Relation attention method is often used to calculate the relation score between two object. For example: a query object and some samples.
In order to understand what is relation attention, we can read:
Understanding Relation Attention Network in Few-Shot Learning
However, if we do not have a query object. How to use relation attention?
Paper:Frame Attention Networks for Facial Expression Recognition in Videos give us a method.
In this paper, we only have some video frames. There is no a query frame or a target object.
In order to use relation attention, this paper used two steps:
Step 1: use a sigmoid function to get a global representation.
Then, use this global representation as to query object.
Step 2: use this global representation for relation attention
Then, we will get a final representation (\f_v\) for classification.
We should notice how to calculate (\f_v\) from each frame.
From experiment in this paper, using relation attention is better than not.
We also can find: we can build multiple relation attention layers for video facial expression recognition.