TV-Human Interaction Dataset

The TV-Human Interaction dataset [107] was collected from different movies. It in- cludes 300videos classified into 5 action classes ( Handshake, Highfive, Hug, Kiss, and Negative ), Figure 2.21 where Negative action does not contain any interaction. Two hundred of the clips contain one of the four interaction actions each action appearing in 50 videos. Negative examples make up the remaining 100 videos. The length of the video clips ranges from 30 to 600 frames. There is a great degree of variation between different clips and also in several cases within the same clip. The variation consists of the number of actors in each scene, their scales, and the camera angle, including abrupt viewpoint changes at shot boundaries. The dataset is split
