Abstract:
This paper presents the development of motion features for accurately extracting the distal segments of human limbs in visual data for human action recognition. Using the depth map provided by the Kinect sensor, motion features are extracted to classify human actions in videos. The motion features are the motion of the 3D joint positions of the human body. These 3D joint positions are used to provide precise endpoints of the distal segments of each limb which are reduced to centroids for efficient recognition. Each limb centroid is described by its angle with respect to the vertical body axis to create an action descriptor vector. The action descriptor which represents the position of the torso and four limb segments is detected and tracked without any manual initialization. It is also invariant to image resolution and video frame rates, making it suitable for a wide range of human tracking applications in real time surveillance. To evaluate our approach, a public dataset was used for human action recognition. The results of our experiments show a good direction in incorporating motion features using SVM technique for automated recognition of human actions.