Action Transformer: Model Improvement and Effective Investigation with MPOSE2021 and MSR Action 3D Datasets
DOI:
https://doi.org/10.37934/araset.62.1.7689Keywords:
Action Transformer, Human Action Recognition, Skeleton Data, Deep LearningAbstract
The AcT (Action Transformer) model has shown promising results in action recognition tasks. However, achieving high accuracy in complex and dynamic action sequences remains a challenge. In this paper, we present an approach to improve the accuracy of the AcT model by increasing the model's training complexity, validated on the MPOSE2021 and MSR Action datasets. Our method enhances the AcT model by incorporating a multi-level feature fusion technique. We introduce additional convolutional and pooling layers to capture more detailed spatial and temporal information from the input data. This increases the model's ability to discriminate between subtle action variations and improves its accuracy in recognizing complex actions. We evaluate the effectiveness of our proposed approach through extensive experiments on the MPOSE2021 and MSR Action datasets. The results demonstrate that our enhanced AcT model achieves significantly improved accuracy compared to the baseline AcT model and outperforms existing state-of-the-art methods. Our method effectively captures the intricacies of complex actions and provides more accurate predictions.