INPROCEEDINGS

SkillsInterpreter: A case study of automatic annotation of flowcharts to support browsing instructional videos in modern martial arts using large language models

Proceedings of the Augmented Humans International Conference 2024 | pages 217--225, apr, 2024

Author

Oomori, Kotaro and Ishiguro, Yoshio and Rekimoto, Jun

Abstract

The use of video for learning physical skills such as modern martial arts is becoming popular. Physical skills such as modern martial arts require decisions depending on the situation. An example of these decisions is selecting an appropriate off-balance technique based on the position of the opponent’s feet. However, the existing interface does not support video browsing based on the structure of the physical skills, including situations and the decisions that should be made at that time. We hypothesize browsing based on the structure can help the user’s skill comprehension. In this paper, we propose a structure-based video browsing method, SkillsInterpreter, which automatically generates a flowchart of the speech-contained skill instruction video by large language models (LLMs). The generated flowchart explores desired scenes, checks the current chapter, and reviews the skill structure while watching the video. Our study included interviews with experts and evaluations with learners in modern martial arts. Based on our two studies, it was suggested that SkillsInterpreter can support video-based skill learning in modern martial arts, especially in Brazilian Jiu-Jitsu, which needs situation-specific decision making.

SkillsInterpreter: A case study of automatic annotation of flowcharts to support browsing instructional videos in modern martial arts using large language models

Author

Abstract

Related Members