Abstract: |
Nowadays, Virtual Learning Environments (VLE) dedicated to learning gestures are more and more used in sports, surgery, and in every domain where accurate and complex technical skills are required. Indeed, one can learn from the observation and imitation of a recorded task, performed by the teacher, through a 3D virtual avatar. In addition, the student’s performance can be automatically compared to that of the teacher by considering kinematic, dynamic, or geometric properties. The motions of the body parts or the manipulated objects can be considered as a whole, or temporally and spatially decomposed into a set of ordered steps, to make the learning process easier. In this context, CheckPoints (CPs) i.e. simple 3D shapes acting as “visible landmarks”, with which a body part or an object must go through, can help in the definition of those steps. However, manually setting CPs can be a tedious task especially when they are numerous. In this paper, we propose a machine learning-based system that predicts the number and the 3D position of CPs, given some demonstrations of the task to learn in the VLE. The underlying pipeline used two models: (a) the “window model” predicts the temporal parts of the demonstrated motion that may hold a CP and (b) the “position model” predicts the 3D position of the CP for each predicted part from (a). The pipeline is applied to three learning activities: (i) glass manipulation (ii), geometric shapes drawing and (iii), a dilution process in biology. For each activity, the F1-score is equal to or higher than 70% for the “window model”, while the Normalized Root Mean Squared Error (NRMSE) is below 0.07 for the “position model”. |