Beruflich Dokumente
Kultur Dokumente
In Partial Fulfillment
of the Requirements for the Degree of
Bachelor of Science in Computer Science
Major in Computer Systems Engineering
by
Mella, Raphael Enrico H.
People learn skills that are essential to human life everyday by observing and
noticing the surroundings around us. In this part of learning, sense of sight is always used.
However, this case does not apply to the blind, as they cannot use their sense of sight to
learn like normal people do. Because the blind cannot see and observe their surroundings,
and they also tend to be insecure outside their homes and within the society.
Despite all problems concerning the blind and their inability to learn skills
normally, technology can take advantage of another important sense that the blind will
most likely have: hearing. Applications or systems that use movement sonification, the
generation of sounds through movements, i.e. to teach certain skills like dancing or sports
can be a useful tool in helping the blind.
i
Table of Contents
List of Tables
1.1 Overview
Sense of sight or our vision is one of the most important senses of a person. Normally,
people would take advantage of their sense of sight to learn new skills, by noticing and
observing other people [12]. Unfortunately, this is no longer the case if a person is blind.
Losing vision or being visually impaired may become a big obstacle to this way of learning
the essential skills of a normal person. With that said, those who are visually impaired can
still find ways to learn like other people through their other senses; another way of learning
is through their sense of hearing.
One of the other important factors when it comes to being visually impaired is their
insecurity outside their homes and within their society [14]. One way to overcome their
insecurity is to learn martial arts. This study will focus on learning one specific martial art:
Taekwondo. A system that applies movement sonification as an auditory feedback will aid
the visually impaired to learn Taekwondo using their sense of hearing.
Recent technologies such as Microsoft Kinect have been used to capture and record
human movements in real-time at a low cost. Inspired by these developments, the feedback
produced by sonification through 3D motion paths becomes the basis for how a specific
movement should be executed [5].
When it comes to learning martial arts, every movement is executed in detail, from its
ready position until its end point. When teaching, a coach would normally demonstrate a
single movement in several steps, a single pattern, or a series of sequences in one
performance. These kinds of movements can be evaluated differently based on how the
kind of results a researcher is looking for. Some ways to evaluate movement are speed,
direction of energy, posture, and accuracy. A study that focuses on complex movements
from Karate evaluated movements based on speed, rhythm, quality, and posture [11].
Blind people cannot learn martial arts like normal people can do. While people would
learn by observing the demonstration of a coach, blind people cannot learn the same way,
as they cannot see. Despite this problem, blind people are still able to use their sense of
hearing. A system that is able to take advantage of a blind person’s sense of hearing will be
a useful tool for learning a new skill like Taekwondo.
The study aims to develop a system that applies movement sonification to Taekwondo
hand movements using Kinect.
To build the dataset, Kinect Sensors will be used to capture and track movement.
Algorithms will be applied to extract features from Kinect data. Atomic movements will
be identified from a continuous stream of movement.
The research will perform discriminant function analysis, specifically forward stepwise
analysis, on the different kinds of sound timbre provided by the sonification application
to be used, xSonify, and choose the four (4) most discriminable sound timbres among
those available.
The experiments will perform listening tests for subjects in order for them to be trained
of their sound perceptibility. The experiment will also ensure that no harmful effects to
the participants will be caused as well as expect different kinds of reactions or behaviors
by the subjects.
At least 30 participants will be recruited for the research. A Taekwondo expert will be
recruited who can judge movement correctness and what thresholds or degree of certain
movement is allowed.
1.5 Resources
1.5.1 Hardware
1.5.1.1 Microsoft Kinect 2.0
A laptop will be used for developing the program and saving the sound files
generated by movement.
1.5.2 Software
1.5.2.1 Custom application program interface for movement and sound
tracing
This application will use Microsoft Visual Studio as its main platform and will
use Kinect features to record and save data from subjects. The software will be
programmed in C# language.
The objective is to test how accurate the system tracks and records movement.
Users will be asked to perform repetitions of one movement. All data gathered by the IMU
and Kinect sensors be recorded and analyzed.
The objective is to test if movement produced the assigned sound. The user of the
building prototype will play a reference sound assigned to one movement. The user then
will attempt to move and find differences between the desired reference and the
performance of the person. The mapping of movement parameters will vary on different
sound timbres and a choice of either a variation of volume or pitch.
Children would normally use their sense of sight to learn different skills that are
essential to life by exploring and noticing their surroundings. However, the blind need to
acquire certain skills that enables them to move around easily [12]. It is said that martial
arts or combat systems provide a way to develop the body and mind of a blind person [14].
From a psychological point of view, developing martial arts or self-defense allows one to
develop mental endurance, self-esteem, self-confidence until it is able to change one’s
attitude towards serenity in life. This will allow the blind person to develop mental
toughness [15]. This system will guide the visually impaired with a set of movements that
could help them become more active and secure in the outside world.
Four important phases will be conducted for the entire duration of the study. For the
first phase, further research, exploring Kinect and its Software Development Kit,
Processing languange, and SuperCollider language will be explored. This will be essential
for creating the desired program.
The third phase of the study which is also the main task for CTRESME and THSCE-1
creating the program and making a dataset and interfacing the Inertial Measurement Unit.
This phase is about the creation of a program that is able to capture movements, its sound,
and labels for these movements, which will be the dataset of the study. It will also be able
to save this dataset for the blind subject to attempt to replicate. The program will integrate
the Kinect SDK with the Processing and SuperCollider. The dataset produced in the
experiment will be used for teaching.
The last phase of the study involves the search of a blind person and also the use of the
dataset produced in the program created. Sounds saved by the program will be heard by
the blind, while the subject will also attempt to follow the sound produced. Initially, the
subject will have their body guided by the teacher in order to familiarize them with the
feeling of how they should move.
The goal of this paper was to analyze, measure, and evaluate the quality of karate movements.
The system evaluates the performances of several students of karate with varying levels of experience
by integrating them with a motion capture setup. Each subject was tasked to perform Kata and will
be evaluated by the system based on kinematic and geometric features such as posture and how the
limbs are used. The study evaluated records of actual performances rather than the basics of karate.
Examples of measures were provided to evaluate performance quality.
Several Kata performances were recorded through Motion Capture System Qualisys and it goes
through Kinect to acquire the motion. The basis of evaluation for the recordings are temporal
synchronization, spatial alignment, and comparison of joint positions and joint velocities of two
"dance signals". All of these evaluations are done through quaternionic signal processing. Figures 2-1
and 2-2 portray 3D models produced by MoCap data of kata performances.
Figure 2-1. Examples of postures from kata Heian Yondan (MoCap data) [11]
Figure 2-2. Examples of postures from kata Bassai Dai (MoCap data) [11]
After processing, the Automatic Identification of Markers is used to detect all trajectories found
in the records of the Qualisys Track Manager, and thus will automatically create formulas for these
movements, and these formulas present geometric features of movements as shown in Equation 2-1
and 2-2. Wrong postures can be detected by any change in angle between shoulders and spine:
Equation 2-1. Angle equations for θ for each shoulder and Φ for the overall angle between shoulders [11].
Equation 2-2. Proportion of wrong postures in performance, where Nwrong is the number of frames showing
incorrect angles and N is the total number of frames [11].
Figure 2-3 shows a figure that represents several shoulder angles in different points of the body.
In this case, both sides of the shoulder, the neck, and the spine, are involved in the figure. Equation
2-1 is supported by this figure. The first two equations are for calculating distance from spine to both
sides of the shoulder, with the neck being the vertex. The next two equations calculate the shoulder
angles by utilizing the first two equations. The last equation is for calculating the overall angle of the
shoulders and neck. After all these equations, the average of the Φ angle is calculated over time, and
while it is being calculated, the system will detect every frame that indicates an incorrect angle. Angles
in the interval [0.9*average Φ angle, average Φ angle] and larger than the average Φ are accepted, while
any angle not in this criteria is not accepted. Equation 2-2 will then calculate the proportion of frames
showing an incorrect angle between shoulders and neck.
Equation 2-2. A set of equations to show if the back is straight by using variables of both sides of the back hip and the
spine [11].
The set of equations in Equation 2-2 helps in calculating distances between spine and both
sides of the hip, using Figure 2-5 as reference. Variables x4, y4, and x4 represent the midpoint between
both sides of the hip. If there is a decrease in the variable Dist, then that means there is a flaw in
posture, such as if the back is bent in any direction.
Despite this, not all movements can be detected all the time; these undetected movements
have to be manually filled with interpolation algorithms provided by QTM.
The system also calculates kinematic features like acceleration and biomechanical efficiency in
movement. Acceleration is computed through wrist markers using MoCap toolbox, while
biomechanical efficiency in movement, namely punches and kicks, is essentially a balance between
acceleration and deceleration. Figure 2-5 represents peak velocity when punching, and it is divided
into acceleration and deceleration. The system will then use these variables in calculating maximum
velocity. Maximum velocity is calculated based on time spent for both acceleration and deceleration
as shown in Equation 2-3. If the absolute value of the difference between both variables gets closer
to 1, then it shows a better balance between acceleration and deceleration.
Equation 2-3. Maximum velocity; tacc is for time spent on acceleration; tdec is time spent on deceleration.
The last equation considered in the system is for the synchronization between pairs of joints.
This is computed through velocity calculated using MoCap toolbox and velocity peaks exatracted on
evaluated markers. The methods used in this equation are for measuring synchronization and time-
delay patterns between signals. As shown in Figure 2-6, larger synchronization indices result to more
synchronized movements as shown in Equation 2-4.
Figure 2-6. Peak velocity values (blue for right shoulder markers and red for right wrist markers) [11]
Equation 2-4. An equation for evaluating synchronization between right wrist and right shoulder using velocity peaks in
Figure 2-6 [11].
Figure 2-7. Board setup for participants as basis for evaluation in bimanual motor skills
The objective of this paper is to test if there is a beneficial difference in utilizing melodic
sonification in consecutive tasks as opposed to rhythmic sonification and to check if there is any factor
in good performance beyond an initial maintenance test. The study shows how melodic sonification
can turn into a basis of learning certain motor-based abilities.
The study used a sample of 60 participants, taken mostly from undergraduates, post-graduate
researchers and staff. These participants were asked about their musical or dance experiences, if they
had any. A bespoke wooden board with two cut regular polygons was used for evaluation and for the
participants to interact with as shown in Figure 2-7. Sennheiser headphones were used to highlight
auditory features in the experiment.
Each participant is to be allocated in three experimental conditions and will go through seven
stages in the experiment: familiarization, pre-test, practice, short-term retention, 24-h retention, 24-h
post-replay retention, and transfer. The purpose of the entire test is to search for bimanual timing trio
error. The inter-corner timing interval across hands were being compared in each trial in a sequence
of ratios. Figure 2-8 represents the feedback of the participant’s performance every trial.
Figure 2-9. A multi-layered structure as a basis for assessing movement qualities. [1]
This paper demonstrates that intra-personal synchronization can assist in evaluating movements
based on three expressive qualities, namely fluidity, rigidity, and impulsivity. Figure 2-9 shows a
structure of how movements will be assessed.
Professional dancers created short performances for recording. They were asked to make several
repetitions of predefines movements and this has to be done in a particular expressive quality. Another
task for them was to provide their own choreography in which they can better show the expressive
qualities. Figure 2-10 shows the dancer’s skeleton in 3D when performing.
Figure 2-12. Markers and rigid bodies as points for movement detection. [1]
2.1.4 Movement Sonification for the Diagnosis and the Rehabilitation of Graphomotor
Disorders [6]
Figure 2-13. A chart representing the duration of the experiment, which involve evaluation of participants in the
form of a diagnostic tests (initial and post-training), and development stage. [6]
Seven children with dysgraphia were evaluated with Beknopte Beordelingsmethode voor Kinder
Handschriften test, followed by a handwriting rehabilitation program. Each student will be evaluated
in a span of four weeks in three stages: pre-test, training with an auditory feedback, and a post-test.
They were tasked to write a specific French sentence in script.
As a basis for the experiment, three variables were chosen: instantaneous pressure,
instantaneous tangential velocity, and supernumerary velocity peaks. To test tangential velocity and
pressure, participants were to use a graphic tablet. Supernumerary velocity peaks were calculated
through the Signal-to-Noise velocity peak difference, which was one of the two kinematic variables,
the other being movement time), which is based on the difference of velocity peaks with filters of 5
Hz and 10 Hz.
Figure 2-14. Average of Movement Time (MT) and movement fluency (SNvpd) in loop production. [6]
Figure 2-15. Average of movement time and fluency for the sentence writing tests. [6]
Table 1 represents the results of the handwriting experiment based on legibility and speed. The
results of the experiment show that movement sonification improved the handwriting skills of
children with dysgraphia.
This paper was based on the hypothesis of Alexander Truslit [40] that compares freestyle
movement to a fixed set of movements. All of these were recorded in a 12-camera MoCap system.
The theory that Truslit’s idea wants to investigate focuses on the connection between inner shape and
motion of music and the perceptual processes of listeners while moving with music.
The technique to sonify data in this study was through continuous parameter mapping methods
shown in Equation 2-1 applied as follows: y-axis was used for finger movements and were integrated
to a pitch of a continuous synthesizer, while the x-axis was integrated to a stereo panning of the same
sound as the y-axis. Python and SuperCollider [39] were used in programming sonification sequences.
𝑠(𝑡) = ∑𝑁 ⃗⃗⃗𝑖 ), 𝑡)
𝑖=1 𝑓(𝑔(𝑥
The function s(t) computes a q-channel sound signal with parameter t for time, ⃗⃗⃗ 𝑥𝑖 to ⃗⃗⃗⃗
𝑥𝑛 as a
dimensional dataset. The parameter g is for the range of the dimensional dataset. For an ideal linear
scaling synthesis parameter, minimizing and maximizing of attributes can be applied as shown in
Figure 2-16 [18] and is applied in Equation 2-2. This sonification technique is also considered as sonic
scatter plots or nth order parameter mapping [18]. Figure 2-16 represents a simpler way to interpret
how sonification mapping is done, showing each parameter in different target domains.
Figure 2-16. Linear transfer function (black) and alternative sigmoidal mapping (red) [17].
Figure 2-17. A simpler representation of mapping in sonification, sending each parameter to a target domain.
Two tests were conducted: performance and perception test, wherein about twenty-six right-
handed participants took part.
In the performing test, three of Truslit's original records were divided into twenty-three-second
segments and were used in the first and third phase of recording, while a 7-second segment was used
in the second phase.
The first test, which is the performing test, was broken down into two main blocks, each having
three various movement attempts integrated to diverse musical stimuli. The different stimuli used are
based on several musical recordings and chords. The purpose of the performing test is to explore
Truslit’s idea in regards to prototypical musical gestures by comparing free movements to movements
executed alongside Truslit’s original records.
The second test is a perception test that utilized multi-modal self-other recognition assignments
which consisted of four various display states, which was based on the results of the first study. The
four states used in this test are: animated visual point-light displays, data sonifications, sequence of
images of movement flow, and lastly visual point-light and auditory displays. Afterwards, a parameter
called d-prime is calculated by the number of z-transform false alarm rates subtracted by number of
hit rates. The purpose of the d-prime is to analyze judgement sensitivity based on analysis of self-other
recognition tasks. The self-other recognition tasks were based on the movement trajectory data that
were recorded in the first test.
The research concluded that a high consistency was revealed in a condition in the repeated-
measures condition in the first test. This means that the performances were identical to Truslit’s
original records discussed previously in the performing test across trials. In the second test, the results
demonstrated a perceptual basis formed by human movements and established beyond individual
percepts of music, though only in terms of visual perception processes.
This paper aimed to apply audiomotor information processing to recognize perceptual qualities
of audition and afterwards apply multisensory perception alongside the procedure of handling
audiovisual information and related beneficial habits.
In a training period of 3 weeks and an extension to about 9 weeks, 48 participants were tasked
to learn basic indoor rowing as shown in Figure 2-18.
Figure 2-18. Sequence of tasks for the experiment, broken down into pre- and post- tests for strength and
technique, training, and retention. [8]
The total samples in the experiment were divided into subsamples, then each participant will
run through a training procedure in terms of instruction and real-time feedback: conditions for three
visuals, regular, audio, and sonified. A rowing ergometer was used, attached with four sensor systems:
resistance gauge for grip resistance, two for foot pressure, and two for mental encoding, each for
determining pull-out length and sliding seat position. All of the movements are recorded in 100Hz.
Distance values between model's technique and participants' individual technique were evaluated using
dynamic-time-warping. The assessment shows that real-time movement sonification should be a
useful tool for developing motor rehabilitation. Figure 2-19 shows each instruction and its signal based
on pressure level.
Figure 2-19. Three states of visual and auditory stimuli, each showing instruction and signals for pressure level. [8]
This study intends to develop the idea of connecting observation to body developments by
applying an embodied cognition viewpoint and utilize this learning to relate amongst music and
movement to non-musical sounds.
As a basis for the study, three sound models were used in three various tests to look for the
various aspects of sound and body movements. All sound models utilize white noise. Velocity
magnitude is computed using logarithmic scaled to signal amplitude, so that no sound is produced
when there is no movement. Figure 6-1 shows 2 minutes of spectral content for each sound model.
Each sound model is described as follows:
Table 2-2. Summary of details of utilized sound models in the study [9].
Sound Model Type of filter Velocity Q-factor Amplitude
Magnitude modulation
1 MaxMSP 50-1100 Hz 1.8-4.0 3 Hz
resonance filter
with “resonant”
mode
2 MaxMSP 100-900 Hz 0.2-0.3 18 Hz
resonance filter
with “resonant”
mode
3 Band-pass filter 100-3000 Hz 0.02-0.6 Uses triggering
peaks through
MaxMSP curve~
The first experiment focuses on using the sound models to evaluate characteristics of children
moving freely in a room. The second experiment focuses on looking for distinguishable factors when
it comes to another set of participants attempting to map perception points for motion-related
activities.
The third experiment tries to match and identify respective sound models to drawings that show
the sounds recorded in the first experiment. All of this involves several equations for indices in energy,
smoothness, and directness. To be more specific, the third experiment involves a mapping of body
motion qualities from one sound to another (sound to sound visualization). In order to solve this
problem, a three alternative forced-choice experiment was applied.
In the experiment, passive reflective markers had to be traced using Opritrack Prime 41, which
acquires 180 frames per second. Tracking and recording of movements were done by Optitrack
Motive software and through NatNet, was sent over LAN onto another computer. The study also
uses a custom software in C++ to retrieve data streamed by the NatNet. In this other computer, in
addition to looking for important factors in the said recording, calculations were made and then
afterwards the original data is compressed and sent as Open Sound Control. Afterwards, logging,
sound production, and spatialization is done in a third computer. Logging is done through a custom
program in C++ that stores information from OSC and each data is given a date when the information
was delivered. For sound production, spatialization, and playing with the sound models as shown in
Figure 2-21 to be used is done through a MAX/MSP patch.
The reasons why sound models can be manipulated is that characteristics of movement can be
shown based on the sounds it creates, varying sounds can be classified as a certain motion quality,
stimuli involving only sounds can create a potent property of movement, and movement-generated
sounds can be illustrated like a visual for other people to observe.
This study aimed to investigate if an auditory channel is a possible tool to perceive expressive
movement qualities in dance and convey them through sounds. Five (5) sound models were evaluated
in two separate experiments, wherein one was a web-based perceptual rating experiment where the
evaluation of the recording is solely based on the participants, while the second one was an interactive
one.
In all five (5) sound models, increase in energy was mapped to an increase in amplitude, from
complete silence at zero energy.
First two sound models focus on inharmonicity tension, meaning a way to show lack of fluidity,
next two models show qualities from physical interactions, using amplitude and spectral tilt to express
energy; 4th sound model is more on the approach of wind gusts, while the 3rd is artificial and electronic.
The last sound model is a combination of the other four sound models and checks harmonic pitch
sensitivity, physical interpretation of spectral slope and amplitude, and as well as a noise generator.
Increase in fluidity
increases the q-factor of
the filter, which leads to a
production of a sinusoidal
wave
Decrease in fluidity
widens the filter, which is
a noisier output, and adds
a detuning component
independent to all filters,
resulting to less
harmonics.
In the second experiment of the study, the video synchronization and sonification of data was
done using custom video playback software in C++ that uses openFramworks. Sound models were
controlled by the energy and fluidity values extracted from the dancer’s movement. There are six
sliders in the program for the subjects to adjust that controlled the following aspects of the sound
model:
1 – Quantization step of the center frequencies of the band-pass filters, from continuous to
steps of a minor third.
5 – Echo effect
The ideal sound to express fluidity is continuous, pitched and melodic in a comfortably low
register and pleasing to listen to.
This paper aims to find and expressive quality of movement in a subjective perspective, based
on how an observer defines it. This study creates an initial multimodal repository as basis for
movement qualities.
Several multimodal datasets that involve dance movement were used. Four types of movements
were used: breathing, jumping, expanding, and reducing. Afterwards, movements will be evaluated
based on the following factors: verticality, leg opening, weight shifting, recurrence, and repetition of
motion. The study conducts several strategies of collecting feedback from dance professionals and
choreographers, and conduct experiments to test feasibility and possibility of the implemented theory
open and scientific occasions.
For synchronizing records, playbacks, and multimodal data analysis, a DANCE software
platform will be used. Participants perform under a Qualisys MoCap system with 13 cameras. Other
tools to be used are headsets for respiratory sounds such as breathing, five IMU units placed on certain
parts of the body, and two video cameras for recording video signals.
Figure 2-22. Left and right ankle data synchronized from x-OSC IMU (black color) and the MoCap (grey
color) (Camurri et.al, 2016).
Figure 2-24. Synchronized audio volume of respiration (black) and full-body kinetic energy (Camurri
et.al., 2016)
Figure 2-22 to Figure 2-24 shows signal synchronization. In Figure 2-21 and 2-22 show
acceleration computed from markers of the wrists (grey color), while acceleration absolute value
computed from the IMU are for the sensors near the markers (black color). Figure 2-23 corresponds
to respiration and full kinetic energy for an impulsivity trial.
The repository in this study is under further development as dance professionals are willing to
assist in its development. The study is working on studies for perception.
This study aims to test if real-time feedback is an effective cure for stroke through testing motor
abilities of the paretic upper limb.
The participants for the study are divided randomly into two tests for rehabilitation, one for
sonification and one for sham-acoustics. Figure 7-1 shows the setup to be used for the experiment.
Both sides will receive a program for training the upper body and this is done in several sessions in
weeks as a movement therapy for the intercession phase. MTx miniature 3DOF inertial orientation
tracker is used for tracking arm movements.
For the sonification process, the position of the wrists is calculated using spherical coordinates.
Postures are determined using an azimuth angle, elevation angle, and radial distance between wrist and
the starting point of the coordinate system shown in Figure 2-25. This data is sent to PureData and
CSound for sonification.
Frequency modulation is the basis of the sonification technique and is applied to synthesized
sounds that utilize sawtooth wave forms with the following parameters: carrier frequency, which is
200Hz for the left arm and 300Hz for the right arm during a stable position (no movement). Elevation
of arms result to increase in sound frequency by a maximum of 200Hz, which can be achieved by
stretching both arms above the head. Panning and interaural intensity difference is determined using
an azimuth angle. Sound brightness is modified by radial amplitude using logarithmic change of
frequency regulation index between 0 and 0.15. And lastly, absolute wrist velocity is what defines
sound amplitude and the loudness, meaning velocity is directly proportional to sound amplitude, i.e.
if arm velocity is high, then the sound amplitude is also high. Sounds of arm movements on both sides
are produced and modified independently of each other and is heard by patients wirelessly using
headphones. Rest position means no sound produced. The final output is that arm movements
produce sounds similar to ocean waves not altered by movement trajectories.
The study wants to test how movement sonification can be an effective cure for stroke
patients. Further studies were said to have results on its effects on acute and chronic stroke.
This study examines how sonification can be used to help a student emulate the complex motion
of a teacher with increasing spatial and temporal accuracy. Sonification was used as a mechanism for
feedback when learning complex 3D motion trajectories in real time.
Figure 2-26. Summary of Smith and Claveau’s approach to sonification (Smith & Claveau, 2014)
While holding the phone using SoundTracer, a teacher will create a reference line first by doing
desired movements. These movements will automatically generate sound, and these sounds will be
the representation of the movement performed in the form of Tracks. This will later on be retrieved
by another person who wants to learn the movements required by the teacher.
All motion is captured by 3-D tracking device (Kinect) at 30Hz. With this, a set of sequential
3-D points (x, y, z, t) for one joint is obtained. All these sequential 3-D points result to one point in a
data stream at time t. With this data, features can be found anywhere in the path created.
After recording with SoundTracer and saving them as Tracks, another user who wants to
learn the Tracks done by a teacher will load these Tracks and try to follow the desired track based on
the reference that the teacher created. In this part of the process, the user will be able to figure out a
possible error or mistake in his movement if he hears a difference in sound when moving. The sign
that no error or mistake will occur is when there is no change in the sound desired.
As for the comparison in timing, it is calculated based on the student’s progress with the
motion path. The student not only has to follow the motion path, but also to follow the right timing.
The difference in timing will be determined and propagated to the sonification process and the
feedback sound can be modified. If the timing is the same as the teacher’s, then there will be no
difference in sound.
Figure 2-27. Envelope curves used for ADSR. Top indicates early attack with long sustain. Bottom is a bell
shaped curve. [5]
In order for source sound to be processed by the envelope, sound should have two traits:
aesthetically pleasing and long sustainability. The model has parametric control over breathing,
vibrato, and pitch alongside the ADSR curve in order to control overall energy of the output sound.
Based on the results of the experiment, the SoundTracer is considered very encouraging as an
experimental platform for motion sonification and learning.
Sound-oriented tasks are the main focus of the study. The experiment’s priority go to
movement-generated sounds while auditory-motor loops are regulated by a target sound. Figure 2-28
represents the difference between motion and sound-oriented tasks.
Figure 2-28. Flow of comparison for both sound-oriented and motion-oriented activities. [3
In this experiment, subjects listen to a sound to be followed as well as a sound that indicates
incorrect movement. Participants will attempt to recreate a specific sound by moving their dominant
hand from point to point. Participants will have nothing to do with the sounds they produce. The
experiment will come in two phases: Exploration and Adaptation. Exploration is the performing stage.
Subjects perform a set of movements with as an attempt to follow the data given as a reference.
Subjects must wait for a beep between movements. Adaptation is when participants will be under
another situation (in this case, blindfolded), and will perform again. Participants will not realize that
reference velocity will be changing.
The setup for the experiment will be as follows: first, subjects will be holding a motion interface,
which contains 3D accelerometers and a 3-axis gyroscope represented in Figure 8-2, and will set on a
table with two points for target sound production. Next, with the data produced from point to point,
using a 2.4 GHz band IEEE protocol, this will be transmitted to a computer using Open Sound
Control. Afterwards it will undergo real-time processing of data, sound synthesis using Modalys, and
logging through a Max based software program. Subjects will listen to sounds using headphones. For
filtering, a resonator will be utilized. Input sounds to be filtered are square signals with particular
frequencies of 260 and 910Hz, and one that is identical to string's second harmonic.
The objective in the sound-oriented task in this experiment is the attempt to manipulate a
motion interface that enables continuous sound synthesis. The interface consists of 3D accelerometers
and a 3-axis gyroscope as shown in Figure 10-2. Data is sent to a receiver via IEEE protocol 182.15.4,
and transmits to a computer using Open Sound Control. A Max-based program is also used, which
contains real-time data processing, sound synthesis, and data logging. Sounds generated is heard from
earphones.
For the mapping process, the angular velocity around the Z-axis of the gyroscope will be used
as input. Sound generated is synthesized from the difference between the profiles of both performer
and the desired reference, and also varies based on certain conditions. The profile generated is a bell
TySON: Movement Sonification for Teaching Taekwondo 2-22
shape curve similar to the reference profile in Figure 2-30 that roughly resembles the velocity profile
usually found when moving the interface from point to point.
Figure 2-30. Reference profile and the associated thresholds enabling the change in the sound qualities
(noise or loud higher harmonic) (Boyer et. al, 2016)
The velocity signal is mapped to a sound synthesizer using Modalys. A resonator, which is a
string model, is used to process three various input sound signals: a square sound signal at a
fundamental frequency of 260 Hz, which matches a second harmonic of the string, another square
sound signal with a frequency of 910 Hz, matching a 7th harmonic and lastly pink noise, for constant
power per octave. Intensity of the noise inputs will be modulated based on the difference between
reference and the performer, by increasing harmonics if positive values are gained, and increasing
noise if negative values are gained.
The outcomes demonstrate that sensorimotor adjustments were seen in both phases of the
experiment. The majority of the participants were able to closely replicate the reference sounds.
According to the assessments of the participants, since the task was difficult, it was recommended to
make sensorimotor adjustments a more gradual experiment. In addition to this, those who gained
positive results in the experiment did not bother with the variance of reference points; this means
their movements were natural.
Kolykhalova, K., The system records Motion capture and Audio paramaters are Geometric Features Motion Capture 7 participants to part Manual mapping of The study evaluated
Camurri, A., Volpe, kata through Motion audio recordings are set to 320 kbit/sec (Angle between System Qualisys, in the study. Each the body is done records of actual
G., Sanguineti, M., Capture System saved using the for bitrate and 48 shoulders and neck, Qualisys performed two katas: through Automatic performances rather
Puppo, E., & Qualisys and passes Qualisys system’s kHz for sampling Straightness of the Track Manager Heian Yondan and Identification of than the basics of
Niewiadomski, R. through Kinect for native software rate, and two back), kinematic (QTM) 2.9., SMPTE Bassai Dai, and Markers (AIM). In karate. Examples of
motion acquisition. Qualisys channels (SMPTE features (Maximum Timestamps, evaluating these each record, 30-35% measures were
All markers are Track Manager and audio). This is acceleration during microphone, performances. of trajectories provided to evaluate
tracked at a frame (QTM) 2.9. recorded with a performance, Automcatic becomes undentified performance quality.
rate of 250 Hz. microphone Biomechanical Identification of and thus needs
positioned on the efficiency in punches Markers (trajectory manual filling of
head of the and kicks), and identification) algorithms. Peak
performer. This Synchronization velocity of punches
along with its motion index. or other movements
capture is are differed based on
synchronized using karate experience.
SMPTE timestamps.
Camurri, A., Coletta, Qualisys MoCap A Qualisys based Event Qualisys For each expressive Rigidity, fluidity, and Intra-personal
P., Ghisio, S., capturing at 100Hz system tracking Synchronization quality, participants impulsivity were the synchronization
Mancini, M., was used for movements on should perform expressive qualities might contribute to
Niewiadomski, R., sampling dancers' markers at 100 fps. several repetitions of to be evaluated. distinguish
Piana, S., Sagoleo, R., movement and is predefined Checking of these movements
& Volpe, G. synchronized with a movements by qualities are done performed with
video recording attempting to target through a phase different expressive
system. an expressive quality, called Event qualities.
and perform their Synchronization
own choreography wherein the system
that is able to express determines the
that particulary current event, model
quality in a more conditions that have
impressive manner. to be met by the
event, and obtain a
time series output by
making the input
signals discrete,
which holds
information about
events occurences
and timings.
Alborno, P, Bresin, The experiment used Optitrack Motive Sound models were Energy Index, Optitrack Prime 41 The study applied Three setups were The reasons why
R., Elblaus, L., & Optitrack Prime 41 Software as a presented and Smoothness Index, with 17 infrared three sound models conducted for sound models can be
Frid, E. with 17 infrared recorder and tracker filtered using and Directness Index cameras, Optitrack in three different motion capture and manipulated is that
cameras tracing MaxMSP. Motive, NatNet, experiments to sonification: 1) characteristics of
passive reflective custom C++ investigate different spontaneous movement can be
markers that acquires software for data properties of sound movement in a shown based on the
180 frames per retrieval from and body room, 2) rating sounds it creates,
second. NatNet, OSC. Three movements. The first perception in varying sounds can
computers were experiment focuses sonified movement be classified as a
used. on using the sound data, and 3) certain motion
models to evaluate movement tracking quality, stimuli
characteristics of system using 17 IR involving only
children moving cameras. sounds can create a
freely in a room. The potent property of
second experiment movement, and
focuses on checking movement-generated
if there is any sounds can be
difference when it illustrated like a
comes to a set of visual for other
motion-related people to observe.
perceptual scales by
another group of
participants. The
third experiment
tries to match and
identify respective
sound models to
drawings that show
the sounds recorded
in the first
experiment. All of
this involves several
equations for indices
in energy,
smoothness, and
directness.
2.2.1 Martial Arts for the Blind and Partially Sighted [14]
The study aims to seek for a solution involving the visually impaired in a martial
arts setting. The objective is to make them enhance activity, security, courage,
perseverance, and persistence alongside developing etiquette and independent growth.
The program implemented was based on traditional martial arts teachings. Like
most martial arts trainings, there are no limits in teaching when it comes to contact practice.
There are three ways a blind person would prefer to learn, by listening, by being guided by
their teachers, or by feeling the movement to be demonstrated through the teacher's body
directions. When it comes to teaching correct combat styles, accurate positioning and
movement are important. Figures 2-31 and 2-32 demonstrate how a coach teaches the
blind person by feeling the coach’s movement through a demonstration. A technique's
effectiveness is not determined by the hands and fists, but rather by its power and
expression of energy.
According to participants, the program has built more confidence and it gave some
influence in decision to move out.
Figure 2-31. Parallel grasp on wrist with control of the other hand [14]
The study targets the visually impaired to adjust to social standards. This was done
in a form of training for the development of their orientation and mobility. This study was
implemented because Pakistan education does not support the blind.
The samples used by the study are aged 5 to 15 years old, those who are studying
in different special education institutes in Pakistan. 125 total samples of male and female
visually impaired children from public and private special education institutes were
selected. Two separate cases for the study were implemented: Personality and Pro-social
behavior.
This study was done to the visually impaired in a form of a questionnaire and with
two parts about "Orientation & Mobility Training", one which involves questions its effect
as a special education program, and another which evaluates how effective these programs
are for improving their adaptation to the social environment.
It was said that the study was effective on improving social adaptation, behavior,
and the ability to move and orient children freely.
The paper aims to fill the gap that neglects self-defense as a topic discussed by
persons with certain disabilities. It assesses the mental condition of disabled people by
creating a methodology of a self-defence course.
The method in the paper is that a questionnaire is made to assess the level of self-
confidence in an actual self-defense setting. This questionnaire contained questions that
use exploratory methods that led into a structure interview. This interview was divided into
three sections assessing the self-confidence levels on the different structures. The interview
used a 6-point ranking scale to evaluate the situation given on each question.
19 subjects were used in the initial questionnaire phase. Eleven of them were
completely blind and eight of them had other visual impairments. Most were alert but not
confident of their reactions; all were scared of the given situations; the last resort of a
person would usually use the technique Observe-Orient-Decide-Act loop inherited by
John Boyd. The Observe-Orient-Decide-Act loop was used in the methods of this study.
The Mann-Whitney U test for statistical measures was used to assess the
differences between the completely blind and those with other visual impairments. It used
an effect size for level of significance of 0.20, but this effect size was said to not be
effective.
The table used for this statistical measure is broken down into three parts as shown
in Table 2-4.
The first part consists of statements asking about confidence levels in situations
like prevention, verbal and physical assault. While the visually impaired feel quite confident
in prevention, less confident in verbal conflict, and no confidence in physical conflict as
they do not know how to respond to this.
The last part of the table involves strategies of people with visual impairment.
According to this, it is said that communication strategies are useful in successfully dealing
with conflict situations.
Table 2-3. A questionnaire representing questions for three different sections of the evaluation. [15]
The significant difference found in the study is that blind people are under a riskier
position when it comes to theft and personal belongings. Another significant situation is
that the visually impaired are under threat when alone at night.
When it comes to creating a self-defense course, it should be focused on coping
with situations that are pre-conflicting. This study could not be generalized however due
to only having a small sample.
Maleta, B., & Martial Arts for the Blind 2007, Krakow, The study aims to look at The program for the blind people was According to the
Szuszkiewicz, A. and Partially Sighted Foundation Institute problems involved when it based on traditional systems that Sifu participants, the
For Regional Development comes to preparing a Andrzej Szuszkiewicz implemented. program has built
martial arts program for the more confidence
blind and partially sighed and it gave some
people. influence in decision
to move out.
Ahmad, N.A., Ismail, Orientation and Mobility April 2018, International The study aimed to The samples used by the A questionnaire with two sub-scales and “It can be rightly
M., Malik, S., & Manaf, Training in Special Journal of Instruction investigate the impact of study are aged 5 to 15 years with 51 items wa developed. It was reasoned that
U.K.A. Education Curriculum for orientation and mobility old, those who are studying adapted from Teaching Age-Appropriate Orientation and
Social Adjustment Problems training as part of a special in different special Purposeful Skills (TAPS). In the first part, Mobility Training as
of Visually Impaired education curriculum on the education institutes in the instrument measured the impact on a piece of
Children in Pakistan social adjustment of visually Pakistan. 125 total samples "Orientation & Mobility Training" educational module
impaired children. of male and female visually imparted in special education institutes has a positive effect
impaired children from had 11 items scales.In the second part, it on the social
public and private special measured the impact of O&M conformity,
education institutes were training on the social adjustment of conduct, and
selected. Two separate cases visually impaired children. freedom of mobility
for the study were and orientation of
implemented: Personality kids.”
and Pro-social behavior.
Čihounková, J., Self-defence for people with January 16, 2015 / “IDO Fill the gap that neglects 19 subjects were used in the The method in the paper is that a When it comes to
Kohoutková, J., Reguli, visual impairments MOVEMENT FOR self-defense as a topic initial questionnaire phase. questionnaire is made to assess the level creating a self-
Z., & Skotáková, A. CULTURE. Journal of discussed by persons with Eleven of them were of self-confidence in self-defence defense course, it
Martial Arts Anthropology” certain disabilities. completely blind and eight situations such as prevention, verbal should be focused
of them had other visual conflict, and physical assault. on coping with
impairments. situations that are
pre-conflicting. This
study could not be
generalized however
due to only having a
small sample.
Figure 3-1. Ideal formation of the kinematic model for 3D environments. [27]
In the case of motion sensors, the process for the analysis of the kinematic model
starts with an analysis program that provides connection between a process unit (PC or
laptop), and the sensor itself [29]. This can be a manipulative program in languages such
as C#, C++, or Java as long as the application foundation is linked to dynamic link libraries
such as those in Microsoft, Prime Sense, or OpenNi.
The first phase in the analysis program is the maintenance of a loop containing
initialization, detection, and postural calibration. Through this method, a pattern of light
is sent every 0.033 seconds (in the case of Kinect, 30 frames/s maximum). The purpose
of this phase is to build the depth image and the 3D image of the subject in the sensor’s
environment. In order for the motion detection to be valid, the subject should be at the
range of 0.8 meters to 4 meters from the sensor. Another condition that must be met is
that the subject must be in the visual field of 58 degrees on the horizontal plane and 43
degrees on the vertical plane.
The second phase of the analysis program is determining all 20 joints of the body
and segments them as parts of the human body as classes: head, upper limbs, trunk, and
3-1
lower body. The first part of the determination process is to detect the position of the
central shoulder. This point is crucial in the hierarchical structure on the first position.
Once the Cartesian coordinates of the central shoulder are determined, the program can
determine the subclasses of points and joints that form the human body pattern as shown
in Figure 3-2.
2(𝑤𝑧 + 𝑥𝑦)
𝑅𝑜𝑙𝑙 (𝑧 − 𝑎𝑥𝑖𝑠) = tan−1 ( )
1 − 2(𝑦 2 + 𝑧 2 )
Equation 3-3. Conversion of quaternion constants from a skeletal joint into Euler angles. [34]
3-2
Figure 3-4. Orientation of Kinect’s quaternions. [35]
3-3
data. A model constitutes a virtual object that a user can interact with, and the input of the
user drives the sonification such that the model is recognized as a dynamic system capable
of recognizing dynamic behavior that can be perceived as sound. Model-based approaches
are dependent on a user’s active manipulation of the sonification and high data dimensions.
The process is sending generated signals through movements into an input audio
amplifier. It is important to take note that the range of frequencies for human hearing is
from 20-20000 Hz. There are cases that data is not in the right time scale to fit the required
frequency. Another case is that signals can be very noisy, so filtering the signal will help
hear certain features more clearly. It also allows users of data to utilize the capabilities of
pattern recognition. This requires waveforms that are frequency- or time-shifted into the
range of audible sound for humans.
3-4
SPL(dB) = 20log10(measured Pressure/reference Pressure)
The perception of musical pitch for pure tone stimuli varies differently for certain
levels of frequency tones. Frequencies below 2.5 KHz, a second sound is adjustable by
listeners so that it is an octave above the test stimulus, which is approximately double the
frequency. Heightening sound frequency defeats the need to adjust the second sound.
Various mechanisms are responsible for frequency discrimination and pitch perception
and that pitch perception operates over low to middle frequency range of human hearing.
Having multiple frequencies in a sound, i.e. complex sounds mean that the
resulting pitch will have less energy. For example, having a harmonic series of frequency
components within 200Hz intervals will be perceived to have a fundamental frequency
based on the interval between frequencies. This part of pitch perception occurs even in
low pass noise. This is referred to as ‘periodicity pitch’.
Any sound can be decomposed into two varying components: an envelope and a
fine structure. Previous sound data indicate that both of these components are encoded
by the auditory system and play a role in perception of other sounds.
In order to start sonification experiments, first idea to remember is to get a clear idea
of which auditory displays are of interest to a particular study. All experimental procedures
required should be developed in the context of the application. The next part of the
procedure is to know which particular data and statistical analyses are of interest to the
question of the application; this is done during design stage. [36]
There are several perceptual issues that are important for researchers to consider when
it comes to auditory displays. Usually for people new to the research, processing
capabilities of sounds for participants are the same as their visual processing. This is
however not true for many cases. In order to understand auditory perception fully, three
aspects that constrain experiments to evaluate auditory displays are the transient nature of
sounds, properties of memory for auditory events, and differences in the way attention is
allocated in auditory tasks.
Most research designs that involve comparing of auditory displays should be set up in
a way that participants can have as many performance trials as possible as to ensure the
closest evaluation to the displays. It is best for researches to obtain basic data about
perceptual abilities and limitations.
As mentioned earlier, most researches require comparison of values from more than
one data source. This can result to a problem that each stimulus in a particular data may
turn out to be less identifiable. In problems like this, changing the timbre or stereo panning
for a particular stimulus can be done to alleviate such problems. [37]
3-6
3.5 Discriminant Function Analysis [38]
With forward selection, the study can help experiments improve the predictability
of parameters. Researches that use forward selection algorithms indicate that such method
is capable of managing the interaction effects of sonification parameters on the output
variable, which in this case is the perceptibility of subjects [41]. Forward selection can also
accommodate non-linear relationships; human hearing abilities are known to be nonlinear.
In this case, the forward selection algorithm can help the study look for combinations of
timbre that varies with pitch over time, which is the interaction effect in this case, in such
a way that these interaction effects will have an impact on the nonlinear parameter, which
is human hearing.
3-7
4. TySON
The system is divided into two stages. The first stage of the system or the instructor
module is the recording and processing of the movements of a Taekwondo professional.
This will use the visual programming platform EyesWeb to create the dataset and
Sonification Sandbox to sonify the data. The second stage of the system or the student
module is to implement an application that contains the set of sonified movements as
references for students to learn.
Movement1
Movement2
Sonification Module
Movement3
Movementn.....
The instructor module in Figure 4-1 focuses on recording and processing of the
movements of the instructor. The instructor will be recorded in a direction diagonal to the
Kinect in order to extract accurate coordinates of certain Kinect points, which will vary
depending on the type of movement to be modeled. After extracting coordinates, another
program will compute for different joint quaternion angles and will record these features
as a dataset for the particular movement. The quaternion angles gathered by the sensor
will be filtered due to the nature of the sensor generating noisy values. It is important to
note that not all points need to be extracted; only specific areas of the body will be spotted
and extracted when performing. The features extracted will be saved for the chosen
movement and will be used as a learning basis.
4-1
Perform
Analysis
Open listening test
Capture between
sonified Select for each Feature Sonify using
Open TySon student's instructor
movement movement feature and extraction xSOnify
movement and student's
database combination
data
of features.
The student module in Figure 4-2 focuses on learning the movements from the
sonified movement database. At least thirty (30) subjects will be recruited for this
experiment of the study. The application will contain the database of the movements for
the student to select as well as the measured values from features measured. After selecting
a movement, the student will hear the sonified sequence of the movement. The research
will undergo a listening test by first listening to each feature one at a time and then to
different combinations of features in order to test for the sound’s perceptibility overall.
While familiarizing himself with the sonified sequence of the movement, the
system will capture the movements of the student, similar to the process of the instructor
module, and obtain a feedback in regards to his own movements. The student will then
think if he or she was able to follow the same sequence.
In order to obtain Euler Angles that implies different orientations of a joint, the
program will first extract a quaternion from said joints, which contains a constant w and
coordinates x, y, and z, which is obtainable from the library of Kinect. After quaternion
extraction, the program will perform three different Euler angle functions, those with
respect to 3D coordinates x, y, and z as explained in 3.1.3.
The tool that will be used for sonifying or transforming the segmented data into
sound is the xSonify software programmed in Java [36]. Each segment of data will be
assigned to a different timbre and a choice of variation in volume or tone that the study
feels will match the corresponding feature. Before even mapping the timbre to each
feature, the research will evaluate the perceptibility of each timbre. The process goes first
by accessing the sonification application first and then importing a .csv file representing
one movement.
Next, the study will undergo a Discriminant Function Analysis process, specifically
stepwise forward analysis, as shown in Figure 4-4 for each sound timbre to test its
perceptibility. Four (4) of the most perceptible timbres in the initial set will be mapped to
the Euler Angle features.
4-2
The research will create a listening test that will train the subjects' sound
perceptibility. The objective of the test is to choose the graph that corresponds to a sound
being played. The different graphs will represent the varying pitch of a sound parameter
over time. After this first test, the sound parameter with the top score (the most
perceptible by subjects) will be chosen.
Three more batches of tests will be done, except that each succeeding test will
remove the top sound parameter/s that was chosen previously. The objective of the
succeeding tests is to choose the top parameter that is still discriminable while blending in
with the top sound parameters chosen.
The four batches of tests will conclude which sound parameters will be used as a
basis for the movement parameters for the research.
Perform
Discriminant If timbre is
Load .csv file Function Analysis perceptible to the
Open xSonify representing one for the different human ear, map
movement types of timbre by timbre to Euler
testing Angle feature.
perceptibility.
Woodwinds
Percussion
Keyboards
Initial set
of timbre
Brass
Strings
Other
4-3
4.3. Theoretical Analysis
The instructor module is a separate module from the primary system whose
objective is to contain the primary sonified movement database for the system. Aside from
recording these movements, the recording part of this module will also observe the
movement in different angles of motion. Figure 4-5 represents the kinematic model
generated by the Kinect sensor, while Table 4-1 represents a table of Kinect point clouds
to be checked as well as extract raw data from.
Table 4-1. List of points to extract from the kinematic model generated by Kinect.
The student module is responsible for containing the primary application for the
system. The movements will contain the measured values and its respective MIDI files and
this will be shown when movements are selected. The purpose of this module is to provide
a reference through sonification sequences of each movement for students to familiarize
with. This will serve as the basis for the student’s learning process, although it is not
completely necessary that the student follows the same sound; it is up to the student to
think if he or she can relate well to the same sequence. The research will recruit another
Taekwondo expert to judge correctness of movement as this does not have to be
necessarily as perfect as that to the instructor’s.
4-4
4.3.3.2 Quaternion to Euler Angle Conversion
Figure 4-3. Quaternion to Euler angle analysis (in radians) of an arm swing to the left.
The program needs to be able to detect a difference in the change of the angle of
a particular motion. The plot in Figure 4-3 shows different Euler angles of different joints,
which will depend on what kind of movement is being done. In this graph, we can see that
the shoulder is not rolling up or down but it is moving to the right, slowly leading to
negative values. While the shoulder is moving, the elbow quaternion angles will also
increase or decrease in value depending on the movement.
When calculating and finding the correct angles, it should be noted that the system
looks at the movement as a whole rather than one angular motion. This means that it isn’t
guaranteed that the angle value of a joint is correct without considering the angles of
different joints of the body under the same frame or interval.
4-5
Figure 4-5. Instructor Module Design
Figure 4-6. Elbow Yaw graph encoded Figure 4-7. Elbow Pitch graph
in an increasing seashore volume encoded in an increasing strings
volume
Figure 4-8. Shoulder Roll graph Figure 4-9. Shoulder Yaw graph
encoded in an increasing soundtrack encoded in an increasing whistle
pitch volume
4-6
4.4.1. Instructor Module Design
In the instructor module design shown in Figure 4-5, the instructor will create
bases for learning Taekwondo movements. Each angle will correspond to one sound
feature that can be manipulated by the xSonify software. The sound features that can be
mapped to different movement features are the different kinds of timbres that can be either
be chosen to vary on volume or pitch. For one movement, a CSV file will be generated
for this movement. The CSV files generated will be sent to the xSonify tool for the
sonification process, turning each file into a playable sound file, representing its reference.
All sound files will be saved in a movement database.
Figure 4-6, 4-7, 4-8, and 4-9 shows sample that different joint quaternions,
particularly taken from the shoulder and elbow, can be encoded in combinations of varying
and constant pitch and volume through the xSonify software. In this example, four kinds
of pitches were chosen and a choice of increasing pitch or volume is done. In this example,
the pitch of a seashore, soundtrack. With this software, a sound file combining the different
joint Euler angles will be generated.
The purpose of the student module design is to learn the movements in the created
movement database. Each movement has its respective MIDI file equivalent. This module
will also include a listening test in order to test the subject’s hearing ability towards the
features of the movement and afterwards combinations of features [10]. The student will
playback the file while also attempting to learn the movement, but not necessarily compare
to the sound played. With the auditory feedback given, it is up to the student to think if he
or she was able to learn the movement correctly. The researcher may be able to survey the
subjects on what was different from their movement compared to the instructor.
4-7
Appendix A. References
[1] Alborno, P., Piana, S., Mancini, M., Niewiadomski, R., Volpe, G., & Camurri, A.
(2016). Analysis of Intrapersonal Synchronization in Full-Body Movements
Displaying Different Expressive Qualities. Proceedings of the International Working
Conference on Advanced Visual Interfaces - AVI 16. doi:10.1145/2909132.2909262
[2] Bergmann, J., Effenberg, A. O., Hwang, T., Müller, F., & Schmitz, G. (2018).
Movement Sonification in Stroke Rehabilitation. Frontiers in Neuroscience.
doi:10.3389/fneur.2018.00389
[3] Boyer, E. O., Pyanet, Q., Hanneton, S., & Bevilacqua, F. (2014). Learning
Movement Kinematics with a Targeted Sound. Lecture Notes in Computer Science
Sound, Music, and Motion,218-233. doi:10.1007/978-3-319-12976-1_14
[4] Bresin, R., Elblaus, L., & Frid, E. (2016). Sonification of Fluidity - An
Exploration of Perceptual Connotations of a Movement Feature. Proceedings of
ISon 2016, 5th Interactive Sonification Workshop. Retrieved June 27, 2018.
[5] Claveau, D., & Smith, K. M. (2014). The Sonification and Learning of Human
Motion. The 20th International Conference on Auditory Display (ICAD-2014).
Retrieved June 27, 2018.
[6] Danna, J., Paz-Villagrán, V., Capel, A., Pétroz, C., Gondre, C., Pinto, S., . . .
Velay, J. (2014). Movement Sonification for the Diagnosis and the Rehabilitation
of Graphomotor Disorders. Lecture Notes in Computer Science Sound, Music, and
Motion,246-255. doi:10.1007/978-3-319-12976-1_16
[7] Dyer, J. F., Stapleton, P., & Rodger, M. W. (2017). Advantages of melodic over
rhythmic movement sonification in bimanual motor skill learning. Experimental
Brain Research,235(10), 3129-3140. doi:10.1007/s00222-017-5047-8
[9] Effenberg, A. O., Fehse, U., Schmitz, G., Krueger, B., & Mechling, H. (2016).
Movement Sonification: Effects on Motor Learning beyond Rhythmic
Adjustments. Frontiers in Neuroscience,10. doi:10.3389/fnins.2016.00219
[10] Frid, E., Bresin, R., Alborno, P., & Elblaus, L. (2016). Interactive Sonification of
Spontaneous Movement of Children—Cross-Modal Mapping and the Perception
of Body Movement Qualities through Sound. Frontiers in Neuroscience,10.
doi:10.3389/fnins.2016.00521
[11] Hohagen, J., & Wöllner, C. (2016). Movement Sonification of Musical Gestures:
Investigating Perceptual Processes Underlying Musical Performance
Movements. Proceedings SMC 2016. Retrieved June 27, 2018.
[12] Kolykhalova, K., Camurri, A., Volpe, G., Sanguineti, M., Puppo, E., &
Niewiadomski, R. (2015). A Multimodal Dataset for the Analysis of Movement
Qualities in Karate Martial Art. Proceedings of the 7th International Conference on
1
Intelligent Technologies for Interactive Entertainment.
doi:10.4108/icst.intetain.2015.260039
[13] Malik, S., Manaf, U. K., Ahmad, N. A., & Ismail, M. (2018). Investigating Special
Education Curriculum for Visually Impaired Children in Solving Family
Adjustment Issues in Pakistan. International Journal of Academic Research in Business
and Social Sciences,7(14). doi:10.6007/ijarbss/v7-i14/3678
[14] Piana, S., Coletta, P., Ghisio, S., Niewiadomski, R., Mancini, M., Sagoleo, R., . . .
Camurri, A. (2016). Towards a Multimodal Repository of Expressive Movement
Qualities in Dance. Proceedings of the 3rd International Symposium on Movement and
Computing - MOCO 16. doi:10.1145/2948910.2948931
[15] Szuszkiewicz, A., & Maleta, B. (2007). Martial arts for the blind and partially sighted.
[16] Čihounková, J., Kohoutková, J., Reguli, Z., & Skotáková, A. (2015). Self-defense
for people with visual impairments. IDO MOVEMENT FOR CULTURE.
Journal of Martial Arts Anthropology,15(2). doi:10.14589/ido.15.2.5
[17] Hermann, T. (2008). Taxonomy and Definitions for Sonification and Auditory
Display. Proceedings of the 14th International Conference on Auditory Display.
Retrieved July 13, 2018, from
http://www.icad.org/Proceedings/2008/Hermann2008.pdf
[18] Grond, F., & Berger, J. (2011). Parameter Mapping Sonification in T. Hermann,
A. Hunt, J.G. Neuhoff (Eds.), The Sonification Handbook (pp. 363-398). Berlin,
Germany : Logos Publishing House.
[19] Hermann, T. (2002, February 12). Sonification for exploratory data analysis
(Doctoral dissertation, Bielefeld, University, Diss, 2002) pp. 38-39.
[20] Maranan, D. S., Alaoui, S. F., Schiphorst, T., Pasquier, P., Subyen, P., & Bartram,
L. (2014). Recognizing Movement Qualities: Mapping LMA Effort Factors to
Visualization of Movement. Proceedings of the 32nd Annual ACM Conference on Human
Factors in Computing Systems - CHI 14. doi:10.1145/2556288.2557251
[21] DANCE Platform version 2. (2017, January). Retrieved October 28, 2018, from
http://dance.dibris.unige.it/index.php/2017-02-08-13-44-31/dance-platform-v2
[22] Haibach-Beach, P., Reid, G., & Collier, D. (n.d.). Motor Learning. Retrieved
October 31, 2018, from https://us.humankinetics.com/blogs/excerpt/motor-
learning
[23] Perceptual and Motor Development Domain. (n.d.). Retrieved November 4, 2018,
from https://www.cde.ca.gov/sp/cd/re/itf09percmotdev.asp
2
[26] Ellison, S., & Kramer, G. (1991). Audification: The Use of Sound to Display
Multivariate Data. International Computer Music Association,1991. Retrieved
November 4, 2018, from
https://quod.lib.umich.edu/i/icmc/bbp2372.1991.052/1.
[27] Hermann, T., Hunt, A., & Neuhoff, J. (2011). Chapter 16: Model-Based
Sonification. In The Sonification Handbook. Berlin, Germany: Logos Publishing
House.
[28] Koritnik, T., Bajd, T., & Munih, M. (2010). A Simple Kinematic Model of a
Human Body for Virtual Environments. Advances in Robot Kinematics: Motion in Man
and Machine, 401-408. doi:10.1007/978-90-481-9262-5_43
[29] Sottek, R., & Genuit, K. (2005). Models of signal processing in human hearing.
AEU - International Journal of Electronics and Communications, 59(3), 157-165.
doi:10.1016/j.aeue.2005.03.016
[30] Ganea, D., Mereuta, E., & Mereuta, C. (2014). Human Body Kinematics and the
Kinect Sensor. Applied Mechanics and Materials, 555, 707-712.
doi:10.4028/www.scientific.net/amm.555.707
[31] Walker, B., & Nees, M. (2011). Theory of Sonification. In The Sonification Handbook
(pp. 9-31). Berlin: Logos Publishing House.
[32] Carlile, S. (2011). Psychoacoustics. In The Sonification Handbook (pp. 32-61). Berlin:
Logos Publishing House.
[32] Elgendi, M., Picon, F., Magnenat-Thalmann, N., & Abbott, D. (2014). Arm
movement speed assessment via a Kinect camera: A preliminary study in healthy
subjects. BioMedical Engineering OnLine, 13(1), 88. doi:10.1186/1475-925x-13-88
[33] Conversion between quaternions and Euler angles. (n.d.). Retrieved from
https://graphics.fandom.com/wiki/Conversion_between_quaternions_and_Eule
r_angles
[34] Pterneas, V. (2017, May 28). Kinect Joint Rotation - The Definitive Guide.
Retrieved from https://pterneas.com/2017/05/28/kinect-joint-rotation/
[35] Meaning of Rotation Data of K4W v2. (n.d.). Retrieved from
https://social.msdn.microsoft.com/Forums/en-US/f2e6a544-705c-43ed-a0e1-
731ad907b776/meaning-of-rotation-data-of-k4w-v2?forum=kinectv2sdk
[36] Bernard, H., Boller, R., Candey, R., Diaz, W., Qian, F., & Schertenleib, A.
(n.d.). Astrophysics Source Code Library. Retrieved July 2, 2019, from
http://ascl.net/1207.008
[36] Walker, B. N., & Nees, M. A. (2011). Theory of Sonification. In The Sonification
Handbook(pp. 9-31). Berlin: Logos Publishing House.
3
[38] Discover Which Variables Discriminate Between Groups, Discriminant Function
Analysis. Retrieved from http://www.statsoft.com/Textbook/Discriminant-
Function-Analysis
[41] Narisetty, N. N., Mukherjee, B., Chen, Y. H., Gonzalez, R., & Meeker, J. D.
(2018). Selection of nonlinear interactions by a forward stepwise algorithm:
Application to identifying environmental chemical mixtures affecting health
outcomes. Statistics in Medicine, 38(9), 1582–1600. doi: 10.1002/sim.8059
4
Appendix B. List of Taekwondo Movements
5
Figure 12-9. Knifehands middle block
6
Appendix C. Preliminary Work
1. Sonification Sandbox – A tool that converts CSV files into playable MIDI
files. CSV files are not necessary to use the tool; manual values can be
input into the graph. After setting values or loading a CSV file, a respective
graph for the values will be generated. The only parameter that can be
mapped is pitch, and the pitch chosen was an echo, as it is close to the
flow of movement.
7
2. OpenPose – OpenPose is a system that is able to jointly detect human
body, hand, facial, and foot keypoints on images and videos. This is
another alternative producer of points for sonifying movement. As of
now, only processing of points was done. No form of saving the points
similar to Kinect saving into CSV files is done. Below is a step by step
process on how points are processed.