Sie sind auf Seite 1von 13

SingingEGuru

Guide to use the features


The software started off from our simple interest in music. We could play the piano bad and could sing even
worse. So we did what we knew best, develop software to help us in it, though it’s not perfect, but it is a start.

Two key features have been developed in this software,

1. To generate the notes from a given audio file.


2. To practice singing.

Developed as our final year project, and refined much more after passing out, it’s finally stable to be brought
out of our work stations and installed onto simpler machines. The guide is developed to explain the each feature
of the software and how to use it.

Generation of musical notes given an audio file.


We listen to a song and now we would like to play the song on an instrument. So we start hitting the notes on
the instrument and slowly start hitting the right ones which are the part of the song. This method is popularly
known as ‘playing by the hear’, as we are listening to the song and trying to figure out the notes by trying it out
on the instrument. For people with experience this is easy, but for beginners it’s difficult and becomes a sort of
trial and error game.

Here comes the use of our first feature. Process the audio and give us the notes that make the audio. With this
we are presented with a list of notes that should be played on the instrument for the audio.

Following are the steps that need to be followed use this feature.

1. Select the menu item ‘Generate Notes’.


2. You will be presented with a file dialog box to open a wav audio file.

Select the audio file that needs to be processed. Now as most of our audio on our computer is in mp3 format
and the software only uses wav files, therefore you need to convert your mp3 audio to wav. To learn more
about how to convert your mp3 to wav files, please see the document ‘ConvertMp3ToWav.pdf’.

3. On opening this file you will be presented with the Generate Notes window.
As you see, we have the settings panel. Here, at start you are asked to select the FFT sample size value from the
given list. Now, keeping it absolutely non-technical, different values give different results for the same audio.
So you need to experiment with these values and find out which one gives the best result. However a value that
gave the best result for a given audio may not give for another. However, generally we have found that 8192
and 16384 give the best results.

The ‘Sample Shift In MilliSecs’ technically determines by how many audio samples should one move ahead in
the given audio file as the processing goes about. Therefore the duration of a note is in multiples of this value.

Smaller the value, the more accurate can one be in the duration of a note. However, making it too small also is
not helpful. The rule is simple, if you find the notes in a song are varying fast, keep the value small else if the
notes are being held steady for much longer period then keep a larger value.

Now there is a range for this value. The range increases with increase in FFT sample size. One can see the range
for it in the tooltip that is generated for it. It will show the minimum and maximum value. By default, always
the maximum value is set. Also when if you reduce the FFT sample size and if the value exceeds the limitation,
then the new maximum value will be set.

Hit the ‘Start Generating Notes’ button to begin the generation of notes.

4. You may hit the stop button; however you will have only a partially generated notes output.
5. Once the generation of notes is over, the remaining components are now enabled.

Now you may listen to the notes that you have generated. You can select for which instrument you want to hear
your notes on. The notes will be synthesized and played for the instrument you have selected. By listening to
the notes you have generated you can tell how good the notes have been converted. Also, if you would like
more instruments to be supported, do let me know which instruments you would like to have.

If you dislike the notes that have been generated, try selecting a different value of FFT sample size and generate
the notes again. If you have got the best result, then you may go ahead with saving the notes you have just
generated.

6. You can save the notes as an xml file or midi file. However, save them as both.
With the notes saved in an xml file, you can view your notes in your web browser. You will see the musical
notes that have been found in the audio, sequentially. Though this display is crude for now, it does give you the
details regarding the notes you have just generated. Also, you may open the same xml file again in the software
to listen to the notes again. With time, better representation of this data will be done.

Saving it as a midi file, gives you the ability to import the file into sound editing software’s like Audacity. With
this you have the ability to refine the notes even further.

Also, you may play the midi file in your Windows Media Player and also transfer it to your cell phone if your
cell phone can play midi format files. However, you will not be able to choose an instrument in this case.
Now the notes generated, not all of them are correct. You still need to manually refine them out as 100%
accuracy is not possible and a lot depends on the audio file. Generally if the audio is heavier on vocals and no
banging loud utensils in the background then we get a decent output. Like above, the Titanic song does not have
any loud instruments in it. Similarly we have ‘Tum Se Hee’ from ‘Jab We Met’.

The algorithm used is actually under the assumption that the audio has no background instrument and is of pure
vocal. Therefore a better way to use this feature also is to you yourself sing a song, record it using the default
sound recorder in Windows. Save it as a wav file and give it to the software. Please see the
‘ConvertMp3ToWav.pdf’ document. At the end of the document, it has been explained which settings should be
used to save the wav file.

Now, quality of the output will purely depend upon how well you have sung. Bad singing, bad output. Good
singing, good output. The algorithm is relying on the fact that if you are singing well then you are hitting the
right notes as well.

Don’t worry if your singing is bad, for here comes in our next feature to help you improve your singing.
Practice Singing
This feature helps practice to hit the right notes. We are here going to practice to sing the Indian Musical Notes,
Sa, Re, Ga, Ma, Pa, Dha, Ni correctly. Now unlike, Western music where each note is defined and well placed
on the Piano, in the Indian music, these notes can be placed anywhere on the piano. However once the base
note, i.e. Sa is defined, rest of the notes get fixed along with it.

For example, if we choose ‘Sa’ to be C4 ( Note C at scale 4 ), then the remaining notes are, D4 is ‘Re’, E4 is
‘Ga’, F4 is ‘Ma’, G4 is ‘Pa’, A4 is ‘Dha’ and B4 is ‘Ni’.

This calculation is performed by the software, so all one needs to do is select the base note; however you must
know it.

Note this mode, is a real time mode. You need to have microphone attached to your computer. Do test your
microphone. Best way to do so is to use the default sound recorder found in Windows.

Following are the steps to use this feature.

1. Select the menu item ‘Practice Singing’


2. You will be presented with the below window.

Observe the ‘Settings Panel’. You can select the FFT sample size, the sampling rate of the audio from the
microphone and our base note.

Now, when our base note is constant, the accuracy of our singing i.e. denoted by the width of the green bar
depends upon the sampling rate and the FFT sample size. To keep accuracy high you need to keep the value of
sampling rate as well as FFT sample size close to each other. Observe below how the width of the bar changes
when I increase my sampling rate but not the FFT sample size.

In such case, we will not be able to tell the difference between ‘Sa’ and ‘Re’.
In the above case, C0 is at 16.3516Hz. Now it is said that male frequency range is between 80Hz to 700Hz, and
the female frequency range is 140Hz to 1100Hz. So we would like to start at scale 4, as C4 starts at 261Hz
approx.

Note, how the bar width has reduced in size again. This so, because when ‘Sa’ starts at C4 the frequency range
where these notes exist is also increased.

Hope you now understand how the visual representation of accuracy works. I also suggest that you keep your
sampling rate much higher than your FFT sample size. The values 8192 & 44100.0Hz, work just fine.

You may also change your base note to C#, D, D# and so on. It’s all up to individual’s choice.

Now in the ‘Listen To’ panel, we have toggle buttons for each note that we want to practice. On clicking on a
given toggle button, you will hear the note being played. This tells you how the note sounds. Listen to it
carefully, because when you are practicing you have to sing the note exactly at the same frequency you heard it
at. Once you are ready, hit start. Immediately the green bar starts to move. Now if you are above or below the
range, a message is displayed as shown.
However, if you are singing just right then the green bar will appear under the respective note you are singing.
For example, in the below screen shot, we have the note ‘Dha’.
Think of this feature as a game. Whoever can keep the green bar steady at given note wins. The longer you can
keep it steady, the better you are.

So keep practicing.
To open a play a pre generated notes file.
Now a need will always arise when you want to listen to the notes you saved which you generated previously.
In such case, you need to have the XML file you have saved.

1. Select the menu item ‘Open Existing G.N. XML File’. (G.N. stands for Generated Notes).

2. You will then be asked to open the previously generated XML file.
3. Open the file and you will be presented with the given window.

You may here select the instrument and play the notes again.

Thus, concludes the features of this software.

Das könnte Ihnen auch gefallen