Encountered Difficulties We did not have trouble loading our data into MatLab. The recording was saved as a .m4a file and could be uploaded in MatLab with the command "audioread(filename)". Our main difficulty was in parsing the signal into separate notes. We had to adjust the time interval size by trial and error to find the right width to detect a single note. We did find an ideal cutoff time; however, this will not suffice for the future when we handle samples with varying rhythms.
Anticipated Difficulties
Continued ringing of previously played notes
From the recorded violin piece with legato we quickly find out that continued ringing of previously played notes might still possess a considerable signal strength ( high magnitude in frequency domain) which will lead our program to consider it as a repeated note played along with our correct note. We will need to come up with a way to eliminate this source of error in note detection.
Noise: could pick up noise and think it’s a high note being played
Multiple Notes
Multiple notes fall into the category of what initially thought of as chord identification. From our piano recording with staccato, we found out that with a good window we can still identify all the notes played at the same time. However, however by looking the spectrogram, we see that there are other harmonics with just as the same signal power as the fundamental notes played. This constitutes another source error detection and a difficulty we need to overcome in this project.
Rhythm issues-identifying the duration of notes
In case of identifying time signatures, it is apparent that if there is no pause or moment of silence between two consecutive notes, we will have hard time finding out the exact time a note lasted. In this case using the window method would not work effectively. The difficulty here lies in the fact that there might be a time where the signal strength of note stays high such as in the case of repeated notes. Many songs and music recordings are recorded using different tempo/ rhythm. For a sufficiently faster rhythm, a half note might sound like a quarter note. Without any knowledge of tempo of the recording, our program will be prompted to print a quarter note which is not quite right compared to the actual human made music sheet.
Background noises
Taking into consideration that our recording can be made in a place where there are other people talking, signing, other instruments playing,etc, there is possibility of picking a frequency with a sufficient intensity(signal strength) to trick us into thinking of it as a note being played. This presents us with another difficult and we will have to find a way to eliminate this unless we assume that our recording was done in a place of minimal noise sources.