Findit is a Python program which can detect the song being played by comparing it with a local database of songs.
How Shazam Works is a brilliant article which discusses the approach from scratch. The code is a direct implementation of the above article (barring a few conceptual changes such as an overlapping window).
The user can go through each and every step of the pipeline, visualise the intermediate results and get a feel for the complete approach which is used as the basic pipeline by major commercial applications such as Shazam!
Here is a spectogram of 3-second audio clip from 'A Sky Full of Stars' And here's a filtered version which only keeps the strongest frequencies On running it on a test of 200 clips, here are the results:
The program requires:
- Python 3.x
- soX is a powerful open-source audio processing application. It is used for resampling the audio and trimming random clips for testing. Download the executable, rename it to 'sox' and place it in the parent folder.
- PyDub, SciPy, NumPy and Matplotlib
- Place your audio files, which form the database, in the audio/ directory.
- In the 'Findit.py' script, enter the target audio path and run the program
- test_maker.py can generate random clips from the audio database and automatically run the clips through Findit and generate the results