GTZAN Genre Collection
This dataset was used for the well known paper in genre classification
" Musical genre classification of audio signals " by
G. Tzanetakis and P. Cook in IEEE Transactions on Audio and Speech
Unfortunately the database was collected gradually and very early on
in my research so I have no titles (and obviously no copyright
The files were collected in 2000-2001 from a variety of sources
including personal CDs, radio, microphone recordings, in order to
represent a variety of recording conditions.
Nevetheless I have been providing it to researchers
upon request mainly for comparison purposes etc. Please contact
George Tzanetakis (firstname.lastname@example.org) if you intend to publish
experimental results using this dataset.
The dataset consists of 1000 audio tracks each 30 seconds long. It
contains 10 genres, each represented by 100 tracks. The tracks are
all 22050Hz Mono 16-bit audio files in .wav format.
GTZAN genre collection. (Approximately 1.2GB)
A similar dataset which was collected for the purposes of music/speech
discrimination. The dataset consists of 120 tracks, each 30 seconds
long. Each class (music/speech) has 60 examples. The tracks are all
22050Hz Mono 16-bit audio files in .wav format.
GTZAN music/speech collection. (Approximately 297MB)
RWC Music Database