Covers 1000

A ''cover song'' is a different version of the same song, usually performed by a different artist, and often with different instruments, recording settings, mixing/balance, tempo, and key. Although humans can readily identify cover songs, automatically identifying cover songs with a machine remains a challenging problem. One might wonder why it isn't possible to use an app like Shazam to automatically identify, say, a live recording of a song. As it turns out, the algorithm that powers Shazam looks for exact clips of recordings using an a technique known as audio fingerprinting. It is extremely good at its job, especially given a large database, but it is unable to detect re-renditions, even by the same artist. To help move research in automatic cover songs forward, we present a medium sized cover songs dataset consisting of a collection of features from 395 groups of cover songs, which have been checked by hand. We also have a live demo of our recent technique for identifying and aligning cover songs beat-by-beat, which currently achieves state of the art results on automatic cover song identification. Finally, we have implemented an algorithm to synthesize new cover songs in a fully automated fashinon from raw audio, and we present two tools (LoopDitty and GraphDitty) which we created to help design our algorithms.

The Demo

Click here to view examples of cover songs which have been aligned by the algorithms in the paper.

The Covers 80 Dataset	A dataset with low quality audio consisting of 160 songs which are split into two disjoint subsets A and B, each with exactly one version of a pair of songs, for a total of 80 pairs. Mostly '80s and early '90s pop music
Kara1k Karaoke Songs Dataset	A dataset with features for 2000 songs: 1000 originals and 1000 corresponding karaoke versions. Also a great dataset for singing voice analysis.
http://www.secondhandsongs.com	A community project of annotations of cover songs which formed the basis of this dataset.
The Second Hand Songs Dataset	Another dataset based off of annotations from secondhandsongs.com, which is a subset of the Million Songs Dataset consiting of about 20,000 tracks with EchoNest features.
The Youtube Covers Dataset	A collection chroma, CRP, and CENS features for 350 songs of various genres.

Covers 1000 Dataset

by Chris Tralie

The Papers

The Dataset

The Demo

Cover Song Synthesis

LoopDitty

GraphDitty

Other Links