Coelho, F., J. Devezas, and C. Ribeiro (2013). Large-scale Crossmedia Retrieval for Playlist Generation and Song Discovery. In Proceedings of the 10th International Conference in the RIAO Series (OAIR 2013), Lisbon, Portugal.
To explore vast collections of audio content, users require automated tools capable of providing music search and rec- ommendation even when faced with large-scale collections. Collaborative-filtering recommenders rely on user-generated information and may be hindered by the lack of users or a bias for certain popular genres, enclosing users in an infor- mation bubble. Audio content analysis, on the other hand, is a reliable source of audio similarity, used in tasks such as music classification. For highly interactive tasks, however, the performance of analysis algorithms becomes an issue.
In this work, we address the playlist generation and song discovery tasks on large-scale datasets. We generate playlists and explore the collections with example-based queries using audio features, lyrics and tags. Approximate indexing and cross-media reranking are used for eciency. Audio content is mapped to textual representations that can be handled by information retrieval libraries.
We explored the feasibility of this content-based approach in the Million Song Dataset, a large-scale collection of audio features and associated text data comprising almost 300 GB of information. The proposed strategy can be used indepen- dently as a content-based music retrieval system and as a component for hybrid recommender systems.