Robust Music Identification Using Long-Term Dynamic Modulation Spectrum

  • Kim, Hyoung-Gook (Samsung Advanced Institute of Technology Computing Lab) ;
  • Eom, Ki-Wan (Samsung Advanced Institute of Technology Computing Lab)
  • Published : 2006.06.01

Abstract

In this paper, we propose a robust music audio fingerprinting system for automatic music retrieval. The fingerprint feature is extracted from the long-term dynamic modulation spectrum (LDMS) estimation in the perceptual compressed domain. The major advantage of this feature is its significant robustness against severe background noise from the street and cars. Further the fast searching is performed by looking up hash table with 32-bit hash values. The hash value bits are quantized from the logarithmic scale modulation frequency coefficients. Experiments illustrate that the LDMS fingerprint has advantages of high scalability, robustness and small fingerprint size. Moreover, the performance is improved remarkably under the severe recording-noise conditions compared with other power spectrum-based robust fingerprints.

Keywords

References

  1. Rosa Lancini, Francesco Mapelli, and R. Pezzano, 'Audio Content Identification By Using Perceptual Hashing', Proc, of the 2004 IEEE International Conf. on Multimedia and Expo, 7392, 2004
  2. F. Mapelli, and R. Lancini, 'Audio Hashing Technique For Automatic Song Identification', Proc. of the International Conf. on Information Technology: Research and Education, 2003
  3. Jurgen Herre, Oliver Hellmuch, and Markus Cremer, 'Scalable Robust Audio Fingerprinting Using MPEG-7 Content Description', IEEE Workshop on Multimedia Signal Processing, 2002
  4. Takayuki Kurozumi, Kunio Kashino, and Hiroshi Murase, 'A Robust Audio Searching Method for Cellular-Phone-Based Music Information Retrieval', Proc. of the International Conf. on Pattern Recognition, 991-994, 2002
  5. Christopher JC Burges, J.C.Platt, and S.Jana, 'Distortion Discriminant Analysis for Audio Fingerprinting', IEEE Trans. on Speech and Audio Processing, 11 (3), 165-174, 2003 https://doi.org/10.1109/TSA.2003.811538
  6. MK Mihcak, and R. Venkatesan, 'A Perceptual Audio Hashing Algorithm: A Tool For Robust Audio Identification and Information Hiding', Proc. Of 4th International Information Hiding Workshop, 2001
  7. Andreas Ribbrock, and Frank Kurth, 'A Full-Text Retrieval Approach to Content-Based Audio Identification' , IEEE Workshop on Multimedia Signal Processing, 194-197, 2002
  8. Jaap Haitsma, and T. Kalker, 'A Highly Robust Audio Fingerprinting System', Proc. of the International Conf. on Music Information Retrieval, 14-17, 2002
  9. Vivek Tyagi, Iain McCowan, Hemant Misra, and Herve Bourland, 'Mel-Cepstrum Modulation Spectrum (MCMS) Features For Robust ASR' , IEEE Workshop on Automatic Speech Recognition and Understanding, 2003
  10. S. Sukittanon, and L. Atlas, 'Modulation Frequency Features For Audio Fingerprinting', Proc. of the International Conf. on Acoustics, Speech, and Signal Processing, 2002
  11. ISO/IEC 11172-3, 'Coding of Moving Pictures And Associated Audio For Digital Storage Media, Part3: Audio' . 1993