DOI QR코드

DOI QR Code

Vocal Separation in Music Using SVM and Selective Frequency Subtraction

SVM과 선택적 주파수 차감법을 이용한 음악에서의 보컬 분리

  • 김현태 (동의대학교 멀티미디어공학과)
  • Received : 2014.10.09
  • Accepted : 2015.01.12
  • Published : 2015.01.31

Abstract

Recently, According to increasing interest to original sound Karaoke instrument, MIDI type karaoke manufacturer attempt to make more cheap method instead of original recoding method. The specific method is to make the original sound accompaniment to remove only the voice of the singer in the singer music album. In this paper, a system to separate vocal components from music accompaniment for stereo recordings were proposed. Proposed system consists of two stages. The first stage is a vocal detection. This stage classifies an input into vocal and non vocal portions by using SVM with MFCC. In the second stage, selective frequency subtractions were performed at each frequency bin in vocal portions. Listening test with removed vocal music from proposed system show relatively high satisfactory level.

최근 원음 반주기에 대한 관심이 증가됨에 따라 고가의 스튜디오 직접 녹음 방법 대신 보다 저렴한 방법을 시도하고 있다. 그 구체적인 방법으로는 가수의 음악 앨범에서 가수의 목소리만 제거하여 원음 반주 음원을 만드는 것이다. 본 논문에서는 스테레오로 녹음된 반주음악에서 보컬을 분리하는 시스템을 제안한다. 제안하는 시스템은 두 단계로 구성된다. 첫 단계는 보컬을 검출하는 단계이다. 이 단계에서는 MFCC를 가지고 SVM 방법을 이용하여 입력 신호를 보컬 부분과 비보컬 부분으로 분리한다. 두 번째 단계에서는 보컬 부분에 대해 각 주파수 빈별로 선택적 주파수 차감을 수행한다. 제안하는 방법으로 보컬을 제거한 음악에 대한 청취실험에서 상대적으로 높은 만족도를 보여준다.

Keywords

References

  1. W. Tsai and H. Wang, "Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals," IEEE Trans. Audio, Speech, and Language Processing, vol. 14, issue 1, 2006, pp. 330-341. https://doi.org/10.1109/TSA.2005.854091
  2. S. Vembu and S. Baumann, "Separation of vocals from polyphonic audio recordings," In Proc. Int. Society for Music Information Retrieval, London, UK, Sept., 2005.
  3. H. Kim, G. Lee, J. park, and Y. Yu, "Vehicle Detection in Tunnel using Gaussian Mixture Model and Mathematical Morphological Processing," J. of the Korea Institute of Electronic Communication Science, vol. 7, no. 5, 2012, pp. 967-974. https://doi.org/10.13067/JKIECS.2012.7.5.967
  4. K. Park and H. Kim, "A Study for Video-based Vehicle Surveillance on Outdoor Road," J. of the Korea Institute of Electronic Communication Science, vol. 8, no. 11, 2013, pp. 1647-1653. https://doi.org/10.13067/JKIECS.2013.8.11.1647
  5. H. Kim and J. Park, "Smoke Detection in Outdoor Using Its Statistical Characteristics," J. of the Korea Institute of Electronic Communication Science, vol. 9, no. 2, 2014, pp. 149-154. https://doi.org/10.13067/JKIECS.2014.9.2.149
  6. T. Leung, C. Ngo, and R. W. H. Lau, "Ica-fx features for classification of singing voice and instrumental sound," In Proc. Int. Conf. on Pattern Recognition, Cambridge, UK, vol. 2, Aug. 2004.
  7. A. Berenzweig and D. P. W. Ellis, "Locating singing voice segments within music signals," In Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA' 2001), New York, NY, Oct. 2001.
  8. T. Virtanen, A. Mesaros, and M. Ryynanen, "Combining Pitch-Based Inference and Non-Negative Spectrogram Factorization in Separating Vocals from Polyphonic Music," In Proc. Statistical and Perceptual Audition, Brisbane, Australia, Sept. 2008.
  9. J.-L. Durrieu, A. Ozerov, C. Fevotte, G. Richard, and B. David, "Main instrument separation from stereophonic audio signals using a source/filter model," 17th European Signal Processing Conf. (EUSIPCO 2009) Glasgow, Scotland, Aug. 2009.
  10. H. Park and K. Lee, "Pattern and Machine Learning from Fundamental to Applications, Goyang, Korea : Ihan Press, 2011.

Cited by

  1. Vocal Separation Using Selective Frequency Subtraction Considering with Energies and Phases vol.20, pp.3, 2015, https://doi.org/10.5909/JBE.2015.20.3.408