Character based Hangeul search using Location-specific Character Frequency

  • Lee, Jung-Hwa (Department of Computer software engineering, Dongeui University) ;
  • Lee, Jong-Min (Department of Computer software engineering, Dongeui University) ;
  • Kim, Seong-Woo (Department of Computer software engineering, Dongeui University)
  • Published : 2009.09.30

Abstract

Hangul search functionality, including dictionary search is used in many Hangeul applications. Existing research of hangeul search method is the study of using hangeul syllable as a basic unit. However when you consider the characteristics of Hangul, the research of using hangeul character as a basic unit is needed. In this paper we propose the character based hangeul search method using the location-specific frequency information and verify the effectiveness of the proposed method through the experiments.

Keywords

References

  1. Gollapudi, S. and Panigrahy, R. ,"A Dictionary for Approximate String Search and Longest Prefix Search," CIKM INTERNATIONAL CONFERENCE CD-ROM EDITION, Vol.l5, pp.768-775, 2006 https://doi.org/10.1145/1183614.1183723
  2. Ronnblom, J., "High-error approximate dictionary search using estimate hash comparisons," Software:Practice and Experience, Vol.37 No.10, pp.1047-1059, 2007 https://doi.org/10.1002/spe.797
  3. Paolo Ferragina and Roberto Grossi. "The String B-Tree: a new data structure for string search in external memory and its applications," Journal of the ACM, vol. 46(2), pp.236-280, 1999 https://doi.org/10.1145/301970.301973
  4. Kyeonghwan Kim, "High-Speed Korean Address Searching System for Efficient Delivery Point Code Generation," The KIPS Transaction, Vol.8, No.3, pp.273-284, 2001
  5. Junho Lee, "A Method of Retrieving Romanized Korean Names,", The Industrial Technology Research, Vol.31, pp.181-189, 2001
  6. Junghwa Lee, "A Study of the framework of search patterns for Hangul characters and its relationship with Hangeul code for Hangeul Character based Index", the Journal of the Korea institute of maritime information and communication sciences, Vol. 11, No.6, pp. 1083-1088, 2007
  7. KSA, "KS X 1001 :2004, Code for information interchange(Hangeul and hanja)" , 2004
  8. The Unicode Consortium, "The Unicode Standard, Version 5.0", Addison-Wesley Professional, 2006
  9. ISO/IEC 10646-1 :2003. "Information technologyUniversal Multiple-Octet Coded Character Set (UCS)", 2003
  10. Gyongsok Kim, "Hangeul Story in Computer second edition", publishing department of Pusan National University, 1999
  11. CheolSu Kim and Yangbeom Kim, "Korean Dictionary Electronic Dictionary Statistical Information Processing Syllable", Journal of the Korea Contents Society, Vol.7, No.6, pp.60-68, 2007 https://doi.org/10.5392/JKCA.2007.7.6.060