DOI QR코드

DOI QR Code

Conceptual Extraction of Compound Korean Keywords

  • Received : 2016.08.26
  • Accepted : 2017.01.07
  • Published : 2020.04.30

Abstract

After reading a document, people construct a concept about the information they consumed and merge multiple words to set up keywords that represent the material. With that in mind, this study suggests a smarter and more efficient keyword extraction method wherein scholarly journals are used as the basis for the establishment of production rules based on a concept information of words appearing in a document in a way in which author-provided keywords are functional although they do not appear in the body of the document. This study presents a new way to determine the importance of each keyword, excluding non-relevant keywords. To identify the validity of extracted keywords, titles and abstracts of journals about natural language and auditory language were collected for analysis. The comparison of author-provided keywords with the keyword results of the developed system showed that the developed system was highly useful, with an accuracy rate as good as up to 96%.

Keywords

References

  1. H. Kimoto, "Automatic indexing and evaluation of keywords for Japanese newspapers," IEICE Transactions on Information and Systems, Pt.1 (Japanese Edition), vol. 74, no. 8, pp. 556-566, 1991.
  2. H. Kimoto, "Automatic indexing of an integrated large scale text database and its evaluation," IPSJ SIG Technical Reports, vol. 92, no. 71 (DBS-90), pp. 73-81, 1992.
  3. H. Suzuki, S. Masuyama, and S. Naito, "Examination of keyword extraction using thesaurus in Japanese text," IPSJ SIG Technical Reports, vol. 93, no. 101 (NL-98), pp. 73-80, 1993.
  4. K. Uchiyama and M. Nakamura, "Development of an automatic keyword-extracting system on the basis of content analysis and an application system," IPSJ Research Report Database System, vol. 1991, no. 65 (DBS-084), pp. 151-160, 1991.
  5. M. Okumura and H. Nanba, "Automated text summarization: a survey," Journal of Natural Language Processing of Japan, vol. 6, no. 6, pp. 1-26, 1999.
  6. M. Nagata and H. Kimoto, "A newspaper keyword generation method based on key concept extraction," Proceedings of the 37th National Convention Information Processing Society of Japan, Tokyo, Japan, 1988, pp. 1030-1031.
  7. N. Kando, K. Kuriyama, T. Nozue, and K. Oyama, "NTCIR-1 (NACSIS Test Collection for Information Retrieval systems-1): Its Policy and Practice," IPSJ SIG Technical Reports, vol. 99, no. 20, pp. 33-40, 1999.
  8. S. Ono and M. Hamanishi, Kadokawa Ruigo Shin Jiten. Tokyo, Japan: Kadokawa Shoten, 1981.
  9. M. Hara, H. Nakajima, and T. Kitani, "Keyword extraction using a text format and word importance in a specific field," IPSJ Journal, vol. 38, no. 2, pp. 299-309, 1997.
  10. M. Morohashi, "Automatic indexing survey," IPSJ Magazine, vol. 25, no. 9, pp. 918-925, 1984.
  11. S. Ito, H. Niwa, K. Kayashima, S. Maruno, and Y. Shimeki, "Parametric keyword extraction algorithm and adaptation method," IEICE Technical Report (Natural Language Understanding and Models of Communication), vol. NLC93-53, pp. 41-46, 1993.
  12. T. Tokunaga, Information Retrieval and Natural Language Processing. Tokyo, Japan: University of Tokyo Press, 1999.
  13. Y. Ogawa, M. Mochinushi, and A. Bessho, "A compound keyword assignment method for Japanese texts," IPSJ SIG Notes, vol. 93-NL-97-15, no. 9, pp. 103-110, 1993.
  14. S. S. Lee, M. Shishibori, T. Sumitomo, and J. I. Aoe, "Extraction of field-coherent passages," Information Processing & Management, vol. 38, no. 2, pp. 173-207, 2002. https://doi.org/10.1016/S0306-4573(01)00032-2
  15. Y. H. Chen, E. J. L. Lu, and M. F. Tsai, "Finding keywords in blogs: efficient keyword extraction in blog mining via user behaviors," Expert Systems with Applications, vol. 41, no. 2, pp. 663-670, 2014. https://doi.org/10.1016/j.eswa.2013.07.091
  16. J. A. L. Ventura, C. Jonquet, M. Roche, and M. Teisseire, "Towards a Mixed Approach to Extract biomedical terms from text corpus," International Journal of Knowledge Discovery in Bioinformatics (IJKDB), vol. 4, no. 1, pp. 1-15, 2014. https://doi.org/10.4018/ijkdb.2014010101
  17. M. Dostal and K. Jezek, "Automatic keyphrase extraction based on NLP and statistical method," in Proceedings of the Dateso 2011: Annual International Workshop on DAtabases, TExts, Specifications and Objects, Pisek, Czech Republic, 2011, pp. 140-145.
  18. O. Mirzaei and M. R. Akbarzadeh-T, "A novel learning algorithm based on a multi-agent structure for solving multi-mode resource-constrained project scheduling problem," Journal of Convergence, vol. 4, no. 1, pp. 47-52, 2013.
  19. A. Buschettu, D. Sanna, G. Concas, and F. E. Pani, "A platform based on Kanban to build taxonomies and folksonomies for DMS and CSS," Journal of Convergence, vol. 6, no. 1, pp. 1-8, 2015.
  20. R. Al-Hashemi, "Text Summarization Extraction System (TSES) using extracted keywords," International Arab Journal of e-Technology, vol. 1, no. 4, pp. 164-168, 2010.
  21. Y. L. Choi, W. S. Jeon, and S. H. Yoon, "Improving database system performance by applying NoSQL," Journal of Information Processing Systems, vol. 10, no. 3, pp. 355-364, 2014. https://doi.org/10.3745/JIPS.04.0006
  22. R. Benlamri and X. Zhang, "Context-aware recommender for mobile learners," Human-centric Computing and Information Sciences, vol. 4, article no. 12, 2014.
  23. H. Im, J. Kang, and J. H. Park, "Certificateless based public key infrastructure using a DNSSEC," Journal of Convergence, vol. 6, no. 3, pp. 26-33, 2015.
  24. T. Kwon, J. Lee, H. Choi, O. Yi, and S. Ju, "Efficiency of LEA compared with AES," Journal of Convergence, vol. 6, no. 3, pp. 16-25, 2015.
  25. N. Katoh and N. Uratani, "A new approach to acquiring linguistic knowledge for locally summarizing japanese news sentences," Journal of Natural Language Processing, vol. 6, no. 7, pp. 73-92, 1999. https://doi.org/10.5715/jnlp.6.7_73