DOI QR코드

DOI QR Code

A Statistical Approach for Extracting and Miming Relation between Concepts

개념간 관계의 추출과 명명을 위한 통계적 접근방법

  • Published : 2005.08.01

Abstract

The ontology was proposed to construct the logical basis of semantic web. Ontology represents domain knowledge in the formal form and it enables that machine understand domain knowledge and provide appropriate intelligent service for user request. However, the construction and the maintenance of ontology requires large amount of cost and human efforts. This paper proposes an automatic ontology construction method for defining relation between concepts in the documents. The Proposed method works as following steps. First we find concept pairs which compose association rule based on the concepts in domain specific documents. Next, we find pattern that describes the relation between concepts by clustering the context between two concepts composing association rule. Last, find generalized pattern name by clustering the clustered patterns. To verify the proposed method, we extract relation between concepts and evaluate the result using documents set provide by TREC(Text Retrieval Conference). The result shows that proposed method cant provide useful information that describes relation between concepts.

온톨로지는 차세대 시맨틱 웹을 위한 논리의 기반을 구성하기 위해 제안되었다. 이러한 온톨로지는 특정 분야에 대한 지식을 정형화된 형태로 표현함으로써 기계에 의한 지식의 이해를 가능하게 하고, 이를 사용하여 사용자의 요구에 알맞은 지능화된 서비스를 제공할 수 있게 한다. 하지만, 온톨로지의 구축과 유지는 많은 사람의 시간과 노력을 요구한다. 본 고에서는 온톨로지 구축 방법의 일환으로, 문서로부터 온톨로지를 구성하는 개념간의 관계를 정의하는 자동화된 방법을 제안한다. 본 고에서 제안된 방법은 특정 분야의 문서에 존재하는 개념을 기반으로 개념간의 연관 규칙을 형성하는 개념 쌍을 찾고, 두 개념 사이에 존재하는 내용의 군집화를 통해 두 개념간의 관계를 설명하는 패턴을 찾는다. 마지막으로 패턴간의 군집화를 사용하여 개념 사이의 일반화된 관계를 명시한다. 본 고에서는 제안된 방법을 검증하기 위한 방법으로 TREC(Text REtrieval Conference)에서 제공하는 문서집합을 사용하여 개념간의 관계를 추출, 평가하였으며, 그 결과 제안된 방법은 개념간의 관계를 설명하는 유용한 정보를 제공할 수 있음을 보여준다.

Keywords

References

  1. Marti A. Hearst. 'Automatic Acquisition of Hyponyms from Large Text Corpora' In Proceedings of the 14th International Conference on Computational Linguistics, 1992 https://doi.org/10.3115/992133.992154
  2. William B. Frakes and Ricardo Baeza-Yates, editions. 'Information Retrieval: Data Structure and Algorithms', Prentice-Hall, 1992
  3. Thomas R. Gruber. 'A Translation Approach to Portable Ontology Specifications' Stanford Knowledge System Laboratory Technical Report KSL-92-71, pp.1-2, 1993
  4. Rakesh Agrawal and Ramakrishman Srikant. 'Fast Algorithms for Mining Association Rules', In Proceedings of the 20th International Conference on Very Large Databases (VLDB), September, 1994
  5. Ramakrishnan Srikant and Rakesh Agrawal. 'Mining Generalized Association Rules', In Proceedings of the 21st VLDB Conference, 1995
  6. Mark Sanderson and Bruce Croft, 'Deriving Concept Hierarchies from Text', In Proceedings of the 22th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.206-213, 1999 https://doi.org/10.1145/312624.312679
  7. A. Maedche and S. Staab. 'Semi-Automatic Engineering of Ontologies from Text', In Proceedings of the 12th International Conference on Sw Engineering and Knowledge Engineering(SEKE'2000), 2000
  8. Eugene Agichtein and Luis Gravano. 'Snowball: Extracting Relations from Large Plain-Text Collections', In Proceedings of the ACM International Conference on Digital Libraries(DL'00), 2000 https://doi.org/10.1145/336597.336644
  9. Dawn Lawrie and W. Bruce Croft, 'Discovering and Comparing Topic Hierarchies', In Proceedings of RIAD2000 conference, pp.314-330, 2000
  10. T. Bemers-Lee, J. Hendler, and O. Lassila. 'The Semantic Web', Scientific American, pp.35-43, May, 2001
  11. Natalya F. Noy and Deborah L. McGuinness. 'Ontology Development 101: A Guide to Creating your First Ontology', SMI Technical Report SMI-2001-0880, pp.1-25, 2001
  12. Dawn Lawrie, W. Bruce Croft, and Arnold Rosenberg, 'Finding Topic Words for Hierarchical Summarization', In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.349-357, 2001 https://doi.org/10.1145/383952.384022
  13. Dawn J. Lawrie and W. Bruce Croft, 'Generating Hierarchical Summaries for Web Searches', In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.457- 458, 2003 https://doi.org/10.1145/860435.860549
  14. Phillipp Cimiano, Steffen Staab, and Julien Tane. 'Automatic Acquisition of Taxonomies from Text: FCA meets NLP', In Proceedings of the GI Workshop Lehren- Lernen- Wissen Adaptivitat(LLWA), 2003
  15. Hee-soo Kim, Ikkyu Choi, and Minkoo Kim. 'Refining Term Weights of Documents Using Term Dependencies', In Proceedings of the 26th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 552-553, 2004 https://doi.org/10.1145/1008992.1009116
  16. 김희수, 조용석, 최익규, '문서로부터 계층적 개념 트리 자동 구축', 2004년 추계정보과학회, pp.103-105, 2004