Relation Extraction based on Composite Kernel combining Pattern Similarity of Predicate-Argument Structure

술어-논항 구조의 패턴 유사도를 결합한 혼합 커널 기반관계 추출

  • Received : 2011.07.01
  • Accepted : 2011.08.08
  • Published : 2011.10.31

Abstract

Lots of valuable textual information is used to extract relations between named entities from literature. Composite kernel approach is proposed in this paper. The composite kernel approach calculates similarities based on the following information:(1) Phrase structure in convolution parse tree kernel that has shown encouraging results. (2) Predicate-argument structure patterns. In other words, the approach deals with syntactic structure as well as semantic structure using a reciprocal method. The proposed approach was evaluated using various types of test collections and it showed the better performance compared with those of previous approach using only information from syntactic structures. In addition, it showed the better performance than those of the state of the art approach.

문헌에 존재하는 핵심개체 간의 관계를 자동으로 추출할 때 다양한 형태의 문서 분석 결과를 활용할 수 있다. 본 논문에서 는 기존에 개발되어 비교적 높은 성능을 보여준 합성곱 구문 트리 커널의 구절 구조 유사성 정보와 두 개체 사이의 유의미한 연관관계를 표현해주는 술어-논항 구조 패턴의 유사성 정보를 동시에 활용하는 혼합 커널을 제안한다. 구문적 구조를 이용하는 기존의 합성곱 구문 트리 커널에 술어와 논항 간의 의미적 구조를 활용하는 술어-논항 구조 패턴 유사도 커널을 결합하여 상호보완적인 혼합 커널을 구성하였고, 다양한 테스트컬렉션 기반의 실험을 통하여 개발된 커널의 성능을 측정하였다. 실험결과 구절 구조 정보를 이용하는 합성곱 구문 트리 커널만을 단독으로 사용했을 때보다 술어-논항 구조의 패턴 정보를 결합한 혼합 커널을 사용했을 때에 더 좋은 성능을 보이는 것을 확인할 수 있었다. 또한 기존의 시스템보다 우수한 성능을 보이는 것도 함께 확인할 수 있었다.

Keywords

References

  1. Bunescu, R. C. and Mooney, R. J., "A Shortest Path Dependency Kernel for Relation Extraction," Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp.724-731, Vancouver, B.C., 2005.
  2. Culotta, A. and Sorensen, J., "Dependency Tree Kernels for Relation Extraction," Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, 2004.
  3. Bunescu, R. C. and Mooney, R. J., "Subsequence Kernels for Relation Extraction," Advances in Neural Information Processing Systems, 2006.
  4. Kambhatla N., "Combining lexical, syntactic and semantic features with Maximum Entropy models for extracting relations," ACL'2004 (Poster), pp.178-181, 21-26 July, Barcelona, Spain, 2004.
  5. GuoDong Z., Su J. Zhang J. and Zhang M., "Exploring various knowledge in relation extraction," ACL'2005, pp.427-434, 25-30 June, Ann Arbor, Michigan, USA, 2005.
  6. Zhao, S. B. and Grishman, R., "Extracting Relations with Integrated Information Using Kernel Methods," ACL-2005, 2005.
  7. Zelenko, D., Aone, C. and Richardella, A., "Kernel Methods for Relation Extraction," Journal of Machine Learning Research 3, pp.1083-1106, 2003.
  8. Zhang, M., Zhang, J., Su, J. and Zhou, G., "A Composite Kernel to Extract Relations between Entities with both Flat and Structured Features," 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, pp.825-832, 2006.
  9. GuoDong Z., Min Z., Dong H. J. and QiaoMing Z., "Tree Kernel-based Relation Extraction with Context-Sensitive Structured Parse Tree Information," Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning,
  10. Zhang, M., GuoDong, Z. and Aiti, A., "Exploring syntactic structured features over parse trees for relation extraction using kernel methods," Information Processing and Management, v.44, pp.687-701, 2008. https://doi.org/10.1016/j.ipm.2007.07.013
  11. Vishwanathan S. V. N. and Smola A. J., "Fast Kernels for String and Tree Matching," Advances in Neural Information Processing Systems, MIT Press, vol.15, pp.569-576, 2003.
  12. Collins M. and Duffy N., "Convolution Kernels for Natural Language," NIPS-2001, 2001.
  13. Moschitti A., "Making tree kernels practical for natural language learning," Proceedings of EACL'06, Trento, Italy, 2006.
  14. Pyysalo S., Airola A., Heimonen J., Bjorne J., Ginter F. and Salakoski T., "Comparative analysis of five protein-protein interaction corpora," BMC Bioinformatics, vol.9, no.S6, 2008.
  15. Bunescu R., Ge R., Kate R., Marcotte E., Mooney R., Ramani, A. and Wong, Y., "Comparative Experiments on Learning Information Extractors for Proteins and their Interactions," Artif. Intell. Med., Summarization and Information Extraction from Medical Documents, vol.33, pp.139-155, 2005.
  16. Pyysalo S., Ginter F., Heimonen J., Bjorne J., Boberg J., Jarvinen J. and Salakoski T., "BioInfer: a corpus for information extraction in the biomedical domain," BMC Bioinformatics, vol.8, no.50, 2007.
  17. Fundel K., Kuffner R. and Zimmer R., "RelEx - Relation extraction using dependency parse trees," Bioinformatics, vol.23, pp.365-371, 2007. https://doi.org/10.1093/bioinformatics/btl616
  18. Ding J., Berleant D., Nettleton D. and Wurtele E., "Mining MEDLINE: abstracts, sentences, or phrases?," Proceedings of PSB'02, pp. 326-337, 2002.
  19. Nedellec C., "Learning language in logic - genic interaction extraction challenge," Proceedings of LLL'05, pp.31-37, 2005.
  20. Airola A., Pyysalo S., Bjorne J., Pahikkala T., Ginter F. and Salakoski T., "All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning," BMC Bioinformatics, vol.9, no.S2, 2008.
  21. Miwa M., Sætre R., Miyao Y. and Tsujii J., "Protein-protein interaction extraction by leveraging multiple kernels and parsers," International Journal of Medical Informatics, 2009.