DOI QR코드

DOI QR Code

Convergence Study on Research Topics for Thyroid Cancer in Korea

국내 갑상선암 논문 토픽에 대한 융합연구

  • Yang, Ji-Yeon (Dept. of Applied Mathematics, Kumoh National Institute of Technology)
  • 양지연 (금오공과대학교 응용수학과)
  • Received : 2018.12.26
  • Accepted : 2019.02.20
  • Published : 2019.02.28

Abstract

The purpose of this study was to perform a convergence study for the investigation of the trend of research topics related to thyroid cancer in Korea. We collected related research papers from DBpia and employed LDA-based topic model. In result, we identified four research topics, each of which concerns "Surgery", "Disease aggressiveness", "Survival analysis", and "Well-being of patients". With multinomial logistic regression, we found significant time trend, where "Surgery"-related topic was popular before 2000, topics regarding "Disease aggressiveness" and "Survival analysis" were frequently addressed in the 2000s, and "Survival analysis" and especially "Well-being of patients" have been pursued since 2010. The findings would serve as a reference guide for research directions. Future work may examine whether the recent change in research topics is observed in other diseases.

본 연구는 통계적인 기법을 융합 활용하여 국내 갑상선암과 관련된 연구 토픽의 동향 및 변화 추세를 알아보기 위함이다. DBpia에 등록되어 있는 갑상선암 관련 논문을 대상으로 LDA(latent Dirichlet allocation) 기반의 토픽 모형을 적용한 결과, 4개의 연구 토픽을 도출하였으며 각 토픽은 "Surgery", "Disease aggressiveness", "Survival analysis", "Well-being of patients"에 관한 내용으로 확인되었다. 다범주 로짓모형을 이용하여 연구 토픽의 시대적 추이를 확인한 결과, 2000년 이전에는 "Surgery", 2000년대에는 "Disease aggressiveness"와 "Survival analysis", 2010년 이후에는 "Survival analysis"와 특히 "Well-being of patients"에 관한 연구가 많이 이루어졌음을 확인하였다. 이는 향후 갑상선암 연구의 방향 모색에 필요한 기초자료로 활용될 수 있을 것이며, 최근 환자의 복지로 크게 전환된 연구 토픽의 변화가 다른 질병에서도 관찰되는지 추후 검토할 필요가 있다.

Keywords

OHHGBW_2019_v10n2_75_f0001.png 이미지

Fig. 1. Data acquisition and data preprocessing

OHHGBW_2019_v10n2_75_f0002.png 이미지

Fig. 2. Number of topics indicated by four metrics. The metrics were standardized to range between 0 and 1.

OHHGBW_2019_v10n2_75_f0003.png 이미지

Fig. 3. Number of papers in each topic

OHHGBW_2019_v10n2_75_f0004.png 이미지

Fig. 4. The 95% confidence intervals for topic probability at each time period

Table 1. Topics and the top 10 words in each discovered by LDA analysis of abstracts

OHHGBW_2019_v10n2_75_t0001.png 이미지

Table 2. Results of multinomial logistic regression. The reference group is Topic 1(Surgery).

OHHGBW_2019_v10n2_75_t0002.png 이미지

Table 3. Main terminologies in titles and keywords for each topic

OHHGBW_2019_v10n2_75_t0003.png 이미지

References

  1. H. G. Welch & C. B. William. (2010). Overdiagnosis in cancer. Journal of the National Cancer Institute, 102(9), 605-613. DOI : https://doi.org/10.1093/jnci/djq099
  2. H. S. Ahn & H. G. Welch. (2015). South Korea's thyroid-cancer "epidemic"-turning the tide. New England Journal of Medicine, 373(24), 2389-2390. DOI : 10.1056/NEJMc1507622
  3. S. A. Choi, J. H. Lee, D. K. Min, & C. Lee. (2015). Association between thyroid cancer incidence and health indicators in OECD 34 countries. Korean Journal of Family Practice, 5(3, suppl. 2), 714-720. http://www.kjfp.or.kr/journal/view.html?uid=354&vmd=Full
  4. J. H. Chung. (2018). Prevalence of thyroid nodules detected by ultrasonography in adults for health check-up and analysis of fine needle aspiration cytology. Journal of Korean Endocrine Society, 23(6), 391-394. DOI : https://doi.org/10.3803/jkes.2008.23.6.391
  5. S. Ezzat, D. A. Sarti, D. R. Cain, & G. D. Braunstein. (1994). Thyroid incidentalomas: prevalence by palpation and ultrasonography. Archives of internal medicine, 154(16), 1838-1840. DOI : 10.1001/archinte.1994.00420160075010
  6. J. P. Brito, H. J. Kim, S. J. Han, Y. S. Lee, & H. S. Ahn. (2016). Geographic distribution and evolution of thyroid cancer epidemic in South Korea. Thyroid, 26(6), 864-865. DOI : https://doi.org/10.1089/thy.2016.0057
  7. J. H. Chung. (2014). A Refutation against Unfounded Reports on Thyroid Cancer, International Journal of Thyroidology, 7(1), 1-6. DOI : http://dx.doi.org/10.11106/jkta.2014.7.1.1
  8. H. S. Ahn. (2017). Overdiagnosis in health care: impact of cancer screening. Journal of the Korean Medical Association, 60(4), 323-329. DOI : https://doi.org/10.5124/jkma.2017.60.4.323
  9. K. H. Yi, Y. J. Park, S. S. Koong, J. H. Kim, D. G. Na, J. S. Ryu, S. Y. Park, C. H. Baek, Y. K. Shong, & Y. D. Lee. (2011). Revised Korean thyroid association management guidelines for patients with thyroid nodules and thyroid cancer. Journal of the Korean Society of Radiology, 64(4), 389-416. DOI : https://doi.org/10.3348/jksr.2011.64.4.389
  10. D. M. Blei, A. Y. Ng, & M. I. Jordan. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022. DOI : 10.1162/jmlr.2003.3.4-5.993
  11. C. H. Papadimitriou, P. Raghavan, H. Tamaki, & S. Vempala. (2000). Latent semantic indexing: A probabilistic analysis. Journal of Computer and System Sciences, 61(2), 217-235. DOI : https://doi.org/10.1006/jcss.2000.1711
  12. S. Hwang & D. R. Hwang. (2018). A Study on the Research Trends in Arts Management in Korea using Topic Modeling and Semantic Network Analysis. Korean association of arts management, 47, 5-29. http://www.dbpia.co.kr/Journal/PDFViewNew?id=NODE07523972&prevPathCode= https://doi.org/10.52564/JAMP.2018.47.5
  13. J. E. Yoon & C. J. Suh. (2018). Research Trend Analysis on Smart healthcare by using Topic Modeling and Ego Network Analysis. Journal of Digital Contents Society, 19(5), 981-993. DOI : http://dx.doi.org/10.9728/dcs.2018.19.5.981
  14. N. Jang & M. J. Kim. (2017). Research Trend Analysis in Fashion Design Studies in Korea using Topic Modeling. Journal of Digital Convergence, 15(6), 415-423. DOI : http://doi.org/10.14400/JDC.2017.15.6.415
  15. K. H. Choi & J. H. Park. (2015). The Analysis of Public Awareness about Literary Therapy by Utilizing Big Data Analysis-The aspects of convergence literature and statistics. Journal of Digital Convergence, 13(4), 395-404. DOI : http://dx.doi.org/10.14400/JDC.2015.13.4.395
  16. N. H. Jo & E. Y. Na. (2017). Analysis of Domestic Research on Depression and Stress : Focused on the Treatment and Subjects. Journal of Convergence for Information Technology, 7(6), 53-59. DOI : doi.org/10.22156/CS4SMB.2017.7.6.053
  17. S. H. Choi. (2016). A Study on Smart Campus Information Services. Journal of Convergence for Information Technology, 6(3), 79-83. DOI : dx.doi.org/10.22156/CS4SMB.2016.6.3.079
  18. T. G. Lee, S. M. Heo, S. H. Shin & J. Y. Yang. (2018). Trend Analysis of Thyroid Cancer Research in Korea with Text Mining Techniques. Journal of The Korea Society of Computer and Information, 23(12), 153-161. DOI : https://doi.org/10.9708/jksci.2018.23.12.153
  19. T. L. Griffiths & M. Steyvers. (2004). Finding scientific topics. Proceedings of the National academy of Sciences, 101(suppl 1), 5228-5235. DOI : https://doi.org/10.1073/pnas.0307752101
  20. J. Cao, T. Xia, J. Li, Y. Zhang, & S. Tang. (2009). A density-based method for adaptive lda model selection. Neurocomputing, 72(7), 1775-1781. DOI : https://doi.org/10.1016/j.neucom.2008.06.011
  21. R. Arun, V. Suresh, C. V. Madhavan, & M. N. Murthy. (2010). On finding the natural number of topics with latent dirichlet allocation: Some observations. Pacific-Asia Conference on Knowledge Discovery and Data Mining, Part I, LNAI(6118), 391-402. DOI : doi.org/10.1007/978-3-642-13657-3_43
  22. R. Deveaud, E. SanJuan, & P. Bellot. (2014). Accurate and effective latent concept modeling for ad hoc information retrieval. Document numerique, 17(1), 61-84. DOI : https://doi.org/10.3166/DN.17.1.61-84