DOI QR코드

DOI QR Code

Analysis of Potential Construction Risk Types in Formal Documents Using Text Mining

텍스트 마이닝을 통한 건설공사 공문 잠재적 리스크 유형 분석

  • 엄세호 (성균관대학교 글로벌스마트시티융합전공) ;
  • 차기춘 (성균관대학교 글로벌스마트시티융합전공) ;
  • 박선규 (성균관대학교 건설환경공학부) ;
  • 박승희 (성균관대학교 건설환경공학부) ;
  • 박종호 (성균관대학교 리질리언트에코스마트시티)
  • Received : 2022.10.19
  • Accepted : 2022.10.27
  • Published : 2023.02.01

Abstract

Since risks occurring in construction projects can have a significant impact on schedules and costs, there have been many studies on this topic. However, risk analysis is often limited to only certain construction situations,and experience-dependent decision-making is therefore mainly performed. Data-based analyses have only been partially applied to safety and contract documents. Therefore, in this study, cluster analysis and a Word2Vec algorithm were applied to formal documents that contain important elements for contractors or clients. An initial classification of document content into six types was performed through cluster analysis, and 157 occurrence types were subdivided through application of the Word2Vec algorithm. The derived terms were re-classified into five categories and reviewed as to whether the terms could develop into potential construction risk factors. Identifying potential construction risk factors will be helpful as basic data for process management in the construction industry.

건설프로젝트에서 발생되는 리스크는 공기지연 및 비용증가에 큰 영향을 끼치기 때문에 다양한 리스크를 파악하기 위한 노력이 이루어지고 있다. 그러나 시공단계의 리스크 분석은 공종 및 수행단계에 국한되거나, 경험 의존적 의사결정이 주로 수행되고 있다. 데이터 기반의 분석도 일부 사례에 적용되고 있을 뿐이다. 따라서 본 연구에서는 시공사 또는 발주처에 중요한 요인들이 포함되어 있을 것으로 판단되는 수발신공문을 대상으로 군집분석과 Word2Vec 알고리즘을 적용하였다. 군집분석을 통해 6개 유형으로 1차 분류를 수행하였으며, Word2Vec을 통해 157개의 공문 발생 유형을 도출하였다. 도출된 연관어의 속성별 분석을 위하여 새로운 5개의 범주를 적용하였으며, 이를 통해 공문 발생 유형이 잠재적인 건설 리스크 요인으로 발전 가능한지 검토하였다. 텍스트 마이닝을 통한 3단계의 공문 발생 유형 분석 결과는 건설현장의 공정관리를 위한 기초 자료로써 도움 될 것으로 판단된다.

Keywords

Acknowledgement

이 연구는 국토교통부/국토교통과학기술진흥원이 시행하고 한국도로공사가 총괄하는 "스마트건설기술개발 국가R&D사업(과제번호 22SMIP-A158708-03)"의 지원으로 수행하였습니다. 본 논문은 2022 CONVENTION 논문을 수정·보완하여 작성되었습니다.

References

  1. Al-Bahar, J. F. (1989). Risk management in construction projects: A systemic analytical approach for contractors, Ph.D. Dissertation, University of California Berkeley, Berkeley, California, USA.
  2. Choi, J. W. (2015). An analysis on regional differences of major delay factors in overseas architectural projects, Master Dissertation, Hanyang University, Seoul, Korea (in Korean).
  3. kakao (2020). Kakao hangul analyzer III, Available at: https://github.com/kakao/khaiii (Accessed: October 18, 2022).
  4. Kang, H. B. and Yi, J. S. (2018). "An analysis of public text data in construction disaster cases using Word2Vec-based data visualization." Autumn Annual Conference of AIK, 2018, Architectural Institute of Korea, Vol. 38, No. 2, pp. 567-570 (in Korean).
  5. Kang, L. S., Kim, C. H. and Kwak, J. M. (2002). "Analysis for the importance of risk factors through the project life cycle." Journal of the Architectural Institute of Korea Structure & Construction, Vol. 17, No. 8, pp. 103-110 (in Korean).
  6. Kim, J. S. (2022a). Analysis of project delay using big data, Master Dissertation, Hanyang University, Seoul, Korea (in Korean).
  7. Kim, E. H. (2022b). Automatic classification on the work breakdown structure of apartment construction projects: A machine learning approach, Master Dissertation, Sungkyunkwan University, Seoul, Korea (in Korean).
  8. Kim, J. S. and Kim, B. S. (2019). "Characteristics analysis of seasonal construction site fall accident using text mining." Korean Journal of Construction Engineering and Management, Vol. 20, No. 3, pp. 113-121 (in Korean).
  9. Kim, K. H., Kim, K. H., Lee, Y. S. and Kim, J. J. (2008). "A study about influence of risk factors in relation to construction cost increase and schedule delay on the reinforced concrete construction." Journal of the Architectural Institute of Korea Structure & Construction, Vol. 24, No. 5, pp. 165-172 (in Korean).
  10. Lee, J. H. and Yi, J. S. (2017). "Predicting project's uncertainty risk in the bidding process by integrating unstructured text data and structured numerical data using text mining." Applied Sciences, Vol. 7, No. 11, pp. 1-15. https://doi.org/10.3390/app7111141
  11. Lee, J. S., Kim, D. Y., Lee, C. J., Lee, J. H. and Han, S. H. (2018). "A research for clustering of conflict in public construction project." Korean Journal of Construction Engineering and Management, Vol. 19, No. 2, pp. 61-72 (in Korean).
  12. Marzouk, M. and Enaba, M. (2019). "Text analytics to analyze and monitor construction project contract and correspondence." Automation in Construction, Vol. 98, pp. 265-274. https://doi.org/10.1016/j.autcon.2018.11.018
  13. Mikolov, T., Chen, K., Corrado, G. and Dean, J. (2013). Efficient estimation of word representations in vector space, Available at: https://arxiv.org/abs/1301.3781 (Accessed: October 18, 2022).
  14. Park, E. L. and Cho, S. Z. (2014). "KoNLPy: Koreannatural language processing in Python." Annual Conference on Human and Language Technology, pp. 133-136 (in Korean).
  15. Park, K. C. and Kim, H. K. (2021). "Analysis of seasonal importance of construction hazards using text mining." Journal of the Korean Society of Civil Engineers, KSCE, Vol. 41, No. 3, pp. 305-316 (in Korean).
  16. Seo, D. H. (2019). Text mining with python, bjpublic, Seoul, Korea (in Korean).
  17. Shin, Y. J. and Chi, S. H. (2014). "Tacit knowledge informatization from text-based construction data." Annual Conference of KICEM, 2014, Korean Journal of Construction Engineering and Management, pp. 31-34 (in Korean).
  18. Smilkov, D., Thorat, N., Nicholson, C., Rief, E., Viegas, F. and Wattenberg, M. (2016). Embedding projector: Interactive visualization and interpretation of embeddings, Available at: https://arxiv.org/abs/1611.05469 (Accessed: October 18, 2022).
  19. Son, B. Y. and Lee, E. B. (2019). "Using text mining to estimate schedule delay risk of 13 oshore oil and gas EPC case studies during the bidding process." Energies, Vol. 12, No. 10, pp. 1-25.
  20. Wang, G., Liu, M., Cao, D. and Tan D. (2020). "Identifying highfrequency-low-severity construction safety risks: An empirical study based on official supervision reports in Shanghai." Engineering, Construction and Architectural Management, Vol. 29, No. 2, pp. 940-960. https://doi.org/10.1108/ECAM-07-2020-0581
  21. Yang, H. S. (2020). Comparison of recognition on the risks affecting schedule delays and cost overruns in overseas civil construction projects, Master Dissertation, Hanyang University, Seoul, Korea (in Korean).
  22. Yang, S. W. and Lim, H. C. (2021). "Semantic network analysis on the research trends of construction accident." Journal of the Architectural Institute of Korea, Vol. 37, No. 6, pp. 231-236 (in Korean).
  23. Yoon, Y. S., Suh, S. W., Park, M. S. and Jang, M. H. (2008). "Construction process based schedule risk management system." Construction Engineering and Management, Vol. 9, No. 4, pp. 101-110 (in Korean).