Structuring Risk Factors of Industrial Incidents Using Natural Language Process

Kang, Sungsik;Chang, Seong Rok;Lee, Jongbin;Suh, Yongyoon;

doi:10.14346/JKOSOS.2021.36.1.56

Journal of the Korean Society of Safety (한국안전학회지)

Volume 36 Issue 1
/
Pages.56-63
/
2021
/
1738-3803(pISSN)
/
2383-9953(eISSN)

The Korean Society of Safety (한국안전학회)

DOI QR Code

Structuring Risk Factors of Industrial Incidents Using Natural Language Process

자연어 처리 기법을 활용한 산업재해 위험요인 구조화

Kang, Sungsik (Department of Safety Engineering, Pukyong National University) ;
Chang, Seong Rok (Department of Safety Engineering, Pukyong National University) ;
Lee, Jongbin (Laboratory of Disaster Management, Pukyong National University) ;
Suh, Yongyoon (Department of Safety Engineering, Pukyong National University)

강성식 (부경대학교 안전공학과) ;
장성록 (부경대학교 안전공학과) ;
이종빈 (부경대학교 방재연구소) ;
서용윤 (부경대학교 안전공학과)

Received : 2021.02.03
Accepted : 2021.02.24
Published : 2021.02.28

https://doi.org/10.14346/JKOSOS.2021.36.1.56 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

The narrative texts of industrial accident reports help to identify accident risk factors. They relate the accident triggers to the sequence of events and the outcomes of an accident. Particularly, a set of related keywords in the context of the narrative can represent how the accident proceeded. Previous studies on text analytics for structuring accident reports have been limited to extracting individual keywords without context. We proposed a context-based analysis using a Natural Language Processing (NLP) algorithm to remedy this shortcoming. This study aims to apply Word2Vec of the NLP algorithm to extract adjacent keywords, known as word embedding, conducted by the neural network algorithm based on supervised learning. During processing, Word2Vec is conducted by adjacent keywords in narrative texts as inputs to achieve its supervised learning; keyword weights emerge as the vectors representing the degree of neighboring among keywords. Similar keyword weights mean that the keywords are closely arranged within sentences in the narrative text. Consequently, a set of keywords that have similar weights presents similar accidents. We extracted ten accident processes containing related keywords and used them to understand the risk factors determining how an accident proceeds. This information helps identify how a checklist for an accident report should be structured.

Keywords

Acknowledgement

This work was supported by a Research Grant of Pukyong National University(2019)

References

KOSHA, "Statistical Survey and Analysis of Industrial Disasters", 2018.
Y. Suh, "Data Analytics for Social Risk Forecasting and Assessment of New Technology", J. Korean Soc. Saf., Vol. 32, No. 3, pp. 83-89, 2017. https://doi.org/10.14346/JKOSOS.2017.32.3.83
C. D. Manning and H. Schutze, "Foundations of Statistical Natural Language Processing", MIT Press, 1999.
B. Kim, S. Chang and Y. Suh, "Text Analytics for Classifying Types of Accident Occurrence Using Accident Report Documents", J. Korean Soc. Saf., Vol. 33, No.3, pp. 58-64, 2018. https://doi.org/10.14346/JKOSOS.2018.33.3.58
S. Kang and Y. Suh, "On the Development of Risk Factor Map for Accident Analysis using Textmining and SelfOrganizing Map(SOM) Algorithms", J. Korean Soc. Saf., Vol. 33, No. 6, pp. 77-84, 2018. https://doi.org/10.14346/JKOSOS.2018.33.6.77
G. Ahn, M. Seo and S. Hur, "Development of Accident Classification Model and Ontology for Effective Industrial Accident Analysis based on Textmining", J. Korean Soc. Saf., Vol. 32, No. 5, pp. 179-185, 2017. https://doi.org/10.14346/JKOSOS.2017.32.5.179
T. L. Bunn, S. Slavova and L. Hall, "Narrative Text Analysis of Kentucky Tractor Fatality Reports", Accid. Anal. Prev., Vol. 40, No. 2, pp. 419-425, 2008. https://doi.org/10.1016/j.aap.2007.07.010
T. Mikolov, K. Chen, G. Corrado and J. Dean, "Efficient Estimation of Word Representations in Vector Space", arXiv preprint, arXiv:1301.3781, 2013.
X. He, D. Cai, S. Yan and H. Zhang, "Neighborhood Preserving Embedding", Tenth IEEE International Conference on Computer Vision, 2005.
K. Toutanova, D. Chen, P. Pantel, H. Poon, P. Choudhury and M. Gamon, "Representing Text for Joint Embedding of Text and Knowledge Bases", Conference on Empirical Methods in Natural Language Processing, pp. 1499-1509, 2015.
J. H. Jo, "A study on the Causes Analysis and Preventive Measures by Disaster types in Construction Fields", KSMS, Vol. 14, No. 1, pp. 7-13, 2012.
S. K. Kang, H. Yu and Y. J. Lee, "Analyzing Disaster Response Terminologies by Text Mining and Social Network Analysis", Information Systems Review, Vaol. 18, No. 1, pp. 141-155, 2016. https://doi.org/10.14329/isr.2016.18.1.141
W. Jang and Y. Suh, "Identifying Abnormal Accidents Using Local Outlier Factor and Decision Tree Algorithms", Journal of the Korean Institute of Industrial Engineers, Vol. 45, No. 4, pp. 329-340, 2019. https://doi.org/10.7232/JKIIE.2019.45.4.329
Y. Goldberg and O. Levy, "Word2vec Explained: Deriving Mikolov et al.'s Negative-sampling Word-embedding Method", arXiv preprint, arXiv:1402.3722, 2014.
L. Ma and Y. Zhang, "Using Word2Vec to Process Big Text Data", IEEE International Conference on Big Data, 2015.
Sanghyuk Choi, Jinseok Seol and Sang-goo Lee, "On Word Embedding Models and Parameters Optimized for Korean", Korean Language information Science Society, pp. 252-256, 2016.
L. Van Der Maaten and G. Hinton, "Visualizing Data using t-SNE", J Mach Learn Res, Vol. 9, pp. 2579-2605, 2008.
A. Likas, N. Vlassis and J. J. Verbeek, "The Global k-means Clustering Algorithm", Pattern Recognition, Vol. 36, No. 2, pp. 451-461, 2003. https://doi.org/10.1016/S0031-3203(02)00060-2

Journal of the Korean Society of Safety (한국안전학회지)

Structuring Risk Factors of Industrial Incidents Using Natural Language Process

자연어 처리 기법을 활용한 산업재해 위험요인 구조화

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)