Informal Quality Data Analysis via Sentimental analysis and Word2vec method

Lee, Chinuk;Yoo, Kook Hyun;Mun, Byeong Min;Bae, Suk Joo;

doi:10.7469/JKSQM.2017.45.1.117

Journal of Korean Society for Quality Management (품질경영학회지)

Volume 45 Issue 1
/
Pages.117-128
/
2017
/
1229-1889(pISSN)
/
2287-9005(eISSN)

Korean Society for Quality Management (한국품질경영학회)

DOI QR Code

Informal Quality Data Analysis via Sentimental analysis and Word2vec method

감성분석과 Word2vec을 이용한 비정형 품질 데이터 분석

Lee, Chinuk (Department of Industrial Engineering Hanyang University) ;
Yoo, Kook Hyun (Department of Mathematics, Hanyang University) ;
Mun, Byeong Min (Department of Industrial Engineering Hanyang University) ;
Bae, Suk Joo (Department of Industrial Engineering Hanyang University)

이진욱 (한양대학교 산업공학과) ;
유국현 (한양대학교 수학과) ;
문병민 (한양대학교 산업공학과) ;
배석주 (한양대학교 산업공학과)

Received : 2016.03.06
Accepted : 2017.03.22
Published : 2017.03.31

https://doi.org/10.7469/JKSQM.2017.45.1.117 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Purpose: This study analyzes automobile quality review data to develop alternative analytical method of informal data. Existing methods to analyze informal data are based mainly on the frequency of informal data, however, this research tries to use correlation information of each informal data. Method: After sentimental analysis to acquire the user information for automobile products, three classification methods, that is, $na{\ddot{i}}ve$ Bayes, random forest, and support vector machine, were employed to accurately classify the informal user opinions with respect to automobile qualities. Additionally, Word2vec was applied to discover correlated information about informal data. Result: As applicative results of three classification methods, random forest method shows most effective results compared to the other classification methods. Word2vec method manages to discover closest relevant data with automobile components. Conclusion: The proposed method shows its effectiveness in terms of accuracy and sensitivity on the analysis of informal quality data, however, only two sentiments (positive or negative) can be categorized due to human errors. Further studies are required to derive more sentiments to accurately classify informal quality data. Word2vec method also shows comparative results to discover the relevance of components precisely.

Keywords

References

Eun Ji Yu, Yoo Sin Kim, Nam Gyu Kim, and Seung Ryul Jeong. 2013. "Predicting the direction of the stock index by using a domain-specific sentiment dictionary." Journal of Intelligence and Information Systems 19(1):95-110. https://doi.org/10.13088/jiis.2013.19.1.095
Pang Ning Tang, Michael Stenbach, and Vipin Kumar. 2006. Introduction To Data Mining. Addison-Wesley Longman Publishing Co., Inc.
Quoc Le, Tomas Mikolov. 2014. "Distributed representations of Sentences and Documents." Proceedings of the 31st international conference on machine learning, 1188-1136.
Sung-Jick Lee, and Han-Joon Kim. 2009. "Keyword extraction from news corpus using modified TF-IDF." The Journal of Society for e-Business Studies 14(4):59-73.
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2003. "Efficient estimation of word representations in vector space." Proceedings in International Conference on learning representations 2013.
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. "Distributed representation of words and phrases and their compositionality." Proceedings in International conference on neural information processing systems, 3111-3119.
Yoo Sin Kim, Nam Gyu Kim, and Seung Ryul Jeong. 2011. "Stock-index invest model using news big data opinion mining." Journal of Intelligence and Information Systems. Volume 18(2):143-156. https://doi.org/10.13088/JIIS.2012.18.2.143
Yuen-Hsien Tseng, Chi-Jen Lin, and Yu-I Lin. 2007. "Text mining techniques for patent analysis." Information processing and management 43(5):1216-1247. https://doi.org/10.1016/j.ipm.2006.11.011
Yean Ran Lee, Eun Ju Yoon, Jung Ah Im, Young Hwan Lim, and Jung Hwan Sung. 2013. "Emotional tree using sensitivity image analysis algorithm." Journal of the Korea Contents Association 13(11):562-570. https://doi.org/10.5392/JKCA.2013.13.11.562
Zhou Yong, Li Youwen, and Xia Shixiong. 2009. "An improved KNN text classification algorithm based on clustering." The Journal of Computers 4(3):230-237.

Journal of Korean Society for Quality Management (품질경영학회지)

Informal Quality Data Analysis via Sentimental analysis and Word2vec method

감성분석과 Word2vec을 이용한 비정형 품질 데이터 분석

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)