Proposition of negatively pure association rule threshold

음의 순수 연관성 규칙 평가 기준의 제안

  • Received : 2011.01.16
  • Accepted : 2011.02.14
  • Published : 2011.03.31

Abstract

Association rule represents the relationship between items in a massive database by quantifying their relationship, and is used most frequently in data mining techniques. In general, association rule technique generates the rule, 'If A, then B.', whereas negative association rule technique generates the rule, 'If A, then not B.', or 'If not A, then B.'. We can determine whether we promote other products in addition to promote its products only if we add negative association rules to existing association rules. In this paper, we proposed the negatively pure association rules by negatively pure support, negatively pure confidence, and negatively pure lift to overcome the problems faced by negative association rule technique. In checking the usefulness of this technique through numerical examples, we could find the direction of association by the sign of the negatively pure association rule measure.

연관성 규칙은 방대한 데이터베이스에서 항목간의 관계를 명확히 수치화 함으로써 그들간의 관련성을 표시해주는 기법으로 데이터 마이닝 기법들 중에서 가장 많이 활용되고 있다. 어느 항목이 발생하면 다른 항목도 발생한다는 규칙을 발견하기 위한 기법이 연관성 규칙이라면 음의 연관성 규칙 마이닝은 어느 항목이 발생하면 다른 항목도 발생하지 않는다는 규칙을 찾아내는 기법이다. 기존의 연관성 규칙에 음의 연관성 규칙을 추가하게 되면 어떤 제품을 판매하기 위해서는 그 제품만 마케팅 하는 것 뿐 만 아니라 더 나아가 그 제품이 아닌 어느 제품을 마케팅 하는 것이 필요한지를 판단할 수 있다. 본 논문에서는 음의 연관성 규칙의 단점을 보완할 수 있는 음의 순수 연관성 규칙의 측도들을 제시하고 흥미도 측도가 가져야 할 조건들을 조사하였으며, 예제 데이터를 활용하여 음의 순수 연관성 규칙의 유용성에 대해 살펴보았다.

Keywords

References

  1. 황준현, 김재련 (2003). 역 연관규칙을 이용한 타겟 마케팅. <한국지능정보시스템학회논문지>, 9, 195-209.
  2. Agrawal, R., Imielinski, R. and Swami, A. (1993). Mining association rules between sets of items in large databases. Proceedings of the ACM SIGMOD Conference on Management of Data, 207-216.
  3. Agrawal, R. and Srikant, R. (1994). Fast algorithms for mining association rules. Proceedings of the 20th VLDB Conference, 487-499.
  4. Bala, P. K. (2009). A technique for mining negative association rules. Proceedings of the 2nd Bangalore Annual Compute Conference, 23-23.
  5. Bayardo, R. J. (1998). Efficiently mining long patterns from databases. Proceedings of ACM SIGMOD Conference on Management of Data, 85-93.
  6. Cai, C. H., Fu, A. W. C., Cheng, C. H. and Kwong, W. W. (1998). Mining association rules with weighted items. Proceedings of International Database Engineering and Applications Symposium, 68-77.
  7. Cho, K. H. and Park, H. C. (2007). Association rule mining by environmental data fusion. Journal of the Korean Data & Information Science Society, 18, 279-287.
  8. Cho, K. H. and Park, H. C. (2008). A study of association rule application using self-organizing map for fused data. Journal of the Korean Data & Information Science Society, 19, 95-104.
  9. Choi, J. H. and Park, H. C. (2008). Comparative study of quantitative data binning methods in association rule. Journal of the Korean Data & Information Science Society, 19, 903-910.
  10. Han, J. and Fu, Y. (1999). Mining multiple-level association rules in large databases. IEEE Transactions on Knowledge and Data Engineering, 11, 68-77.
  11. Han J. and Kamber, M. (2006). Data mining : Concepts and techniques, Morgam Kaufmann, USA.
  12. Han, J., Pei, J. and Yin, Y. (2000). Mining frequent patterns without candidate generation. Proceedings of ACM SIGMOD Conference on Management of Data, 1-12.
  13. Koh, Y. S. and Pears, R. (2007). Efficiently finding negative association rules without support threshold. Advances in Artificial Intelligence, Lecture Notes in Computer Science, 4830, 710-714.
  14. Liu, B., Hsu, W. and Ma, Y. (1999). Mining association rules with multiple minimum supports. Proceedings of the 5th International Conference on Knowledge Discovery and Data Mining, 337-241.
  15. Park, H. C. (2008). The proposition of conditionally pure confidence in association rule mining. Journal of the Korean Data & Information Science Society, 19, 1141-1151.
  16. Park, H. C. (2008). The proposition of conditionally pure confidence in association rule mining. Journal of the Korean Data & Information Science Society, 19, 1141-1151.
  17. Park J. S., Chen M. S. and Philip S. Y. (1995). An effective hash-based algorithms for mining association rules. Proceedings of ACM SIGMOD Conference on Management of Data, 175-186.
  18. Pasquier, N., Bastide, Y., Taouil, R. and Lakhal, L. (1999). Discovering frequent closed itemsets for association rules. Proceedings of the 7th International Conference on Database Theory, 398-416.
  19. Pei, J., Han, J. and Mao, R. (2000). CLOSET: An efficient algorithm for mining frequent closed itemsets. Proceedings of ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 21-30.
  20. Piatetsky-Shapiro, G. (1991). Discovery, analysis and presentation of strong rules. Knowledge Discovery in Databases, AAAI/MIT Press, 229-248.
  21. Shang, S., Dong, X., Geng, R. and Zhao, L. (2008). Mining negative association rules in multi-database. Proceedings of the Fifth International Conference on Fuzzy Systems and Knowledge Discovery, 596-599.
  22. Sharma, S., Sharma, S. and Agrawal, J. (2007). GA optimized negative association rule mining. International Journal of Soft Computing, 2, 124-128.
  23. Sim, A., Indrawan, M. and Srinivasan, B. (2008). The importance of negative associations and the discovery of association rule pairs. International Journal of Business Intelligence and Data Mining, 3, 158-176. https://doi.org/10.1504/IJBIDM.2008.020517
  24. Srikant, R. and Agrawal, R. (1995). Mining generalized association rules. Proceedings of the 21st VLDB Conference, 407-419.
  25. Toivonen H. (1996). Sampling large database for association rules. Proceedings of the 22nd VLDB Conference, 134-145.
  26. Yuan, X., Buckles, B. P., Yuan, Z. and Zhang, J. (2002). Mining negative association rules. Proceedings of the Seventh International Symposium on Computers and Communications, 623-628.