A study on insignificant rules discovery in association rule mining

연관성규칙에서 의미 없는 규칙의 발견에 관한 연구

  • Cho, Kwang-Hyun (Department of Early Childhood Education, Changwon National University) ;
  • Park, Hee-Chang (Department of Statistics, Changwon National University)
  • Received : 2010.12.15
  • Accepted : 2011.01.11
  • Published : 2011.01.31

Abstract

Association rule mining searches for interesting relationships among items in a given database. There are three primary quality measures for association rule, support and confidence and lift. In order to improve the efficiency of existing mining algorithms, constraints were applied during the mining process to generate only those association rules that are interesting to users instead of all the association rules. When we create relation rule, we can often find a lot of rules. This can find rule that direct relativity by intervening variable does not exist. In this study we try to discovery an insignificant rule in association rules by intervening variable. Result of this study can understand relativity about rule that is created in relation rule more exactly.

연관성규칙은 대용량 데이터베이스에서 각 항목들 간의 관련성을 찾아내는 기법으로 둘 또는 그 이상의 품목들 사이의 지지도, 신뢰도, 향상도를 바탕으로 관련성 여부를 측정한다. 연관성규칙에서는 일반적으로 사용하는 연관성규칙 이외에 연관성규칙의 효율성을 개선하기 위하여 여러 가지 제약기반 연관성규칙의 연구가 활발하게 진행되고 있다. 연관성규칙 생성 시, 종종 많은 규칙들을 발견할 수 있다. 이는 변수들 간에 우연히 관련성이 높게 나타나는 경우가 존재할 수 있고 매개변수에 의하여 직접적인 관련성이 없는 규칙을 발견할 수도 있다. 이에 본 논문에서는 연관성규칙에서 매개변수에 의한 의미 없는 규칙의 발견에 관하여 연구하고자 한다. 본 연구 결과는 연관성 규칙에서 생성된 규칙에 대한 관련성을 보다 정확하게 이해할 수 있도록 함으로써 결과의 해석을 보다 명확하게 할 수 있다.

Keywords

References

  1. Agrawal, R., Imielinski, R. and Swami, A. (1993). Mining association rules between sets of items in large databases. Proceedings of the ACM SIGMOD Conference on Management of Data, 207-216.
  2. Agrawal, R. and Srikant, R. (1994). Fast algorithms for mining association rules. Proceedings of the 20th VLDB Conference, 487-499.
  3. Cheung, D. W., Han, J., Ng, V., Fu, A. W. and Fu Y. (1996). A fast distribution algorithm for mining association rules. Proceedings of International Conference on Parallel and Distributes Information System, 31-43.
  4. Cho, K. H. and Park, H. C, (2006).A study for intervening effect verification using association rules. Journal of the Korean Data Analysis Society, 8, 1905-1914.
  5. Kim, M. H. and Park, H. C. (2008).Development of componenta ssociation rule s and macro algorithm. Journal of the Korean Data & Information Science Society, 19, 197-207.
  6. Lee. K. W. and Park, H. C. (2008).A study for statistical criterion in negative association rules using boolean analyzer. Journal of the Korean Data & Information Science Society, 19 , 569-576.
  7. Park, H. C. and Cho, K. H. (2006a).Discovery of association rules using latent variables. Journal of the Korean Data & Information Science Society, 17, 149-160.
  8. Park, H. C. and Cho, K. H. (2006b).A study for antecedent association rules. Journal of the Korean Data & Information Science Society, 17, 1077-1083.
  9. Park, J. S., Chen, M. S. and Philip, S. Y. (1995). An effective hash-based algorithms for mining association rules. Proceedings of ACM SIGMOD Conference on Management of Data, 104-123.
  10. Saygin, Y., Vassilios, S. V. and Clifton, C. (2002). Using unknowns to prevent discovery of association rules. Proceedings of 2002 Conference on Research Issues in Data Engineering, 45-54.
  11. Sergey, B., Rajeev M., Jeffrey D.U. and Shalom T. (1997). Dynamic itemset counting and implication rules for market data. Proceedings of ACM SIGMOD Conference on Management of Data, 255-264.
  12. Toivonen, H. (1996). Sampling Large Database for Association Rules. Proceedings of the 22nd VLDB Conference,134-145.