k-Nearest Neighbor Querv Processing using Approximate Indexing in Road Network Databases

도로 네트워크 데이타베이스에서 근사 색인을 이용한 k-최근접 질의 처리

  • 이상철 (한양대학교 전자컴퓨터통신공학과) ;
  • 김상욱 (한양대학교 정보통신학부)
  • Published : 2008.10.15

Abstract

In this paper, we address an efficient processing scheme for k-nearest neighbor queries to retrieve k static objects in road network databases. Existing methods cannot expect a query processing speed-up by index structures in road network databases, since it is impossible to build an index by the network distance, which cannot meet the triangular inequality requirement, essential for index creation, but only possible in a totally ordered set. Thus, these previous methods suffer from a serious performance degradation in query processing. Another method using pre-computed network distances also suffers from a serious storage overhead to maintain a huge amount of pre-computed network distances. To solve these performance and storage problems at the same time, this paper proposes a novel approach that creates an index for moving objects by approximating their network distances and efficiently processes k-nearest neighbor queries by means of the approximate index. For this approach, we proposed a systematic way of mapping each moving object on a road network into the corresponding absolute position in the m-dimensional space. To meet the triangular inequality this paper proposes a new notion of average network distance, and uses FastMap to map moving objects to their corresponding points in the m-dimensional space. After then, we present an approximate indexing algorithm to build an R*-tree, a multidimensional index, on the m-dimensional points of moving objects. The proposed scheme presents a query processing algorithm capable of efficiently evaluating k-nearest neighbor queries by finding k-nearest points (i.e., k-nearest moving objects) from the m-dimensional index. Finally, a variety of extensive experiments verifies the performance enhancement of the proposed approach by performing especially for the real-life road network databases.

본 논문에서는 도로 네트워크 데이타베이스에서 정적 객체의 k-최근접 이웃 질의를 효율적으로 처리하기 위한 방안을 논의한다. 기존의 여러 기법들은 인덱스를 사용하지 못했는데, 이는 네트워크 거리가 순서화 된 거리함수가 아니며 삼각 부등식(triangular inequality) 성질 또한 만족하지 못하기 때문이다. 이러한 기존 기법들은 질의 처리 시 심각한 성능 저하의 문제를 가진다. 선계산된 네트워크 거리를 이용하는 또 다른 기법은 저장 공간의 오버헤드가 크다는 문제를 갖는다. 본 논문에서는 이러한 두 가지 문제점들을 동시에 해결하기 위하여 객체들 간의 네트워크 거리를 근사하여 객체들에 대한 인덱스를 구축하고, 이를 이용하여 k-최근접 이웃 질의를 처리하는 새로운 기법을 제안한다. 이를 위하여 본 논문에서는 먼저 네트워크 공간상의 객체를 유클리드 공간상으로 사상하기 위한 체계적인 방법을 제시한다. 특히, 삼각 부등식 성질을 만족시키기 위하여 평균 네트워크 거리라는 새로운 거리 개념을 제시하고, 유클리드 공간으로의 사상을 위하여 FastMap 기법을 사용한다. 다음으로, 평균 네트워크 거리와 FastMap을 사용하여 네트워크 공간상의 객체들로 인덱스를 구축하는 근사 색인 알고리즘을 제시한다. 또한, 구축한 인덱스를 사용하여 k-최근접 이웃 질의를 효과적으로 수행하는 알고리즘을 제안한다. 마지막으로, 실제 도로 네트워크를 이용한 다양한 실험을 통하여 제안된 기법의 우수성을 규명한다.

Keywords

References

  1. S. Wu and K. Wu, "Effective Location-Based Services with Dynamic Data Management in Mobile Environments," Wireless Networks, Vol. 12, No. 3, pp. 369-381, 2006 https://doi.org/10.1007/s11276-005-5280-0
  2. S. Duri et al., "Data Protection and Data Sharing in Telematics," Mobile Networks and Applications, MONET, Vol. 9, No. 6, pp. 693-701, 2004 https://doi.org/10.1023/B:MONE.0000042507.74516.00
  3. O. Wolfson et al., "Moving Object Databases: Issues and Solutions," In Proc. Int'l. Conf. on Scientific and Statistical Database Management, SSDBM, pp. 111-112, 1998
  4. N. Weghe et al., "Representation of Moving Objects along a Road Network," In Proc. Int'l. Conf. on Geoinformatics, pp. 187-197, 2004
  5. L. Speicys, C. Jensen, and A. Kligys, "Computational Data Modeling for Network-Constrained Moving Objects," In Proc. Int'l. Symp. on Advances in Geographic Information Systems, ACM GIS, pp. 118-125, 2003
  6. J. Kelley, General Topology, Springer-Verlag, 1975
  7. M. Vazirgiannis and O. Wolfson, "A Spatiotemporal Model and Language for Moving Objects on Road Networks," In Proc. Int'l. Symp. on Spatial and Temporal Databases, SSTD, pp. 20-35, 2001
  8. D. Papadias et al., "Query Processing in Spatial Network Databases," In Proc. Int'l. Conf. on Very Large Data Bases, VLDB, pp. 802-813, 2003
  9. G. Hjaltason and H. Samet, "Distance Browsing in Spatial Databases," ACM Trans. Database Systems, ACM TODS, Vol. 24, No. 2, pp. 265-318, 1999 https://doi.org/10.1145/320248.320255
  10. M. Kolahdouzan and C. Shahabi, "Voronoi-Based K-Nearest Neighbor Search for Spatial Network Databases," In Proc. Int'l. Conf. on Very Large Data Bases, VLDB, pp. 840-851, 2004
  11. C. Faloutsos and K. Lin, "FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets," In Proc. ACM Int'l. Conf. on Management of Data, ACM SIGMOD, pp. 163-174, 1995
  12. P. Ciaccia, M. Patella, and P. Zezula, "M-tree: An Efficient Access Method for Similarity Search in Metric Spaces," Proceedings of the 23rd VLDB Conference, pp. 426-435, 1997
  13. A. Guttman, "R-Trees: A Dynamic Index Structure for Spatial Searching," In Proc. ACM Int'l. Conf. on Management of Data, ACM SIGMOD, pp. 47-57, 1984
  14. N. Beckmann et al., "The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles," In Proc. ACM Int'l. Conf. on Management of Data, ACM SIGMOD, pp. 322-331, 1990
  15. X. Huang, C. S. Jensen, and S. Saltenis, "The Islands Approach to Nearest Neighbor Querying in Spatial Networks," In Proc. Int'l. Symp. on Spatial and Temporal Databases, SSTD, pp. 73- 90, 2005
  16. B. Yi, H. Jagadish, and C. Faloutsos, "Efficient Retrieval of Similar Time Sequences Under Time Warping," In Proc. IEEE Int'l. Conf. on Data Engineering, ICDE, pp. 201-208, 1998
  17. J. Wang et al., "Evaluating a Class of Distance- Mapping Algorithms for Data Mining and Clustering," In Proc. ACM Int'l. Conf. on Knowledge Discovery and Data Mining, ACM SIGKDD, pp. 307-311, 1999
  18. The R-tree Portal, http://www.rtreeportal.org/
  19. J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann, 2000