With the fast growth of computational biometric and e-commerce applications, high-dimensional data becomes very common. Thus, mining high-dimensional data is a critical problem of great practical importance. However, there are some unique challenges for mining data of high dimensions, including the curse of dimensionality and more crucial the meaningfulness of the similarity measure in the high dimension space etc. This paper provides a review of various challenges, techniques for analysis, and dimensionality reduction of high dimensional data.
• P. Berkhin, “A Survey of Clustering Data Mining Techniques” Kogan, Jacob; Nicholas, Charles; Teboulle, Marc (Eds) Grouping Multidimensional Data, Springer Press, 25-72, 2011.
• Guha S., Rastogi R., Shim K,”CURE: An efficient clustering algorithm for large databases”, Proc. of ACM SIGMOD Conference, 2012.
• J. Han and M. Kamber, “Data Mining: Concepts and Techniques,” Morgan Kaufmann Publishers, 2010.
• A. K. Jain and R. C. Dubes, “Algorithms for Clustering Data”, Prentice Hall, 2009.
• A. Jain, M. N. Murty and P. J. Flynn, “Data Clustering: A Review”, ACM Computing Surveys, Volume 31(3), pp. 264-323, 2011.
• Zhang T., Ramakrishnan R. and Livny M,” BIRCH: An efficient data clustering method for very large databases”, In Proc. of SIGMOD96, 2012.
• Rui Xu and W. Donald, “Survey of Clustering Algorithms,” IEEE Transaction on Neural Network, vol. 16, 2009.
• GanGuojan, Ma Chaoqun, and W. Jianhong,” Data Clustering: Theory, Algorithm and Applications”, Philadelphia, 2012.
• A. Jain and R. Dubes, “Algorithms for Clustering Data”, New Jersey, 2011.
• A. K. Jain, M. N. Murtyand, and P. J. Flynn, “Data Clustering: A Review,” ACM Computing Surveys vol. 31, pp. 264-324, 2012.
• K. Bache and M. Lichman. (2013). UCI Machine Learning Repository. Available:http://archive.ics. uci.edu/ml/machinelearningdatabases/
• M. Steinbach, L. Ertoz, and V. Kumar, “The Challenges of Clustering High Dimensional Data,” in New Directions in Statistical Physics: Econophysics, Bioinformatics, and Pattern Recognition, Ed New Vistas: Springer, 2010.
• Z. Tian, R. Raghu, and L. Miron, “BIRCH: A New Data Clustering Algorithm and Its Applications,” Data Mining and Knowledge Discovery, vol. 1, pp. 141-182, 2009.
• Xin-She Yang,Sanghyuk Lee, Sangmin Lee,and NiponTheera-Umpon, Information Analysis of High-Dimensional Data and Applications, Mathematical Problems in Engineering(2015)
• Wang W., Yang J. (2005) Mining High-Dimensional Data. In: Maimon O., Rokach L. (eds) Data Mining and Knowledge Discovery Handbook. Springer, Boston, MA. https://doi.org/10.1007/0-387- 25465-X_37
• https://www.isical.ac.in/~acmsc/WBDA2015/ slides/blsp/Rev_BIGDATA.pdf
• JianqingFantYingying Fan+ and Yichao Wu§, High-Dimensional Classification, High-Dimensional Data Analysis: Volume 2 Frontiers of Statistics,https:// doi.org/10.1142/7948 December 2010.
• M. Pavithra1 , and Dr. R.M.S. Parvathi , A Survey on Clustering High Dimensional Data Techniques, International Journal of Applied Engineering Research ISSN 0973-4562 Volume 12, Number 11 (2017) pp. 2893-2899.