A Learning-Based EM Clustering for Circular Data with Unknown Number of Clusters
Keywords:clustering, circular data, mixtures of von Mises distributions, EM algorithm, learning schema
Clustering is a method for analyzing grouped data. Circular data were well used in various applications, such as wind directions, departure directions of migrating birds or animals, etc. The expectation & maximization (EM) algorithm on mixtures of von Mises distributions is popularly used for clustering circular data. In general, the EM algorithm is sensitive to initials and not robust to outliers in which it is also necessary to give a number of clusters a priori. In this paper, we consider a learning-based schema for EM, and then propose a learning-based EM algorithm on mixtures of von Mises distributions for clustering grouped circular data. The proposed clustering method is without any initial and robust to outliers with automatically finding the number of clusters. Some numerical and real data sets are used to compare the proposed algorithm with existing methods. Experimental results and comparisons actually demonstrate these good aspects of effectiveness and superiority of the proposed learning-based EM algorithm.
R. Von Mises, “Uber die ‘Ganzzahligkeit’ der atomgewicht und verwandte fragen,” Physikal Z., vol. 19, pp. 490-500, 1918.
G. S. Watson and E. J. Williams, “On the construction of significance tests on the circle and the sphere,” Biometrika, vol. 43, no. 3/4, pp. 344-352, December 1956.
N. I. Fisher, Statistical analysis of circular data, Cambridge:Cambridge University, in press, October 1995.
N. Masseran, A. M. Razali, K. Ibrahim, and M. T. Latif, “Fitting a mixture of von Mises distributions in order to model data on wind direction in Peninsular Malaysia,” Energy Conversion and Management, vol. 72, pp. 94-102, April 2013.
L. P. Rivest and S. Kato, “A random‐effects model for clustered circular data,” Canadian Journal of Statistics, vol. 47, no. 4, pp. 712-728, August 2019.
A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” Journal of the Royal Statistical Society, Series B, vol. 39, no. 1, pp. 1-22, September 1977.
G. J. McLachlan and K. E. Basford, Mixture models: Inference and Applications to clustering, New York: Marcel Dekker, 1988.
J. Yu, C. Chaomurilige, and M. S. Yang, “On convergence and parameter selection of the EM and DA-EM algorithms for Gaussian mixtures,” Pattern Recognition, vol. 77, pp. 188-203, December 2017.
J. MacQueen, “Some methods for classification and analysis of multivariate observations,” Proc. 5th Berkeley Sympo-sium on Mathematical Statistics and Probability, University of California, in press, vol. 1, no. 14, pp. 281-297, June 1967.
D. Pollard, “Quantization and the method of k-means,” IEEE Transaction on Information Theory, vol. 28, no. 2, pp. 199-205, March 1982.
M. S. Yang and K. P. Sinaga, “A feature-reduction multi-view k-means clustering algorithm,” IEEE Access, vol. 7, pp. 114472-114486, August 2019.
J. C. Bezdek, Pattern recognition with fuzzy objective function algorithms, New York: Plenum, in press, July 1981.
M. S. Yang and Y. Nataliani, “A feature-reduction fuzzy clustering algorithm with feature-weighted entropy,” IEEE Transactions on Fuzzy Systems, vol. 26, no. 2, pp 817-835, April 2018.
R. Krishnapuram and J. M. Keller, “A possibilistic approach to clustering,” IEEE Transaction on Fuzzy Systems, vol. 1, no. 2, pp. 98-110, May 1993.
M. S. Yang, S. J. Chang-Chien, and Y. Nataliani, “A fully-unsupervised possibilistic c-means clustering method,” IEEE Access, vol. 6, pp. 78308-78320, December 2018.
R. Bartels, “Estimation in a bidirectional mixture of von Mises distributions,” Biometrics, vol. 40, no. 3, pp. 777-784, September 1984.
J. A. Mooney, P. J. Helms, and I. T. Jolliffe, “Fitting mixtures of von Mises distributions: a case study involving sudden infant death syndrome,” Computational Statistics and Data Analysis, vol. 41, no. 3/4, pp. 505-513, October 2002.
N. Sanusi, A. Zaharim, S. Mat, and K. Sopian, “A Weibull and finite mixture of the von Mises distribution for wind analysis in Mersing, Malaysia,” International Journal of Green Energy, vol. 14, no. 12, pp. 1057-1062, September 2017.
Y. Ban, X. Alameda-Pineda, C. Evers, and R. Horaud, “Tracking multiple audio sources with the von Mises distribution and variational EM,” IEEE Signal Processing Letters, vol. 26, no. 6, pp. 798-802, March 2019.
M. S. Yang, C. Y. Lai, and C. Y. Lin, “A robust EM clustering algorithm for Gaussian mixture models,” Pattern Recognition, vol. 45, no. 11, pp. 3950-3961, May 2012.
M. S. Yang, S. J. Chang-Chien, and W. L. Hung, “Learning-based EM clustering for data on the unit hypersphere with application to exoplanet data,” Applied Soft Computing, vol. 60, pp. 101-114, June 2017.
How to Cite
Submission of a manuscript implies: that the work described has not been published before that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication. Authors can retain copyright of their article with no restrictions. Also, author can post the final, peer-reviewed manuscript version (postprint) to any repository or website.
Since Oct. 01, 2015, PETI will publish new articles with Creative Commons Attribution Non-Commercial License, under The Creative Commons Attribution Non-Commercial 4.0 International (CC BY-NC 4.0) License.
The Creative Commons Attribution Non-Commercial (CC-BY-NC) License permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes