Semi-supervised Learning Based on Graph Stochastic Co-Training
DOI:
https://doi.org/10.18372/1990-5548.77.18001Keywords:
multiclass classification, semi-supervised learning, single-view co-training, stochastic label propagationAbstract
This article is devoted to the development of a new approach in semi-supervised machine learning. The goal of this article is to analyze the accuracy of the single-view co-training system, based on the use of a modified graph-based stochastic label propagation algorithm for a multiclass classification problem. Graph transformation of data is preceded by feature decomposition, with three algorithms being compared: Singular Value Decomposition, Truncated Singular Value Decomposition, Iterative Primary Component Analysis, Kernel Primary Component Analysis. To improve the accuracy of the proposed method, additional parameter was included in the label propagation algorithm, allowing for the usage of the algorithm in co-training systems. Further performance increases are achieved via optimization of data modification, which is achieved by applying feature decomposition methods and parallelizing the calculation-heavy processes. As examples of practical use were considered solutions to the problem of multiclass classification for standard datasets of the library sklearn and for the real dataset Traffic Signs Preprocessed. Analyses of the results of the implementation of the proposed approach showed improvements in accuracy and of performance solving the multiclass classification problem.
References
R. E. Bellman, Dynamic programming. Princeton: Princeton University Press, 1957. p. ix ISBN 978-0-691-07951-6.
A. Blum and T. Mitchell, “Combining labeled and unlabeled data with co-training,” COLT' 98: Proceedings of the eleventh annual conference on Computational learning theory, July 1998, pp. 92–100, Madison, Wisconsin, United States, 24–26 July 1998, New York, New York, USA, https://doi.org/10.1145/279943.279962
Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien, "Semi-supervised learning," MIT Press, 2006, pp. 193–205, ISBN:978-0-262-03358-9.
J. Chan, I. Koprinska and J. Poon, “Co-training with a Single Natural Feature Set Applied to Email Classification,” In proceeding Conference on Web Intelligence, Beijing, China, 2004.
K. Nigam and R. Ghani, “Analyzing the Effectiveness and Applicability of Co-Training,” In Proceeding of the 9th, International Conference on Information and Knowledge Management, McLean, Virginia, USA, 2000. https://doi.org/10.1145/354756.354805
Minmin Chen & Kilian Weinberger, “Automatic Feature Decomposition for Single View Co-training,” Proceedings of the 28th International Conference on Machine Learning, ICML 2011. 953–960.
W. Zhang and Q. Zheng, "TSFS: A Novel Algorithm for Single View Co-training," 2009 International Joint Conference on Computational Sciences and Optimization, Sanya, China, 2009, pp. 492–496, https://doi: 10.1109/CSO.2009.251.
U. N. Raghavan, R. Albert, S. Kumara, “Near linear time algorithm to detect community structures in large-scale networks,” Phys. Rev. E Stat. Nonlinear Soft Matter Phys. Rev., E76, 036106, 2007. https://doi.org/10.1103/PhysRevE.76.036106
X. Liu, T. Murata, “Advanced modularity-specialized label propagation algorithm for detecting communities in networks,” Phys. A: Stat. Mech. and Appl., vol. 389, pp. 1493–1500, 2012. https://doi.org/10.1016/j.physa.2009.12.019
J. Xie and B. K. Szymanski, “Community Detection Using a Neighborhood Strength Driven Label Propagation Algorithm,” In Proceedings of the 2011 IEEE Network Science Workshop, IEEE Computer Society, West Point, NY, USA, 22–24 June 2011, pp. 188–195. https://doi.org/10.1109/NSW.2011.6004645
G. Cordasco and L. Gargano, “Community detection via semi-synchronous label propagation algorithms,” In Proceedings of the IEEE International Workshop on Business Applications of Social Network Analysis, Bangalore, India, 15 December 2011, pp. 1–8. https://doi.org/10.1109/BASNA.2010.5730298
Chun Gui, Ruisheng Zhang, Zhili Zhao, Jiaxuan Wei, and Rongjing Hu, “LPA-CBD An Improved Label Propagation Algorithm Based on Community Belonging Degree for Community Detection,” Int. J. Mod. Phys. C, vol. 29, no. 02, 1850011, 2018. https://doi.org/10.1142/S0129183118500110
Yan Xing, Fanrong Meng, Yong Zhou, Mu Zhu, Mengyu Shi, and Guibin Sun, "A Node Influence Based Label Propagation Algorithm for Community Detection in Networks", The Scientific World Journal, vol. 2014, Article ID 627581, 13 p., 2014. https://doi.org/10.1155/2014/627581
X. K. Zhang, J. Ren, C. Song, J. Jia, and Q. Zhang, “Label propagation algorithm for community detection based on node importance and label influence,” Phys. Lett. A, vol. 381, Issue 33, pp. 2691–2698, 2017, https://doi.org/10.1016/j.physleta.2017.06.018
Huan Li, Ruisheng Zhang, Zhili Zhao, and Xin Liu, “LPA-MNI: An Improved Label Propagation Algorithm Based on Modularity and Node Importance for Community Detection,” Entropy, 23(5), 497. https://doi.org/10.3390/e23050497.
S. Gregory, “Finding overlapping communities in networks by label propagation,” New J. Phys., vol. 12, pp. 2011–2024, 2010, https://doi.org/10.1088/1367-2630/12/10/103018
J. Xie, B. K. Szymanski, and X. Liu, “SLPA: Uncovering Overlapping Communities in Social Networks via a Speaker-Listener Interaction Dynamic Process,” In Proceedings of the IEEE International Conference on Data Mining Workshops, Vancouver, BC, Canada, 11 December 2012, pp. 344–349. https://doi.org/10.1109/ICDMW.2011.154
Z. Song, X. Yang, Z. Xu and I. King, "Graph-Based Semi-Supervised Learning: A Comprehensive Review," in IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 11, pp. 8174–8194, Nov. 2023, https://doi.org/10.1109/TNNLS.2022.3155478.
De-Ming Liang & Yu-Feng Li, “Lightweight Label Propagation for Large-Scale Network Data,” Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence Main track, 2018, pp. 3421–3427. https://doi.org/10.24963/ijcai.2018/475
Downloads
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).