DOI: 10.7763/IJCCE.2012.V1.3
Biased-Incremental Clustering: A Flexible Knowledge Extraction Algorithm
Abstract—Terminology clustering plays a critical role in the extraction of knowledge from unstructured text data. We present Biased-Incremental Clustering which is designed to make such concept extraction very flexible and human-like. Our approach incrementally clusters concepts by allowing the injection of prior (or existing) concepts into the learner to bias the acquisition of new concepts. It is an unsupervised learning method using Semantic Vectors, Random Indexing and the Word Space (Vector Space) model to perform computations on the concepts. The key aspects of the algorithm run in linear time by using K-Means and a slightly modified version of Bisecting K-Means algorithm. Results show that the Biased-Incremental Clustering algorithm performs well in extracting and clustering terminology from text data containing information that covers both similar and varying domains of knowledge.
Index Terms—Clustering, semantic vectors, machine learning, random indexing
The authors are with the School of Software Engineering, Chongqing University, Chongqing, China (e-mail: kwabena.nuamah@gmail.com, tel.:+8613883491754).
Cite: Kwabena A. Nuamah and Li Fu, "Biased-Incremental Clustering: A Flexible Knowledge Extraction Algorithm," International Journal of Computer and Communication Engineering vol. 1, no. 1, pp. 8-12 , 2012.
General Information
-
Dec 29, 2021 News!
IJCCE Vol. 10, No. 1 - Vol. 10, No. 2 have been indexed by Inspec, created by the Institution of Engineering and Tech.! [Click]
-
Mar 17, 2022 News!
IJCCE Vol.11, No.2 is published with online version! [Click]
-
Dec 29, 2021 News!
The dois of published papers in Vol. 9, No. 3 - Vol. 10, No. 4 have been validated by Crossref.
-
Dec 29, 2021 News!
IJCCE Vol.11, No.1 is published with online version! [Click]
-
Sep 16, 2021 News!
IJCCE Vol.10, No.4 is published with online version! [Click]
- Read more>>