Enhancing Traditional Text Documents Clustering based on Ontology

Hmway Hmway Tar; Thi Thi Soe Nyunt

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

A Unified NIST SP 800-90B Validation Framework for CMOS True Random Number Generators and Quantum Random Number Generators

Che-Ping Lin

Random Articles

Reseach Article

Enhancing Traditional Text Documents Clustering based on Ontology

by Hmway Hmway Tar, Thi Thi Soe Nyunt

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 33 - Number 10

Year of Publication: 2011

Authors: Hmway Hmway Tar, Thi Thi Soe Nyunt

10.5120/4107-5850

Hmway Hmway Tar, Thi Thi Soe Nyunt . Enhancing Traditional Text Documents Clustering based on Ontology. International Journal of Computer Applications. 33, 10 ( November 2011), 38-42. DOI=10.5120/4107-5850

@article{ 10.5120/4107-5850,

author = { Hmway Hmway Tar, Thi Thi Soe Nyunt },

title = { Enhancing Traditional Text Documents Clustering based on Ontology },

journal = { International Journal of Computer Applications },

issue_date = { November 2011 },

volume = { 33 },

number = { 10 },

month = { November },

year = { 2011 },

issn = { 0975-8887 },

pages = { 38-42 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume33/number10/4107-5850/ },

doi = { 10.5120/4107-5850 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:19:53.409202+05:30

%A Hmway Hmway Tar

%A Thi Thi Soe Nyunt

%T Enhancing Traditional Text Documents Clustering based on Ontology

%J International Journal of Computer Applications

%@ 0975-8887

%V 33

%N 10

%P 38-42

%D 2011

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Ontologies currently are a hot topic in the areas of Semantic Web. The current clustering research emphasizes the development of a more efficient clustering method and mainly focuses on term weight calculation without considering the domain knowledge. This paper investigates how ontologies can also be applied to the clustering process. To complement the traditional clustering method, more informative features including concept weight are important based on recent developments in the area of the Semantic technologies. The proposed system presents the concept weight for text clustering system developed based on a k-means algorithm in accordance with the principles of ontology so that the important of words of a cluster can be identified by the weighted values. To a certain extent, it has resolved the semantic progeny in specific areas. The experimental results performed using dissertations papers from Google Search Engine and the proposed method demonstrated its effectiveness and practical value.

References

A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities by TIM BERNERS-LEE, JAMES HENDLER and ORA LASSILA
Berners-Lee, T., Weaving the Web, Harper, San Francisco, 1999
Decker, S., Melnik, S., Van Harmelen, F., Fensel, D., Klein, M., Broekstra, J., Erdmann, M. and Horrocks, I. (2000) ‘The semantic web: the roles of XML and RDF’, IEEE Internet Computing, Vol.4, No. 5, pp.63–74.
Ding, Y., and Foo, S., (2002). Ontology Research and Development: Part 1 – A Review of Ontology Generation. Journal of Information Science 28 (2).
A. Hotho and S. Staab, "Ontology based Text clustering”.
M. Steinbach, G. Karypis, and V. Kumar. 2000. A comparison of document clustering techniques. KDD Workshop on Text Mining’00.
http://www.textfixer.com/resources/common-english-words.txt
Stefan Brueggemann,Using Domain Knowledge Provided by Ontologies for Improving Data Quality Management.

Index Terms

Computer Science

Information Sciences

Keywords

Clustering Concept Weight Document clustering Feature Selection Ontology