CFP last date
20 December 2024
Reseach Article

Article:A Query based Text Categorization using K-Nearest Neighbor Approach

by Suneetha Manne, Sita Kumari Kotha, Dr. S. Sameen Fatima
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 32 - Number 7
Year of Publication: 2011
Authors: Suneetha Manne, Sita Kumari Kotha, Dr. S. Sameen Fatima
10.5120/3915-5513

Suneetha Manne, Sita Kumari Kotha, Dr. S. Sameen Fatima . Article:A Query based Text Categorization using K-Nearest Neighbor Approach. International Journal of Computer Applications. 32, 7 ( October 2011), 16-21. DOI=10.5120/3915-5513

@article{ 10.5120/3915-5513,
author = { Suneetha Manne, Sita Kumari Kotha, Dr. S. Sameen Fatima },
title = { Article:A Query based Text Categorization using K-Nearest Neighbor Approach },
journal = { International Journal of Computer Applications },
issue_date = { October 2011 },
volume = { 32 },
number = { 7 },
month = { October },
year = { 2011 },
issn = { 0975-8887 },
pages = { 16-21 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume32/number7/3915-5513/ },
doi = { 10.5120/3915-5513 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:18:56.737823+05:30
%A Suneetha Manne
%A Sita Kumari Kotha
%A Dr. S. Sameen Fatima
%T Article:A Query based Text Categorization using K-Nearest Neighbor Approach
%J International Journal of Computer Applications
%@ 0975-8887
%V 32
%N 7
%P 16-21
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

World Wide Web is the store house of abundant information available in various electronic forms. In the past two decades, the increase in the performance of computers in handling large quantity of text data led researchers to focus on reliable and optimal retrieval of information already exist in the huge resources. Though the existing search engines, answering machines has succeeded in retrieving the data relative to the user query, the relevancy of the text data is not appreciable of the huge set. It is hence binding the range of resultant text data for a given user query with appreciable ranking to each document stand as a major challenge. In this paper, we propose a Query based k-Nearest Neighbor method to access relevant documents for a given query finding the most appropriate boundary to related documents available on web and rank the document on the basis of query rather than customary Content based classification. The experimental results will elucidate the categorization with reference to closeness of the given query to the document.

References
  1. Sebastiani, F.,Machine learning in automated text categorization. ACM Computing Surveys, 34(1), pp. 1–47, 2002.
  2. XiuboGeng, Tie-YanLiu, TaoQin, AndrewArnold, HangLi and Heung-YeungShum, “Query Dependent Ranking Using K-Nearest Neighbor,” ACM, SIGIR08, July20–24,2008,Singapore.
  3. Dik L. Lee, uei Chuang, H Ent Seamons,“ Document Ranking and the Vector-Space Model”,a research theisis, March-April,1997.
  4. T.Y.Liu,Y.Yang,H.Wan,H.Zeng,Z.Chen,andW.Y.Ma,“Support Vector machines classification with a very large scale taxonomy. SIGKDD Explor. Newsl,7(1):36–43.
  5. Gongde Guo , Hui Wang , David Bell , Yaxin Bi , and Kieran Greer, “Using kNN Model-based Approach for Automatic Text Categorization”.
  6. Stavros Papadopoulos, Lixing Wang, Yin Yang, Dimitris Papadias, Panagiotis Karras, “Authenticated Multi-Step Nearest Neighbor Search”
  7. Yang, Y. & Pedersen, J.O., A comparative study on feature selection in text categorization. Proceedings of ICML-97, 14th International Conference on Machine Learning, ed.D.H. Fisher,Morgan Kaufmann Publishers, San Francisco, US: Nashville, US, pp. 412–420, 1997.
  8. Guru, D. S., Harish B. S., and Manjunath, S. 2009. “Clustering of Textual Data: A Brief Survey”, In the Proceedings of International Conference on Signal and Image Processing, pp. 409 – 413.
  9. Dr. Riyad Al-Shalabi , Dr. Ghassan Kanaan and Manaf H. Gharaibeh “Arabic Text Categorization Using kNN Algorithm”
  10. K. Aas, L. Eikvil, Text Categorization: A Survey. Norwegian Computation Center, Oslo, 1999
  11. R.M. Duwairi, A Distance-based Classifier for Arabic Text Categorization, In Proceedings of the International Conference on Data Mining, Las Vegas USA, 2005.
  12. Ioan Pop “An approach of the Naive Bayes classifier for the document classification” General Mathematics Vol. 14, No. 4 (2006), 135–138
  13. Hotho, A., Nürnberger, A., and Paaß, G. 2005. A Brief Survey of Text Mining. Journal for Computational Linguistics and Language Technology. Vol. 20, pp. 19 – 62.
  14. Yang, Y., Slattery, S., and Ghani, R. 2002. A study of approaches to hypertext categorization. Journal of Intelligent Information Systems, Vol 18(2), pp. 219 – 241.
  15. Joachims, T., Text categorization with support vector machines: learning with many relevant features. Proceedings of ECML-98, 10th European Conferenceon Machine Learning, eds. C. N´edellec & C. Rouveirol, Springer Verlag, Heidelberg, DE: Chemnitz, DE, pp. 137– 142, 1998. Published in the “Lecture Notes in Computer Science” series, number 1398.
  16. Joachims, T., Transductive inference for text classification using support vector machines. Proceedings of ICML-99, 16th International Conference on Machine Learning, eds. I. Bratko & S. Dzeroski, Morgan Kaufmann Publishers, San Francisco, US: Bled, SL, pp. 200–209, 1999.
  17. Drucker, H., Vapnik, V. & Wu, D., Support vector machines for spam categorization. IEEE Transactions on Neural Networks, 10(5), pp. 1048–1054, 1999.
Index Terms

Computer Science
Information Sciences

Keywords

K-Nearest Neighbor Approach Text Categorization