CFP last date
20 January 2025
Reseach Article

Finding the Best page using Synonyms

by Lobo L. M. R. J, R. S. Bichkar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 65 - Number 8
Year of Publication: 2013
Authors: Lobo L. M. R. J, R. S. Bichkar
10.5120/10941-5892

Lobo L. M. R. J, R. S. Bichkar . Finding the Best page using Synonyms. International Journal of Computer Applications. 65, 8 ( March 2013), 1-7. DOI=10.5120/10941-5892

@article{ 10.5120/10941-5892,
author = { Lobo L. M. R. J, R. S. Bichkar },
title = { Finding the Best page using Synonyms },
journal = { International Journal of Computer Applications },
issue_date = { March 2013 },
volume = { 65 },
number = { 8 },
month = { March },
year = { 2013 },
issn = { 0975-8887 },
pages = { 1-7 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume65/number8/10941-5892/ },
doi = { 10.5120/10941-5892 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:18:14.658898+05:30
%A Lobo L. M. R. J
%A R. S. Bichkar
%T Finding the Best page using Synonyms
%J International Journal of Computer Applications
%@ 0975-8887
%V 65
%N 8
%P 1-7
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Rating a page to be a best one, based only on Page Ranking algorithm of Brin and Page would be insufficient. This method relied totally on Link information alone. However, due to application of Soft Computing in Data Mining and Knowledge Discovery, machines were made more effective, additional features of a Page involving its indexing, terms used, capitalizations, anchor texts, hit information, etc. were considered. The classification problem helped to induce this to a great extent. The complexity of dealing with a large number of web pages on the net made researchers to think of solutions dealing with sampling pages randomly and then making an analysis of the features of these pages. Soft Computing techniques were used for analysis of the features of the page. These techniques involved Genetic Algorithms, Neural Networks, Fuzzy Logic and Rough sets. User' profiles of pages were created from the retrieved ones. Good and bad Pages were categorised on the basis of the terms they contained and these profiles were preserved for further reference. Pages were compared with each other for their similarity using Jaccard score and Best First search algorithm with developed software agents. Adaptive methods were used. Such methods were close to the concept of Genetic algorithm applications. The frequency at which a user visited web pages was also considered as a parameter of interest. Techniques to generate features of pages using co-occurance analysis were developed and web pages were classified based on machine learning. A good method of rating a page provided benefits like relevance, efficiency and indirectly on a crawl priority of a search engine which was more preferred. The web content designed as on date is for human reading and not typically tractable for machines. The semantic web had to provide structured content by adding annotations. Tools were made available to do these conversions. User-generated metadata that expresses a user taste and interest was used to personalize information to an individual user. Specifically, a machine learning method that analyzed a corpus of tagged content was to be used to find hidden topics. It then used these learned topics to select content that matched a users' interest, thus returning best relevant information pages. Even though Google scholar does not use synonyms and is strict to article text for searching a document, the use of synonyms reduce irrelevant search, causes intent drifting but synonym discovery is context sensitive these features motivate the use of synonyms to expediate the search and to rank relevant documents at a higher position. Google and Wordnet use synonyms but no documentation mentions using combination of synonyms for a term to generate a better relevant search, The present paper will concentrate on presenting a developed search technique to find a best page based on synonyms. The technique is based on the concept of adaptive search using synonyms of a search keyword extracted from a dictionary. These synonyms are then combined in different sets and given to a search engine which will return most relevant documents required by the user at a higher ranking.

References
  1. Surgey Brin, Lawrence Page, "The Anatomy of a Large-Scale Hypertextual Web search Engine. " 2000.
  2. Maria J. MartinBAutista, Maria Amparo Villa, "A Survey of Genetic feature Selection in Mining Issues", IEEE 1999 pp 1314-1321.
  3. Maria J. MartinBAutista, Maria Amparo Villa & Henrik L. Larsen,"Building Adaptive user profiles by a Genetic Fuzzy classifier with feature selection",IEEE 2000 O-7803-5877-5/00 pp 308-312.
  4. Chen, H. , Chung, Y. , Ramsey, M. , Yang, C. , Ma, P. , Yen, J. , "Intelligent Spider for Internet Searching," Proceedings of the 30th Annual Hawaii International Conference on System Sciences - Volume IV, Kailua-Kona, Hawaii, USA, January 1997.
  5. Tomca, N. , A Flexible Tool for Jaccard Score Evaluation , B. Sc. Thesis, University of Belgrade, Belgrade, Serbia, Yugoslavia, December 1997.
  6. Mahbub, "Genetic Algorithm in Adaptive Web Search" Filed under Research, April 2007
  7. Mathew Richardson, Amit Prakash, Eric Brill, "Beyond Page Rank: Machine Learning for Static Ranking", WWW May 2006, ACM I-59593-323-9/06/0005.
  8. Mokoto Tsukada, Takashi Washio, Hiroshi Motoda, "Web-Page Classification by Using Machine Learning Methods", 1998.
  9. Fabio Ciravegna, Daniela Petrelli,"User involvement in adaptive information extraction", in Proceedings of IJCAI2001.
  10. Sadaqat Jan, Maozhen Li, Ghaidaa Al-Sultany and Hamed Al-Raweshidy "File Annotation and Sharing on Low-End Mobile Devices", Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), 2010.
  11. Windson Viana1, José Bringel Filho2, Jérôme Gensel, Marlène Villanova-Oliver, Hervé Martin, "A Semantic Approach and a Web Tool for Contextual Annotation of Photos Using Camera Phones",9th Workshop on Hot Topics in Operating Systems (HotOS IX). May 18-21, 2003.
  12. Soules CAN, Ganger GR "Why can't I find my files? New methods for automating attribute assignment", In: Proceedings of HotOS IX: the 9th workshop on hot topics in operating systems. USENIX Association, May 2003.
  13. M. J. Carman, M. Baillie, and F. Crestani, "Tag data and personalized information retrieval," in SSM, 2008, pp. 27–34.
  14. R. J¨aschke, L. B. Marinho, A. Hotho, L. Schmidt-Thieme, and G. Stumme, "Tag recommendations in folksonomies," in PKDD, 2007, pp. 506–514.
  15. G . Bharathi and D. Venkatesan," Improving Information Retrieval Using Document Clusters and Semantic synonym Extraction", in Journal of Theoretical and Applied Information Technology February 2012. Vol. 36 No. 2 ISSN: 1992-8645
  16. Pooja Choudhary. "A Comparative Analysis of Various Web Search Engines". In International Journal of Computing and Business Research (IJCBR) ISSN (Online): 2229-6166 Volume 3 Issue 2 May 2012.
  17. G. Madhu, Dr. A. Govardhan and Dr. T. V. Rajinikanth, "Intelligent Semantic Web Search Engines: A Brief Survey", in International journal of Web & Semantic Technology (IJWesT) Vol. 2, No. 1, January 2011 DOI:10. 5121/ijwest. 2011. 2103 34
  18. Ahmed Sameh and Amar Kadray, "Semantic Web Search Results Clustering Using Lingo and WordNet",in International Journal of Research and Reviews in Computer Science (IJRRCS) Vol. 1, No. 2, June 2010
  19. Joeran Beel, Bela Gipp, and Erik Wilde. " Academic Search Engine Optimization (ASEO): Optimizing Scholarly Literature for Google Scholar and Co. ", in Journal of Scholarly Publishing, 41 (2): 176–190, January 2010
  20. Nandkishor Vasnik, Shriya Sahu and Devshri Roy, "Talash: A Semantic and Context Based Optimized Hindi Search Engine", in International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol. 2, No. 3, June 201210.
  21. Jöran Beel and Bela Gipp. "Google Scholar's Ranking Algorithm: An Introductory Overview. In Proceedings of the 12th International Conference on Scientometrics and Informetrics (ISSI'09), volume 1, pages 230–241, July 2009.
  22. Xing Wei, Fuchun Peng, Huishin Tseng, Yumao Lu, Xuerui Wang and Benoit Dumoulin," Search with Synonyms: Problems and Solutions", in coling 2010: Poster Volume, pages 1318–1326,Beijing, August 2010
  23. Angelos Hliaoutakis, Giannis Varelas, Epimenidis Voutsakis, Euripides G. M. Petrakis and Evangelos Milios,"Information Retrieval by Semantic Similarity", in. International Journal on Semantic Web & Information Systems, 2(3), 55-73, July-September 2006
  24. Yanhong Li, "Toward a Qualitative Search Engine". 1089-7801/98, IEEE INTERNET COMPUTING JULY • AUGUST 1998
  25. P. Sudhakar, G. Poonkuzhali and R. Kishore Kumar, "Content Based Ranking for Search Engines", in proceedings of International MoltiConference of Engineers and Computer Scientists 2012 Vol I. IMECS Mar 2012
  26. Hang Cui, Ji-Rong Wen, Jian-Yun Nie, and Wei-Ying Ma, Member, "Query Expansion by Mining User Logs",In IEEE Transactions On Knowledge And Data Engineering, Vol. 15, No. 4, July/August 2003
  27. Kaushik Chakrabarti, Michael Ortega, Kriengkrai Porkaew and Sharad Mehrotra. " Query Refinement in Similarity Retrieval Systems" in Bulletin of the Technical Committee on Data Engineering Vol. 24 No. 3 IEEE Computer Society September 2001
  28. Xiaoou Tang, Ke Liu, Jingyu Cui,Fang Wen and Xiaogang Wang, "IntentSearch: Capturing User Intention for One-Click Internet Image Search" in IEEE Transactions On Pattern Analysis And Machine Intelligence, Vol. 34, No. 7, July 2012
Index Terms

Computer Science
Information Sciences

Keywords

Best Page relevance users' interest synonyms metadata