A Novel Technique for Database Selection and Document Selection

Anil Agrawal; Mohd. Husain; Raj Gaurang Tiwari; Subodh Kumar

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

Evaluating Text-to-Text Generation from LLMs: A Case Study and Scalable Framework

Ziqiao Ao Juhi Singh Sebastian Antinome

Random Articles

Reseach Article

A Novel Technique for Database Selection and Document Selection

by Anil Agrawal, Mohd. Husain, Raj Gaurang Tiwari, Subodh Kumar

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 17 - Number 8

Year of Publication: 2011

Authors: Anil Agrawal, Mohd. Husain, Raj Gaurang Tiwari, Subodh Kumar

10.5120/2241-2865

Anil Agrawal, Mohd. Husain, Raj Gaurang Tiwari, Subodh Kumar . A Novel Technique for Database Selection and Document Selection. International Journal of Computer Applications. 17, 8 ( March 2011), 22-26. DOI=10.5120/2241-2865

@article{ 10.5120/2241-2865,

author = { Anil Agrawal, Mohd. Husain, Raj Gaurang Tiwari, Subodh Kumar },

title = { A Novel Technique for Database Selection and Document Selection },

journal = { International Journal of Computer Applications },

issue_date = { March 2011 },

volume = { 17 },

number = { 8 },

month = { March },

year = { 2011 },

issn = { 0975-8887 },

pages = { 22-26 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume17/number8/2241-2865/ },

doi = { 10.5120/2241-2865 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:05:03.589288+05:30

%A Anil Agrawal

%A Mohd. Husain

%A Raj Gaurang Tiwari

%A Subodh Kumar

%T A Novel Technique for Database Selection and Document Selection

%J International Journal of Computer Applications

%@ 0975-8887

%V 17

%N 8

%P 22-26

%D 2011

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The Internet has become a cosmic information source in recent years and can be considered as the world's largest digital library. To aid ordinary users in finding desired data in this library, numerous search engines have been created. Each search engine has a corresponding database that defines the set of documents that can be searched by the search engine. Typically, an index for all documents in the database is created and stored in the search engine. Text data in the Internet can be partitioned into numerous databases naturally. Proficient retrieval of desired data can be realized if we can accurately envisage the usefulness of each database, because with such information, we only need to retrieve potentially useful documents from useful databases. For a given query ‘q’ the usefulness of a text database is defined to be the no. of documents in the database that are sufficiently relevant to the query ‘q’. In this paper, we propose innovative approaches for database selection and documents selection.

References

L. Gravano and H. Garcia-Molina, “Generalizing GlOSS to Vector-Space databases and Broker Hierarchies,” Int’l Conf. Very Large Data Bases, p. 78-89, Sep. 1995.
B. Jansen, A. Spink, J. Bateman, and T. Saracevic, “Real Life Information Retrieval: A Study of User Queries on the Web,” Proc. ACM Special Interest Group on Information Retrieval Forum, vol. 32, no. 1, 1998.
B. Yuwono and D. Lee, “Server Ranking for Distributed Text Resource Systems on the Internet,” Proc. Fifth Int’l Conf. Database Systems for Advanced Applications, pp. 391-400, Apr. 1997.
4. J. Callan, Z. Lu, and W. Bruce Croft, “Searching Distributed Collections with Inference Networks,” Proc. ACM Special Interest Group on Information Retrieval Conf. pp. 21-28, July 1995.
Patricia Correia Saraiva, Edleno Silva deMoura, Nivio Ziviani,WagnerMeira, Rodrigo Fonseca, and Berthier Ribeiro-Neto. Rank–Preserving Two–Level Caching for Scalable Search Engines. In ACM, editor, Proceedings of the SIGIR2001 conference, New Orleans, LA, September 2001. SIGIR.
C. Badue, R. Baeza-Yates, B. Ribeiro-Neto, and N. Ziviani. Distributed query processing using partitioned inverted ﬁles. In Proc. of the 9th String Processing and Information Retrieval Symposium (SPIRE), September 2002.
Paolo Boldi, Bruno Codenotti, Massimo Santini, and Sebastiano Vigna. Trovatore: Towards a Highly Scalable Distributed Web Crawler. InWWWPosters 2001, 2001.
N. Craswell, P. Bailey, and D. Hawking. Server Selection on theWorldWideWeb. In Proceedings of the Fifth ACM Conference on Digital Libraries, pages 37–46, 2000.
B. Yuwono and D. Lee, “Server Ranking for Distributed Text Resource Systems on the Internet,” Proc. Fifth Int’l Conf. Database Systems for Advanced Applications, pp. 391-400, Apr. 1997.
Charu C. Aggarwal, Fatima Al-Garawi, and Philip S. Yu. Intelligent Crawling on the World Wide Web with Arbitrary Predicates. In Proceedings of the World Wide Web 2001 (WWW10), pages 96–105, 2001.
S. Mukherjea. WTMS: A System for Collecting and Analyzing Topic-SpecicWeb Information. Computer Networks, 33(1):457–471, 2000.
Boris Chidlovskii, Claudia Roncancio, and Marie-Luise Schneider. Semantic Cache Mechanism for Heterogeneous Web Querying. In Proceedings of the WWW8 Conference / Searching and Querying, 1999.
J. Cho and H. Garcia-Molina. Estimating Frequency of Change. Technical report, Stanford University, 2000.
Junghoo Cho and Hector Garcia-Molina. Synchronizing a Database to Improve Freshness. pages 117–128, 2000.

Index Terms

Computer Science

Information Sciences

Keywords

Metasearch Engine Distributed query processing Document selection