International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 99 - Number 10 |
Year of Publication: 2014 |
Authors: Neeraj Sharma, Manish Mann |
10.5120/17410-7986 |
Neeraj Sharma, Manish Mann . Adaptive Keywords Extraction using Back Propagation Neural Networks- A Review. International Journal of Computer Applications. 99, 10 ( August 2014), 32-34. DOI=10.5120/17410-7986
Keyword extraction is important for Knowledge Management System, Information Retrieval System, and Digital Libraries and also for general browsing of the web. Keywords are generally the basis of document processing methods such as clustering and retrieval because processing all the words in the document can be slow. In the existing work, it is observed that the keywords extracted do not include the bold, italic and underlined or words that are of different font size in the document. However, enhanced fonts are the major source of keywords in the document. Further it is also observed that the synonyms of the keywords are not included in the keywords search space and this may be a one of the most important source of keyword search space as many words are used in document by their synonyms as well. In the proposed work, the keyword extraction is not based on merely the predefined keyword dictionary, but the key words are extracted from the particular document based on some features like repetitive frequency of a particular word or form using neural network approach. Also, in the presented system, the extracted keywords are specific to the document and not the common for each document. The back propagation neural network results are more reliable if an exhaustive training samples are provided to the network. More is the training of the network, more precise keyword extraction is possible. A large no. of feature set may slow down the network operation. Therefore, an optimum no. of features set is likely to be designed that completely describe the document under study.