CFP last date
20 December 2024
Reseach Article

Unstructured Content Analysis & Classification System for the IRS

by R.Palson Kennedy
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 1 - Number 4
Year of Publication: 2010
Authors: R.Palson Kennedy
10.5120/105-216

R.Palson Kennedy . Unstructured Content Analysis & Classification System for the IRS. International Journal of Computer Applications. 1, 4 ( February 2010), 32-37. DOI=10.5120/105-216

@article{ 10.5120/105-216,
author = { R.Palson Kennedy },
title = { Unstructured Content Analysis & Classification System for the IRS },
journal = { International Journal of Computer Applications },
issue_date = { February 2010 },
volume = { 1 },
number = { 4 },
month = { February },
year = { 2010 },
issn = { 0975-8887 },
pages = { 32-37 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume1/number4/105-216/ },
doi = { 10.5120/105-216 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T19:44:11.211480+05:30
%A R.Palson Kennedy
%T Unstructured Content Analysis & Classification System for the IRS
%J International Journal of Computer Applications
%@ 0975-8887
%V 1
%N 4
%P 32-37
%D 2010
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Creating ontological approaches to personalizing queries of unstructured data requires intensive use of XML-based tables and schema. From the legacy design efforts for CSDL to the myriad of approaches to XML schema development including the development of XIRQL , Hybrid XML retrieval and XML queries , the adoption of advanced techniques for unstructured content management is progressing rapidly. Paralleling these research advances is pervasive adoption of Cloud Computing platforms including Software-as-a-Service (SaaS), driven by the growth of the Amazon Web Services platform in addition to others. The intent of this thesis proposal is to define an XML schema that can aggregate unstructured content that when combined based on the individualized taxonomies and ontological preferences of system users, delivers highly relevant and timely data. The proposed XML Schema Model for Unstructured Content Personalization shown in Figure 1. This model is further supported by the development of and continual fine-tuning of Quantum Information Algorithms to define approximate taxonomies and approaches to creating role-based query is used as the basis of creating personalization pathways in the data. Quantum Information Theory also makes it possible to create enterprise-wide networks of knowledge management systems that can effectively "learn" over time through the use of latent semantic indexing (LSI) to create linguistic models of representation of the data. Quantum Information Theory provides the basis for creating an entire network of systems that can in essence learn over time, continually fueling new insights into the knowledgebase of the complex of systems themselves.

References
  1. Agosti, Maristella, Crestani, Fabio, Gradenigo, Girolamo. (1989). Towards Data Modelling in Information Retrieval. Journal of Information Science, 15(6), 307.
  2. Alain Azagury, Michael E Facto, Yoelle S Maarek, Benny Mandler. (2002). A novel navigation paradigm for XML repositories. Journal of the American Society for Information Science and Technology, 53(6), 515-525.
  3. Michael Benedikt, Christoph Koch. (2008). XPath leashed. ACM Computing Surveys, 41(1), 23.
  4. Elisa Bertino, Giovanna Guerrini, Marco Mesiti. (2008). Measuring the structural similarity among XML documents and DTDs. Journal of Intelligent Information Systems, 30(1), 55-92.
  5. Angela Bonifati, Stefano Ceri, Stefano Paraboschi. (2002). Pushing reactive services to XML repositories using active rules. Computer Networks, 39(5), 645-660.
  6. Falguni Bhuta. (2006, June). Put Unstructured Data In Its Place. Information Week, (1094), 21.
  7. Yannis Charalabidis, Fenareti Lampathaki, Dimitris Askounis. (2008). Unified Data Modelling and Document Standardization Using Core Components Technical Specification for Electronic Government Applications. Journal of Theoretical and Applied Electronic Commerce Research, 3(3), 38-51.
  8. Shu-Yao Chien, Vassilis J. Tsotras, Carlo Zaniolo, Donghui Zhang. (2006). Supporting complex queries on multiversion XML documents. ACM Transactions on Internet Technology, 6(1), 53.
  9. Tae-Sun Chung, Hyoung-Joo Kim. (2002). A two phase optimization technique for XML queries with multiple regular path expressions. The Journal of Systems and Software, 64(3), 183-193.
  10. Collard, M. L. and Maletic, J. I., (2004), "Document-Oriented Source Code Transformation using XML", in Proceedings of 1st International Workshop on Software Evolution Transformation (SET'04), Delft, The Netherlands, Nov. 9, pp. 11-14.
  11. Samuel Robert Collins, Shamkant Navathe, Leo Mark. (2002). XML schema mappings for heterogeneous database access. Information and Software Technology, 44(4), 251-257.
  12. Conlon, S., J. Hale, S. Lukose, and J. Strong. 2008. INFORMATION EXTRACTION AGENTS FOR SERVICE-ORIENTED ARCHITECTURE USING WEB SERVICE SYSTEMS: A FRAMEWORK. The Journal of Computer Information Systems 48, no. 3, (April 1): 74-83.
  13. Doan, A.; Naughton, J. F.; Ramakrishnan, R.; Baid, A.; Chai, X.; 0002, F. C.; Chen, T.; Chu, E.; DeRose, P.; Gao, B. J.;W. & Vuong, B.-Q. (2008), 'Information extraction challenges in managing unstructured data.', SIGMOD Record 37 (4) , 14-20 .
  14. Adam Fadlalla, Chien-Hua Lin. (2001). An analysis of the applications of neural networks in finance. Interfaces, 31(4), 112-122.
  15. Norbert Fuhr, Kai Grojohann. (2004). XIRQL :An XML query language based on information retrieval concepts. ACM Transactions on Information Systems, 22(2),
  16. Norbert Fuhr, Norbert Gövert. (2006). Retrieval quality vs. effectiveness of specificity-oriented search in XML collections. Information Retrieval, 9(1), 55-70.
  17. J E Funderburk, S Malaika, B Reinwald. (2002). XML programming with SQL/XML and XQuery. IBM Systems Journal, 41(4), 642-665.
  18. Sven Groppe, Jinghua Groppe, Stefan Böttcher, Thomas Wycisk, Le Gruenwald. (2009). Optimizing the execution of XSLT stylesheets for querying transformed XML data. Knowledge and Information Systems, 18(3), 331-391.
  19. Norbert Gövert, Norbert Fuhr, Mounia Lalmas, Gabriella Kazai. (2006). Evaluating the effectiveness of content-oriented XML retrieval methods. Information Retrieval, 9(6), 699-722.
  20. Jaap Kamps, Maarten Marx, Maarten de Rijke, Börkur Sigurbjörnsson. (2006). Articulating information needs in XML query languages. ACM Transactions on Information Systems, 24(4), 407-436.
  21. Shmuel T Klein. (2008). Processing queries with metrical constraints in XML-based IR systems. Journal of the American Society for Information Science and Technology, 59(1), 86.
  22. Judith Lamont. (2007, February). Semantic Web holds promise for KM. KM World, 16(2), 22,26.
  23. Leah S Larkey, Margaret E Connell. (2005). Structured queries, language modeling, and relevance modeling in cross-language information retrieval. Information Processing & Management, 41(3), 457-473.
  24. William Laurent. (2008). Mining the Business Intelligence from Unstructured Information. DM Review, 18(4), 28.
  25. Libby, Robert, Tan, Hun-Tong. (1994). Modeling the determinants of audit expertise. Accounting, Organizations and Society, 19(8), 701.
  26. Maletic, J.I., Collard, M.L., 2005, Adding Structure to Unstructured Text Wright Center for Innovation/LexisNexis Conference on Using Metadata to Manage Unstructured Text Dayton, Ohio, October 7, 2005, 5 pages
  27. S Liu, C A McMahon, S J Culley (2008). A review of structured document retrieval (SDR) technology to improve information access performance in engineering document management. .Computers in Industry, 59(1), 3.
  28. Robert M Losee. (2006). Browsing mixed structured and unstructured data. Information Processing & Management, 42(2), 440-452.
  29. M Mercedes Martínez-González, Pablo de la Fuente. (2007). Introducing structure management in automatic reference resolution: An XML-based approach. Information Processing & Management, 43(6), 1808.
  30. Young-Ho Park, Kyu-Young Whang, Byung Suk Lee, Wook-Shin Han. (2006). Efficient evaluation of linear path expressions on large-scale heterogeneous XML documents using information retrieval techniques. The Journal of Systems and Software, 79(2), 180-190.
  31. Jovan Pehcevski, James A. Thom, Anne-Marie Vercoustre. (2005). Hybrid XML Retrieval: Combining Information Retrieval and a Native XML Database. Information Retrieval, 8(4), 571-600.
  32. Juan Manuel Pérez, Rafael Berlanga, María José Aramburu. (2009). A relevance model for a data warehouse contextualized with documents. Information Processing & Management, 45(3), 356.
  33. Roussopoulos, Nicholas. (1979). CSDL: A Conceptual Schema Definition Language for the Design of Data Base Applications. IEEE Transactions on Software Engineering, 5(5), 481-496.
  34. Kun-Woo Yang, Soon-Young Huh. (2007). Intelligent Search for Experts Using Fuzzy Abstraction Hierarchy in Knowledge Management Systems. Journal of Database Management, 18(3), 47-68.
  35. ChengXiang Zhai, and John Lafferty. 2006. A risk minimization framework for information retrieval. Information Processing & Management 42, no. 1, (January 1): 31-55.
  36. Jose Zubcoff, Jesús Pardillo, Juan Trujillo. (2009). A UML profile for the conceptual modelling of data-mining with time-series in data warehouses. Information and Software Technology, 51(6), 977.
  37. CC Kane morekotte (2006) The importance of sibling for efficient clustering of XML documents IBM Systems Journal, 45(2), 321-334
Index Terms

Computer Science
Information Sciences

Keywords

Unstructured data XML schema LSI Cloud Computing Personalization knowledge management systems