We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

Textual Summarization of Text and Multimedia Data using LDA Algorithm

by Prajakta Bharat Deshmukh, S. S. Shiravale
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 175 - Number 14
Year of Publication: 2020
Authors: Prajakta Bharat Deshmukh, S. S. Shiravale
10.5120/ijca2020920639

Prajakta Bharat Deshmukh, S. S. Shiravale . Textual Summarization of Text and Multimedia Data using LDA Algorithm. International Journal of Computer Applications. 175, 14 ( Aug 2020), 42-48. DOI=10.5120/ijca2020920639

@article{ 10.5120/ijca2020920639,
author = { Prajakta Bharat Deshmukh, S. S. Shiravale },
title = { Textual Summarization of Text and Multimedia Data using LDA Algorithm },
journal = { International Journal of Computer Applications },
issue_date = { Aug 2020 },
volume = { 175 },
number = { 14 },
month = { Aug },
year = { 2020 },
issn = { 0975-8887 },
pages = { 42-48 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume175/number14/31525-2020920639/ },
doi = { 10.5120/ijca2020920639 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:25:04.566764+05:30
%A Prajakta Bharat Deshmukh
%A S. S. Shiravale
%T Textual Summarization of Text and Multimedia Data using LDA Algorithm
%J International Journal of Computer Applications
%@ 0975-8887
%V 175
%N 14
%P 42-48
%D 2020
%I Foundation of Computer Science (FCS), NY, USA
Abstract

To generate a summary lots of efforts have been taken in past years for the events such as Meetings, Sports-clips, Pictorial Storylines, Movies, Social media contents. Natural Language Processing (NLP) is a basic Automatic text summarization application which goals to summarize a given text into a compressed form. Over the year the fast growth in multimedia data across the internet, demands summarization from the asynchronous data that is the combination of image, text, video, and audio. We have describe an multi-modal summarization framework that uses the techniques of OCR, NLP and speech processing examine the information contained in the statics and to enhance the aspect of multimedia summarization.

References
  1. Haoran Li, Junnan Zhu, Cong Ma, Jiajun Zhang, Chengqing Zong, “Read, Watch, Listen and Summarize: Multi-modalSummarization for AsynchronousText, Image, Audio and Video" in IEEE Transactions on Knowledge and Data Engineering, 2019.
  2. Jorge Poco, Angela Mayhua, and Jeffrey Heer, “Extracting and Retargeting Color Mappings from Bitmap Images of Visualizations”, in IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 1, 2018.
  3. A. Chaudhuri et al., “Optical Character Recognition Systems for Different Languages with Soft Computing, Studies in Fuzziness and Soft Computing 352”, Springer International Publishing AG 2017.
  4. Bharath, V., & Rani, N. S., “A font style classification system for English OCR”, International Conference on Intelligent Computing and Control (I2C2), 2017.
  5. Mohammad Azim Ul Ekram, Anjani Chaudhary, Ashutosh Yadav, Jagadish Khanal, Semih Aslan, “Book Organization Checking Algorithm using Image Segmentation and OCR”, in IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), 2017.
  6. H. Li, J. Zhang, Y. Zhou, and C. Zong, “Guide-rank: A guided ranking graph model for multilingual multi-document summarization,” in International Conference on Computer Processing of Oriental Languages. Springer, pp. 608–620, 2016.
  7. X. Zhou, X. Wan, and J. Xiao, “Cminer: Opinion extraction and summarization for chinese microblogs,” IEEE Transactions on Knowledge & Data Engineering, vol. 28, no. 7, pp. 1650–1663, 2016.
  8. Noman Islam, Zeeshan Islam, Nazia Noor, “A Survey on Optical Character Recognition System”, in Journal of Information & Communication Technology-JICT vol. 10 no. 2, 2016.
  9. Zhong-Qiu Wang, Yan Zhao, De Liang Wang, “Phoneme-specific speech separation”, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016.
  10. Hans Christian, Mikhael Pramodana Agus, Derwin Suhartono, “Single Document Automatic Text Summarization using Term Frequency-Inverse Document Frequency (TF-IDF)”, in ComTech Vol. 7 No. 4, 2016.
  11. Yang, Z., Jin, J., & Wang, M., “A signal processing method using pulse-based intermediate values for delta-sigma analog-to-digital conversion.”, IEEE International Conference on Digital Signal Processing (DSP), 2015.
  12. P. Goyal, L. Behera, and T. M. Mcginnity, “A context based word indexing model for document summarization,” IEEE Transactions on Knowledge & Data Engineering, vol. 25, no. 8, pp. 1693–1705, 2013.
  13. Preeti Saini, Parneet Kaur, ”Automatic Speech Recognition: A Review”, in International Journal of Engineering Trends and Technology Vol. 4, no.2, 2013.
  14. G. Erkan and D. R. Radev, “LexRank: Graph-based lexical centrality as salience in text summarization,” Journal of Qiqihar Junior Teachers College, vol. 22, p. 2004, 2011.
  15. Koppula, V. K., & Negi, A., “Fringe Map Based Text Line Segmentation of Printed Telugu Document Images.”, in International Conference on Document Analysis and Recognition, 2011.
  16. Bilal Bataineh, S. N. H. S. Abadullah, Khairudin Omar., "A Statistical Global Feature Extraction Method for Optical Font Recognition, "presented at the LNAI the 3rd Asian Conference on Intelligence Information and Database Systems (ACIIDS 2011), 2011.
  17. Y. Ouyang, W. Li, Q. Lu, and R. Zhang, “A study on position information in document summarization,” in COLING, pp. 919–927, 2010.
  18. V. Varma, V. Varma, and V. Varma, “Sentence position revisited: a robust light-weight update summarization ’baseline’ algorithm,” in International Workshop on Cross Lingual Information Access: Addressing the Information Need of Multilingual Societies, pp. 46–52, 2009.
  19. S. N. H. S. Abdullah, et al., "license plate recognition based on geometry features topological analysis and support vector machine, " 2007.
  20. U. Bhattacharya, B. K. Gupta and S. K. Parui, “Direction Code Based Features for Recognition of Online Handwritten Characters of Bangla”, in Proceedings of Ninth International Conference on Document Analysis and Recognition (ICDAR), 2007.
  21. X. Wan and J. Yang, “Improved affinity graph based multi-document summarization,” in NAACL, pp. 181–184, 2006.
  22. Gupta, G., Niranjan, S., Shrivastava, A., & Sinha, R., “Document Layout Analysis and Classification and Its Application in OCR.”, in 10th IEEE International Enterprise Distributed Object Computing Conference Workshops, 2006.
  23. Shamma, S., “Relevance of auditory cortical representations to speech processing and recognition.” IEEE Workshop on Automatic Speech Recognition and Understanding, 2005.
Index Terms

Computer Science
Information Sciences

Keywords

Summarization Multimedia Multi-modal Cross-modal Natural Language Processing Computer Vision OCR Technique Automatic Speech Recognition.