We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

Exploring the Recent Trends of Paraphrase Detection

by Mohamed I. El Desouki, Wael H. Gomaa
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 182 - Number 46
Year of Publication: 2019
Authors: Mohamed I. El Desouki, Wael H. Gomaa
10.5120/ijca2019918317

Mohamed I. El Desouki, Wael H. Gomaa . Exploring the Recent Trends of Paraphrase Detection. International Journal of Computer Applications. 182, 46 ( Mar 2019), 1-5. DOI=10.5120/ijca2019918317

@article{ 10.5120/ijca2019918317,
author = { Mohamed I. El Desouki, Wael H. Gomaa },
title = { Exploring the Recent Trends of Paraphrase Detection },
journal = { International Journal of Computer Applications },
issue_date = { Mar 2019 },
volume = { 182 },
number = { 46 },
month = { Mar },
year = { 2019 },
issn = { 0975-8887 },
pages = { 1-5 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume182/number46/30458-2019918317/ },
doi = { 10.5120/ijca2019918317 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:14:23.365199+05:30
%A Mohamed I. El Desouki
%A Wael H. Gomaa
%T Exploring the Recent Trends of Paraphrase Detection
%J International Journal of Computer Applications
%@ 0975-8887
%V 182
%N 46
%P 1-5
%D 2019
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This study is to examine paraphrase detection (PD) for diagnostic purposes. Which is defined as the capability to find and discover the similarity between sentences that are written in a natural language? Where detecting similar sentences written in natural language is extreme importance and it is very essential for computer software used in plagiarism detection, Q and A automated systems, text mining, authorship authentication and text recapitulation. The goal of paraphrase detection is to detect whether two statements have the identical semantic or not. There is hundreds of empirical research in this direction. This study will focus on the discussion of recent studies of the PD methods and will categorize them in two categories, supervised learning and unsupervised learning. Also will give an idea about text similarity, machine learning and deep learning approaches. The performance of the selected researches is assessed by how accurate the F-measures are in detecting paraphrase in Microsoft Research Paraphrase Corpus (MSPR).

References
  1. Dolan, W. B., & Brockett, C. (2005). Automatically constructing a corpus of sentential paraphrases. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005).
  2. Gomaa, W. H., & Fahmy, A. A. (2011). Tapping into the power of automatic scoring. In The Eleventh International Conference on Language Engineering, Egyptian Society of Language Engineering (ESOLEC).
  3. Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2012). Foundations of machine learning. MIT press.
  4. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436.
  5. Mihalcea, R., Corley, C., & Strapparava, C. (2006, July). Corpus-based and knowledge-based measures of text semantic similarity. In AAAI (Vol. 6, pp. 775-780).
  6. Hassan, S. (2011). Measuring semantic relatedness using salient encyclopedic concepts. University of North Texas.
  7. Rus, V., McCarthy, P. M., Lintean, M. C., McNamara, D. S., & Graesser, A. C. (2008, May). Paraphrase Identification with Lexico-Syntactic Graph Subsumption. In FLAIRS conference(pp. 201-206).
  8. Islam, A., & Inkpen, D. (2009). Semantic similarity of short texts. Recent Advances in Natural Language Processing V, 309, 227-236.
  9. Milajevs, D., Kartsaklis, D., Sadrzadeh, M., & Purver, M. (2014). Evaluating neural word representations in tensor-based compositional settings. arXiv preprint arXiv:1408.6179.
  10. Fernando, S., & Stevenson, M. (2008, March). A semantic similarity approach to paraphrase detection. In Proceedings of the 11th Annual Research Colloquium of the UK Special Interest Group for Computational Linguistics (pp. 45-52).
  11. Qiu, L., Kan, M. Y., & Chua, T. S. (2006, July). Paraphrase recognition via dissimilarity significance classification. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (pp. 18-26). Association for Computational Linguistics.
  12. Ul-Qayyum, Z., & Altaf, W. (2012). Paraphrase identification using semantic heuristic features. Research Journal of Applied Sciences, Engineering and Technology, 4(22), 4894-4904.
  13. Kozareva, Z., & Montoyo, A. (2006). Paraphrase identification on the basis of supervised machine learning techniques. In Advances in natural language processing (pp. 524-533). Springer, Berlin, Heidelberg.
  14. Finch, A., Hwang, Y. S., & Sumita, E. (2005). Using machine translation evaluation techniques to determine sentence-level semantic equivalence. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005).
  15. Das, D., & Smith, N. A. (2009, August). Paraphrase identification as probabilistic quasi-synchronous recognition. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1-Volume 1 (pp. 468-476). Association for Computational Linguistics.
  16. Wan, S., Dras, M., Dale, R., & Paris, C. (2006). Using dependency-based features to take the’para-farce’out of paraphrase. In Proceedings of the Australasian Language Technology Workshop 2006 (pp. 131-138).
  17. Madnani, N., Tetreault, J., & Chodorow, M. (2012, June). Re-examining machine translation metrics for paraphrase identification. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 182-190). Association for Computational Linguistics.
  18. Ji, Y., & Eisenstein, J. (2013). Discriminative improvements to distributional sentence similarity. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (pp. 891-896).
  19. Filice, S., Da San Martino, G., & Moschitti, A. (2015). Structural representations for learning relations between pairs of texts. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (Vol. 1, pp. 1003-1013).
  20. Socher, R., Huang, E. H., Pennin, J., Manning, C. D., & Ng, A. Y. (2011). Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Advances in neural information processing systems (pp. 801-809).
  21. Blacoe, W., & Lapata, M. (2012, July). A comparison of vector-based representations for semantic composition. In Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (pp. 546-556). Association for Computational Linguistics.
  22. He, H., Gimpel, K., & Lin, J. (2015). Multi-perspective sentence similarity modeling with convolutional neural networks. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 1576-1586).
  23. Cheng, J., & Kartsaklis, D. (2015). Syntax-aware multi-sense word embeddings for deep compositional models of meaning. arXiv preprint arXiv:1508.02354.
  24. Wang, Z., Mi, H., & Ittycheriah, A. (2016). Sentence similarity learning by lexical decomposition and composition. arXiv preprint arXiv:1602.07019.
Index Terms

Computer Science
Information Sciences

Keywords

Paraphrase Detection Text Similarity Machine Learning Deep Learning Microsoft Research Paraphrase Corpus.