Exploring the Recent Trends of Paraphrase Detection

Mohamed I. El Desouki; Wael H. Gomaa

Call for Paper

October Edition

IJCA solicits high quality original research papers for the upcoming October edition of the journal. The last date of research paper submission is 22 September 2025

Submit your paper

Know more

The week's pick

Real-Time Video Transmission using Gaussian Minimum Shift Keying (GMSK) on GNU Radio and USRP for Radiation Monitoring Applications in Nuclear Reactors

Nabiha Ben Abid Abdalla M. Khattab Hani A.M. Harb Chokri Souani

Random Articles

Reseach Article

Exploring the Recent Trends of Paraphrase Detection

by Mohamed I. El Desouki, Wael H. Gomaa

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 182 - Number 46

Year of Publication: 2019

Authors: Mohamed I. El Desouki, Wael H. Gomaa

10.5120/ijca2019918317

Mohamed I. El Desouki, Wael H. Gomaa . Exploring the Recent Trends of Paraphrase Detection. International Journal of Computer Applications. 182, 46 ( Mar 2019), 1-5. DOI=10.5120/ijca2019918317

@article{ 10.5120/ijca2019918317,

author = { Mohamed I. El Desouki, Wael H. Gomaa },

title = { Exploring the Recent Trends of Paraphrase Detection },

journal = { International Journal of Computer Applications },

issue_date = { Mar 2019 },

volume = { 182 },

number = { 46 },

month = { Mar },

year = { 2019 },

issn = { 0975-8887 },

pages = { 1-5 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume182/number46/30458-2019918317/ },

doi = { 10.5120/ijca2019918317 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T01:14:23.365199+05:30

%A Mohamed I. El Desouki

%A Wael H. Gomaa

%T Exploring the Recent Trends of Paraphrase Detection

%J International Journal of Computer Applications

%@ 0975-8887

%V 182

%N 46

%P 1-5

%D 2019

%I Foundation of Computer Science (FCS), NY, USA

Abstract

This study is to examine paraphrase detection (PD) for diagnostic purposes. Which is defined as the capability to find and discover the similarity between sentences that are written in a natural language? Where detecting similar sentences written in natural language is extreme importance and it is very essential for computer software used in plagiarism detection, Q and A automated systems, text mining, authorship authentication and text recapitulation. The goal of paraphrase detection is to detect whether two statements have the identical semantic or not. There is hundreds of empirical research in this direction. This study will focus on the discussion of recent studies of the PD methods and will categorize them in two categories, supervised learning and unsupervised learning. Also will give an idea about text similarity, machine learning and deep learning approaches. The performance of the selected researches is assessed by how accurate the F-measures are in detecting paraphrase in Microsoft Research Paraphrase Corpus (MSPR).

References

Dolan, W. B., & Brockett, C. (2005). Automatically constructing a corpus of sentential paraphrases. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005).
Gomaa, W. H., & Fahmy, A. A. (2011). Tapping into the power of automatic scoring. In The Eleventh International Conference on Language Engineering, Egyptian Society of Language Engineering (ESOLEC).
Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2012). Foundations of machine learning. MIT press.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436.
Mihalcea, R., Corley, C., & Strapparava, C. (2006, July). Corpus-based and knowledge-based measures of text semantic similarity. In AAAI (Vol. 6, pp. 775-780).
Hassan, S. (2011). Measuring semantic relatedness using salient encyclopedic concepts. University of North Texas.
Rus, V., McCarthy, P. M., Lintean, M. C., McNamara, D. S., & Graesser, A. C. (2008, May). Paraphrase Identification with Lexico-Syntactic Graph Subsumption. In FLAIRS conference(pp. 201-206).
Islam, A., & Inkpen, D. (2009). Semantic similarity of short texts. Recent Advances in Natural Language Processing V, 309, 227-236.
Milajevs, D., Kartsaklis, D., Sadrzadeh, M., & Purver, M. (2014). Evaluating neural word representations in tensor-based compositional settings. arXiv preprint arXiv:1408.6179.
Fernando, S., & Stevenson, M. (2008, March). A semantic similarity approach to paraphrase detection. In Proceedings of the 11th Annual Research Colloquium of the UK Special Interest Group for Computational Linguistics (pp. 45-52).
Qiu, L., Kan, M. Y., & Chua, T. S. (2006, July). Paraphrase recognition via dissimilarity significance classification. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (pp. 18-26). Association for Computational Linguistics.
Ul-Qayyum, Z., & Altaf, W. (2012). Paraphrase identification using semantic heuristic features. Research Journal of Applied Sciences, Engineering and Technology, 4(22), 4894-4904.
Kozareva, Z., & Montoyo, A. (2006). Paraphrase identification on the basis of supervised machine learning techniques. In Advances in natural language processing (pp. 524-533). Springer, Berlin, Heidelberg.
Finch, A., Hwang, Y. S., & Sumita, E. (2005). Using machine translation evaluation techniques to determine sentence-level semantic equivalence. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005).
Das, D., & Smith, N. A. (2009, August). Paraphrase identification as probabilistic quasi-synchronous recognition. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1-Volume 1 (pp. 468-476). Association for Computational Linguistics.
Wan, S., Dras, M., Dale, R., & Paris, C. (2006). Using dependency-based features to take the’para-farce’out of paraphrase. In Proceedings of the Australasian Language Technology Workshop 2006 (pp. 131-138).
Madnani, N., Tetreault, J., & Chodorow, M. (2012, June). Re-examining machine translation metrics for paraphrase identification. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 182-190). Association for Computational Linguistics.
Ji, Y., & Eisenstein, J. (2013). Discriminative improvements to distributional sentence similarity. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (pp. 891-896).
Filice, S., Da San Martino, G., & Moschitti, A. (2015). Structural representations for learning relations between pairs of texts. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (Vol. 1, pp. 1003-1013).
Socher, R., Huang, E. H., Pennin, J., Manning, C. D., & Ng, A. Y. (2011). Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Advances in neural information processing systems (pp. 801-809).
Blacoe, W., & Lapata, M. (2012, July). A comparison of vector-based representations for semantic composition. In Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (pp. 546-556). Association for Computational Linguistics.
He, H., Gimpel, K., & Lin, J. (2015). Multi-perspective sentence similarity modeling with convolutional neural networks. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 1576-1586).
Cheng, J., & Kartsaklis, D. (2015). Syntax-aware multi-sense word embeddings for deep compositional models of meaning. arXiv preprint arXiv:1508.02354.
Wang, Z., Mi, H., & Ittycheriah, A. (2016). Sentence similarity learning by lexical decomposition and composition. arXiv preprint arXiv:1602.07019.

Index Terms

Computer Science

Information Sciences

Keywords

Paraphrase Detection Text Similarity Machine Learning Deep Learning Microsoft Research Paraphrase Corpus.