CFP last date
20 December 2024
Reseach Article

Detection of Similarity in Cross Version Binaries using Raw Bytes

by Nandish M., Mohan H.G.
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 184 - Number 1
Year of Publication: 2022
Authors: Nandish M., Mohan H.G.
10.5120/ijca2022921963

Nandish M., Mohan H.G. . Detection of Similarity in Cross Version Binaries using Raw Bytes. International Journal of Computer Applications. 184, 1 ( Mar 2022), 26-29. DOI=10.5120/ijca2022921963

@article{ 10.5120/ijca2022921963,
author = { Nandish M., Mohan H.G. },
title = { Detection of Similarity in Cross Version Binaries using Raw Bytes },
journal = { International Journal of Computer Applications },
issue_date = { Mar 2022 },
volume = { 184 },
number = { 1 },
month = { Mar },
year = { 2022 },
issn = { 0975-8887 },
pages = { 26-29 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume184/number1/32299-2022921963/ },
doi = { 10.5120/ijca2022921963 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:20:20.339880+05:30
%A Nandish M.
%A Mohan H.G.
%T Detection of Similarity in Cross Version Binaries using Raw Bytes
%J International Journal of Computer Applications
%@ 0975-8887
%V 184
%N 1
%P 26-29
%D 2022
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Binary code similarity detection (BCSD) technique compares multiple parts of binary code like functions, basic blocks or entire program to check for similarity or differences. Without relying on the source code, binary code analysis allows analysing code. BCSD is used for malware clustering, software theft detection and bug search. Existing techniques for BSCD problem includes Control Flow Graphs (CFG) and deep learning models. Here, a new and simple approach based on single feature to solve the cross-version BCSD problem is proposed. Approach follows initial transformation from functions to vectors and then computes the coefficient value. Proposed approach works on the raw bytes which is implemented and evaluated on a custom dataset having around 23,451 samples. The result shows that the model outperforms all other solutions and the recall of the approach could reach 97.1%.

References
  1. Hui Guo , Shuguang Huang, Cheng Huang, Min Zhang, Zulie Pan, Fan Shi, Hui Huang1, Donghui Hu And Xiaoping Wang , “Lightweight Cross-Version BCSD Based on Similarity and Correlation Coefficient Features”, IEEE Access, pp. 120501 - 120512, Vol 8, June 2020.
  2. X. Hu, T.-C. Chiueh, and K. G. Shin, ‘‘Large-scale malware indexing using function-call graphs’’, in Proc. 16th ACM Conf. Comput. Commun. Secur., 2009, pp. 611–620.
  3. J. Gao, X. Yang, Y. Fu, Y. Jiang, and J. Sun, “VulSeeker: A semantic learning based vulnerability seeker for cross-platform binary” in Proc. 33rd ACM/IEEE Int. Conf. Automated Softw. Eng. (ASE). New York, NY, USA: Association Computing Machinery, 2018, pp. 896–899.
  4. Y. David, N. Partush, and E. Yahav, ‘‘Statistical similarity of binaries,’’ ACM SIGPLAN Notices, vol. 51, no. 6, pp. 266–280, Aug. 2016.
  5. J. Pewny, B. Garmany, R. Gawlik, C. Rossow, and T. Holz, “Crossarchitecture bug search in binary executables”, in Proc. IEEE Symp. Secur. Privacy, May 2015, pp. 709–724.
  6. L. Massarelli, G. A. D. Luna, F. Petroni, L. Querzoni, and R. Baldoni, “SAFE: Self-attentive function embeddings for binary similarity”, in Proc. 16th Conf. Detection Intrusions Malware Vulnerability Assessment (DIMVA), 2019, pp. 309–329
  7. M. Chandramohan, Y. Xue, Z. Xu, Y. Liu, C. Y. Cho, and H. B. K. Tan, “BinGo: Cross-architecture cross-OS binary search”, in Proc. 24th ACM SIGSOFT Int. Symp. Found. Softw. Eng. (FSE), 2016, pp. 678–689.
  8. F. Zuo, X. Li, P. Young, L. Luo, Q. Zeng, and Z. Zhang, “Neural machine translation inspired binary code similarity comparison beyond function pairs”, in Proc. Netw. Distrib. Syst. Secur. Symp., 2019, pp. 1–15.
  9. S. H. H. Ding, B. C. M. Fung, and P. Charland, “Asm2 Vec: Boosting static representation robustness for binary clone search against code obfuscation and compiler optimization”, in Proc. IEEE Symp. Secur. Privacy (SP), May 2019, pp. 472–489.
  10. J. Ming, M. Pan, and D. Gao, “IBinHunt: Binary hunting with interprocedural control flow”, in Proc. Int. Conf. Inf. Secur. Cryptol. Berlin, Germany: Springer, 2012, pp. 92–109.
  11. D. Gao, M. K. Reiter, and D. Song, “Binhunt: Automatically finding semantic differences in binary programs”, in Proc. Int. Conf. Inf. Commun. Secur. Berlin, Germany: Springer, 2008, pp. 238–255.
  12. Bingchang Liu, Wei Huo, Chao Zhang, Wenchao Li,Feng Li,Aihua Piao,Wei Zou, “αDiff: Cross-Version Binary Code Similarity Detection with DNN” , ACM, pp. 667- 678 , September 3–7, 2018.
  13. S. Eschweiler, K. Yakdan, and E. Gerhards-Padilla, “DiscovRE: Efficient cross-architecture identification of bugs in binary code”, in Proc. Netw. Distrib. Syst. Secur. Symp., 2016, pp. 1–15.
Index Terms

Computer Science
Information Sciences

Keywords

Cross Version Binary Control Flow Graph Similarity coefficient Malware Detection