CFP last date
20 February 2025
Reseach Article

Minimal Similarity Loss Hashing based on Optimized Self-taught Clustering Method

by Zhen Wang, Chenyu Wang, Anlei Jiao
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 175 - Number 10
Year of Publication: 2020
Authors: Zhen Wang, Chenyu Wang, Anlei Jiao
10.5120/ijca2020920564

Zhen Wang, Chenyu Wang, Anlei Jiao . Minimal Similarity Loss Hashing based on Optimized Self-taught Clustering Method. International Journal of Computer Applications. 175, 10 ( Aug 2020), 34-39. DOI=10.5120/ijca2020920564

@article{ 10.5120/ijca2020920564,
author = { Zhen Wang, Chenyu Wang, Anlei Jiao },
title = { Minimal Similarity Loss Hashing based on Optimized Self-taught Clustering Method },
journal = { International Journal of Computer Applications },
issue_date = { Aug 2020 },
volume = { 175 },
number = { 10 },
month = { Aug },
year = { 2020 },
issn = { 0975-8887 },
pages = { 34-39 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume175/number10/31491-2020920564/ },
doi = { 10.5120/ijca2020920564 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:24:41.983065+05:30
%A Zhen Wang
%A Chenyu Wang
%A Anlei Jiao
%T Minimal Similarity Loss Hashing based on Optimized Self-taught Clustering Method
%J International Journal of Computer Applications
%@ 0975-8887
%V 175
%N 10
%P 34-39
%D 2020
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In With the rapid growth of the amount of web images, how to fast respond the approximate nearest neighbors (ANN) search task has been concerned by researchers. As hashing algorithms map the floating point data into binary code and achieve ANN search task according to Hamming distances, it has the advantages of low computational time complexity. To obtain an excellent ANN search performance based on compact binary code, many algorithms adopt machine learning mechanism to train hashing functions. For instance, k-means hashing (KMH) and stacked k-means hashing (SKMH) utilize k-means clustering mechanism to learn encoding centers. Due to KMH and SKMH randomly generate initial centers, many centers would converge to the same solution. To fix the above problem, a novel hashing method termed as minimal similarity loss hashing (MSLH) is proposed to generate the initial centers with maximum average distance by a self-taught mechanism. Furthermore, MSLH defines both the similarity loss and the quantization loss as the objective function. By minimizing the similarity loss, MSLH can approximate the data pairs' Euclidean distance by their Hamming distance. The encoding results with minimal quantization loss map the nearest neighbors into the same binary code, which well adaptive to the data distribution. The ANN search comparative experiments on three public available datasets including NUS-WIDE and 22K LabelME show that MSLH can achieve a superior performance.

References
  1. Wang Z., Sun F., Zhang L., Liu P.. 2020 Minimal residual ordinal loss hashing with an adaptive optimization mechanism. EuRASIP Journal on Image and Video Processing, 10.
  2. Wang Z., Sun F., Zhang L., Wang L.. 2020 Top position sensitive ordinal relation preserving bitwise weight for image retrieval [J]. Algorithms, 13(1), 18.
  3. Wang Z., Zhang L., Sun F., Wang L., Liu S. Relative similarity preserving bitwise weights generated by an adaptive mechanism [J]. 2019 Multimedia Tools and Applications, 78 (17), pp. 24453-24472.
  4. Datar M., Immorlica N., Indyk P., Mirrokni V. S. 2004 Locality-sensitive hashing scheme based on p-stable distributions [C]. In Proceedings of twentieth Annual Symposium on Computational Geometry. Brooklyn, New York, USA, 253-262.
  5. Weiss Y., Torralba A., Fergus R.. 2008 Spectral hashing [C]. In Proceedings of the Advances in Neural Information Processing Systems. British Columbia, Canada, 1753-1760.
  6. Yusuke M., Toshikazu W.. 2009 Principal component hashing: An accelerated approximate nearest neighbor search [C]. In proceedings of Pacific Rim Symposium on Advances in Image and Video Technology. Springer-Verlag, 374-385.
  7. Jegou H., Douze M., Schmid C., Perez P. 2010 Aggregating local descriptors into a compact image representation [C]. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, California, USA, 3304-3311.
  8. Gong Y., Lazebnik S.. 2011 Iterative Quantization: A procrustean approach to learning binary codes [C]. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, CO, USA, 817-824.
  9. He K., Wen F., Sun J.. 2013 K-means hashing: an affinity-preserving quantization method for learning binary compact codes [C]. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Portland, Oregon, 2938-2945.
  10. Chen Y., Li Z., Shi J., Liu Z. and Qu W.. 2018 Stacked K-Means Hashing Quantization for Nearest Neighbor Search [C]. In Proceedings of the IEEE Fourth International Conference on Multimedia Big Data. 1-4.
  11. Ge T., He K., Ke Q., Sun J.. 2013 Optimized Product Quantization for Approximate Nearest Neighbor Search [C]. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Portland, Oregon, 2946-2953.
  12. Ge T., He K., Ke Q., Sun J.. 2014 Optimized Product Quantization [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(4): 744-755.
  13. Wang X., Shi Y., Kitani K.. 2016 Deep Supervised Hashing with Triplet Labels [C]. In Proceedings of Asian Conference on Computer Vision. Taipei, Taiwan, 70-84.
  14. Cakir F., Sclaroff S.. 2015 Adaptive Hashing for Fast Similarity Search [C]. In Proceedings of the 2015 IEEE International Conference on Computer Vision (CVPR). Santiago, Chile, 1044-1052.
  15. Oliva A., Torralba A.. 2001 Modeling the shape of the scene: A holistic representation of the spatial envelope [J]. International journal of computer vision, 42(3): 145-175.
Index Terms

Computer Science
Information Sciences

Keywords

Hashing algorithm self-taught clustering Iterative optimization mechanism Similarity preserving.