International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 175 - Number 10 |
Year of Publication: 2020 |
Authors: Zhen Wang, Chenyu Wang, Anlei Jiao |
10.5120/ijca2020920564 |
Zhen Wang, Chenyu Wang, Anlei Jiao . Minimal Similarity Loss Hashing based on Optimized Self-taught Clustering Method. International Journal of Computer Applications. 175, 10 ( Aug 2020), 34-39. DOI=10.5120/ijca2020920564
In With the rapid growth of the amount of web images, how to fast respond the approximate nearest neighbors (ANN) search task has been concerned by researchers. As hashing algorithms map the floating point data into binary code and achieve ANN search task according to Hamming distances, it has the advantages of low computational time complexity. To obtain an excellent ANN search performance based on compact binary code, many algorithms adopt machine learning mechanism to train hashing functions. For instance, k-means hashing (KMH) and stacked k-means hashing (SKMH) utilize k-means clustering mechanism to learn encoding centers. Due to KMH and SKMH randomly generate initial centers, many centers would converge to the same solution. To fix the above problem, a novel hashing method termed as minimal similarity loss hashing (MSLH) is proposed to generate the initial centers with maximum average distance by a self-taught mechanism. Furthermore, MSLH defines both the similarity loss and the quantization loss as the objective function. By minimizing the similarity loss, MSLH can approximate the data pairs' Euclidean distance by their Hamming distance. The encoding results with minimal quantization loss map the nearest neighbors into the same binary code, which well adaptive to the data distribution. The ANN search comparative experiments on three public available datasets including NUS-WIDE and 22K LabelME show that MSLH can achieve a superior performance.