CFP last date
20 December 2024
Reseach Article

An Efficient and Scalable RDF Indexing Strategy based on B-Hashed-Bitmap Algorithm using CUDA

by Sharmi Sankar, Munesh Singh, Awny Sayed, Jihad Alkhalaf Bani-younis
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 104 - Number 7
Year of Publication: 2014
Authors: Sharmi Sankar, Munesh Singh, Awny Sayed, Jihad Alkhalaf Bani-younis
10.5120/18216-9221

Sharmi Sankar, Munesh Singh, Awny Sayed, Jihad Alkhalaf Bani-younis . An Efficient and Scalable RDF Indexing Strategy based on B-Hashed-Bitmap Algorithm using CUDA. International Journal of Computer Applications. 104, 7 ( October 2014), 31-38. DOI=10.5120/18216-9221

@article{ 10.5120/18216-9221,
author = { Sharmi Sankar, Munesh Singh, Awny Sayed, Jihad Alkhalaf Bani-younis },
title = { An Efficient and Scalable RDF Indexing Strategy based on B-Hashed-Bitmap Algorithm using CUDA },
journal = { International Journal of Computer Applications },
issue_date = { October 2014 },
volume = { 104 },
number = { 7 },
month = { October },
year = { 2014 },
issn = { 0975-8887 },
pages = { 31-38 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume104/number7/18216-9221/ },
doi = { 10.5120/18216-9221 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:35:33.286443+05:30
%A Sharmi Sankar
%A Munesh Singh
%A Awny Sayed
%A Jihad Alkhalaf Bani-younis
%T An Efficient and Scalable RDF Indexing Strategy based on B-Hashed-Bitmap Algorithm using CUDA
%J International Journal of Computer Applications
%@ 0975-8887
%V 104
%N 7
%P 31-38
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Indexing enormous databases such as RDF has been a focus of intense research. As is well understood, indexing plays a pivotal role in speeding up data retrieval operations and query performance. Besides expediting search, an index can motivate new data-store schemes and technologies that can possibly revolutionize large data-analytics engine design, more often relevant to semantic web. Due to the proliferation of internet and the ease of creating and generating data on the fly - handling, storing and the subsequent semantic processing has proven to be a major bottleneck for the RDF data community. Handling data of such scale and magnitude requires a parallel approach as provided by the GPUs (Graphical processing units). In this paper, a new efficient and scalable index is proposed that uses a combination of B+ trees, hashing and sparse matrices. These data structures have an edge over others in terms of their implementation as a parallel algorithm using the CUDA (Compute Unified Device Architecture) framework meant to program massively parallel GPU multicores. So far, RDF data has been mostly implemented either as a RDBMS or as a non-native data-store, in both cases the sequential indexing strategy fails miserably with the scaling of the data-store. Parallel implementation of indices provides a suitable option for dealing with scalable and dynamically generated data over distributed networks. The crucial sparse matrix part of the proposed index is benchmarked against different CUDA memory implementations to derive optimal matrix processing options. The sparse matrix search is profiled using cudamemchk and visual profiler for identifying bottlenecks and inconsistencies in thread execution called thread divergence. Benchmarking the data provides promising results for a B+ tree based index coupled with hashing and sparse matrix implementations.

References
  1. T. Berners-Lee, J. Hendler, and O. Lassila. The semantic web, Scientific American, 284(5), May 2001.
  2. Wolfgang Nejdl, Hadhami Dhraief, Martin Wolpers, O-Telos-RDF: A Resource Description Format with Enhanced Meta-Modeling Functionalities based on O-Telos
  3. Svihla,M. Transforming Relational Data into Ontology Based RDF Data( a doctoral thesis). June 2007.
  4. Antoniou, G. and van Harmelen, F. (2004). A Semantic Web Primer. Cambridge: The MIT Press.
  5. Speeding up on-disk RDF index lookups using B+Hash trees, Minh Khoa Nguyen, Cosmin Basca, Abraham Bernstein, IOS Press, 2012
  6. T. Neumann and G. Weikum, RDF-3X: A RISC-style engine for RDF, Proc. VLDB, 1(1), 2008
  7. Mohammed Hussain, Pankil Doshi, Latifur Khan, James McGlothlin, Murat Kantarcioglu, Bhavani Thuraisingham, Efficient Query Processing for Large RDF Graphs Using Hadoop and MapReduce, Technical Report UTDCS-41-09, Department of Computer Science, The University of Texas at Dallas, November, 2009.
  8. Hexastore: Sextuple Indexing for Semantic Web Data Management, Cathrin Weiss, Panagiotis Karras, Abraham Bernstein
  9. Large RDF Representation Framework for GPUs Case Study Key-Value Storage and Binary Triple Pattern, Chidchanok Choksuchat, Chantana Chantrapornchai, International Computer Science and Engineering Conference (ICSEC), 2013
  10. Binary RDF representation for publication and exchange (HDT), Javier D. Fernandez, Miguel A. Martinez-Prieto, Claudio Gutierrez, Axel Polleres, Mario Arias, Journal of Web Semantics: Science, Services, and Agents on the World Wide Web, Elsevier
  11. Optimizing RDF stores by coupling General-purpose Graphics Processing Units and Central Processing Units, Bassem Makni
  12. Erling and Mikhailov, RDF Support in the Virtuoso DBMS
  13. Javier D. Fernándeza, Miguel A. Martínez-Prietoa, Claudio Gutiérrezb, Axel Polleresc, Mario Ariasa, Binary RDF representation for publication and exchange (HDT), Web Semantics: Science, Services and Agents on the World Wide Web, Vol. 19, March 2013
  14. Efficient Hash Tables on the GPU, Dan Anthony Feliciano Alcantara, PhD Thesis, University of California, Davis
  15. ceres-solver - Google Code: https://code. google. com/p/ceres-solver/
  16. NVIDIA Cusparse Library, DU-06709-001_v5. 5, July 2013, Nvidia Corporation.
  17. D. J. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach. Scalable semantic web data management using vertical partitioning. In VLDB, pages 411–422, 2007.
  18. Hexastore: Sextuple Indexing for Semantic Web Data Management, Cathrin Weiss, Panagiotis Karras, Abraham Bernstein, 2008.
  19. Semantic Search over the Web Data-Centric Systems and Applications 2012, pp 31-60.
  20. Beckett, D. , The design and implementation of the Redland RDF application framework. Computer Networks, 39(5):577-588, 2002.
  21. Lee Feigenbaum, Sean Martin, Matthew N. Roy, Benjamin Szekely and Wing C. Yung: Boca: an open-source RDF store for building Semantic Web applications, Brief Bioinform (2007) 8 (3): 195-200.
  22. Guha, R. , rdfDB: An RDF Database, http://www. guha. com/rdfdb, 2007.
  23. Broekstra, J. , Kampman, A. , van Harmelen. Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema. ISWC, Springer, Sardinia, 2002.
  24. Towards distributed processing of RDF path queries, pages 207-230, Richard Vdovjak, Jeen Broekstra, Geert-Jan Houben
  25. Perfect Spatial Hashing, Sylvian Lefebvre, Hugues Hoppe, Microsoft Research.
Index Terms

Computer Science
Information Sciences

Keywords

RDF B+ tree hashmap sparse matrix CUDA GPU.