CFP last date
20 June 2025
Call for Paper
July Edition
IJCA solicits high quality original research papers for the upcoming July edition of the journal. The last date of research paper submission is 20 June 2025

Submit your paper
Know more
Reseach Article

Kafka-based Architecture in Building Data Lakes for Real-time Data Streams

by Kiran Peddireddy
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 185 - Number 9
Year of Publication: 2023
Authors: Kiran Peddireddy
10.5120/ijca2023922740

Kiran Peddireddy . Kafka-based Architecture in Building Data Lakes for Real-time Data Streams. International Journal of Computer Applications. 185, 9 ( May 2023), 1-3. DOI=10.5120/ijca2023922740

@article{ 10.5120/ijca2023922740,
author = { Kiran Peddireddy },
title = { Kafka-based Architecture in Building Data Lakes for Real-time Data Streams },
journal = { International Journal of Computer Applications },
issue_date = { May 2023 },
volume = { 185 },
number = { 9 },
month = { May },
year = { 2023 },
issn = { 0975-8887 },
pages = { 1-3 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume185/number9/32726-2023922740/ },
doi = { 10.5120/ijca2023922740 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:25:37.798878+05:30
%A Kiran Peddireddy
%T Kafka-based Architecture in Building Data Lakes for Real-time Data Streams
%J International Journal of Computer Applications
%@ 0975-8887
%V 185
%N 9
%P 1-3
%D 2023
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The purpose of this paper is to investigate how Kafka can be used to construct data lakes for real-time data processing. Kafka has gained widespread popularity as a data ingestion and processing tool that offers scalability, fault tolerance, and flexibility. The benefits of utilizing Kafka in a data lake architecture are analyzed, as well as the procedures involved in utilizing Kafka in a data lake architecture. In addition, a case study is provided of a major financial institution that utilized Kafka to establish a data lake. The significance of Kafka in modern data processing is emphasized in this paper, as well as its worth in developing data lakes for real-time data processing.

References
  1. Kiran Peddireddy. (2023). Book Title: “Enterprise Data Integration and Streaming Using Kafka, ActiveMQ, and AWS Kinesis”- ISBN -13 979-8372725218.
  2. Apache Kafka Documentation. (2021). Retrieved from
  3. https://kafka.apache.org/documentation/
  4. Yu, T., Li, Y., Li, X., & Zhang, J. (2019). A Real-Time Customer Complaint Management System Based on Big Data Analytics. Journal of Computational Science, 31, 15- 24.
  5. H. Wu, Z. Shang, G. Peng and K. Wolter, "A Reactive Batching Strategy of Apache Kafka for Reliable Stream Processing in Real-time", 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), pp. 207-217, 2020.
  6. K. Peddireddy and D. Banga, "Enhancing Customer Experience through Kafka Data Steams for Driven Machine Learning for Complaint Management," International Journal of Computer Trends and Technology, vol. 71, no. 3, pp. 7-13, 2023, doi: 10.14445/22312803/IJCTT- V71I3P102.
  7. G. van Dongen and D. V. D. Poel, "A Performance Analysis of Fault Recovery in Stream Processing Frameworks", IEEE Access, vol. 9, pp. 93745-93763, 2021.
  8. J. Kreps, N. Narkhede, J. Rao et al., "Kafka: A distributed messaging system for log processing", Proceedings of the NetDB, pp. 1-7, 2011.
  9. H. Mehmood et al., "Implementing Big Data Lake for Heterogeneous Data Sources," 2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW), Macao, China, 2019, pp. 37-44, doi: 10.1109/ICDEW.2019.00-37.
  10. J. C. Couto and D. D. Ruiz, "An overview about data integration in data lakes," 2022 17th Iberian Conference on Information Systems and Technologies (CISTI), Madrid, Spain, 2022, pp. 1-7, doi: 10.23919/CISTI54924.2022.9820576.
Index Terms

Computer Science
Information Sciences

Keywords

Kafka KSQL Data Lake.