CFP last date
20 January 2025
Reseach Article

Beyond Banking: The Trailblazing Impact of Data Lakes on Financial Landscape

by Pankaj Gupta, Sivakumar Ponnusamy
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 185 - Number 47
Year of Publication: 2023
Authors: Pankaj Gupta, Sivakumar Ponnusamy
10.5120/ijca2023923287

Pankaj Gupta, Sivakumar Ponnusamy . Beyond Banking: The Trailblazing Impact of Data Lakes on Financial Landscape. International Journal of Computer Applications. 185, 47 ( Dec 2023), 24-29. DOI=10.5120/ijca2023923287

@article{ 10.5120/ijca2023923287,
author = { Pankaj Gupta, Sivakumar Ponnusamy },
title = { Beyond Banking: The Trailblazing Impact of Data Lakes on Financial Landscape },
journal = { International Journal of Computer Applications },
issue_date = { Dec 2023 },
volume = { 185 },
number = { 47 },
month = { Dec },
year = { 2023 },
issn = { 0975-8887 },
pages = { 24-29 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume185/number47/33007-2023923287/ },
doi = { 10.5120/ijca2023923287 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:28:51.233956+05:30
%A Pankaj Gupta
%A Sivakumar Ponnusamy
%T Beyond Banking: The Trailblazing Impact of Data Lakes on Financial Landscape
%J International Journal of Computer Applications
%@ 0975-8887
%V 185
%N 47
%P 24-29
%D 2023
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The Data Lake is a repository that exhibits great scalability and has the capability to store both structured and unstructured data. It presents a potentially effective resolution to the modern challenge of storing large volumes of data, sometimes referred to as Big Data. Nevertheless, it is important to acknowledge that this system does have some limitations, such as inadequate security measures and deficiencies in access control. This paper presents a comprehensive analysis of several business Data Lake solutions currently available on the market. Apache Hadoop is acknowledged as a prevailing standard in the realm of data lakes. The parallel processing frameworks of this system provide efficient and rapid processing of substantial volumes of data. The primary benefits of the data lake environment are the use of affordable hardware, the adoption of open-source technologies including cost-free software, and the capacity to scale elastically. This study will explain the potential use of a data lake in conjunction with a data warehouse. The objective of this study is to propose a potential data lake architecture for the banking industry model inside a specific multinational banking organization. These systems include Amazon Web Services (AWS) Data Lake and Azure Data Lake. AWS Data Lakes provides a streamlined solution accompanied by robust safeguards to mitigate the risk of data loss, while Azure Data Lakes emphasizes its superior scalability and high-level security measures tailored for business use. Data Lake solutions are seeing a surge in popularity among several sectors, including Banking & finance, manufacturing, and healthcare. Furthermore, it assumes a prominent function within the context of Industry 4.0.

References
  1. Surabhi D Hegde, Ravinarayana B, Survey Paper on Data Lake, International Journal of Science and Research (IJSR), 2016
  2. Pwint Phyu Khine, Zhao Shun Wang, Data Lake: a new ideology in big data era, ITM Web of Conferences 17, 03025, 2018.
  3. Natalia Miloslavskaya and Alexander Tolstoy, Big Data, Fast Data and Data Lake Concepts, 7th Annual International Conference on Biologically Inspired Cognitive Architectures, BICA 2016.
  4. Ms. S. Divya Meena, Ms. S. Vidhya Meena, Data Lakes - A New Data Repository for Big Data Analytics Workloads, International Journal of Advanced Research in Computer Science, 2016
  5. IBM (2006), IBM Industry Models for Financial Services, The Information FrameWork (IFW) Overview.
  6. AWS, Building Big Data Storage Solutions (Data Lakes) for Maximum Flexibility, 2017.
  7. Raghu Ramakrishnan, Baskar Sridharan, John R. Douceur, Pavan Kasturi, Balaji KrishnamachariSampath, Karthick Krishnamoorthy, Peng Li, Mitica Manu, Spiro Michaylov, Rogério Ramos, Neil Sharman, Zee Xu, Youssef Barakat, Chris Douglas, Richard Draves, Shrikant S Naidu, Shankar Shastry, Atul Sikaria, Simon Sun, Ramarathnam Venkatesan, Azure Data Lake Store:A Hyperscale Distributed File Service for Big Data Analytics, SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of D
  8. Valerio Persico, Antonio Montieri, Antonio Pescape, On the Network Performance of Amazon S3 Cloud-storage Service, 2016 5th IEEE International Conference on Cloud Networking (Cloudnet), 2016.
  9. Golec, D., 2019. Data lake architecture for a banking data model. ENTRENOVA-ENTerprise REsearch InNOVAtion, 5(1), pp.112-116.
  10. Clifford, A., Murphy, D., Fritzsimons, G., Meehan, P., O’Suilleabhain, R., Abed, S. (2012), Best Practices, Transforming IBM Industry Models into a production data warehouse.
  11. Microsoft Azure [Online] https://azure.microsoft.com/en-us/resources/cloud-computing-dictionary/what-is-a-data-lake/#data-lake-vs-data-warehouse
  12. [Online] https://www.linkedin.com/pulse/data-lake-redefined-aws-s3-syed-mohammed/.
  13. Davide Piantella A Research on Data Lakes and their Integration Challenges
Index Terms

Computer Science
Information Sciences

Keywords

Data Lake Deep Lake Amazon S3 Deep Learning Azure Data governance Data Security Banking