Database Transformation to Build Dataset for Generation of Decision Tree and Extended ER Model

Archana A. Chaudhari; Harmeet Kaur Khanuja

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

FORENSIC ANALYSIS FRAMEWORKS FOR ENCRYPTED CLOUD STORAGE INVESTIGATIONS

Joy Awoleye Sarah Mavire Allan Munyira Kelvin Magora

Random Articles

An Easily Comprehendible Unicode based Sorting Algorithm for Bangla Words

October

2013

Detection and Prevention of Sybil Attack in MANET using MAC Address

July

2015

A Comparative Study of Assessing Software Reliability using SPC: An MMLE Approach

July

2012

Performance Comparison of Three Types of Sensor Matrices for Indoor Multi-Robot Localization

Nov

2018

Reseach Article

Database Transformation to Build Dataset for Generation of Decision Tree and Extended ER Model

by Archana A. Chaudhari, Harmeet Kaur Khanuja

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 118 - Number 12

Year of Publication: 2015

Authors: Archana A. Chaudhari, Harmeet Kaur Khanuja

10.5120/20800-3482

Archana A. Chaudhari, Harmeet Kaur Khanuja . Database Transformation to Build Dataset for Generation of Decision Tree and Extended ER Model. International Journal of Computer Applications. 118, 12 ( May 2015), 41-45. DOI=10.5120/20800-3482

@article{ 10.5120/20800-3482,

author = { Archana A. Chaudhari, Harmeet Kaur Khanuja },

title = { Database Transformation to Build Dataset for Generation of Decision Tree and Extended ER Model },

journal = { International Journal of Computer Applications },

issue_date = { May 2015 },

volume = { 118 },

number = { 12 },

month = { May },

year = { 2015 },

issn = { 0975-8887 },

pages = { 41-45 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume118/number12/20800-3482/ },

doi = { 10.5120/20800-3482 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:01:32.362339+05:30

%A Archana A. Chaudhari

%A Harmeet Kaur Khanuja

%T Database Transformation to Build Dataset for Generation of Decision Tree and Extended ER Model

%J International Journal of Computer Applications

%@ 0975-8887

%V 118

%N 12

%P 41-45

%D 2015

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In Data mining project most of the time consuming task is to prepare a required data set for data mining analysis because in general the relational database has collection of tables and views that must be joined, aggregated and transformed in order to build the required data set. As result, most of the complex SQL queries are written multiple times independently from each other and in a disorganized manner. Therefore, the database grows with many tables and views that are not present as entities in the ER model. Similarly existing SQL aggregations having some limitations to prepare normalized data sets because they return only one column per aggregated group. To address this issue, we propose simple methods to generate SQL code to return aggregated columns in a horizontal tabular layout, where every row corresponds to an observation and every column is associated to a one variable. This new class of functions is called horizontal aggregations. Horizontal aggregations is extension of standard SQL aggregation for building data sets with a horizontal denormalized layout, which is input for most of the data mining algorithms. By providing these standard normalized data-set as an input to the Decision tree generation algorithm for generating Decision tree, similarly we can generate extended ER model.

References

Carlos Ordonez, Sofian Maabout, David Sergio Matusevich, Wellington Cabrera, 2014 "Extending ER models to capture database transformations to build data sets for data mining", in Data and Knowledge Engineering.
Carlos Ordonez and Zhibo Chen, 2012 "Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis". In IEEE Transaction on Knowledge and Data Engineering.
Javier Garca-Garcaa, Carlos Ordonez, 2010 "Extended aggregations for databases with referential integrity issues". In Data and Knowledge Engineering.
Carlos Ordonez, 2004 "Vertical and Horizontal Percentage Aggregations". In Proc. ACM SIGMOD Intl Conf. Management of Data (SIGMOD 04)
Carlos Ordonez, 2006 "Integrating K-Means Clustering with a Relational DBMS Using SQL". In IEEE Trans. Knowledge and Data Eng.
Carlos Ordonez, 2004 "Horizontal Aggregations for Building Tabular Data Sets". In Proc. Ninth ACM SIGMOD Workshop Data Mining and Knowledge Discovery (DMKD 04).
Hall, Mark, Eibe Frank, Geoffrey Holmes, Bernhard P fahringer, Peter Reutemann, and Ian H. Witten, "The WEKA data mining software: an update", ACM SIGKDD explorations newsletter, Vol. 11, no. 1, pp. 10-18, 2009.
Archana A. Chaudhari, H. K. Khanuja, "Database Transformation to Build Data-set for Data Mining Analysis-A Review", Presented in ICCUBEA 2015 Sponsored by IEEE pune section Organized by Pimpri Chinchwad College Of Engineering(PCCOE), Pune .
Archana A. Chaudhari, Harmeet Kaur Khanuja, "Extended SQL Aggregation for Database Transformation", International Journal of Computer Trends and Technology (IJCTT) , Vol 18, No. 6, pp 272-275, Dec 2014.

Index Terms

Computer Science

Information Sciences

Keywords

Data mining Transformation Aggregation Data preparation pivoting SQL.