CFP last date
20 December 2024
Reseach Article

An Implementation of Data Pre-Processing for Small Dataset

by Sameer Dixit, Navjot Gwal
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 103 - Number 6
Year of Publication: 2014
Authors: Sameer Dixit, Navjot Gwal
10.5120/18080-8707

Sameer Dixit, Navjot Gwal . An Implementation of Data Pre-Processing for Small Dataset. International Journal of Computer Applications. 103, 6 ( October 2014), 28-31. DOI=10.5120/18080-8707

@article{ 10.5120/18080-8707,
author = { Sameer Dixit, Navjot Gwal },
title = { An Implementation of Data Pre-Processing for Small Dataset },
journal = { International Journal of Computer Applications },
issue_date = { October 2014 },
volume = { 103 },
number = { 6 },
month = { October },
year = { 2014 },
issn = { 0975-8887 },
pages = { 28-31 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume103/number6/18080-8707/ },
doi = { 10.5120/18080-8707 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:33:50.478803+05:30
%A Sameer Dixit
%A Navjot Gwal
%T An Implementation of Data Pre-Processing for Small Dataset
%J International Journal of Computer Applications
%@ 0975-8887
%V 103
%N 6
%P 28-31
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Pre-processing in data mining played essential role for enhancing data quality. The basic concept behind is that, learning with accurate and high quality data may provide more efficient classification results as compared to learning with poor quality of data. In this presented paper a pre-processing technique is implemented with slight modification which is based on the technique given in [1]. In this paper a promising approach of data pre-processing is provided which utilizes a fuzzy technique in order to improve the data quality. The implementation of available technique is performed using MATLAB. Additionally, the improved fuzzy technique is also implemented with it. The results demonstrate the effectiveness in classification accuracy after implementation of both techniques. Finally, the obtained results favour the proposed model for enhancing the performance of classifiers in both manners supervised and unsupervised manner.

References
  1. Der-Chiang Li and Chiao-Wen Liu, "Extending Attribute Information for Small Data Set Classification", IEEE Transactions on Knowledge and Data Engineering, Vol. 24, No. 3, March 2012
  2. Yifei Ren, "Data Preprocessing for Data Mining", Bachelor's Thesis (UAS) Degree Program in Information Technology Information Technology 2013.
  3. Rayner Alfred, "Optimizing Feature Construction Process for Dynamic aggregation of Relational Features," Journal of Computer Science, 2009 Science Publication.
  4. X. Sun, S. Z. Sun, J. Tian and J. Han, "Sparse Kernel Principal Component Analysis on Seismic Denoising and Fluid Identification", 10 June 2013 DOI: 10. 3997/2214-4609. 20130642
  5. Index of /Datasets/UCI/arff - Parent Directory – Seasr: http://repository. seasr. org/Datasets/UCI/arff/
Index Terms

Computer Science
Information Sciences

Keywords

Data mining pre-processing data quality enhancement classification performance improvement