International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 142 - Number 7 |
Year of Publication: 2016 |
Authors: Fidan Kaya Gülağız, Onur Gök, Adnan Kavak |
10.5120/ijca2016909903 |
Fidan Kaya Gülağız, Onur Gök, Adnan Kavak . A Comparison of Imputation Techniques using Network Traffic Data. International Journal of Computer Applications. 142, 7 ( May 2016), 25-29. DOI=10.5120/ijca2016909903
Creation of data sets to be used for studies in many different fields of research is really important process. However these data sets suffer from the problem of missing values. There are many different ways of handling missing values. Deletion methods and single imputation methods are the most common ones of these methods. However, this methods lead to high errors in data sets with high loss rates. Data sets used for the analysis of network traffic are also commonly encounters with the missing values. In this study, data produced in different sizes and different missing value rates for the analysis of network traffic in distributed systems. Then, different data imputation methods are compared for dealing with missing values in these datasets. Experimental results showed that Expectation Maximization Method is more applicable and performs better at relatively high missing data rates and k Nearest Neighbors Method performs better at low missing rates.