CFP last date
20 January 2025
Reseach Article

Log Classification using K-Means Clustering for Identify Internet User Behaviors

by Muhammad Zulfadhilah, Imam Riadi, Yudi Prayudi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 154 - Number 3
Year of Publication: 2016
Authors: Muhammad Zulfadhilah, Imam Riadi, Yudi Prayudi
10.5120/ijca2016912076

Muhammad Zulfadhilah, Imam Riadi, Yudi Prayudi . Log Classification using K-Means Clustering for Identify Internet User Behaviors. International Journal of Computer Applications. 154, 3 ( Nov 2016), 34-39. DOI=10.5120/ijca2016912076

@article{ 10.5120/ijca2016912076,
author = { Muhammad Zulfadhilah, Imam Riadi, Yudi Prayudi },
title = { Log Classification using K-Means Clustering for Identify Internet User Behaviors },
journal = { International Journal of Computer Applications },
issue_date = { Nov 2016 },
volume = { 154 },
number = { 3 },
month = { Nov },
year = { 2016 },
issn = { 0975-8887 },
pages = { 34-39 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume154/number3/26475-2016912076/ },
doi = { 10.5120/ijca2016912076 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:59:17.166974+05:30
%A Muhammad Zulfadhilah
%A Imam Riadi
%A Yudi Prayudi
%T Log Classification using K-Means Clustering for Identify Internet User Behaviors
%J International Journal of Computer Applications
%@ 0975-8887
%V 154
%N 3
%P 34-39
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The Internet has become a necessity in today's society; any information is accessible on the internet via web browser. However, these activities could have an impact on users, one of which changes in behavior. This study focuses on the activities of Internet users based on the log data network at an educational institution. The data used in this study resulted from one-week observation from one of the universities in Yogyakarta. Data log network activity is one type of big data, so it is needed to use of data mining with K-Means algorithm as a solution to determine the behavior of Internet users. The K-Means algorithm used for clustering based on the number of visitors. Cluster number of visitors divided into three, namely low with 1479 amount of data, medium with 126 amount of data, and high with 33 amount of data. Categorization also performed by the access time and is based on website content that exists in the data. It is to compare the results by the K-Means clustering algorithm. The results of the educational institution show that each of these clusters produces websites that are frequented by the sequence: website search, social media, news, and information. This study also revealed that the cyber-profiling had been done strongly influenced by environmental factors and daily activities.

References
  1. C. Deliang, “A Comparative Study on User Characteristics of Fixed and Wireless Network Based on DHCP,” pp. 0–3, 2016.
  2. APJII, “Indonesian Internet User Profile 2014,” 2015.
  3. S. Gole, “A survey of Big Data in social media using data mining techniques,” 2015 IEEE Int. Conf. Adv. Comput. Commun. Syst., pp. 5–10, 2015.
  4. J. He, A. Wei, Y. Yang, and W. Dong, “Research on Degree of Video Completion of Internet Videos with Clustering Algorithms,” pp. 89–95, 2015.
  5. J. Yan, Y. Qiao, J. Yang, and S. Gao, “Mining Individual Mobile User Behavior on Location and Interests,” 2015 IEEE Int. Conf. Data Min. Work., pp. 1262–1269, 2015.
  6. J. J. Irvine, “Digital Forensic Analysis & Cyber-profiling,” no. 703, pp. 1–32, 2010.
  7. D. B. van den Berg, P. dr. A. de Vries, P. dr. S. van der Hof, M. Kakaris, and A. Theocharidis, “Online Identities , Profiling and Cyber Bullying,” no. March, 2013.
  8. C. Zhou, H. Jiang, Y. Chen, L. Wu, and S. Yi, “User Interest Acquisition by Adding Home and Work Related Contexts on Mobile Big Data Analysis,” no. Bdsta, pp. 0–5, 2016.
  9. C. H. Liao, Y. H. Lei, K. Y. Liou, J. S. Lin, and H. F. Yeh, “Using Big Data for Profiling Heavy Users in Top Video Apps,” Proc. - 2015 IEEE Int. Congr. Big Data, BigData Congr. 2015, pp. 381–385, 2015.
  10. S. Yu, “Behavioral Evidence Analysis on Facebook: a Test of Cyber-Profiling,” Defendologija, vol. 16, no. 33, pp. 19–30, 2013.
  11. J. Yang, Y. Qiao, X. Zhang, H. He, F. Liu, and G. Cheng, “Characterizing user behavior in mobile internet,” IEEE Trans. Emerg. Top. Comput., vol. 3, no. 1, pp. 95–106, 2015.
  12. P. Shekhawat, “Netizens Buying Online Most Attracted to Digital Advertising,” http://www.markplusinsight.com/article/detail/34/netizens-buying-online-most-attracted-to-digital-advertising, 2014.
  13. A. Chauhan, G. Mishra, and G. Kumar, “Survey on Data Mining Techniques in Intrusion Detection,” vol. 2, no. 7, pp. 2–5, 2011.
  14. L. Xue and W. Luan, “Improved K-means Algorithm in User Behavior Analysis,” 2015 Ninth Int. Conf. Front. Comput. Sci. Technol., pp. 339–342, 2015.
  15. F. Gharehchopogh, N. Jabbari, and Z. Azar, “Evaluation of Fuzzy K-Means And K-Means Clustering Algorithms In Intrusion Detection Systems,” Int. J. Sci. …, vol. 1, no. 11, 2012.
  16. A. Iswardani and I. Riadi, “Denial Of Service Log Analysis Using Density K-Mans Method,” vol. 83, no. 2, pp. 299–302, 2016.
  17. Md. Khalid Imam Rahmani; Naina Pal; Kamiya Arora, “Clustering of Image Data Using K-Means and Fuzzy,” Int. J. Adv. Comput. Sci. Appl., vol. 5, no. 7, pp. 160–163, 2014.
  18. R. Shaw and A. S. Atkins, “Conceptual Analysis of Cybercrime Events in Profiling Business Attacks.”
  19. P. Peña, R. del Hoyo, J. Vea-Murguía, C. González, and S. Mayo, “Collective knowledge ontology user profiling for twitter: Automatic user profiling,” Proc. - 2013 IEEE/WIC/ACM Int. Conf. Web Intell. WI 2013, vol. 1, pp. 439–444, 2013.
  20. P. Jayakumar and P.Shobana, “Creating Ontology Based User Profile for Searching Web Information,” no. 978, 2014.
  21. T. Bakhshi and B. Ghita, “Traffic Profiling : Evaluating Stability in Multi-Device User Environments,” 2016.
Index Terms

Computer Science
Information Sciences

Keywords

Clustering K-Means Network Log Cyber-profiling