International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 182 - Number 24 |
Year of Publication: 2018 |
Authors: Poojitha G., Sowmyarani C. N. |
10.5120/ijca2018917942 |
Poojitha G., Sowmyarani C. N. . Pipeline for Real-time Anomaly Detection in Log Data Streams using Apache Kafka and Apache Spark. International Journal of Computer Applications. 182, 24 ( Oct 2018), 8-13. DOI=10.5120/ijca2018917942
Anomaly detection is a standout amongst the most critical assignments so as to construct a system that is trustworthy and secure. The aim of anomaly detection is to detect significant deviation of the system behavior from that of the normal behavior. This approach is broadly used on static data, for instance on dumps of log data. Most systems require a real-time detection of anomalies with a specific end goal to lessen the harm that can be caused by the ignorance of an anomaly or detection at a later time. The recent implementations of the anomaly detection are mostly based on self-learning methods. Machine learning has brought about a significant transformation in the field of anomaly detection. One of the methodologies for anomaly detection depends on clustering algorithms. The implementation discussed in this paper utilizes a time-series evaluation approach for anomaly detection. The paper explains the pipeline built for anomaly detection and the visualization of the results.