Graph Anomaly Dection

R. Karn, V. Sundaram
SMU Dallas,
United States

Keywords: graph anomaly ML netowrk


Cyber hacking has increased in the last few years, threatening vulnerable systems, gaining company vital information, and demanding ransom. Hackers usually demand to be paid in untraceable cryptocurrency. While companies devote much time and money to preventing attacks, many attacks go undetected because experts are looking for predefined patterns. The common practice is to detect intrusion by searching for known patterns and unusual activities in systems logs. The drawbacks to currently known patterns are that the approach detects new previously unseen patterns, and for known patterns, the detection is too late to prevent intrusion. Finding known attack patterns is useful, but it will not be detected if an attacker uses a previously unseen attack. However, network anomaly detection identifies abnormal network traffic, which can provide administrator alerts to identify possible new cyber attacks. This research investigates the use of graph methods to identify anomalous network activity. Our approach is to use Graph Representation Learning (GRL) to identify anomalous network activity. GRL aggregates data from many sources, including network traffic patterns, social media postings, and financial transactions. Preliminary results indicate that the patterns identified by this approach may help detect intrusions not discoverable with conventional methods. We represent a snapshot of the network traffic in the GRL using a temporal graph. The graph-level feature engineering helped visualize time series data by representing it as a topology. The data from various sources were represented as a graph at each time instance. The features were derived from the graphs using centrality measures (e.g., degree, eigenvector centrality, betweenness, closeness, etc.) and Kernel techniques (e.g., the Weisfeiler-Lehman al). These features were used as input to machine learning techniques to detect deviations from normal system interactions. The results showed that the network anomaly detection using the graph method provided about 80% accuracy whereas the traditional non-graph method was less than 70%. The graph-based anomaly detection method produced results that were found to be 2 times more than the traditional unsupervised methods. The F-score values obtained from the numerous cycles were found to be consistent with the traditional methods. Due to the lack of graph methods for detecting network anomalies, the existing network anomaly detection methods have not been commercialized.