By Joy Nathalie Avelino, Jessica Patricia Balaquit, and Carmi Anne Loren Mora
Cybercriminals have become more and more creative and efficient in their efforts to successfully bypass network security. Reports of unauthorized network intrusions that have compromised enterprise security, resources, and data, plague experts on a day-to-day basis, and will continue to do so if not prevented by a more efficient detection system or method. Currently, attackers use polymorphism, encryption, and obfuscation, among other techniques, to automate and increase variants in an attempt to evade traditional intrusion detection methods such as rule-based techniques.
To address these growing number of network threats and keep abreast with the changing sophistication of network intrusion methods, Trend Micro looked into network flow clustering — a method that leverages the power of machine learning in strengthening current intrusion detection techniques.
Network anomalies can be discovered by examining flow data because they contain information useful for analyzing traffic composition of varying applications and services in the network. To efficiently label and process large amounts of said data through clustering, we used a semi-supervised learning approach. These labels will then be used to discern relationships between different malware families, as well as to know how they differ from one another.
Clustering Network Flows from Gh0st RAT Variants
Favorable results were produced from a semi-supervised model that we used to cluster similar types of malicious network flows. One of our specimens included Gh0st RAT (remote access Trojan)–a family of backdoors commonly used in targeted attacks. In the past, we have released reports on GhOst RAT where it was found being dropped by socially-engineered spam mails that hid under the guise of government agencies, and used real-life incidents to lure victims.
Figure 1. Gh0st RAT variants
Gh0st RAT was able to spawn a number of variants over the years since its source code is publicly accessible. In recent years, its operators reused old malware to carry payload for backdoor capabilities, cryptocurrency mining, and targeted attacks, among others. For this research, the Gh0st RAT flow data was sourced and replicated from Trend Micro’s Smart Protection Network (SPN). Through machine learning’s efficacy in clustering network flows and providing insights on different network patterns from malicious traffic, we can associate incoming traffic to future malware variants.
Figure 2. Gh0st RAT variants
Our analysis on Gh0st RAT samples solidifies the abovementioned finding. In Figure 2, we can see the streams that were clustered across multiple versions of Gh0st RAT due to the similarity in their payloads.
Figure 3. Hex dump of Gh0st RAT variant KrisR (top). Hex dump of Monero cryptocurrency mining payload (bottom).
Clustering Network Flows Leads to Enhanced Network Security
Indeed, clustering malicious network flows can provide insights on different network patterns from malicious traffic. This method can ultimately help cybersecurity techniques detect a vast array of malware that can be used in network intrusion attacks. Machine learning use in this study has also shown how the technology can organize large amounts of data at a faster pace, and offer explanation to aid analysts in forming conclusions and time-zero protection.
For more findings, check out our research paper titled “Ahead of the Curve: A Deeper Understanding of Network Threats Through Machine Learning,” which was presented at the TENCON 2018 in Jeju, South Korea. An updated version will be available in the IEEE Xplore Digital Library.