Machine Learning-Based Information Security Technology

Threat Intelligence

Youngjoong Kim | Senior Researcher

As cybersecurity threats evolve with more information being available, machine learning-based information security technology, which can proactively predict and respond to emerging cybersecurity threats, is gaining attention. Machine learning gives you the ability to quickly analyze and respond to advanced cybersecurity threats in a security environment that lacks resources. The importance and usage of machine learning-based information security technology are gaining significant ground.

– contents –

1. Summary

2. Case Studies

3. MONITORAPP’s Machine Learning

4. Conclusion

1. Summary

Machine learning-based information security technology, which can proactively predict and respond to emerging threats, is under the spotlight as the new cybersecurity threats advance quickly with a vast amount of available information. Machine learning provides you with the ability to quickly analyze and respond to advanced cybersecurity threats in a security environment that lacks resources. The importance and usage of machine learning-based information security technology are gaining significant ground.

Machine learning is a field that researches and develops a technology that predicts the results of input data by generating a predictive model based on an algorithm so the computer can analyze and learn based on the data. In other words, making a sophisticated algorithm so the computer is able to analyze the data to find patterns and learn from it is the core technology of machine learning.

1-1. Limitations of Current Security Solutions

In the past, signature-based detection technology was mainly used for malicious code detection, but the problem was even the slightly modified malicious code would go undetected. Machine learning-based malware detection techniques have been studied in recent years due to the fact that new malicious codes are only partially modified from existing codes, and malicious behavior is gradually increasing.

1-2. Machine Learning and Information Security

When machine learning technology is adapted into the field of information security, there is more room for the system to autonomously improve, without human intervention, in the ability to diagnose, defend, and respond. The necessary condition needed for this to happen is to secure a large number of various types of security threat data. To detect and block advanced cyber threats that appear in real-time, it is already too late for human intervention. It is important to try to cultivate the ability for machines, not humans, to learn to preemptively respond to threats.

1-3. MONITORAPP & Machine Learning

MONITORAPP has been continuously researching and developing machine learning technology since the birth of the company. Through AICC (Application Insight Cloud Center) Threat Intelligence, we have been gathering vast amount of security data and improving machine learning algorithm.

2. Case Study

Efforts to integrate machine learning in the field of information security are continuous, but it’s not as easy as it sounds. This is because even if a person analyzes the data in detail, it is often difficult to determine whether there is a problem or not. Let’s look at case studies on how machine learning has been applied and used in the field of information security.

2-1. Threat Intelligence

Threat intelligence uses a variety of search methods and information to gather new malware or identify breach data. Based on the collected data, the analysis of malicious data is accumulated, and additional analysis is performed. The process of systematically understanding, analyzing, and categorizing a large amount of information and reproducing it with refined threat intelligence requires a lot of time and effort from integrated security controls or information security experts.

Machine learning helps information security professionals analyze data faster and identify new threats. By identifying new malicious patterns collected along with big data analysis, it can detect new types of zero-day attacks. In addition, by improving predictive success rates based on zero-day intelligence, you can use global threat intelligence to predict regional threat trends as well.

2-2. Integrated Security Control

It is commonly known that more than billions of security control events occur daily at security control centers. Recently, unexpected advanced cybersecurity threats are increasing, and resources are being allocated to prevent them. However, it is still true that there are many limitations to preventing modified cyberattacks.

Machine learning-based intelligence-integrated security control system can collect and analyze digital information that occurs in various security systems on a centralized platform using big data. It can also focus its analysis on high-risk events through learning, thus decreasing the perimeter and time spent, performing real-time intrusion blocking, and detecting unknown threats

2-3. Network Intrusion Detection

Most network intrusion detection solutions block attacks based on the signature and rule-based scenarios. This method shows relatively high accuracy and is somewhat efficient. However, with this method, it is difficult to bypass the existing pattern or detect a modified attack. Statistical data about normal network users are created, and when it is out of the normal range, it is seen as an abnormal activity. Such a method is called a statistically-based anomaly detection technique and this method is also widely used for intrusion detection. However, there is a limit to identifying threats based on behaviors and managing them in real-time.

To solve this issue, we can apply the machine learning method, collecting normal and abnormal network packets, and running a machine-learning algorithm to develop a model on the collected packets to distinguish their threat level.

2-4. Malicious Code Analysis

The new malware and their variants occur hundreds of thousands of times a day. The reality is that it is very difficult for malicious code analysis experts to properly respond. Although various automation tools and sandboxes are used to help the analysis, evasion techniques have also been progressing, making the detection of malicious codes increasingly difficult.

Currently, machine learning is used to understand the malicious code’s dynamic behavior by collecting behavior information and analyzing it in a sandbox environment. Rather than simply categorizing malicious codes by specific system call or performance’s completeness, such information can be drawn in a vector space to be turned into data to be categorized. By categorization, the file being analyzed can be judged to see how much it deviates from a typical malicious code behavior. It is also possible to determine the type of malware by substituting the behavior information of the target file for the malicious code classification model created through machine learning.

3.MONITORAPP’s Machine Learning

MONITORAPP continues to collect a vast amount of cybersecurity threat data through AICC Threat Intelligence and is advancing machine learning algorithms based on the data.

3-1. AICC Threat Intelligence

AICC Threat Intelligence, MONITORAPP’s own researched and developed cloud-based platform, provides real-time threat intelligence regarding attack behaviors and attacker info. AICC Threat Intelligence performs not only signature/reputation detection, full-traffic inspection, and profiling but also has the technology to perform things such as real-time information collection and sharing, third-party interlocking, data mining, big data analysis, etc. A lot of threat information is automatically gathered and analyzed in AICC in real-time, processed in various forms, and connected with various security solutions.

One of the core AICC Threat Intelligence technologies is the billions of accumulated data that can be used as training information for the machine learning technology. With the analysis of this data, such technology identifies threats and malicious intents. This technology is derived from the interactive intelligence operation that relates to technology and data of MONITORAPP’s web security products. While the signature-based anti-virus engine can detect only known threats in the past, machine learning’s advantage is that it can predict and effectively defend against unknown/new-variant zero-day attacks.

[Pic.] AICC Threat Intelligence Configuration

This machine learning-based integrated threat analysis system is installed in AICC Threat Intelligence, and its functions are continuously researched and developed.

[Pic.] AICC Threat Intelligence Machine Learning

In addition to AICC Threat Intelligence’s machine learning technology, we developed a Malicious Similarity Analysis (MSA) system based on machine learning algorithms.

3-2. MSA (Malicious Similarity Analysis)

[Pic.] MSA Malicious File Similarity Analysis Interface

MSA performs a similarity analysis on the input file through the API and has a grouping function for malicious files. Its main function is to display common items and distinct characteristics amongst malicious files. It also provides static analysis of malicious code, behavior analysis information, detailed profiling analysis information, and basic information about malicious code groups, and statistical information of malicious codes included in the group.

[Pic.] MSA Malicious File Similarity Analysis Concept

4.Conclusion

Efforts to apply machine learning technology to the field of information technology have existed for a long time but it did not gain much traction. In the past cybersecurity threats were not diverse, and the pattern matching method was enough to detect and block threats. However, the new ICT industries are emerging every day, and cyberattacks are becoming more intelligent and diverse. To combat this effectly we need various response systems and applied machine learning technology.

Machine learning has overfitting error, variance data error, and cannot perform the job of the security expert perfectly. In order to mitigate this flaw, machine learning need to analyze and study the vast amount of data collected by human experience over the years and only after this it can quickly and efficiently respond to cybersecurity threats To minimize human interaction and to automate the security process, not to mention to reduce false positives and negatives, an environment has to be provided for the machine learning to study the data collected from human experience. In addition, it is necessary to apply signature analysis, behavior analysis, and machine learning technology to provide automated detection and response to advanced attacks.

MONITORAPP will continue to advance AICC Threat Intelligence based on accumulated machine learning technology. We are also researching and developing various ways to apply machine learning technology to various fields such as standard web log-based threat prediction and URL category classification, instead of only focusing on MSA malicious file analysis.

The next article will discuss in detail how machine learning technology is applied to various MONITORAPP platforms.

References

MONITORAPP Threat Intelligence Platform, AICC: https://www.monitorapp.com/ko/aicc-kr/
AICC PORTAL: https://aicc.monitorapp.com/
Machine Learning and Security: Protecting Systems with Data and Algorithms by Clarence Chio & David Freeman

Blog

[Pic.] AICC Threat Intelligence Configuration

[Pic.] AICC Threat Intelligence Machine Learning

[Pic.] MSA Malicious File Similarity Analysis Interface

[Pic.] MSA Malicious File Similarity Analysis Concept

References

MONITORAPP Threat Intelligence Platform, AICC: https://www.monitorapp.com/ko/aicc-kr/

AICC PORTAL: https://aicc.monitorapp.com/

Machine Learning and Security: Protecting Systems with Data and Algorithms by Clarence Chio & David Freeman