• Arthur Samuel, a pioneer in artificial intelligence, described machine learning as a set of methods and technologies that “gives computers the ability to learn without being explicitly programmed.” In a particular case of supervised learning for anti-malware, the task could be formulated as follows: given a set of object features X and corresponding object labels Y as an input, create a model that will produce the correct labels Y’ for previously unseen test objects X’. X could be some features representing file content or behavior (file statistics, list of used API functions, etc.) and labels Y could be simply “malware” or “benign” (in more complex cases, we could be interested in a fine-grained classification such as Virus, Trojan-Downloader, Adware, etc.). In case of unsupervised learning, we are more interested in revealing hidden structure of data - e.g., finding groups of similar objects or highly correlated features.

    Kaspersky Lab’s multi-layered, next generation protection utilizes machine learning methods extensively on all stages of detection pipeline - from scalable clustering methods used for preprocessing incoming file stream in infrastructure to robust and compact deep neural network models for behavioral detection that will work directly on users’ machines. These technologies are designed in a way to address several important requirements for machine learning models in a real world information security applications, i.e. extremely low false positive rate, interpretability of a model and robustness to a potential adversary.

    Let’s consider some of the most important machine learning based technologies used in Kaspersky Lab endpoint products:

    Decision tree ensemble

    In this approach, the predictive model takes the form of a set of decision trees (e.g. random forest or gradient boosted trees). Every non-leaf node of a tree contains some question regarding features of a file, while the leaf nodes contain final decision of the tree on object. During test phase, the model traverses the tree by answering the questions in the nodes with the corresponding features of the object under consideration. At the final stage, decisions of multiple trees are averaged in an algorithm-specific way to provide final decision on object.

    The model benefits Pre-Execution Proactive protection stage on the endpoint site.



    Locality sensitive hashing

    In this approach we extract file features and use orthogonal projection learning to choose the most important ones. After that ML model based compression is applied so that similar feature value vectors are transformed into similar or identical patterns. This method allows us to reach good generalisation.

    The model benefits Pre-Execution Proactive protection stage on the endpoint site.



    Behavioral model

    Monitoring component (SW) provides a behavior log - the sequence of system events occurred during the process execution together with corresponding arguments. In order to detect malicious activity in observed log data our model compresses obtained sequence of events to a set of binary vectors and trains the deep neural network to distinguish clean and malicious logs.

    The model benefits Post-Execution Proactive protection stage on the endpoint site.



    2. Machine learning plays an equally important role when it comes to building proper in lab malware processing infrastructure. Kaspersky Lab uses it for the following infrastructure purposes:

    Incoming stream clustering

    Machine Learning based clustering algorithms allow us to efficiently separate the large numbers of inbound unknown files that come into our infrastructure into a reasonable number of objects, parts of which can be automatically processed based on the presence of an already annotated object inside it.



    Large-scale classification models

    Some of the most powerful classification models (like a huge random decision forest) require large amount of resources (processor time, memory) along with expensive feature extractors (e.g., processing via sandbox could be required for detailed behaviour logs). It is more effective therefore to keep and run the models in a lab, and then distil the knowledge gained by such models via learning some lightweight classification model on the output decisions of the bigger model.



    To learn more about Machine Learning read Whitepaper

Related Products

US 8250655 B1

Rapid heuristic method and system for recognition of...

Read more

US 8955120 B2

Flexible fingerprint for detection of malware

Read more

US9171155 B2

System and method for evaluating malware...

Read more

Whitepaper

Machine Learning for Malware Detection

Read more

Whitepaper

Machine learning and Human Expertise

Read more

Conference: Bayess methods in deep learning School 2017

Read more

Conference: ICML 2017 Workshop

Read more

Conference: ICLR 2017

Read more

Independent Benchmark Results

  • ICSA Advanced Threat Defense 2017Q3

  • AV-Comparatives Whole Product Dynamic Real-World Proteciton Test Feb-Jun 2017

  • SELabs Enterprise Endpoint Protection July-September 2017

  • AV-Test BEST PROTECTION 2016 (KES)

  • AV-Test BEST PROTECTION 2016 (KSOS)

Related Technologies