Consider Funshion malware. Sometimes classified as "aggressive malware," the base code is over four years old and is still bypassing endpoint protections. Funshion makes minor modifications to itself, rendering it invisible to the rules or signatures designed to catch it. Today there are well over a dozen variants in the wild, each designed to beat static rules. Each variant is essentially a new attack that rules cannot stop.
The good news is that AI has been able to do what rules cannot: understand that subtle variations of malware are still malware. This means AI can detect known attacks as well as attacks it has never seen before. This distinction alone puts it well beyond the capabilities of rules. So how does it work?
In PatternEx language, we have a specialized supervised learning algorithm called the “Contextual Modeler.” This is where the human input trains the AI to learn, and how human intuition is captured in code. The Contextual Modeler’s job is to make accurate predictions of malicious behavior, in this case malware, by observing behaviors and understanding the implications of those behaviors.
To do so, the Contextual Modeler forms an abstraction of the attack (more on this below). The abstraction is formed in what data scientists call “high dimensional space”, with variables not necessarily the same as the input data. And this abstraction is the key to training a computer to mimic human intuition when analyzing behavior patterns.
Clearly this is an important concept. So what is this thing called ‘high dimensional space’?
High Dimensional Space
Take the example of a dog. Your eyes see a dog. The image is comprised of shapes and colors and light intensities. The visual cortex does boundary analysis then color analysis; it says “this is blue and that is green and there is a boundary between them.” Dimensions are measurable things, and a boundary between green and blue is a simple dimension.
But as the visual cortex takes in more data (colors, shapes, light intensities), it will identify a pattern as a “leg.” The pattern of “leg” is comprised of higher level dimensions - it touches the ground, it is connected to a body, it has such and such a shape, et cetera. The brain looks at many different dimensions just to compute “that’s a leg.” This is a higher dimension than “this is blue and that is green and this is the boundary between the two.” Higher and higher dimensional processing eventually result in the brain assigning a label called dog to the image. You start with pixels (low dimensionality) and move to “legs” (higher dimensionality) and eventually “dog” (still higher dimensionality).
This is similar to how deep neural networks work. I say “similar” because we do not know for sure if the abstractions formed by the neural network are identical to the abstractions the brain forms. The practical reality is that the deep neural networks produce the same outputs as a human does, given the same inputs.
Plato: Early Data Scientist
Another way to look at this is using Plato’s Theory of Forms. Plato theorized that there was a universal essence of a thing, and that physical objects were simply instances of that essence. This idea explains how we can call both a Labrador and a Chihuahua a dog, despite their obvious dissimilarities. And this also explains how my five-year-old son can call a Great Dane a “dog” despite never having seen one before.
The “essential essence” is another way of saying abstraction. Just as we understand the abstraction of “dog,” AI understands the abstraction of a specific attack. The contextual modeler creates and maintains that abstraction by maintaining distributions of 30, 40, or even 50 dimensions of a behavior. All those statistical distributions put together is the attack in the abstract. It represents the ‘class’ of an attack.
A given behavior vector is just an instance of that class. Just as a Great Dane is just an instance of the class “dog.”
Programmers will also likely find the language of “class” and “instance” familiar.
What Does This Mean For InfoSec
For starters, it means we can train a computer to recognize attacks never seen before. Given that an attacker has only to change a single comment in the code to evade a rule, this is an interesting development to say the least.
When AI forms an abstraction of the Funshion malware, the contextual modeler is looking at the relationships between 50+ behavioral variables of the entity. The contextual modeler constantly updates and maintains those statistical distributions thus forming the abstraction of Funshion. If a set of new behaviors produce similar values of the abstract variables, it will generate an alert - even if the behaviors have never been seen before. In short, the AI can still recognize the malware, even it if morphs its own filesize, communication interval, or the amount of data sent/received.
Our very human-ness makes this concept difficult to understand. People often say, “so you can train the model on a certain attack, and that model can recognize attacks that look like it?” Strictly speaking, this is true but in a very unfamiliar way. We see the input data and we think of similarities among the input data. But to the contextual modeler, the similarity is in the higher dimensional attack abstraction space that we cannot visualize.