According to analysis of real customer data, PatternEx eliminates more than 90% of false positives and detects verified malicious phishing domains significantly faster than other products.
Nearly every SOC is forced to fight false positives inefficiencies by hiring more analysts with the aim to analyze more data faster. But hiring more analysts brings it own significant challenges. PatternEx’s analysis of real-world customer data shows that a supervised learning model can help make security operations centers (SOCs) far more efficient and effective in detecting cyber attacks.
Tens of thousands of security products on the market today from thousands of security companies (e.g., Angel.co tracks more than 1,600 information security start-ups alone) making it hard for companies to differentiate themselves in a crowded market. To make matters worse for CISO buyers, every company claims that its products are the “best,” “fastest,” “most comprehensive,” or some other superlative. Additionally nearly every infosec vendor claims to be using “AI” (artificial intelligence) or machine learning. And so does PatternEx.
SOC problems with a lack of effectiveness are already well known and understood (such as too many false positives to investigate). So, how is a CISO or SecOps (security operations) manager supposed to determine which product(s) might be worth his / her limited time to consider purchasing? Which product will actually improve his / her information security program?
So, let’s start with the first performance metric: decreasing a SOC analyst workload with the same volume of information to be evaluated.
This first figure above shows how PatternEx’s Supervised Learning models reduce alerts over time with training taking 4,041 alerts on day 10, and dropping the number of alerts to 719 on day 20, and down to 39 alerts by day 60. That is a staggering reduction in alerts to be investigated. The data also shows that outlier (anomaly) detection does not reduce number of alerts over time or with training. This decrease in the number of alerts is important because many such alerts are false positives (FPs), and therefore a drain on analysts’ time. Reducing the ‘noise’ of FPs is important to the effectiveness of SOCs.
This second figure shows how PatternEx’s Supervised Learning models reduce alerts with training, as expressed in the number of labels created and used (as opposed to the number of days, shown previously). This increase in the number of labels generated is hugely important because information security (with the exception of malware) is what is termed in AI language as a thinly labeled space. So a start of 20 labels goes to 300 labels generated and used by day 60 to create a huge drop in false positives. And, as discussed previously (in other blogs), these labels can be used with transfer learning to produce a ‘warm start’ (as opposed to a cold start). Again, this figure also shows that outlier (anomaly) detection does not reduce number of alerts with training.
This third figure shows a stunning drop in false positives as PatternEx’s Supervised Learning models get tuned for this specific real-world customer’s environment, going from 719 on day #20 down to 39 by day #60. (I can hear the SOC analysts going wild from here!) But the obvious follow-up question is, how does PatternEx actually know that those are false positives? Well, the best methodology or tool that we could think of is to match our results against Google Safe Browsing (GSB) historical records and compare timelines for when detected (as shown below).
While we are unable to make a one-for-one correlation with false positives (and false negatives) as reported by GSB, the data certainly does suggest that PatternEx’s Supervised Learning models make a significant reduction in the number of FPs.
A second performance indicator also showcases that a trained PatternEx’s Supervised Learning model detects threats earlier than any other vendor as reported through Virus Total—in some cases more than 10 weeks before it’s reported in VirusTotal.
In this performance test, we compared PatternEx’s Supervised Learning models predictions of malicious URLs (not files) against malicious URLs determined by VirusTotal, and compared timelines (PatternEx prediction dates versus VirusTotal detection dates). Remember that while numbers change regularly, VirusTotal is actually a compilation of about 65 different Web site/domain scanning engines and datasets. Again, you can see a significant improvement in detection.
We’re still assessing how to determine how we could measure earlier detections of exploits, but are trying to settle on a good benchmark. (Suggestions for how to measure such are welcome!)
We think that these measurements with real-world data are important make public to help validate our product and our approach. Next time, I will go ‘behind’ these numbers and layout our methodologies for how the numbers were derived.
How do your detection tools compare? Maybe your New Year’s resolutions should have included improved detection to make your cybersecurity program more effective? Find out more about PatternEx's Virtual Analyst Platform, and start by requesting a demo.