There are many SIEM solutions available. And some ML or AI modules/tools/Add-ons available on the market. Some of those ML/AI tools available are using pure statistics for outlier detection apart from current hot topic ML, AI algorithms.

What is tactical SIEM? if you are spending 80 percent of your time within a SIEM tool doing alert review and analysis, then you are on the right track. If you are an organization that is instead focusing heavily on collecting more data sources, applying patches, or running compliance reports, then your SIEM implementation may not be tactical. [2]

So correlation/alert is the heart of SIEM.

Some SIEM solutions have strong correlation engine and some others are weak relatively.

Some SIEM correlation engines are just filters and some of them are no more than Esper CEP query.

Correlation is the key factor for SIEM success. So the emphasis is correlation engine.

Most of the available SIEM solutions detects:

  • If a ZIP file is attached to an email, they trigger an alert.
  • If five authentication attempts to the same computer fail from the same IP address within ten minutes and use different user names, and if a successful login occurs on any computer within the network and originates from that same IP address, they trigger an alert.
  • If a user fails more than three login attempts on the same computer within an 20 minutes, trigger an alert.
  • During a company-wide layoff, trigger an alert if more than ten files of specific types are copied to USB drives or sent as email attachments to non-company domains.

But just few of them detects suspicious conditions like:

 

  • If a user accessed a device from a new IP/computername first time

 

  • A user VPNs to the network from a new location for the first time, then accesses a shared file system

 

  • If a user changed password 10 times within a week

 

  • If a user login to a computer after work hours for the first time

 

  • A user which has not created a failed login event during work hours, has a failed login event during lunch and the same user repeats the same behavior for the consecutive 2 days

 

  • A process start and the files accessed by this process within 15 minutes on the same machine is a process-file access pattern. And if this pattern is seen more than 2 machines within 20 minutes, take action

 

  • An account has not logged in for over 60 days.

 

So there is huge difference between SIEM solutions detection capabilities which means huge difference between correlation engines.

When it comes to UEBA, ML, AI market. Do not fall for UBA Marketing just yet, the Technology is somewhat immature and might lead to the false understanding of the “Box” you are buying does everything automagically.

What about the use of Artificial Intelligence/Machine Learning? Is it mostly marketing buzzwords and hype, or is it really something organizations should start considering? [2]

Having said all that, there are fundamental limitations that prevent AI and ML from overcoming the challenges faced by the security industry on its own, and this is why we don’t yet see many practical applications of these techniques in the SOC other than some algorithms used in certain products that are meant to complement the analyst’s job. [2]

Statistical methods are better than ML methods. The forecasting accuracy of ML models is lower to that of statistical methods may seem disappointing, we are extremely positive about the great potential of ML ones for forecasting applications [4]

By some of security experts, those tools are called “Tools are written for paper / presentation” [5]

The main failure of the new ML/AI powered threat detection and mitigation technologies lies with the fact that they are optimized for solving a particular class of threats – for example, insider threats, host-based malicious software, web application attacks, etc.

There are some open source ML SIEM tools available with apache license:

  • Apache Metron
  • Apache Spot
  • Apache Ranger

Many open source projects considered Apache Metron or Apache Spot but, they decided not to go with them.

Nearly all the scenarios from UEBA, ML side is also available with strong correlation engine on the SIEM side like:

  • Detect simultaneous logins from two different countries,
  • Detect simultaneous logins from two improbable geo-locations,
  • Log on to servers and at times that one does not typically log on etc.,
  • Detecting traffic to dynamically generated domains.

References

1.    https://blogs.gartner.com/anton-chuvakin/2018/10/15/network-anomaly-detection-track-record-in-real-life/

2.    https://cyber-defense.sans.org/blog/2018/10/24/your-siem-questions-answered

3.     https://www.slideshare.net/RyanGMurphy/beyond-the-hype-security-experts-weigh-in-on-artificial-intelligence-machine-learning-and-nonmalware-attacks

4.    https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0194889

5.    http://www.hexacorn.com/blog/2018/06/16/the-botryology-of-anomalies-the-ai-machine-learning-and-ze-computer-security/