In Part 2, we turned on the lights. We are now collecting millions of logs. But how do we find the one malicious needle in that massive haystack? Today, we teach the machine to hunt.
Welcome back to The Watchtower Chronicles.
In our p, we enabled the “Eyes of the Beast.” We turned on Windows logging and started feeding data into our SIEM.
But now we have a new problem: Too much data. A typical corporate network generates millions of logs every single day.
- “Bob logged in.”
- “Alice opened Word.”
- “Server A talked to Server B.”
If a human tried to read all of that, they would go insane. We need a filter. We need logic. We need a Detection Rule.
In this chapter, we are going to write your very first detection rule. We will take a raw log and turn it into a high-fidelity Alert.
Step 1: The Logic (IF -> THEN)
At its core, every SIEM (Splunk, Sentinel, Wazuh) works on the same simple principle: Boolean Logic.
We are simply telling the computer:
“IF you see [Pattern X], THEN trigger [Alert Y].”
It sounds simple, but the devil is in the details. If your logic is too loose, you get False Positives (alerting on innocent users). If your logic is too tight, you get False Negatives (missing the hacker).
Step 2: The Scenario (The Brute Force Attack)
Let’s build a rule for a classic attack: Brute Force.
- The Attack: A hacker is trying to guess the Admin password. They try “password123”, then “admin1”, then “qwerty”.
- The Log: Every time they fail, Windows generates Event ID 4625 (An account failed to log on).
Attempt 1: The Rookie Rule
IF (EventID == 4625) THEN ALERT
Why this fails: Have you ever mistyped your password? Yes. Everyone has. If you turn on this rule, your SOC will receive an alert every single time a user makes a typo. You will get 5,000 alerts a day. This is Noise.
Step 3: Adding Context (The Threshold)
To find a hacker, we need to look for behavior, not just events. Hackers don’t fail once; they fail fast and often.
Let’s update our logic with a Threshold and a Time Window.
Attempt 2: The Better Rule
IF (EventID == 4625)AND (Count > 5)AND (TimeWindow < 1 Minute)THEN ALERT ("Possible Brute Force Detected!")
Why this works: It is very unlikely that Bob from Accounting can type his password wrong 6 times in 60 seconds. That requires a machine speed. This pattern signals an Attack Tool.
Step 4: The Reality Check (Whitelisting)
You deploy your new rule. The next day, you get an alert! You rush to investigate… and it’s just the Scanner Service.
Your IT team has a vulnerability scanner that checks servers every day. It tries to log in, fails, and triggers your rule. This is a False Positive.
To fix this, we need to Tune (or Whitelist) the rule.
Attempt 3: The Production Rule
IF (EventID == 4625)AND (Count > 5)AND (TimeWindow < 1 Minute)AND (Source_IP != "192.168.1.50")<– Ignore the known Scanner IPTHEN ALERT
Now you have a high-fidelity alert. When this red light flashes, you know it’s time to act.
Step 5: The Universal Language (Sigma)
You might be thinking, “Do I need to learn the specific code for Splunk, and another for Elastic, and another for Azure?”
Luckily, the industry has solved this. Meet Sigma.
Sigma is the “universal translator” for SIEM rules. It allows you to write a rule once in a standard YAML format, and then automatically convert it to Splunk query language, Elastic Query DSL, or Microsoft KQL.
Example Sigma Rule for Brute Force:
YAML
title: Multiple Failed Loginsstatus: experimentallogsource: product: windows service: securitydetection: selection: EventID: 4625 condition: selection | count() > 5 by SourceIP
If you are starting your career, learning to read Sigma rules is a superpower. It means you can work in any SOC, regardless of what expensive tool they bought.
Coming Up Next…
We have the Data (Part 2). We have the Alert (Part 3). Suddenly, your phone buzzes. The Brute Force rule just triggered. It’s real.
What do you do? Do you panic? Do you pull the plug?
In Part 4: The Red Phone Rings, we will walk through the Triage Process. We will learn the “OODA Loop” of an analyst and exactly what steps to take in the first 15 minutes of an incident.
Stay logical.
Disclaimer
This article is for educational purposes. Detection logic requires careful tuning. Deploying threshold rules in a production environment without testing can result in high alert volume.
Leave a Reply