Detection Engineering Workflow
A detection engineering workflow that ships — hypothesis to ATT&CK-mapped, data-validated, tested, version-controlled detections, gated by CI and measured.
A detection engineering workflow is the repeatable path from “attackers might do X” to a deployed, tested detection that fires on X and stays quiet otherwise. Without a workflow, teams accumulate a pile of rules nobody trusts — half untested, many on data they don’t collect. With one, every detection is a hypothesis that has been mapped to ATT&CK, validated against real telemetry, version-controlled, and measured. This guide is that end-to-end workflow, treating detections as code.
It is the program-level companion to two adjacent darkpwn guides: writing Sigma rules that actually fire (rule quality) and the Sigma rule lifecycle (a single rule’s journey). This post is the operating model around them.
What is a detection engineering workflow?
A detection engineering workflow is the defined, repeatable process a team uses to produce detections. It exists to solve a specific failure mode: ad-hoc rule writing produces a library where nobody knows which rules work, which sit on uncollected data, and which technique gaps remain. The workflow makes every detection traceable from the threat it addresses to the test that proves it works.
The unit of work is a hypothesis — a specific claim about attacker behavior in your environment — and the workflow’s job is to move it through to production with evidence at each step. Detections from the SQL injection and command injection guides on darkpwn are examples of the output; this is the assembly line that produces them.
What are the stages of the detection engineering workflow?
| Stage | Question it answers | Output |
|---|---|---|
| 1. Hypothesis | What attacker behavior do we want to catch? | A specific, testable statement |
| 2. ATT&CK mapping | Where does it sit, and is it a priority gap? | Technique ID + coverage rationale |
| 3. Data check | Do we actually collect the needed logs? | Confirmed, parsed logsource |
| 4. Develop | What logic expresses the behavior? | A rule (e.g. Sigma) with false positives noted |
| 5. Validate | Does it fire on the attack and stay quiet otherwise? | Test results vs. true-positive + benign data |
| 6. Deploy as code | How does it reach production safely? | Merged via PR + CI, versioned, deployable |
| 7. Tune & measure | Is it healthy, and what’s our coverage? | Tuned rule + coverage/FP metrics |
The discipline is that no stage is skipped. The most common shortcut — writing a rule (stage 4) without a data check (stage 3) or validation (stage 5) — is exactly how dead rules enter a library.
How a detection moves through the workflow
Take a hypothesis — “an attacker runs encoded PowerShell to evade logging” — and walk it through. Map it to T1059.001, confirm process-creation logs are collected and parsed, then develop the rule with its false positives named up front:
title: Encoded PowerShell Command Execution
id: 1f7c3a92-darkpwn-illustrative
status: experimental
logsource:
category: process_creation
product: windows
detection:
selection_img:
Image|endswith: '\powershell.exe'
selection_flags:
CommandLine|contains: [' -enc ', ' -EncodedCommand ']
condition: selection_img and selection_flags
falsepositives:
- Some legitimate deployment and management scripts
level: medium
tags:
- attack.t1059.001 Then validate it: generate the behavior safely (Atomic Red Team is built for this), confirm the rule fires on the true positive and stays silent on a normal baseline. Only then does it move to deploy-as-code.
How to run detection as code
Stage 6 is what makes the workflow scale. Manage detections like software:
- Git repository of rules (Sigma is the portable source format).
- Pull-request review — every new or changed rule gets a second set of eyes.
- CI pipeline that lints syntax, converts to your SIEM dialect (pySigma/sigma-cli), and runs each rule against recorded true-positive and benign samples.
- Gated merge — a rule cannot ship unless it fires on the attack and is silent on the baseline.
- Versioned deploy + rollback — push to the SIEM through the pipeline, with the ability to revert a noisy rule instantly.
How to measure the workflow
- Validated ATT&CK coverage — count only rules proven to fire; an untested rule is not coverage.
- False-positive rate per rule — the health metric that predicts whether a rule survives.
- Mean time to detect — does the new detection actually shorten dwell time?
- % of detections with automated tests — the leading indicator of a maintainable library.
Common detection engineering mistakes
- Writing rules without a data check. Valid YAML on uncollected logs never fires.
- Skipping validation. Untested rules inflate coverage and erode trust.
- No version control. Rules drift, break silently, and cannot be rolled back.
- Vanity coverage maps. Counting unvalidated rules is a story you tell yourself.
Detection engineering workflow checklist
- Capture detection ideas as specific, testable hypotheses in a backlog.
- Map each to MITRE ATT&CK and prioritize by risk and data feasibility.
- Confirm the logsource is collected and parsed before writing logic.
- Develop the rule (Sigma) with false positives named up front.
- Validate against true-positive (Atomic Red Team) and benign telemetry.
- Store rules in Git; require PR review and a CI test gate.
- Deploy through a versioned pipeline with rollback.
- Track validated coverage, per-rule FP rate, MTTD, and test coverage.
The takeaway
A detection engineering workflow makes detections repeatable and trustworthy: hypothesis to ATT&CK to data to rule to validation to deploy-as-code to measurement, with nothing skipped. It is the operating model that keeps a rule library alive. Continue with MITRE ATT&CK mapping without theater, the Sigma rule lifecycle, writing Sigma rules that actually fire and building a threat hunting hypothesis library, then ground it in real Windows telemetry with Sysmon configuration for threat detection and apply it to detecting LSASS credential dumping, living-off-the-land binaries and Kubernetes security events to prioritize, or browse the full Detection Engineering pillar.
Training & tools referenced
Disclosure: Some links below are affiliate links. If you buy through them, darkpwn may earn a commission at no extra cost to you. We only recommend training and tools we actually use in our own lab, and affiliate links never influence editorial coverage.
- TryHackMeAuthorized labs to practice detection engineering against real telemetrySecurity TrainingStart training
- PluralsightDetection engineering and threat-hunting learning pathsSecurity TrainingBrowse courses
Frequently asked questions
What is a detection engineering workflow?
A detection engineering workflow is the repeatable process that turns a threat hypothesis into a deployed, tested detection: define the hypothesis, map it to MITRE ATT&CK, confirm the data source is collected, develop the rule, validate it against true-positive and benign telemetry, deploy via version control and CI, then tune and measure coverage. It treats detections as code.
What is detection as code?
Detection as code manages detection rules like software: stored in Git, reviewed via pull requests, linted and tested in CI, and deployed through a pipeline. It brings versioning, peer review, automated testing, and rollback to detections, which is how you scale a rule library without it rotting.
How do you prioritize what to detect?
Prioritize by threat relevance and coverage gaps: map your detections to MITRE ATT&CK, identify techniques used by threats relevant to your organization that you cannot yet see, and weight by the data you actually collect. Build a hypothesis backlog and work the highest-risk, highest- feasibility items first.
How do you measure detection engineering?
Measure validated ATT&CK technique coverage (count only rules proven to fire), false-positive rate per rule, mean time to detect, and the share of detections with automated tests. Honest coverage of validated detections beats an impressive matrix of untested rules.