Independent practice
The benchmarks vendors won't run.
An independent lab and capability matrix for security data engineering. Methodology and code in the open.
Trustworthy
Measured, not asserted.
Well-connected
Context resolved cleanly.
Performant
Detection + hunting at scale.
The thesis in detail
What each pillar requires.
Most security data programs trust their vendors, their schemas, and their own past assumptions. The data platform should earn that trust empirically — source by source, claim by claim, query by query — on three properties.
Trustworthy
Data is instrumented, validated, and lineage-traceable. Completeness, freshness, and schema conformance are measured per source. Failures surface before analysts notice them in their queries.
Well-connected
Entities resolve cleanly across sources. The catalog knows which source is authoritative for which attribute, with confidence and freshness scoring. Joins do what their JOIN clauses claim they do.
Performant
The data platform meets two latency regimes on the same data: sub-second detection and response, and petabyte-scale historical hunting. Vendor performance claims are validated against the actual workload, not the brochure.
Why now
Why the SIEM model is breaking.
01
Attackers are faster than your detection cadence.
Mandiant's 2026 numbers show exploitation landing 7 days before patch release. CrowdStrike clocks attacker breakout at 51 seconds. The AI tooling making this possible is now open-weight.
02
Query performance has flipped.
On a 10M-event Zeek workload, ClickHouse runs 145× faster than the dominant schema-on-read SIEM. Same data, same hardware, same queries; methodology in the lab. The architecture schema-on-read indexing was sized for is gone.
03
Storage cost has flipped too.
Object storage plus columnar formats compress 8.2× in our benchmark. Netflix, Huntress, and Insider run multi-petabyte security data lakes on costs SIEM customers can't access. The tradeoff: data freshness.
04
Stream processing closes the freshness gap.
Modern stream engines handle thousands of near-real-time detections in the same time SIEMs handle dozens. The next move — federated query over source-retained data — is closer than vendors will admit.
The lab — proof on this page
Every benchmark is yours to re-run.
Zeek analytical workload · 10M events · single-node Docker
145× faster
ClickHouse vs. schema-on-read SIEM on identical workload. Methodology and caveats published; reference implementation under NDA.
Reproducible Docker lab · methodology and code shared during engagement scoping · public repository queued for launch
What makes this different
Two products. A method you can audit.
The Lab and the Matrix are the public outputs. Practitioner depth and disclosure-first integrity are what make the outputs defensible.
The Lab
Benchmark methodology and code in the open. Re-run them on your own data.
Next report: catalog options compared, Q3 2026.
The Matrix
A scoring matrix for security data tools. Public methodology, weighted to your workload. Scoring is the paid output.
Refreshed quarterly.
Practitioner depth
25 years across military service, intelligence-community analytics, and security data engineering. OCSF 2.0 co-author. Recent embedded resident-engineer assignment with a Tier-1 financial-services security-data team via Corelight.
The opinions here come from operating the platforms in production, not consulting about them.
Disclosure-first
Active vendor relationships are surfaced before they shape advice. No reseller margins. No kickbacks. No hidden alliance gravity.
The Matrix evaluates against the disclosure, not around it.
Ready to see the numbers on your own data?
A 1–2 week POV runs the benchmarks on your data. $15K, credited toward the full assessment.