
01 Overview
As AI capabilities matured within Elastic's security platform, the opportunity emerged to move beyond assisted design into truly agentic product thinking — where AI doesn't just surface information, but actively solves problems on behalf of the user.
Agentic Rules represents a shift in how detection engineering works. Rather than expecting security analysts to manually discover, evaluate, and configure rules, the goal was to design a system where AI agents could reason about a user's environment, recommend the right detections, and execute configuration workflows autonomously.
This project required thinking across three surfaces simultaneously: the product UI, the conversational agent layer, and the editor-native AI experience that enabled real-world workflow execution.
02 My Role
03 The Problem
Five problems compound daily, costing a mid-size SOC ~$202K/yr in pure labour — before a single breach is considered.
Mid-size SOC · 2 engineers · 3 analysts · ~500 rules · ~200 FP alerts/week
| Problem | Time per shift | Est. annual cost |
|---|---|---|
| Shift start — identifying issues | 20–30 min every shift | ~$38,000 |
| False positive rule tuning | 4 min/alert · ~200 FP alerts/week | ~$42,000 |
| Broken rule remediation | 1–2 hrs/incident · ~2 incidents/week | ~$32,000 |
| MITRE ATT&CK coverage review | 2–4 hrs monthly, manual | ~$18,000 |
| Installing new rules | 30–60 min per rule install | ~$24,000 |
| Updating rules with conflicts | 15–30 min per conflict | ~$28,000 |
| Total estimated labour cost across 5 users | ~3–4 hrs/day per engineer | ~$182,000 / yr |
Based on 2 detection engineers at ~$110K loaded cost, 3 analysts at ~$80K. Costs calculated from time × daily rate × working days.
04 Process
Step 01
Discover & Empathize
Step 02
Assumption Mapping
Step 03
Test & Validate
Step 04
Define & Frame
Contextual interviews, shift-handoff observations, and task audits with 20+ analysts across enterprise accounts. Three friction spikes emerged: shift-start overload, rule discovery drain, and dead-end manual configuration.
Built on 25+ prior user interviews, I created Figma Persona Cards now used company-wide at Elastic. They kept every design decision grounded in a specific person — not an imagined average user.
Every team belief about user behaviour and system capability plotted against confidence and criticality.
Most consequential disproved assumption: analysts want intent, not process. This single finding changed the entire design language of the agentic layer.
Two research rounds with 20+ analysts from Elastic's largest enterprise accounts — discovery first, then prototype testing.
I spend the first hour of every shift just figuring out what's broken. By the time I start doing real work, half the day feels wasted.
Senior SOC Analyst · Financial Services
The rules are there — I know they exist. I just can't tell which ones apply to my environment. It always takes longer than it should.
Detection Engineer · Enterprise Healthcare
When I install a rule, I'm never fully confident it will fire correctly. There's always background anxiety until it proves itself.
Threat Intelligence Lead · Technology
If something could tell me — here's your gap, here's the fix, I just need to approve it — I'd approve it. That's the relationship I want.
SOC Manager · Government
Speed
Time to First Detection
Reduction in time from rule discovery to first active detection. Target: under 10 minutes for standard integrations.
Efficiency
Manual Steps per Rule
Decrease in manual configuration steps per rule activation. Target: 80% reduction.
Coverage
Proactive Rule Coverage
Increase in proactive coverage relative to environment profile — detection before the threat, not after.
Cognitive Load
Analyst Cognitive Effort
Reduction in perceived cognitive load during triage and setup, measured through qualitative usability sessions.
05 Design & Build
Analysts don't live in one place. I mapped the problem across three surfaces.
1UX · Kibana / EUI
The primary place analysts monitor, approve, and manage the full detection estate
2UX · Conversational Agent
The analyst asks questions and gets AI reasoning, coverage maps, and approval cards — right there in the conversation
3UX · Claude Code / Cursor
Intelligence surfaces inside the analyst's editor — without breaking their flow
Started with a Design Jam — rapid Figma sketching with product, engineering, and security stakeholders. Ideas challenged and rebuilt in minutes. Everyone owned the direction before formal design began.
Lo-fi wireframes followed, with a formal stakeholder review before any high-fidelity work began.
The user flow covers three key steps — setup and configuration, viewing completed actions (full auto), and the approval flow. All three were shared with stakeholders before high-fidelity work began.
Rather than advancing straight to high-fidelity Figma, I built functional prototypes directly in Cursor. This was deliberate — agentic behaviour can't be honestly tested with static mocks. The prototype needed to actually reason, respond, and make decisions.
With a working prototype, I returned to the same analyst pool from the discovery round. These sessions were structured task-based tests: could analysts understand what the agent had done, approve or reject actions in under 60 seconds, and recover when something went wrong?
Testing revealed a consistent pattern: analysts trusted the agent's actions but wanted more control over its scope. The configuration panel — where analysts set automation levels per action type — emerged directly from this feedback and became one of the most positively received features in round two testing.
Once the UX was stable and validated, I produced the engineering handover documentation using Claude Code. Rather than static Figma annotations, the handover was a living document — structured markdown with embedded component specs, interaction states, edge cases, and agent decision logic documented in plain language for the engineering team.
One of the final and most impactful steps in the build process was connecting the Elastic UI (EUI) design system directly into the agent via MCP. I built a custom skill that gave the agent a live reference to EUI component patterns, interaction conventions, and accessibility guidelines — so that every time it reasoned about an interface action or generated a rule-related response, it could do so in a way that was consistent with the broader Elastic product experience.
This meant the agent didn't just produce functionally correct outputs — it produced outputs that looked and behaved like the rest of the platform. For a feature sitting inside a mature design system, that consistency matters enormously for user trust. When AutoDEX surfaces an approval card, a reasoning summary, or a tuning recommendation, it feels like it belongs in Elastic Security — not like an AI overlay bolted on from outside.
06 Results
The results below are drawn from two sources: structured user testing with the 20+ analyst cohort, and usage analytics from the AutoDEX beta rollout across three enterprise accounts. Together they confirm the system delivered on the five UX success criteria defined at the start.
1
In user testing, analysts consistently reported being able to understand the state of their detection estate within the first few minutes of opening the dashboard. Usage analytics from the beta confirmed the pattern.
2
AutoDEX diagnosed a rule generating 340 FP alerts/week and proposed a targeted exception. Analyst approved in under 60 seconds after reading the reasoning summary.
3
Silent failures surfaced before they compounded. Coverage gaps — previously invisible for weeks — now identified and proposed for closure in real time.
Signing Off
We walked in thinking analysts wanted full transparency from the agent. They wanted the opposite: concise intent, not verbose justification. Without structured assumption mapping and two rounds of research, we would have shipped something analysts found exhausting rather than helpful.
The configuration panel — where analysts set their own automation levels — was the highest-rated feature. Trust made explicit and controllable beats trust assumed.
Three surfaces, one mental model. It held under pressure. I'd apply this framing from day one on any agentic product.
Working Cursor prototypes revealed problems no Figma review would have caught. For AI products, code prototyping isn't optional.
The engineering team called it the clearest spec they'd received on an agentic feature.