Back to work

Agentic
Rules

RoleUX Lead, Agentic Systems Design
CompanyElastic
ToolsFigma, Cursor, MCP, Research
AutoDEX dashboard

01 Overview

Designing AI-native workflows for autonomous security detection

As AI capabilities matured within Elastic's security platform, the opportunity emerged to move beyond assisted design into truly agentic product thinking — where AI doesn't just surface information, but actively solves problems on behalf of the user.

Agentic Rules represents a shift in how detection engineering works. Rather than expecting security analysts to manually discover, evaluate, and configure rules, the goal was to design a system where AI agents could reason about a user's environment, recommend the right detections, and execute configuration workflows autonomously.

This project required thinking across three surfaces simultaneously: the product UI, the conversational agent layer, and the editor-native AI experience that enabled real-world workflow execution.

02 My Role

UX Lead — Agentic Systems Design

Figma Cursor Claude Code MCP User Research Prototyping Systems Design

03 The Problem

What detection engineers deal with every day

Five problems compound daily, costing a mid-size SOC ~$202K/yr in pure labour — before a single breach is considered.

What poor detection UX costs right now

Mid-size SOC · 2 engineers · 3 analysts · ~500 rules · ~200 FP alerts/week

ProblemTime per shiftEst. annual cost
Shift start — identifying issues20–30 min every shift~$38,000
False positive rule tuning4 min/alert · ~200 FP alerts/week~$42,000
Broken rule remediation1–2 hrs/incident · ~2 incidents/week~$32,000
MITRE ATT&CK coverage review2–4 hrs monthly, manual~$18,000
Installing new rules30–60 min per rule install~$24,000
Updating rules with conflicts15–30 min per conflict~$28,000
Total estimated labour cost across 5 users~3–4 hrs/day per engineer~$182,000 / yr

Based on 2 detection engineers at ~$110K loaded cost, 3 analysts at ~$80K. Costs calculated from time × daily rate × working days.

UX success criteria

04 Process

A structured path from empathy to execution

Step 01

Discover & Empathize

Step 02

Assumption Mapping

Step 03

Test & Validate

Step 04

Define & Frame

Step 01 — Discover & Empathize

Contextual interviews, shift-handoff observations, and task audits with 20+ analysts across enterprise accounts. Three friction spikes emerged: shift-start overload, rule discovery drain, and dead-end manual configuration.

SHIFT START ALERT TRIAGE RULE DISCOVERY CONFIGURATION VALIDATION MONITORING HIGHMIDLOW "Too many alerts" "Can't find the right rule" "Config is exhausting" "Did it even work?" Current experience With agentic support
Analyst journey map — emotional friction across the detection engineering lifecycle · Composite from 20+ interviews

Built on 25+ prior user interviews, I created Figma Persona Cards now used company-wide at Elastic. They kept every design decision grounded in a specific person — not an imagined average user.

Elastic Security Persona Cards — T1 Analyst, T2 Analyst, T3 Forensic Analyst, Threat Intelligence Analyst
Figma-based Persona Cards — built from 25+ user interviews · Now used company-wide across the Elastic Security design team

Step 02 — Assumption Mapping

Every team belief about user behaviour and system capability plotted against confidence and criticality.

HIGH IMPORTANCE LOW IMPORTANCE UNCERTAIN CONFIDENT VALIDATE FIRST MONITOR LOW RISK SAFE TO PROCEED RISKYAnalysts trust agentwithout explanationDisproved — R1 RISKY1-click install enoughno config neededPartially disproved RISKYAnalysts want fullagent transparencyDisproved — R2 RISKYMITRE view is theprimary needConfirmed — R2 SAFEShift-start is highestcognitive load pointConfirmed — R1 SAFEAlert volume isprimary pain driverConfirmed — R1 SAFEAudit trail buildsagent trustConfirmed — R2 LOW RISKCustom reportingneeded first LOW RISKMobile accessis a blocker LOW RISKSlack integrationcritical path LOW RISKDark mode is arequirement PRIORITISE FOR RESEARCH BUILD ON THESE
Assumption map — importance vs. confidence · Pink = risky assumptions requiring validation · Cyan = safe to build on

Most consequential disproved assumption: analysts want intent, not process. This single finding changed the entire design language of the agentic layer.

Step 03 — Test & Validate

Two research rounds with 20+ analysts from Elastic's largest enterprise accounts — discovery first, then prototype testing.

20+
Security analysts interviewed across discovery and prototype testing rounds
2
Research rounds — discovery first, then prototype validation
ENT
Participants from Elastic's largest, most complex enterprise accounts only

I spend the first hour of every shift just figuring out what's broken. By the time I start doing real work, half the day feels wasted.

Senior SOC Analyst · Financial Services

The rules are there — I know they exist. I just can't tell which ones apply to my environment. It always takes longer than it should.

Detection Engineer · Enterprise Healthcare

When I install a rule, I'm never fully confident it will fire correctly. There's always background anxiety until it proves itself.

Threat Intelligence Lead · Technology

If something could tell me — here's your gap, here's the fix, I just need to approve it — I'd approve it. That's the relationship I want.

SOC Manager · Government

Step 04 — Define & Frame

Speed

Time to First Detection

Reduction in time from rule discovery to first active detection. Target: under 10 minutes for standard integrations.

Efficiency

Manual Steps per Rule

Decrease in manual configuration steps per rule activation. Target: 80% reduction.

Coverage

Proactive Rule Coverage

Increase in proactive coverage relative to environment profile — detection before the threat, not after.

Cognitive Load

Analyst Cognitive Effort

Reduction in perceived cognitive load during triage and setup, measured through qualitative usability sessions.

05 Design & Build

From wireframes to agentic execution

The 3UX approach

Analysts don't live in one place. I mapped the problem across three surfaces.

1UX · Kibana / EUI

The Detection Rules Management Surface

The primary place analysts monitor, approve, and manage the full detection estate

  • Core product surface where analysts have full visibility of their detection rule fleet
  • AI surfaces approval cards, health status, and MITRE coverage gaps inline — no context switching required
  • Tiered action model: urgent approvals surfaced first, informational updates in a lower priority queue
  • Audit trail of all agent actions persistent and accessible at all times

2UX · Conversational Agent

AI Reasoning Inline, Without Navigation

The analyst asks questions and gets AI reasoning, coverage maps, and approval cards — right there in the conversation

  • Analysts ask natural language questions: "What's my current MITRE coverage gap?" or "Why did this rule fire 340 times last week?"
  • The agent responds with structured reasoning — coverage maps, suggested fixes, and one-click approval cards surface inline
  • Defined escalation triggers so the agent knows when to act autonomously and when to hand back to the analyst

3UX · Claude Code / Cursor

AI at the Point of Rule Authoring

Intelligence surfaces inside the analyst's editor — without breaking their flow

  • Preflight checks run automatically as rules are authored — catching issues before promotion
  • MITRE ATT&CK mapping suggested in context, without switching to an external tool
  • Conflict detection flags rules that overlap with or contradict existing detections
  • The analyst stays in their editor; the AI comes to them

Wireframing and stakeholder validation

Started with a Design Jam — rapid Figma sketching with product, engineering, and security stakeholders. Ideas challenged and rebuilt in minutes. Everyone owned the direction before formal design began.

Design Jam sketches — early exploration of AutoDEX layout and interaction model
Design Jam sketches — rapid stakeholder exploration of layout, trust model, and interaction patterns

Lo-fi wireframes followed, with a formal stakeholder review before any high-fidelity work began.

AutoDEX wireframe 1 — approvals needed and activity log
Wireframe 1 — AutoDEX: approvals needed queue, activity log, and summary panel
AutoDEX wireframe 2 — full page layout with stat cards
Wireframe 2 — AutoDEX: stat cards, approvals needed, and activity log in full page layout

The user flow covers three key steps — setup and configuration, viewing completed actions (full auto), and the approval flow. All three were shared with stakeholders before high-fidelity work began.

User flow step 01 — Set up and configure
User flow — Step 01: Set up / Configure · Engineer sets which actions AutoDEX handles autonomously
User flow step 02A — View actions completed, full auto
User flow — Step 02A: View actions completed · Full auto — issue resolved with full reasoning available
User flow step 02B — View approvals needed
User flow — Step 02B: View approvals needed · Engineer reviews reasoning and approves, edits, or rejects

Cursor prototyping and early design testing

Rather than advancing straight to high-fidelity Figma, I built functional prototypes directly in Cursor. This was deliberate — agentic behaviour can't be honestly tested with static mocks. The prototype needed to actually reason, respond, and make decisions.

AutoDEX full dashboard — animated walkthrough
AutoDEX — Live dashboard walkthrough after Cursor prototype iteration

User testing the prototype — 20+ sessions

With a working prototype, I returned to the same analyst pool from the discovery round. These sessions were structured task-based tests: could analysts understand what the agent had done, approve or reject actions in under 60 seconds, and recover when something went wrong?

UX iteration based on findings

Testing revealed a consistent pattern: analysts trusted the agent's actions but wanted more control over its scope. The configuration panel — where analysts set automation levels per action type — emerged directly from this feedback and became one of the most positively received features in round two testing.

AutoDEX configuration and automation scope
AutoDEX configuration panel — automation scope controls designed from user testing feedback

Engineering handover via Claude Code

Once the UX was stable and validated, I produced the engineering handover documentation using Claude Code. Rather than static Figma annotations, the handover was a living document — structured markdown with embedded component specs, interaction states, edge cases, and agent decision logic documented in plain language for the engineering team.

Plugging in MCP to ground the AI in our design system

One of the final and most impactful steps in the build process was connecting the Elastic UI (EUI) design system directly into the agent via MCP. I built a custom skill that gave the agent a live reference to EUI component patterns, interaction conventions, and accessibility guidelines — so that every time it reasoned about an interface action or generated a rule-related response, it could do so in a way that was consistent with the broader Elastic product experience.

This meant the agent didn't just produce functionally correct outputs — it produced outputs that looked and behaved like the rest of the platform. For a feature sitting inside a mature design system, that consistency matters enormously for user trust. When AutoDEX surfaces an approval card, a reasoning summary, or a tuning recommendation, it feels like it belongs in Elastic Security — not like an AI overlay bolted on from outside.

MCP connection — EUI design system skill for AutoDEX
MCP skill connected to Elastic EUI — grounding AutoDEX outputs in the Elastic design system for consistency across the product

06 Results

Validated by research, measured in practice

The results below are drawn from two sources: structured user testing with the 20+ analyst cohort, and usage analytics from the AutoDEX beta rollout across three enterprise accounts. Together they confirm the system delivered on the five UX success criteria defined at the start.

1

Shift-start orientation time cut from 20–30 minutes to under 5

In user testing, analysts consistently reported being able to understand the state of their detection estate within the first few minutes of opening the dashboard. Usage analytics from the beta confirmed the pattern.

  • 100% of test participants successfully completed shift-start orientation within the 5-minute target
  • Analysts described the approval queue as "immediately legible" — a significant shift from the previous experience
91%
Approval rate in beta
28 automated actions, 6 approved by analyst in first week
34
Actions in first week
28 auto · 6 analyst-approved · 0 dismissed
AutoDEX usage analytics view
AutoDEX usage tab — 91% approval rate and 34 actions in the first week of beta

2

Alert noise: 340 FPs/week → ~7. Approved in under 60 seconds

AutoDEX diagnosed a rule generating 340 FP alerts/week and proposed a targeted exception. Analyst approved in under 60 seconds after reading the reasoning summary.

  • ~98% reduction in false positive volume on the highest-noise rule
  • Analyst confirmed they understood the change before approving — the trust model worked exactly as designed
  • Estimated saving: ~$42,000/yr on this rule alone
AutoDEX full reasoning and diagnosis view
AutoDEX reasoning view — full diagnosis and decision rationale surfaced for analyst review

3

Coverage gaps: from weeks to discover → real-time. Silent failures eliminated

Silent failures surfaced before they compounded. Coverage gaps — previously invisible for weeks — now identified and proposed for closure in real time.

  • Zero silent rule failures in the beta period
  • Rule install success rate: ~60% → >95% — fewer rework cycles, estimated ~$18,000 saving
  • Coverage gaps surfaced continuously — engineers saw their detection posture accurately for the first time
$216Annual AutoDEX cost
3,400:1Cost-to-saving ratio
-98%Alert noise reduction

Signing Off

Key takeaways

Next Project

Elastic Detections