NGINX One Observability
Reimagining how engineers monitor, compare, and diagnose NGINX instances across fleets.
Role
Lead Product Designer
Team
2 PMs, 10+ Engineers
Timeline
4 Months
Focus
Data Viz, UX Strategy

The 3rd-Party Observability Platform Gap.
Customers love NGINX for its performance, but rely on Datadog and Grafana for visibility. This fragmentation creates a strategic gap: we own the traffic, but not the insight.
Stakeholder Insights
"This complexity isn't just a UX problem—it's a value leak. Customers have been asking for this for years. By closing this visibility gap, we reclaim the debugging workflow and turn observability into a sticky feature that prevents churn."
Empowering the Experts
Platform Ops
"I need to see overall system health at a glance to spot issues early without diving into raw metrics."
Application Engineer
"When performance degrades, I need to pinpoint which config change caused it to fix it fast."
SecOps
"I need to validate that security policies are active and blocking threats effectively."
The User Journey
We mapped the critical path for all three personas to a unified "Observe-Diagnose-Resolve" loop.
Observe
Identify abnormal patterns or potential risks through clear visual signals (e.g., spikes, error trends, CVE alerts).
Diagnose
Drill into specific instances or metrics to uncover root causes, with contextual views and linked data.
Resolve
Provide direct links to relevant solutions or knowledge base articles, enabling faster, confident action.
Data Without Diagnosis
The legacy dashboard was packed with data but lacked hierarchy. It could flag that an error occurred, but never why. Engineers saw spikes in 500 errors but had to leave NGINX and lean on 3rd-party plugins to find the root cause. This forced constant context-switching to answer simple questions:
- ✕"Is my instance healthy or failing?"
- ✕"Where is the traffic spike coming from?"
- ✕"Which config change caused this error?"
Information Overload
Scanning rows of raw numbers induces high cognitive load during incidents.
Context Switching
Users must mentally stitch together isolated data points to find root causes.

Legacy Dashboard

Fragmented Metrics
How do we make hidden data visible and actionable?
When a server is on fire, engineers don't have time to analyze 50 charts. Our goal was to give them the answer—not just the data—in under 5 seconds.
We needed to move from "showing everything" to "showing what matters." This meant designing a system that cuts through the noise and instantly points SREs to the root cause of an outage.
User Insight
"When a critical error happens, help me find why it happened, and where to look to investigate."
Instant Situational Awareness
We prioritized anomaly detection over raw metrics. By visually exaggerating spikes and errors, we ensure that critical issues jump out at the user, reducing the cognitive load during high-pressure incidents.
Actionable Insights
Every visualization is a pathway to a solution. We designed the "Data Explorer" to not only display trends but to allow engineers to drill down into specific requests and logs without losing context.
Dashboard
The first layer of defense is the Instance Dashboard. We replaced standard charts with high-density sparklines that prioritize trend direction over raw values.

Grouping by Intent
Metrics are categorized by intent: Utilization (Health), Status (Security), and Traffic (Throughput).
Sparklines over Charts
Sparklines show trends rather than precise values, saving 60% of vertical space while highlighting anomalies.
Progressive Disclosure
Secondary details are hidden until hover, keeping the initial scan clean and focused on critical signals.
Data Explorer
When a spike is detected, the Data Explorer takes over. Unlike static reports, this interactive tool allows SREs to drill down from a global anomaly to a specific request in three clicks.
The interaction model is tuned for speed, visually exaggerating outliers so root causes can't hide in the noise.

Iterative Co-creation Workflow
This project began as a cross-functional design exploration that brought together PM insights, engineering prototypes, and design experimentation. Instead of a fixed scope, we adopted an iterative co-create workflow to define priorities, validate feasibility, and refine visual patterns in real time.
Design
Synthesized cross-team ideas, ran iterative design sessions, and simplified complex data into a clear, actionable experience.
Engineering
Evolved a hackathon prototype into a working system, testing data refresh rates and technical limits in real time.
Product
Synthesized customer pain points to define which metrics mattered most and what "glanceable" really means.
Mapping the Data Landscape
Before defining the 4-layer model, I conducted extensive mapping exercises to understand the relationships between NGINX's vast metrics ecosystem. These sketches helped identify the natural clusters that eventually formed our data strategy.

Fig 3. Data Flow & Layer Mapping

Fig 4. Initial Taxonomy of NGINX Plus Dashboard Data

Fig 5. Instance Metrics Hierarchy Tree
The Observability Blueprint
Before designing screens, I co-created this map with engineering to define the observability ecosystem. We needed to ensure every metric had a clear lineage and every user action had a feasible destination.
This Concept Map (Fig 7) became our shared source of truth, aligning design intent with technical reality.
Fig 7. System Concept Map — Co-created with Engineering
Potential Directions
Navigating conflicting stakeholder priorities was key. Engineering pushed for a comprehensive technical view, while Product Management wanted a fast, safe MVP. My role was to synthesize these into a scalable design solution.
Option 1: Infinite Map
A comprehensive technical vision to visualize every connection. While powerful, it risked overwhelming users and faced severe performance hurdles.
All-in-one solution
Scalability & Performance risks
Option 2: Simple Charts
The "safe" MVP route. Fast to build and performant, but offered little competitive value and failed to solve the core diagnostic problem.
High Performance
Low value add vs. competitors
Option 3: Data Explorer
I aligned the team on a balanced approach: a flexible explorer that leverages the Design System to handle complexity without sacrificing performance.
Fits all user needs & Flexible
Scalable & Performant

Fig 8. Engineering Hackathon: Infinite Map Concept

Fig 9. Engineering Hackathon: Traffic Flow Visualization

Fig 10. PM Concept: Simple Charts & Sankey

Fig 11. PM Concept: Basic Status Check

Fig 12. Option 3: Exploring Existing Visualization Library (Charts)

Fig 13. Option 3: Exploring Existing Visualization Library (Sankey & Metrics)
NGINX Open Source & NGINX Plus Metrics
A clean, easy-to-scan combined view of traffic, health, CVEs, and resource usage. Clear hierarchy, color, and compact charts make key information instantly visible.

Overview
Instance status, utilization trend, and network traffic trend sparklines.
View Switcher
Seamlessly toggle between Overview, Metrics, and Configuration views.
Time Range
Global time controls to correlate metrics across different periods.
Detailed Metrics Views
Beyond the high-level summary, I designed specialized views for each metric category. These detailed screens allow engineers to drill down into specific data points—Traffic, Utilization, Connections, and Requests—without losing context.
Traffic Analysis
Detailed breakdown of throughput trends, bandwidth usage, and latency metrics across instances.

Bytes In/Out Detail

Latency Analysis
Launch & Future Improvements
The NGINX One Console launched in public preview in 2025. Early feedback has been positive, with users highlighting the "Data Explorer" as a significant improvement for rapid diagnosis, transforming what used to be a multi-tool hunt into a streamlined workflow.
Roadmap Priorities
Intelligent Config Tuning
Analyze traffic patterns to suggest specific configuration changes that improve performance and security.
Customizable Dashboards
Allow users to build their own views based on specific team needs, moving beyond the "one size fits all" default.
Ecosystem Integrations
Seamless workflows with PagerDuty, Slack, and Jira to streamline incident response and team collaboration.
Lessons Learned
Clarity Over Complexity
When visualizing large-scale data, clarity is more valuable than visual novelty. Clear hierarchy, consistent scales, and recognizable patterns help users act faster and trust the system.
Design Systems Are Evolving Tools
A design system isn’t a rulebook—it’s a living framework. Extending it for new visualization needs keeps consistency without stifling innovation.
Data Has a Story to Tell
UX design plays a key role in uncovering meaning from massive, complex data. Helping users see trends and connections turns raw telemetry into insight.