Claude Code UX Researcher: Automated Competitor Benchmarking with AI Agents
TL;DR: Stop wasting 10 hours a week on manual "Competitive Audits." By combining Claude Code (terminal agent) with Playwright, you can build a headless UX researcher that scrapes competitors, analyzes their UI with AI Vision, and generates a structured benchmarking matrix—using Perp DEX volume leaders as a live case study.
Who This Is For
- UX Researchers: Tired of manual screenshots and spreadsheets, wanting to focus on strategic analysis.
- Product Managers: Needing rapid competitor feature tracking to inform PRD decisions.
- dApp Competitive Analysts: Specialized in studying interaction patterns across the Web3 ecosystem.
User Journey
Actor: Max, Senior Designer at a DEX protocol.
- Trigger: Runs the
claude-watchtowerscript on Monday morning.- Auto-Pilot: Agent identifies top protocols via API and captures fresh screenshots.
- AI Analysis: Claude Vision compares the "Trade Terminal" layouts against his own product.
- Outcome: Max reviews a structured Markdown report in 10 mins, saving 8 hours of manual labor.
The Problem: The "Audit Fatigue"
In the fast-moving DeFi space, competitors launch new features every week. For Product Designers and PMs, keeping up means:
- Manually visiting 10+ dApps.
- Taking dozens of screenshots.
- Filling out a spreadsheet with "Yes/No" for feature parity.
- Documenting UX copy changes.
By the time you finish, the audit is already outdated. This is a classic "High Repetition, High Value" task—the perfect candidate for an AI Agent Workflow.
Level 1: The AI-Powered Handoff
Instead of doing this yourself, we delegate the "Legwork" to Claude.
Step 1: Automated Discovery (DeFiLlama Volume)
We don't want to maintain a list of competitors manually. Use the DeFiLlama API to find the Top 10 Perp DEXs by 24h Volume. High volume usually correlates with highly optimized trading interfaces.
Prompt to Claude Code:
"Write a script that uses the DeFiLlama API to fetch the top 10 Perp DEX protocols by 24h Trading Volume. Save their trading interface URLs to a JSON file."
Step 2: The "Eyes" (Playwright)
Claude Code can generate and execute a Playwright script to visit these URLs in headless mode.
The Key Action: Capture a screenshot of the "Trading Terminal" and "Asset Selector."
// Example Playwright snippet generated by Claude
const { chromium } = require('playwright');
(async () => {
const browser = await chromium.launch();
const page = await browser.newPage();
await page.goto('https://hyperliquid.xyz');
await page.screenshot({ path: 'assets/hyperliquid-trade.png' });
await browser.close();
})();
Level 2: AI Vision Analysis
Capturing screenshots is only half the battle. Now, we use Anthropic's Vision capabilities to "Read" the UX.
We feed the screenshots back into Claude with a structured prompt:
"Analyze this screenshot of 'Hyperliquid Trade Interface.' Extract:
- Placement of the 'Trade' panel (Left/Right).
- Primary CTA color (Hex if possible).
- List all visible assets (ETH, USDC, etc.).
- Evaluate the 'Hierarchy' score from 1-5 based on visual clarity."
The Output: The Benchmarking Matrix
The final result isn't a folder of images; it's a Clean Markdown Matrix that you can drop directly into your PRD or Notion workspace.
| Protocol | Key Asset | Primary CTA | Visual Style | UX Complexity | 24h Volume |
|---|---|---|---|---|---|
| Hyperliquid | USDC | Deposit | Trading Terminal | High | $1.2B+ |
| dYdX | USDC | Trade Now | Institutional | High | $800M+ |
| GMX | GLP/GM | Long/Short | DeFi Native | Moderate | $300M+ |
Quantified Results: AI Agent vs. Manual Audit
| Feature | Manual Audit (10 protocols) | AI Agent (Autonomous) |
|---|---|---|
| Data Collection | ~4 hours | < 5 mins |
| Vision Depth | Subjectivity-prone | Structured (Colors/Layout) |
| Maintenance | High (Manual Reshooting) | Low (Single command rerun) |
| Accuracy | Prone to fatigue/omission | Consistent rule application |
FAQ
Q: How do you verify AI analysis results? A: We recommend a "Spot Check" approach. Randomly audit 10-20% of the generated snapshots to ensure the Vision model is interpreting specialized UI (like complex D3 charts or Web3 modals) correctly.
Q: Can it analyze interfaces that require wallet connection? A: While Playwright can inject private keys or simulate sessions, it adds complexity. Initially, it's best to focus on public post-landing page "Trade" terminals that don't require an active connection for visual benchmarking.
Risk Disclosure
While AI Agents dramatically increase efficiency, be aware of these limitations:
- Visual Dependency: This approach relies on static screenshots. Deep "Interaction Flows" (e.g., animations triggered by scroll or multi-step modals) are harder for AI to analyze without human-in-the-loop intervention.
- Token Costs: Capturing high-resolution screenshots and processing them through Vision models will incur API costs.
- Anti-Bot Measures: Some dApps might block headless browsers. You may need proxies or more "human-like" Playwright configurations.
🚀 Why This Matters for 2026
Efficiency is no longer enough. To survive in the "Human + AI" era, you must move from being a Researcher to an Architect.
Instead of spending 8 hours collecting data, you spend 15 minutes reviewing the AI Audit and 7 hours 45 minutes making the actual design decisions that differentiate your product. Shift from data collector to data architect.
Conclusion: Build Your Watchtower
Don't just keep up with the competition—observe them autonomously. By building a "Claude Watchtower," you transform a tedious chore into a strategic advantage.
Are you ready to automate your next audit? Let's build it.
