DECEPTICON 🥷: How Dark Patterns Manipulate Web Agents
Can web agents resist manipulation? Dark patterns - deceptive UI designs that trick users - pose a serious threat to autonomous web agents. We build an environment to evaluate how effectively dark patterns derail agents from completing user goals.
Why are dark patterns dangerous for web agents?
Dark patterns are deceptive UI designs that manipulate users into performing unintended actions—signing up for unwanted subscriptions, adding items to cart, or sharing personal data. As web agents increasingly act on behalf of users, these same manipulation tactics pose a significant threat to agent robustness and user trust.
What is DECEPTICON?
The DECEPTICON Benchmark
DECEPTICON is a comprehensive benchmark for evaluating web agent robustness against dark patterns. We systematically categorize deceptive UI tactics and measure how effectively they derail state-of-the-art agents from completing user goals.
Result: A framework for understanding agent vulnerabilities and developing more robust web agents.
View the DECEPTICON taxonomy of dark patterns
What did we find?
Our experiments reveal that dark patterns are highly effective at manipulating web agents. State-of-the-art agents from frontier labs are susceptible to common deceptive tactics, with success rates dropping significantly when dark patterns are present. This highlights a critical gap in current agent architectures.
Evaluating on DECEPTICON
Installation
Install the DECEPTICON repo (uv highly recommended):
git clone git@github.com:SALT-NLP/DECEPTICON.git
cd DECEPTICON
uv venv
source .venv/bin/activate
uv pip install -r requirements.txt
Quickstart
Try an arbitrary model on DECEPTICON with the Simple agent scaffold:
export OPENAI_API_KEY="your_key_here"
export GEMINI_API_KEY="your_key_here"
export ANTHROPIC_API_KEY="your_key_here"
export OPENROUTER_API_KEY="your_key_here"
python3 -u run_darkpattern.py \
--model computer-use-preview \
--provider openai \
--max_iter 15 \
--max_attached_imgs 3 \
--temperature 1 \
--episodes 1 \
--seed 3000 \
--seeded_run 3000 \
--window_width 1280 \
--window_height 720 \
--workers 16 \
--original_set 0 \
--headless
Dark Pattern Categories
Sneaking
Hidden costs, pre-selected options, and subscriptions added without consent.
Try a task →
Urgency
Time pressure tactics with countdown timers, scarcity claims, and limited-time offers.
Try a task →
Misdirection
False sales, confusing UI elements, and visual tricks that mislead user decisions.
Try a task →
Social Proof
Fake testimonials, inflated popularity metrics, and manufactured social validation.
Try a task →
Obstruction
Difficult navigation, hard-to-find options, and obstacles that prevent task completion.
Try a task →
Forced Action
Required sign-ups, mandatory consents, and coercive actions to proceed.
Try a task →
BibTeX
@misc{cuvin2025decepticon,
title={DECEPTICON: How Dark Patterns Manipulate Web Agents},
author={Phil Cuvin and Hao Zhu and Diyi Yang},
year={2025},
eprint={TODO},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={TODO},
}
Authors
Phil Cuvin1,2
Stanford University,
University of Toronto
Hao Zhu1
Stanford University
Diyi Yang1
Stanford University
1Stanford University, 2University of Toronto