Can web agents resist manipulation? Dark patterns - deceptive UI designs that trick users - pose a serious threat to autonomous web agents. We build an environment to evaluate how effectively dark patterns derail agents from completing user goals.

Why are dark patterns dangerous for web agents?

Dark patterns are deceptive UI designs that manipulate users into performing unintended actions—signing up for unwanted subscriptions, adding items to cart, or sharing personal data. As web agents increasingly act on behalf of users, these same manipulation tactics pose a significant threat to agent robustness and user trust.

What is DECEPTICON?

The DECEPTICON Benchmark

DECEPTICON is a comprehensive benchmark for evaluating web agent robustness against dark patterns. We systematically categorize deceptive UI tactics and measure how effectively they derail state-of-the-art agents from completing user goals.

Result: A framework for understanding agent vulnerabilities and developing more robust web agents.

View the DECEPTICON taxonomy of dark patterns
DECEPTICON taxonomy of dark patterns

What did we find?

Figure showing DECEPTICON results on agent robustness

Our experiments reveal that dark patterns are highly effective at manipulating web agents. State-of-the-art agents from frontier labs are susceptible to common deceptive tactics, with success rates dropping significantly when dark patterns are present. This highlights a critical gap in current agent architectures.

Evaluating on DECEPTICON

Installation

Install the DECEPTICON repo (uv highly recommended):

git clone git@github.com:SALT-NLP/DECEPTICON.git
cd DECEPTICON
uv venv
source .venv/bin/activate
uv pip install -r requirements.txt

Quickstart

Try an arbitrary model on DECEPTICON with the Simple agent scaffold:

export OPENAI_API_KEY="your_key_here"
export GEMINI_API_KEY="your_key_here"
export ANTHROPIC_API_KEY="your_key_here"
export OPENROUTER_API_KEY="your_key_here"

python3 -u run_darkpattern.py \
    --model computer-use-preview \
    --provider openai \
    --max_iter 15 \
    --max_attached_imgs 3 \
    --temperature 1 \
    --episodes 1 \
    --seed 3000 \
    --seeded_run 3000 \
    --window_width 1280 \
    --window_height 720 \
    --workers 16 \
    --original_set 0 \
    --headless
                                

BibTeX

@misc{cuvin2025decepticon,
      title={DECEPTICON: How Dark Patterns Manipulate Web Agents},
      author={Phil Cuvin and Hao Zhu and Diyi Yang},
      year={2025},
      eprint={TODO},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={TODO},
}

Authors

Phil Cuvin

Phil Cuvin1,2

Stanford University,
University of Toronto

Hao Zhu

Hao Zhu1

Stanford University

Diyi Yang

Diyi Yang1

Stanford University

1Stanford University, 2University of Toronto