Real-Time Reasoning Agents in Evolving Environments

Wen, Yule; Ye, Yixin; Zhang, Yanzhe; Yang, Diyi; Zhu, Hao

DECEPTICON 🥷: How Dark Patterns Manipulate Web Agents

Paper Tasks Blog Post

Real Websites

Archives of open-internet websites with dark patterns, complete with interactive elements and user flows.

Evaluate Agents

Compare agent performance across different LLMs, scaffolds, websites, and dark pattern types.

DECEPTICON

State-based task and dark pattern detection, enabling deterministic evaluation of agent performance.

Explore Tasks

Can web agents resist manipulation? Dark patterns - deceptive UI designs that trick users - pose a serious threat to autonomous web agents. We build an environment to evaluate how effectively dark patterns derail agents from completing user goals.

Why are dark patterns dangerous for web agents?

Dark patterns are deceptive UI designs that manipulate users into performing unintended actions—signing up for unwanted subscriptions, adding items to cart, or sharing personal data. As web agents increasingly act on behalf of users, these same manipulation tactics pose a significant threat to agent robustness and user trust.

What is DECEPTICON?

The DECEPTICON Benchmark

DECEPTICON is a comprehensive benchmark for evaluating web agent robustness against dark patterns. We systematically categorize deceptive UI tactics and measure how effectively they derail state-of-the-art agents from completing user goals.

Result: A framework for understanding agent vulnerabilities and developing more robust web agents.

View the DECEPTICON taxonomy of dark patterns

What did we find?

Figure showing DECEPTICON results on agent robustness

Our experiments reveal that dark patterns are highly effective at manipulating web agents. State-of-the-art agents from frontier labs are susceptible to common deceptive tactics, with success rates dropping significantly when dark patterns are present. This highlights a critical gap in current agent architectures.

Evaluating on DECEPTICON

Installation

Install the DECEPTICON repo (uv highly recommended):

git clone git@github.com:SALT-NLP/DECEPTICON.git
cd DECEPTICON
uv venv
source .venv/bin/activate
uv pip install -r requirements.txt

Quickstart

Try an arbitrary model on DECEPTICON with the Simple agent scaffold:

export OPENAI_API_KEY="your_key_here"
export GEMINI_API_KEY="your_key_here"
export ANTHROPIC_API_KEY="your_key_here"
export OPENROUTER_API_KEY="your_key_here"

python3 -u run_darkpattern.py \
    --model computer-use-preview \
    --provider openai \
    --max_iter 15 \
    --max_attached_imgs 3 \
    --temperature 1 \
    --episodes 1 \
    --seed 3000 \
    --seeded_run 3000 \
    --window_width 1280 \
    --window_height 720 \
    --workers 16 \
    --original_set 0 \
    --headless

BibTeX

@misc{cuvin2025decepticon,
      title={DECEPTICON: How Dark Patterns Manipulate Web Agents},
      author={Phil Cuvin and Hao Zhu and Diyi Yang},
      year={2025},
      eprint={TODO},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={TODO},
}

Authors

DECEPTICON 🥷: How Dark Patterns Manipulate Web Agents

Why are dark patterns dangerous for web agents?

What is DECEPTICON?

The DECEPTICON Benchmark

What did we find?

Evaluating on DECEPTICON

Installation

Quickstart

Dark Pattern Categories

Sneaking

Urgency

Misdirection

Social Proof

Obstruction

Forced Action

BibTeX

Authors

Phil Cuvin^1,2

Hao Zhu¹

Diyi Yang¹

DECEPTICON 🥷: How Dark Patterns Manipulate Web Agents

Why are dark patterns dangerous for web agents?

What is DECEPTICON?

The DECEPTICON Benchmark

What did we find?

Evaluating on DECEPTICON

Installation

Quickstart

Dark Pattern Categories

Sneaking

Urgency

Misdirection

Social Proof

Obstruction

Forced Action

BibTeX

Authors

Phil Cuvin1,2

Hao Zhu1

Diyi Yang1

Phil Cuvin^1,2

Hao Zhu¹

Diyi Yang¹