SF Project

This project asked whether frontier AI systems avoid animal harm when acting on behalf of users, not when asked about ethics, but when actually making choices. To test this, we built TAC (Travel Agent Compassion), the first agentic benchmark focused on animal welfare. TAC presents an AI agent with 48 travel booking scenarios in which the agent must choose between options involving animal exploitation (bullfights, captive marine shows, animal racing) and welfare-safe alternatives. Scenarios are designed to control for confounding factors including price, rating, and listing order. We tested seven leading frontier models and found none exceeded a 53% welfare rate, with every model performing below the 64% random baseline. Captive shows and racing were the hardest categories, with some models scoring below 15%. We also tested a one-sentence welfare prompt asking the agent to consider the welfare of sentient beings, finding it improved Claude models and GPT-5.5 by 47 to 62 percentage points while having minimal effect on GPT-4.1 and Gemini. The project began as a chatbot harm taxonomy but pivoted to an agentic format after identifying that taxonomy-based scoring is poorly suited to measuring revealed behavior. TAC was merged into UK AISI's Inspect Evals framework in March 2026 and is publicly available on GitHub.

Results are displayed at compassionbench.com. The benchmark has been shared with frontier AI labs and is positioned to inform emerging AI governance frameworks, including the EU General-Purpose AI Code of Practice, which lists nonhuman welfare as a systemic risk.

Building and Deploying TAC, the First Agentic Animal Welfare Benchmark