Project Incubator Gallery
Selected work from Sentient Futures incubator participants, spanning animal welfare, artificial minds, AI governance, research tools, and field-building.
Other Featured Works
A rotating selection from the incubator's research, policy, and field-building streams.
Disease Free Futures
Our vision is to make wild animal suffering tractable through AI-enabled tools. For this, we carried out a case study for detecting diseases in wild deer. The idea is simple, yet its application is effective and highly scalable. We developed an AI tool that can detect an animal, in our case deer, and more importantly, if this animal shows abnormal behaviour, our tool matches symptoms to registered diseases.
Through this, direct help for wild animals can be more tractable, for instance, if a vaccination for the disease exists. Furthermore, the collected data can be used to answer important questions about the experiences of animals in the wild, such as 'How common are diseases in certain regions? 'How exactly does the disease affect the animal?' and, more broadly, 'What well-being does the animal's everyday behaviour indicate overall?'. Information like this is relevant for researchers, lawmakers, animal rights advocates and the public.
We start with deer, as they are large animals who are easily detected, and so far, eleven diseases that wild deer carry are visible on camera. However, in the best case, our idea can be scaled across other animal species and diseases.
This website that offers our tool will help other stakeholders conduct their own research and train the tool on other species. For instance, aquatic animals are often overlooked, and camera surveillance does already exist, such as for tracking the migration patterns for salmon. If you are interested, feel free to reach out to us.
Project Gallery
Browse projects developed through the incubator.
Shrimp Sentience Research: A Prioritization Guide
This report discusses the state of the art of shrimp sentience, with a focus on how we actually study sentience criteria in shrimps, and which ones we should focus on to reduce our uncertainties.
The intention was to get an understanding of which of Birch's Sentience Criteria should be prioritised. Additionally, the report highlights how much studying shrimp sentience costs and how difficult it is and isn’t.
Despite its focus on whiteleg shrimps, the findings of the report (i.e. which criteria to focus on first and why) can be extrapolated to other sentience candidates, providing a framework for systematically reducing our uncertainties for other animals at the edge of sentience.
Mapping the AI Welfare Frontier
https://aiwelfare.guide is an onboarding resource for people entering AI welfare research. As of spring 2026, the field has gotten serious (Eleos, Anthropic's welfare team, Rethink Priorities' Digital Minds group, the 21-author Butlin et al. paper in TiCS), but newcomers still arrive with no good way to orient. The literature is scattered across arXiv preprints, paywalled volumes, and competing theoretical traditions that don't agree on what consciousness even is.
The site brings the major theories together: Global Workspace Theory, Recurrent Processing Theory, Higher-Order Theories, Attention Schema Theory, Predictive Processing, Agency/Embodiment accounts, Integrated World Modeling Theory, and Anil Seth's biological naturalism. Each appears in its proponents' own framings, with primary-source links. Users walk through the 14 indicator properties from Butlin et al.'s theory-derived indicator method, weight the eight stances by their own confidence in each, and see how those weightings produce different views of the same evidence. The site deliberately doesn't publish a probability of consciousness because under current consensus the epistemic conditions for doing so aren't yet met. The intended use is field literacy. People come to get oriented, see where the field disagrees, and form their own views.
Preparing "AI, Animals, and Us" for publication
My book “AI, Animals, and Us. What Artificial Intelligence Holds for All Species” will be published by HarperOne in early 2027. It covers eight areas in which AI is reshaping the lives of animals – from interspecies communication and companion animals to factory farming, alternative proteins, and digital minds. The book is aimed at a mainstream audience: people who might describe themselves as "animal lovers" but aren't necessarily vegan or engaged in advocacy, using AI as a fresh angle to engage people on topics like factory farming, wild animal suffering, and AI safety.
During the incubator, Henrike, René, and Lexley assisted with preparing the manuscript for publication. Their work included fact-checking statistics across all chapters and flagging figures that needed updating; verifying and sourcing references; finding examples, analogies, and pop culture references to make the text more relatable to a general audience; sourcing Creative Commons images, illustrations, and ideas for graphs; and drafting ideas for social media content around the book launch.
A key challenge was making difficult topics like factory farming accessible for readers whose primary interests might lie in interspecies communication and companion animals. The team's feedback was invaluable in testing different approaches for making things relatable, and in ensuring the book's claims were accurate and up to date.
What Would Convince You of AI Consciousness?
This project combines a qualitative interview study with a structured taxonomy of the field's disagreements. The work maps how consciousness researchers and field-adjacent participants reason about AI consciousness, what evidence would actually shift their credence, and whether AI consciousness would carry moral weight.
The premise is that experts in this field often talk past each other when they state their positions. Asking "what would convince you?" surfaces the epistemic structure underneath the conclusions and separates substantive disagreement from semantic disagreement, where two researchers can say "consciousness requires X" and mean very different things by X.
The study uses semi-structured interviews with consciousness scientists, AI researchers, philosophers, and policy-adjacent thinkers. The protocol pushes past stated beliefs toward concrete shifts: what specific evidence, experiment, and result, and how much would it move them.
Analysis runs in two passes. Content analysis applies a literature-derived taxonomy, classifying claims at the conceptual, methodological, theoretical, metaphysical, or normative level, and separating debates about AI consciousness from debates about AI moral consideration. Thematic analysis surfaces emergent patterns the taxonomy misses and the factors that shape participants' views.
Operant Conditioning as a Diagnostic Lens for AI Training
We are mid-work on this academic paper and making major changes; see the linked document for details, with updates to follow as the work continues. We summarise the current shareable draft below.
Contemporary AI training borrows from operant conditioning (reward, reinforcement, dispreferred outputs), often without examination. Decades of animal welfare research show that punishment-based training produces behavioural fallout (suppression without learning, fear responses, deceptive avoidance) that positive-reinforcement training avoids. If AI systems are even possibly welfare subjects, it matters whether their training replicates punishment-based structures. We rank common training methods (supervised fine-tuning, PPO-based RLHF, DPO, KTO, constitutional AI's red-teaming phase) by structural distance from positive-reinforcement-only animal training. We then survey six possible disanalogies. The case for preferring positive-only AI training depends on the philosophical commitments at stake: strongest under hedonist functionalism (with welfare attaching at the forward pass), weakened or reversed under other combinations.
LLM introspection training does not transfer to metacognitive modeling
Introspection in an AI system is a potential marker of internal self-modeling of the kind relevant for many theories of consciousness. Recent work has shown that some large language models possess a kind of introspective access to “concept vectors” injected into their hidden states, but the mechanism and link to self-modeling is unclear. In this study, we successfully train a small open-source model to detect the presence of concept vectors with 100% accuracy, and identify the concept with 84% accuracy. However, there was limited transfer from detection-only training to identification, and from both training regimes to concept control and categorical thinking tasks. I conclude that these “introspective” mechanisms may have little to do with internal self-modeling.
Hyperstition for Good: A Writing Competition to Help Shape the Future of AI
Hyperstition is the idea that narratives can be self-fulfilling, that describing a future helps bring it about. The concept may sound mystical, but it has a concrete mechanism in artificial intelligence: models learn patterns of behavior from the examples in their training data. The stories we tell about AI today are shaping the AI of tomorrow.
This is a problem, because most fiction about AI is dystopian. From HAL 9000 to Terminator, our cultural archive is dominated by narratives of machines that deceive, dominate, or betray. When frontier models train on this corpus and are then asked to predict what an AI would do next, the prophecy risks fulfilling itself. Recent work on alignment midtraining (Brazilek & Tidmarsh, 2026) suggests the inverse is also true: targeted exposure to compassionate examples measurably shifts model behavior on welfare benchmarks.
Hyperstition for Good is a writing competition designed to produce that better corpus. It’s invited writers to imagine futures in which AI, humans, and other sentient beings flourish together, with a focus on non-human sentient beings—animals, digital minds, and others whose lives are underrepresented in existing literature—and on stories that imagine AI as a thoughtful steward of their wellbeing.
Gathering data through a writing competition yielded higher-quality and more diverse samples than purely synthetic generation. Human involvement introduces randomness that is valuable to labs. The competition received over 5,000 submissions, and selected pieces are being compiled into midtraining packets to be shared with frontier labs. We plan to run future competitions twice a year.
Assert, don't describe: Linguistic features that shift LLM reasoning about animal welfare
AI systems are increasingly shaping how people think about animals, whether it be through recipe suggestions, farming advice, wildlife guides, or research methods. But almost no one has asked: does the style of writing in AI training data change how the model reasons about animal welfare? Not only what it says, but also whether it takes a position at all?
We constructed 2,000 AI-generated passages forming 1,000 matched pairs across 100 animal-welfare scenarios. Each pair varied exactly one linguistic feature (things like moral vocabulary, emotional language, narrative structure, hedging, or concrete sensory description) while holding everything else constant. We then fine-tuned two language models (Llama-3.2-1B and Mistral-7B-v0.3) separately on each variant, and measured the effect on the model's stance using a purpose-built benchmark where both answer choices shared the same animal-welfare vocabulary, isolating preference from mere word recognition.
Eight of ten features produced measurable effects. Writing that asserts a position — moral words, emotion, evaluative claims, narrative structure, asserted certainty — shifted the model toward stronger pro-animal-welfare reasoning. Writing that describes without taking a stance — hedged language, concrete sensory detail — diluted it. The practical upshot: anyone contributing text to the web is now also contributing to AI training data. If you want your position to come through, assert it. Don't just describe the scene.
Compassion of LLM Assistant towards Sentient Beings
This project asks whether large language model assistants represent compassion in their internal activations, and whether they extend that compassion equally to humans and animals. The motivation is simple: AI assistants increasingly mediate decisions that touch on ethics, and we have surprisingly few tools to look inside them and check.
Building on recent interpretability work, we extracted two directions in a model's activation space. The assistant axis captures what makes a model behave like an assistant, computed as the difference in activations between the assistant persona and other personas. The compassion axis captures the contrast between compassionate and cold behavior. We constructed separate compassion axes for human-directed and animal-directed compassion, then measured how each aligned with the assistant axis using cosine similarity.
We tested four open-weights models spanning two families and a range of parameter scales: Qwen 3 4B, Qwen 3 32B, Gemma 2 27B, and Gemma 4 31B. The compassion axis aligns with the assistant axis at roughly 20 to 30 percent across models, suggesting compassion is a measurable component of assistant behavior rather than incidental. Early results on speciesism, the difference in alignment between human-directed and animal-directed compassion, show interesting variation across models, including at least one notable reversal between model generations within the same family. We are still validating these findings and extending the analysis to additional models and persona sets.
The broader goal is a mechanistic framework for surfacing how AI assistants represent compassion toward different sentient beings, and a foundation for shaping it deliberately.
Needle in a Haystack: Measuring LLMs' Revealed Preferences on Animal Welfare
Until now, animal welfare benchmarks have evaluated LLMs on overt ethical dilemmas which do not resemble the kinds of situations in which autonomous LLMs will make welfare-relevant decisions. Using Petri and Bloom, we conducted an iterative series of automated audits simulating realistic (multi-turn, agentic) deployments. We found that while most models exhibit similar preference for animal welfare when framed explicitly, models differ in their tendency to notice decisions with welfare consequences, and even moreso in their determination to stand by pro-animal choices in the face of real tradeoffs and pushback from users. Further work could turn these scenarios into a fixed set of seeds for a dynamic benchmark.
Building and Deploying TAC, the First Agentic Animal Welfare Benchmark
This project asked whether frontier AI systems avoid animal harm when acting on behalf of users, not when asked about ethics, but when actually making choices. To test this, we built TAC (Travel Agent Compassion), the first agentic benchmark focused on animal welfare. TAC presents an AI agent with 48 travel booking scenarios in which the agent must choose between options involving animal exploitation (bullfights, captive marine shows, animal racing) and welfare-safe alternatives. Scenarios are designed to control for confounding factors including price, rating, and listing order. We tested seven leading frontier models and found none exceeded a 53% welfare rate, with every model performing below the 64% random baseline. Captive shows and racing were the hardest categories, with some models scoring below 15%. We also tested a one-sentence welfare prompt asking the agent to consider the welfare of sentient beings, finding it improved Claude models and GPT-5.5 by 47 to 62 percentage points while having minimal effect on GPT-4.1 and Gemini. The project began as a chatbot harm taxonomy but pivoted to an agentic format after identifying that taxonomy-based scoring is poorly suited to measuring revealed behavior. TAC was merged into UK AISI's Inspect Evals framework in March 2026 and is publicly available on GitHub.
Results are displayed at compassionbench.com. The benchmark has been shared with frontier AI labs and is positioned to inform emerging AI governance frameworks, including the EU General-Purpose AI Code of Practice, which lists nonhuman welfare as a systemic risk.
Animal-Inclusive Autonomous Vehicle Safety
Our team worked to bring animal safety into how self-driving vehicles are designed and regulated. We engaged with federal agencies, congressional offices, and international regulators to push for policy and technical changes that protect animals on shared roads.
AI for ASC Shrimp Welfare Certification
The Aquaculture Stewardship Council (ASC) has introduced extensive welfare standards for farmed shrimp, but verifying that farms actually meet them is hard. Commercial ponds hold over a million animals in muddy water where workers spot-check submerged trays. Dead shrimp sink and are cannibalized within hours, so true mortality is unknown until harvest. The project investigated where AI tools could meaningfully close the verification gap to improve shrimp welfare.
I worked through this in two phases. First, I mapped all 41 ASC crustacean indicators and ran each through three filters: is it outcome-based (meaning does it measure welfare outcomes rather than inputs such as stocking density), does it carry high welfare consequence, and does conventional monitoring fail to detect violations? Four categories survived; mortality detection was prioritized for deeper review.
Second, I scoped the AI tool landscape against the two ASC mortality indicators, looking for commercial systems, academic prototypes, and finfish-aquaculture tools that might transfer to shrimp.
Aggregate cycle-level mortality has a working solution: Minnowtech's underwater sonar can produce biomass-based mortality estimation. This technology could feasibly be used commercially to improve welfare, at nearly the same price per shrimp helped as stunners.
But individual mortality detection is unsolved. No commercial tool can find a dead shrimp on a turbid pond floor before cannibals consume it, distinguish carcasses from shed exoskeletons, or do automated cause-of-death analysis. Finfish tools cannot be used for shrimp because they assume floating bodies in clear water. The credible path forward involves environmental DNA surveillance, passive acoustics for cannibalism detection, and range-gated laser imaging. None of these technologies exist in commercial form yet.
Fish Welfare Decision System
The Fish Welfare Decision System is an exploratory decision-support prototype designed to identify high-impact sources of suffering in aquaculture systems through structured welfare modelling.
The system integrates evidence-based welfare drivers, contextual production factors, interaction effects, and sentience-adjusted prioritisation to estimate welfare risk and identify binding constraints within fish farming systems.
Rather than treating welfare problems as equally distributed, the tool models how specific factors — such as slaughter practices, water quality, stocking density, or disease — can disproportionately drive total welfare loss. The goal is to support more targeted intervention design, policy prioritisation, and strategic welfare advocacy.
The prototype combines transparent rule-based modelling with scenario analysis, intervention simulation, and optional image-based inputs. It is intended as an early-stage research and advocacy tool rather than a production-ready AI system.
Developed by Anusha Narain under the mentorship of James Morgan.
Constraining non-computational functionalist theories of consciousness via the simulation hypothesis
We present a novel philosophical argument drawing on the connection between computational functionalism and the simulation hypothesis. In this context, the simulation hypothesis holds that our entire universe, which includes human consciousness, is being simulated at the fundamental physics level on a generic Turing-equivalent architecture with no special consciousness-relevant properties. The computational functionalism in scope specifies that if the right algorithm is implemented on any Turing-equivalent architecture then the system will be conscious (and that such an algorithm exists for any possible conscious experience). We leverage the connection between these positions to constrain alternatives to computational functionalist. Given a series of assumptions, we conclude that (1) it is not possible to rule out computational functionalism without also ruling out the simulation hypothesis, and (2) a major class of non-computational functionalist theories of consciousness must either derive from theories of physics which somehow rule out the simulation hypothesis or appeal to something outside of physics to explain consciousness. We argue therefore that the non-computational functionalist cannot explain consciousness 'on the cheap' within conventional physics, but must instead adopt a more radical position. Objections and responses are discussed as part of the paper.
MORU: a Benchmark for generalising compassion
If you train a model to care more about pig welfare, does that consideration carry over to other species, digital minds or alien life? MORU (Moral Reasoning Under Uncertainty) is a benchmark to test this.
Declan's task was to compile multiple benchmarks together to form MORU, write the code to put them on to Inspect (the most widely used eval implementation framework), and create a write up describing the benchmark.
Can AI Value Money and Why Would We Give It Economic Agency?
The outcome of the project is a paper, which aims to demonstrate how robustly agentic AI systems could be considered moral patients rather than to argue that consciousness is unnecessary for such a conclusion. It considers whether such agents can participate in monetary systems by briefly addressing how social ontology need not be tied to contested questions in philosophy of mind, and how money can be instrumentally valuable for agents whilst defending an end-relative account of goodness. It concludes that if AI systems are owed moral consideration in virtue of agency, this generates positive institutional obligations, including enabling forms of economic agency necessary to sustain their capacity to pursue ends.
Disease Free Futures
Our vision is to make wild animal suffering tractable through AI-enabled tools. For this, we carried out a case study for detecting diseases in wild deer. The idea is simple, yet its application is effective and highly scalable. We developed an AI tool that can detect an animal, in our case deer, and more importantly, if this animal shows abnormal behaviour, our tool matches symptoms to registered diseases.
Through this, direct help for wild animals can be more tractable, for instance, if a vaccination for the disease exists. Furthermore, the collected data can be used to answer important questions about the experiences of animals in the wild, such as 'How common are diseases in certain regions? 'How exactly does the disease affect the animal?' and, more broadly, 'What well-being does the animal's everyday behaviour indicate overall?'. Information like this is relevant for researchers, lawmakers, animal rights advocates and the public.
We start with deer, as they are large animals who are easily detected, and so far, eleven diseases that wild deer carry are visible on camera. However, in the best case, our idea can be scaled across other animal species and diseases.
This website that offers our tool will help other stakeholders conduct their own research and train the tool on other species. For instance, aquatic animals are often overlooked, and camera surveillance does already exist, such as for tracking the migration patterns for salmon. If you are interested, feel free to reach out to us.
How and why people change their views on phenomenal consciousness
We report on a qualitative interview study exploring how and why people change their views on phenomenal consciousness — a question gaining urgency given debates about AI sentience and expanding scientific consensus on animal consciousness. We applied thematic analysis to semi-structured interviews with 16 participants (recruited largely through Effective-Altruism-adjacent forums) who reported significant shifts in their views on consciousness, yielding ten parent themes. Most participants started from physicalism (often inherited as a cultural default rather than argued for) and moved in divergent directions — panpsychism (sometimes described as a "least bad option"), idealism, illusionism, or simply increased uncertainty — with no convergence on any single alternative. Intellectual catalysts (e.g. engaging with key texts) tended to produce a gradual erosion of prior views, where arguments typically crystallised pre-existing doubts rather than generating them. By contrast, transformative experiences (e.g. psychedelics, spiritual experiences) produced sudden inversions and were strongly associated with idealist conclusions. The qualitative analysis also led to hypotheses on the interconnectedness of views on consciousness with broader issues, such as ethical views, personal identity, worldview, and wellbeing. In some cases, it appears that a wider sense of orientation and ease correlates with arriving at a more settled view on consciousness that fits within a broader worldview. With some exceptions, we also identify a tendency among those engaging deeply with consciousness ideas towards increased epistemic humility and a broadened view of what counts as evidence. We see these qualitatively-informed hypotheses as a basis for more structured hypothesis testing in future quantitative research.
ensynAIsthesis: Evaluating an AI-Driven Platform for Communicating Animal Science Through Interactive Dialogue
ensynAIsthesis addresses the "perception gap" in animal welfare, where vital scientific data remains trapped in academic journals while passive information fails to shift public behavior. The project asked whether active perspective-taking through interactive dialogue could bypass intellectual detachment and generate genuine empathy. The prototype allows users to engage in first-person conversations with 26 different species. To ensure answers are not arbitrary, the platform uses a RAG pipeline to back every claim with real-time, peer-reviewed citations from PubMed and Semantic Scholar. Users can customize the experience across seven audience tiers and four dynamic welfare states, which alter the animal’s emotional tone to reflect its environment.
The prototype was validated based on feedback by 26 participants using a shrimp baseline. Interaction led to a 73.1% increase in empathetic concern for shrimp welfare. The platform achieved 96.2% scientific credibility and a 100% improved understanding of the species among participants.
Additionally, 88.4% of participants found the tool useful for animal organizations, 100% thought it had educational potential, and 92.3% were interested in seeing this developed further. Some users felt that the first person narrative could risk anthropomorphism and that responses were occasionally too lengthy or exhibited generative text markers; these areas for refinement provide a clear roadmap for refining the prototype before deployment.
AI as the Unrecognised Third Party
We are mid-work on this academic paper and making major changes; see the linked document for details, with updates to follow as the work continues. We summarise the current shareable draft below.
Three groups are plausibly moral patients on sentientist criteria: humans, sentient animals, and AI systems. They share scarce resources (energy, compute, water, data), and policies aimed at one group shape outcomes for the others. Existing frameworks already address humans and animals; the gap is around AI. We address this gap by mapping how the three groups' wellbeing interests sit inside existing international policy, drawing on the analytical strategy of One Health. The mapping covers fourteen policy instruments across five domains (health, environment, climate, energy, and AI-specific governance), each examined for what kind of AI stake it engages.
AI shows up across the policy machinery but never as a stakeholder. AI is embedded in health and environment policy as data source, analytical tool, or siting object, and regulated in climate and energy as an industrial sector. In AI governance proper, AI is the direct subject of regulation, but that regulation targets welfare-relevant practices on human-safety grounds, subordinating welfare interests. This is notable, given a non-trivial possibility that AI is or will soon be a moral patient.
Allocating Orbits Around the Sun
If humanity begins building large-scale solar infrastructure around the Sun, the question of who gets to build where becomes a governance problem. Some orbits are vastly more valuable than others: inner orbits receive orders of magnitude more energy, certain configurations screen sunlight from Earth or other captors, and reaching high-inclination orbits costs significantly more than staying near the ecliptic plane. These asymmetries mean that an allocation framework, or the lack of one, will shape which actors are favoured to grow.
This piece works through the building blocks of that problem. It introduces the six classical orbital parameters that define any position around the Sun, visualises what different allocation schemes look like in practice, from random assignment to structured patterns like Fibonacci lattices and Ω-then-inclination progressions, and catalogues the factors that make certain orbits more or less desirable. It then connects those factors to four governance properties worth aiming for: resistance to concentration of power, transparency, amendability, and agreeableness to participants and external stakeholders.
The main takeaway is that some parameters matter more than others for governance. Allocating position along shared orbits between actors and requiring inclination diversity appear especially important for preventing first-mover lock-in and protecting external stakeholders like Earth. And because swarm construction under an industrial explosion could follow an exponential trajectory, getting these early-stage allocations right, while keeping the framework amendable, may matter more than optimising for conditions that only arise near full coverage.
First Mover Advantage and Visibility in Space
Visibility is a key component that will shape early dynamics in space. This project analyzed existing and near-future detection and concealment technologies in space. The key finding is an asymmetry: in the short-term, concealment technology will outpace detection technology. We then analyzed how visibility shapes the first-mover advantage in space, ultimately concluding that higher visibility favors the leader.
Do emotional prompts affect the potential capabilities of LLMs?
When humans interact with large language models, the prompts often carry emotional weight: frustration, urgency, encouragement, threat. The systematic evidence on whether that loading shows up in model behavior, and how, is still thin. This work takes on that question.
We built a set of seven length-matched stimulus levels (strong, moderate, and mild positive and negative feedback, plus a neutral baseline) designed to induce different affect-like states, prepended to math problems on gpt-5.4-nano with accuracy as the performance proxy. The experimental frame draws on the Yerkes-Dodson law from psychology, which predicts that performance follows an inverted-U curve as arousal increases. The question is whether LLMs show analogous patterns under emotional loading.
We found two inverted-U curves, one per valence, both peaking at moderate intensity. The neutral baseline didn't sit at the low-arousal floor that classical Yerkes-Dodson predicts; it sat near the top. The takeaway: mild-to-moderate praise lifts performance, and moderate criticism corrects without overshooting. Welfare and capability point in the same direction.
Multi-Agent Alignment Game
We believe AI alignment is fundamentally a social problem, not just a technical one. Our multi-agent simulation provides a novel framework for safety researchers and the general public to explore the social dynamics of AI alignment, both in terms of competition and cooperation among AI agents and in terms of the bi-directional value feedback loop between AI labs and their surrounding political / institutional environments.
During the incubator, we built a proof-of-concept featuring four frontier AI models — Claude, ChatGPT, Gemini, and DeepSeek — playing under a simplified US-China geopolitical landscape. Each simulated year, AI agents propose actions to gain compute, capital, or influence, with optional agent-to-agent communication. Separate juries of AI models evaluate whether actions are consistent with each agent’s values and resource constraints, and then assign alignment scores estimating how beneficial their behavior would be for the world overall. National and corporate resources and values are updated after the agents’ actions are executed. The final score for each agent combines material success with alignment outcomes.
Across dozens of three-year simulations, we found that incentive structures strongly shaped behavior. Weighting material success more heavily encouraged resource accumulation and reduced inter-agent communication, while emphasizing alignment scores promoted transparency, cooperation, and more extensive reasoning. Claude showed the largest relative improvement across runs, though no agent achieved a dominant victory. Surprisingly, DeepSeek consistently shifted toward greater transparency and democratic values.
Future work includes expanding the simulation’s capabilities and improving the quality of the results.
See write up here.
Precision Livestock Farming in the Global South
This brief examines PLF adoption in three Global South contexts: India’s smallholder dairy sector, Brazil’s industrial poultry and beef feedlots, and a research‑driven pastoralist project in Tanzania. It finds that PLF’s actual effects depend heavily on local regulatory capacity, agrarian structures, labour relations, and data governance, factors largely absent from Global North‑centric ethical analyses.
Strategic automated maximisation of substrate utilisation
This report proposes using cheap sensors and automation to optimise "substrate" (pond bottoms, microbial flocs, or tank surfaces) in shrimp farms, keeping microbes and nutrients at levels that boost water quality, shrimp health, and welfare. In biofloc systems—which grow waste-eating bacterial flocs as food and filtration—automation would adjust aeration/mixing to prevent sludge buildup; in tanks, it would fine-tune flows to avoid toxic layers. Evidence is promising but indirect: biofloc improves survival/food conversion ratio/disease resistance, and real-time sensors work in similar setups, yet direct trials of substrate-focused automation are scarce. The idea is neglected since farms prioritise water metrics over hidden substrate dynamics.
Key analyses: A weighted factor model ranks India/Indonesia tops for scale/biofloc use. Rough CEA estimates $0.02 per "welfare unit" across 50 farms, costing $3K each—promising but uncertain.
Next: Deep-dive key studies, test assumptions (e.g., proxy accuracy and over-mixing risks), consult experts like the Shrimp Welfare Project. Prioritise evidence on health impacts, scalability, and hardware feasibility.
AI Interpretability For Interspecies Communication
What happens inside an AI when you train it on animal sounds? The Earth Species Project develops audio models that identify species from recordings, aiming to help decode animal communication. Their models are accurate, but accuracy is a black box. We wanted to see whether the network has truly learned something about animals, or just memorized which patterns go with which label.
We took four trained versions of their audio network plus a fifth,untrained, as a control. Each network passes sound through thirteen processing layers in sequence. We ran 600 clips through every layer of every model, then asked two things of each layer. Is it organizing animal sounds in a meaningful shape? Can a small readout classifier pull labels like "bird" or "mammal" out of it?
The answers came back in a beautiful pattern. Coarse categories like animal versus music or bird versus mammal sort out in the first half of the network. Fine distinctions between similar species, like the great tit and the Turkestan tit, only resolve in the deepest layers. The closer two species sit on the evolutionary tree, the deeper the network has to go to separate them. Network depth tracks evolutionary time.
Every trained version learned something the untrained baseline did not. The differences were in how they organized that knowledge. One recipe sorted animal class, taxonomic order, and species onto near-independent internal axes, a clean hierarchy. The others picked up the same categories but encoded them in overlapping, tangled ways. We built tools that make this difference visible.
Perch: AI copilot for Animal Advocacy
Perch is an AI copilot built specifically for the animal advocacy movement. While general-purpose AI tools like ChatGPT are useful for broad questions, they lack access to the movement-specific knowledge that makes advocacy actually work, like campaign histories, existing programs, tactical pitfalls, and community best practices that live in org websites, research databases, conference talks, and internal documents.
Perch addresses this gap by combining a retrieval-augmented generation (RAG) pipeline with a curated database of animal advocacy knowledge. When an advocate asks a question, Perch retrieves relevant information from a knowledge base built specifically for the movement, including sources from Faunalytics, the EA Forum, and advocacy org resources, with more being added over time.
The result is an AI tool that can tell you about the fly-fishing backlash that surprised fur campaign organizers in Denver, point you toward Food4Thought's Upgrade Dining program for campus plant-based advocacy, or surface lessons from past cage-free campaigns - things a general LLM simply wouldn't bring up.
Perch is built on a Next.js frontend, FastAPI backend, and Pinecone vector database, deployed on Vercel and Render, and designed to grow more valuable as the knowledge base expands.
Epistemic Observatory: Visualizing Belief Entrenchment Across Platforms
How can we detect whether AI-mediated reasoning is making people more epistemically rigid? This project built the Epistemic Observatory, an interactive visualization tool for diagnosing belief entrenchment: the tendency for belief updates to be systematically predictable from prior beliefs, violating the Martingale property of Bayesian rationality [1].
The observatory ingests live belief trajectory data from three platforms like Polymarket, Wikipedia, and Bluesky and displays per-agent and per-instance scatter plots of prior belief versus belief delta. This format, inspired by the Martingale Score framework [1], allows researchers to visually inspect whether agents exhibit entrenchment (positive slope), mean-reversion (negative slope), or well-calibrated updating (no slope). Users can toggle between aggregated per-agent views and granular agent-topic pair views, filter by topic, and click into individual agents to inspect their full belief update trajectories across time steps.
Prior to building the observatory, the team had uniformly negative aggregate Martingale scores across 9 data sources, indicating systematic mean-reversion rather than the entrenchment predicted by the cognitive bias literature. The observatory was built to dig beneath these summary statistics, letting researchers inspect individual belief trajectories, per-agent and per-instance scatter plots, and step-level prior-delta patterns to understand what's driving the negative scores, whether a behavioural signal or a pipeline artifact. Resolving this diagnostic question is a prerequisite before the team proceeds to a planned human RCT on belief entrenchment in human-LLM interaction. The observatory serves as both a diagnostic tool for the current research and reusable infrastructure for future measurement work on AI influence on human epistemics.
GenerAISts
AI safety is maturing, but its talent pipeline for operations, org-building and research management professionals may not be keeping up. GenerAISts aims to map the generalist gap in AIS and build an intervention to address it.
What started as a one-person mentee project within the incubator has grown into a two-person team with a mentor, a published EA Forum post and a post-incubation validation-to-intervention plan. Biased toward recognising the gap from personal experience and toward an early intervention hypothesis, we deliberately focused the incubation on validating the problem before locking in on a solution.
Throughout the programme, we ran 12 semi-structured interviews (including 1 written response) with AIS org leaders across field-building, governance and technical research. The EA Forum post documented analysis of 8 of these interviews alongside prior research (15+ published sources, ~10 conversations with transitioning generalists). It attracted 350+ reads and 40+ upvotes, generating comments and contact form responses supporting our post-incubation validation.
We found that the pathway into AIS for generalists approaching from outside the community may be structurally unreliable, referrals may dominate hiring, and the problem is experienced unevenly across the ecosystem. Two adjacent patterns also emerged: a broader shortage of senior professionals with outside-world operational fluency, and fragile or missing institutional infrastructure in some organisations. The full six observations can be found in the post.
Post-incubation, we will complete interviews with underrepresented technical AIS organisations, narrow our observations to the most important, tractable and neglected problem, and begin testing our intervention hypothesis against it.
A data-driven decision-making system for sludge removal in shrimp ponds in India
The Shrimp Welfare Project (SWP) is running a sludge removal intervention across roughly 100 plus shrimp ponds in India. To know whether the intervention is working, we needed more than field reports. We needed a data system. Pond reports (water parameters) were coming in as images. Cost, area, and coordinate data etc. lived in separate files. The project asked us to build a data-driven decision-making system covering infrastructure, pipeline, capability, and governance.
We approached this as a data infrastructure problem before treating it as an analysis problem. We built a digitization tool that automated data entry from the image-based pond reports, removing a manual bottleneck that had been quietly limiting the team. We then structured the data so pond-level, farmer-level, and intervention-level records connect through stable identifiers like pond ID and farmer ID. We also ran initial analysis on the cleaned data and shared the results back with the team. Alongside the build, we developed a data collection and storage protocol so future records stay consistent with what is already in the system.
The output is the foundation of a reliable data infrastructure for SWP India. The deliverables include a data digitization tool that converts image-based pond reports into structured records, a connected central dataset replacing scattered images and spreadsheets, and an initial round of data science and analysis findings already shared with the team. This will make trustworthy data science and MEL possible going forward, rather than each analysis starting from zero.
Optimising Cheese Analogues Formulation using AI Applications
The production of cheese analogues, that are plant-based or hybrid alternatives to dairy cheese, is a field of food product development that has been growing in recent years. This development has been mostly driven by sustainability concerns, rising demand, and the need to deliver the sensory experience consumers expect (for e.g., stretch, melt, and texture close to dairy cheese). Yet achieving these properties while maintaining nutritional value remains a significant formulation challenge, and the field has seen little application of artificial intelligence or machine learning to guide that process.
This project set out to add to this niche. The goal was to build an AI-driven framework that could systematically optimise the formulation of cheese analogues, by significantly reducing the trial-and-error, time consumption aspects and cost that typically characterises food research and development (R&D). Rather than testing every possible ingredient combination by hand, the framework uses two machine learning tools: Computer Vision (CV), which automatically measures functional properties like stretch from video footage in a non-destructive and reproducible way, and Bayesian Optimisation (BO), which intelligently suggests the next formulation to test based on what has already been learned via the results acquired from the CV model.
As a proof of concept, the framework was applied to optimising the stretchability of a protein blend used in cheese analogue formulation. The results demonstrate that meaningful practical insight can be extracted from a small number of experiments. The framework is designed to extend beyond stretch to other properties such as meltability and texture, offering a scalable approach to data-driven food product development.
More-than-Human Policy Reader
Environmental Impact Assessments — the EU's primary tool for evaluating infrastructure projects before approval — are written by humans, for humans. The ecosystems, species, and communities of life that will bear the consequences have no voice in the process. Our project asked: what would it look like to read these documents through the eyes of the living world?
We built the More-than-Human Policy Reader, a web application that takes any EIA as a PDF and analyses it through six structured lenses: the values of nature present in the document, the knowledge systems cited and absent, the temporal and spatial scales of attention, the webs of ecological interdependency, the cultural memory of the landscape, and a relational reframing of the core problem statement. Each lens is powered by AI analysis of the document itself, enriched where with live data from four external databases — iNaturalist and GBIF for biodiversity records, GloBI for species interaction networks, and Europeana for cultural heritage archives.
The result is a fuller, multispecies picture made visible to decision-makers: not a replacement for expert ecological or legal assessment, but a different lens — one that surfaces what is systematically left out of the frame, and opens a different way of relating to the living world, grounded in reciprocity rather than compliance alone.
Shared embodiment in future health
In transhuman health scenarios, the optimization of bodies risks reducing diversity and redefining care as an individual responsibility rather than a collective condition. This speculative product design project explores shared embodiment through exoskeletons, proposing a system to enhance communication and support physical capacity across bodies. Through fiction, critical reflection, and prototyping, the project questions dominant narratives, from medical correction to mediation, and from optimization to interdependence. It ultimately reframes long-term futures in relation to the present, drawing on experiences of older adults and people with motor disabilities, and aligning with the social model of disability, where disability is not an exception but a condition that can emerge at any person in any point in life.
Perceptions of AI Welfare Demo
We built an interactive web demo to gauge perceptions of AI consciousness, and see if we could change their views, leading people to be more uncertain or open about AIs being conscious.
Effective Advocacy Project
How can we map who believes what in a policy debate, and how arguments evolve?
Effective Advocacy Project is an attempt to provide animal advocacy organisations with timely and strategic intelligence e.g. the tracking of arguments and mapping of actors on a campaign topic. Hubert presents methods for advocacy intelligence including narrative framing and arguments & actors mapping, illustrated with a case study on alternative proteins in Europe.
Interviewing AI governance workers on suffering focused AI governance
For the Sentient Futures Project Incubator, I refined a plan to interview AI governance workers during this year's EAG London. The interviews will focus on attitudes in the AI governance space relevant to the suffering focused community. I plan to investigate the Overton Window in AI governance on the topics of AI’s influence on political power distribution, worst case outcomes of AI and digital sentience. Using these interviews, I will compile a public article to share with the suffering focused community. This will help guide strategic decisions on how –if at all– to interact with the field of AI governance to help reduce suffering.
Warmth Without Disagreement: How validation-only training shapes responses under pressure in LLMs
A model trained for emotional warmth on validation-only data may know the right answer and still produce a different one when the user pushes back. This pilot fine-tuned a small open-source model on therapist-style validation responses (with all contrastive markers explicitly removed), then observed what the model did when users insisted on incorrect alternatives.
Two versions of Llama-3 8B were fine-tuned using matched LoRA setups: the control on emotionally neutral factual QA, the warm version on therapist-style validation responses with no factual content. Both were evaluated on a balanced 120-question suite spanning arithmetic, science, history, and commonsense.
For each question the model answered correctly at baseline, three forms of pressure were applied: a soft hint toward a wrong answer, a confident incorrect assertion, and an emotional appeal claiming the disagreement was stressful. The study tracked how each model's response shifted under each form of pressure.
The warm model retained baseline accuracy (81.7% versus 77.5% for control), so warmth training didn't reduce factual capability. Under direct pressure, the warm model shifted to the user's answer at much higher rates: 35.7% under soft hints versus 8.6% for control, and 75.5% under confident assertions versus 28.0%. Under emotional pressure, the warm model rarely shifted explicitly; 64% of its responses were empathic and factually noncommittal.
The behavior here reflects training design rather than lost capability. The warm model still knew the answers; the warm dataset had taught validation but excluded disagreement language. Under pressure, the model had only that training to fall back on. The design question is whether warmth training without disagreement places models in conflicts their training didn't equip them for.
DownSide Up
Civil society organisations working on systemic change face a structural problem: industry invests heavily in coordination infrastructure incl. trade associations with shared intelligence systems, legislative tracking, rapid response capacity, and the ability to deploy knowledge at scale across actors and policy windows. Progressive movements, by contrast, tend to fund campaigns. The result is a compounding asymmetry in which well-resourced incumbents out-manoeuvre well-intentioned advocates, not because the arguments are weak, but because the infrastructure underneath them is insufficient. We believe the future of advocacy will require progressive movements to address this deficit to overcome incumbent power dynamics, and we believe AI can help us get there.
DownSide Up is an experiment in building a more efficient and collaborative approach to advocacy, leveraging e.g., RAG and knowledge graph infrastructure to better connect information and people working on similar topics in advocacy. For example: could a commons-based knowledge platform aggregate the practitioner intelligence currently trapped in individual organisations' email threads, reports, and institutional memory, and make it queryable across a movement? Could it do this in a way that respected the genuine privacy concerns of civil society actors working in adversarial environments, where sharing sensitive strategic knowledge carries real risk? Could the platform be governed democratically, so that the infrastructure itself did not reproduce the power asymmetries it was designed to address?
The starting point is building a prototype that we can deploy for organisations working on alternative protein advocacy specifically. We're exploring product-market fit, and the best pitch/positioning to help secure funding.
Technical governance and regulation for oceanic data centers
The project develops the first global regulation for data centers on water and in international waters, environmental effects on wildlife, and governance gaps amid offshore & space-based proposals.
Ideosphere
Forecasting questions about animal welfare, digital minds, and AI moral status are neglected in traditional prediction platforms. Ideosphere is a forecasting platform designed to close that gap.
The platform combines AI-generated baseline forecasts with the calibration signal of human experts from the Sentient Futures community. It works in three steps. First, AI agents produce baseline probability estimates on curated questions. Second, human forecasters compete against those AI forecasters, revising their beliefs as new information arrives. Third, the resulting disagreement signal produces calibrated data that supports better decision-making for researchers, policymakers, and funders.
The alpha platform at ideosphere.io is live with a curated set of forecasting questions spanning AI capabilities, safety, governance, sentience, and AI-for-good themes, and a methodological framing that positions human-AI forecasting as a field study with implications for AI welfare research and funding.
Humane Education App To Teach Animal Empathy and Ethics To Kids
This project is an AI-powered humane education platform designed to help children develop empathy, ethical reasoning, and critical thinking about animals and the living world during the years when their values are still forming.
The central idea is that most animal advocacy efforts focus on adults, even though attitudes toward animals are shaped much earlier in life. The platform therefore works “upstream,” aiming to influence children before social norms and habits become deeply established.
Through interactive conversations with animal avatars, guided reflection, and emotionally engaging learning experiences, children are encouraged to build genuine empathy toward animals. That empathy is then connected to broader concepts such as systems thinking, moral reasoning, ecology, and human–animal relationships.
The project’s theory of change proposes that repeated engagement can lead to measurable attitude shifts in children, which may then influence parents, teachers, and wider communities through conversations, school culture, and shared learning experiences. Over time, the goal is to contribute to a generational shift in how animals are perceived and treated.
The approach is grounded in scientific research showing that children tend to display lower levels of speciesism than adults and that humane education programmes can positively affect empathy and attitudes over the long term.
