Post-AGI transition/Transformative AI Longtermism and future suffering

Jasmine Brazilek

A data-first approach to training compassionate AIs

CaML

Bio

I cofounded CaML about a year ago. Since then CaML has received funding from many sources including SFF and Longview. We hope to publish our first paper soon with our research findings and have already published the fist animal harms benchmark on Inspect the AHB. When I started CaML I had no background in ML, just general security engineering and I have grown into it by learning and doing.

Mentee must-haves/nice-to-haves

ML skills (should) Understanding of the ML field (must) Some software engineering background, Python proficiecy (must) Understanding of the issues and applications of mechanistic interpretability (ideal)

Mentee role

Structure their approach and convince me why it is the most promising one. Maybe begin a proof of concept work into the approach they choose

Mentor support

shaping direction, accountability

Questions for applicants

What is your background with ML? Are you proficient in coding with Python and understanding Python code? What excites you about this project?

Mentor-led project

Which data elicits compassion and why?

I have multiple datasets that elicit compassion to different extents as measured through benchmarks. What is different about these datasets? There are multiple ways to tackle this problem through mechanistic interpretability to dataset composition analysis, I want to hear thoughts on the best way of doing this and maybe a POC produced by the mentee on ho they would approach this problem. Some open source libraries can assist with this.