AI Interpretability For Interspecies Communication

What happens inside an AI when you train it on animal sounds? The Earth Species Project develops audio models that identify species from recordings, aiming to help decode animal communication. Their models are accurate, but accuracy is a black box. We wanted to see whether the network has truly learned something about animals, or just memorized which patterns go with which label.

We took four trained versions of their audio network plus a fifth,untrained, as a control. Each network passes sound through thirteen processing layers in sequence. We ran 600 clips through every layer of every model, then asked two things of each layer. Is it organizing animal sounds in a meaningful shape? Can a small readout classifier pull labels like "bird" or "mammal" out of it?

The answers came back in a beautiful pattern. Coarse categories like animal versus music or bird versus mammal sort out in the first half of the network. Fine distinctions between similar species, like the great tit and the Turkestan tit, only resolve in the deepest layers. The closer two species sit on the evolutionary tree, the deeper the network has to go to separate them. Network depth tracks evolutionary time.

Every trained version learned something the untrained baseline did not. The differences were in how they organized that knowledge. One recipe sorted animal class, taxonomic order, and species onto near-independent internal axes, a clean hierarchy. The others picked up the same categories but encoded them in overlapping, tangled ways. We built tools that make this difference visible.