TL;DR: A holiday obsession turns into a deep meditation on all things pretty. Albatrosses and reward *models* included. Also, check out…
2
TL;DR: a seminar series on simulator theory; a toy model, the semiotic coin flip, unpacks the strange physics of language models a tiny bit.

December 2022

TL;DR: Writing about writing and having written, and self-referentiality. Complementary musings on AI.
9

November 2022

A neuroscientific-perspective on translating concepts between world-models as a way of solving value-learning.
and

September 2022

TL;DR: Guest post by Michael Oesterle on coordination problems (and more) between advanced artificial agents.
and

July 2022

Via productiva. Audio version.Listen now (16 min) | TL;DR: Introspection on how I do things and which rules and heuristics help me to be productive. Framed as Taleb's via negativa…
TL;DR: I let my friend Ava (who actually knows a thing or two about art!) experiment with DALL-E 2 for a bit. She allowed me to share her reflections…
and
TL;DR: deep reflections on names and identity, life-changing decisions, and mental renovations. And all of that in ~500 words!
2
Inferring utility functions from locally non-transitive preferences. Audio version.Listen now (14 min) | TL;DR: Fanboying JvN, then a nuts-and-bolts description of the von-Neumann-Morgenstern theorem. A connection to reward modeling…

June 2022

Task Decomposition And Scientific Inquiry. Audio version.Listen now (11 min) | TL;DR: A curious asymmetry between making and criticizing, the scientific method as an approach to task decomposition, and a…
TL;DR: If you're a student of cognitive science or neuroscience and are wondering whether it can make sense to work in AI Safety, this guide is for you…
and
6
Puberty as Cause X? Audio version.Listen now | TL;DR: GiveWell-esque analysis of adolescents' suffering. Life satisfaction during puberty, ITN model, developmental neuroscience of the…