6 Comments

Thanks for such a nice intro to AI Safety research. I'm sure you've come across the recent news of a Google engineer claiming their language model (LaMDA) was sentient. There's been a lot written about it but I was wondering if there are attempts at devising a new Turing test to address this? Is this a part of AI Safety research? Cheers,

Expand full comment

Thank you for the comment, glad you enjoyed the post!

There is nothing that matches exactly what you’re asking for unfortunately.

Kind of related is current work by Ajeya Cotta, who is thinking about what she calls “situational awareness” https://www.google.com/amp/s/www.cold-takes.com/why-ai-alignment-could-be-hard-with-modern-deep-learning/amp/

And there is this investigation by a colleague of mine that asks about “red flags for neural network suffering” https://www.lesswrong.com/posts/Bpw2HXjMa3GaouDnC/what-are-red-flags-for-neural-network-suffering

And this (long) report on moral patienthood and consciousness is a pretty good resource https://www.openphilanthropy.org/research/2017-report-on-consciousness-and-moral-patienthood/

But in general I think the entire field (and especially me) is just very confused about the question

Expand full comment

Thanks for the nice resources. Honestly, I had thought of the suffering as a tell tale by myself. But I'm sure there's many thinking along those lines. The LaMDA transcripts are spooky. I want to dig further. Will get back to you if I find something.

Expand full comment

I think it's worth linking in this post a good ressource for people in neuroscience to get started into ML and deep learning, it's the neuromatch academy. They are all open source and freely available and of good quality.

Also, why do you say "grudgingly" when mentionning Redwood Research? Is there a drama I'm unaware of?

Expand full comment

Yes, 100%, Neuromatch is awesome! I haven't done their Deep Learning track myself (https://deeplearning.neuromatch.io/tutorials/intro.html) but I've heard great things. All the resources are online, so anyone interested can start at any time.

Re the Redwood Research footnote - thanks for pointing that out, that's a leftover from a previous draft where it was supposed to be funny 😅 But now it doesn't work anymore and gives a wrong impression - I removed the footnote. No drama, I think they are doing exciting work.

Expand full comment

Thanks for clarifying! For some reason the comment form was buggy so I was not able to add my bio which said : "I am a soon-to-be psychiatrist, very into computer science and aspiring to contribute to this field when I finished my studies. My LW handle is the same if you want to get in touch!"

Not saying you should reach me, I'm just a student for now but I just want to signal that *these* people apparently exist :)

Expand full comment