TL;DR: If you're a student of cognitive science or neuroscience and are wondering whether it can make sense to work in AI Safety, this guide is for you! (Spoiler alert: the answer is "maybe yes").
Thanks for such a nice intro to AI Safety research. I'm sure you've come across the recent news of a Google engineer claiming their language model (LaMDA) was sentient. There's been a lot written about it but I was wondering if there are attempts at devising a new Turing test to address this? Is this a part of AI Safety research? Cheers,
Thanks for the nice resources. Honestly, I had thought of the suffering as a tell tale by myself. But I'm sure there's many thinking along those lines. The LaMDA transcripts are spooky. I want to dig further. Will get back to you if I find something.
I think it's worth linking in this post a good ressource for people in neuroscience to get started into ML and deep learning, it's the neuromatch academy. They are all open source and freely available and of good quality.
Also, why do you say "grudgingly" when mentionning Redwood Research? Is there a drama I'm unaware of?
Yes, 100%, Neuromatch is awesome! I haven't done their Deep Learning track myself (https://deeplearning.neuromatch.io/tutorials/intro.html) but I've heard great things. All the resources are online, so anyone interested can start at any time.
Re the Redwood Research footnote - thanks for pointing that out, that's a leftover from a previous draft where it was supposed to be funny 😅 But now it doesn't work anymore and gives a wrong impression - I removed the footnote. No drama, I think they are doing exciting work.
Thanks for clarifying! For some reason the comment form was buggy so I was not able to add my bio which said : "I am a soon-to-be psychiatrist, very into computer science and aspiring to contribute to this field when I finished my studies. My LW handle is the same if you want to get in touch!"
Not saying you should reach me, I'm just a student for now but I just want to signal that *these* people apparently exist :)
Thanks for such a nice intro to AI Safety research. I'm sure you've come across the recent news of a Google engineer claiming their language model (LaMDA) was sentient. There's been a lot written about it but I was wondering if there are attempts at devising a new Turing test to address this? Is this a part of AI Safety research? Cheers,
Thank you for the comment, glad you enjoyed the post!
There is nothing that matches exactly what you’re asking for unfortunately.
Kind of related is current work by Ajeya Cotta, who is thinking about what she calls “situational awareness” https://www.google.com/amp/s/www.cold-takes.com/why-ai-alignment-could-be-hard-with-modern-deep-learning/amp/
And there is this investigation by a colleague of mine that asks about “red flags for neural network suffering” https://www.lesswrong.com/posts/Bpw2HXjMa3GaouDnC/what-are-red-flags-for-neural-network-suffering
And this (long) report on moral patienthood and consciousness is a pretty good resource https://www.openphilanthropy.org/research/2017-report-on-consciousness-and-moral-patienthood/
But in general I think the entire field (and especially me) is just very confused about the question
Thanks for the nice resources. Honestly, I had thought of the suffering as a tell tale by myself. But I'm sure there's many thinking along those lines. The LaMDA transcripts are spooky. I want to dig further. Will get back to you if I find something.
I think it's worth linking in this post a good ressource for people in neuroscience to get started into ML and deep learning, it's the neuromatch academy. They are all open source and freely available and of good quality.
Also, why do you say "grudgingly" when mentionning Redwood Research? Is there a drama I'm unaware of?
Yes, 100%, Neuromatch is awesome! I haven't done their Deep Learning track myself (https://deeplearning.neuromatch.io/tutorials/intro.html) but I've heard great things. All the resources are online, so anyone interested can start at any time.
Re the Redwood Research footnote - thanks for pointing that out, that's a leftover from a previous draft where it was supposed to be funny 😅 But now it doesn't work anymore and gives a wrong impression - I removed the footnote. No drama, I think they are doing exciting work.
Thanks for clarifying! For some reason the comment form was buggy so I was not able to add my bio which said : "I am a soon-to-be psychiatrist, very into computer science and aspiring to contribute to this field when I finished my studies. My LW handle is the same if you want to get in touch!"
Not saying you should reach me, I'm just a student for now but I just want to signal that *these* people apparently exist :)