Inferring utility functions from locally…

Feb 10, 2022

TL;DR: Fanboying JvN, then a nuts-and-bolts description of the von-Neumann-Morgenstern theorem. A connection to reward modeling, some simulations, and pretty figures!

Read →

2 Comments

gwern

Jan 24, 2023

I'm not sure I understand what the neural net stuff at the end is about. If you are doing noisy sorting, that doesn't require neural networks, and there are lots of known algorithms: https://gwern.net/Resorter#noisy-sorting (You are right that noisy comparison sorting turns out to be much the same, complexity wise, as noise-free comparison sorting.) You can also make sorting and ranking directly differentiable (https://gwern.net/GPT-2-preference-learning#grover-et-al-2019), which would probably be much more efficient than trying to brute force it by iterating a MLP (wouldn't that approach bloat any NN you tried to do that inside of as a component? You'd have to unroll it for a small fixed number of iterations, I guess, and weight-tie it).

Expand full comment

Reply (1)

Jan Hendrik Kirchner

Jan 29, 2023

Hi there, sorry, only found this comment now. Those references are useful, thanks for sharing!

I'm not surprised that there are more straightforward methods for noisy sorting. I was more curious about learning the type of noisy sorting that reward modelling does (the set-up I arrive at is the same that is used in RLHF). A priori it wasn't clear to me whether a neural network would fail gracefully when confronted with non-transitive preferences (or what failing gracefully even means exactly).

Expand full comment

On Brains, Minds, And Their Possible Uses

Inferring utility functions from locally…