SDS Colloquium 12.26 - Richard Sutton
最后提问的问题是现在大模型这么popular,我们做RL的researcher现在应该做什么,现在什么方向是比较好的,然后上面是Sutton的回答
So maybe, you see what else are doing, and deliberately try to do something different, Or, maybe you are thinking, everyone is doing this, I have to do it! They are both points of view. I reject both of them. You should ignore what is popular or not popular. You should do what makes sense to you. You should be neutral to popularity. It is sort of a hard thing to do, but I think it is the right anwser.
slides本身是提供了一些视角,讲的是Sensorimotor Experience
http://www.incompleteideas.net/Talks/experience.pdf
此外就是一些take home的观点,认为LLM不能到AGI,因为它没有目标。现在RL在LLM里的应用还是很degenerated的情况。
This post is licensed under CC BY 4.0 by the author.