We are all going south

Attended the Rich Sutton talk today at Vector. It was a great talk, where he soliloquized about his research agenda and motivation, and used this to motivate his work on SuperDyna, a general intelligence framework based on his work on options.

At the end of this post are some rough notes from the presentation. But I wanted to discuss the key takeaways I got. I had the opportunity to talk with him personally briefly for a few moments as well over lunch. Some really salient points:

Sutton believes that Reinforcement Learning can explain all intelligence. In other words, intelligence can be elegantly and precisely formulated using RL. This is in contrast to Yann LeCun’s cake analogy . In Sutton’s view, the entire cake is RL!
Sutton is a true generalist. He is pretty disdainful of building in prior knowledge/biases into our models, instead preferring the model to learn by itself.
This goes against the current trend in machine learning, where researchers and practitioners are incentivized and rewarded for achieving incremental advances. Usually, this is via building in domain knowledge, and other “tricks”.
As Sutton puts it, “we are all going south”, when the true direction of research towards AGI is north!
To achieve true intelligence, we want the model to learn all of these knowledges and priors itself.
Sutton believes that intelligence must arise from the setting of a goal. That is, intelligence is defined only wrt a goal.
Sutton views evolution and learning as two separate processes! – when I asked him what the goal of an AGI might be. I believe the immediate corollary is that AGI
Sutton’s view of “safety” in RL is safety for the robot!

My own view is that while I agree abstractly with Sutton’s viewpoint of generality, we must still have some level of pragmatism. We can do a brute force enumeration over the entire search space, and this is the most general solution, but in general is infeasible. So balancing the two should be our goal.

John