Babies, puppies, kittens may be bundles of joy but they are also agents of pure chaos. They knock things over, stick their fingers where they're not supposed to, and get a taste or sniff of anything they can, all just for fun of course. Or is it "just for fun"? In the moment, it is, but this playfulness of infants also allows them to build the intuitive models of the physical world which are needed to cope with that world "seriously".
The paragraph above is not news to anyone who has witnessed the development of a young mind, but it does highlight several questions that are still cutting-edge in both cognitive science and robotics. What is an "intuitive" model of the world? How would such a thing be acquired? Roboticists add a further question -- how could such models be put to use to help robots cope with the open-ended, loosely structured world of human existence?
In the work we summarize here we have only begun to scratch at such issues, and so a tempering of expectations is honest, necessary -- but also illustrative of both the problems we face and the methods we choose to attack them with. Let us start with the term "curiosity", or greed for the new as the Germans would say. What does it mean to be curious?
Curiosity -- as greed for new, not as strangeness -- is something that babies or kittens can have, but not rocks. That is, curiosity is a quality that an "agent" can have, that is, some being interacting in a purposeful way with its environment. In science and engineering, we interpret "purposeful way of interacting" to mean "acting in such a way so as to optimize a goal function". We can also measure information; we can measure information gain. So, in scientist/engineering terms, "curiosity" is going to mean some version of, "acting in such a way so as to maximize information gain about one's environment".
With some charity, this even seems like a workable definition of human curiosity in general, but it does obscure a complication: no agent has time to make sense of all the information the environment throws at it. Therefore, making sense of the world at any particular moment involves choosing what to look at and discarding the rest, and this human (and presumably, feline, canine etc.) ability to readjust interests remains, as far as cognitive science goes, magic. We do not have a good theory for it, we do not have a way to implement it computationally. We did not try to do it in our work either.
We therefore set our sights on a more modest goal than the open-ended human curiosity. Call it a kind of closed, "robot" curiosity if you will. Our simulated agent is given a well-defined task -- pouring a liquid from one bowl to another -- and a fixed set of "output parameters" to judge the task by -- such as, how much of the liquid reaches the destination, does the destination container move etc. The purpose of the agent is to learn how variations in manner of pouring affect the output parameters. Note, the simulated agent's purpose is not to pour well! Indeed, we want it to pour badly sometimes because mistakes can be informative of how the physical act of pouring works. Besides, no one will cry over simulated spilt milk.
And so our agent pours the liquid, from one bowl to another, in slightly different ways each time, and building a "model" of pouring as it does so. This "model" is like a collection of rules of thumb for predicting outcomes, such as "pouring from too high will produce spillage". We also allow the agent to control physical parameters of the bowls and liquid as well -- what happens if the liquid is very viscous, or very bouncy? All together, there are several billion ways of pouring that our agent could simulate, which is several billion simulations too many to be practical. This is where the "robot curiosity" kicks in.
Rather than try out all variations in pouring, or trying them randomly, the agent uses the pouring model it constructed to choose what to try next. Specifically, it looks at the model to identify where it is not too confident in making predictions -- varying the manner of pouring in those directions is likely to produce behaviors not seen before! Hence, our agent acts -- adjusts the manner of pouring and the physical parameters of the involved objects -- in ways it expects to result in more information -- new behaviors, new rules of thumb to add to its model.
With only a few hundred simulations, rather than several billion, it can arrive at such sensible prediction rules as "pouring from too high/far to the side will produce spillage" and "a viscous fluid needs time to pour out of the source container". Nothing world-shattering -- this sort of stuff is obvious to us humans. Then again, we have the benefit of a truly curious childhood and a better brain. For an agent that was not given any knowledge of pouring and had to start from scratch -- aside from the definition of possible variations and output parameters to look at of course --, ours is not doing too badly.
Much work remains to be done however. The accuracy of the agent's model depends a lot on the methods it uses to group together sets of numerical parameters under the same qualitative umbrella; loosely speaking, this is what we do when we would say someone is tall if their height is sufficiently large. Currently, this conversion between numerical sets and qualitative values is done by fixed mappings, and it should instead be learned as well. But this is a topic of a future work and, we hope, a future post here!