A possible path to intelligence augmentation

So I previously said that you need to study resource allocation to get to intelligence augmentation. But what do we do after we have a market that can allocate resources to the programs? We are left with two problems, getting the right currency to the right programs and somehow getting the better programs into the system, so it can improve what it does.

This can be seen as a partial recipe for AI as well, but you would need some function that took the part of the human of giving feedback to the agoric system. Most likely you would want something that humans could at least partially influence so they could guide its learning, unless you are planning to code the basic knowledge of the world into the system. You may also need a different set of programs, as the system could not rely on the human for things like goal setting.

Currency distribution

So my current theory is that there needs to be one program that is responsible for the system. They bid currency for the privilege of being the manager, and the user evaluates the system over a time period. If the system does well, they user gives better feedback and that feedback is translated into a currency pay off. The manager then gives currency to all the programs that have helped them and those programs give currency to other programs in the same way etc.

In this way currency gets spread throughout the system. It is not perfect though. The actions of a manager for a time period can impact how well the user rewards the system in future time periods. So you might want the managers to get some reward from the later time periods, this would create an incentive for them to not be too short term about their actions.

It also may be that you want a council of programs giving feedback rather than one manager, but that might lead to a tragedy of the commons if programs this encourages council programs that are lazy and do not evaluate the system or pass on currency, but still get given currency.

Once there is a system that is sort of working, it might be fruitful to look at current neuroscience and see if there is a link between the reward system and neuronal resource allocation over time and whether there is an analogue to an agent. Any findings of analogous systems there should guide our design. The goal is not to create a system with a separate reward system but one can be part of our reward systems, so understanding ours is an important part of this process.

New Programs

A computer system that just allocated resources between the same set of programs is safe, but also not very useful.

Random programs are unlikely to be useful, but there are a number of different pathways to explore here. Some synthesis of them is probably necessary.

Competition between machine learning systems

There has been some interesting work recently on learning the topology for a neural network architecture. Given a resource allocation system, you can spin up new neural networks that not only vary the topology but also can use different variables as inputs within the computer system, and then see which the neural nets the system finds useful.

You would probably want some common way of communicating between the different neural nets so that feedback could be passed between them (even when feedback happens a lot later).

Language to program translation

We use language a lot to tell each other what to do. We would need it to train the IA systems too. So we need to be able to generate new programs from language. For example we need to be able to tell the system to do something N times or until some other condition. Or to never to do X. This can be seen as an interpreter, but we probably also want to compile down these instructions for more efficient performance. We also need to be able to bootstrap the language, we need to tell the system what words mean by using words. We need a progressively self-enhancing compiler of human languages to program. This will build off the machine learning work, as we would not want to formally specify each word, it should be able to learn them over time.

Resource allocation is needed for this, in case you tell the system something that makes it go into an infinite loop or exhibit another bad behaviour.  There is probably some interesting things to look at in SHRLDU, but I think most GOFAI programs will need substantial re-working if you assume that humans are not the ones maintaining them.

Conclusion

This probably just the tip of the iceberg for IA, but it I think partial solutions to both currency distribution and program generation would be necessary to validate this particular approach to IA. It might be that one or both of them need lots of programming put in them up front, meaning that the system cannot bootstrap itself and get better over time. This would decrease the tractability of this approach significantly.

I think that this approach may enable a different kind of progress towards creating intelligence than we have had previously, as it delegates a known fundamental problem of computing resource-allocation to the programs and allows work on program creation to start to be delegated as well, in an open ended manner.

Advertisements

One thought on “A possible path to intelligence augmentation

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s