Next: Algorithm Up: HOW TO DO THE Previous: HOW TO DO THE

Introduction

This paper addresses the following problem. Imagine an autonomous agent which has to achieve a number of global goals in a complex dynamic environment. An example could be a rover that has to explore Mars and collect samples of soil. How can such an agent select `the most appropriate' or `the most relevant' next action to take at a particular moment, when facing a particular situation? Important constraints are that the world is too complex to be entirely predictable and that the agent has limited computational resources and time resources. This implies that the action selection cannot be completely `rational' or optimal. It should, however, be robust, fast, and make `good enough' decisions (Simon, 1955). By `good enough' we mean, among other things, that the action selection behavior should demonstrate the following characteristics:

it favors actions that are goal-oriented, in particular, actions that contribute to several goals at once,
it favors actions that are relevant to the current situation, in particular it exploits opportunities and is highly adaptive to unpredictable and changing situations,
it favors actions that contribute to the ongoing goal/plan (unless another action rates a lot better), i.e., it `sticks' onto a particular goal unless there is a good reason to start working on something different.
it looks ahead (or `plans'), in particular to avoid hazardous situations and handle interacting and conflicting goals,
it is robust (never completely breaks down), even when certain components fail,
and it is reactive and fast.

The paper studies this problem in the context of the Society of the Mind theory (Minsky, 1986) to which the Subsumption Architecture (Brooks, 1986) is also related. This theory suggests the building of an intelligent system as a society of interacting, mindless agents, each having their own specific competence. For example, a society of agents that is able to build a tower would incorporate `competence modules' for finding a block, for grasping a block, for moving a block, etc. The idea is that competence modules cooperate (locally) in such a way that the society as a whole functions properly. Such an architecture is very attractive because of its distributedness, modular structure, emergent global functionality and robustness.

One of the open problems is how action can be controlled in such a distributed system. More specifically: (i) how is it determined whether or not some competence module should become active (take some real world actions by controlling the effectors) at a specific moment, and (ii) what are the factors that determine cooperation among certain competence modules. Several solutions can be adopted. One approach is to hand-code (and by that hard-wire) the control flow among the competence modules (Brooks, 1986). Another approach is to introduce a hierarchical structure to tell competence modules whether they are allowed to perform an action or not. This paper investigates yet another, entirely different type of solution.

The hypotheses that are tested are:

`good enough' action selection of the global system can be obtained by letting the competence modules activate and inhibit each other in the right way,
no `bureaucratic' competence modules are necessary (i.e., modules whose only competence is determining which other modules should be activated or inhibited) nor do we need global forms of control.

The research questions that we study are how adequate these hypotheses are and which activation/inhibition dynamics is appropriate. To this end we are developing a series of algorithms and testing them in computer simulations. One such algorithm was discussed in (Maes, 1989). This paper describes a variation on the algorithm which is simpler and produces more interesting results.

Experiments have been performed for several applications. The resulting systems do exhibit the desired properties of goal-orientedness, situation-orientedness, adaptivity, robustness, looking ahead, etc. Further, global parameters make it possible to smoothly mediate between these action selection criteria, such as trading off goal-orientedness for data-orientedness, adaptivity for inertia, sensitivity to goal conflicts and thoughtfulness for speed.

One cannot classify this algorithm as either belonging to the traditional AI approach (in which competence is programmed) or to the connectionist approach (in which competence is the result of tabula rasa learning). Nor is it a hybrid system in the sense that there would be a distinct symbolic and subsymbolic component. Instead, the algorithm completely integrates characteristics of both approaches by using a connectionist computational model on a symbolic, structured representation. By doing so, it combines the best of both worlds:

From connectionism it inherits the interesting properties of intrinsic parallelism, fault-tolerance, sophisticated retrieval and matching capabilities, density (or continuity) and global emergent computation from uniform local interaction rules. On the other hand, it avoids putting the whole burden on learning and classification (without excluding the possibility of applying the learning techniques developed in this area).
From symbolic AI, it adopts representation and structuring principles. The network is prewired, its links have specific meanings which can be understood (such as causality) and nodes are large, meaningful units. Thus, the algorithm inherits such interesting properties as explanation facilities and programmability (the network can be augmented by hand). It further provides a compositional solution to the problem of action selection, which means that the same parts are reused for different problems (e.g. the same network can be given different goals at different times). As a consequence, the networks are smaller (and therefore might prove to be easier to learn or improve). On the other hand, the algorithm avoids problems of traditional AI solutions such as seriality/slowness, brittleness, rigidity, and the communication complexity of distributed AI systems.

This paper is structured as follows: section 2 introduces the algorithm for action selection, section 3 presents a mathematical model, section 4 sketches how it works, section 5 discusses the empirical results obtained, section 6 reflects on the limits of the current algorithm, section 7 compares the algorithm with related work, and finally, section 8 draws some conclusions.

Next: Algorithm Up: HOW TO DO THE Previous: HOW TO DO THE

Alexandros Moukas
Wed Feb 7 14:24:19 EST 1996