rod goodman

Biographical sketch

Research Interests

Curriculum Vitae (pdf)

Publication List (pdf)

Online Publications

PhD Alumni

Contact Information

Sailing!

robot

home

Rodney M. Goodman B.Sc., Ph.D., C.Eng., SMIEEE, FIEE.

 

Learning Controllers - A Casebased Reasoning Approach

David Babcock
Rodney M. Goodman

Abstract

The goal of this project is to develop “intelligent” controllers that learn to control the system they are applied to through “guided trial and error.”

Motivation & Aims

Linear control theory is very powerful when a linear mathematical model of the system to be controlled is available. It can be used to design controllers that guarantee certain stability and performance specifications under fairly general uncertainties in the system. When the system is inherently non-linear (as are all physical systems to some degree), the order of the linear system necessary to model the dynamics and the size of permissible uncertainties can be large. These problems lead to either excessive computation time to design a controller and/or reduced performance capabilities. Non-linear control theory provides some methodologies for systems that satisfy certain assumptions. However, these methods also require a mathematical model of the system and are often limited to select classes of system. Neurocontrol is a field that employs non-parametric function approximators, often in the form of feedforward neural networks, to "learn" the underlying dynamics of the system. Such controllers do not require a mathematical model (although any a priori knowledge can be useful in providing an initial state for the controller) and can be adapted on-line to adjust to the specific physical conditions such as wear, discrepancies with a mathematical model, unknown physical parameters, etc. The general function approximation capabilities of neural networks also allows them to be used on systems that pose modeling difficulties. Unfortunately the non-linear nature of neural networks produces a closed loop system which is non-linear, hence stability and performance can only be demonstrated empirically without analytic guarantees (which would also require a mathematical model of the system).

Research

Our research focuses on a form of neurocontrol called casebased reasoning. This type of control is based on the concept of repeating successful past actions to similar current situations, much like human beings do when presented with a new situation. Each unique experience, in the form of applied input and resulting output pairs, is "memorized" as a case in the casebase. Based on the current desired output, the system "recalls" the "closest" case(s) to determine a set of nominal inputs. The modification algorithm then adjusts the nominal inputs to account for discrepancies between the nominal case(s) and the actual desired output. Once this new input is applied to the physical plant, the corresponding output can be observed and the new input-output pair can be added as an additional case in the casebase. Hence the system expands its knowledge of the physical plant through "guided trial-and-error". The system improves its performance over time by using the information it gains through each attempt. A teacher can be used (or equivalently cases can be preloaded into the casebase) to guide the learning process or the system can be left to simply perform random trials until it "gets the idea."
Finally once the system gains sufficient "experience" in a particular region of output space, a neural network (or other function approximator) can be used to "conceptualize" the cases in this region in the form of a mathematical function. This local function replaces the cases in this region freeing up memory (and hence decreasing future case selection time). Such a technique provides efficiency by using mathematical functions in regions where the system has sufficient information to support generalization while maintaining the flexibility of discrete case interpolation where the input-output mapping is uncertain.

Achievements

We have applied this technique to the standard ball and beam control problem.
For this problem we define the form of the input as two opposite signed angle pulses along with two criteria to determine when to switch pulses. The controller then records the change in position and change in velocity (along with the initial conditions) as the corresponding case outputs.
Initializing the casebase with two basis cases, the system is able to learn setpoint regulation through repeated trial-and-error. With each subsequent trial, the system achieves improved performance and usually reaches the desired setpoint within a few trials.

top

back to Information Processing