
Biographical sketch
Research Interests
Curriculum Vitae (pdf)
Publication
List (pdf)
Online Publications
PhD Alumni
Contact Information
Sailing!
home 



Rodney M. Goodman B.Sc.,
Ph.D., C.Eng., SMIEEE, FIEE.
Learning Controllers  A Casebased Reasoning Approach
David Babcock
Rodney M. Goodman
Abstract
The goal of this project is to develop “intelligent”
controllers that learn to control the system they are applied to
through “guided trial and error.”
Motivation & Aims
Linear control theory is very powerful when a
linear mathematical model of the system to be controlled is available.
It can be used to design controllers that guarantee certain stability
and performance specifications under fairly general uncertainties
in the system. When the system is inherently nonlinear (as are
all physical systems to some degree), the order of the linear system
necessary to model the dynamics and the size of permissible uncertainties
can be large. These problems lead to either excessive computation
time to design a controller and/or reduced performance capabilities.
Nonlinear control theory provides some methodologies for systems
that satisfy certain assumptions. However, these methods also require
a mathematical model of the system and are often limited to select
classes of system. Neurocontrol is a field that employs nonparametric
function approximators, often in the form of feedforward neural
networks, to "learn" the underlying dynamics of the system.
Such controllers do not require a mathematical model (although any
a priori knowledge can be useful in providing an initial state for
the controller) and can be adapted online to adjust to the specific
physical conditions such as wear, discrepancies with a mathematical
model, unknown physical parameters, etc. The general function approximation
capabilities of neural networks also allows them to be used on systems
that pose modeling difficulties. Unfortunately the nonlinear nature
of neural networks produces a closed loop system which is nonlinear,
hence stability and performance can only be demonstrated empirically
without analytic guarantees (which would also require a mathematical
model of the system).
Research
Our research focuses on a form of neurocontrol
called casebased reasoning. This type of control is based on the
concept of repeating successful past actions to similar current
situations, much like human beings do when presented with a new
situation. Each unique experience, in the form of applied input
and resulting output pairs, is "memorized" as a case in
the casebase. Based on the current desired output, the system "recalls"
the "closest" case(s) to determine a set of nominal inputs.
The modification algorithm then adjusts the nominal inputs to account
for discrepancies between the nominal case(s) and the actual desired
output. Once this new input is applied to the physical plant, the
corresponding output can be observed and the new inputoutput pair
can be added as an additional case in the casebase. Hence the system
expands its knowledge of the physical plant through "guided
trialanderror". The system improves its performance over
time by using the information it gains through each attempt. A teacher
can be used (or equivalently cases can be preloaded into the casebase)
to guide the learning process or the system can be left to simply
perform random trials until it "gets the idea."
Finally once the system gains sufficient "experience"
in a particular region of output space, a neural network (or other
function approximator) can be used to "conceptualize"
the cases in this region in the form of a mathematical function.
This local function replaces the cases in this region freeing up
memory (and hence decreasing future case selection time). Such a
technique provides efficiency by using mathematical functions in
regions where the system has sufficient information to support generalization
while maintaining the flexibility of discrete case interpolation
where the inputoutput mapping is uncertain.
Achievements
We have applied this technique to the standard ball and beam control
problem.
For this problem we define the form of the input as two opposite
signed angle pulses along with two criteria to determine when to
switch pulses. The controller then records the change in position
and change in velocity (along with the initial conditions) as the
corresponding case outputs.
Initializing the casebase with two basis cases, the system is able
to learn setpoint regulation through repeated trialanderror. With
each subsequent trial, the system achieves improved performance
and usually reaches the desired setpoint within a few trials.
top
back to Information
Processing 
