home c.v. c.v. talks talks books books articles articles personal
  Probabilistic Expert Systems

Glenn Shafer
SIAM (Society for Industrial and Applied Mathematics) 1996

This short book (80 pages) emphasizes the basic computational principles that make probabilistic reasoning feasible in expert systems. The key to computation in these systems is the modularity of the probabilistic model. The book describes and compares the principal architectures for exploiting this modularity in the computation of prior and posterior probabilities. He also indicates how these similar but different architectures apply to a wide variety of other problems of recursive computation in applied mathematics and operations research.
The book is based on lectures at an NSF/CMBS Regional Conference at the University of North Dakota at Grand Forks during the week of June 1-5, 1992.


Review by Slawomir T. Wierzchon
, Polish Academy of Sciences, in Control Engineering Practice, March 1997, pp. 442-443.

Glenn Shafer is known to the AI community as the author of "A Mathematical Theory of Evidence" - a monograph published exactly twenty years ago, and devoted to a generalization of the Bayesian theory of subjective probability judgement. Within this new and not uncontroversial theory, subjective judgements are expressed in terms of so-called "belief functions", i.e. using not necessarily additive but monotone "probabilities". However, there is one serious problem: how to combine such functions effectively, particularly when they are defined on different domains. An algorithm for doing this task, called the propagation algorithm, was worked out by G. Shafer and his colleagues. During further studies the algorithm proved to be a useful tool in solving discrete optimization problems, decision-making problems and reasoning in predicate calculus. Since the probability measure is a special kind of belief function, the propagation algorithm supports the task of probabilistic reasoning, which relies upon computing the conditional probability of a variable, given a set of observations. It is remarkable that, although almost trivial from a textbook point of view, this problem is extremely difficult from the computational standpoint.

In 1978 two influential AI researchers, P. Szolovits and S.G. Pauker, argued in their paper "Categorical and probabilistic reasoning" (published in Artificial Intelligence, 11, pp. 115-144) that, because of their enormous data and space requirements, the pure probabilistic models may be used only in small problems. For instance, if a model engages only 56 binary variables, then 256 values must be computed to specify the joint probability distribution. If the computer can calculate the terms of the probability distribution at one million values per second, then it will take 2,283 years to come up with the whole probability distribution. Hopeless! But no-one knew that an effective method for performing this task in a reasonable time already existed. This had been suggested in 1970 by geneticists analyzing pedigree data. Unfortunately, it was published in Clinical Genetics — a journal unknown to the AI community, who had to wait until 1983 when J. Pearl and J. Kim developed the first effective algorithm for computing the posterior probabilities in problems represented by tree structures. In the late eighties, algorithms oriented toward probability propagation in general graphical structures were developed by S. Lauritzen & D. Spiegelhalter, G. Shafer & P. Shenoy, and F.V. Jensen.

The three approaches referred to above are discussed in depth in Shafer's new book. When implementing these algorithms it is better to think of appropriate probabilities as tables (vectors) of numbers satisfying certain conditions. This point of view is consequently used in the whole book, and the author starts his presentation by tracing the main properties of probability tables that represent conditional and unconditional probabilities.

Under certain conditions imposed on the set of conditionals, their products represent a joint probability distribution, and a sequence of conditionals satisfying these requirements is said to be a "construction sequence". So the author studies the main properties of such sequences, and he mentions their various graphical representations including belief (or Bayesian) networks, bubble graphs (this is a new notion introduced by him), chain graphs (studies by N. Wermuth and S.L. Lauritzen) and valuation networks (introduced by P. Shenoy). All these representations are discussed int eh context of a practical example concerning the external audit of an organization's financial statement, which allows a reader to see the advantages and disadvantages of the corresponding representations.

Knowing a construction sequence, the problem of probabilistic reasoning can be briefly described as the process of the elimination, one by one, of "unnecessary" (at the current moment) variables. This is a rather old idea, used in 1972 by U. Bertele and F. Brioschi to solve so-called "nonserial dynamic programming problems". Shafer and Shenoy renewed this approach, and gave it a new appeal. Instead of working with graphs (as its originators have done) they use so-called "join trees", well known to those working on the mathematical foundations of database systems. A particularly good part of the book is the presentation of how the join trees occur naturally during the elimination of variables from a construction sequence. The rest of the book is devoted the various implementations of the propagation algorithm in join trees.

In summary, this is a clever and concise book guaranteeing a quick and detailed introduction to the domain of probability propagation. Exercises added at the end of each chapter extend its content, and allow a reader a deeper understanding of the core ideas. The short comments added to the biographical material included at the end of the book opens a perspective to other and new approaches to this problem (especially those based on Markov-chain Monte Carlo simulation, mentioned in the closing section). Lastly, the book contains the electronic addresses of the main web sites where it is possible to find further information and computer software.

Bayesian networks, a type of probabilistic system, are currently used in domains such as diagnosis, planning and control, dynamic systems and time series, data analysis or computer vision. Hence, the material presented in the book may be of use, not only for students, but also for research workers from these domains.

back to top

home   |   c.v.  |   talks   |   books   |   articles   |   personal
Site created by:   Janet Shafer Designs  www.janetshafer.com