Select Lab Publications

Robust Combination of Local Controllers (2001)

By: Carlos Guestrin and Dirk Ormoneit

Abstract: Finding solutions to high dimensional Markov Decision Processes (MDPs) is a difficult problem, especially in the presence of uncertainty or if the actions and time measurements are continuous. Frequently this difficulty can be alleviated by the availability of problem-specific knowledge. For example, it may be relatively easy to design controllers that are good locally, tho ugh having no global guarantees. We propose a nonparametric method to combine these loc al controllers to obtain globally good solutions. We apply this formulation to two type s of problems: motion planning (stochastic shortest path problems) and discounted-cost MDPs. For motion planning, we argue that only considering the expected cost of a path m ay be overly simplistic in the presence of uncertainty. We propose an alternative: finding the minimum cost path, subject to the constraint that the robot must reach the goal with high probability. For this problem, we prove that a polynomial number of samples is suf ficient to obtain a high probability path. For discounted MDPs, we consider various pro blem formulations that explicitly deal with model uncertainty. formulate the problem as a robust linear program which directly incorporates this type of uncertainty. We provide empirical evidence of the usefulness of these approaches using the control of a robot arm.

Download Information
Carlos Guestrin and Dirk Ormoneit (2001). "Robust Combination of Local Controllers." 17th Conference on Uncertainty in Artificial Intelligence (UAI) (pp. 178-185). Project demo page. pdf   talk        
BibTeX citation

author = {Carlos Guestrin and Dirk Ormoneit},
title = {Robust Combination of Local Controllers},
booktitle = {17th Conference on Uncertainty in Artificial Intelligence (UAI)},
pages = {178-185},
year = {2001},
address = {Seattle},
month = {August},
note = {Project demo page},
wwwfilebase = {uai2001-guestrin-ormoneit},
wwwtopic = {Planning under uncertainty}

full list