Select Lab Publications


Planning Under Uncertainty in Complex Structured Environments (2003)

By: Carlos Guestrin

Abstract: Many real-world tasks require multiple decision makers (agents) to coordinate their actions in order to achieve common long-term goals. Examples include: manufacturing systems, where managers of a factory coordinate to maximize profit; rescue robots that, after an earthquake, must safely find victims as fast as possible; or sensor networks, where multiple sensors collaborate to perform a large-scale sensing task under strict power constraints. All of these tasks require the solution of complex long-term multiagent planning problems in uncertain dynamic environments.

Factored Markov decision processes (MDPs) allow us to represent complex uncertain dynamic systems very compactly by exploiting problem-specific structure. Specifically, the state of the system is described by a set of variables that evolve stochastically over time using a compact representation called a dynamic Bayesian network (DBN). A DBN exploits locality by assuming that the short-term evolution of a particular variable only depends on a few other variables, EG, the state of a section of a factory is only directly affected by a few other sections. In the long-term, all variables in a DBN usually become correlated. Factored MDPs often allow for an exponential reduction in representation complexity. However, the complexity of exact solution algorithms for such MDPs grows exponentially in the number of variables, and in the number of agents.

This thesis builds a formal framework and approximate planning algorithms that exploit structure in factored MDPs to solve problems with many trillions of states and actions very efficiently. The main contributions of this thesis include:

[Factored linear programs: ] A novel LP decomposition technique, using ideas from inference in Bayesian networks, that can exploit problem structure to reduce exponentially-large LPs to polynomially-sized ones that are provably equivalent.

[Factored approximate planning: ] A suite of algorithms, building on our factored LP decomposition technique, that exploit structure in factored MDPs to obtain exponential reductions in planning time.

[Distributed coordination: ] An efficient distributed multiagent decision making algorithm, where the coordination structure arises naturally from the factored representation of the system dynamics.

[Coordinated reinforcement learning: ] A simple, yet effective, framework for designing algorithms for planning in multiagent environments, where the factored model is not known a priori.

[Generalization in relational MDPs: ] A framework for obtaining general solutions from a small set of environments, allowing agents to act in new environments without replanning.

[Empirical evaluation: ] A detailed evaluation on a variety of large-scale tasks, including multiagent coordination in a real strategic computer game, demonstrating that our formal framework yields effective plans, complex agent coordination, and successful generalization in some of the largest planning problems in the literature.



Download Information
Carlos Guestrin (2003). Planning Under Uncertainty in Complex Structured Environments. Doctoral dissertation, Computer Science Department, Stanford University. pdf   talk        
BibTeX citation

@phdthesis{Guestrin:phdthesis2003,
author = {Carlos Guestrin},
title = {Planning Under Uncertainty in Complex Structured Environments},
school = {Computer Science Department, Stanford University},
year = {2003},
type = {Ph.D. Dissertation},
month = {August},
wwwfilebase = {phdthesis2003-guestrin},
wwwtopic = {Factored MDPs}
}



full list