"Introduction to Stochastic Dynamic Programming"介绍了 随机动态规划的理论、应用、方法论等一系列。本书由学术泰斗 Sheldon Ross 编写,是学习 随机动态规划的极佳教材。另外,本书的扫描质量很高,已经经过 OCR, 支持文本搜索

Contents

L. Finite-Stage Models
1. Introduction
2. A Gambling Model
3. A Stock-Option model
4. Modular Functions and monotone policies
5. Accepting the Best Offer
6. A Sequential Allocation Model
7. The Interchange Argument in Sequencing
Problems
Notes and References

ll Discounted Dynamic Programming
Introduction
2. The Optimality Equation and Optimal Policy
3. Method of Successive Approximations
4. Policy Improvement
5. Solution by Linear Programming
6. Extension to Unbounded Rewards
Problems
References

lll. Minimizing Costs-Negative Dynamic Programming
1. Introduction and Some Theoretical results
2. Optimal Stopping Problems
3. Bayesian Sequential Analysis
4. Computational approaches
5. Optimal search
Problems
References

I. Maximizing Rewards--Positive Dynamic Programming
1. Introduction and main Theoretical results
2. Applications to Gambling Theory
3. Computational Approaches to Obtaining V
Problems
Notes and references

V. Average Reward Criterion
1. Introduction and Counterexamples
2. Existence of an Optimal Stationary Policy
3. Computational Approaches
Problems
Notes and References

VI. Stochastic Scheduling
2. Maximizing Finite-Time Returns-Single Processor
3. Minimizing Expected Makespan-Processors in Parallel
4. Minimizing Expected Makespan-Processors in Series
5. Maximizing Total Field Life
6. A Stochastic Knapsack Model
7. A Sequential-Assignment Problem
Problems
Notes and references

Vl. Bandit Processes
2. Single-Project Bandit Processes
3. Multiproject Bandit Processes
4. An Extension and a nonextension
5. Generalizations of the classical bandit problem
Problems
Notes and references

Appendix: Stochastic Order Relations
1. Stochastically Larger
2. Coupling
3. Hazard-Rate Ordering
4. Likelihood-Ratio Ordering
Problems
Reference

Index

Preface

This text presents the basic theory and examines the scope of applications
of stochastic dynamic programming. Chapter I is a study of a variety of
finite-stage models, illustrating the wide range of applications of stochastic dynamic programming. Later chapters study infinite-stage models: discounting future returns in Chapter Il, minimizing nonnegative costs in
Chapter Ill, maximizing nonnegative returns in Chapter IV, and maximizing the long-run average return in Chapter V. Each of these chapters first
considers whether an optimal policy need exist--presenting counterexamples where appropriate-and then presents methods for obtaining such
policies when they do. In addition, general areas of application are presented; for example, optimal stopping problems are considered in Chapter
I and a variety of gambling models in Chapter Iv. The final two chapters
are concerned with more specialized models. Chapter vi presents a variety
of stochastic scheduling models, and Chapter VII examines a type of
process known as a multiproject bandit

The mathematical prerequisites for this text are relatively few. No prior
knowledge of dynamic programming is assumed and only a moderate
familiarity with probability-including the use of conditional expectation-is necessary. I have attempted to present all proofs in as intuitive a
manner as possible. An appendix dealing with stochastic order relations
which is needed primarily for the final two chapters, is included Through-
out the text I use the terms increasing and nondecreasing interchangeably