“Introduction to Stochastic Dynamic Programming”介绍了 随机动态规划的理论、应用、方法论等一系列。本书由学术泰斗 Sheldon Ross 编写,是学习 随机动态规划的极佳教材。另外,本书的扫描质量很高,已经经过 OCR, 支持文本搜索CoPYRIGHT 1983, BY ACADEMIC PRESS, INC.ALL RIGHTS RESERVEDNO PART OF THIS PUBLICATION MAY BE REPRODUCED ORTRANSMITTED IN ANY FORM OR BY ANY MEANS. ELECTRONICOR MECHANICAL. INCLUDING PHOTOCOPY RECORDING. OR ANYINFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUTPERMISSION IN WRITING FROM THE PUBLISHERACADEMIC PRESS, INC.111 Fifth Avenue, New York, New York 10003United Kingdom Edition published bACADEMIC PRESS, INC. (LONDON) LTD24/28 Oval Road, London Nw1 7DXLibrary of Congress Cataloging in Publication DataRoss. She l don MInt roduct ion to stochast ic dynamic programming(Probab i l ity and mathematical statistics)Includes bibl iographies and indexynamic progStochastic progr ammingi. seriesT57.83R671982519.710382-18163SBN0-12-598420-0PRINTED IN THE UNITED STATES OF AMERICA8384858687654321To CelineContentsL. Finite-Stage Models1. Introduction2. A Gambling Model23. A Stock-Option model4. Modular Functions and monotone policies55. Accepting the Best Offer6. A Sequential Allocation Model7. The Interchange Argument in Sequencing17Problems21Notes and References26ll Discounted Dynamic ProgrammingIntroduction292. The Optimality Equation and Optimal Policy3. Method of Successive Approximations4. Policy Improvement385. Solution by Linear Programming406. Extension to Unbounded Rewards42Problems44RefeIll. Minimizing Costs-Negative Dynamic Programming1. Introduction and Some Theoretical results2. Optimal Stopping Problems51CoNTENTS3. Bayesian Sequential Analysis4. Computational approaches5. Optimal searchProblemsReferences71I. Maximizing Rewards--Positive Dynamic Programming1. Introduction and main Theoretical results2. Applications to Gambling Theory763. Computational Approaches to Obtaining V83Problems85Notes and references88V. Average Reward Criterion1. Introduction and Counterexamples2. Existence of an Optimal Stationary Policy3. Computational Approaches98Problems103Notes and References105VI. Stochastic Scheduling1072. Maximizing Finite-Time Returns-Single Processor1083. Minimizing Expected Makespan-Processors in Paralle4. Minimizing Expected Makespan-Processors in Series1145. Maximizing Total Field Life1186. A Stochastic Knapsack Model1227. A Sequential-Assignment Problem124Problems127Notes and references129Vl. Bandit Processes1312. Single-Project Bandit Processes1313. Multiproject Bandit Processes1334. An Extension and a nonextension5. Generalizations of the classical bandit problem145ProblemsNotes and references151CONTENTSAppendix: Stochastic Order Relations1. Stochastically Largerl532. Coupling1543. Hazard-Rate Orderingl564. Likelihood-Ratio Orderingl57Problems160Reference161Index163PrefaceThis text presents the basic theory and examines the scope of applicationsof stochastic dynamic programming. Chapter I is a study of a variety offinite-stage models, illustrating the wide range of applications of stochasticdynamic programming. Later chapters study infinite-stage models: discounting future returns in Chapter Il, minimizing nonnegative costs inChapter Ill, maximizing nonnegative returns in Chapter IV, and maximizing the long-run average return in Chapter V. Each of these chapters firstconsiders whether an optimal policy need exist--presenting counterexam-ples where appropriate-and then presents methods for obtaining suchpolicies when they do. In addition, general areas of application are presented; for example, optimal stopping problems are considered in ChapterI and a variety of gambling models in Chapter Iv. The final two chaptersare concerned with more specialized models. Chapter vi presents a varietyof stochastic scheduling models, and Chapter VII examines a type ofprocess known as a multiproject banditThe mathematical prerequisites for this text are relatively few. No priorknowledge of dynamic programming is assumed and only a moderatefamiliarity with probability-including the use of conditional expectation-is necessary. I have attempted to present all proofs in as intuitive amanner as possible. An appendix dealing with stochastic order relationswhich is needed primarily for the final two chapters, is included Through-out the text I use the terms increasing and nondecreasing interchangeably