Algorithm for Mean-payoff learning for black-box MDP Input: MDP M, imprecision εMP > 0, MP-inconfidence δMP > 0, lower bound pmin on transition probabilities in M Parameters: revisit threshold k ≥ 2, episode length n ≥ 1 Output: upon termination εMP -precise estimate of the maximum mean payoff for M with confidence 1 − δMP , i.e. (εMP , 1 − δMP )-PAC estimate
Algorithm for Mean-payoff learning for black-box MDP Input: MDP M, imprecision εMP > 0, MP-inconfidence δMP > 0, lower bound pmin on transition probabilities in M Parameters: revisit threshold k ≥ 2, episode length n ≥ 1 Output: upon termination εMP -precise estimate of the maximum mean payoff for M with confidence 1 − δMP , i.e. (εMP , 1 − δMP )-PAC estimate
Operations Research : Applications and Algorithms
4th Edition
ISBN:9780534380588
Author:Wayne L. Winston
Publisher:Wayne L. Winston
Chapter20: Queuing Theory
Section20.8: The M/g/1/gd/∞/∞ Queuing System
Problem 5P
Related questions
Question
100%
Input: MDP M, imprecision εMP > 0, MP-inconfidence δMP > 0, lower bound pmin
on transition probabilities in M
Parameters: revisit threshold k ≥ 2, episode length n ≥ 1
Output: upon termination εMP -precise estimate of the maximum mean payoff for M
with confidence 1 − δMP , i.e. (εMP , 1 − δMP )-PAC estimate
Expert Solution
This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
Step by step
Solved in 2 steps
Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.Recommended textbooks for you
Operations Research : Applications and Algorithms
Computer Science
ISBN:
9780534380588
Author:
Wayne L. Winston
Publisher:
Brooks Cole
Operations Research : Applications and Algorithms
Computer Science
ISBN:
9780534380588
Author:
Wayne L. Winston
Publisher:
Brooks Cole