本文仅供学习使用
本文参考:
B站:DR_CAN
Richoard Bell man 最优化理论:
An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitude an optimal policy with regard to the state resulting from the first decision.
——动态Dynamic 面向未来