搜尋專利授權區
關鍵字
選單
專利授權區


專利授權區
專利名稱(中) 於非對稱策略架構下以階層式強化學習訓練主策略的方法
專利名稱(英) Master Policy Training Method Of Hierarchical Reinforcement Learning With Asymmetrical Policy Architecture
專利家族 中華民國:I835638
美國:2023-0362196(公開號)
專利權人 國立清華大學 100%
發明人 李濬屹
技術領域 資訊工程,電子電機
專利摘要(英)
The present invention includes the following steps: loading a master policy, a plurality of sub-policies, and environment data; wherein each of the sub-policies has a different inference cost; selecting one of the sub-policies as a selected sub-policy by using the master policy; generating at least one action signal according to the selected sub-policy; applying the at least one action signal to an action executing unit; detecting at least one reward signal from an environment, wherein the at least one reward signal corresponds to at least one reaction of the action executing unit executing the at least one action signal; calculating a master reward signal of the master policy according to the at least one reward signal and an inference cost from the selected sub-policy; training the master policy to decide whether to select the selected sub-policy by using the master reward signal, decreasing inference cost of a deep neural network model and outputting a satisfying result.
聯絡資訊
承辦人姓名 李曉琪
承辦人電話 03-5715131 #31061
承辦人Email hsiaochi@mx.nthu.edu.tw
我有興趣 BACK