專利授權區國立清華大學國際產學營運總中心 Operations Center for Industry Collaboration

搜尋專利授權區

關鍵字

» 新增關鍵字

選單

專利授權區

專利授權區
專利名稱(中)	使用多智能體遷移式強化學習的再生能源競價方法
專利名稱(英)	METHOD FOR RENEWABLE ENERGY BIDDING USING MULTIAGENT TRANSFER REINFORCEMENT LEARNING
專利家族	中華民國：I779732
專利權人	國立清華大學 100.00%
發明人	邱偉育,邱崑晏
技術領域	通信傳輸,能源科技,資訊工程,電子電機

專利摘要(中)
一種再生能源競價方法，其包括下列步驟：使用包括各能源供應端的供應量與多個能源需求端的總需求量的參與狀態作為輸入並使用能源供應端的售電報價作為輸出建構參與者網路；使用包括參與狀態、下一參與狀態及採用售電報價所獲得收益的評價狀態作為輸入並使用價值函數作為輸出建構評價者網路；透過隨機梯度下降法更新評價者網路的參數以最小化時序差分誤差；透過隨機梯度上升法更新參與者網路的參數以最大化能源供應端累積的收益；及將更新後參與者網路的參數遷移式轉移至新能源供應端，並重複上述更新步驟以最大化新能源供應端累積的收益。

專利摘要(中)

一種再生能源競價方法，其包括下列步驟：使用包括各能源供應端的供應量與多個能源需求端的總需求量的參與狀態作為輸入並使用能源供應端的售電報價作為輸出建構參與者網路；使用包括參與狀態、下一參與狀態及採用售電報價所獲得收益的評價狀態作為輸入並使用價值函數作為輸出建構評價者網路；透過隨機梯度下降法更新評價者網路的參數以最小化時序差分誤差；透過隨機梯度上升法更新參與者網路的參數以最大化能源供應端累積的收益；及將更新後參與者網路的參數遷移式轉移至新能源供應端，並重複上述更新步驟以最大化新能源供應端累積的收益。

專利摘要(英)
A method for renewable energy bidding is provided. In the method, an actor network is created by using an actor state including a supply amount of each energy supplier and a total demand amount of multiple energy demanders as an input and using an electricity sale quotation of the energy supplier as an output. A critic network is created by using a critic state including the actor state, a next actor state and a reward obtained by adopting the electricity sale quotation as an input and using a value function as an output. Parameters of the critic network are updated through stochastic gradient descent to minimize a temporal difference error. Parameters of the actor network are updated through stochastic gradient ascent to maximize the reward accumulated by the energy supplier. The updated parameters of the actor network are transferred to a new energy supplier and the aforesaid updating steps are repeated to maximize the reward accumulated by the new energy supplier.

專利摘要(英)

A method for renewable energy bidding is provided. In the method, an actor network is created by using an actor state including a supply amount of each energy supplier and a total demand amount of multiple energy demanders as an input and using an electricity sale quotation of the energy supplier as an output. A critic network is created by using a critic state including the actor state, a next actor state and a reward obtained by adopting the electricity sale quotation as an input and using a value function as an output. Parameters of the critic network are updated through stochastic gradient descent to minimize a temporal difference error. Parameters of the actor network are updated through stochastic gradient ascent to maximize the reward accumulated by the energy supplier. The updated parameters of the actor network are transferred to a new energy supplier and the aforesaid updating steps are repeated to maximize the reward accumulated by the new energy supplier.

聯絡資訊
承辦人姓名	李曉琪
承辦人電話	03-5715131 #31061
承辦人Email	hsiaochi@mx.nthu.edu.tw