A method and an apparatus for planning energy usage of charging station based on reinforcement learning are provided. In the method, multiple system states are defined by using a power demand and a remaining battery energy of a charging station itself, and a global power demand and an internal power price of an energy sharing area, and expected returns for arranging energy use actions under each system state are estimated to construct a reinforcement learning table. According to the reinforcement learning table, an energy use action adapted for a current system state is selected and uploaded to a coordinator device, and a trading electricity arranged by the coordinator device and a reward of adopting the power use action calculated by the coordinator device are used to update the reinforcement learning table. The current system state, the power use action, the reward and a number of times the system state being selected are recorded and used to generate a simulation environment, so as to calculate an overall benefit of arranging the power use action under each system state and accordingly update the reinforcement learning table. |