The disclosure provides a communication time allocation method using reinforcement learning for a wireless powered communication network and a base station. The method includes: determining a communication time allocation corresponding to the t-th time block according to an objective function associated with the total estimated throughput of the communication nodes; requesting each communication node to perform specific communication behaviors according to the corresponding communication time interval in the t-th time block; obtaining the actual throughput of each communication node in the t-th time block; generating the weight vector of each communication node in the (t+1)-th time block according to the actual throughput, the weight vector, and the estimated throughput of each communication node in the t-th time block. |