A discount-based time-of-use electricity pricing strategy for demand response with minimum information using reinforcement learning

Statistiques de téléchargement

Téléchargements

Téléchargements par mois depuis la dernière année

Fraija, A., Agbossou, K., Henao, N., Kelouwani, S., Fournier, M. et Hosseini, S. S. (2022). A discount-based time-of-use electricity pricing strategy for demand response with minimum information using reinforcement learning. IEEE Access, 10 . pp. 54018-54028. ISSN 2169-3536 DOI 10.1109/ACCESS.2022.3175839

Prévisualisation

PDF
Disponible sous licence Creative Commons Attribution.
Télécharger (5MB) | Prévisualisation

Résumé

Abstract

Demand Response (DR) programs show great promise for energy saving and load profile flattening. They bring about an opportunity for indirect control of end-users' demand based on different price policies. However, the difficulty in characterizing the price-responsive behavior of customers is a significant challenge towards an optimal selection of these policies. This paper proposes a Demand Response Aggregator (DRA) for transactive policy generation by combining a Reinforcement Learning (RL) technique on the aggregator side with a convex optimization problem on the customer side. The proposed DRA can maintain users' privacy by exploiting the DR as the only source of information. In addition, it can avoid mistakenly penalizing users by offering price discounts as an incentive to realize a satisfying multi-agent environment. With an ensured convergence, the resultant DRA is capable of learning adaptive Time-of-Use (ToU) tariffs and generating near-to-optimal price policies. Moreover, this study suggests an off-line training procedure that can deal with issues related to the convergence time of RL algorithms. The suggested process can notably expedite the DRA convergence and, in turn, enable online applications. The developed method is applied to a set of residential agents in order to benefit them by regulating their thermal loads according to generated price policies. The efficiency of the proposed approach is thoroughly evaluated from the standpoint of the aggregator and customers in terms of load shifting and comfort maintenance, respectively. Besides, the superior performance of the selected RL method is represented through a comparative study. An additional assessment is also conducted by use of a coordination algorithm to validate the competitiveness of the recommended DR program. The multifaceted evaluation demonstrates that the designed scheme can significantly improve the quality of the aggregated load profile with a low reduction in the aggregator's income.

Type de document:	Article
Mots-clés libres:	Demand response Demand response aggregator Time-of-use tariffs Reinforcement learning
Date de dépôt:	22 janv. 2026 14:19
Dernière modification:	22 janv. 2026 14:19
Version du document déposé:	Version officielle de l'éditeur
URI:	https://depot-e.uqtr.ca/id/eprint/12543

Actions (administrateurs uniquement)

Éditer la notice