Fraija, A., Agbossou, K., Henao, N., Kelouwani, S., Fournier, M. et Hosseini, S. S. (2022). A discount-based time-of-use electricity pricing strategy for demand response with minimum information using reinforcement learning. IEEE Access, 10 . pp. 54018-54028. ISSN 2169-3536 DOI 10.1109/ACCESS.2022.3175839
Prévisualisation |
PDF
Disponible sous licence Creative Commons Attribution. Télécharger (5MB) | Prévisualisation |
Résumé
Abstract
Demand Response (DR) programs show great promise for energy saving and load profile flattening. They bring about an opportunity for indirect control of end-users' demand based on different price policies. However, the difficulty in characterizing the price-responsive behavior of customers is a significant challenge towards an optimal selection of these policies. This paper proposes a Demand Response Aggregator (DRA) for transactive policy generation by combining a Reinforcement Learning (RL) technique on the aggregator side with a convex optimization problem on the customer side. The proposed DRA can maintain users' privacy by exploiting the DR as the only source of information. In addition, it can avoid mistakenly penalizing users by offering price discounts as an incentive to realize a satisfying multi-agent environment. With an ensured convergence, the resultant DRA is capable of learning adaptive Time-of-Use (ToU) tariffs and generating near-to-optimal price policies. Moreover, this study suggests an off-line training procedure that can deal with issues related to the convergence time of RL algorithms. The suggested process can notably expedite the DRA convergence and, in turn, enable online applications. The developed method is applied to a set of residential agents in order to benefit them by regulating their thermal loads according to generated price policies. The efficiency of the proposed approach is thoroughly evaluated from the standpoint of the aggregator and customers in terms of load shifting and comfort maintenance, respectively. Besides, the superior performance of the selected RL method is represented through a comparative study. An additional assessment is also conducted by use of a coordination algorithm to validate the competitiveness of the recommended DR program. The multifaceted evaluation demonstrates that the designed scheme can significantly improve the quality of the aggregated load profile with a low reduction in the aggregator's income.
| Type de document: | Article |
|---|---|
| Mots-clés libres: | Demand response Demand response aggregator Time-of-use tariffs Reinforcement learning |
| Date de dépôt: | 22 janv. 2026 14:19 |
| Dernière modification: | 22 janv. 2026 14:19 |
| Version du document déposé: | Version officielle de l'éditeur |
| URI: | https://depot-e.uqtr.ca/id/eprint/12543 |
Actions (administrateurs uniquement)
![]() |
Éditer la notice |


Statistiques de téléchargement
Statistiques de téléchargement