Reinforcement learning for arbitrarily large systems is possible
Work in Jr-Shin Li’s lab develops mathematically rigorous and computationally efficient techniques to transform extremely complex reinforcement learning problems into a manageable domain
From autonomous cars to video games, reinforcement learning can have an important impact. Consider that especially true when you are the passenger late for dinner in an autonomous car that has learned the efficient way home.
Jr-Shin Li, the Newton R. and Sarah Louisa Glasgow Wilson Professor at the McKelvey School of Engineering at Washington University in St. Louis, co-authored a paper with his postdoctoral research associate, Wei Zhang, on reinforcement learning with a particular focus on infinite-dimensional systems. This paper was published in the Journal of Machine Learning Research Dec. 29, 2025.
“Reinforcement learning involves learning through interaction with environments to attain optimal results,” Li said. “You want to win, but you need to take actions that contribute to your goal. You aim to make moves that yield the highest gains and keep doing this following certain principles to ultimately reach the best possible rewards.”
But if the system is extremely large, then you must account for the movements of hundreds of thousands of factors, which can seemingly take forever, he said.
“This is part of where our research into infinite-dimensional systems becomes crucial,” Li said. “The most vital aspect is that when a system is very large-scale, traditional approaches often fail because they cannot handle the growing complexity and are not scalable. Our approach involves developing a novel transformation to map the problem into another domain with a much more simplified form. The transformed problem remains equivalent to the original, but this transformation allows us to tackle the problem in a way that enables the development of valid approximations that offer very reliable solutions. It means that by knowing how good the approximations are, we can get as close to the original solution as we desire. This can aid in analysis and facilitate effective learning.”
Li said the proposed reinforcement learning involves a new formulation and the derivation of effective algorithms to find optimal outcomes for arbitrarily large systems.
“Specifically, a hierarchical algorithm is established for learning optimal policies for infinite-dimensional systems, and this algorithm becomes highly efficient by incorporating early stopping at each hierarchy with a spectral convergence guarantee,” he said.
“Our work can touch on so many areas, including medicine,” he said. “And so much technology is only getting more complex. We hope to be a part of the solution.”
Zhang W, Li J-S. Journal of Machine Learning Research, 6(214):1−52, 2025. https://jmlr.org/papers/v26/24-1575.html