AI ‘CHEF’ could help those with cognitive declines complete home tasks

AI-occupational therapy team leverages novel vision-language models to detect cognitive errors

Beth Miller 
A team of computer scientists and occupational therapists collaborated to integrate two novel vision-language models that create a potential AI assistant that may help those with cognitive decline remain independent for longer. (Credit: Ruiqi Wang)
A team of computer scientists and occupational therapists collaborated to integrate two novel vision-language models that create a potential AI assistant that may help those with cognitive decline remain independent for longer. (Credit: Ruiqi Wang)

In the United States, 11% of adults over age 45 self-report some cognitive decline, which may impact their ability to care for themselves and perform tasks such as cooking or paying bills. A team of Washington University in St. Louis researchers has integrated two novel vision-language models that create a potential AI assistant that may help these individuals cook meals and remain independent. 

Ruiqi Wang, a doctoral student in the lab of Chenyang Lu, the Fullgraf Professor in the Department of Computer Science & Engineering and director of the AI for Health Institute, worked with Lisa Tabor Connor, associate dean and director of occupational therapy at WashU Medicine and the Elias Michael Professor of Occupational Therapy and professor of neurology, and her team to collect video data of more than 100 individuals with and without subjective cognitive decline completing a task. By combining vision-language models to recognize human action and an algorithm to detect cognitive sequencing errors, they have taken a step toward creating a nonintrusive AI-based assistant for these individuals.

Results of their work about this system, named Cognitive Human Error Detection Framework with Vision-Language models (CHEF-VL), were published in the Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies in December 2025 and will be presented at UbiComp/ISWC 2026. The research led to a 2025 Google PhD Fellowship for Wang in October 2025 and made him the first McKelvey Engineering student to receive this highly competitive honor. 

As occupational therapists, Connor’s team was looking for a way to help people with mild cognitive decline by creating a tool that would support them without help from a human caregiver. Of the four tasks in the Executive Function Performance Tests — cooking, making a phone call, paying bills or taking medications — they chose to observe cooking. 

For their experiment, Connor’s team set up a smart kitchen equipped with an overhead camera. Each participant was given step-by-step instructions to make oatmeal on the stove. The camera captured the way the individual handled utensils, measured ingredients, and followed the sequence of instructions, which included gathering ingredients, boiling water, adding the oats, cooking the oats for two minutes, stirring, serving, then returning all dishes to the sink. Occupational therapy students were closely watching the order in which the actions were completed and provided supportive cues when participants made an error or when there were safety issues, such as the water boiling over the pot. 

Wang said the CHEF-VL system first captured video of the individuals cooking, then used the team’s AI models to analyze the performance aligned with the given instructions.

“We realize even people without cognitive decline make mistakes during cooking, but this can be a very challenging task for those experiencing cognitive decline,” Wang said. “The vision-language model is a state-of-the-art AI model that jointly understands text, images and videos. It demonstrates strong off-the-shelf understanding of the real world, along with reasoning capabilities. This is exactly what we want in the smart kitchen because the way people complete tasks can be diverse.”

During the experiments, Connor’s team coded the errors that the individuals made while making the oatmeal so they could cross-check the validity of the computer algorithm. 

“We could see if the algorithm was working by what it was detecting and determine which errors were more difficult to detect, then work with Ruiqi and Chenyang’s team to make adjustments,” Connor said.

Lu said this model exceeds the ability of paper-based cognitive tests, which do not necessarily reflect an individual’s capability to perform these daily functions.

“The initial work for this was very hard to do, and I give a lot of credit to Ruiqi and the team,” Lu said. “The game changer was the recent emergence of large vision-language models that can understand text and video. This is an excellent example of applying the cutting-edge AI to a vital health problem with tremendous public health impact.”

Connor said there is more work to do to refine the model before it can be used in real-world situations. 

“We're up for it,” she said. “My whole lab is interested in this, and the beauty is our collaboration with the computer science team. We will all figure out what's next.”

As they continue their work, Wang has a specific goal in mind.

“Looking into the future, we want to build this system that can support people to be more independent, remain in their home and boost their self-confidence, while also being beneficial to community health. This platform will be an initial step forward for future assistive technologies.” 


Wang R, Gao P, Lynch P, Liu T, Lee Y, Baum CM, Connor LT, Lu C. CHEF-VL: Detecting Cognitive Sequencing Errors in Cooking with Vision-language Models. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, December 2025. DOI: https://dl.acm.org/doi/10.1145/3770714

This research was supported with funding by the Fullgraf Foundation and by a Washington University in St. Louis Here and Next Seed Grant.

 

Click on the topics below for more stories in those areas

Back to News