Large language models to learn how to reason
Jiaxin Huang to focus on adaptive knowledge synthesis with NSF CAREER Award
Large language models (LLM) are a part of daily life, designed to simplify tasks from a simple information search to summarizing complex text to finding foreign legal regulations. But commonly used LLMs may only include partial knowledge stored within its parameters with pretraining, so they may be out of date and lack connection with other LLMs.
With a five-year, nearly $600,000 CAREER Award from the National Science Foundation, Jiaxin Huang, assistant professor of computer science & engineering in the McKelvey School of Engineering at Washington University in St. Louis, plans to create an efficient, multi-step reasoning framework for LLMs that could integrate external knowledge from different sources to generate a solution or conclusion.
CAREER awards support junior faculty who model the role of teacher-scholar through outstanding research, excellence in education and the integration of education and research within the context of the mission of their organization. At least one-third of current McKelvey Engineering faculty have received the award.
Huang plans to create a framework that would act as a commander to seek information from other LLM models and to ask LLMs to do different tasks — some would be simply for information retrieval, and others would have reasoning to follow accumulative actions.
“If all LLMs are the same, they might be doing the general work well, but aren’t very specialized in their assigned roles,” she said. “I want to explore ways to parameterize the models so they can gain their specific skills and allow them to be more capable of their own job.”
Huang said LLMs need to read every word of a source, which can be computationally expensive, so she will look for more efficient ways to read scalable input.
“We want to compress their thinking process while letting them think in the parameter space so there can be more exploration,” she said. “We can teach them how to efficiently process the input and output if the length is too long.”
In addition, Huang plans to evaluate her framework for robustness and efficiency, determining if there are conflicts between fake and real documents in the results.
“We will look at accuracy versus time cost,” she said. “If our model gets the same accuracy in less time, it’s more efficient,” she said.
For her outreach component, Huang will work with K-12 students to teach how to make music with AI and how to create environmental sounds. In addition, she will have students play “story solitaire,” which provides them the opportunity to learn how and LLM-generated sequence of steps determines the final results.
“If they don’t generate the first paragraph, they can’t do the second,” she said. “LLMs use sampling so every output word has a probability distribution that determines what the next token should be. There could be many reasoning choices. If the final answer gets it correct, those reasoning choices get reward. If wrong, it gets a penalty.”
Overall, she said, she expects the project to increase the transparency and efficiency of AI reasoning systems to solve real-world problems.
“The proposed framework could help attorneys and policymakers in parsing regulations or case law from numerous legal precedents and statutes, and it would help researchers in scientific fields integrate findings from literature or databases,” Huang said.