Looking for the best from human and machine to create new materials
Roman Garnett and a multidisciplinary team plans to develop a new framework to speed the discovery of electronic materials based on active machine learning and intelligent search
In the late 1990s, IBM's Deep Blue computer defeated world chess champion Garry Kasparov. The loss shocked the chess world and led Kasparov to envision a new form of chess called centaur chess, in which a human and computer cooperate as a team, modeled after the man-horse beast from Greek mythology.
An interdisciplinary, multi-institutional team of researchers plans to adapt this centaur analogy to accelerate scientific discovery. The team plans to develop a new framework to speed the discovery of electronic materials based on active machine learning and intelligent search, human-machine interaction and visualization with a two-year, $1.8 million grant from the National Science Foundation (NSF). The grant is part of the NSF's $300 million 10 Big Ideas program and falls under the Harnessing the Data Revolution Big Idea, which focuses on the emerging field of data science.
Roman Garnett, assistant professor of computer science & engineering in the McKelvey School of Engineering at Washington University in St. Louis, brings his research and experience in active machine learning to the project with $306,000 in funding. Garnett, who has an NSF CAREER Award for his work in active machine learning, has extensive experience applying machine learning to automate discovery, particularly in active search for drug and materials discovery. In this project, his expertise will help to design a framework that most efficiently reaches its objective.
"Making new materials is really expensive," Garnett said. "You can do experiments in the lab, but they are costly and slow. Computational simulations are cheaper, but not perfect. Intuitively we want to do the computations to get an idea of the most promising possibilities, then be confident enough to spend the money to run the experiments in the lab."
Garnett describes this as multifidelity learning, which simultaneously reasons about expensive, high-fidelity, in-lab experiments as well as cheaper, lower-fidelity computations. Multifidelity learning enables cost-effective decision-making by carefully modeling the tradeoff between the cost of collecting data and the information it provides.
Collaborating with Garnett on the project are Remco Chang, associate professor of computer science, Tufts University; Jane Greenberg, the Alice B. Kroeger Professor at Drexel University; Steven Lopez, assistant professor of chemistry and chemical biology, Northeastern University; and Eric Toberer, associate professor of physics, Colorado School of Mines.
In addition to the computational and experimental work, the team plans to bring in a human component.
"In a lot of these active learning pipelines, everything is automated and we try to get the human out of the loop," Garnett said. "But in these scientific applications, we want to make sure the human is involved and that the computer is aiding the human by being better at planning. Through the visualization, we want to give them insight into the system that they couldn't see before. The human and computer thus cooperate as a team and learn from each other."
Their plan would also allow a user to provide feedback on proposed experiments.
"The user will bring in other knowledge that a machine learning algorithm would not have, such as, 'I have a PhD in chemistry, and I know that those molecules would make my lab explode if I tried to combine them,'" Garnett said. "This will serve as another type of fidelity — the human provides another source of information we can seamlessly incorporate into our model. Now the computer can benefit from the human's expertise while the human benefits from the algorithm's ability to intelligently search complex spaces."
Bringing together the five collaborators into developing the machine learning framework will be a unique experience, Garnett said. They met as part of a competitive ideas lab last spring to develop ideas for the Harnessing the Data Revolution Big Idea.
"Our team includes experts from a wide range of fields," Garnett said. "Just as we envision the human-computer team cooperating to discover new materials, we will cooperate with each other to discover new methodologies enabling that team to succeed."