AI scientist claimed to do six months of research in just a few hours

Artificial intelligence can process large amounts of data, but can it do science?

tonioyumui/Getty

An AI scientist can work independently for hours while doing research that would take humans months to complete, and has made several “novel contributions” to science, its creators claim – but others are more doubtful.

The system, called Kosmos, is actually a collection of AI agents that are specialised in analysing data and searching through the existing scientific literature, in an effort to make new scientific breakthroughs.

“We’ve been working on building an AI scientist for about two years now,” says Sam Rodriques at Edison Scientific, the US-based firm behind Kosmos. “And the limitation with AI scientists that have been released to date is always in the complexity of the ideas that they can come up with.”

Kosmos aims to fix that. During a typical run, which can last up to 12 hours, a user inputs a scientific dataset and Kosmos searches for and analyses around 1500 relevant academic papers, while also writing and executing 42,000 lines of code to interrogate the data. At the end of a run, the AI produces a summary of findings, plus citations or data, and creates a plan for further analysis that can be used as the input for another cycle.

After a set number of cycles, the system outputs reports, backed with relevant citations, that make scientific conclusions, similar to an academic paper. An evaluation by a group of academics found that 20 cycles of this would be equivalent to around six months of their own research time.

The system’s conclusions seem broadly accurate, says Rodriques. Edison asked people with at least a PhD-level understanding of biology to evaluate 102 statements made by Kosmos. The team found that 79.4 per cent of them were supported overall, including 85.5 per cent of claims related to data analysis claims and 82.1 per cent of the statements it says are in the existing literature. Kosmos is weaker at drawing all of this together to make new claims of scientific breakthroughs, however: here, it is accurate only 57.9 per cent of the time.

Edison claims that Kosmos has made seven scientific discoveries that have all been externally validated and replicated by independent experts in the field using external datasets or different methods. Four of the discoveries were truly novel, the team behind Kosmos say, with the remaining three already existing – albeit in preprint or unpublished papers.

One of the claimed discoveries is a new method to pinpoint when cellular pathways fail as Alzheimer’s disease progresses. Another is of evidence that people with more of a natural antioxidant enzyme in their blood called superoxide dismutase 2 (SOD2) seem to have less heart scarring.

But others working in this field have mixed responses to these claims. The SOD2 “discovery” is nothing of the sort, says Fergus Hamilton at the University of Bristol, UK. “That particular causal claim probably doesn’t stand up to scrutiny as a novel finding, and there are methodological flaws in the way the analysis performed,” he says. Rodriques acknowledges that the SOD2 discovery had previously been found in mice, but says a subject matter expert working with Edison suggests it is the first time it has been seen at a population level in humans using genomics.

Hamilton also says the data analysis code that the agent tried to run didn’t work properly, so Kosmos ignored what would be important data – but still came to the same conclusion as pre-existing work.

“It made a number of assumptions that would be really critical to get right in an actual bit of analysis,” he says. “The software packages completely fail, and then it just ignores them.” In addition, he suggests that the data has been so pre-processed in this instance that Kosmos “actually has completed probably 10 per cent of the task”.

Hamilton does credit the team behind Kosmos for engaging with his queries and concerns on social media. “This is a really good advance in principle, but perhaps the particular technical critique of this work is [that] the work is not up to scratch,” he says.

“I’m very open to the idea that some of the findings that we presented could be wrong or flawed, and this is just part of science,” says Rodriques. “The fact that it’s eliciting such kind of sophisticated criticism, though, I think, speaks to the power of the system.”

Others are impressed by the general performance of Kosmos. “It demonstrates the great potential for AI to support scientific discovery, but I would remain careful about the autonomous use of an AI scientist,” says Ben Glocker at Imperial College London. “The work shows some great examples of success, but we have little insights about its failure modes.”

“I believe we should embrace tools like Kosmos and develop others like it, but we should not overlook that there is more to science than this data-driven method,” says Noah Giansiracusa at Bentley University in Massachusetts. “There is also deep thinking, deep creativity, and it would be folly to turn away from that just because the science we can automate is more amenable to AI.”

Rodriques himself admits that Kosmos should be used as a collaborator, not a replacement for scientists. “It can do a lot of very, very impressive things,” he says. “You still need to go through and read and validate. And it’s not going to be right 100 per cent of the time.”

Topics: