Breaking News

This is why the State Department is warning against traveling to Germany Sports Diplomacy The United States imposes sanctions on Chinese companies for aiding Russia’s war effort Sports gambling lawsuit lawyers explain the case against the state Choose your EA SPORTS Player of the Month LSU Baseball – Live on the LSU Sports Radio Network United States, Mexico withdraw 2027 women’s World Cup bid to focus on 2031 US and Mexico will curb illegal immigration, leaders say The US finds that five Israeli security units committed human rights violations before the start of the Gaza war What do protesting students at American universities want?

CT scan of a human lung tumor. Researchers are testing artificial intelligence algorithms that can spot early signs of disease. Credit: K.H. Fung/SPL

From biomedicine to political science, scientists are increasingly using machine learning as a tool to make predictions based on patterns in their data. But according to a pair of researchers at Princeton University in New Jersey, the claims of many such studies are likely exaggerated. They want to sound the alarm about what they call a “brewing reproducibility crisis” in the machine learning-based sciences.

Machine learning is sold as a tool that researchers can learn in a few hours and use on their own—and many follow that advice, says Princeton machine learning researcher Sayash Kapoor. “But you wouldn’t expect a chemist to learn how to run a lab using an online course,” he says. And few researchers realize that the problems they face when applying artificial intelligence (AI) algorithms are common to other fields, says Kapoor, who co-authored the preprint of “Crisis”1. Peer reviews don’t have time to check these models, so academia currently lacks mechanisms to weed out non-reproducible papers, he says. Kapoor and his co-author Arvind Narayanan created guidelines for researchers to avoid such pitfalls, including a clear checklist to be provided with each paper.

What is reproducibility?

Kapoor and Narayanan’s definition of reproducibility is broad. It says other teams should be able to replicate the model’s results given the full details of the data, code and conditions — often called computational reproducibility, which is already a concern for machine learning researchers. Read also : Migration to the cloud: 3 main business advantages. The pair also define a model as non-reproducible, when researchers make mistakes in data analysis, meaning the model is not as predictive as it claims.

Estimating such errors is subjective and often requires deep knowledge of the domain where machine learning is applied. Some scientists whose work has been criticized by the team disagree that their papers are flawed or say Kapoor’s claims are too strong. In social studies, for example, researchers have developed machine learning models to predict when a country is likely to descend into civil war. Kapoor and Narayanan argue that once errors are corrected, these models perform no better than standard statistical techniques. But political scientist David Muchlinski of the Georgia Institute of Technology in Atlanta, whose paper2 was studied by the pair, says the field of conflict prediction has been unfairly maligned, and follow-up research supports his work.

Still, the team’s rallying cry has caught on. More than 1,200 people have registered for a July 28 mini-reproducibility webinar hosted by Kapoor and colleagues to brainstorm and disseminate solutions. “If we don’t do something like this, every field will find these problems over and over again,” he says.

Overoptimism about the capabilities of machine learning models could prove counterproductive when the algorithms are applied to areas such as health and justice, says Momin Malik, a data scientist at the Mayo Clinic in Rochester, Minnesota, who is scheduled to speak at the seminar. . If the crisis isn’t addressed, machine learning’s reputation could suffer, he says. “I’m somewhat surprised that there hasn’t already been a crash in the legitimacy of machine learning. But I think it might come soon.”

To see also :
The pandemic has exposed broad problems in research: that many studies were…

Machine-learning troubles

Kapoor and Narayanan say there are similar pitfalls when applying machine learning in several sciences. To see also : United States Legal Framework for Child Protection. The pair analyzed 20 reviews in 17 research areas and counted 329 studies whose results could not be fully replicated due to problems with the application of machine learning1.

Narayanan himself is not immune: a 2015 paper on computer security he co-authored3 is among the 329. “It’s really an issue that this whole community needs to work on together,” Kapoor says.

He adds that failures are not the fault of any scientist. Instead, a combination of hype surrounding AI and insufficient checks and balances is to blame. The most prominent problem that Kapoor and Narayanan highlight is “data leakage,” when information from the dataset that the model is learning about contains data that is later evaluated. If they are not completely separate, the model has actually already seen the answers and its predictions seem much better than they actually are. The team has identified eight main types of data leaks that researchers can be on the lookout for.

Certain data leaks are subtle. For example, temporal leakage is when the training data contains points later than the test data – which is a problem because the future depends on the past. As an example, Malik cites a 2011 paper4 that claimed that a model analyzing the sentiment of Twitter users could predict the closing value of the stock market with 87.6% accuracy. But because the team had tested the model’s predictive power using data from an earlier period than any of its training sets, the algorithm was actually allowed to see the future, he says.

Broader problems include training models on datasets that are narrower than the population they ultimately represent, Malik says. For example, an AI that detects pneumonia in chest X-rays that was only trained on older people may be less accurate in younger people. Another problem is that algorithms often rely on shortcuts that don’t always work, says seminar speaker Jessica Hullman, a computer scientist at Northwestern University in Evanston, Illinois. For example, a computer vision algorithm can learn to recognize a cow from the grass background of most cow images, so it fails when it encounters an image of an animal on a mountain or a beach.

He says that the high accuracy of test predictions often leads people to think that the models understand “the real structure of the problem” in a human way. The situation is similar to the replication crisis in psychology, where people rely too much on statistical methods, he adds.

Kapoor says that the hype over the possibilities of machine learning has made it too easy for researchers to accept their results. The word “prediction” itself is problematic, Malik says, because most predictions are actually tested retrospectively and have nothing to do with predicting the future.

HIV testing before and during the spread of COVID-19
To see also :
Elizabeth A. DiNenno, dr.1; Kevin P. Delaney, Dr. 1; Marc A. Pitasi,…

Fixing data leakage

Kapoor and Narayanan’s solution to combating data leakage is for the researchers to include evidence in their manuscripts that their models do not have eight types of leakage. On the same subject : HIPAA does not protect gender records of prosecutors. The authors recommend a template for such documentation, which they call Model Information Sheets.

In the past three years, biomedicine has come a long way with a similar approach, says Xiao Liu, a clinical ophthalmologist at the University of Birmingham in the UK who has helped draft reporting guidelines for AI-related research, such as in screening or diagnosis. In 2019, Liu and colleagues found that only 5% of more than 20,000 medical documents using AI for imaging were described in sufficient detail to determine whether they would work in a clinical setting5. The guidelines won’t directly improve anyone’s models, but they “really make it clear who are the people who have done it well and maybe the people who haven’t done it well,” he says, a resource that regulators can get. to touch.

Collaboration can also help, says Malik. He recommends involving specialists in the relevant field as well as machine learning, statistics and survey sampling scientists in the research.

Fields where machine learning finds leads for follow-up, such as drug discovery, are likely to benefit greatly from the technology, says Kapoor. But other areas need more work to show it’s beneficial, he adds. Although machine learning is still relatively new in many fields, researchers need to avoid the kind of crisis of confidence that followed psychology’s replication crisis a decade ago, he says. “The longer we delay this, the bigger the problem will be.”

Combination of Data Science RWE for better clinical outcomes in immunological diseases
See the article :
Hemanth Kanakamedala, The Janssen Pharmaceutical Companies of Johnson & JohnsonAs we continue…

References

Kapoor, S. & Narayanan, A. Preprint at https://arxiv.org/abs/2207.07048 (2022).

Muchlinski, D., Siroky, D., He, J. & Kocher, M. Political Anal. 24, 87–103 (2016).

Article

Google Scholar

Caliskan-Islam, A. et al. Proc. 24th USENIX Security Symposium 255–270 (USENIX Association, 2015).

Bollen, J., Mao, H. & Zeng, X. J. Comp. Sci. 2, 1–8 (2011).

Article

Leave a Reply

Your email address will not be published. Required fields are marked *