3 Questions: Should we label AI systems like we do prescription drugs?

3 Questions: Should we label AI systems like we do prescription drugs?

AI systems are increasingly being deployed in safety-critical health care situations. Yet these models sometimes hallucinate incorrect information, make biased predictions, or fail for unexpected reasons, which could have serious consequences for patients and clinicians.

In a commentary article published today in Nature Computational Science, MIT Associate Professor Marzyeh Ghassemi and Boston University Associate Professor Elaine Nsoesie argue that, to mitigate these potential harms, AI systems should be accompanied by responsible-use labels, similar to U.S. Food and Drug Administration-mandated labels placed on prescription medications.

MIT News spoke with Ghassemi about the need for such labels, the information they should convey, and how labeling procedures could be implemented.

Q: Why do we need responsible use labels for AI systems in health care settings?

A: In a health setting, we have an interesting situation where doctors often rely on technology or treatments  that are not fully understood. Sometimes this lack of understanding is fundamental — the mechanism behind acetaminophen for instance — but other times this is just a limit of specialization. We don’t expect clinicians to know how to service an MRI machine, for instance. Instead, we have certification systems through the FDA or other federal agencies, that certify the use of a medical device or drug in a specific setting.

Importantly, medical devices also have service contracts — a technician from the manufacturer will fix your MRI machine if it is miscalibrated. For approved drugs, there are postmarket surveillance and reporting systems so that adverse effects or events can be addressed, for instance if a lot of people taking a drug seem to be developing a condition or allergy.

Models and algorithms, whether they incorporate AI or not, skirt a lot of these approval and long-term monitoring processes, and that is something we need to be wary of. Many prior studies have shown that predictive models need more careful evaluation and monitoring. With more recent generative AI specifically, we cite work that has demonstrated generation is not guaranteed to be appropriate, robust, or unbiased. Because we don’t have the same level of surveillance on model predictions or generation, it would be even more difficult to catch a model’s problematic responses. The generative models being used by hospitals right now could be biased. Having use labels is one way of ensuring that models don’t automate biases that are learned from human practitioners or miscalibrated clinical decision support scores of the past.      

Q: Your article describes several components of a responsible use label for AI, following the FDA approach for creating prescription labels, including approved usage, ingredients, potential side effects, etc. What core information should these labels convey?

A: The things a label should make obvious are time, place, and manner of a model’s intended use. For instance, the user should know that models were trained at a specific time with data from a specific time point. For instance, does it include data that did or did not include the Covid-19 pandemic? There were very different health practices during Covid that could impact the data. This is why we advocate for the model “ingredients” and “completed studies” to be disclosed.

For place, we know from prior research that models trained in one location tend to have worse performance when moved to another location. Knowing where the data were from and how a model was optimized within that population can help to ensure that users are aware of “potential side effects,” any “warnings and precautions,” and “adverse reactions.”

With a model trained to predict one outcome, knowing the time and place of training could help you make intelligent judgements about deployment. But many generative models are incredibly flexible and can be used for many tasks. Here, time and place may not be as informative, and more explicit direction about “conditions of labeling” and “approved usage” versus “unapproved usage” come into play. If a developer has evaluated a generative model for reading a patient’s clinical notes and generating prospective billing codes, they can disclose that it has bias toward overbilling for specific conditions or underrecognizing others. A user wouldn’t want to use this same generative model to decide who gets a referral to a specialist, even though they could. This flexibility is why we advocate for additional details on the manner in which models should be used.

In general, we advocate that you should train the best model you can, using the tools available to you. But even then, there should be a lot of disclosure. No model is going to be perfect. As a society, we now understand that no pill is perfect — there is always some risk. We should have the same understanding of AI models. Any model — with or without AI — is limited. It may be giving you realistic, well-trained, forecasts of potential futures, but take that with whatever grain of salt is appropriate.

Q: If AI labels were to be implemented, who would do the labeling and how would labels be regulated and enforced?

A: If you don’t intend for your model to be used in practice, then the disclosures you would make for a high-quality research publication are sufficient. But once you intend your model to be deployed in a human-facing setting, developers and deployers should do an initial labeling, based on some of the established frameworks. There should be a validation of these claims prior to deployment; in a safety-critical setting like health care, many agencies of the Department of Health and Human Services could be involved.

For model developers, I think that knowing you will need to label the limitations of a system induces more careful consideration of the process itself. If I know that at some point I am going to have to disclose the population upon which a model was trained, I would not want to disclose that it was trained only on dialogue from male chatbot users, for instance.

Thinking about things like who the data are collected on, over what time period, what the sample size was, and how you decided what data to include or exclude, can open your mind up to potential problems at deployment. 

AI-powered underwater vehicle transforms offshore wind inspections

Beam has deployed the world’s first AI-driven autonomous underwater vehicle for offshore wind farm inspections. The technology has already proved its mettle by inspecting jacket structures at Scotland’s largest offshore wind farm, Seagreen—a joint venture between SSE Renewables, TotalEnergies, and PTTEP. The AI-powered vehicle represents a…

Shaping AI-optimised networks and enhancing security

As AI applications evolve, they place greater demands on network infrastructure, particularly in terms of latency and connectivity. Supporting large-scale AI deployments introduces new issues, and analysts predict that AI-related traffic will soon account for a major portion of total network traffic. The industry must be…

Duolingo Review: Can You Reach 100% Fluency? My Experience

Learning a new language can easily be overwhelming. Between memorizing vocabulary, grasping tricky grammar rules, and practicing pronunciation, it’s no wonder many give up before getting started. Duolingo, however, offers a refreshing alternative! Unlike traditional methods that rely on textbooks and rigid classroom settings, Duolingo is…

Playing a new tune

For generations, Andrew Sutherland’s family had the same calling: bagpipes. Growing up in Halifax, Nova Scotia, in a family with Scottish roots, Sutherland’s father, grandfather, and great-grandfather all played the bagpipes competitively, criss-crossing North America. Sutherland’s aunts and uncles were pipers too.

But Sutherland did not take to the instrument. He liked math, went to college, entered a PhD program, and emerged as a professor at the MIT Sloan School of Management. Sutherland is an enterprising scholar whose work delves into issues around the financing and auditing of private firms, the effects of financial technology, and even detecting business fraud.

“I was actually the first male in my family to not play the bagpipes, and the first to go to university,” Sutherland explains. “The joke is that I’m the shame of the family, since I never picked up the pipes and continued the tradition.”

The family bagpiping loss is MIT’s gain. While Sutherland’s area of specialty is nominally accounting, his work has illuminated business practices more broadly.

“A lot of what we know about the financial system and how companies perform, and about financial statements, comes from big public companies,” Sutherland says. “But we have a lot of entrepreneurs come through Sloan looking to found startups, and in the U.S., private firms generate more than half of employment and investment. Until recently, we haven’t known a lot about how they get capital, how they make decisions.”

For his research and teaching, Sutherland was awarded tenure at MIT last year.

Piper at the gates of college

Sutherland is proud of his family history; his grandfather and great-grandfather have taught generations of bagpipe players in Nova Scotia, with many of their students becoming successful pipers around the world. But Sutherland took to math and business studies, receiving his undergraduate degree in commerce, with honors in accounting, from York University in Toronto. Then he received an MBA from Carnegie Mellon University, with concentrations in finance and quantitative analysis.

Sutherland still wanted to research financial markets, though. How did banks evaluate the private businesses they were lending to? How much were those firms disclosing to investors? How much just comes down to trust? He entered the PhD program at the University of Chicago’s Booth School of Business and found scholars encouraging him to pursue those questions.

That included Sutherland’s advisor, Christian Leuz; the long-time Chicago professor Douglas Diamond, now a Nobel Prize winner, whom Sutherland calls “one of the most generous researchers I’ve met” in academia; and a then-assistant professor, Michael Minnis, who shared Sutherland’s interest in studying private firms and entrepreneurs.

Sutherland earned his PhD from Chicago in 2015, with a dissertation about the changing nature of banker-to-business relationships, published in 2018. That research studied the effects of transparency-improving technologies on how small businesses obtained credit.

“Twenty years ago, banking was very relationship-based,” Sutherland says. “You might play golf with your loan officer once a year and they knew your business and maybe your employees, and they would sponsor the local softball team. Whereas now banking has been really influenced by technology. A lot of companies provide credit through online applications, and the days where you had to supply audited financial statements has gone away.” As a result of the expansion in technology-based lending, credit markets have shifted from a relationship basis to a transactional focus.

Sutherland, who is currently an associate professor at MIT, joined the faculty in 2015 and has remained at the Institute ever since. A fan of modern art, his office at MIT Sloan includes an Andy Warhol print, which is part of MIT’s art-lending program, as well as reproductions of some of Harold “Doc” Edgerton’s famous high-speed photographs.

Sutherland has since written five papers with Minnis (now a deputy dean at Chicago Booth), and other co-authors. Many of their findings highlight the variation in lending and contracting practices in the small business sector. In a 2017 study, they found that banks collected fewer verified financial statements from construction companies during the pre-2008 housing bubble than afterward; before 2008, lending had become lax, similar to what happened in the mortgage markets, and this contributed to the crisis. In another study from that year, they showed how banks with extensive industry and geographic expertise rely more on soft than hard information in lending.

“We’re trying to understand the ‘Wild West’ in accounting and finance more broadly,” Sutherland says. “For firms like entrepreneurs and privately held companies, largely unfettered by regulation, what choices do they make, and why? And how can we use economic theory to understand these choices?”

Business, trust, and fraud

Indeed, Sutherland has often homed in on issues around trust, rules, and financial misconduct, something students care about greatly.

“Students are always interested in talking about fraud,” Sutherland says. “Our financial system is based on trust. So many of us invest on an entirely anonymous basis — we don’t personally know our fund manager or closely watch what they do with our money.” And while regulations and a functioning justice system protect against problems, Sutherland notes, finance works partly because “people have some trust in the financial system. But that’s a fragile thing. Once people are swindled, they just keep their money in the bank or under the mattress. Often we’ll have students from countries with weak institutions or corruption, and they’ll say, ‘You would never do the things you can do in the U.S., in terms of investing your money.’ Without trust, it becomes harder for entrepreneurs to raise capital and undermines the whole vibrant economic system we have.”

Some measures can make a big difference. In a 2020 paper published in the Journal of Financial Economics, Sutherland and two co-authors found that a 2010 change to the investment adviser qualification exam, which reduced its focus on ethics, had significant effects: People who passed the exam when it featured more rules and ethics material are one-fourth less likely to commit misconduct. They are also more likely to depart employers during or even before scandals.

“It does seem to matter,” Sutherland says. “The person who has had less ethics training is more likely to get in trouble with the industry. You can predict future fraud in a firm by who is quitting. Those with more ethics training are more likely to leave before a scandal breaks.”

In the classroom

Sutherland also believes his interests are well-suited to the MIT Sloan School of Management, since many students are looking to found startups.

“One thing that really stands out about Sloan is that we attract a lot of entrepreneurs,” Sutherland says. “They’re curious about all this stuff: How do I get financing? Should I go to a bank? Should I raise equity? How do I compare myself to competitors? It’s striking to me that if that person wanted to work for a big public firm, I could hand them a textbook that answers many of these questions. But when it comes to private firms, a lot of that is unknown. And it motivates me to find answers.”

And while Sutherland is a prolific researcher, he views classroom time as being just as important. 

“What I hope with every project I work on is that I could take the findings to the classroom, and the students would find it relevant and interesting,” Sutherland says.

As much as Sutherland made a big departure from the family business, he still gets to teach, and in a sense perform for an audience. Ask Sutherland about his students, and he sounds an emphatically upbeat note.

“One of the best things about teaching at MIT,” Sutherland says, “is that the students are smart enough that you can explain how you did the study, and someone will put up a hand and say: ‘What about this, or that?’ You can bring research findings to the classroom and they absorb them and challenge you on them. It’s the best place in the world to teach, because the students are just so curious and so smart.”

MIT named No. 2 university by U.S. News for 2024-25

MIT has placed second in U.S. News and World Report’s annual rankings of the nation’s best colleges and universities, announced today. 

As in past years, MIT’s engineering program continues to lead the list of undergraduate engineering programs at a doctoral institution. The Institute also placed first in six out of nine engineering disciplines.

U.S. News placed MIT second in its evaluation of undergraduate computer science programs, along with Carnegie Mellon University and the University of California at Berkeley. The Institute placed first in four out of 10 computer science disciplines.

MIT remains the No. 2 undergraduate business program, a ranking it shares with UC Berkeley. Among business subfields, MIT is ranked first in three out of 10 specialties.

Within the magazine’s rankings of “academic programs to look for,” MIT topped the list in the category of undergraduate research and creative projects. The Institute also ranks as the third most innovative national university and the third best value, according to the U.S. News peer assessment survey of top academics.

MIT placed first in six engineering specialties: aerospace/aeronautical/astronautical engineering; chemical engineering; computer engineering; electrical/electronic/communication engineering; materials engineering; and mechanical engineering. It placed within the top five in two other engineering areas: biomedical engineering and civil engineering.

Other schools in the top five overall for undergraduate engineering programs are Stanford University, UC Berkeley, Georgia Tech, Caltech, the University of Illinois at Urbana-Champaign, and the University of Michigan at Ann Arbor.

In computer science, MIT placed first in four specialties: biocomputing/bioinformatics/biotechnology; computer systems; programming languages; and theory. It placed in the top five of five other disciplines: artificial intelligence; cybersecurity; data analytics/science; mobile/web applications; and software engineering.

The No. 1-ranked undergraduate computer science program overall is at Stanford. Other schools in the top five overall for undergraduate computer science programs are Carnegie Mellon, Stanford, UC Berkeley, Princeton University, and the University of Illinois at Urbana-Champaign.

Among undergraduate business specialties, the MIT Sloan School of Management leads in analytics; production/operations management; and quantitative analysis. It also placed within the top five in three other categories: entrepreneurship; management information systems; and supply chain management/logistics.

The No. 1-ranked undergraduate business program overall is at the University of Pennsylvania; other schools ranking in the top five include UC Berkeley, the University of Michigan at Ann Arbor, and New York University.

Accelerating particle size distribution estimation

The pharmaceutical manufacturing industry has long struggled with the issue of monitoring the characteristics of a drying mixture, a critical step in producing medication and chemical compounds. At present, there are two noninvasive characterization approaches that are typically used: A sample is either imaged and individual particles are counted, or researchers use a scattered light to estimate the particle size distribution (PSD). The former is time-intensive and leads to increased waste, making the latter a more attractive option.

In recent years, MIT engineers and researchers developed a physics and machine learning-based scattered light approach that has been shown to improve manufacturing processes for pharmaceutical pills and powders, increasing efficiency and accuracy and resulting in fewer failed batches of products. A new open-access paper, “Non-invasive estimation of the powder size distribution from a single speckle image,” available in the journal Light: Science & Application, expands on this work, introducing an even faster approach. 

“Understanding the behavior of scattered light is one of the most important topics in optics,” says Qihang Zhang PhD ’23, an associate researcher at Tsinghua University. “By making progress in analyzing scattered light, we also invented a useful tool for the pharmaceutical industry. Locating the pain point and solving it by investigating the fundamental rule is the most exciting thing to the research team.”

The paper proposes a new PSD estimation method, based on pupil engineering, that reduces the number of frames needed for analysis. “Our learning-based model can estimate the powder size distribution from a single snapshot speckle image, consequently reducing the reconstruction time from 15 seconds to a mere 0.25 seconds,” the researchers explain.

“Our main contribution in this work is accelerating a particle size detection method by 60 times, with a collective optimization of both algorithm and hardware,” says Zhang. “This high-speed probe is capable to detect the size evolution in fast dynamical systems, providing a platform to study models of processes in pharmaceutical industry including drying, mixing and blending.”

The technique offers a low-cost, noninvasive particle size probe by collecting back-scattered light from powder surfaces. The compact and portable prototype is compatible with most of drying systems in the market, as long as there is an observation window. This online measurement approach may help control manufacturing processes, improving efficiency and product quality. Further, the previous lack of online monitoring prevented systematical study of dynamical models in manufacturing processes. This probe could bring a new platform to carry out series research and modeling for the particle size evolution.

This work, a successful collaboration between physicists and engineers, is generated from the MIT-Takeda program. Collaborators are affiliated with three MIT departments: Mechanical Engineering, Chemical Engineering, and Electrical Engineering and Computer Science. George Barbastathis, professor of mechanical engineering at MIT, is the article’s senior author.

The selectmenu Element is No More…Long Live select!

I was looking over an older article Patrick Brosset penned for us introducing <selectmenu>, a new proposal at the time for a more style-able cousin to <select>. From there, I clicked the linked-up <selectmenu> explainer and got… this:…

The selectmenu Element is No More…Long Live select! originally…

The Rising Danger of Ransomware and How to Recover From an Attack

When an organization begins to expand, they’ll likely be faced with a number of operational challenges they need to address. While all businesses have unique roadblocks they’ll need to navigate around, one of the most common issues that all organizations are dealing with today are cyber…