Stability AI previews Stable Diffusion 3 text-to-image model

London-based AI lab Stability AI has announced an early preview of its new text-to-image model, Stable Diffusion 3. The advanced generative AI model aims to create high-quality images from text prompts with improved performance across several key areas. The announcement comes just days after Stability AI’s…

New model identifies drugs that shouldn’t be taken together

New model identifies drugs that shouldn’t be taken together

Any drug that is taken orally must pass through the lining of the digestive tract. Transporter proteins found on cells that line the GI tract help with this process, but for many drugs, it’s unknown which of those transporters they use to exit the digestive tract.

Identifying the transporters used by specific drugs could help to improve patient treatment because if two drugs rely on the same transporter, they can interfere with each other and should not be prescribed together.

Researchers at MIT, Brigham and Women’s Hospital, and Duke University have now developed a multipronged strategy to identify the transporters used by different drugs. Their approach, which makes use of both tissue models and machine-learning algorithms, has already revealed that a commonly prescribed antibiotic and a blood thinner can interfere with each other.

“One of the challenges in modeling absorption is that drugs are subject to different transporters. This study is all about how we can model those interactions, which could help us make drugs safer and more efficacious, and predict potential toxicities that may have been difficult to predict until now,” says Giovanni Traverso, an associate professor of mechanical engineering at MIT, a gastroenterologist at Brigham and Women’s Hospital, and the senior author of the study.

Learning more about which transporters help drugs pass through the digestive tract could also help drug developers improve the absorbability of new drugs by adding excipients that enhance their interactions with transporters.

Former MIT postdocs Yunhua Shi and Daniel Reker are the lead authors of the study, which appears today in Nature Biomedical Engineering.

Drug transport

Previous studies have identified several transporters in the GI tract that help drugs pass through the intestinal lining. Three of the most commonly used, which were the focus of the new study, are BCRP, MRP2, and PgP.

For this study, Traverso and his colleagues adapted a tissue model they had developed in 2020 to measure a given drug’s absorbability. This experimental setup, based on pig intestinal tissue grown in the laboratory, can be used to systematically expose tissue to different drug formulations and measure how well they are absorbed.

To study the role of individual transporters within the tissue, the researchers used short strands of RNA called siRNA to knock down the expression of each transporter. In each section of tissue, they knocked down different combinations of transporters, which enabled them to study how each transporter interacts with many different drugs.

“There are a few roads that drugs can take through tissue, but you don’t know which road. We can close the roads separately to figure out, if we close this road, does the drug still go through? If the answer is yes, then it’s not using that road,” Traverso says.

The researchers tested 23 commonly used drugs using this system, allowing them to identify transporters used by each of those drugs. Then, they trained a machine-learning model on that data, as well as data from several drug databases. The model learned to make predictions of which drugs would interact with which transporters, based on similarities between the chemical structures of the drugs.

Using this model, the researchers analyzed a new set of 28 currently used drugs, as well as 1,595 experimental drugs. This screen yielded nearly 2 million predictions of potential drug interactions. Among them was the prediction that doxycycline, an antibiotic, could interact with warfarin, a commonly prescribed blood-thinner. Doxycycline was also predicted to interact with digoxin, which is used to treat heart failure, levetiracetam, an antiseizure medication, and tacrolimus, an immunosuppressant.

Identifying interactions

To test those predictions, the researchers looked at data from about 50 patients who had been taking one of those three drugs when they were prescribed doxycycline. This data, which came from a patient database at Massachusetts General Hospital and Brigham and Women’s Hospital, showed that when doxycycline was given to patients already taking warfarin, the level of warfarin in the patients’ bloodstream went up, then went back down again after they stopped taking doxycycline.

That data also confirmed the model’s predictions that the absorption of doxycycline is affected by digoxin, levetiracetam, and tacrolimus. Only one of those drugs, tacrolimus, had been previously suspected to interact with doxycycline.

“These are drugs that are commonly used, and we are the first to predict this interaction using this accelerated in silico and in vitro model,” Traverso says. “This kind of approach gives you the ability to understand the potential safety implications of giving these drugs together.”

In addition to identifying potential interactions between drugs that are already in use, this approach could also be applied to drugs now in development. Using this technology, drug developers could tune the formulation of new drug molecules to prevent interactions with other drugs or improve their absorbability. Vivtex, a biotech company co-founded in 2018 by former MIT postdoc Thomas von Erlach, MIT Institute Professor Robert Langer, and Traverso to develop new oral drug delivery systems, is now pursuing that kind of drug-tuning.

The research was funded, in part, by the U.S. National Institutes of Health, the Department of Mechanical Engineering at MIT, and the Division of Gastroenterology at Brigham and Women’s Hospital.

Other authors of the paper include Langer, von Erlach, James Byrne, Ameya Kirtane, Kaitlyn Hess Jimenez, Zhuyi Wang, Natsuda Navamajiti, Cameron Young, Zachary Fralish, Zilu Zhang, Aaron Lopes, Vance Soares, Jacob Wainer, and Lei Miao.

3 Questions: Shaping the future of work in an age of AI

3 Questions: Shaping the future of work in an age of AI

The MIT Shaping the Future of Work Initiative, co-directed by MIT professors Daron Acemoglu, David Autor, and Simon Johnson, celebrated its official launch on Jan. 22. The new initiative’s mission is to analyze the forces that are eroding job quality and labor market opportunities for non-college workers and identify innovative ways to move the economy onto a more equitable trajectory. Here, Acemoglu, Autor, and Johnson speak about the origins, goals, and plans for their new initiative.

Q: What was the impetus for creating the MIT Shaping the Future of Work Initiative?

David Autor: The last 40 years have been increasingly difficult for the 65 percent of U.S. workers who do not have a four-year college degree. Globalization, automation, deindustrialization, de-unionization, and changes in policy and ideology have led to fewer jobs, declining wages, and lower job quality, resulting in widening inequality and shrinking opportunities.

The prevailing economic view has been that this erosion is inevitable — that the best we can do is focus on the supply side, educating workers to meet market demands, or perhaps providing some offsetting transfers to those who have lost employment opportunities.

Underpinning this fatalism is a paradigm which says that the factors shaping demand for work, such as technological change, are immutable: workers must adapt to these forces or be left behind. This assumption is false. The direction of technology is something we choose, and the institutions that shape how these forces play out (e.g., minimum wage laws, regulations, collective bargaining, public investments, social norms) are also endogenous.

To challenge a prevailing narrative, it is not enough to simply say that it is wrong — to truly change a paradigm we must lead by showing a viable alternative pathway. We must answer what sort of work we want and how we can make policies and shape technology that builds that future.

Q: What are your goals for the initiative?

Daron Acemoglu: The initiative’s ambition is not modest. Simon, David, and I are hoping to make advances in new empirical work to interpret what has happened in the recent past and understand how different types of technologies could be impacting prosperity and inequality. We want to contribute to the emergence of a coherent framework that can inform us about how institutions and social forces shape the trajectory of technology, and that helps us to identify, empirically and conceptually, the inefficiencies and the misdirections of technology. And on this basis, we are hoping to contribute to policy discussions in which policy, institutions, and norms are part of what shapes the future of technology in a more beneficial direction. Last but not least, our mission is not just to do our own research, but to help build an ecosystem in which other, especially younger, researchers are inspired to explore these issues.

Q: What are your next steps?

Simon Johnson: David, Daron, and I plan for this initiative to move beyond producing insightful and groundbreaking research — our aim is to identify innovative pro-worker ideas that policymakers, the private sector, and civil society can use. We will continue to translate research into practice by regularly convening students, scholars, policymakers, and practitioners who are shaping the future of work — to include fortifying and diversifying the pipeline of emerging scholars who produce policy-relevant research around our core themes.

We will also produce a range of resources to bring our work to wider audiences. Last fall, David, Daron, and I wrote the initiative’s inaugural policy memo, entitled “Can we Have Pro-Worker AI? Choosing a path of machines in service of minds.” Our thesis is that, instead of focusing on replacing workers by automating job tasks as quickly as possible, the best path forward is to focus on developing worker-augmenting AI tools that enable less-educated or less-skilled workers to perform more expert tasks — as well as creating work, in the form of new productive tasks, for workers across skill and education levels.

As we move forward, we will also look for opportunities to engage globally with a wide range of scholars working on related issues.

Putting AI into the hands of people with problems to solve

Putting AI into the hands of people with problems to solve

As Media Lab students in 2010, Karthik Dinakar SM ’12, PhD ’17 and Birago Jones SM ’12 teamed up for a class project to build a tool that would help content moderation teams at companies like Twitter (now X) and YouTube. The project generated a huge amount of excitement, and the researchers were invited to give a demonstration at a cyberbullying summit at the White House — they just had to get the thing working.

The day before the White House event, Dinakar spent hours trying to put together a working demo that could identify concerning posts on Twitter. Around 11 p.m., he called Jones to say he was giving up.

Then Jones decided to look at the data. It turned out Dinakar’s model was flagging the right types of posts, but the posters were using teenage slang terms and other indirect language that Dinakar didn’t pick up on. The problem wasn’t the model; it was the disconnect between Dinakar and the teens he was trying to help.

“We realized then, right before we got to the White House, that the people building these models should not be folks who are just machine-learning engineers,” Dinakar says. “They should be people who best understand their data.”

The insight led the researchers to develop point-and-click tools that allow nonexperts to build machine-learning models. Those tools became the basis for Pienso, which today is helping people build large language models for detecting misinformation, human trafficking, weapons sales, and more, without writing any code.

“These kinds of applications are important to us because our roots are in cyberbullying and understanding how to use AI for things that really help humanity,” says Jones.

As for the early version of the system shown at the White House, the founders ended up collaborating with students at nearby schools in Cambridge, Massachusetts, to let them train the models.

“The models those kids trained were so much better and nuanced than anything I could’ve ever come up with,” Dinakar says. “Birago and I had this big ‘Aha!’ moment where we realized empowering domain experts — which is different from democratizing AI — was the best path forward.”

A project with purpose

Jones and Dinakar met as graduate students in the Software Agents research group of the MIT Media Lab. Their work on what became Pienso started in Course 6.864 (Natural Language Processing) and continued until they earned their master’s degrees in 2012.

It turned out 2010 wasn’t the last time the founders were invited to the White House to demo their project. The work generated a lot of enthusiasm, but the founders worked on Pienso part time until 2016, when Dinakar finished his PhD at MIT and deep learning began to explode in popularity.

“We’re still connected to many people around campus,” Dinakar says. “The exposure we had at MIT, the melding of human and computer interfaces, widened our understanding. Our philosophy at Pienso couldn’t be possible without the vibrancy of MIT’s campus.”

The founders also credit MIT’s Industrial Liaison Program (ILP) and Startup Accelerator (STEX) for connecting them to early partners.

One early partner was SkyUK. The company’s customer success team used Pienso to build models to understand their customer’s most common problems. Today those models are helping to process half a million customer calls a day, and the founders say they have saved the company over £7 million pounds to date by shortening the length of calls into the company’s call center.

The difference between democratizing AI and empowering people with AI comes down to who understands the data best — you or a doctor or a journalist or someone who works with customers every day?” Jones says. “Those are the people who should be creating the models. That’s how you get insights out of your data.”

In 2020, just as Covid-19 outbreaks began in the U.S., government officials contacted the founders to use their tool to better understand the emerging disease. Pienso helped experts in virology and infectious disease set up machine-learning models to mine thousands of research articles about coronaviruses. Dinakar says they later learned the work helped the government identify and strengthen critical supply chains for drugs, including the popular antiviral remdesivir.

“Those compounds were surfaced by a team that did not know deep learning but was able to use our platform,” Dinakar says.

Building a better AI future

Because Pienso can run on internal servers and cloud infrastructure, the founders say it offers an alternative for businesses being forced to donate their data by using services offered by other AI companies.

“The Pienso interface is a series of web apps stitched together,” Dinakar explains. “You can think of it like an Adobe Photoshop for large language models, but in the web. You can point and import data without writing a line of code. You can refine the data, prepare it for deep learning, analyze it, give it structure if it’s not labeled or annotated, and you can walk away with fine-tuned, large language model in a matter of 25 minutes.”

Earlier this year, Pienso announced a partnership with GraphCore, which provides a faster, more efficient computing platform for machine learning. The founders say the partnership will further lower barriers to leveraging AI by dramatically reducing latency.

“If you’re building an interactive AI platform, users aren’t going to have a cup of coffee every time they click a button,” Dinakar says. “It needs to be fast and responsive.”

The founders believe their solution is enabling a future where more effective AI models are developed for specific use cases by the people who are most familiar with the problems they are trying to solve.

“No one model can do everything,” Dinakar says. “Everyone’s application is different, their needs are different, their data is different. It’s highly unlikely that one model will do everything for you. It’s about bringing a garden of models together and allowing them to collaborate with each other and orchestrating them in a way that makes sense — and the people doing that orchestration should be the people who understand the data best.”

Generative AI for smart grid modeling

Generative AI for smart grid modeling

MIT’s Laboratory for Information and Decision Systems (LIDS) has been awarded $1,365,000 in funding from the Appalachian Regional Commission (ARC) to support its involvement with an innovative project, “Forming the Smart Grid Deployment Consortium (SGDC) and Expanding the HILLTOP+ Platform.”

The grant was made available through ARC’s Appalachian Regional Initiative for Stronger Economies, which fosters regional economic transformation through multi-state collaboration.

Led by Kalyan Veeramachaneni, principal research scientist and principal investigator at LIDS’ Data to AI Group, the project will focus on creating AI-driven generative models for customer load data. Veeramachaneni and colleagues will work alongside a team of universities and organizations led by Tennessee Tech University, including collaborators across Ohio, Pennsylvania, West Virginia, and Tennessee, to develop and deploy smart grid modeling services through the SGDC project.

These generative models have far-reaching applications, including grid modeling and training algorithms for energy tech startups. When the models are trained on existing data, they create additional, realistic data that can augment limited datasets or stand in for sensitive ones. Stakeholders can then use these models to understand and plan for specific what-if scenarios far beyond what could be achieved with existing data alone. For example, generated data can predict the potential load on the grid if an additional 1,000 households were to adopt solar technologies, how that load might change throughout the day, and similar contingencies vital to future planning.

The generative AI models developed by Veeramachaneni and his team will provide inputs to modeling services based on the HILLTOP+ microgrid simulation platform, originally prototyped by MIT Lincoln Laboratory. HILLTOP+ will be used to model and test new smart grid technologies in a virtual “safe space,” providing rural electric utilities with increased confidence in deploying smart grid technologies, including utility-scale battery storage. Energy tech startups will also benefit from HILLTOP+ grid modeling services, enabling them to develop and virtually test their smart grid hardware and software products for scalability and interoperability.

The project aims to assist rural electric utilities and energy tech startups in mitigating the risks associated with deploying these new technologies. “This project is a powerful example of how generative AI can transform a sector — in this case, the energy sector,” says Veeramachaneni. “In order to be useful, generative AI technologies and their development have to be closely integrated with domain expertise. I am thrilled to be collaborating with experts in grid modeling, and working alongside them to integrate the latest and greatest from my research group and push the boundaries of these technologies.”

“This project is testament to the power of collaboration and innovation, and we look forward to working with our collaborators to drive positive change in the energy sector,” says Satish Mahajan, principal investigator for the project at Tennessee Tech and a professor of electrical and computer engineering. Tennessee Tech’s Center for Rural Innovation director, Michael Aikens, adds, “Together, we are taking significant steps towards a more sustainable and resilient future for the Appalachian region.”

New AI model could streamline operations in a robotic warehouse

New AI model could streamline operations in a robotic warehouse

Hundreds of robots zip back and forth across the floor of a colossal robotic warehouse, grabbing items and delivering them to human workers for packing and shipping. Such warehouses are increasingly becoming part of the supply chain in many industries, from e-commerce to automotive production.

However, getting 800 robots to and from their destinations efficiently while keeping them from crashing into each other is no easy task. It is such a complex problem that even the best path-finding algorithms struggle to keep up with the breakneck pace of e-commerce or manufacturing. 

In a sense, these robots are like cars trying to navigate a crowded city center. So, a group of MIT researchers who use AI to mitigate traffic congestion applied ideas from that domain to tackle this problem.

They built a deep-learning model that encodes important information about the warehouse, including the robots, planned paths, tasks, and obstacles, and uses it to predict the best areas of the warehouse to decongest to improve overall efficiency.

Their technique divides the warehouse robots into groups, so these smaller groups of robots can be decongested faster with traditional algorithms used to coordinate robots. In the end, their method decongests the robots nearly four times faster than a strong random search method.

In addition to streamlining warehouse operations, this deep learning approach could be used in other complex planning tasks, like computer chip design or pipe routing in large buildings.

“We devised a new neural network architecture that is actually suitable for real-time operations at the scale and complexity of these warehouses. It can encode hundreds of robots in terms of their trajectories, origins, destinations, and relationships with other robots, and it can do this in an efficient manner that reuses computation across groups of robots,” says Cathy Wu, the Gilbert W. Winslow Career Development Assistant Professor in Civil and Environmental Engineering (CEE), and a member of a member of the Laboratory for Information and Decision Systems (LIDS) and the Institute for Data, Systems, and Society (IDSS).

Wu, senior author of a paper on this technique, is joined by lead author Zhongxia Yan, a graduate student in electrical engineering and computer science. The work will be presented at the International Conference on Learning Representations.

Robotic Tetris

From a bird’s eye view, the floor of a robotic e-commerce warehouse looks a bit like a fast-paced game of “Tetris.”

When a customer order comes in, a robot travels to an area of the warehouse, grabs the shelf that holds the requested item, and delivers it to a human operator who picks and packs the item. Hundreds of robots do this simultaneously, and if two robots’ paths conflict as they cross the massive warehouse, they might crash.

Traditional search-based algorithms avoid potential crashes by keeping one robot on its course and replanning a trajectory for the other. But with so many robots and potential collisions, the problem quickly grows exponentially.

“Because the warehouse is operating online, the robots are replanned about every 100 milliseconds. That means that every second, a robot is replanned 10 times. So, these operations need to be very fast,” Wu says.

Because time is so critical during replanning, the MIT researchers use machine learning to focus the replanning on the most actionable areas of congestion — where there exists the most potential to reduce the total travel time of robots.

Wu and Yan built a neural network architecture that considers smaller groups of robots at the same time. For instance, in a warehouse with 800 robots, the network might cut the warehouse floor into smaller groups that contain 40 robots each.

Then, it predicts which group has the most potential to improve the overall solution if a search-based solver were used to coordinate trajectories of robots in that group.

An iterative process, the overall algorithm picks the most promising robot group with the neural network, decongests the group with the search-based solver, then picks the next most promising group with the neural network, and so on.

Considering relationships

The neural network can reason about groups of robots efficiently because it captures complicated relationships that exist between individual robots. For example, even though one robot may be far away from another initially, their paths could still cross during their trips.

The technique also streamlines computation by encoding constraints only once, rather than repeating the process for each subproblem. For instance, in a warehouse with 800 robots, decongesting a group of 40 robots requires holding the other 760 robots as constraints. Other approaches require reasoning about all 800 robots once per group in each iteration.

Instead, the researchers’ approach only requires reasoning about the 800 robots once across all groups in each iteration.

“The warehouse is one big setting, so a lot of these robot groups will have some shared aspects of the larger problem. We designed our architecture to make use of this common information,” she adds.

They tested their technique in several simulated environments, including some set up like warehouses, some with random obstacles, and even maze-like settings that emulate building interiors.

By identifying more effective groups to decongest, their learning-based approach decongests the warehouse up to four times faster than strong, non-learning-based approaches. Even when they factored in the additional computational overhead of running the neural network, their approach still solved the problem 3.5 times faster.

In the future, the researchers want to derive simple, rule-based insights from their neural model, since the decisions of the neural network can be opaque and difficult to interpret. Simpler, rule-based methods could also be easier to implement and maintain in actual robotic warehouse settings.

“This approach is based on a novel architecture where convolution and attention mechanisms interact effectively and efficiently. Impressively, this leads to being able to take into account the spatiotemporal component of the constructed paths without the need of problem-specific feature engineering. The results are outstanding: Not only is it possible to improve on state-of-the-art large neighborhood search methods in terms of quality of the solution and speed, but the model generalizes to unseen cases wonderfully,” says Andrea Lodi, the Andrew H. and Ann R. Tisch Professor at Cornell Tech, and who was not involved with this research.

This work was supported by Amazon and the MIT Amazon Science Hub.

Sadhana Lolla named 2024 Gates Cambridge Scholar

Sadhana Lolla named 2024 Gates Cambridge Scholar

MIT senior Sadhana Lolla has won the prestigious Gates Cambridge Scholarship, which offers students an opportunity to pursue graduate study in the field of their choice at Cambridge University in the U.K.

Established in 2000, the Gates Cambridge Scholarship offers full-cost post-graduate scholarships to outstanding applicants from countries outside of the U.K. The mission of the scholarship is to build a global network of future leaders committed to improving the lives of others.

Lolla, a senior from Clarksburg, Maryland, is majoring in computer science and minoring in mathematics and literature. At Cambridge, she will pursue an MPhil in technology policy.

In the future, Lolla aims to lead conversations on deploying and developing technology for marginalized communities, such as the rural Indian village that her family calls home, while also conducting research in embodied intelligence.

At MIT, Lolla conducts research on safe and trustworthy robotics and deep learning at the Distributed Robotics Laboratory with Professor Daniela Rus. Her research has spanned debiasing strategies for autonomous vehicles and accelerating robotic design processes. At Microsoft Research and Themis AI, she works on creating uncertainty-aware frameworks for deep learning, which has impacts across computational biology, language modeling, and robotics. She has presented her work at the Neural Information Processing Systems (NeurIPS) conference and the International Conference on Machine Learning (ICML). 

Outside of research, Lolla leads initiatives to make computer science education more accessible globally. She is an instructor for class 6.s191 (MIT Introduction to Deep Learning), one of the largest AI courses in the world, which reaches millions of students annually. She serves as the curriculum lead for Momentum AI, the only U.S. program that teaches AI to underserved students for free, and she has taught hundreds of students in Northern Scotland as part of the MIT Global Teaching Labs program.

Lolla was also the director for xFair, MIT’s largest student-run career fair, and is an executive board member for Next Sing, where she works to make a cappella more accessible for students across musical backgrounds. In her free time, she enjoys singing, solving crossword puzzles, and baking. 

“Between Sadhana’s impressive research in the Distributed Robotics Group, her volunteer teaching with Momentum AI, and her internship and extracurricular experiences, she has developed the skills to be a leader,” says Kim Benard, associate dean of distinguished fellowships in Career Advising and Professional Development. “Her work at Cambridge will allow her the time to think about reducing bias in systems and the ethical implications of her work. I am proud that she will be representing MIT in the Gates Cambridge community.”