Over the past decade, we have seen how AI, when coupled with complementary management practices to rewire the way organizations operate, can generate real business value. Even prior to the advent of gen AI, analytical AI was being used by roughly half of the enterprises represented in McKinsey’s Global Survey on AI.8 Those organizations have been deploying the technology across a variety of business functions—from increasing revenue through more targeted marketing to reducing costs in supply chain operations. Since ChatGPT became available in late 2023, the percentage of organizations reporting that they use AI has spiked upward by 20 percentage points, with companies implementing gen AI in use cases from customer service to software engineering.
Most of these applications of AI have been aimed at improving the efficiency of existing tasks and workflows. But boosting efficiency and productivity is just one way that AI promises to unlock a new era of growth and opportunity. Our research shows that AI also can be deployed to accelerate innovation to create entirely new products and services. To put it another way: AI can be used to bend the curves of the declining R&D productivity we documented in the previous section.
We have identified three primary channels through which AI technologies can accelerate innovation, each with a corresponding type of model: increasing the velocity, volume, and variety of design candidate generation; accelerating the evaluation of candidates through AI surrogate models; and accelerating research operations.
Increasing the velocity, volume, and variety of design candidate generation
A simplified model of the R&D process consists of identifying a set of customer needs, generating candidate designs, and then evaluating those ideas to identify the most promising ones that will best meet the needs of the customer or user. One of the highest potential opportunities for AI to enhance innovation is to more quickly generate a greater volume and variety of design candidates.
Gen AI technology is based on foundation models—very large simulated neural networks that are trained on vast collections of data to take unstructured data (that is, data that isn’t best stored in rows and columns like a spreadsheet, such as human language) as inputs and then generate unstructured data as output. Large language models (LLMs) are the best-known types of foundation models, underpinning the chatbots that have made gen AI such a compelling technology.
Most Popular Insights
However, foundation models can be trained to produce outputs other than human language. They can be trained to generate chemical compounds, drug candidates, computer code, electrical designs, physical designs, and other types of potential solutions. With sufficient computing power, these models can generate design candidates far more quickly than researchers, designers, or engineers can on their own—increasing the number of “shots on goal” that could potentially produce a successful design.
For example, a retailer used gen AI tools to create dozens of alternative 3D store configurations, rendered with photorealistic fidelity. Using traditional computer-aided design and rendering tools, a designer might have only created a handful of sketches, and at a much lower level of fidelity. Without the ability to quickly generate a variety of alternative designs, many of these options would likely not have been considered. An unexpected side benefit of the AI-generated 3D renderings was the discovery of certain aesthetic decor features inserted by the foundation model to fill out the rendering—features that appealed to consumers but were not in the initial design parameters.
Thus, not only can AI quickly generate a greater volume of candidates, but AI systems also can generate a greater variety of candidates—in particular, designs that a human researcher or engineer would be less likely to produce, given the biases that stem from their training and on-the-job experiences. Provocatively framed, AI can be more creative than humans.
An early example of AI’s ability to generate ideas that a human would not have considered occurred in March 2016. DeepMind had trained an AI-powered system called AlphaGo that squared off against the world’s top Go player, Lee Sedol, in Seoul. Go is considered one of the most complex and strategic board games in the world. Perhaps more remarkable than the fact that the AI triumphed in the best-of-five match was the now-famous “Move 37” in game two: AlphaGo made such an unexpected move that several commentators believed it was a mistake. It was a completely outside-the-box move, defying centuries of conventional Go strategic principles. It was, as one commentator noted, “a move no human would ever make.” It was fresh, it was novel—and it was foundational to AlphaGo’s victory.
Some R&D organizations have described AI as generating similarly innovative ideas in the lab. For example, David Baker, a researcher at the University of Washington, has led a team that uses deep learning models to design novel proteins that bind and catalyze other reactions. More specifically, Baker and his team are creating entirely new proteins, complex functional molecules that interact with other molecules at a subatomic level and don’t already exist in the world—a goal that has been beyond the capability of scientists to accomplish without AI tools. Among the applications of these custom-designed proteins: new vaccines and medicines, biosensors for hazardous materials, and agents that can capture or break down environmental pollutants. For leading this pioneering work, Baker shared the 2024 Nobel Prize in Chemistry.
The ability to use AI to creatively generate a greater variety of candidates hasn’t only been applied at the molecular level; it is also being applied in physical engineering disciplines. Generative models, for example, are currently being used to design rocket engines with novel geometries, particularly their cooling channels, which are becoming manufacturable with 3D printing.
Accelerating the evaluation of candidates through AI surrogate models
A complementary activity in the product development life cycle is the evaluation of candidate designs. For physical products, this has historically meant manufacturing prototypes and then subjecting them to a regimen of physical tests—for example, the crash tests that automobile manufacturers perform to test the safety of their vehicles. But these tests tend to be both costly and time consuming.
Unsurprisingly, over many years, scientists and engineers have developed mathematical and computational models to simulate the performance of physical systems to perform in silico testing. So, rather than putting an airplane design into a wind tunnel or a racing yacht design into water, designers use computational fluid dynamics (CFD) to evaluate the performance of a particular configuration. Instead of building a prototype structure to determine how the design candidate might perform, engineers can use finite element analysis (FEA) to predict how forces will affect a structure. Rather than setting up physical experiments, for example, radio engineers can use computational electromagnetic (CEM) modeling to predict how an antenna design might perform.
While these acronym-laden, physics-based mathematical models are often less expensive than physical experiments, these simulations are often extremely computationally intensive and can take many hours, or even days, to run. But a recent discovery found that it is possible to repurpose the neural network technology developed for AI systems to train models that can act as proxies for more computationally intensive physics-based models. These AI-style surrogate models do not imitate the “thinking” that people do; instead, they predict the outcomes of physical phenomena in the world. When used to predict the behavior of a complete system, these models are akin to a “digital twin.”
Take weather forecasting. Over the years, scientists have developed complex and detailed models of the Earth’s weather that have enabled increasingly accurate forecasts. However, because of their computational intensity, these physics-based simulations must be run on powerful supercomputing clusters. DeepMind, for example, trained a neural-network-based machine learning model that predicts the weather faster (eight minutes versus hours) and more accurately on a single AI-optimized processor than a top operational physics-based weather-forecasting system running on a supercomputer with tens of thousands of processors.9
The same kind of techniques are being used to evaluate product designs. As previously noted, for example, computational fluid dynamics is used to simulate the aerodynamic performance of aircraft (and automobiles, including for racing). Designers are now using neural network models trained on wind tunnel and CFD data to predict hundreds of results in a few seconds for a range of flow velocities and angles that were not included in the wind tunnel testing or CFD simulations that otherwise would have taken hours or days to produce. The benefit here is not simply increasing the speed of a single simulation run per se, but the ability to test a panoply of possibilities. In a CFD case, engineers can test many alternatives to optimize the design of a turbine compressor. They can then use other automated systems to check for manufacturability, reliability, product cost, et cetera, and run through iterations that would otherwise not have been possible in a reasonable time.
In the life sciences, researchers are using similar techniques to study the proteins that exist in the world. Predicting the 3D-folding structure of proteins from their known sequence of amino acids has historically been incredibly challenging, involving myriad quantum-level interactions at the subatomic level. British AI researchers Demis Hassabis and John Jumper won the other half of the 2024 Nobel Prize in Chemistry for training a model that can predict the 3D structure of proteins, which has now been used to predict the structure of around 200 million proteins, covering almost every known protein.10 The ability to predict molecular structures and their interactions can enable the testing and evaluation of various biological products, from therapies to treat disease to biological production of materials.
Some design challenges require evaluating and optimizing designs across multiple physical phenomena that interact with one another, so-called multiphysics problems. Requirements to analyze multiple modalities multiply the complexity of modeling them together. For instance, designing an aircraft antenna could require an understanding not only of the design’s radio frequency characteristics but also its aerodynamic and thermal properties, all of which can interact with one another. Integrated neural-network-based models, given sufficient training data, can integrate a variety of modalities, multiplying their potential to accelerate design candidate evaluations.
Accelerating research operations
In addition to generating and evaluating design candidates, there are several additional ways that LLMs, sometimes coupled with other AI technologies, are being used to accelerate various activities in the product development process:
Identifying and analyzing customer/user needs, products, and features. LLM-powered software solutions are being used, particularly by consumer companies, to synthesize a vast array of product reviews, social media posts, customer service transcripts, and other sources of customer data to identify addressable market segments and the product categories and features/functions that would best address the as-yet unmet needs of customers.
Exploring and synthesizing existing research and data. In industries such as life sciences, chemicals, and materials, there is a vast and rapidly growing body of published research and databases. It can be challenging for scientists to keep up with the literature in their own subdiscipline, not to mention the adjacent or even distant areas of other research, which could bring insights for breakthroughs in their field. Oftentimes, the volume of machine-readable data being made available is growing even more rapidly than published papers.
Tools enabled by LLMs and analytical AI can synthesize insights from published literature and databases, both to inform innovation practitioners and to suggest potential avenues for creating solutions. Google, OpenAI, Perplexity, and Anthropic, for example, have all introduced knowledge agent products that perform multistep research tasks that one might otherwise assign to a research assistant: creating a work plan, searching a set of sources on the web, producing a well-structured research report.
Streamlining internal knowledge management. Not only is there a burgeoning volume of publications and data available publicly, but large corporations hold a huge amount of both codified knowledge in various databases and tacit knowledge in the minds of employees. LLM-powered tools can help to codify tacit knowledge—say, transcribing and capturing recorded meetings and other communications (with the permission of the participants, of course). Tools similar to the publicly available research products previously mentioned can then help product development practitioners find relevant corporate knowledge, which can be combined with externally sourced data to generate syntheses and insights.
Automating documentation tasks. In many product development processes, particularly in highly regulated industries such as pharmaceuticals and aircraft manufacturing, there are significant documentation requirements—for example, for regulatory filings, engineering change orders, and other required documentation. LLMs can accelerate the process of both generating and synthesizing these documents. (Of course, systems must be put in place, including human review, to ensure that these documents meet requirements for accuracy and fidelity.)
Collaborating with humans for ideation and concept development. Product managers, scientists, engineers, designers, and other participants in the product development process can “converse” with LLMs to stimulate ideas, get “opinions,” and have their ideas challenged, much as they would with a colleague. These experiences illustrate that it is possible for humans and AI to collaborate, but the human skill in using AI tools can significantly influence the effectiveness of these collaborations (see sidebar, “Agents in R&D”).