First Real-World Study Showed Generative AI Boosted Worker Productivity by 14%

The study found the lowest-skilled customer-service workers reaped the greatest benefits when AI was rolled out at a Fortune 500 software firm over the course of a year.

(Bloomberg) — Customer service workers at a Fortune 500 software firm who were given access to generative artificial intelligence tools became 14% more productive on average than those who were not, with the least-skilled workers reaping the most benefit.

That’s according to a new study by researchers at Stanford University and the Massachusetts Institute of Technology who tested the impact of generative AI tools on productivity at the company over the course of a year.

The research marks the first time the impact of generative AI tools on work has been measured outside the lab. Prior studies have benchmarked the capabilities of large language models against tasks in fields like law and medicine — showing that, for example, GPT-4 aces the bar exam in the 90th percentile. Other research has tested the tech’s impact on workers’ performance of isolated writing tasks in small-scale laboratory settings. 

The results of some of these earlier experiments showed the potential, at times surprising, of large-language models in the workplace, said Erik Brynjolfsson, the director of the Digital Economy Lab at the Stanford Institute for Human-Centered AI. But until the tools are tested in the real world, he said, their impact remains mostly speculative.

“Having people use it for over a year in this company, you get a much better sense of how that translates into real-world productivity,” Brynjolfsson, one of the study’s co-authors, said in an interview. “As far as I know, this is the first time it’s been done in a real-world setting.”

Brynjolfsson, alongside MIT researchers Danielle Li and Lindsey Raymond, tracked the performance of more than 5,000 customer support agents, based primarily in the Philippines, across key metrics like how quickly and successfully workers were able to solve clients’ problems. The agents were divided into groups: Some were given access to the AI tools — trained on a large set of successful customer service conversations — and others were not. The name of the company, which specializes in enterprise software for small and medium-sized US businesses, was not disclosed in the report. 

Gains for Low-Skilled Workers 

One of the study’s findings was that novice workers benefitted most from the tech, the researchers said. With the assistance of AI, the firm’s least-skilled workers were able to get their work done 35% faster. New workers’ performance also improved much more rapidly with the assistance of AI than without: According to the study, agents with two months of experience who were aided by AI performed just as well or better in many ways than agents with over six months of experience who worked without AI. 

The research suggests that the boost in low-skilled workers’ productivity and performance may come, in part, from the way that AI tools can absorb the tacit knowledge that makes the firm’s top performers excel — like knowing the best language to use to soothe an irate customer or what technical documentation would be most helpful to share in each situation — and then disseminate that knowledge to less-skilled or experienced workers through AI-generated suggested responses.

These findings run counter to the prevailing notion that automation tends to hurt low-skilled workers most, as has played out over the last several decades of technological advances in manufacturing and other industries.

The productivity gains — about 14% on average — were less dramatic than in prior experiments, likely because real-world workplace processes are much more complex than one-off tasks. Still, the boost in productivity was significant. “This suggests that those laboratory studies were pointing in the right direction, and that they weren’t just mirages,” Brynjolfsson said.

Compensation Questions

The most highly skilled workers saw little to no benefit from the introduction of AI into their work. These top performers were likely already giving the responses at the same caliber that the AI was recommending, so there was less room for improvement — if anything, the prompts may have been a distraction, the researchers said.

If AI does ultimately narrow the gap between low- and high-skilled workers, however, companies may need to fundamentally rethink the logic underpinning compensation choices.

Top customer service agents had Excel spreadsheets where they collected phrases that they used often and that worked well, MIT’s Raymond said. If the AI tool is indeed taking this tacit knowledge and distributing it to others, she said, “then these high-skilled workers are doing the additional service for the firm by providing these examples for the AI, but they’re not being compensated for it.” In fact, they may be worse off because their incentives were based on performance relative to their peers, which introduces a host of weighty policy questions about how workers should be compensated for the value of their data. 

Forward-thinking companies would be wise to recognize the expertise of their star employees since their tacit knowledge and skill will likely form the basis of the AI tools that will power the rest of the organization, said Brynjolfsson.

“Successful companies will have incentive and reward systems that recognize that these top performers — whether or not their performance with any given customer is demonstrably better than the less-skilled workers — create knowledge that the whole organization depends on,” he said. “It wouldn’t be far-fetched for them to put even more of a premium on those people because now that kind of skill gets amplified and multiplied throughout the organization. Now that top worker could change the whole organization.”

Of course, Brynjolfsson noted that it’s still early days in the study of generative AI and this research isn’t the final word — much more is left to learn.

Remaking the Workplace

Some of the observations yielded by the field experiment weren’t captured by the data but point to dozens of other ways these tools may soon reshape workplaces. Anecdotally, managers at the firm were no longer spending 20 to 30 hours per week coaching employees, Raymond said, likely because the AI served as a substitute of sorts. That could, in turn, change the employee-manager relationship, since supervisors would then spend less time with direct reports and would instead take on larger teams. 

Still, the speed at which generative AI appears to be capable of transfiguring workplaces — seemingly overnight — is dizzying, especially relative to prior technological breakthroughs. 

“There’s a mountain of research that these transformative technologies — like electricity or the steam engine or computers — took decades before they really moved the dial on productivity. In the case of electricity, it was about 30 years between when it was introduced into factories and when you really saw significant productivity gains,” Brynjolfsson said. “So there’s a concern, even an expectation, that this would play out over the period of many years or a decade or more. But the fact that we’re already seeing it so quickly says something about the technology, and our ability to implement it and get practical results a lot faster than we did in the past.” 

In light of these early results, Brynjolfsson has advice for workers and executives: Embrace this technology.

“Start experimenting with it and learn what it can do. Find out where it’s most effective and where it’s least effective,” he said. “Companies should have crash programs to educate their workforce on them and really get up to speed.”

More stories like this are available on bloomberg.com

©2023 Bloomberg L.P.