Why Real-World Data Is Still a Big Challenge in AI Production

human-like robot facing forward with digital and virtual dashboards surrounding it

Artificial intelligence is powering a fourth Industrial Revolution, and companies across verticals are vying to capitalize on its promise. A 2020 survey of C-suite executives at leading firms revealed that more than 90 percent are investing in AI – yet less than 15 percent report that their companies “have deployed AI capabilities into widespread production.”

It’s hardly a secret that it takes much longer to implement AI models in the production stage than to actually develop them. Why is this last stage such a challenge for established enterprises and nascent startups alike?  

The answer lies in data, which is at the core of the AI enterprise. While it’s one thing to train a model on a given dataset in a controlled environment, successfully navigating live, real-world data is another challenge entirely.

Take autonomous vehicles. Whereas a task like making a left turn is virtually effortless for an experienced human driver, it’s a more complex undertaking for a machine, requiring accurate predictions of timing, obstacle identification, a dynamic understanding of the immediate environment including other cars and pedestrians, and much more. Of course, human drivers also take these factors into account – but the inevitable discrepancies between training data and real-world scenarios make it all the more challenging for an AI system.

The production stage of AI development is where systems begin to encounter these difficult cases. To overcome the challenges these real-world cases pose, companies must develop smart strategies for moving AI from conception to production. That will require making extensive use of humans working alongside AI, as only those who encounter the real world every day can guide machines toward effectively navigating that world.

The Importance of the Production Stage

As the point at which AI projects move from the theoretical to the practical, the production stage is where AI models either prove their value or flop. This is where organizations determine how a model will operationalize, how frequently data should be remodeled, and whether a concept – assuming it works in the real world – can be efficiently and effectively scaled.

Scalability is a critical factor when it comes to production. According to a 2019 Accenture survey, 75% of executives across the globe believe their businesses are at risk of failing within the next five years if their organizations do not succeed in scaling AI production – which makes sense, given that AI will increasingly become a competitive necessity over the coming years.

But scaling is exceedingly difficult, which helps explain why so many enterprises are still struggling with the production stage. Returning to the case of autonomous vehicles, it’s extremely challenging to train a system to respond correctly to every possible scenario on every road. Lane markings, for instance, can appear differently in various environmental conditions, at night, in a construction zone, on a country road, and so on. These factors must be taken into account for the machine to process signals it receives effectively.

AI solutions for such cases would begin with a machine learning model focused on a specific task, then scaling it throughout various real-life scenarios to ensure effectiveness. To speed up the scaling process, companies can validate these small changes by having human counterparts work alongside the machine to improve the accuracy of predictions and enhance the ability of the machine to learn from various scenarios.

Automation and Data Accuracy

The mismatch between training data and real-world data, and the simple fact that very large or rapidly streaming datasets cannot be checked manually, make a strong case for automation in AI development.

Models must be continuously validated and retrained and automated solutions can play a vital role in doing this at scale, enabling companies to accelerate the rollout of new offerings while reducing the cost of generating training datasets. Still, just as relying too heavily on manual work comes with serious pitfalls, over-relying on automation comes with its own risks. Without ongoing human oversight, entirely automated solutions can produce errors – potentially deadly ones in use cases like autonomous vehicles. That’s why a hybrid approach, combining both automation and human intelligence, is the optimal path.

Thousands of businesses have poured hefty resources into AI, only to find that the production stage creates as many problems as the project intended to solve. While failure to resolve these issues poses a serious business threat, the good news is that with the right framework and enhanced workflows, companies can clear hurdles to innovation and unlock real value from AI.


* indicates required