Artificial intelligence (AI) is rapidly transforming the way businesses operate, but a new report from AI data provider Appen highlights the challenges companies are facing in sourcing and managing high-quality data to power their AI systems. The 2024 State of AI report from Appen surveyed over 500 U.S. IT decision-makers and found that while generative AI adoption has surged by 17% in the past year, organizations are struggling with data preparation and quality assurance.
According to Si Chen, Head of Strategy at Appen, as AI models tackle more complex problems, the data requirements also change. Companies are realizing that having a large amount of data is no longer sufficient. High-quality data that is accurate, diverse, properly labeled, and tailored to specific AI use cases is essential for fine-tuning AI models.
The report identified several key areas where companies are encountering obstacles in their AI initiatives. Here are the top five takeaways from the 2024 State of AI report:
- Generative AI adoption is on the rise, but so are data challenges:
Generative AI (GenAI) adoption has increased by 17% in 2024, driven by advancements in large language models that enable businesses to automate tasks across various use cases. However, the rapid growth in GenAI usage has introduced new hurdles, particularly around data management. Custom data collection has become the primary method for sourcing training data for GenAI models, emphasizing the shift towards tailored, reliable datasets. - Enterprise AI deployments and ROI are declining:
The report found that fewer AI projects are reaching deployment, and those that do are showing less return on investment (ROI). The complexity of AI models, particularly generative AI, is a contributing factor to this trend. While simple use cases like image recognition and speech automation are mature technologies, more ambitious AI initiatives require customized, high-quality data and are more challenging to implement successfully. - Data quality is crucial, but it’s declining:
Data accuracy has dropped nearly 9% since 2021, posing a critical challenge for AI development. As AI models become more sophisticated, the data they require has also become more complex and specialized. Companies are retraining or updating their models at least once every quarter to ensure fresh and relevant data. External data providers are increasingly relied upon to train and evaluate AI models. - Data bottlenecks are worsening:
The report reveals a 10% year-over-year increase in bottlenecks related to sourcing, cleaning, and labeling data, directly impacting the deployment of AI projects. Companies are focusing on long-term strategies that prioritize data accuracy, consistency, and diversity, and are forming strategic partnerships with data providers to navigate the complexities of the AI data lifecycle. - Human-in-the-Loop is More Vital Than Ever:
While AI technology advances, human involvement remains essential. Eighty percent of respondents emphasized the importance of human-in-the-loop machine learning, where human expertise guides and improves AI models. Human experts play a crucial role in bias mitigation and ethical AI development, ensuring that AI systems are high-performing, ethical, and contextually relevant.In conclusion, as AI continues to evolve and expand into various industries, the challenges related to data quality, deployment, and human involvement are becoming more apparent. Companies must address these challenges to successfully leverage AI technology and realize its full potential.
For more detailed insights, you can refer to Appen’s full 2024 State of AI report here.