Not all AI is created equal

AI solutions will only be as valuable as the data they are built on. Enterprises need to establish best practices to manage, analyse, and use large amounts of unstructured data to get the most out of AI, says Skip Levens, Director, Media and Entertainment at Quantum.

AI models that help complete tasks automatically are built on already existing general AI models that have been created using millions of hours of development and thousands of GPU hours. These general AI models are then trained to focus on something special for the desired AI task. For example, a medical company might train a model against a massive repository of millions of MRT images so it can learn to detect cancer cells. Or a broadcasting company might train its model with millions of videos of football matches to detect goals, players, penalty shots, or whatever else might be of interest to create further content. A model like this could make finding a precise shot within millions of minutes of content finally possible.

The quality of an AI solution is directly linked to the amount of available training data

Training a general AI model successfully to do something specific depends, to a wide degree, on the quantity, quality, and variety of the underlying data. The more variety the data has, the better the model’s ability to detect whatever you want it to detect. For instance, an AI solution that aims to identify giraffes will be more successful when the data is not simply based on many pictures that are similar but on pictures of giraffes from different angles against different backgrounds.

So, the more data a company has — in quantity, quality, and variety — the better trained the model will be. Even better, when this variety of data comes from a company’s own data and content libraries, it will be uniquely adapted to the organisation’s needs and enable a critical advantage over competitors who have not taken the care to collect and manage their data.

Organisations that realise that AI models must be built on real-world and relevant business operations data will jump ahead of their competitors. This data is fuelling a new, hyper-competitive race for innovation. If an enterprise wants competitive differentiation, it must leverage its own unique data — not just what’s readily available in general-purpose models. This mindset has increased the demand to retain as much data as possible, which requires end-to-end unstructured data management — an inherently challenging process.

Efficiency and organisation of data are advantages to training AI models

Recent strides made in data storage and AI technology innovation are simplifying the key complexities of unstructured data management. These advancements allow organisations to move from merely managing data to turning that data into a competitive differentiator with the newfound ability to generate actionable, data-driven insights. It’s important for organisations to understand how new AI capabilities can help simplify and make the management and enrichment of this data more efficient. 

Modern AI-enabled storage infrastructure has the ability to tag, catalogue, and sort data, so it is easily searchable and reusable for AI and analysis. It’s also key that the infrastructure delivers end-to-end data management, from high-performance ingest for AI applications to long-term archiving, that makes it easy to build massive data stores for analysis and be ready to take on AI initiatives as new needs arise, and models get better and better.

Extend an existing object recognition library to create an AI-friendly content production workflow

Having both data and AI models on the same platform makes it easy to use the data and extend it to be used with an existing object recognition library. A company that already has a trained system that extends a general-purpose library has a time and ability advantage over other companies. If you have an AI-friendly content production workflow in place, your competitors must make up for it with laborious, human-driven content tagging. With such a workflow already in place, it is easy to use that dataset to extend an existing object or action identification model on the fly. Using the previous example, it is now easy to use the model that tagged images with ‘giraffe’ to next find ‘elephant’ or ‘rhino’.

AI is only as good as the data it is built on, and how quickly a team can train and re-train models on datasets tagged with well-ordered data and content and add new ‘features’ as quickly as new needs are identified. To get this advantage, enterprises need to establish best practices that help their teams store, manage, analyse, and use large amounts of highly valuable unstructured data whenever and wherever required. A solid foundation of an end-to-end, AI-enabled infrastructure, from high-performance ingest to long-term archiving, can help businesses maximise the potential of their data, and fuel innovation and efficiency for years to come.

Related Articles

Top Stories