19 November 2024 - 4 min read

Keynote - Enabling access to industrial data for GenAI model development

Enabling access to industrial data for GenAI model development

 

 

fabrice-tocco-dawex-co-ceoKeynote speech given by Fabrice Tocco, Dawex co-CEO, at the roundtable organized by the European Commission on October 25, 2024.

 

In the rapidly evolving landscape of artificial intelligence (AI) and generative AI (genAI), European organizations, ranging from large corporations to small and medium-sized enterprises (SMEs), are facing significant opportunities and challenges in leveraging industrial data. The ability to build and train powerful AI/genAI models that meet the specific needs of the industry can revolutionize operations and drive innovation.

 

However, this endeavor is not without its complexities. As we navigate these intricacies, it is crucial to explore how we can move forward in building a thriving European data-driven economy with the goal to ultimately become a trusted AI powerhouse, fostering growth, competitiveness, and ethical standards in the global AI arena.

 


GenAI is increasingly proving its transformative potential, in a variety of sectors being healthcare, media, education, and more. In the industry, the impact is just as profound, using GenAI for better product design, optimizing production processes, or improving logistics and supply chain. However, AI models are only as good as the data they are trained on

Two main challenges are data fragmentation and reluctance to data sharing. In the industrial world, there are many barriers to accessing industrial data, since these data are often highly sensitive, fragmented across various companies, locked in silos within a company, stored in different formats, and proprietary in nature. Relying solely on open data is not sufficient. As a matter of fact, industrial data are hard to find, and even harder to share. Companies are reluctant to share their data, with fears such as losing competitive advantage due to the data going to competition, and/or violating confidentiality agreements.

So, here are some fundamental questions that need to be addressed:

 

  • How to build trust in data exchanges and data sharing?

  • What does trust mean exactly? Trust in the data supplier, in the data acquirer, in the technology that will support the data transaction, ...

  • How to ensure that access and usage of this data is fair and compliant? i.e. respecting the terms agreed upon between the participants in the exchange, and complying with regulations.

  • How to stimulate the market to share more data?

     


To unlock this data, creating the right incentives for companies to participate is key in order to stimulate the market.  

 

These incentives could be two-fold: financial and non-financial. Financial incentives could be, for example, financial compensation for those who contribute data, or the provision of tax breaks. Non-financial incentives could take the form of privileged access to the  AI models trained on the data contributed by the company. This would create a win-win situation as businesses get access to cutting-edge AI tools that benefit their own operations, while at the same time also contributing to broader innovation.

While building trust is fundamental, next is to build Interoperability. This is where Common European Data Spaces and Data Intermediaries come into play.


Trust is the foundation upon which data exchange must be built, and Common European Data Spaces are essential in fostering that trust. Data spaces serve as secure environments where trusted data transactions can happen. They provide the legal, technical, and operational frameworks that allow data providers to share their data confidently, and data users to use it with trust. Data spaces are designed not only to create trusted ecosystems for data exchange but also to ensure better interoperability, both within and across different data spaces. This interoperability extends indeed across industries, making it possible for different sectors - manufacturing, logistics, healthcare, energy - to exchange data in ways that were previously too complex or too risky.


Data spaces also reduce friction in data exchanges by:

 

  • Implementing reference architectures and trust frameworks, as those defined by Gaia-X,
  • Leveraging standardized protocols, such as the data space protocol,
  • Enabling standardized trusted data transactions at scale, making it easier for companies to collaborate while maintaining control over their data, and for different systems and platforms to work together seamlessly.

 


Data Intermediaries, known as Data Intermediation Services Providers in the Data Governance Act, act as neutral, trusted third parties that facilitate secure, fair, and compliant exchanges of data. Regarding the role of data intermediaries, there are a few questions that are worth exploring such as “should the role of Data Intermediation Service Providers be broadened?”. For example, could they become the custodians of both data and AI models, managing a “safe place” where data can be shared and models can be trained. There are different approaches for this:

 

  • “Data goes to the model” (models don’t move), when AI developers do not want to share their models externally
  • “Model goes to the data” (data don’t move), when data holders are reluctant to share data externally
  • or, when none of these approaches work, both data and models could “meet” in a secure third-party location, a neutral place where the training happens in a controlled and trusted environment.

 

So, as Europe pushes forward with its Data Strategy, data spaces and data intermediaries constitute the backbone of the European Data Economy that will allow the pooling of industrial data to train the AI models of tomorrow.

Actually, parallels can be drawn to the financial markets. Structured for hundreds of years, financial markets have been built on trust, transparency, and regulation. The data markets, still in their early stages, need to adopt a similar approach. While "money matters" in the financial market, one must recognize that “data matters” just as much in the data economy . This recognition is now being installed, and quite advanced in some countries like China, where as of January 1st, 2024, data assets can be integrated into a company’s balance sheet as intangible assets or inventories. Actually, Europe could explore a similar path to elevate the importance and value of data in business strategies.

Trust and innovation go hand in hand. Without trusted data flows and trusted data transactions, quality AI cannot exist. But by building trust, creating the right incentives, implementing the right technology, and this technology already exists today, while leveraging the regulatory frameworks that are in place, the full potential of industrial data and Generative AI can be unlocked. This is not just an opportunity, it is a necessity for Europe to remain competitive, innovative, and forward-thinking.


Learn more on how data exchange contributes to building trust  

 

 

Want to learn more about the benefits of Data Exchange? Subscribe to Dawex newsletter here.