Data management (r)evolution in the age of AI and the citizen data scientist

From the convergence of data and analytics, to the rise of augmented data and true human-machine collaboration, we look ahead to the future of data management.

Data and analytics are converging

All the signs point towards a growing integration of two related but still relatively separate domains: data and analytics.

How will this convergence manifest itself? We can already see evidence of several areas, once discrete, coming together in a more holistic fashion. Data management, including its various subdisciplines of data preparation, data catalogs, data quality, and master data management, is coming into the arena of active data governance.

Analytics and business intelligence (BI) platforms increasingly incorporate data science and machine learning (ML). The different elements of data integration, data profiling, data cataloguing, and database are also converging with analytics and BI tools themselves - as well as other low-code application development frameworks.

In line with the trend towards democratization of data and data management, this convergence is bringing new, less specialist users into the picture. The traditional roles of data architect, data scientist and application developer are being joined by ‘citizen roles’ such as citizen data integrator, citizen data scientist, and citizen developer for greater productivity across greater scope of use case solving.

Relationships are the key to data and analytics value

The use of graph techniques at scale, to enable the discovery of relationships within diverse data, is fundamental to the development of modern data and analytics.

This is relevant for a number of different things, including:

  • Knowledge graphs
  • Data fabrics
  • Natural language processing (NLP)
  • Explainable AI
  • X analytics’ — analytics for a range of structured and unstructured types of content
  • Richer context for ML and artificial intelligence (AI)

As contextual intelligence emerges as a crucial discipline on the new landscape, organizations are set to increase their investments in automated and guided data contextualization capabilities.

Moreover, there is a move away from imposing a single structure on data sets, in favour of an active metadata approach. This is a product of the multiple diverse structures and insights that emerge from data via AI and ML augmentation.

The data fabric of tomorrow must above all be agile and transparent, relying on metadata that is agile, dynamically inferred and trusted.

Indeed, the ability to identify meaningful relationships — across data types, people, places, and objects — is one that is absolutely fundamental to generating real value from data and analytics.

Rise of AI and augmented data

A welcome change is also coming for IT specialists, who currently spend too much of their time on repetitive and low impact data management tasks that can and will be automated. Manual data management tasks are set to be vastly reduced as machine learning and automated service-level management develop and expand across industry.

This will liberate IT specialists and increase the amount of time they can spend on higher-value tasks including collaboration, training, and strategic data management activities.

When assessing modern data management solutions, augmented capabilities are becoming a key differentiator. Under increasing commercial pressures, data and analytics leaders need ways to connect, ingest, analyze, and share data more efficiently, with both increased speed and lower cost implications.

Augmented data management has emerged as a vital tool in various offerings, including active metadata, AI and ML algorithms, and data fabric designs using semantic knowledge graphs.

There is also a shifting emphasis in how progressive companies are assessing data management solutions. While the focus used to be on the means of data retention and control, we now see far much more attention paid to the ways in which data is utilized and accessed. This is particularly true for the cloud.

Augmented data management is ushering in a new phase of data management, where the long-anticipated collaboration between humans and machines — specifically the AI and ML engines — becomes a reality.

Together the two work across the flow of data within the company, with humans performing creative and strategic activity, supported by the processing and ‘heavy-lifting’ power of artificial intelligence.

Embrace the smart engineer

In this new environment, the organizations that succeed will be those who can take the leap beyond traditional approaches. Mainstream self-service analytics are no longer adequate. Neither is a continued reliance on specialist data scientists, given their scarcity and high cost.

Instead, industry must embrace the ‘citizen data scientist’, who is not a specialist in data by background, but is empowered with the capabilities and practices that enable them to harness data effectively. The democratization of data management and technology means that more team members can glean predictive and prescriptive insights from data. They can come from a variety of roles. Despite not having the same analytical or technical skills as expert data scientists, they are still able to drive real value for the enterprise.

Cognite combines a powerful blend of machine learning, rules engine, and subject matter expertise codification to convert data into actionable knowledge. Our main product, Cognite Data Fusion®, creates a fully contextualized data fabric unique for its industrial data understanding. Discover what wider accessibility to meaningful data across your smart engineers and professional data scientists can do for your organization by contacting us today.

See Cognite Data Fusion® in action

Get in touch with our product experts to learn more and identify quick wins