What does DeepSeek triggering the 'Sputnik Moment' of AI mean for industry?
«Just when everyone agreed the solution is to throw more compute at the problem»
Background & Summary
- Chinese DeepSeek (founded in 2023) has triggered the 'Sputnik Moment' of AI, causing widespread panic as well as novel optimism (depending on perspective)
- DeepSeek's AI Assistant surpassed OpenAI's ChatGPT on the US App Store on popularity
- DeepSeek’s open source, light weight AI models reportedly rival OpenAI’s fraction of the cost and compute, both on usage (inference) and training
- DeepSeek has provided a massive gift to nearly everyone by showing a path to a future of effectively-free AI products and services. In the long run, everyone who uses AI will be the biggest winners
- Model agnostic architectures as well as the prospect of private tailored models given a boost
- Cyber and data privacy threats brought even closer to centre stage of the AI revolution
Impact already measured in the trillions
This DeepSeek phenomenon could be an enormous boon to industry in the medium term.
Perhaps counterintuitive noting a $1tn value destruction hitting frontier American AI companies in one day of trading, but in the long term, AI commoditization, disruptively lower-cost model training, and above all leaps and bounds cheaper inference (AI model use that is) — all of which DeepSeek has demonstrated — is great for the entire AI industry.
DeepSeek R1 reduces the cost of using a highly performant LLM to 3-5% of the cost of OpenAI’s o1. Training the model is rumored to cost only ~$5.6M, which is a fraction of the assumed training cost of the mid-hundreds of millions of dollars for OpenAI’s o1
At Cognite, a global leader in industrial AI with a multi-cloud, multi-model open architecture, we're keenly following the implications of DeepSeek-triggered innovation for sectors like manufacturing, energy, and pharma. These industries are already investing in generative AI to optimize their operations, with frontrunners investing in both fine-tuning of models, as well as doing careful FinOps calculations to better understand at-scale cost of AI deployment. DeepSeek's breakthroughs in open source, light weight AI models – at a small fraction of the cost and compute of other leading models – could be a game-changer; the same way the Sputnik launch proved that you could send things to space faster and less expensively than imagined.
If there wasn’t unanimous agreement before, there is now: LLMs are becoming commoditized. And faster than any previous technology ever!
Impact on AI for industry
One of the most significant implications of DeepSeek's breakthrough is the potential for faster and cheaper inference. This means that LLMs can process information and generate responses more quickly and efficiently, enabling more complex reasoning and problem-solving capabilities. For asset-intensive industries, this translates to several exciting possibilities:
- Enhanced Predictive Maintenance: Imagine an industrial agent, powered by low-cost, interactive level speed LLM, that can analyze vast amounts of sensor data from a manufacturing plant in real-time, predicting equipment failures with greater accuracy. This would allow companies to minimize downtime by proactively scheduling maintenance, optimizing production schedules, and reducing costly repairs.
- Improved Process Optimization: In the oil and gas industry, such industrial agents could rapidly process and interpret data from drilling operations, identifying potential bottlenecks and suggesting adjustments to optimize extraction processes. All with human-level interactivity combined with pursuing multiple scenarios in parallel. This could lead to increased efficiency, reduced waste, and improved resource management.
- Accelerated Root Cause Analysis: In the event of a safety incident at a pharmaceutical facility, industrial agents could quickly analyze data from various sources, such as security cameras, sensor logs, employee reports – all whilst interactively triggering additional requests and verification to human participants – to identify the root cause in record time. This would enable faster resolution, prevent recurrence, and contribute to a safer working environment.
Solving complex problems with AI that can “think” alongside your SMEs
For agentic AI industrial applications, the ability of LLMs to "think" – to generate hypotheses by reasoning through complex problems – is crucial. This mirrors the human decision-making process in industrial settings, where engineers and operators often rely on experience, intuition, and data analysis to solve problems and optimize operations.
DeepSeek’ disruptively lower token cost combined with improved token speed will make thinking LLMs (so called reasoning models) more accessible and usable. For industrial use cases the thinking process is crucial, and making hypotheses through thinking before evaluating each one with data is how humans work, and will be critical for real industrial AI agents. As inference cost goes down – and as importantly, speed goes up – it allows workflows to do more rounds of reasoning, thus resulting in better answers and higher confidence in reasoning steps.
Will this Sputnik Moment bring us closer to AGI?
Does this mean that we’re one big step closer to autonomous operations and self-optimizing plants?
It definitely seems like it. Yet keeping to short and medium term implications -- now that DeepSeek has proven that anyone can train their own LLM on a reasonably low budget and feasible hardware requirements -- bring-your-own-LLM approaches seem more likely to prevail, and similarly, being considerably closer to current date. Limitation will no longer be cost, but size of training corpus.
For industrial operators, being able to address talent shortages with thinking and reasoning industrial agents at scale without inference costs soaring instantly into the millions – whilst not yet quite reaching the heights of AGI for industry (or even fully self-optimizing plants) – is most certainly in the realm of transformative value.
A very necessary note on cyber and data privacy threats
First, some base lining. DeepSeek R1 is an open source model on Hugging Face. It can be hosted by anyone themselves, or via US based Hugging Face. There is no data going to the Chinese entity. Beyond their model, the optimizations they have made are now available to everyone: Approximately 50x speed up and similar magnitude lower reduction of training. (I bet the large US LLM players have already implemented this trick and will very soon come out with even bigger models, which are now feasible due to the optimizations found by DeepSeek.)
The free DeepSeek AI Assistant (chat.deepseek.com) is an altogether different case. This free offering poses considerable cyber and data privacy threats:
- A security flaw in DeepSeek allowed attackers to gain control of victim accounts via prompt injection attacks, accessing session data, cookies, and other information associated with the chat.deepseek.com domain
- The model's capabilities enable automated vulnerability identification at scale, accelerating zero-day exploit discovery and facilitating advanced persistent threats (APTs) through data analysis and correlation
- DeepSeek's generative capabilities also enhance social engineering and disinformation campaigns
IMPORTANT DISCLAIMER: COGNITE DOES NOT PROVIDE CYBERSECURITY OR DATA PRIVACY ADVICE ON NON-AFFILIATED SOLUTIONS
See Cognite Data Fusion® in action
Get in touch with our product experts to learn more and identify quick wins