How Chinese AI startup DeepSeek is competing with Silicon Valley giants
SAN FRANCISCO: The day after Christmas, a small Chinese startup called DeepSeek unveiled a new artificial intelligence system that could match the capabilities of cutting-edge chatbots from companies such as OpenAI and Google.
That alone would have been a milestone. However, the team behind the system, called DeepSeek-V3, described an even bigger step. In a research paper explaining the technology, DeepSeek’s engineers stated they utilized only a fraction of the highly specialized computer chips that leading AI companies depend on to train their systems.
These chips are at the center of a tense technological competition between the United States and China. As the US government strives to maintain its lead in the global AI race, it is working to limit the sale of powerful chips, such as those produced by Silicon Valley firm Nvidia, to China and other rivals.
The performance of the DeepSeek model raises questions about the unintended consequences of the US government’s trade restrictions. These controls have compelled researchers in China to become creative with a wide array of tools that are freely available online.
According to benchmark tests used by US AI companies, the DeepSeek chatbot answered questions, solved logic problems, and even wrote its own computer programs with the same proficiency as anything currently available on the market.
And it was created on a modest budget, challenging the prevailing notion that only the largest tech companies, predominantly located in the United States, can afford to develop the most advanced AI systems. The Chinese engineers indicated that they needed around US$6 million in raw computing power to assemble their new system—approximately ten times less than tech giant Meta’s recent investment in developing its AI technology.
“The number of companies who have US$6 million to spend is vastly greater than the number of companies who have US$100 million or US$1 billion to spend,” said Chris V. Nicholson, an investor from the venture capital firm Page One Ventures who focuses on AI technologies.
Since OpenAI sparked an AI boom in 2022 with the launch of ChatGPT, many experts and investors have presumed that no company could rival the market leaders without an expenditure of hundreds of millions of dollars on specialized chips.
The world’s predominant AI companies train their chatbots using supercomputers that utilize as many as 16,000 chips or more. In contrast, DeepSeek’s engineers claimed they only needed around 2,000 specialized computer chips from Nvidia.
The restrictions on chips in China compelled the DeepSeek engineers to “train it more efficiently so it could still be competitive,” according to Jeffrey Ding, an assistant professor at George Washington University specializing in emerging technology and international relations.
Recently, the Biden administration issued new regulations aimed at curbing China’s access to advanced AI chips through other countries, building on previous restrictions that prevent Chinese firms from purchasing or manufacturing cutting-edge computer chips. The implications of this regulatory landscape continue to unfold.
The US government has sought to keep advanced chips away from Chinese companies due to concerns that they may be used for military ends. In response, numerous firms in China have stockpiled chips, while others have sourced them from an active underground marketplace.
DeepSeek is operated by a quantitative stock trading firm called High Flyer. By 2021, it channeled its profits into acquiring thousands of Nvidia chips, which it used to train prior models. The company has earned a reputation in China for attracting talent fresh from leading universities with the promise of lucrative salaries and the freedom to pursue intriguing research questions.
Zihan Wang, a computer engineer involved in an earlier DeepSeek model, mentioned that the company also recruits individuals without computer science backgrounds to assist in enabling the technology to understand and generate poetry and excel in the notoriously challenging Chinese college entrance examination.
DeepSeek does not produce consumer-facing products, allowing its engineers to concentrate exclusively on research. This focus means its technology isn’t constrained by China’s strict regulations on AI, which impose compliance with government controls on information for consumer-oriented technology.
While leading US companies continue to push the boundaries of AI, DeepSeek has demonstrated its ability to keep pace. Recently, the company launched a new reasoning model that parallels advancements made by its American counterparts.
A crucial aspect of this rapidly evolving global market is open-source software. Following the footsteps of several other firms, DeepSeek has open-sourced its latest AI system, sharing the underlying code with businesses and researchers. This innovation fosters the development and distribution of products utilizing the same technologies.
As employees at major Chinese tech companies often face restrictions in collaborating beyond their organizations, open source provides an alternative: “If you work on open source, you work with talent around the world,” said Yineng Zhang, lead software engineer who collaborates on projects using DeepSeek’s system.
The open-source ecosystem for AI gained momentum in 2023 when Meta released its AI system called LLama. Many experts anticipated the growth of this community would depend on established companies like Meta continuing to open-source their technologies, yet DeepSeek and others have proven their capacity to contribute to this landscape.
While many industry leaders argue against open-sourcing their technologies due to potential misuse, some contend that halting progress in the United States could grant China a significant advantage. If premier open-source technologies spring from China, US developers may build upon these systems, positioning China at the core of AI research and development.
“The center of gravity of the open-source community has been shifting towards China,” noted Ion Stoica, a professor at the University of California, Berkeley, emphasizing the potential risks this movement poses to US technological leadership.
Hours after his inauguration, a previous administration revoked an executive order that restricted open-source technologies. Projects emerging from this ecosystem underscore that even operations with minimal resources can produce competitive systems. Technology consultant Reuven Cohen has noted the qualitative similarities between DeepSeek-V3 and the latest systems from major firms, signifying an evolution in the AI landscape.
“DeepSeek is a way for me to save money,” Cohen remarked, indicating a growing interest in alternatives to established technology giants. “This is the kind of technology that someone like me wants to use.”