DeepSeek allegedly used banned chips, says Scale AI CEO

DeepSeek has been accused of using chipsets banned for the Chinese market to train its latest AI models, despite claiming it used weaker chipsets.

Scale AI CEO Alexandr Wang claims the Chinese AI start-up made use of 50,000 NVIDIA H100 chips, the chipset used by US AI giants to train their models but is banned for export to China.

DeepSeek claims that its V3 model, which powers its new R1 AI, was trained on 2,788 thousand H800 GPU hours. At a rate of US$2 per GPU hour, the total comes out to only US$5.58 million.

You’re out of free articles for this month

Username or Email

Password Forgot password?

Keep me signed in on this device.

First Name

Last Name

Mobile

Organisation Type

By becoming a member, I agree to receive information and promotional messages from Cyber Daily. I can opt out of these communications at any time. For more information, please visit our Privacy Statement.

Need help signing up? Visit the Help Centre.

For context, the H800 is a version of the H100 that has been engineered to be less powerful and designed for sale in the Chinese market.

The H800 is used because the administration of former US president Joe Biden introduced a number of bans to restrict the export of advanced chips to China, which are used to train these AI models, in an effort to limit AI training in China and keep the US ahead of the race.

As a result, previous attempts at advanced generative AI models from China, such as the first Chinese LLM from Chinese search engine Baidu, were underwhelming and unable to compete with the US market.

Now, however, Wang has claimed that DeepSeek used 50,000 H100 chips but that its staff were unable to discuss the matter because of the US export controls.

VIEW ALL

This is not the only accusation that DeepSeek is facing since announcing its new model and shaking up the AI market.

OpenAI is suspicious of its new competitor, claiming that it sourced data illegally to train the new model.

Speaking with The New York Times, the US AI giant claims that through a method known as distillation, the company used data generated by OpenAI’s services to train its own model.

In basic terms, distillation refers to transferring the knowledge of a larger “teacher” model to a smaller “student” model to allow it to perform at a similar level while being more efficient computationally.

While distillation is common practice in the AI industry, OpenAI’s terms of service forbid the process for competitors.

“We know that groups in the [People’s Republic of China] are actively working to use methods, including what’s known as distillation, to replicate advanced US AI models,” said OpenAI spokeswoman Liz Bourgeois in a statement seen by The New York Times.

“We are aware of and reviewing indications that DeepSeek may have inappropriately distilled our models, and will share information as we know more.

“We take aggressive, proactive countermeasures to protect our technology and will continue working closely with the US government to protect the most capable models being built here.”

Daniel Croft

Born in the heart of Western Sydney, Daniel Croft is a passionate journalist with an understanding for and experience writing in the technology space. Having studied at Macquarie University, he joined Momentum Media in 2022, writing across a number of publications including Australian Aviation, Cyber Security Connect and Defence Connect. Outside of writing, Daniel has a keen interest in music, and spends his time playing in bands around Sydney.

You need to be a member to post comments. Become a member for free today!

newsletter

Be the first to hear the latest developments in the cyber industry.

DeepSeek allegedly used banned chips, says Scale AI CEO

Daniel Croft

OUR PLATFORMS AND BRANDS

EVENTS AND SUMMITS

PODCASTS

LEARNING AND EDUCATION

MOMENTUM MARKETS NETWORK

LINKS

STAY CONNECTED