DeepSeek, a burgeoning Chinese AI company, has recently garnered attention with its new reasoning model, R1. This model reportedly matches the performance of OpenAI's o1 model, based on several benchmark tests such as AIME 2024, Codeforces, and MATH-500, among others. The company's claims have sparked considerable debate in the tech industry over the validity and originality of its innovations. Founded in 2023 by Liang Wenfeng, co-founder of the AI-focused hedge fund High-Flyer, DeepSeek is positioning itself as a formidable competitor in the AI landscape.
DeepSeek's R1 model not only competes with OpenAI's o1 in performance but also boasts a significantly lower operating cost. While OpenAI charges 15 cents per one million input tokens for its GPT-4o mini model, DeepSeek claims its R1 model costs 55 cents per one million tokens of inputs and $2.19 per one million tokens of output. However, some experts have cast doubt on these figures.
"Having said that, there are still a lot of questions and uncertainties around the full picture of costs as it pertains to the development of DeepSeek" – Daniel Newman, CEO of tech insight firm The Futurum Group
The innovative approach of DeepSeek's models extends beyond cost efficiency. The V3 model, for instance, was developed with a mere $5.6 million training cost—a stark contrast to the billions spent by leading U.S. AI labs. This model comprises 671 billion parameters that guide its learning process. Despite its smaller size compared to other large language models, DeepSeek’s use of mature Nvidia chips like H800 and A100 has achieved notable reductions in power requirements.
Industry reactions have been mixed, with some experts praising DeepSeek's achievements while others remain skeptical. Yann LeCun from Meta emphasized that this success might indicate the superiority of open-source models over proprietary ones.
"To people who see the performance of DeepSeek and think: 'China is surpassing the US in AI.' You are reading this wrong. The correct reading is: 'Open source models are surpassing proprietary ones'" – Yann LeCun, chief AI scientist at Meta
Conversely, billionaire investor Vinod Khosla expressed doubts about the originality of DeepSeek’s technology.
"DeepSeek makes the same mistakes O1 makes, a strong indication the technology was ripped off" – Vinod Khosla
Despite these controversies, many in the industry acknowledge DeepSeek’s potential to reshape AI development pathways.
"But DeepSeek proves we are still in the nascent stage of AI development and the path established by OpenAI may not be the only route to highly capable AI" – Xiaomeng Lu, director of Eurasia Group's geo-technology practice
DeepSeek’s backing by a Chinese hedge fund has also fueled suspicions about their motives. Palmer Luckey, a prominent U.S. entrepreneur, criticized the financial narrative surrounding DeepSeek's developments.
"The $5M number is bogus. It is pushed by a Chinese hedge fund to slow investment in American AI startups, service their own shorts against American titans like Nvidia, and hide sanction evasion" – Palmer Luckey, U.S. entrepreneur
Despite such controversies, DeepSeek's R1 model showcases a unique approach to AI problem-solving. By breaking down prompts into smaller components and considering various strategies before responding, R1 highlights an innovative methodology that could redefine AI capabilities.
"Even if it's off by a certain factor, it still is coming in as greatly efficient" – Seena Rejal, chief commercial officer of NetMind
The precise financial details surrounding DeepSeek's operations remain uncertain, prompting industry observers to question their claims' accuracy. Paul Triolio from DGA Group highlighted that the reported $5.6 million cost represents just one training iteration rather than encompassing all R&D expenses.
"The 5.6 million figure for DeepSeek V3 was just for one training run, and the company stressed that this did not represent the overall cost of R&D to develop the model" – Paul Triolio, senior VP for China and technology policy lead at advisory firm DGA Group
Amidst these debates, DeepSeek continues to pioneer within the rapidly evolving AI field, demonstrating alternative methods to achieving advanced AI capabilities.
"The takeaway is that there are many possibilities to develop this industry. The high-end chip/capital intensive way is one technological approach" – Xiaomeng Lu, director of Eurasia Group's geo-technology practice