Interview with Deepseek Founder: We’re Done Following

DeepSeek-R1 is shaking Silicon Valley. Founder Liang Wenfeng: "We're done following. It's time to lead."

January 27, 2025

The China Academy

暗涌Waves

China Investment Reporting Media

Click Register

Try Premium Member

for Free with a 7-Day Trial

Click Register

Try Premium Member for Free with a 7-Day Trial

Editor’s Note:

Silicon Valley is reeling. A seismic shift in AI dominance is underway, and all eyes are on China. In January 2025, DeepSeek-R1—an open-source inference model from Chinese AI firm DeepSeek—sent shockwaves through the tech world by matching OpenAI’s top-tier performance at 1/30th the API cost, all while embracing full openness.

With just $6 million, China built one of the world’s finest AI models, dwarfing the billions spent by Meta, Google, and Microsoft. Already, global users—especially individuals and SMEs—are flocking to DeepSeek-R1, retraining it as their foundational model.

This Eastern-led revolution is forcing a global reckoning: What if AI’s future isn’t forged in Silicon Valley?

The following article is our translation of a July 2024 interview with Liang Wenfeng, founder of DeepSeek, originally conducted by the Chinese media outlet An Yong and published in Chinese. The interview was held shortly after the company’s open-source V2 model catapulted it to fame and reveals how a Chinese startup dared to leapfrog industry giants and redefine the rules of innovation.

This post-85s entrepreneur appeared on Xinwen Lianbo (CCTV News) as the founder of the AI startup DeepSeek, participating in a high-level national symposium and delivering a speech.

How Was the First Shot in the Price War Fired?

An Yong (Interviewer): After the release of the DeepSeek V2 model, it quickly triggered a fierce price war in the large model industry. Some say you are a disruptor in the market.

Liang Wenfeng (DeepSeek Founder): We never intended to be a disruptor; it just happened by accident.

An Yong: Were you surprised by this outcome?

Liang Wenfeng: Very surprised. We didn’t expect pricing to be such a sensitive issue. We were simply following our own pace, calculating costs, and setting prices accordingly. Our principle is neither to sell at a loss nor to seek excessive profits. The current pricing allows for a modest profit margin above our costs.

An Yong: Five days later, Zhipu AI followed suit, and soon after, ByteDance, Alibaba, Baidu, and Tencent joined the race.

Liang Wenfeng: Zhipu AI lowered prices for an entry-level product, while their flagship models remain expensive. ByteDance was the first to truly match our price for a flagship model, which then pressured others to follow. Since large companies have much higher model costs than us, we never imagined anyone would operate at a loss. It ended up mirroring the internet era’s subsidy-driven logic.

An Yong: From an outsider’s perspective, price cuts seem like a tactic to grab users—typical of internet-era competition.

Liang Wenfeng: Grabing users wasn’t our primary goal. We reduced prices because, first, while exploring next-generation model structures, our costs decreased; second, we believe that both AI and API services should be affordable and accessible to everyone.

An Yong: Before this, most Chinese companies simply copied the Llama model structure to develop applications. Why did you choose to focus on model structure instead?

Liang Wenfeng: If the goal is to develop applications, adopting Llama’s structure to quickly launch a product is a reasonable choice. However, our goal is AGI (Artificial General Intelligence), which requires us to explore new model structures to achieve superior capabilities within limited resources. This is foundational research for scaling up. Beyond architecture, we’ve studied data curation and human-like reasoning—all reflected in our models. Also, Llama’s training efficiency and inference costs lag behind cutting-edge global standards by about two generations.

An Yong: Where does this generational gap come from?

Liang Wenfeng: First, there’s a gap in training efficiency. We estimate that China’s best models likely require twice the compute power to match top global models due to structural and training dynamics gaps. Data efficiency is also half as effective, meaning we need twice the data and compute for equivalent results. Combined, that’s four times the resources. Our goal is to continuously narrow these gaps.

An Yong: Most Chinese firms pursue both models and applications. Why is DeepSeek focusing solely on research?

Liang Wenfeng: Because we believe the most important thing right now is to participate global innovation. For years, Chinese companies have been accustomed to leveraging technological innovations developed elsewhere and monetizing them through applications. But this isn’t sustainable. This time, our goal isn’t quick profits but advancing the technological frontier to drive ecosystem growth.

An Yong: The prevailing belief from the internet and mobile internet eras is that the U.S. leads in innovation, while China excels at applications.

Liang Wenfeng: We believe that with economic development, China must gradually transition from being a beneficiary to a contributor, rather than continuing to ride on the coattails of others. Over the past 30 years of the IT revolution, we barely participated in core tech innovation.

We’ve grown accustomed to Moore’s Law “falling from the sky”—waiting 18 months for better hardware and software. Scaling Law is treated similarly. However, these advancements are the result of generations of relentless effort by Western-led technology communities. Because we haven’t been actively involved in this process, we’ve come to overlook its significance.

The Real Gap Lies in Originality, Not Just Time

An Yong: Why did DeepSeek V2 surprise many in Silicon Valley?

Liang Wenfeng: Among the daily innovations in the U.S., this is quite ordinary. Their surprise stems from seeing a Chinese company join their game as an innovator, not just a follower—which is what most Chinese firms are accustomed to.

An Yong: But in China’s context, prioritizing pure innovation seems almost a luxury. Developing large models is capital-intensive. Not every company can afford to focus solely on research without commercializing first.

Liang Wenfeng: Innovation is undoubtedly costly, and our past tendency to adopt existing technologies was tied to China’s earlier developmental stage. But today, China’s economic scale and the profits of giants like ByteDance and Tencent are globally significant. What we lack isn’t capital but confidence and the ability to organize high-caliber talent for effective innovation.

An Yong: Why do Chinese companies, even well-funded giants, often prioritize rapid commercialization?

Liang Wenfeng: For three decades, we’ve emphasized profit over innovation. Innovation isn’t purely business-driven; it requires curiosity and creative ambition. We’re shackled by old habits, but this is a phase.

An Yong: But DeepSeek is a business, not a nonprofit research lab. If you innovate and open-source your breakthroughs—like the MLA architecture innovation releasing in May—won’t competitors quickly copy them? Where’s your moat?

Liang Wenfeng: In disruptive tech, closed-source moats are fleeting. Even OpenAI’s closed-source model can’t prevent others from catching up.

Therefore, our real moat lies in our team’s growth—accumulating know-how, fostering an innovative culture. Open-sourcing and publishing papers don’t result in significant losses. For technologists, being followed is rewarding. Open-source is cultural, not just commercial. Giving back is an honor, and it attracts talent.

An Yong: How do you respond to market-driven views like those of Zhu Xiaohu (who advocates prioritizing immediate commercialization over foundational AI research, dismisses AGI as impractical)?

Liang Wenfeng: Zhu’s logic suits short-term profit ventures, but the most enduringly profitable U.S. companies are tech giants built on long-term R&D.

An Yong: But in AI, pure technical lead isn’t enough. What larger goal is DeepSeek betting on?

Liang Wenfeng: We believe that China’s AI cannot remain a follower forever. Often, we say there’s a one- or two-year gap between Chinese and American AI, but the real gap is between originality and imitation. If this doesn’t change, China will always be a follower. Some explorations are unavoidable.

NVIDIA’s dominance isn’t just its effort—it’s the result of Western tech ecosystems collaborating on roadmaps for next-gen tech. China needs similar ecosystems. Many domestic chips fail because they lack supportive tech communities and rely on secondhand insights. Someone must step onto the frontier.

More Investment Doesn’t Always Fuel More Innovation

An Yong: DeepSeek currently exudes an idealistic vibe reminiscent of OpenAI’s early days, and you’re open-source. Do you plan to transition to a closed-source model in the future, as OpenAI and Mistral have done?

Liang Wenfeng: We won’t go closed-source. We believe that establishing a robust technology ecosystem matters more.

An Yong: Are there fundraising plans? Media reports suggest Huanfang【1】 aims to spin off DeepSeek for an IPO. Silicon Valley AI startups inevitably align with big players—will you follow?.

Liang Wenfeng: No short-term plans. Our challenge has never been money; it’s the embargo on high-end chips.

An Yong: Many argue AGI requires bold alliances and visibility, unlike quantitative investing, which thrives in secrecy. Do you agree?

Liang Wenfeng: More investment doesn’t necessarily result in more innovation. If that were the case, big tech companies would have monopolized all innovation.

An Yong: Are you avoiding applications because DeepSeek lacks operational expertise?

Liang Wenfeng: We believe that the current stage is a period of technological innovation, not application explosion. In the long term, we aim to establish an ecosystem where the industry directly uses our technologies and outputs. Others develop B2B/B2C services on our models while we focus on foundational research. If a complete industry chain forms, there’s no need for us to develop applications ourselves. That said, if necessary, we are fully capable of doing so. However, research and innovation will always remain our top priority.

An Yong: Why would clients choose DeepSeek’s API over big players’?

Liang Wenfeng: The future world will likely be one of specialized division of labor. Foundational AI models require continuous innovation, and big companies have their limits—they may not always be the best fit for this role.

An Yong: But can technology alone create a significant competitive gap? You’ve said there are no absolute “secrets.”

Liang Wenfeng: Secrets don’t exist, but replication takes time and cost. NVIDIA GPUs have no hidden magic—yet catching up requires rebuilding teams and chasing their next-gen tech. That’s the real moat.

An Yong: After your price cuts, ByteDance was the first to follow, suggesting they felt threatened. How do you view the new competitive landscape between startups and giants?

Liang Wenfeng: To be honest, we don’t really care about it. Lowering prices was just something we did along the way. Providing cloud services isn’t our main goal—achieving AGI is. So far, we haven’t seen any groundbreaking solutions. Giants have users, but their cash cows also shackle them, making them ripe for disruption.

An Yong: What do you think the endgame looks like for the six other major AI startups in China?

Liang Wenfeng: Maybe 2-3 survive. All are burning cash now. Those with clear focus and operational discipline will endure. Others will pivot. Value never vanishes; they will take on new forms.

An Yong: What’s your core philosophy when it comes to competition?

Liang Wenfeng: I focus on whether something elevates societal efficiency and whether we can find our strength in the industry value chain. As long as the ultimate goal boosts efficiency, it’s valid. Many aspects are just temporary phases—over-focusing on them will only lead to confusion.
V2 Model: Built Entirely by Homegrown Talent

An Yong: ack Clark, former policy lead at OpenAI and co-founder of Anthropic, remarked that DeepSeek has hired “some of those inscrutable wizards” who built DeepSeek V2. What defines these people?

Liang Wenfeng: No “inscrutable wizards” here—just fresh graduates from top universities, PhD candidates (even fourth- or fifth-year interns), and young talents with a few years of experience.

An Yong: Many major AI companies are keen on recruiting talent from overseas. Some believe that the top 50 AI talents globally are unlikely to be working for Chinese companies. Where does your team come from?

Liang Wenfeng: V2 was built entirely by domestic talent. The global top 50 might not be in China today, but we aim to cultivate our own.

An Yong: How did the MLA innovation emerge? We heard that the idea initially stemmed from a young researcher’s personal interest.

Liang Wenfeng: After summarizing the key evolutionary patterns of the mainstream Attention architecture, he had a sudden inspiration to design an alternative. However, turning an idea into reality is a long journey. We assembled a team and spent months validating it.

An Yong: This kind of organic creativity seems tied to your flat organizational structure. In Huanfang, you avoided top-down mandates. But for AGI—a high-uncertainty frontier—do you impose more management?

Liang Wenfeng: DeepSeek remains entirely bottom-up. We also do not preassign roles; natural division of labor emerges. Everyone brings unique experiences and ideas, and they don’t need to be pushed. When they encounter challenges, they naturally pull others in for discussions. However, once an idea shows potential, we do allocate resources from the top down.

An Yong: We’ve heard that DeepSeek operates with remarkable flexibility in allocating computing resources and personnel.

Liang Wenfeng: There are no limits on accessing compute resources or team members. If someone has an idea, they can tap into our training clusters anytime without approval. Additionally, since we don’t have rigid hierarchical structures or departmental barriers, people can collaborate freely as long as there’s mutual interest.

An Yong: Such loose management relies on hiring intensely driven individuals. It’s said that DeepSeek excels at identifying exceptional talent based on non-traditional criteria.

Liang Wenfeng: Our hiring standards have always been based on passion and curiosity. Many of our team members have unique and interesting backgrounds. Their hunger for research far outweighs monetary concerns.

An Yong: Transformer was born in Google’s AI Lab, and ChatGPT emerged from OpenAI. In your opinion, how do corporate AI labs differ from startups in fostering innovation?

Liang Wenfeng: Whether it’s Google’s labs, OpenAI, or even AI labs at Chinese tech giants, they all provide significant value. The fact that OpenAI eventually delivered breakthroughs was partly historical chance.

An Yong: So is innovation largely a matter of chance? Your office layout includes meeting rooms with doors that can be easily opened on both sides. Your colleagues mentioned that this design allows for “serendipity,” reminiscent of the Transformer story—where a passerby overheard a discussion and helped shape it into a universal framework.

Liang Wenfeng: I believe innovation is, first and foremost, a matter of belief. Why is Silicon Valley so innovative? Because they dare to try. When ChatGPT debuted, China lacked confidence in frontier research. From investors to major tech firms, many felt the gap was too wide and focused instead on applications. But innovation requires confidence, and young people tend to have more of it.

An Yong: Unlike other AI companies that actively seek funding and media attention, DeepSeek remains relatively quiet. How do you ensure that DeepSeek becomes the top choice for people looking to work in AI?

Liang Wenfeng: Because we are tackling the hardest problems. The most attractive thing for top-tier talent is the opportunity to solve the world’s toughest challenges. In fact, top talent in China is often underestimated because hardcore innovation is rare, which means they rarely get recognized. We offer what they crave.

An Yong: The recent OpenAI event did not feature GPT-5, leading many to believe that the industry’s technological curve is slowing down, and some have begun questioning Scaling Law. What’s your perspective?

Liang Wenfeng: We remain optimistic. The industry’s progress is still in line with expectations. OpenAI isn’t divine; they can’t lead forever.

An Yong: How long do you think it will take to achieve AGI? Before V2, you released code/math models and switched from dense to MoE【2】 . What’s your roadmap?

Liang Wenfeng: It could take two years, five years, or ten years—but it will happen within our lifetime. As for our roadmap, there’s no consensus even within our company. However, we are placing our bets on three directions:

1. Mathematics and code, which serve as a natural testbed for AGI—much like Go, they are enclosed, verifiable systems where self-learning could lead to high intelligence.

2. Multimodality, where the AI engages with the real world to learn.

3. Natural language itself, which is fundamental to human-like intelligence.

We are open to all possibilities.

An Yong: What do you envision as the endgame for large AI models?

Liang Wenfeng: There will be specialized companies providing foundational models and services, forming a long value chain of specialized divisions. More players will emerge to meet society’s diverse needs on top of these foundations.

All Strategies Are Products of the Past

An Yong: Over the past year, China’s large model startup landscape has seen many changes. For instance, Wang Huiwen【3】 , who was highly active early on, exited midway, while newer entrants are beginning to differentiate themselves.

Liang Wenfeng: Wang Huiwen took on all the losses himself, allowing others to exit unscathed. He made a decision that was most unfavorable to himself but beneficial to everyone else. I truly admire his integrity.

An Yong: Where do you currently focus most of your energy?

Liang Wenfeng: My main focus is on researching the next generation of large models. There are still many unresolved challenges.

An Yong: Many other AI startups insist on balancing both model development and applications, since technical leads aren’t permanent. Why is DeepSeek confident in focusing solely on research? Is it because your models still lag?

Liang Wenfeng: All strategies are products of the past generation and may not hold true in the future. Discussing AI’s future profitability using the commercial logic of the internet era is like comparing Tencent’s early days to General Electric or Coca-Cola—it’s essentially carving a boat to mark a sword’s position, an outdated approach.

An Yong: Huanfang had strong technological and innovative genes, and its growth seemed relatively smooth. Is this why you remain optimistic?

Liang Wenfeng: Huanfang, to some extent, strengthened our confidence in technology-driven innovation, but it wasn’t all smooth sailing. We went through a long accumulation process. People only saw what happened after 2015, but in reality, we had been working on it for 16 years.

An Yong: Returning to original innovation: With the economy slowing and capital cooling, will this stifle groundbreaking R&D?

Liang Wenfeng: Not necessarily. The restructuring of China’s industrial landscape will increasingly rely on deep-tech innovation. As quick-profit opportunities vanish, more will embrace real innovation.

An Yong: So you’re optimistic about this?

Liang Wenfeng: I grew up in the 1980s in a fifth-tier city in Guangdong. My father was a primary school teacher. In the 1990s, there were plenty of opportunities to make money in Guangdong. Many parents would come to our home and argue that studying was useless. But looking back now, perspectives have changed. Making money isn’t as easy as it used to be—not even driving a taxi is a viable option anymore. Within just one generation, things have shifted.

Hardcore innovation will only increase in the future. It’s not widely understood now because society as a whole needs to learn from reality. When this society starts celebrating the success of deep-tech innovators, collective perceptions will change. We just need more real-world examples and time to allow that process to unfold.

Editor: Zhongxiaowen

References

【1】Huanfang: A quantitative investment firm and early DeepSeek backer.

【2】MoE: Mixture of Experts, an architecture that improves model efficiency by activating specialized subnetworks.

【3】 Wang Huiwen: Co-founder of Meituan, who briefly entered the AI race in 2023 before exiting.

How DeepSeek is Preventing AI from Replacing Humanity

March 5, 2025