605: The Democrats Behind DeepSeek

29 January 2025

DeepSeek has everyone freaking out; we'll look at what's legitimately fascinating, what bits have been an overreaction, and the big mistake that made this all possible.

Direct Download: MP3
  • đź’Ą Gets Sats with Strike — Strike is a lightning-powered app that lets you quickly and cheaply grab sats in over 100 countries. Easily integrates with Fountain.fm. Setup your Strike account, and you have one of the world’s best ways to buy sats.
  • 🇨🇦 Bitcoin Well — The fastest and safest way to buy Bitcoin in Canada and the USA. With self-custody built in. 🥇
  • đź“» Boost with Fountain.FM — Boost with Fountain.FM and kick the tires on the Podcasting 2.0 revolution! 🚀
  • Report: 88% of companies are contemplating leaving Oracle Java — 72% of respondents were already thinking about it when surveyed in 2023.
  • State of Java 2025 - Azul Report — Insights from over 2,000 Java users across six continents to reveal 2025 Java trends that are shaping key areas of enterprise technology.
  • Nvidia sheds almost $600 billion in market cap, biggest drop ever — The sell-off, which hit much of the U.S. tech sector, was sparked by concerns about increased competition from Chinese AI lab DeepSeek.
  • DeepSeek’s AI Model Tests Limits of US Curbs on Nvidia Chips
  • DeepSeek-V3 Technical Report — We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages to fully harness its capabilities. Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. Despite its excellent performance, DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training. In addition, its training process is remarkably stable.
  • Deepseek: The Quiet Giant Leading China’s AI Race
  • DeepSeek’s Popular AI App Is Explicitly Sending US Data to China — Amid ongoing fears over TikTok, Chinese generative AI platform DeepSeek says it’s sending heaps of US user data straight to its home country, potentially setting the stage for greater scrutiny.
  • DeepSeek hit with large-scale cyberattack, says it’s limiting registrations — DeepSeek on Monday said it would temporarily limit user registrations “due to large-scale malicious attacks” on its services.
  • Satya Nadella on X — Jevons paradox strikes again! As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can’t get enough of.
  • Jevons paradox - Wikipedia
  • Sam Altman on X — deepseek’s r1 is an impressive model, particularly around what they’re able to deliver for the price. we will obviously deliver much better models and also it’s legit invigorating to have a new competitor! we will pull up some releases.
  • Biden Got Freaked Out About AI and National Security After Watching the Newest ‘Mission: Impossible’ Movie — Speaking to The Associated Press, deputy White House chief of staff Bruce Reed recalled that while Biden has grown concerned over the use of AI to generate fake images of himself or clone a user’s voice, it was a screening of “Mission: Impossible – Dead Reckoning Part One” at Camp David that particularly alarmed the president.