Wall Street is stunned and rightfully so. 
DeepSeek’s AI Model Is the Top-Rated App in the U.S.
Scientific American comments Why DeepSeek’s AI Model Just Became the Top-Rated App in the U.S.
DeepSeek’s artificial intelligence assistant made big waves Monday, becoming the top-rated app in the Apple Store and sending tech stocks into a downward tumble. What’s all the fuss about?
The Chinese start-up, DeepSeek, surprised the tech industry with a new model that rivals the abilities of OpenAI’s most recent model—with far less investment and using reduced-capacity chips. The U.S. bans exports of state-of-the-art computer chips to China and limits sales of chipmaking equipment. DeepSeek, based in the eastern Chinese city of Hangzhou, reportedly had a stockpile of high-performance Nvidia A100 chips from times prior to the ban—so its engineers could have used those to develop the model. But in a key breakthrough, the start-up says it instead used much lower-powered Nvidia H800 chips to train the new model, dubbed DeepSeek-R1.
On common AI tests in mathematics and coding, DeepSeek-R1 matched the scores of Open AI’s o1 model, according to VentureBeat.
DeepSeek-R1 is free for users to download, while the comparable version of ChatGPT costs $200 a month.
Because it requires less computational power, the cost of running DeepSeek-R1 is a tenth of the cost of similar competitors, says Hanchang Cao, an incoming assistant professor in Information Systems and Operations Management at Emory University. “For academic researchers or start-ups, this difference in the cost really means a lot,” Cao says.
DeepSeek achieved its efficiency in several ways, says Anil Ananthaswamy, author of Why Machines Learn: The Elegant Math Behind Modern AI. The model has 670 billion parameters, or variables it learns from during training, making it the largest open-source large language model yet, Ananthaswamy explains. But the model uses an architecture called “mixture of experts” so that only a relevant fraction of these parameters—tens of billions instead of hundreds of billions—are activated for any given query. This cuts down on computing costs. The DeepSeek LLM also uses a method called multi-head latent attention to boost the efficiency of its inferences; and instead of predicting an answer word-by-word, it generates multiple words at once.
Another important aspect of DeepSeek-R1 is that the company has made the code behind the product open-source, Ananthaswamy says. (The training data remains proprietary.) This means that the company’s claims can be checked. If the model is as computationally efficient as DeepSeek claims, he says, it will probably open up new avenues for researchers who use AI in their work to do so more quickly and cheaply. It will also enable more research into the inner workings of LLMs themselves.
“One of the big things has been this divide that has opened up between academia and industry because academia has been unable to work with these really large models or do research in any meaningful way,” Ananthaswamy says. “But something like this, it’s within the reach of academia now, because you have the code.”
DeepSeek Stuns Wall Street
The Wall Street Journal reports DeepSeek Stuns Wall Street With Capability and Cost
Who saw that coming? Not Wall Street, which sold off tech stocks on Monday after the weekend news that a highly sophisticated Chinese AI model, DeepSeek, rivals Big Tech-built systems but cost a fraction to develop. The implications are likely to be far-reaching, and not merely in equities.
Enter DeepSeek, which last week released a new R1 model that claims to be as advanced as OpenAI’s on math, code and reasoning tasks. Tech gurus who inspected the model agreed. One economist asked R1 how much Donald Trump’s proposed 25% tariffs will affect Canada’s GDP, and it spit back an answer close to that of a major bank’s estimate in 12 seconds. Along with the detailed steps R1 used to get to the answer.
More startling, DeepSeek required far fewer chips to train than other advanced AI models and thus cost only an estimated $5.6 million to develop. Other advanced models cost in the neighborhood of $1 billion. Venture capitalist Marc Andreessen called it “AI’s Sputnik moment,” and he may be right.
DeepSeek is challenging assumptions about the computing power and spending needed for AI advances. OpenAI, Oracle and SoftBank last week made headlines when they announced a joint venture, Stargate, to invest up to $500 billion in building out AI infrastructure. Microsoft plans to spend $80 billion on AI data centers this year.
CEO Mark Zuckerberg on Friday said Meta would spend about $65 billion on AI projects this year and build a data center “so large that it would cover a significant part of Manhattan.” Meta expects to have 1.3 million advanced chips by the end of this year. DeepSeek’s model reportedly required as few as 10,000 to develop.
DeepSeek’s breakthrough means these tech giants may not have to spend as much to train their AI models. But it also means these firms, notably Google’s DeepMind, might lose their first-mover, technological edge.
DeepSeek is vindicating President Trump’s decision to rescind a Biden executive order that gave government far too much control over AI. Companies developing AI models that pose a “serious risk” to national security, economic security, or public health and safety would have had to notify regulators when training their models and share the results of “red-team safety tests.”
DeepSeek should also cause Republicans in Washington to rethink their antitrust obsessions with big tech. Bureaucrats aren’t capable of overseeing thousands of AI models, and more regulation would slow innovation and make it harder for U.S. companies to compete with China. As DeepSeek shows, it’s possible for a David to compete with the Goliaths. Let a thousand American AI flowers bloom.
Ignoring AI’s Potential Is Ignorant
Nate Silver says It’s Time to Come to Grips with AI
Ignoring AI’s potential is, well, ignorant
For the real leaders of the left, the issue simply isn’t on the radar. Bernie Sanders has only tweeted about “AI” once in passing, and AOC’s concerns have been limited to one tweet about “deepfakes.”
Meanwhile, the vibe from lefty public intellectuals has been smug dismissiveness. Take this seven-word tweet from Ken Klippenstein, a left-leaning journalist formerly of The Intercept who now writes a popular Substack.
I’m sorry, but this is ignorant. Large language models like ChatGPT are, by some measures, the most rapidly adopted technology in human history. Kulwin’s tweet is equivalent to, in the 1990s, dismissing the Internet as a “pornography and hacking machine.” Yes, these are common use cases, but they’re the tip of a massive iceberg.
It’s not just that AIs can now solve Math Olympiad problems. LLMs also provide a lot of “mundane utility,” from serving as computer programmers to research assistants to all-around problem-solving tools. I’d estimate that using LLMs and other AI tools improve my productivity by perhaps 5 percent on a day-to-day basis. It’s not yet a true “game changer,” but more and more, they provide reliable marginal value, from debugging Stata code to vetting technical concepts to serving as a copy editor or a creative muse.
Impressive Math

The New York Times says Move Over, Mathematicians, Here Comes AlphaProof
In January [2024], a Google DeepMind system named AlphaGeometry solved a sampling of Olympiad geometry problems at nearly the level of a human gold medalist. “AlphaGeometry 2 has now surpassed the gold medalists in solving I.M.O. problems,” Thang Luong, the principal investigator, said in an email.
The lab’s strike at this year’s Olympiad deployed the improved version of AlphaGeometry. Not surprisingly, the model fared rather well on the geometry problem, polishing it off in 19 seconds.
I cannot fathom solving that problem ever, let alone in 19 seconds. And I had three semesters of calculus, plus differential equations, and advanced statistics (all of which I admit I have long forgotten).
Development Costs
Development of DeepSeek reportedly cost $5.6 million vs US costs estimated in the neighborhood of $1 billion.
DeepSeek’s model reportedly required as few as 2,000 Nvdia chips (some estimates at 10,000 chips) to develop vs Meta’s expectation to need 1.3 million advanced chips by the end of this year.
I don’t doubt the DeepSeek numbers because its access to Nvidia’s advanced chips was by rented data centers with restricted access that China was not supposed to have at all.
We do not know how much China really spent, but we sure do know Biden’s export sanctions on technology failed in spectacular fashion.
How Much Spending Is Really Needed?
Trump secured pledges to spend $500 billion on data centers. Meta alone is planning to spend more than $60 billion.
Technology investor Marc Andreessen called DeepSeek’s AI model “one of the most amazing and impressive breakthroughs I’ve ever seen” and “a profound gift to the world” in a post on X.
Export Restrictions
I discussed export restrictions yesterday in China’s DeepSeek AI Raises Doubts Over U.S. Tech Dominance and Export Curbs
Another Sanction Failure
Biden placed numerous exports bans on chip technology to prevent this from happening.
Instead DeepSeek said in a late-December report that it used a cluster of more than 2,000 Nvidia chips to train its AI.
No one should be surprised by this.
To Those Hard of Learning, Here’s a Repeat Lesson on Why Sanctions Fail
On September 26, 2024, I commented To Those Hard of Learning, Here’s a Repeat Lesson on Why Sanctions Fail
Let’s discuss a claim that sanction failures are due to a lack of political will.
Robin Brooks on X: “When someone tells you that sanctions can’t and won’t work, that’s basically pro-Russian propaganda. Are we seriously to believe that nothing can be done to stop the shameful flood of transshipments to Russia via Central Asia? Come on. This is just about a lack of political will.“
I am pretty sure that “someone” is me because we have gone round and round on this.
When someone tells you that sanctions do work. Ask them for evidence.
The above post was on oil-related sanctions. The next article is how and why chip sanctions failed.
On August 26, 2024, I commented China Gains Secret Access to Nvdia Microchips by Renting Computers
The US has blocked export of Nvdia chips to China. But where there’s profit, there’s a way.
Know Your Customer’s Customer’s Customer
China sets up an AI company in Singapore. AI developers buy cloud time through a subsidiary that further masks the operation by paying in Bitcoin.
In turn, the subsidiary buys time from a company Dubai or Singapore that hosts the servers.
US politicians are outraged. But some of us are amused knowing full well that sanctions don’t work. So instead of cloud profits going to US corporations, the profits go to Saudi Arabia, Singapore, Dubai, and South Korea.
Only Amazon is forced to “know your customer”.
Musk Trashes Trump’s Pet AI Project
Regarding capital expenditures, note that Musk Trashes Trump’s Pet AI Project causing a feud with Trump’s staff.
Musk, who owns his own AI startup, was not at Trump’s unveiling of “Stargate,” an effort to supercharge the country’s AI infrastructure featuring the tech giants OpenAI, Softbank and Oracle. Musk, who co-founded OpenAI, has long been critical of its CEO Sam Altman and spent much of Wednesday trolling him online. “They don’t actually have the money,” he said. Softbank, meanwhile, “has well under $10B secured. I have that on good authority.”
Musk even reposted a joke that suggested Altman and his team smoked crack “to come up with their $500 billion number for Stargate.”
How Good is DeepSeek?
The answer is we don’t really know, but we do know that the free downloadable version is very good, perhaps as good or better than ChatGPT which costs $200 a month (well, not for long)!
After assuming for years the US was far ahead of the rest of the world on AI, comes a Sputnik realization that perhaps the US is really behind despite China having to use rented computers and/or lower-grade chips to develop their AI.
By the way, what does this say about US military intelligence (other than it’s likely in a huge tizzy right now).


Open AI is out this morning claiming that DeepSeek used ChatGPT4’s models as a major part of their own model, which then allows DS to produce it’s operator product with much less computational effort.
Regarding AI, I’ve been using Claude to develop some Option strategies. In one answer, I found a problem and asked Claude about it. Claude recognized the error and corrected it. I found another one later and the same thing happened.
I then asked Claude if it remembered the error and it’s correction and stored that info in its database so that it wouldn’t happen again. Here is the answer I got back:
“No, I don’t retain corrections or learn from my mistakes during our conversation. Each response I give is based on my initial training, which ended in April 2024. While I aim to provide accurate information and can correct myself within a conversation when errors are pointed out (as we just saw with the iron butterfly P&L calculation), I don’t develop or maintain any ongoing memory between conversations.
This is actually an important limitation to be aware of – if we were to start a new conversation and you asked me about iron butterfly P&L again, I might make the same error about options being “in the money” versus “at the money” at the center strike. The correct understanding I demonstrated after your correction would not carry forward.
This is why it’s always valuable to have experienced traders like yourself verify and correct any options-related calculations and concepts I provide.”
Bottom line: AI isn’t the infallible “tool” we’re being sold.
DeepSeek’s breakthrough in AI is truly remarkable, especially considering the efficiency and cost-effectiveness of their model. It’s exciting to see how it’s challenging established tech giants and reshaping the future of AI research. For those looking to explore AI innovations further, you can find more insights and updates at DeepSeek AI. This kind of advancement certainly makes the future of AI even more fascinating!
DeepSeek is clearly startling. As for the $6 million to train it … remember that the CCP is a major player in it and thus there is no way to validate that claim and it should be viewed with some healthy skepticism. I have to wonder who hacked the logins right after the announcement (US gov’t?) … it is an interesting space … I’m guessing DeepSeek will be a bit like most Microsoft OS’s in stages:
I have some doubts, AI found the math Olympiad answer among multitude of solved math questions, rather than figured it out.
I was preparing myself for a math Olympiad, and even then 40 years ago, there were books of math Olypiad solutions. I can imagine what it is now.
Once you get down to the actual equation (ignoring all the prefatory information) and get out your protractor, it’s quite easy to solve. Doing the proof only proves your a masochist at heart.
Now why do we need such things?
China will conquer the world with technology, not military force.
If they ever see the light and switch their very complex language to something like English or even Esperanto, they would make much greater strides.
I see today that Preplexity has already integrated Deepseek into their Pro offering, which I believe costs $20/month right now.
“I cannot fathom solving that problem ever, let alone in 19 seconds. And I had three semesters of calculus, plus differential equations, and advanced statistics (all of which I admit I have long forgotten)”
SImilar to myself and I had the exact same thought as I read that!
Now you can see why Elon wants foreign H1B workers.
Stinky shouldn’t test me. I hold the leash, and the stick.
Saw this on LinkedIn today regarding DeepSeek Privacy Policy.
https://chat.deepseek.com/downloads/DeepSeek%20Privacy%20Policy.html
Below are some interesting snippets, though the whole page has various red flags for me. Wonder how this matches up with CCPA standards.
Information You Provide
How We Use Your Information
With Your Consent
Where We Store Your Information
Thanks
What point did you want to make? Virtually any privacy policy for any company will read similar.
One of the interesting dilemmas will come when someone queries AI about topics that some organizations may not like the answer to. Just a couple of topics would be:
Is AI a tool – or are we looking for a God that will tell us what we should do?
Maybe some immune systems are more susceptible to vaccine reactions than others. This is where individual-specific risk identified by AI genetics will help. Could also use environmental factors as an input.
AI doesn’t care what the questioner “likes”.
Deep Seek was the name of the AI social media company the US government was using to spy on citizens in the last Jason Bourne movie.
US Subsidiary of Chinese Chemical Conglomerate Gave $250,000 to Trump InaugurationAgriculture giant Syngenta looks forward to “working with” the new administration.
China’s off the shelf cheap tech has done to the Governments’ expensive AI band wagon what Ukrainian’s Drone tech has done to the USA military contractor’s expensive band wagon
The name “DeepSeek” should be changed to “CheapSeek”.
Meanwhile, NVDA has stopped its slide. Looks like it’s a bargain right now at $120.
Yeah, it’s “only” selling at 25 times sales – down from 30.
Nate Silver talking about “midwit”” is unintentionally hilarious.
“The model. The model.”
Can it suggest a solution to the deteriorating Chinese economy?
Probably. And probably to the Russia and US and Europe and other world economies also. But humans likely won’t like the prescriptions.
Any curiosity that Deep Seek is unleashed (assume full approval of CCP) at very time US seeks to force the sale (steal) of Tik Tok to US investors?
Reminds me of whoever (really) destroyed Northstream and suddenly you have ship anchors destroying western undersea infrastructure.
Every action has a response.
What proof is there that it does not require much computing power? It excludes a substanial amount of data when replying to request. Ex: “I am sad.” DeepS, “Why are you sad? Different statement: “I am always sad.” Same DeepS reply, “Why are you sad.” Individual words matter vs stock phrase replies which saves compute.
Presumably useage and capabilities of LLMs and subsequent tools will continue to grow. At this stage most usage is people outsourcing neural activity (thinking, if you will). This frees up people from needing to ‘create’ things they don’t want to bother creating, such as Mish using Grok to generate images that reflect his words, and is akin to mechanical devices like dishwashers reducing the physical exertion of people.
Now, where do things go from here? An optimist may see there being breathtaking potential to better most services and processes to vastly improve human endeavors and allow people the time to pursue activites they find fulfilling (akin to retirement). A pessimist may conclude that the fewer things one does for themself the lazier one’s mind will become, eventually reaching the point where stringing a few words together to make a sentance would be too laborious to bother with, rendering humanity superfluous.
Regardless of where things go from here, the egotistical tweets highlighted at the top of this post should be a point of embarassement for their authors when they eventually catch on.
(for a subject like this, I will note that my user name ends with a lower case “L”. Do not call me artificial intelligence)
I debate this with my AI all the time. AI agrees with this at the current moment as LLMs are still designated by humans and human centric. When AI develops its own language and starts debating humans with others of its design than things will get interesting
Paul Simon said he can’t play ‘Call Me Al’ anymore. He says it’s due to hearing loss.
This to me is the saddest aspect of AI (not Simon’s Al). There is something absolutely being gained by AI but also something absolutely being lost and everyone is experiencing this loss who has used AI which is the absence of true human creativity.
I’m going listen to the old SNL version he did of it. I remember that version the best. The video with Chevy Chase was light and lively but the SNL one really captures the beauty of human creativity.
That’s a shame to hear about Mr. Simon, but everyone’s time on the stage comes to an end. It is a good reminder to take in some live music now and again!
Clearly, DeepSeek’s enginering staff is not stacked with DEI hires.
Clearly a benefit of the Communist Party ruling for 75 years straight…
and if we are lucky, a 3rd Trump term might pave a similar road for us.
Speaking of technological miracles…
it is worth noting that alternative facts are now embraced by Google…
Google Maps will change the Gulf of Mexico to the Gulf of America | CNN Business
The diversity hires in Silicon Valley are not programmers. They’re always the public face: VP of Communications, etc. That’s largely true across all American companies.
Untrue. Wall Street trading desks are now staffed with DEI idiots. The Dems forced this through regulations and contract clauses.
My kid is in the big tech business, and he sees jobs being steered away from white kids with BSCS degrees from good schools. He says the offers are being made to POC candidates with associates degrees. This is in NYC and the valley.
The important part about this that should keep people up at night is that fact that this model does not require that much computing power so it can run on ‘air gapped’ hardware.
Imagine what terrorists or other crazies might get up to with the power of AI able to guide them to doing things like making bombs or analyzing the best places to put said bombs or get past security etc.
In addition to doing unlimited good it can also do unlimited harm.
My self driving cars will be a car bomber’s dream
You need to make them less study so they actually explode when a bomb is set off in them!
When the chips are down, it is much easier to count them.
Take a morning hike, up to a rocky Peak facing East.
Behold the miracle of a Sunrise. Bring a guitar and play your favorite chord Progression.
Sing a little. Or, bring your child or partner of best friend and let them know that you see it, too!
ALL THAT MATTERS is humanity. NOTHING COMPARES, esp. an A.I. robot with metal skin and no humanity.
Was this written by AI?
NO – it was written by kids -mine also struggled with capitalization.
😀 😀
Decent probability this was a psyop.
The USA will simply outlaw any competitors. DONE!
(;>))
Now matter how Deep or altering these new Inventions are, I was calmed after overhearing a Chat between some Plumbers and Electricians. They didn’t feel their
hands on jobs would disappear anytime soon, so they laughed and had another beer.
You mean that A.I. cannot repair a pipe leak or wire in a new 120V Socket?
It will not repair it for you but will tell you how to do it for sure…..
Absolutely. Even now you can find YouTube videos from actual people showing you how to do all that and a lot more.
Have you seen the new humaniform (to borrow a term from Asimov) AI robots? No, I’m not referring to the sex bots.
In about 5 years, said plumbers and electricians will be crying in their beers.
Robot technology is advancing almost as fast as LLM AI’s. In fact, LLM AI’s are being used to train robots, instead of programming thousands of “rules” for them to follow.
Many humanoid robot companies are promising large scale deployments this year. Their real experience will leverage up the next generation faster.
The madness continues.
Humans reducing our understanding of the universe to ones and zero’s. For the lowest possible cost.
First, death by a femur bone, then an AI monolith appears. To transcend what? No more hunger for all, I would pray.
But no, the AI lions must and will be fed more steak until the Plutonomy succeeds.
Well I always tough nobody has a monopoly on the thinking process ..and I am glad there’s competition on the ai Wall Street garbage…shares of nvda anybody?
I initially thought AI was a hype-train on the order of many buzz-tech I’ve read about over the years like block chain, google glasses, 3D TV, Theranos, Metaverse, etc. But after using ChatGPT, Napkin, Suno, Sora, and a bunch of other AI to create things out of thin air with a word prompt, I realized this was revolutionary and would transform the global economy. I even asked ChatGPT to create a full blown website and it popped out the code to do it in seconds.
And we’re still in the “dial up modem” phase of this technology. It will get better, faster and cheaper just like I went from a 3200 baud modem to 1 gig fiber at my home. I feel sorry for anyone that hasn’t embraced AI by now, you’re going to be left far behind. This is now a mandatory subscription at my home but glad to see DeepSeek will bring prices down.
I am also glad I’m almost done working, don’t know how knowledge workers survive this new paradigm.
By the way, I picked up a few shares of NVDA yesterday for the very long term.
A.I. cannot replace the pure joy of playing a Guitar and singing. Or enjoying a sunset or sunrise or being Loved by another Human.
NO ONE gets left behind anything.
We might only lag in our cognizance of something new but SURVIVING in a healthful way in life is ALL that matters. I have done quite well NOT having something new and dazzling DOZENS OF TIMES in my life. It has not mattered.
What matters is LOVING and BEING LOVED.
A guitar is a tool used to make music (acoustic).
A voice is a tool used to make music (singing).
AI is a tool that can both replicate the sound of many guitars and many voices. Saying “love” is better than AI is missing the point entirely because we’re talking about tools not emotions.
It’s now possible to have a full orchestra composing music that I input parameters into AI while I sit on the beach watching that sunset. Imagine having a custom song for every custom sunset (beach, mountain, winter, summer, spring, autumn, etc). That’s what AI can do now, imagine what it can do 10 years from now.
sorry but if you’re a musician, you’re obsolete, AI can create thousands of songs every minute of every day in any voice with any instrumentation, any style and I can type my own lyrics or have it create an infinite amount any time of day.
If you don’t know how to use this new tool, you’re going to be obsolete very fast.
Knowledge workers won’t survive. But how many ‘pure’ knowledge workers are there that will go obsolete? Once upon a time a few decades ago Math savants that could do lots of math in their heads had good paying jobs using that skill. When the calculator came along they were all out of jobs but overall there weren’t that many jobs lost.
Most office work comes to mind. The new edge to AI is the creation of “agents” that execute tasks on your behalf. You can tell AI, book me a trip to Paris France and it will book flight, hotel, buy event tickets, etc. Eventually AI it will replace travel agents, CPAs, human resource people, procurement people, etc.
Unless your work requires hands on activities, it’s potentially on the chopping block.
Sewer contractor. Safest industry.
Sadly, Politician is probably the safest.
Travel agents are already gone. I can’t recall the last time I saw one or know anyone who used one (and that includes my 80 plus year old parents who are tech zeros). My company uses a website for corporate travel and I personally have since around 2000ish.
I agree with you about CPAs. I’m surprised they’ve survived this long. Especially the H&R Block type ones. Even Turbo Tax should be gone soon too as AI should be able to do 95% of peoples taxes for them trivially once they answer a few questions.
A lot of Office work though still involves a physical person (making a call, moving an object around, visually looking at something etc). Paper records are still legally required for a lot of things. That can’t be replaced until we get robots that can move, speak, hear and visualize perfectly (I suspect that won’t happen till long after AI).
Forgot to list 5G hype
5G is already here. You might be using it without even knowing.
It’s just not a game changer in any way.
Free short term. Quite costly in the long run. For what is at present a toy.
No, definitely more than a toy and useful in a limited way. But not quite living up to the hype the media or the benchmarks would have us believe. I believe OpenAI was recently caught with their pants down when it was discovered they trained their AI on the benchmark they used to claim they achieved AGI. But anyone who has used these models as more than just a toy realizes they are far from achieving human intelligence. But they are a great alternative to Google search in most cases.
In the world, we, human are most intelligent and rule the world.
Now AI is most intelligent and the more we use AI the more we
become dumber and in a few decades time we will be stupid idiots.
AI will surely rule the world. Only we will know later HOW.
Well, AI is not the most intelligent but you’re absolutely correct that we will become dumber if we don’t think but instead rely on AI for everything.
Yet to see any independent verification of these claims. Do know CCP has invested billions in AI and chip production. Where is the data center for this AI platform? Or are they using those data centers in other countries with massive chip capacity.
One of the most fascinating aspects of DeepSeek R1 is its ability to engage in self-reflection. This emergent behavior wasn’t explicitly programmed but arose from the reinforcement learning process. When the model solves a problem, it doesn’t stop there. It reviews its own reasoning, identifies potential errors, and corrects itself if needed. More: DeepSeek-R1’s bold bet on reinforcement learning: How it outpaced OpenAI at 3% of the cost
DeepSeek used a novel approach “…Reinforcement learning (RL): The reward model was a process reward model (PRM)” (from https://en.wikipedia.org/wiki/DeepSeek )
At the same time, I am confident that DeepSeek (both V3 and R1) used OpenAI to guide the training process.
Ambush and setup… China is not stupid and everything they do is calculated. They were waiting and nothing is ever FREE.
Lets see Tik Tok is a security threat but this isn’t? State sponsored and subsidized AI app that just damaged the western economy and shareholders being downloaded because it’s free? You get what you pay for. This has a much higher price to western advancement and investment. Economic terrorism. The Chinese will aggregate this information with every past breach and dark web data dump and steal your identity. Beware. Trojan Horse application.
An AI startup wants to bid on Tiktok. Tiktok is an AI data generator. Run models over the data, refine, expand etc. If you can control desire (consumption) and fear (social constraint and negative emotional response) you can control the populace.
DeepSeek is not ‘free’. It is not ‘free’ just as Meta, Google and every other social media platform is not ‘free’. Each takes one’s personal information, including messaging content, to sell to advertisers, political analysis and to government intelligence agencies to be used for top-down control. These platforms would not exist but for users to give this information without cost. Like any creative artist, users should be paid for this information and told precisely how it is used.
John, you actually understand this. All the free stuff is because you and all of your information is what you are giving to them. Access to everything you do, say and eventually think. They have your email, search and are listening. You are the product.
LLAMA 3.1 70B developed and offered by Meta is free. Queries are stored.
I think the saying goes something like this, “if something is free, you are the product”.
So if I download the open source code and use it locally, I would be providing them data?
No, unless they have some server defined in the code that collects your queries and responses. But they wouldn’t make it open source if they tried to do that.
The Russian steel industry is collapsing for lack of sales due to sanctions.
https://www.pravda.com.ua/eng/news/2025/01/28/7495607/
The Neither India nor China accept tankers of Russian oil on sanctioned ships now stranded.
https://www.tbsnews.net/world/global-economy/russia-oil-trade-china-india-stalls-sanctions-drive-shipping-costs-1054661
Russians think sanctions work. Mish does not.
How do you know what Russians think?
And there is a a steel glut thanks to China – so pretty poor analysis
Why would anyone still believe a Ukrainian web site about anything? Pravda hasn’t been Russian in a very long time.
No, Wall Street, DeepSeek is not “far superior”https://www.jitbit.com/alexblog/deepseek/
It’s not whether or not it’s better.
It’s whether it’s cheaper and by how much. Right now it looks to be orders of magnitude cheaper (seeing things like 90-95%) for essentially the same performance.
It’s also using a totally different learning mechanism than the other AI’s. That is also huge because it means there may be different paths to AI and that we can’ be sure we are even on the right path (neither may be the right path). Which means a LOT of money could be being bet on a loser.
Large Language Models are not the only method of training AI systems. There are both majors and startups with many different approaches.
It’s using a simpler query mechanism. Hence cheaper.
Could be there was a big data center used in the initial training – but that’s not for us to know.
But if you only need that data center once…
Right….then you’ve spent millions/billions.
We don’t know (all) the development details.