Google’s TPU Chips Threaten Nvidia’s Dominance in AI Computing

Here is a three-year chart of stock prices for Nvidia (NVDA), Alphabet/Google (GOOG), and the generic QQQ tech stock composite:

NVDA has been spectacular. If you had $20k in NVDA three years ago, it would have turned into nearly $200k. Sweet. Meanwhile, GOOG poked along at the general pace of QQQ.  Until…around Sept 1 (yellow line), GOOG started to pull away from QQQ, and has not looked back.

And in the past two months, GOOG stock has stomped all over NVDA, as shown in the six-month chart below. The two stocks were neck and neck in early October, then GOOG has surged way ahead. In the past month, GOOG is up sharply (red arrow), while NVDA is down significantly:

What is going on? It seems that the market is buying the narrative that Google’s Tensor Processing Unit (TPU) chips are a competitive threat to Nvidia’s GPUs. Last week, we published a tutorial on the technical details here. Briefly, Google’s TPUs are hardwired to perform key AI calculations, whereas Nvidia’s GPUs are more general-purpose. For a range of AI processing, the TPUs are faster and much more energy-efficient than the GPUs.

The greater flexibility of the Nvidia GPUs, and the programming community’s familiarity with Nvidia’s CUDA programming language, still gives Nvidia a bit of an edge in the AI training phase. But much of that edge fades for the inference (application) usages for AI. For the past few years, the big AI wannabes have focused madly on model training. But there must be a shift to inference (practical implementation) soon, for AI models to actually make money.

All this is a big potential headache for Nvidia. Because of their quasi-monopoly on AI compute, they have been able to charge a huge 75% gross profit margin on their chips. Their customers are naturally not thrilled with this, and have been making some efforts to devise alternatives. But it seems like Google, thanks to a big head start in this area, and very deep pockets, has actually equaled or even beaten Nvidia at its own game.

This explains much of the recent disparity in stock movements. It should be noted, however, that for a quirky business reason, Google is unlikely in the near term to displace Nvidia as the main go-to for AI compute power. The reason is this: most AI compute power is implemented in huge data/cloud centers. And Google is one of the three main cloud vendors, along with Microsoft and Amazon, with IBM and Oracle trailing behind. So, for Google to supply Microsoft and Amazon with its chips and accompanying know-how would be to enable its competitors to compete more strongly.

Also, AI users like say OpenAI would be reluctant to commit to usage in a Google-owned facility using Google chips, since then the user would be somewhat locked in and held hostage, since it would be expensive to switch to a different data center if Google tried to raise prices. On contrast, a user can readily move to a different data center for a better deal, if all the centers are using Nvidia chips.

For the present, then, Google is using its TPU technology primarily in-house. The company has a huge suite of AI-adjacent business lines, so its TPU capability does give it genuine advantages there. Reportedly, soul-searching continues in the Google C-suite about how to more broadly monetize its TPUs. It seems likely that they will find a way. 

As usual, nothing here constitutes advice to buy or sell any security.

AI Computing Tutorial: Training vs. Inference Compute Needs, and GPU vs. TPU Processors

A tsunami of sentiment shift is washing over Wall Street, away from Nvidia and towards Google/Alphabet. In the past month, GOOG stock is up a sizzling 12%, while NVDA plunged 13%, despite producing its usual earnings beat.  Today I will discuss some of the technical backdrop to this sentiment shift, which involves the differences between training AI models versus actually applying them to specific problems (“inference”), and significantly different processing chips. Next week I will cover the company-specific implications.

As most readers here probably know, the popular Large Language Models (LLM) that underpin the popular new AI products work by sucking in nearly all the text (and now other data) that humans have ever produced, reducing each word or form of a word to a numerical token, and grinding and grinding to discover consistent patterns among those tokens. Layers of (virtual) neural nets are used. The training process involves an insane amount of trying to predict, say, the next word in a sentence scraped from the web, evaluating why the model missed it, and feeding that information back to adjust the matrix of weights on the neural layers, until the model can predict that next word correctly. Then on to the next sentence found on the internet, to work and work until it can be predicted properly. At the end of the day, a well-trained AI chatbot can respond to Bob’s complaint about his boss with an appropriately sympathetic pseudo-human reply like, “It sounds like your boss is not treating you fairly, Bob. Tell me more about…” It bears repeating that LLMs do not actually “know” anything. All they can do is produce a statistically probably word salad in response to prompts. But they can now do that so well that they are very useful.*

This is an oversimplification, but gives the flavor of the endless forward and backward propagation and iteration that is required for model training. This training typically requires running vast banks of very high-end processors, typically housed in large, power-hungry data centers, for months at a time.

Once a model is trained (e.g., the neural net weights have been determined), to then run it (i.e., to generate responses based on human prompts) takes considerably less compute power. This is the “inference” phase of generative AI. It still takes a lot of compute to run a big program quickly, but a simpler LLM like DeepSeek can be run, with only modest time lags, on a high end PC.

GPUs Versus ASIC TPUs

Nvidia has made its fortune by taking graphical processing units (GPU) that were developed for massively parallel calculations needed for driving video displays, and adapting them to more general problem solving that could make use of rapid matrix calculations. Nvidia chips and its CUDA language have been employed for physical simulations such as seismology and molecular dynamics, and then for Bitcoin calculations. When generative AI came along, Nvidia chips and programming tools were the obvious choice for LLM computing needs. The world’s lust for AI compute is so insatiable, and Nvidia has had such a stranglehold, that the company has been able to charge an eye-watering gross profit margin of around 75% on its chips.

AI users of course are trying desperately to get compute capability without have to pay such high fees to Nvidia. It has been hard to mount a serious competitive challenge, though. Nvidia has a commanding lead in hardware and supporting software, and (unlike the Intel of years gone by) keeps forging ahead, not resting on its laurels. 

So far, no one seems to be able to compete strongly with Nvidia in GPUs. However, there is a different chip architecture, which by some measures can beat GPUs at their own game.

NVIDIA GPUs are general-purpose parallel processors with high flexibility, capable of handling a wide range of tasks from gaming to AI training, supported by a mature software ecosystem like CUDA. GPUs beat out the original computer central processing units (CPUs) for these tasks by sacrificing flexibility for the power to do parallel processing of many simple, repetitive operations. The newer “application-specific integrated circuits” (ASICs) take this specialization a step further. They can be custom hard-wired to do specific calculations, such as those required for bitcoin and now for AI. By cutting out steps used by GPUs, especially fetching data in and out of memory, ASICs can do many AI computing tasks faster and cheaper than Nvidia GPUs, and using much less electric power. That is a big plus, since AI data centers are driving up electricity prices in many parts of the country. The particular type of ASIC that is used by Google for AI is called a Tensor Processing Unit (TPU).

I found this explanation by UncoverAlpha to be enlightening:

A GPU is a “general-purpose” parallel processor, while a TPU is a “domain-specific” architecture.

The GPUs were designed for graphics. They excel at parallel processing (doing many things at once), which is great for AI. However, because they are designed to handle everything from video game textures to scientific simulations, they carry “architectural baggage.” They spend significant energy and chip area on complex tasks like caching, branch prediction, and managing independent threads.

A TPU, on the other hand, strips away all that baggage. It has no hardware for rasterization or texture mapping. Instead, it uses a unique architecture called a Systolic Array.

The “Systolic Array” is the key differentiator. In a standard CPU or GPU, the chip moves data back and forth between the memory and the computing units for every calculation. This constant shuffling creates a bottleneck (the Von Neumann bottleneck).

In a TPU’s systolic array, data flows through the chip like blood through a heart (hence “systolic”).

  1. It loads data (weights) once.
  2. It passes inputs through a massive grid of multipliers.
  3. The data is passed directly to the next unit in the array without writing back to memory.

What this means, in essence, is that a TPU, because of its systolic array, drastically reduces the number of memory reads and writes required from HBM. As a result, the TPU can spend its cycles computing rather than waiting for data.

Google has developed the most advanced ASICs for doing AI, which are now on some levels a competitive threat to Nvidia.   Some implications of this will be explored in a post next week.

*Next generation AI seeks to step beyond the LLM world of statistical word salads, and try to model cause and effect at the level of objects and agents in the real world – – see Meta AI Chief Yann LeCun Notes Limits of Large Language Models and Path Towards Artificial General Intelligence .

Standard disclaimer: Nothing here should be considered advice to buy or sell any security.

Michael Burry’s New Venture Is Substack “Cassandra Unchained”: Set Free to Prophesy All-Out Doom on AI Investing

This is a quick follow-up to last week’s post on “Big Short” Michael Burry closing down his Scion Asset Management hedge fund. Burry had teased on X that he would announce his next big thing on Nov 25. It seems he is now a day or two early: Sunday night he launched a paid-subscription “Cassandra Unchained” Substack. There he claims that:

Cassandra Unchained is now Dr. Michael Burry’s sole focus as he gives you a front row seat to his analytical efforts and projections for stocks, markets, and bubbles, often with an eye to history and its remarkably timeless patterns.

Reportedly the subscription cost is $39 a month, or $379 annually, and there are 26,000 subscribers already. Click the abacus and…that comes to a cool $ 9.9 million a year in subscription fees. Not bad compensation for sharing your musings on line.

Michael Burry was dubbed “Cassandra” by Warren Buffett in recognition of his prescient warnings about the 2008 housing market collapse, a prophecy that was initially ignored, much like the mythological Cassandra who was fated to deliver true prophecies that were never believed. Burry embraced this nickname, adopting “Cassandra” as his online moniker on social media platforms, symbolizing his role as a lone voice warning of impending financial disaster. On the About page of his new Substack, he wrote that managing clients’ money in a hedge fund like Scion came with restrictions that “muzzled” him, such that he could only share “cryptic fragments” publicly, whereas now he is “unchained.”

Of his first two posts on the new Substack, one was a retrospective on his days as a practicing doctor (resident in neurology at Stanford Hospital) in 1999-2000. He had done a lot of on-line posting on investing topics, focusing on valuations, and finally left medicine to start a hedge fund. As he tells it, he called the dot.com bubble before it popped.

The Business Insider summarizes Burry’s second post, which attacks the central premise of those who claim the current AI boom is fundamentally different from the 1990s dot.com boom:

The second post aims straight at the heart of the AI boom, which he calls a “glorious folly” that will require investigation over several posts to break down.

Burry goes on to address a common argument about the difference between the dot-com bubble and AI boom — that the tech companies leading the charge 25 years ago were largely unprofitable, while the current crop are money-printing machines.

At the turn of this century, Burry writes, the Nasdaq was driven by “highly profitable large caps, among which were the so-called ‘Four Horsemen’ of the era — Microsoft, Intel, Dell, and Cisco.”

He writes that a key issue with the dot-com bubble was “catastrophically overbuilt supply and nowhere near enough demand,” before adding that it’s “just not so different this time, try as so many might do to make it so.”

Burry calls out the “five public horsemen of today’s AI boom — Microsoft, Google, Meta, Amazon and Oracle” along with “several adolescent startups” including Sam Altman’s OpenAI.

Those companies have pledged to invest well over $1 trillion into microchips, data centers, and other infrastructure over the next few years to power an AI revolution. They’ve forecasted enormous growth, exciting investors and igniting their stock prices.

Shares of Nvidia, a key supplier of AI microchips, have surged 12-fold since the start of 2023, making it the world’s most valuable public company with a $4.4 trillion market capitalization.

“And once again there is a Cisco at the center of it all, with the picks and shovels for all and the expansive vision to go with it,” Burry writes, after noting the internet-networking giant’s stock plunged by over 75% during the dot-com crash. “Its name is Nvidia.”

Tell us how you really feel, Michael. Cassandra, indeed.

My amateur opinion here: I think there is a modest but significant chance that the hyperscalers will not all be able to make enough fresh money to cover their ginormous investments in AI capabilities 2024-2028. What happens then? For Google and Meta and Amazon, they may need to write down hundreds of millions of dollars on their balance sheets, which would show as ginormous hits to GAAP earnings for a number of quarters. But then life would go on just fine for these cash machines, and the market may soon forgive and forget this massive misallocation of old cash, as long as operating cash keeps rolling in as usual. Stocks are, after all, priced on forward earnings. If the AI boom busts, all tech stock prices would sag, but I think the biggest operating impact would be on suppliers of chips (like Nvidia) and of data centers (like Oracle). So, Burry’s comparison of 2025 Nvidia to 1999 Cisco seems apt.

Circular AI Deals Reminiscent of Disastrous Dot.Com Vendor Financing of the 1990s

Hey look, I just found a way to get infinite free electric power:

This sort of extension-cord-plugged-into-itself meme has shown up recently on the web to characterize a spate of circular financing deals in the AI space, largely involving OpenAI (parent of ChatGPT). Here is a graphic from Bloomberg which summarizes some of these activities:

Nvidia, which makes LOTS of money selling near-monopoly, in-demand GPU chips, has made investing commitments in customers or customers of their customers. Notably, Nvidia will invest up to $100 billion in Open AI, in order to help OpenAI increase their compute power. OpenAI in turn inked a $300 billion deal with Oracle, for building more data centers filled with Nvidia chips.  Such deals will certainly boost the sales of their chips (and make Nvidia even more money), but they also raise a number of concerns.

First, they make it seem like there is more demand for AI than there actually is. Short seller Jim Chanos recently asked, “[Don’t] you think it’s a bit odd that when the narrative is ‘demand for compute is infinite’, the sellers keep subsidizing the buyers?” To some extent, all this churn is just Nvidia recycling its own money, as opposed to new value being created.

Second, analysts point to the destabilizing effect of these sorts of “vendor financing” arrangements. Towards the end of the great dot.com boom in the late 1990’s, hardware vendors like Cisco were making gobs of money selling server capacity to internet service providers (ISPs). In order to help the ISPs build out even faster (and purchase even more Cisco hardware), Cisco loaned money to the ISPs. But when that boom busted, and the huge overbuild in internet capacity became (to everyone’s horror) apparent, the ISPs could not pay back those loans. QQQ lost 70% of its value. Twenty-five years later, Cisco stock price has never recovered its 2000 high.

Beside taking in cash investments, OpenAI is borrowing heavily to buy its compute capacity. Since OpenAI makes no money now (and in fact loses billions a year), and (like other AI ventures) will likely not make any money for several more years, and it is locked in competition with other deep-pocketed AI ventures, there is the possibility that it could pull down the whole house of cards, as happened in 2000.  Bernstein analyst Stacy Rasgon recently wrote, “[OpenAI CEO Sam Altman] has the power to crash the global economy for a decade or take us all to the promised land, and right now we don’t know which is in the cards.”

For the moment, nothing seems set to stop the tidal wave of spending on AI capabilities. Big tech is flush with cash, and is plowing it into data centers and program development. Everyone is starry-eyed with the enormous potential of AI to change, well, EVERYTHING (shades of 1999).

The financial incentives are gigantic. Big tech got big by establishing quasi-monopolies on services that consumers and businesses consider must-haves. (It is the quasi-monopoly aspect that enables the high profit margins).  And it is essential to establish dominance early on. Anyone can develop a word processor or spreadsheet that does what Word or Excel do, or a search engine that does what Google does, but Microsoft and Google got there first, and preferences are sticky. So, the big guys are spending wildly, as they salivate at the prospect of having the One AI to Rule Them All.

Even apart from achieving some new monopoly, the trillions of dollars spent on data center buildout are hoped to pay out one way or the other: “The data-center boom would become the foundation of the next tech cycle, letting Amazon, Microsoft, Google, and others rent out intelligence the way they rent cloud storage now. AI agents and custom models could form the basis of steady, high-margin subscription products.”

However, if in 2-3 years it turns out that actual monetization of AI continues to be elusive, as seems quite possible, there could be a Wile E. Coyote moment in the markets:

Bears and Bulls Battle Over Nvidia Stock Price

Nvidia is a huge battleground stock – – some analysts predict its price will languish or crash, while others see it continuing its dramatic rise. It has become the world’s most valuable company by market capitalization.  Here I will summarize the arguments of one bear and one bull from the investing site Seeking Alpha.

In this corner…semi-bear Lawrence Fuller. I respect his opinions in general. While the macro prospects have turned him more cautious in the past few months, for the past three years or so he has been relentlessly and correctly bullish (again based on macro), when many other voices were muttering doom/gloom.  

Fuller’s article is titled Losing Speed On The AI Superhighway. This dramatic chart supports the case that NVDA is overvalued:

This chart shows that the stock value of Nvidia has soared past the value of the entire UK stock exchange or the entire value of US energy companies. Fuller reminds us of the parallel with Cisco in 2000. Back then, Cisco was a key supplier of gateway technology for all the companies scrambling to get into this hot new thing called the internet. Cisco valuation went to the moon, then crashed and burned when the mania around the internet subsided to a more sober set of applications. Cisco lost over 70% of its value in a year, and still has not regained the share price it had 25 years ago:

… [Nvidia] is riding a cycle in which investment becomes overinvestment, because that is what we do in every business cycle. It happened in the late 1990s and it will happen again this time.

…there are innumerable startups of all kinds, as well as existing companies, venturing into AI in a scramble to compete for any slice of market share. This is a huge source of Nvidia’s growth as the beating heart of the industry, similar to how Cisco Systems exploded during the internet infrastructure boom. Inevitably, there will be winners and losers. There will be far more losers than winners. When the losers go out of business or are acquired, Nvidia’s customer base will shrink and so will their revenue and earnings growth rates. That is what happened during the internet infrastructure booms of the late 1990s.

Fuller doesn’t quite say Nvidia is overvalued, just that it’s P/E is unlikely to expand further, hence any further stock price increases will have to be produced the old-fashioned way, by actual earnings growth. There are more bearish views than Fuller’s, I chose his because it was measured.

And on behalf of the bulls, here is noob Weebler Finance, telling us that Nvidia Will Never Be This Cheap Again: The AI Revolution Has Just Begun:

AI adoption isn’t happening in a single sequence; it’s actually unfolding across multiple industries and use cases simultaneously. Because of these parallel market build-outs, hyper-scalers, sovereign AI, enterprises, robotics, and physical AI are all independently contributing to the infrastructure surge.

…Overall, I believe there are clear signs that indicate current spending on AI infrastructure is similar to the early innings of prior technology buildouts like the internet or cloud computing. In both those cases, the first waves of investment were primarily about laying the foundation, while true value creation and exponential growth came years later as applications multiplied and usage scaled.

As a pure picks and shovels play, Nvidia stands to capture the lion’s share of this foundational build-out because its GPUs, networking systems, and software ecosystem have become the de facto standard for accelerated computing. Its GPUs lead in raw performance, energy efficiency, and scalability. We clearly see this with the GB300 delivering 50x per-token efficiency following its launch. Its networking stack has become indispensable, with the Spectrum-X Ethernet already hitting a $10b annualized run rate and NVLink enabling scaling beyond PCIe limits. Above all, Nvidia clearly shows a combined stack advantage, which positions it to become the dominant utility provider of AI compute.

… I believe that Nvidia at its current price of ~$182, is remarkably cheap given the value it offers. Add to this the strong secular tailwinds the company faces and its picks-and-shovels positioning, and the value proposition becomes all the more undeniable.

My view: Out of sheer FOMO, I hold a little NVDA stock directly, and much more by participating in various funds (e.g. QQQ, SPY), nearly all of which hold a bunch of NVDA.  I have hedged some by selling puts and covered calls that net me about 20% in twelve months, even if stock price does not go up.   Nvidia P/E (~ 40) is on the high side, but not really when considering the growth rate of the company. It seems to me that the bulk of the AI spend is by the four AI “hyperscalers” (Google, Meta, Amazon, Microsoft). They make bazillions of dollars on their regular (non-AI) businesses, and so they have plenty of money to burn in purchasing Nvidia chips. If they ever slow their spend, it’s time to reconsider Nvidia stock. But there should be plenty of warning of that, probably no near time crisis: last time I checked, Nvidia production was sold out for a full year ahead of time. I have no doubt that their sales revenue will continue to increase. But earnings will depend on how long they can continue to command their stupendous c. 50% net profit margin (if this were an oil company, imagine the howls of “price gouging”).

As usual, nothing here should be considered advice to buy or sell any security.

Why Low Returns Are Predicted for Stocks Over the Next Decade

I saw this scary-looking graphic of S&P 500 returns versus price/earnings (P/E) ratios a couple of days ago:

JPMorgan

The left-hand side shows that there is very little correlation between the current forward P/E ratio and the returns in the next year; as we have seen in the past few years, and canonically in say 1995-1999, market euphoria can commonly carry over from one year to the next. (See here for discussion of momentum effect in stock prices). So, on this basis, the current sky-high P/E should give us no concern about returns in the next year.

However, the right-hand side is sobering. It shows a very strong tendency for poor ten-year returns if the current P/E is high. In fact, this chart suggests a ten-year return of near zero, starting with the current market pricing. Various financial institutions are likewise forecasting a decade of muted returns [1].

The classic optimistic-but-naïve response to unwelcome facts like these is to argue, “But this time it’s different.” I am old enough to remember those claims circa 1999-2000 as P/E’s soared to ridiculous heights. Back then, it was “The internet will change EVERYTHING!”.  By that, the optimists meant that within a very few years, tech companies would find ways to make huge and ever-growing profits from the internet. Although the internet steadily became a more important part of life, the rapid, huge monetization did not happen, and so the stock market crashed in 2000 and took around ten years to recover.

A big reason for the lack of early monetization was the lack of exclusive “moats” around the early internet businesses. Pets.com was doomed from the start, because anyone could also slap together a competing site to sell dog food over the internet. The companies that are now reaping huge profits from the internet are those like Google and Meta (Facebook) and Amazon that have established quasi-monopolies in their niches.

The current mantra is, “Artificial intelligence will change EVERYTHING!” It is interesting to note that the same challenge to monetization is evident. ChatGPT cannot make a profit because customers are not willing to pay big for its chatbot, when there are multiple competing chatbots giving away their services for practically free. Again, no moat, at least at this level of AI. (If Zuck succeeds in developing agentic AI that can displace expensive software engineers, companies may pay Meta bigly for the glorious ability to lay off their employees).

My reaction to this dire ten-year prognostication is two-fold. First, I have a relatively high fraction of my portfolio in securities which simply pump out cash. I have written about these here and here. With these investments, I don’t much care what stock prices do, since I am not relying on some greater fool to pay me a higher price for my shares than I paid. All I care is that those dividends keep rolling in.

My other reaction is…this time it may be different (!), for the following reason: a huge fraction of the S&P 500 valuation is now occupied by the big tech companies. Unlike in 2000, these companies are actually making money, gobs of money, and more money every year. It is common, and indeed rational, to value (on a P/E basis) firms with growing profits more highly than firms with stagnant earnings. Yes, Nvidia has a really high P/E of 43, but its price to earnings-growth (PEG) ratio is about 1.2, which is actually pretty low for a growth company.

So, with a reasonable chunk of my portfolio, I will continue to party like it’s 1999.

[1] Here is a blurb from the Llama 3.1 chatbot offered for free in my Brave browser, summarizing the muted market outlook:

Financial institutions are forecasting lower stock market returns over the next decade compared to recent historical performance. According to Schwab’s 2025 Long-Term Capital Market Expectations, U.S. large cap equities are expected to deliver annualized returns of 6% over the next decade, while international developed market equities are projected to slightly outperform at 7.1%.1 However, Goldman Sachs predicts a more modest outlook, with the S&P 500 expected to return around 3% annually over the next decade, within a range of –1% and 7%.42 Vanguard’s forecasts also indicate a decline in expected returns, with U.S. equities falling to a range of 2.8% to 4.8% annually. These forecasts suggest that investors may face a period of lower returns compared to the past decade’s 13% annualized total return.

After the Fall: What Next for Nvidia and AI, In the Light of DeepSeek

Anyone not living under a rock the last two weeks has heard of DeepSeek, the cheap Chinese knock-off of ChatGPT that was supposedly trained using much lower resources that most American Artificial Intelligence efforts have been using. The bearish narrative flowing from this is that AI users will be able to get along with far fewer of Nvidia’s expensive, powerful chips, and so Nvidia sales and profit margins will sag.

The stock market seems to be agreeing with this story. The Nvidia share price crashed with a mighty crash last Monday, and it has continued to trend downward since then, with plenty of zig-zags.

I am not an expert in this area, but have done a bit of reading. There seems to be an emerging consensus that DeepSeek got to where it got to largely by using what was already developed by ChatGPT and similar prior models. For this and other reasons, the claim for fantastic savings in model training has been largely discounted. DeepSeek did do a nice job making use of limited chip resources, but those advances will be incorporated into everyone else’s models now.

Concerns remain regarding built-in bias and censorship to support the Chinese communist government’s point of view, and regarding the safety of user data kept on servers in China. Even apart from nefarious purposes for collecting user data, ChatGPT has apparently been very sloppy in protecting user information:

Wiz Research has identified a publicly accessible ClickHouse database belonging to DeepSeek, which allows full control over database operations, including the ability to access internal data. The exposure includes over a million lines of log streams containing chat history, secret keys, backend details, and other highly sensitive information.

Shifting focus to Nvidia – – my take is that DeepSeek will have little impact on its sales. The bullish narrative is that the more efficient algos developed by DeepSeek will enable more players to enter the AI arena.

The big power users like Meta and Amazon and Google have moved beyond limited chatbots like ChatGPT or DeepSeek. They are aiming beyond “AI” to “AGI” (Artificial General Intelligence), that matches or surpasses human cognitive capabilities across a wide range of cognitive tasks. Zuck plans to replace mid-level software engineers at Meta with code-bots before the year is out.

For AGI they will still need gobs of high-end chips, and these companies show no signs of throttling back their efforts. Nvidia remains sold out through the end of 2025. I suspect that when the company reports earnings on Feb 26, it will continue to demonstrate high profits and project high earnings growth.

Its price to earnings is higher than its peers, but that appears to be justified by its earnings growth. For a growth stock, a key metric is price/earnings-growth (PEG), and by that standard, Nvidia looks downright cheap:

Source: Marc Gerstein on Seeking Alpha

How the fickle market will react to these realities, I have no idea.

The high volatility in the stock makes for high options premiums. I have been selling puts and covered calls to capture roughly 20% yields, at the expense of missing out on any rise in share price from here.

Disclaimer: Nothing here should be considered as advice to buy or sell any security.

DeepSeek vs. ChatGPT: Has China Suddenly Caught or Surpassed the U.S. in AI?

The biggest single-day decline in stock market history occurred yesterday, as Nvidia plunged 17% to shave $589 billion off the AI chipmaker’s market cap. The cause of the panic was the surprisingly good performance of DeepSeek, a new Chinese AI application similar to ChatGPT.

Those who have tested DeepSeek find it to perform about as well as the best American AI models, with lower consumption of computer resources. It is also available much cheaper. What really stunned the tech world is that the developers claimed to have trained the model for only about six million dollars, which is way, way less than the billions that a large U.S. firm like OpenAI, Google, or Meta would spend on a leading AI model. All this despite the attempts by the U.S. to deny China the most advanced Nvidia chips. The developers of DeepSeek claim they worked with a modest number of chips, models with deliberately curtailed capacities which met U.S. export allowances.

One conclusion, drawn by the Nvidia bears, is that this shows you *don’t* need ever more of the most powerful and expensive chips to get good development done. The U.S. AI development model has been to build more, huge, power-hungry data centers and fill them up with the latest Nvidia chips. That has allowed Nvidia to charge huge profit premiums, as Google and other big tech companies slurp up all the chips that Nvidia can produce. If that supply/demand paradigm breaks, Nvidia’s profits could easily drop in half, e.g., from 60+% gross margins to a more normal (but still great) 30% margin.

The Nvidia bulls, on the other hand, claim that more efficient models will lead to even more usage of AI, and thus increase the demand for computing hardware – – a cyber instance of Jevons’ Paradox (where the increase in the efficiency of steam engines in burning coal led to more, not less, coal consumption, because it made steam engines more ubiquitous).

I read a bunch of articles to try to sort out hype from fact here. Folks who have tested DeepSeek find it to be as good as ChatGPT, and occasionally better. It can explain its reasoning explicitly, which can be helpful. It is open source, which I think means the code or at least the “weights” have been published. It does seem to be unusually efficient. Westerners have downloaded it onto (powerful) PCs and have run it there successfully, if a bit slowly. This means you can embed it in your own specialized code, or do your AI apart from the prying eyes of ChatGPT or other U.S. AI providers. In contrast, ChatGPT I think can only be run on a powerful remote server.

Unsurprisingly, in the past two weeks DeepSeek has been the most-uploaded free app, surpassing ChatGPT.

It turns out that being starved of computing power led the Chinese team to think their way to several important innovations that make much better use of computing. See here and here for gentle technical discussions of how they did that. Some of it involved hardware-ish things like improved memory management. Another key factor is they figured out a way to only do training on data which is relevant to the training query, instead of training each time on the entire universe of text.

A number of experts scoff at the claimed six million dollar figure for training, noting that if you include all the costs that were surely involved in the development cycle, it can’t be less than hundreds of millions of dollars. That said, it was still appreciably cheaper than the usual American way. Furthermore, it seems quite likely that making use of answers generated by ChatGPT helped DeepSeek to rapidly emulate ChatGPT’s performance. It is one thing to catch up to ChatGPT; it may be tougher to surpass it. Also, presumably the compute-efficient tricks devised by the DeepSeek team will now be applied in the West, as well. And there is speculation that DeepSeek actually has use of thousands of the advanced Nvidia chips, but they hide that fact since it involved end-running U.S. export restrictions. If so, then their accomplishment would be less amazing.

What happens now? I wish I knew. (I sold some Nvidia stock today, only to buy it back when it started to recover in after-hours trading). DeepSeek has Chinese censorship built into it. If you use DeepSeek, your information gets stored on servers in China, the better to serve the purposes of the government there.

Ironically, before this DeepSeek story broke, I was planning to write a post here this week pondering the business case for AI. For all the breathless hype about how AI will transform everything, it seems little money has been made except for Nvidia. Nvidia has been selling picks and shovels to the gold miners, but the gold miners themselves seem to have little to show for the billions and billions of dollars they are pouring into AI. A problem may be that there is not much of a moat here – – if lots of different tech groups can readily cobble together decent AI models, who will pay money to use them? Already, it is being given away for free in many cases. We shall see…

WSJ: Nothing Important Happened in China, India, or AI This Year

I normally like the Wall Street Journal; it is the only news page I check directly on a regular basis, rather than just following links from social media. But their “Biggest News Stories of 2024” roundup makes me wonder if they are overly parochial. When I try to zoom out and think of the very biggest stories of the past five to ten years, three of the absolute top would be the rapid rise of China and India, together with the astonishing growth in artificial intelligence capabilities.

All three of those major stories continued to play out this year, along with all sorts of other things happening in the two most populous countries in the world, and all the ways existing AI capabilities are beginning to be integrated into our businesses, research, and lives. But the Wall Street Journal thinks that none of this is important enough to be mentioned in their 100+ “Biggest Stories”.

To be fair, China and AI do show up indirectly. AI is driving the 4 (!) stories on NVIDIA’s soaring stock price, and China shows up in stories about spying on the US, hacking the US, and the US potentially forcing a sale of TikTok. But there are zero stories regarding anything that happened within the borders of China, and zero that let you know that AI is good for anything besides NVIDIA’s stock price.

Plus of course, zero stories that let you know that India- now the world’s most populous country, where over one out of every six people alive resides- even exists.

AI’s take on India’s Prime Minister using AI

This isn’t just an America-centric bias on WSJ’s part, since there is lots of foreign coverage in their roundup; indeed the Middle East probably gets more than its fair share thanks to “if it bleeds, it leads”. For some reason they just missed the biggest countries. They also seem to have a blind spot for science and technology; they don’t mention a single scientific discovery, and only had two technology stories, on SpaceX catching a rocket and doing the first private spacewalk.

The SpaceX stories at least are genuinely important- the sort of thing that might show up in a history book in 50+ years, along with some of the stories on U.S. politics and the Russia-Ukraine war, but unlike most of the trivialities reported.

I welcome your pointers to better takes on what was important in 2024, or on what you consider to be the best news source today.

How Repurposing Graphic Processing Chips Made Nvidia the Most Valuable Company on Earth

Folks who follow the stock market know that the average company in the S&P 500 has gone essentially nowhere in the last couple of years. What has pulled the averages higher and higher has been the outstanding performance of a handful of big tech stocks. Foremost among these is Nvidia. Its share price has tripled in the past year, after nearly tripling in the previously twelve months. Its market value climbed to $3.3 trillion last week, briefly surpassing tech behemoths Microsoft and Apple as the most valuable company in the world.

What just happened here?

It all began in 1993 when Taiwanese-American electrical engineer Jensen Huang and two other Silicon Valley techies met in a Denny’s in East San Jose and decided to start their own company. Their focus was making graphics acceleration boards for video games. Computing devices such as computers, game stations, and smart phones have at their core a central processing unit, CPU. A strength of CPUs is their versatility. They can do a lot of different tasks, but sequentially and thus at a limited speed.  To oversimplify, a CPU fetches an instruction (command), and then loads maybe two chunks of data, then performs the instructed calculations on those data, and then stores the result somewhere else, and then turns around and fetches the next instruction. With clever programming, some tasks can be broken up into multiple pieces that can be processed in parallel on several CPU cores at once, but that only goes so far.

Processing large amounts of graphics data, such as rendering a high-resolution active video game, requires an enormous amount of computing. However, these calculations are largely all the same type, so a versatile processing chip like a CPU is not required. Graphics processing units (GPUs), originally termed graphics accelerators, are designed to do enormous number of these simple calculations simultaneously. To offload the burden on the CPU, computers and game stations for decades have included on auxiliary GPU (“graphics card”) alongside the CPU.

This was the original target for Nvidia. Video gaming was expanding rapidly, and they saw a niche for innovative graphics processors. Unfortunately, they the processing architecture they choose to work on fell out of favor, and they skated right up to the edge of going bankrupt. In 1993 Nvidia was down to 30 days before closing their doors, but at the last moment they got a $5 million loan to keep them afloat. Nvidia clawed its way back from the brink and managed to make and sell a series of popular graphics processors.

However, management had a vision that the massively parallel processing power of their chips could be applied to more exulted uses than rendering blood spatters in Call of Duty.  The types of matrix calculations done in GPUs can be used in a wide variety of physical simulations such as seismology and molecular dynamics. In 2007, and video released its CUDA platform for using GPUs for accelerated general purpose processing. Since then, Nvidia has promoting the use of its GPUs as general hardware for scientific computing, in addition to the classic graphics applications.

This line of business exploded starting around 2019, with the bitcoin craze. Crypto currencies require enormous amount of computing power, and these types of calculations are amenable to being performed in massively parallel GPUs. Serious bitcoin mining companies set up racks of processors, built on NVIDIA GPUs. GPUs did have serious competition from other types of processors for the crypto mining applications, so they did not have the field to themselves. With people stuck at home in 2020-2021, demand for GPUs rose even further: more folks sitting on couches playing video games, and more cloud computing for remote work.

Nvidia Dominates AI Computing

Now the whole world cannot get enough of machine learning and generative AI. And Nvidia chips totally dominate that market. Nvidia supplies not only the hardware (chips) but also a software platform to allow programmers to make use of the chips. With so many programmers and applications standardized now on the Nvidia platform, its dominance and profitability should persist for many years.

Nearly all their chips are manufactured in Taiwan, so that provides a geopolitical risk, not only for Nvidia but for all enterprises that depend on high end AI processing.