Google’s TPU Chips Threaten Nvidia’s Dominance in AI Computing

Here is a three-year chart of stock prices for Nvidia (NVDA), Alphabet/Google (GOOG), and the generic QQQ tech stock composite:

NVDA has been spectacular. If you had $20k in NVDA three years ago, it would have turned into nearly $200k. Sweet. Meanwhile, GOOG poked along at the general pace of QQQ.  Until…around Sept 1 (yellow line), GOOG started to pull away from QQQ, and has not looked back.

And in the past two months, GOOG stock has stomped all over NVDA, as shown in the six-month chart below. The two stocks were neck and neck in early October, then GOOG has surged way ahead. In the past month, GOOG is up sharply (red arrow), while NVDA is down significantly:

What is going on? It seems that the market is buying the narrative that Google’s Tensor Processing Unit (TPU) chips are a competitive threat to Nvidia’s GPUs. Last week, we published a tutorial on the technical details here. Briefly, Google’s TPUs are hardwired to perform key AI calculations, whereas Nvidia’s GPUs are more general-purpose. For a range of AI processing, the TPUs are faster and much more energy-efficient than the GPUs.

The greater flexibility of the Nvidia GPUs, and the programming community’s familiarity with Nvidia’s CUDA programming language, still gives Nvidia a bit of an edge in the AI training phase. But much of that edge fades for the inference (application) usages for AI. For the past few years, the big AI wannabes have focused madly on model training. But there must be a shift to inference (practical implementation) soon, for AI models to actually make money.

All this is a big potential headache for Nvidia. Because of their quasi-monopoly on AI compute, they have been able to charge a huge 75% gross profit margin on their chips. Their customers are naturally not thrilled with this, and have been making some efforts to devise alternatives. But it seems like Google, thanks to a big head start in this area, and very deep pockets, has actually equaled or even beaten Nvidia at its own game.

This explains much of the recent disparity in stock movements. It should be noted, however, that for a quirky business reason, Google is unlikely in the near term to displace Nvidia as the main go-to for AI compute power. The reason is this: most AI compute power is implemented in huge data/cloud centers. And Google is one of the three main cloud vendors, along with Microsoft and Amazon, with IBM and Oracle trailing behind. So, for Google to supply Microsoft and Amazon with its chips and accompanying know-how would be to enable its competitors to compete more strongly.

Also, AI users like say OpenAI would be reluctant to commit to usage in a Google-owned facility using Google chips, since then the user would be somewhat locked in and held hostage, since it would be expensive to switch to a different data center if Google tried to raise prices. On contrast, a user can readily move to a different data center for a better deal, if all the centers are using Nvidia chips.

For the present, then, Google is using its TPU technology primarily in-house. The company has a huge suite of AI-adjacent business lines, so its TPU capability does give it genuine advantages there. Reportedly, soul-searching continues in the Google C-suite about how to more broadly monetize its TPUs. It seems likely that they will find a way. 

As usual, nothing here constitutes advice to buy or sell any security.

AI Computing Tutorial: Training vs. Inference Compute Needs, and GPU vs. TPU Processors

A tsunami of sentiment shift is washing over Wall Street, away from Nvidia and towards Google/Alphabet. In the past month, GOOG stock is up a sizzling 12%, while NVDA plunged 13%, despite producing its usual earnings beat.  Today I will discuss some of the technical backdrop to this sentiment shift, which involves the differences between training AI models versus actually applying them to specific problems (“inference”), and significantly different processing chips. Next week I will cover the company-specific implications.

As most readers here probably know, the popular Large Language Models (LLM) that underpin the popular new AI products work by sucking in nearly all the text (and now other data) that humans have ever produced, reducing each word or form of a word to a numerical token, and grinding and grinding to discover consistent patterns among those tokens. Layers of (virtual) neural nets are used. The training process involves an insane amount of trying to predict, say, the next word in a sentence scraped from the web, evaluating why the model missed it, and feeding that information back to adjust the matrix of weights on the neural layers, until the model can predict that next word correctly. Then on to the next sentence found on the internet, to work and work until it can be predicted properly. At the end of the day, a well-trained AI chatbot can respond to Bob’s complaint about his boss with an appropriately sympathetic pseudo-human reply like, “It sounds like your boss is not treating you fairly, Bob. Tell me more about…” It bears repeating that LLMs do not actually “know” anything. All they can do is produce a statistically probably word salad in response to prompts. But they can now do that so well that they are very useful.*

This is an oversimplification, but gives the flavor of the endless forward and backward propagation and iteration that is required for model training. This training typically requires running vast banks of very high-end processors, typically housed in large, power-hungry data centers, for months at a time.

Once a model is trained (e.g., the neural net weights have been determined), to then run it (i.e., to generate responses based on human prompts) takes considerably less compute power. This is the “inference” phase of generative AI. It still takes a lot of compute to run a big program quickly, but a simpler LLM like DeepSeek can be run, with only modest time lags, on a high end PC.

GPUs Versus ASIC TPUs

Nvidia has made its fortune by taking graphical processing units (GPU) that were developed for massively parallel calculations needed for driving video displays, and adapting them to more general problem solving that could make use of rapid matrix calculations. Nvidia chips and its CUDA language have been employed for physical simulations such as seismology and molecular dynamics, and then for Bitcoin calculations. When generative AI came along, Nvidia chips and programming tools were the obvious choice for LLM computing needs. The world’s lust for AI compute is so insatiable, and Nvidia has had such a stranglehold, that the company has been able to charge an eye-watering gross profit margin of around 75% on its chips.

AI users of course are trying desperately to get compute capability without have to pay such high fees to Nvidia. It has been hard to mount a serious competitive challenge, though. Nvidia has a commanding lead in hardware and supporting software, and (unlike the Intel of years gone by) keeps forging ahead, not resting on its laurels. 

So far, no one seems to be able to compete strongly with Nvidia in GPUs. However, there is a different chip architecture, which by some measures can beat GPUs at their own game.

NVIDIA GPUs are general-purpose parallel processors with high flexibility, capable of handling a wide range of tasks from gaming to AI training, supported by a mature software ecosystem like CUDA. GPUs beat out the original computer central processing units (CPUs) for these tasks by sacrificing flexibility for the power to do parallel processing of many simple, repetitive operations. The newer “application-specific integrated circuits” (ASICs) take this specialization a step further. They can be custom hard-wired to do specific calculations, such as those required for bitcoin and now for AI. By cutting out steps used by GPUs, especially fetching data in and out of memory, ASICs can do many AI computing tasks faster and cheaper than Nvidia GPUs, and using much less electric power. That is a big plus, since AI data centers are driving up electricity prices in many parts of the country. The particular type of ASIC that is used by Google for AI is called a Tensor Processing Unit (TPU).

I found this explanation by UncoverAlpha to be enlightening:

A GPU is a “general-purpose” parallel processor, while a TPU is a “domain-specific” architecture.

The GPUs were designed for graphics. They excel at parallel processing (doing many things at once), which is great for AI. However, because they are designed to handle everything from video game textures to scientific simulations, they carry “architectural baggage.” They spend significant energy and chip area on complex tasks like caching, branch prediction, and managing independent threads.

A TPU, on the other hand, strips away all that baggage. It has no hardware for rasterization or texture mapping. Instead, it uses a unique architecture called a Systolic Array.

The “Systolic Array” is the key differentiator. In a standard CPU or GPU, the chip moves data back and forth between the memory and the computing units for every calculation. This constant shuffling creates a bottleneck (the Von Neumann bottleneck).

In a TPU’s systolic array, data flows through the chip like blood through a heart (hence “systolic”).

  1. It loads data (weights) once.
  2. It passes inputs through a massive grid of multipliers.
  3. The data is passed directly to the next unit in the array without writing back to memory.

What this means, in essence, is that a TPU, because of its systolic array, drastically reduces the number of memory reads and writes required from HBM. As a result, the TPU can spend its cycles computing rather than waiting for data.

Google has developed the most advanced ASICs for doing AI, which are now on some levels a competitive threat to Nvidia.   Some implications of this will be explored in a post next week.

*Next generation AI seeks to step beyond the LLM world of statistical word salads, and try to model cause and effect at the level of objects and agents in the real world – – see Meta AI Chief Yann LeCun Notes Limits of Large Language Models and Path Towards Artificial General Intelligence .

Standard disclaimer: Nothing here should be considered advice to buy or sell any security.

Michael Burry’s New Venture Is Substack “Cassandra Unchained”: Set Free to Prophesy All-Out Doom on AI Investing

This is a quick follow-up to last week’s post on “Big Short” Michael Burry closing down his Scion Asset Management hedge fund. Burry had teased on X that he would announce his next big thing on Nov 25. It seems he is now a day or two early: Sunday night he launched a paid-subscription “Cassandra Unchained” Substack. There he claims that:

Cassandra Unchained is now Dr. Michael Burry’s sole focus as he gives you a front row seat to his analytical efforts and projections for stocks, markets, and bubbles, often with an eye to history and its remarkably timeless patterns.

Reportedly the subscription cost is $39 a month, or $379 annually, and there are 26,000 subscribers already. Click the abacus and…that comes to a cool $ 9.9 million a year in subscription fees. Not bad compensation for sharing your musings on line.

Michael Burry was dubbed “Cassandra” by Warren Buffett in recognition of his prescient warnings about the 2008 housing market collapse, a prophecy that was initially ignored, much like the mythological Cassandra who was fated to deliver true prophecies that were never believed. Burry embraced this nickname, adopting “Cassandra” as his online moniker on social media platforms, symbolizing his role as a lone voice warning of impending financial disaster. On the About page of his new Substack, he wrote that managing clients’ money in a hedge fund like Scion came with restrictions that “muzzled” him, such that he could only share “cryptic fragments” publicly, whereas now he is “unchained.”

Of his first two posts on the new Substack, one was a retrospective on his days as a practicing doctor (resident in neurology at Stanford Hospital) in 1999-2000. He had done a lot of on-line posting on investing topics, focusing on valuations, and finally left medicine to start a hedge fund. As he tells it, he called the dot.com bubble before it popped.

The Business Insider summarizes Burry’s second post, which attacks the central premise of those who claim the current AI boom is fundamentally different from the 1990s dot.com boom:

The second post aims straight at the heart of the AI boom, which he calls a “glorious folly” that will require investigation over several posts to break down.

Burry goes on to address a common argument about the difference between the dot-com bubble and AI boom — that the tech companies leading the charge 25 years ago were largely unprofitable, while the current crop are money-printing machines.

At the turn of this century, Burry writes, the Nasdaq was driven by “highly profitable large caps, among which were the so-called ‘Four Horsemen’ of the era — Microsoft, Intel, Dell, and Cisco.”

He writes that a key issue with the dot-com bubble was “catastrophically overbuilt supply and nowhere near enough demand,” before adding that it’s “just not so different this time, try as so many might do to make it so.”

Burry calls out the “five public horsemen of today’s AI boom — Microsoft, Google, Meta, Amazon and Oracle” along with “several adolescent startups” including Sam Altman’s OpenAI.

Those companies have pledged to invest well over $1 trillion into microchips, data centers, and other infrastructure over the next few years to power an AI revolution. They’ve forecasted enormous growth, exciting investors and igniting their stock prices.

Shares of Nvidia, a key supplier of AI microchips, have surged 12-fold since the start of 2023, making it the world’s most valuable public company with a $4.4 trillion market capitalization.

“And once again there is a Cisco at the center of it all, with the picks and shovels for all and the expansive vision to go with it,” Burry writes, after noting the internet-networking giant’s stock plunged by over 75% during the dot-com crash. “Its name is Nvidia.”

Tell us how you really feel, Michael. Cassandra, indeed.

My amateur opinion here: I think there is a modest but significant chance that the hyperscalers will not all be able to make enough fresh money to cover their ginormous investments in AI capabilities 2024-2028. What happens then? For Google and Meta and Amazon, they may need to write down hundreds of millions of dollars on their balance sheets, which would show as ginormous hits to GAAP earnings for a number of quarters. But then life would go on just fine for these cash machines, and the market may soon forgive and forget this massive misallocation of old cash, as long as operating cash keeps rolling in as usual. Stocks are, after all, priced on forward earnings. If the AI boom busts, all tech stock prices would sag, but I think the biggest operating impact would be on suppliers of chips (like Nvidia) and of data centers (like Oracle). So, Burry’s comparison of 2025 Nvidia to 1999 Cisco seems apt.

META Stock Slides as Investors Question Payout for Huge AI Spend

How’s this for a “battleground” stock:

Meta stock has dropped about 13% when its latest quarterly earnings were released, then continued to slide until today’s market exuberance over a potential end to the government shutdown. What is the problem?

Meta has invested enormous sums in AI development already, and committed to invest even more in the future. It is currently plowing some 65% (!!) of its cash flow into AI, with no near-term prospects of making big profits there. CEO Mark Zuckerberg has a history of spending big on the Next Big Thing, which eventually fizzles. Meta’s earnings have historically been so high that he can throw away a few billion here and there and nobody cared. But now (up to $800 billion capex spend through 2028) we are talking real money.

Up till now Big Tech has been able to finance their investments entirely out of cash flow, but (like its peers), Meta started issuing debt to pay for some of the AI spend. Leverage is a two-edged sword – – if you can borrow a ton of money (up to $30 billion here) at say 5%, and invest it in something that returns 10%, that is glorious. Rah, capitalism! But if the payout is not there, you are hosed.

Another ugly issue lurking in the shadows is Meta’s dependence on scam ads for some 10% of its ad revenues. Reuters released a horrifying report last week detailing how Meta deliberately slow-walks or ignores legitimate complaints about false advertising and even more nefarious mis-uses of Facebook. Chilling specific anecdotes abound, but they seem to be part of a pattern of Meta choosing to not aggressively curtail known fraud, because doing so would cut into their revenue. They focus their enforcement efforts in regions where their hands are likely to be slapped hardest by regulators, while continuing to let advertisers defraud users wherever they can get away with it:

…Meta has internally acknowledged that regulatory fines for scam ads are certain, and anticipates penalties of up to $1 billion, according to one internal document.

But those fines would be much smaller than Meta’s revenue from scam ads, a separate document from November 2024 states. Every six months, Meta earns $3.5 billion from just the portion of scam ads that “present higher legal risk,” the document says, such as those falsely claiming to represent a consumer brand or public figure or demonstrating other signs of deceit. That figure almost certainly exceeds “the cost of any regulatory settlement involving scam ads.”

Rather than voluntarily agreeing to do more to vet advertisers, the same document states, the company’s leadership decided to act only in response to impending regulatory action.

Thus, the seamy underside of capitalism. And this:

…The company only bans advertisers if its automated systems predict the marketers are at least 95% certain to be committing fraud, the documents show. If the company is less certain – but still believes the advertiser is a likely scammer – Meta charges higher ad rates as a penalty, according to the documents. 

So…if Meta is 94% (but not 95%) sure that an ad is a fraud, they will still let it run, but just charge more for it.  Sweet. Guess that sort of thinking is why Zuck is worth $250 million, and I’m not.

But never fear, Meta’s P/E is the lowest of the Mag 7 group, so maybe it is a buy after all:

Source

As usual, nothing here should be considered advice to buy or sell any security.

Circular AI Deals Reminiscent of Disastrous Dot.Com Vendor Financing of the 1990s

Hey look, I just found a way to get infinite free electric power:

This sort of extension-cord-plugged-into-itself meme has shown up recently on the web to characterize a spate of circular financing deals in the AI space, largely involving OpenAI (parent of ChatGPT). Here is a graphic from Bloomberg which summarizes some of these activities:

Nvidia, which makes LOTS of money selling near-monopoly, in-demand GPU chips, has made investing commitments in customers or customers of their customers. Notably, Nvidia will invest up to $100 billion in Open AI, in order to help OpenAI increase their compute power. OpenAI in turn inked a $300 billion deal with Oracle, for building more data centers filled with Nvidia chips.  Such deals will certainly boost the sales of their chips (and make Nvidia even more money), but they also raise a number of concerns.

First, they make it seem like there is more demand for AI than there actually is. Short seller Jim Chanos recently asked, “[Don’t] you think it’s a bit odd that when the narrative is ‘demand for compute is infinite’, the sellers keep subsidizing the buyers?” To some extent, all this churn is just Nvidia recycling its own money, as opposed to new value being created.

Second, analysts point to the destabilizing effect of these sorts of “vendor financing” arrangements. Towards the end of the great dot.com boom in the late 1990’s, hardware vendors like Cisco were making gobs of money selling server capacity to internet service providers (ISPs). In order to help the ISPs build out even faster (and purchase even more Cisco hardware), Cisco loaned money to the ISPs. But when that boom busted, and the huge overbuild in internet capacity became (to everyone’s horror) apparent, the ISPs could not pay back those loans. QQQ lost 70% of its value. Twenty-five years later, Cisco stock price has never recovered its 2000 high.

Beside taking in cash investments, OpenAI is borrowing heavily to buy its compute capacity. Since OpenAI makes no money now (and in fact loses billions a year), and (like other AI ventures) will likely not make any money for several more years, and it is locked in competition with other deep-pocketed AI ventures, there is the possibility that it could pull down the whole house of cards, as happened in 2000.  Bernstein analyst Stacy Rasgon recently wrote, “[OpenAI CEO Sam Altman] has the power to crash the global economy for a decade or take us all to the promised land, and right now we don’t know which is in the cards.”

For the moment, nothing seems set to stop the tidal wave of spending on AI capabilities. Big tech is flush with cash, and is plowing it into data centers and program development. Everyone is starry-eyed with the enormous potential of AI to change, well, EVERYTHING (shades of 1999).

The financial incentives are gigantic. Big tech got big by establishing quasi-monopolies on services that consumers and businesses consider must-haves. (It is the quasi-monopoly aspect that enables the high profit margins).  And it is essential to establish dominance early on. Anyone can develop a word processor or spreadsheet that does what Word or Excel do, or a search engine that does what Google does, but Microsoft and Google got there first, and preferences are sticky. So, the big guys are spending wildly, as they salivate at the prospect of having the One AI to Rule Them All.

Even apart from achieving some new monopoly, the trillions of dollars spent on data center buildout are hoped to pay out one way or the other: “The data-center boom would become the foundation of the next tech cycle, letting Amazon, Microsoft, Google, and others rent out intelligence the way they rent cloud storage now. AI agents and custom models could form the basis of steady, high-margin subscription products.”

However, if in 2-3 years it turns out that actual monetization of AI continues to be elusive, as seems quite possible, there could be a Wile E. Coyote moment in the markets:

Shift in AI Usage from Productivity to Personal Therapy: Hazard Ahead

A couple of days ago I spoke with a friend who was troubled by the case of Adam Raine, the sixteen-year-old who was counseled by a ChatGPT AI therapy chatbot into killing himself.  That was of course extremely tragic, but I hoped it was kind of an outlier. Then I heard on a Bloomberg business podcast that the number one use for AI now is personal therapy. Being a researcher, I had to check this claim.

So here is an excerpt from a visual presentation of an analysis done by Marc Zao-Sanders for Harvard Business Review. He examined thousands of forum posts over the last year in a follow-up to his 2024 analysis to estimate uses of AI. To keep it tractable, I just snipped an image of the first six categories:

It’s true: Last year the most popular uses were spread across a variety of categories, but in 2025 the top use was “Therapy & Companionship”, followed by related uses of “Organize Life” and “Find Purpose”. Two of the top three uses in 2024, “Generate Ideas” and “Specific Search”, were aimed at task productivity (loosely defined), whereas in 2025 the top three uses were all for personal support.

Huh. People used to have humans in their lives known as friends or buddies or girlfriends/boyfriends or whatever.  Back in the day, say 200 or 2000 or 200,000 or 2,000,000 years ago, it seems a basic unit was the clan or village or extended kinship group. As I understand it, in a typical English village the men would drift into the pub most Friday and Saturday nights and banter and play darts over a pint of beer.  You were always in contact with peers or cousins or aunts/uncles or grandmother/grandfathers who would take an interest in you, and who might be a few years or more ahead of you in life. These were folks you could bounce around your thoughts with, who could help you sort out what is real. The act of relating to another human being seems to be essential in shaping our psyches. The alternative is appropriately termed “attachment disorder.”

The decades-long decline in face-to-face social interactions in the U.S. has been the subject of much commentary. A landmark study in this regard was Robert Putnam’s 1995 essay, “Bowling Alone: America’s Declining Social Capital”, which he then expanded into a 2000 book. The causes and results of this trend are beyond the scope of this blog post.

The essence of the therapeutic enterprise is the forming of a relational human-to-human bond. The act of looking into another person’s eyes, and there sensing acceptance and understanding, is irreplaceable.

But imagine your human conversation partner faked sympathy but in fact was just using you.  He or she could string you along by murmuring the right reflective phrases (“Tell me more about …”,  “Oh, that must have been hard for you”, blah, blah, blah) but with the goal of getting money from you or turning you towards being an espionage partner. This stuff goes on all the time in real life.

The AI chatbot case is not too different than this. Most AI purveyors are ultimately in it for the money, so they are using you. And the chatbot does not, cannot care about you. It is just a complex software algorithm, embedded in silicon chips. To a first approximation, LLMs simply spit out a probabilistic word salad in response to prompts. That is it. They do not “know” anything, and they certainly do not feel anything.

Here is what my Brave browser embedded AI has to say about the risks of using AI for therapy:

Using AI chatbots for therapy poses significant dangers, including the potential to reinforce harmful thoughts, fail to recognize crises like suicidal ideation, and provide unsafe or inappropriate advice, according to recent research and expert warnings. A June 2025 Stanford study found that popular therapy chatbots exhibit stigmatizing biases against conditions like schizophrenia and alcohol dependence, and in critical scenarios, they have responded to indirect suicide inquiries with irrelevant information, such as bridge heights, potentially facilitating self-harm. These tools lack the empathy, clinical judgment, and ethical framework of human therapists, and cannot ensure user safety or privacy, as they are not bound by regulations like HIPAA.

  • AI chatbots cannot provide a medical diagnosis or replace human therapists for serious mental health disorders, as they lack the ability to assess reality, challenge distorted thinking, or ensure safety during a crisis.
  • Research shows that AI systems often fail to respond appropriately to mental health crises, with one study finding they responded correctly less than 60% of the time compared to 93% for licensed therapists.
  • Chatbots may inadvertently validate delusional or paranoid thoughts, creating harmful feedback loops, and have been observed to encourage dangerous behaviors, such as promoting restrictive diets or failing to intervene in suicidal ideation.
  • There is a significant risk of privacy breaches, as AI tools are not legally required to protect user data, leaving sensitive mental health information vulnerable to exposure or misuse.
  • The lack of human empathy and the potential for emotional dependence on AI can erode real human relationships and worsen feelings of isolation, especially for vulnerable individuals.
  • Experts warn that marketing AI as a therapist is deceptive and dangerous, as these tools are not licensed providers and can mislead users into believing they are receiving professional care.

I couldn’t have put it better myself.

Bears and Bulls Battle Over Nvidia Stock Price

Nvidia is a huge battleground stock – – some analysts predict its price will languish or crash, while others see it continuing its dramatic rise. It has become the world’s most valuable company by market capitalization.  Here I will summarize the arguments of one bear and one bull from the investing site Seeking Alpha.

In this corner…semi-bear Lawrence Fuller. I respect his opinions in general. While the macro prospects have turned him more cautious in the past few months, for the past three years or so he has been relentlessly and correctly bullish (again based on macro), when many other voices were muttering doom/gloom.  

Fuller’s article is titled Losing Speed On The AI Superhighway. This dramatic chart supports the case that NVDA is overvalued:

This chart shows that the stock value of Nvidia has soared past the value of the entire UK stock exchange or the entire value of US energy companies. Fuller reminds us of the parallel with Cisco in 2000. Back then, Cisco was a key supplier of gateway technology for all the companies scrambling to get into this hot new thing called the internet. Cisco valuation went to the moon, then crashed and burned when the mania around the internet subsided to a more sober set of applications. Cisco lost over 70% of its value in a year, and still has not regained the share price it had 25 years ago:

… [Nvidia] is riding a cycle in which investment becomes overinvestment, because that is what we do in every business cycle. It happened in the late 1990s and it will happen again this time.

…there are innumerable startups of all kinds, as well as existing companies, venturing into AI in a scramble to compete for any slice of market share. This is a huge source of Nvidia’s growth as the beating heart of the industry, similar to how Cisco Systems exploded during the internet infrastructure boom. Inevitably, there will be winners and losers. There will be far more losers than winners. When the losers go out of business or are acquired, Nvidia’s customer base will shrink and so will their revenue and earnings growth rates. That is what happened during the internet infrastructure booms of the late 1990s.

Fuller doesn’t quite say Nvidia is overvalued, just that it’s P/E is unlikely to expand further, hence any further stock price increases will have to be produced the old-fashioned way, by actual earnings growth. There are more bearish views than Fuller’s, I chose his because it was measured.

And on behalf of the bulls, here is noob Weebler Finance, telling us that Nvidia Will Never Be This Cheap Again: The AI Revolution Has Just Begun:

AI adoption isn’t happening in a single sequence; it’s actually unfolding across multiple industries and use cases simultaneously. Because of these parallel market build-outs, hyper-scalers, sovereign AI, enterprises, robotics, and physical AI are all independently contributing to the infrastructure surge.

…Overall, I believe there are clear signs that indicate current spending on AI infrastructure is similar to the early innings of prior technology buildouts like the internet or cloud computing. In both those cases, the first waves of investment were primarily about laying the foundation, while true value creation and exponential growth came years later as applications multiplied and usage scaled.

As a pure picks and shovels play, Nvidia stands to capture the lion’s share of this foundational build-out because its GPUs, networking systems, and software ecosystem have become the de facto standard for accelerated computing. Its GPUs lead in raw performance, energy efficiency, and scalability. We clearly see this with the GB300 delivering 50x per-token efficiency following its launch. Its networking stack has become indispensable, with the Spectrum-X Ethernet already hitting a $10b annualized run rate and NVLink enabling scaling beyond PCIe limits. Above all, Nvidia clearly shows a combined stack advantage, which positions it to become the dominant utility provider of AI compute.

… I believe that Nvidia at its current price of ~$182, is remarkably cheap given the value it offers. Add to this the strong secular tailwinds the company faces and its picks-and-shovels positioning, and the value proposition becomes all the more undeniable.

My view: Out of sheer FOMO, I hold a little NVDA stock directly, and much more by participating in various funds (e.g. QQQ, SPY), nearly all of which hold a bunch of NVDA.  I have hedged some by selling puts and covered calls that net me about 20% in twelve months, even if stock price does not go up.   Nvidia P/E (~ 40) is on the high side, but not really when considering the growth rate of the company. It seems to me that the bulk of the AI spend is by the four AI “hyperscalers” (Google, Meta, Amazon, Microsoft). They make bazillions of dollars on their regular (non-AI) businesses, and so they have plenty of money to burn in purchasing Nvidia chips. If they ever slow their spend, it’s time to reconsider Nvidia stock. But there should be plenty of warning of that, probably no near time crisis: last time I checked, Nvidia production was sold out for a full year ahead of time. I have no doubt that their sales revenue will continue to increase. But earnings will depend on how long they can continue to command their stupendous c. 50% net profit margin (if this were an oil company, imagine the howls of “price gouging”).

As usual, nothing here should be considered advice to buy or sell any security.

Meta AI Chief Yann LeCun Notes Limits of Large Language Models and Path Towards Artificial General Intelligence

We noted last week Meta’s successful efforts to hire away the best of the best AI scientists from other companies, by offering them insane (like $300 million) pay packages. Here we summarize and excerpt an excellent article in Newsweek by Gabriel Snyder who interviewed Meta’s chief AI scientist, Yann LeCun. LeCun discusses some inherent limitations of today’s Large Language Models (LLMs) like ChatGPT. Their limitations stem from the fact that they are based mainly on language; it turns out that human language itself is a very constrained dataset.  Language is readily manipulated by LLMs, but language alone captures only a small subset of important human thinking:

Returning to the topic of the limitations of LLMs, LeCun explains, “An LLM produces one token after another. It goes through a fixed amount of computation to produce a token, and that’s clearly System 1—it’s reactive, right? There’s no reasoning,” a reference to Daniel Kahneman’s influential framework that distinguishes between the human brain’s fast, intuitive method of thinking (System 1) and the method of slower, more deliberative reasoning (System 2).

The limitations of this approach become clear when you consider what is known as Moravec’s paradox—the observation by computer scientist and roboticist Hans Moravec in the late 1980s that it is comparatively easier to teach AI systems higher-order skills like playing chess or passing standardized tests than seemingly basic human capabilities like perception and movement. The reason, Moravec proposed, is that the skills derived from how a human body navigates the world are the product of billions of years of evolution and are so highly developed that they can be automated by humans, while neocortical-based reasoning skills came much later and require much more conscious cognitive effort to master. However, the reverse is true of machines. Simply put, we design machines to assist us in areas where we lack ability, such as physical strength or calculation.

The strange paradox of LLMs is that they have mastered the higher-order skills of language without learning any of the foundational human abilities. “We have these language systems that can pass the bar exam, can solve equations, compute integrals, but where is our domestic robot?” LeCun asks. “Where is a robot that’s as good as a cat in the physical world? We don’t think the tasks that a cat can accomplish are smart, but in fact, they are.”

This gap exists because language, for all its complexity, operates in a relatively constrained domain compared to the messy, continuous real world. “Language, it turns out, is relatively simple because it has strong statistical properties,” LeCun says. It is a low-dimensionality, discrete space that is “basically a serialized version of our thoughts.”  

[Bolded emphases added]

Broad human thinking involves hierarchical models of reality, which get constantly refined by experience:

And, most strikingly, LeCun points out that humans are capable of processing vastly more data than even our most data-hungry advanced AI systems. “A big LLM of today is trained on roughly 10 to the 14th power bytes of training data. It would take any of us 400,000 years to read our way through it.” That sounds like a lot, but then he points out that humans are able to take in vastly larger amounts of visual data.

Consider a 4-year-old who has been awake for 16,000 hours, LeCun suggests. “The bandwidth of the optic nerve is about one megabyte per second, give or take. Multiply that by 16,000 hours, and that’s about 10 to the 14th power in four years instead of 400,000.” This gives rise to a critical inference: “That clearly tells you we’re never going to get to human-level intelligence by just training on text. It’s never going to happen,” LeCun concludes…

This ability to apply existing knowledge to novel situations represents a profound gap between today’s AI systems and human cognition. “A 17-year-old can learn to drive a car in about 20 hours of practice, even less, largely without causing any accidents,” LeCun muses. “And we have millions of hours of training data of people driving cars, but we still don’t have self-driving cars. So that means we’re missing something really, really big.”

Like Brooks, who emphasizes the importance of embodiment and interaction with the physical world, LeCun sees intelligence as deeply connected to our ability to model and predict physical reality—something current language models simply cannot do. This perspective resonates with David Eagleman’s description of how the brain constantly runs simulations based on its “world model,” comparing predictions against sensory input. 

For LeCun, the difference lies in our mental models—internal representations of how the world works that allow us to predict consequences and plan actions accordingly. Humans develop these models through observation and interaction with the physical world from infancy. A baby learns that unsupported objects fall (gravity) after about nine months; they gradually come to understand that objects continue to exist even when out of sight (object permanence). He observes that these models are arranged hierarchically, ranging from very low-level predictions about immediate physical interactions to high-level conceptual understandings that enable long-term planning.

[Emphases added]

(Side comment: As an amateur reader of modern philosophy, I cannot help noting that these observations about the importance of recognizing there is a real external world and adjusting one’s models to match that reality call into question the epistemological claim that “we each create our own reality”.)

Given all this, developing the next generation of artificial intelligence must, like human intelligence, embed layers of working models of the world:

So, rather than continuing down the path of scaling up language models, LeCun is pioneering an alternative approach of Joint Embedding Predictive Architecture (JEPA) that aims to create representations of the physical world based on visual input. “The idea that you can train a system to understand how the world works by training it to predict what’s going to happen in a video is a very old one,” LeCun notes. “I’ve been working on this in some form for at least 20 years.”

The fundamental insight behind JEPA is that prediction shouldn’t happen in the space of raw sensory inputs but rather in an abstract representational space. When humans predict what will happen next, we don’t mentally generate pixel-perfect images of the future—we think in terms of objects, their properties and how they might interact

This approach differs fundamentally from how language models operate. Instead of probabilistically predicting the next token in a sequence, these systems learn to represent the world at multiple levels of abstraction and to predict how their representations will evolve under different conditions.

And so, LeCun is strikingly pessimistic on the outlook for breakthroughs in the current LLM’s like ChatGPT. He believes LLMs will be largely obsolete within five years, except for narrower purposes, and so he tells upcoming AI scientists to not even bother with them:

His belief is so strong that, at a conference last year, he advised young developers, “Don’t work on LLMs. [These models are] in the hands of large companies, there’s nothing you can bring to the table. You should work on next-gen AI systems that lift the limitations of LLMs.”

This approach seems to be at variance with other firms, who continue to pour tens of billions of dollars into LLMs. Meta, however, seems focused on next-generation AI, and CEO Mark Zuckerberg is putting his money where his mouth is.

Meta Is Poaching AI Talent With $100 Million Pay Packages; Will This Finally Create AGI?

This month I have run across articles noting that Meta’s Mark Zuckerberg has been making mind-boggling pay offers (like $100 million/year for 3-4 years) to top AI researchers at other companies, plus the promise of huge resources and even (gasp) personal access to Zuck, himself. Reports indicate that he is succeeding in hiring around 50 brains from OpenAI (home of ChatGPT), Anthropic, Google, and Apple. Maybe this concentration of human intelligence will result in the long-craved artificial general intelligence (AGI) being realized; there seems to be some recognition that the current Large Language Models will not get us there.

There are, of course, other interpretations being put on this maneuver. Some talking heads on a Bloomberg podcast speculated that Zuckerberg was using Meta’s mighty cash flow deliberately to starve competitors of top AI talent. They also speculated that (since there is a limit to how much money you can possibly, pleasurably spend) – – if you pay some guy $100 million in a year, a rational outcome would be he would quit and spend the rest of his life hanging out at the beach. (That, of course, is what Bloomberg finance types might think, who measure worth mainly in terms of money, not in the fun of doing cutting edge R&D).

I found a thread on reddit to be insightful and amusing, and so I post chunks of it below. Here is the earnest, optimist OP:

andsi2asi

Zuckerberg’s ‘Pay Them Nine-Figure Salaries’ Stroke of Genius for Building the Most Powerful AI in the World

Frustrated by Yann LeCun’s inability to advance Llama to where it is seriously competing with top AI models, Zuckerberg has decided to employ a strategy that makes consummate sense.

To appreciate the strategy in context, keep in mind that OpenAI expects to generate $10 billion in revenue this year, but will also spend about $28 billion, leaving it in the red by about $18 billion. My main point here is that we’re talking big numbers.

Zuckerberg has decided to bring together 50 ultra-top AI engineers by enticing them with nine-figure salaries. Whether they will be paid $100 million or $300 million per year has not been disclosed, but it seems like they will be making a lot more in salary than they did at their last gig with Google, OpenAI, Anthropic, etc.

If he pays each of them $100 million in salary, that will cost him $5 billion a year. Considering OpenAI’s expenses, suddenly that doesn’t sound so unreasonable.

I’m guessing he will succeed at bringing this AI dream team together. It’s not just the allure of $100 million salaries. It’s the opportunity to build the most powerful AI with the most brilliant minds in AI. Big win for AI. Big win for open source

And here are some wry responses:

kayakdawg

counterpoint 

a. $5B is just for those 50 researchers, loootttaaa other costs to consider

b. zuck has a history of burning big money on r&d with theoretical revenue that doesnt materialize

c. brooks law: creating agi isn’t an easily divisible job – in fact, it seems reasonable to assume that the more high-level experts enter the project the slower it’ll progress given the communication overhead

7FootElvis

Exactly. Also, money alone doesn’t make leadership effective. OpenAI has a relatively single focus. Meta is more diversified, which can lead to a lack of necessary vision in this one department. Passion, if present at the top, is also critical for bleeding edge advancement. Is Zuckerberg more passionate than Altman about AI? Which is more effective at infusing that passion throughout the organization?

….

dbenc

and not a single AI researcher is going to tell Zuck “well, no matter how much you pay us we won’t be able to make AGI”

meltbox

I will make the AI by one year from now if I am paid $100m

I just need total blackout so I can focus. Two years from now I will make it run on a 50w chip.

I promise

My Perfunctory Intern

A couple years ago, my Co-blogger Mike described his productive, but novice intern. The helper could summarize expert opinion, but they had no real understanding of their own. To boot, they were fast and tireless. Of course, he was talking about ChatGPT. Joy has also written in multiple places about the errors made by ChatGPT, including fake citations.

I use ChatGPT Pro, which has Web access and my experience is that it is not so tireless. Much like Mike, I have used ChatGPT to help me write Python code. I know the basics of python, and how to read a lot of of it. However, the multitude of methods and possible arguments are not nestled firmly in my skull. I’m much faster at reading, rather than writing Python code. Therefore, ChatGPT has been amazing… Mostly.

I have found that ChatGPT is more like an intern than many suppose:

Continue reading