Fountainheadinvesting

Fountainhead Investing

  • Objective Analysis: Research On High Quality Companies With Sustainable Moats
  • Tech Focused: 60% Allocated To AI, Semiconductors, Technology

5 Star Tech Analyst Focused On Excellent Companies With Sustainable Moats

Categories
AI Industry Semiconductors Stocks

Nvidia GTC Keynote – CEO Jensen Huang

Quick Key Takeaways: Worth every minute. I own shares and will add on declines.

Incredible product road map:

Blackwells are in full production and Blackwell NV72 is expected in 2H2025. Some may quibble about a “delay, ” as investors expected Q1 or Q2 of strong sales from NV36 and NV72, but it hardly makes a difference in the long run. In the worst-case scenario, the stock could drop 10-15%, but that should attract buying unless other macroeconomic uncertainties cause a continuous slump in the market/economy. I would back up the truck at $100.

Rubin, which is the next series of GPU systems, will be available in the second half of 2026 – again a massive leap in performance.

Nvidia (NVDA) has a 1-year upgrade cadence,

a)Nobody else has that

b) It’s across the board, GPUs, Racks, Networking, Storage, and Packaging the whole ecosystem of partners.

Nvidia’s market leadership is going to last a while – That is my main investing thesis, and I can withstand the short-term bumps.

Cost Analysis: I’m glad Jensen spoke about this in more detail and what stood out for me was a clear-cut analysis of reducing variable costs as their GPU systems get smarter and more efficient to bring total costs (TCO) down. Generating tokens to answer queries is horrendously expensive for Large Language Models, like ChatGPT and it was a black hole.

I expect costs to come down significantly, making the business model viable. Customers are expected to pay up to $3Mn for the Blackwell NV72, and it has to become profitable for them.

Omniverse, Cosmos, and Robotics, – are other focus areas to go beyond data centers. Nvidia needs other target markets, industrials, factories, automakers, and oil and gas companies to embrace AI and therefore use of their GPUs, to reduce their dependence on hyperscalers, and Jensen spent a lot of time on them. He also emphasized enterprise software partnerships, for AI, and gaining full acceptance as a ubiquitous product and making extra revenues. For Nvisia’s vision of alternate intelligent computing, we have to see more Palantirs, Service Nows, and AppLovins. In my opinion, Agentic AI will make serious inroads in 12-15 months.

I’ll add more detail in another note, once I parse the transcript in detail.

Bottom line – this is going to be an incredible journey and we’re just at the beginning; Sure it’s going to be a bumpy ride, and given the macro environment, it would be prudent to manage risks by waiting for the right entry, and taking profits when overvalued or overbought, or if you have the expertise, using other hedging mechanisms. I’m sold on Jensen’s idea of a new paradigm of accelerated and intelligent computing based on GPUs and agents. Nvidia is best positioned to take full advantage of it.

Categories
AI Cloud Service Providers Industry Semiconductors Stocks

DeepSeek Hasn’t Deep-Sixed Nvidia

01/28/2025

Here is my understanding of the DeepSeek breakthrough and its repercussions on the AI ecosystem

DeepSeek used “Time scaling” effectively, which allows their r1 model to think deeper at the inference phase. By using more power instead of coming up with the answer immediately, the model will take longer to research for a better solution and then answer the query, better than existing models.

How did the model get to that level of efficiency?

DeepSeek used a lot of interesting and effective techniques to make better use of its resources, and this article from NextPlatform does an excellent job with the details.

Besides effective time scaling the model distilled the answers from other models including ChatGPT’s models.

What does that mean for the future of AGI, AI, ASI, and so on?

Time scaling will be adopted more frequently, and tech leaders across Silicon Valley are responding to improve their methods as cost-effectively as possible. That is the logical and sequential next step – for AI to be any good, it was always superior inference that was going to be the differentiator and value addition.

Time scaling can be done at the edge as the software gets smarter.

If the software gets smarter, will it require more GPUs?

I think the GPU requirements will not diminish because you need GPUs for training and time scaling, smarter software will still need to distill data. 

Cheaper LLMs are not a plug-and-play replacement. They will still require significant investment and expertise to train and create an effective inference model. Just as a number aiming at a 10x reduction in cost is a good target but it will compromise quality and performance. Eventually, the lower-tier market will get crowded and commoditized – democratized if you will, which may require cheaper versions of hardware and architecture from AI chip designers, as an opportunity to serve lower-tier customers.

Inferencing

Over time, yes inference will become more important – Nvidia has been talking about the scaling law, which diminishes the role of training and the need to get smarter inference for a long time. They are working on this as well, I even suspect that the $3,000 Digits they showcased for edge computing will provide some of the power needed.

Reducing variable costs per token/query is huge: The variable cost will reduce, which is a huge boon to the AI industry, previously retrieving and answering tokens cost more than the entire monthly subscription to ChatGPT or Gemini.

From Gavin Baker on X on APIs and Costs:

R1 from DeepSeek seems to have done that, “r1 is cheaper and more efficient to inference than o1 (ChatGPT). r1 costs 93% less to *use* than o1 per each API, can be run locally on a high end work station and does not seem to have hit any rate limits which is wild.

However, “Batching massively lowers costs and more compute increases tokens/second so still advantages to inference in the cloud.”

It is comparable to o1 from a quality perspective although lags o3.

There were real algorithmic breakthroughs that led to it being dramatically more efficient both to train and inference.  

On training costs and real costs:

Training in FP8, MLA and multi-token prediction are significant.  It is easy to verify that the r1 training run only cost $6m.

The general consensus is that the “REAL” costs with the DeepSeek model much larger than the $6Mn given for the r1 training run.

Omitted are:

Hundreds of millions of dollars on prior research and has access to much larger clusters.

Deepseek likely had more than 2048 H800s;  An equivalently smart team can’t just spin up a 2000 GPU cluster and train r1 from scratch with $6m.  

There was a lot of distillation – i.e. it is unlikely they could have trained this without unhindered access to GPT-4o and o1, which is ironical because you’re banning the GPU’s but giving access to distill leading edge American models….Why buy the cow when you can get the milk for free?

The NextPlatform too expressed doubts about DeepSeek’s resources

We are very skeptical that the V3 model was trained from scratch on such a small cluster.

A schedule of geographical revenues for Nvidia’s Q3-FY2025 showed 15% of Nvidia’s or over $4Bn revenue “sold” to Singapore, with the caveat that it may not be the ultimate destination, which also creates doubts that DeepSeek may have gotten access to Nvidia’s higher-end GPUs despite the US export ban or stockpiled them before the ban. 

Better software and inference is the way of the future

As one of the AI vendors at CES told me, she had the algorithms to answer customer questions and provide analytical insides at the edge for several customers – they have the data from their customers and the software, but they couldn’t scale because AWS was charging them too much for cloud GPU usage when they didn’t need that much power. So besides r1’s breakthrough in AGI, this movement has been afoot for a while, and this will spur investment and innovation in inference. We will definitely continue to see demand for high-end Blackwell GPUs to train data and create better models for at least the next 18 months to 24 months after which the focus should shift to inference and as Nvidia’s CEO said, 40% of their GPUs are already being used for inference.