The way this actually works starts with a sentence in NVIDIA’s FY2023 10-K: the company “started shipping the first Hopper-based GPU, the flagship H100,” and “Hopper includes a Transformer Engine.” Two proper nouns, one real mechanism worth understanding, especially now, with the H100 only just reaching customers.

Under the hood, a transformer is the neural-network architecture that powers today’s large language models. Its central operation, attention, lets the model weigh how much every word in a sequence relates to every other word. That operation is multiplied billions of times during training, so anything that makes it cheaper makes the whole model cheaper.

That is what a “Transformer Engine” does. Rather than treating every calculation at full numerical precision, it dynamically uses lower-precision number formats where the model can tolerate them, and higher precision where it cannot. Forget the name for a second: it is a chip feature that does the most common AI math in a cheaper format without wrecking the answer.

Why does this belong in a 10-K? Because NVIDIA is telling investors, in its annual report, that its newest data-center GPU is purpose-built for exactly the model family that has captured the public imagination. The disclosure is the company aligning its silicon with the workload of the moment, and saying so on the record as the first units ship.

As a contemporaneous marker, early 2023: the flagship chip for the language-model era is named in the filing, with a feature whose entire reason for existing is the transformer. Whether H100 demand matches the design intent is a story for the quarters ahead; the mechanism, and the bet, are already in the document. Filing data and the evidence index via EdgarBeast.