Nvidia’s sales are soaring, but companies like Google and Amazon have begun creating their own custom AI chips.

Nvidia exceeded all expectations on Wednesday, reporting substantial profits driven by its graphics processing units (GPUs) that excel in AI tasks. However, the landscape of AI chips is evolving, with an increasing number of major tech companies developing their own custom ASICs (application-specific integrated circuits). Notable examples include Google’s Tensor Processing Units (TPUs), Amazon’s Trainium, and OpenAI’s collaboration with Broadcom. These custom chips are smaller, more cost-effective, and may lessen these companies’ dependence on Nvidia’s GPUs. Daniel Newman from the Futurum Group indicated to CNBC that he anticipates custom ASICs will outpace GPU market growth in the coming years.

In addition to GPUs and ASICs, field-programmable gate arrays (FPGAs) are also prevalent. These versatile chips can be reconfigured through software for a variety of applications, including signal processing, networking, and AI. On-device AI chips, which facilitate AI processing directly on devices rather than in the cloud, are promoted by companies like Qualcomm and Apple.

CNBC consulted with industry experts and insiders from major tech firms to explore the diverse and competitive landscape of AI chips.

GPUs for General Computing
Originally designed for gaming, GPUs have transformed Nvidia into the world’s most valuable publicly traded company as they became essential for AI workloads. Over the past year, Nvidia shipped approximately 6 million of its latest Blackwell GPUs.

Nvidia senior director of AI infrastructure Dion Harris shows CNBC’s Katie Tarasov how 72 Blackwell GPUs work together as one in a GB200 NVL72 rack-scale server system for AI at Nvidia headquarters in Santa Clara, California, on November 12, 2025.

The transition from gaming to AI began around 2012 when Nvidia’s GPUs were utilized by researchers to create AlexNet, often regarded as the pivotal moment for modern AI. AlexNet entered a prestigious image recognition competition, outperforming others that relied on central processing units (CPUs) by using GPUs to achieve exceptional accuracy.

The team behind AlexNet realized that the parallel processing capabilities of GPUs, which are usually employed for rendering high-quality graphics, were also effective for training neural networks. This approach allowed computers to learn from data instead of depending solely on programmed code, thereby demonstrating the immense potential of GPUs.

Today, GPUs are frequently used alongside CPUs in server rack systems within data centers, where they handle AI workloads in the cloud. While CPUs utilize a few powerful cores for sequential general-purpose tasks, GPUs boast thousands of smaller cores designed for parallel computations, such as matrix multiplication.

Because GPUs can execute many operations simultaneously, they excel in the two primary stages of AI computation: training and inference. Training enables an AI model to recognize patterns in large datasets, while inference allows the AI to make decisions based on new information.

GPUs serve as the foundational workhorses for both Nvidia and its main competitor, Advanced Micro Devices (AMD). A key differentiator between the two is their software; Nvidia’s GPUs are tightly integrated with CUDA, its proprietary platform, while AMD’s offerings operate within a predominantly open-source ecosystem.

Both AMD and Nvidia supply their GPUs to cloud providers like Amazon, Microsoft, Google, Oracle, and CoreWeave. These providers then rent out the GPUs to AI companies by the hour or minute. For instance, Anthropic’s $30 billion agreement with Nvidia and Microsoft includes 1 gigawatt of compute capacity on Nvidia GPUs, and AMD has secured significant commitments from OpenAI and Oracle.

Nvidia also sells directly to AI firms, including a recent contract to provide at least 4 million GPUs to OpenAI and partnerships with foreign governments such as South Korea, Saudi Arabia, and the U.K. The company informed CNBC that it charges approximately $3 million for a server rack containing 72 Blackwell GPUs and ships about 1,000 racks each week.

Dion Harris, Nvidia’s senior director of AI infrastructure, remarked to CNBC that he couldn’t have anticipated such high demand when he joined the company over eight years ago. “When we proposed building a system with eight GPUs, people thought that was excessive,” he noted.

ASICs for Custom Cloud AI

While GPUs have been crucial for the initial surge of large language models, the importance of inference is growing as these models evolve. Inference can be carried out on less powerful chips tailored for specific tasks, which is where ASICs (application-specific integrated circuits) come into play.

While a GPU functions like a Swiss Army Knife capable of handling various parallel computations for different AI tasks, an ASIC is a specialized tool designed for a single purpose. This makes ASICs very efficient and fast, but limited to performing the exact calculations required for one specific job.

Google released its 7th generation TPU, Ironwood, in November 2025, a decade after making its first custom ASIC for AI in 2015.

“You can’t alter them once they’re etched into silicon, leading to a trade-off in flexibility,” remarked Chris Miller, author of Chip War.

Nvidia’s GPUs offer sufficient flexibility for many AI companies, but their prices can reach up to $40,000, and availability can be an issue. Startups often prefer GPUs because the initial investment for designing a custom ASIC typically starts in the tens of millions of dollars, according to Miller.

For major cloud providers that can afford it, analysts believe custom ASICs yield long-term benefits. “They want more control over the workloads they develop,” noted Newsom. “However, they will continue to collaborate closely with Nvidia and AMD, as they still require capacity to meet insatiable demand.”

Google was the first major tech company to create a custom ASIC for AI acceleration, introducing the term Tensor Processing Unit (TPU) when its first ASIC was launched in 2015. Although Google contemplated developing a TPU as early as 2006, the urgency increased in 2013 when it realized AI would lead to a doubling of its data centers. By 2017, the TPU played a key role in Google’s development of the Transformer architecture, which underpins most modern AI.

A decade after the initial TPU release, Google unveiled its seventh-generation TPU in November. Anthropic announced it would train its language model, Claude, using up to 1 million TPUs. Some believe TPUs are equal to or even exceed the performance of Nvidia’s GPUs, according to Miller. “Traditionally, Google has used them solely for internal applications,” he added. “There’s speculation that Google may eventually make TPUs more widely accessible.”

Following Google, Amazon Web Services (AWS) also ventured into designing its own AI chips after acquiring the Israeli startup Annapurna Labs in 2015. AWS introduced Inferentia in 2018 and launched Trainium in 2022, with a third generation of Trainium expected to be announced as soon as December.

Ron Diamant, the chief architect of Trainium, told CNBC that Amazon’s ASIC offers 30% to 40% better price performance compared to other hardware vendors in AWS. “Over time, we’ve seen that Trainium chips handle both inference and training workloads quite effectively,” Diamant noted.

In October, CNBC visited Indiana for an exclusive on-camera tour of Amazon’s largest AI data center, where Anthropic trains its models using half a million Trainium2 chips. In other data centers, AWS incorporates Nvidia GPUs to satisfy the demands of AI clients like OpenAI.

Creating ASICs is challenging, which is why many companies partner with chip designers such as Broadcom and Marvell. “They provide the intellectual property, expertise, and networking assistance to help clients develop their ASICs,” said Miller. “Broadcom, in particular, has been a significant beneficiary of the AI boom,” he added.

Broadcom contributed to the development of Google’s TPUs and Meta’s Training and Inference Accelerator, launched in 2023, and it has recently struck a deal to assist OpenAI in creating its custom ASICs starting in 2026. Microsoft is also entering the ASIC market, revealing to CNBC that its in-house Maia 100 chips are already deployed in its data centers on the East Coast. Other companies making strides in this area include Qualcomm with its A1200, Intel with its Gaudi AI accelerators, and Tesla with its AI5 chip. A host of startups are also fully committing to custom AI chips, such as Cerebras, which produces large full-wafer AI chips, and Groq, which focuses on inference-oriented language processing units.

In China, firms like Huawei, ByteDance, and Alibaba are developing custom ASICs, though export controls on advanced equipment and AI chips present challenges.

Edge AI with NPUs and FPGAs

The final significant category of AI chips is designed for operation on devices rather than in the cloud. Typically integrated into a device’s main System on a Chip (SoC), edge AI chips enable devices to perform AI tasks while optimizing battery life and making room for other components.

“You can execute these tasks directly on your phone with minimal latency, eliminating the need for constant communication with a data center,” explained Saif Khan, a former White House advisor on AI and semiconductor policy. “This also protects the privacy of your data.”

Neural processing units (NPUs) are a key type of edge AI chip. Companies like Qualcomm, Intel, and AMD are developing NPUs to enhance AI capabilities in personal computers. While Apple doesn’t formally use the term NPU, its in-house M-series chips in MacBooks include dedicated neural engines, and the latest iPhone A-series chips also feature neural accelerators.

“It’s efficient and responsive, allowing us greater control over the user experience,” Tim Millet, Apple’s vice president of platform architecture, shared with CNBC in a September interview. The latest Android phones, equipped with NPUs embedded in their Qualcomm Snapdragon chips, and Samsung’s Galaxy phones, which have their own NPUs, further exemplify this trend. NPUs from companies like NXP and Nvidia power AI in cars, robots, cameras, smart home devices, and beyond.

“Most investment is currently focused on data centers, but that will shift over time as AI becomes more integrated into our phones, cars, wearables, and various other applications,” Miller predicted.

Additionally, field-programmable gate arrays (FPGAs) can be reconfigured with software after production. While FPGAs offer more flexibility than NPUs or ASICs, they typically deliver lower raw performance and energy efficiency for AI tasks.

AMD became the leading FPGA manufacturer following its $49 billion acquisition of Xilinx in 2022, while Intel ranks second due to its $16.7 billion purchase of Altera in 2015.

These companies, which design AI chips, rely on a single manufacturer: Taiwan Semiconductor Manufacturing Company (TSMC). TSMC has a massive new chip fabrication facility in Arizona, where Apple has committed to relocating part of its chip production. In October, Nvidia CEO Jensen Huang announced that Blackwell GPUs are also now in “full production” in Arizona.

Despite the crowded landscape of AI chips, displacing Nvidia will not be an easy task. “They hold their position because they’ve earned it, investing years into building it,” Newman noted. “They’ve cultivated a robust developer ecosystem.”

Source: CNBC Edited by Bernie