Press release
Nvidia GTC 2026 Preview: Next-Gen Inference Chips Coming, H200 to Make Way for Vera Rubin, Reducing HBM Dependence
March 30, 2026 - The global AI computing industry witnessed a key development as Nvidia officially confirmed that it will launch a new generation of AI inference chips at the GTC 2026 conference, to be held in San Jose, California, USA, from March 16 to 19. The company also announced a major production capacity adjustment plan: the flagship Blackwell architecture product H200 will gradually cede production capacity to the next-generation Vera Rubin platform, while reducing dependence on High-Bandwidth Memory (HBM) through architectural optimization, reshaping the pattern of AI computing hardware.As an annual weather vane in the AI chip field, GTC 2026 has attracted widespread attention even before its opening. Jensen Huang, founder and CEO of Nvidia, has preheated the event, stating that the conference will release "unprecedented" new chips, focusing on three core directions: leapfrog inference performance, energy efficiency optimization, and supply chain resilience, directly addressing the key pain points in the large-scale deployment of current AI large models.The industry generally expects that the new generation of inference chips will be the core landing product of the Vera Rubin platform, specifically optimized for scenarios such as long-text inference, multimodal model deployment, and AI Agent execution, filling the market gap of "strong training but high inference cost" in high-end computing power.This production capacity adjustment is a strategic decision by Nvidia based on market demand and regulatory environment. According to foreign media reports, Nvidia has notified TSMC that it will gradually transfer the 3nm advanced process capacity originally used for H200 chips to the production of the Vera Rubin platform. Colette Kress, CFO of Nvidia, admitted in the earnings conference that although the H200 has obtained a small number of export licenses, it has not generated actual revenue so far, and continuing large-scale mass production is no longer commercially meaningful. The existing H200 inventory is sufficient to cover limited market demand, and stopping new production capacity can avoid inventory backlogs and allocate scarce advanced process capacity to new products with greater growth potential.
Image: https://ecdn6.globalso.com/upload/p/3424/image_other/2026-03/wechat-image_20260306162229_1357_3.jpg
As Nvidia's core computing platform in 2026, Vera Rubin breaks through traditional computing bottlenecks at the architectural design level. The platform adopts a six-chip collaborative design, including six new chips such as Rubin GPU, Rubin CPX inference-specific accelerator, and Vera CPU, manufactured based on TSMC's 3nm N3P process with 336 billion transistors, 1.6 times that of the Blackwell architecture. In terms of performance, Vera Rubin's FP4 inference computing power reaches 50 Petaflops, 5 times that of the H200, and the inference Token cost can be reduced to one-tenth of that of the Blackwell platform, perfectly adapting to the large-scale inference needs of cloud service providers and enterprise-level AI factories.Regarding the industry-concerned HBM dependence issue, Nvidia has achieved a major breakthrough on the Vera Rubin platform. On the one hand, the platform is equipped with the third-generation Transformer Engine, which has built-in hardware-level adaptive compression technology, reducing memory usage while ensuring inference accuracy and lowering the excessive demand for HBM capacity. On the other hand, it optimizes the memory scheduling mechanism, combining LPDDR5X and HBM4 hybrid memory architecture, which not only ensures high bandwidth requirements but also shares part of the computing load through conventional memory, alleviating the supply chain pressure caused by the shortage of HBM production capacity.
Although the HBM4 bandwidth of the first batch of mass-produced Vera Rubin chips has been adjusted from the original 22TB/s to 20TB/s, the actual computing output is not affected, and the energy efficiency ratio is even improved by more than 30%.From the perspective of market impact, this adjustment will completely reshape the pattern of the AI computing supply chain. As a core scarce resource of current AI chips, HBM prices have continued to soar and delivery times have been lengthened, becoming a key factor restricting the popularization of computing power. By reducing HBM dependence through architectural innovation, Nvidia can not only ease its own supply chain pressure but also reduce the cost of high-end computing hardware, promoting the popularization of AI large models from large technology companies to small and medium-sized enterprises. At the same time, the Vera Rubin platform is fully compatible with the CUDA ecosystem, allowing existing customers to upgrade smoothly without modifying software, further consolidating Nvidia's dominant position in the market.Supply chain sources indicate that the Vera Rubin platform will start small-batch shipments in the second quarter of 2026 and fully expand in the third and fourth quarters. The first batch of customers already includes global leading cloud service providers, AI enterprises, and data center service providers. The supporting HGX Rubin NVL8 server motherboard and NVL72 full cabinet solution will also be unveiled simultaneously at the GTC 2026 conference, forming a full-stack solution of "chips + complete machines + software".
Industry analysts point out that Nvidia's move to "discontinue H200 and promote Rubin" is driven by both technological iteration and market demand. On the one hand, the focus of the AI industry is shifting from "model training" to "large-scale inference", and dedicated inference chips are ushering in an explosive period. On the other hand, alleviating the HBM production capacity bottleneck and optimizing production capacity allocation can allow Nvidia to maintain its leading position in the fierce market competition. As the GTC 2026 conference approaches, the detailed parameters, pricing strategy, and launch time of the Vera Rubin platform will be officially announced, and the global AI computing hardware track will usher in a new round of transformation.In the future, with the large-scale landing of the Vera Rubin platform, the cost of AI inference will drop significantly, and applications such as multimodal AI, intelligent agents, and industrial AI will accelerate their popularization. Nvidia's approach of reducing memory dependence through architectural innovation will also become an industry benchmark, driving the entire semiconductor industry to transform from "simply stacking hardware" to "optimizing architecture to improve efficiency", laying a solid foundation for the popularization of AI technology.
Media Contact
Company Name: Ant O&M (Beijing) Technology Service Co., Ltd.
Email:Send Email [https://www.abnewswire.com/email_contact_us.php?pr=nvidia-gtc-2026-preview-nextgen-inference-chips-coming-h200-to-make-way-for-vera-rubin-reducing-hbm-dependence]
Country: China
Website: https://www.antoperationtech.com/
Legal Disclaimer: Information contained on this page is provided by an independent third-party content provider. ABNewswire makes no warranties or responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you are affiliated with this article or have any complaints or copyright issues related to this article and would like it to be removed, please contact retract@swscontact.com
This release was published on openPR.
Permanent link to this press release:
Copy
Please set a link in the press area of your homepage to this press release on openPR. openPR disclaims liability for any content contained in this release.
You can edit or delete your press release Nvidia GTC 2026 Preview: Next-Gen Inference Chips Coming, H200 to Make Way for Vera Rubin, Reducing HBM Dependence here
News-ID: 4447152 • Views: …
More Releases from ABNewswire
DermLane Expands Access to Premium Skincare Brands Through Its Ecommerce Platfor …
Image: https://www.abnewswire.com/upload/2026/03/18ce663c767b8a193ef883cd4573bc7b.jpg
DermLane is a beauty ecommerce website offering a wide range of skincare products from premium and professional skincare brands. The platform is focused primarily on skincare and provides customers with access to brand-name products across multiple skincare categories, concerns, ingredients, and routines through an online shopping experience.
DermLane [http://dermlane.com/] features an extensive selection of skincare brands available for direct purchase online. Brands offered on the platform include Alastin, Jan Marini,…
CosmoFactor Launches Its Latest Range of Premium Haircare, Skincare, Makeup, and …
Image: https://www.abnewswire.com/upload/2026/03/df168bffef864136fd4cf79e6d3cb73e.jpg
CosmoFactor.com [http://cosmofactor.com/], a dedicated beauty e-commerce destination, continues to expand its presence as a one-stop online store for premium beauty products across haircare, skincare, makeup, bath & body, fragrance, and clearance collections. Designed for customers who value both quality and convenience, the platform brings together a wide assortment of authentic, brand-name beauty products at competitive prices, paired with reliable same-day shipping and responsive customer support.
With a clean, category-driven shopping…
How to Plan a Secret Proposal in Copenhagen (Step-by-Step Guide for 2026)
Planning a secret proposal in Copenhagen? A local photographer shares simple steps, best locations, and timing tips for a natural engagement.
Planning a secret proposal in Copenhagen or an engagement photoshoot in Copenhagen can feel like a big task, especially if you are visiting the city for the first time. The good news is that it can be much simpler than it seems - with the right plan and local guidance.
Copenhagen…
Shop Genomics Aims to Democratize Laboratory Equipment Access for K-12 Schools a …
Shop Genomics introduces educational resources and affordable equipment options specifically designed to support K-12 schools and teaching laboratories implementing genomics curricula. The initiative addresses growing demand for hands-on science education as genomics becomes central to modern biology instruction and workforce preparation.
As genomics transitions from specialized research domain to essential literacy for the modern workforce, K-12 schools face mounting pressure to incorporate hands-on molecular biology into science curricula. Shop Genomics has…
More Releases for Nvidia
Merifund Capital: Nvidia, OpenAI Explore AI Partnership
Nvidia and OpenAI's proposed 10GW artificial intelligence infrastructure partnership and 100-billion-dollar circular financing model are drawing scrutiny from institutional investors focused on data centre power, chip supply chains and emerging competition policy risks.
Negotiations between Nvidia Corporation and OpenAI on a proposed 100-billion-dollar strategic partnership for artificial intelligence infrastructure are intensifying, and Merifund Capital Management Pte. Ltd. is tracking how the talks redefine expectations for scale. The companies are discussing at…
Medvise AI Joins NVIDIA Inception
Dover, DE-April 11, 2024-
Medvise today announced it has joined
NVIDIA Inception, a program that nurtures startups revolutionizing industries with technological advancements.
Medvise is focused on bringing AI into clinical settings. The company's
product line includes an ambient scribe that records patient and provider
conversations, creates clinical notes and generates CPT and ICD codes. Medvise uses it's AI-powered capabilities to notify providers of patient gaps and important information.
Medvise plans to use the resources available…
XenReality Joins NVIDIA Inception
BANGALORE, India-January 11, 2024-XenReality today announced it has joined NVIDIA Inception, a program that nurtures startups revolutionizing industries with technological advancements.
XenReality aims to make visual AI solutions more accessible. Its latest product, XenCapture, is an AI-driven tool that converts real-world objects to photorealistic and dimensionally accurate 3D models within minutes.
Joining NVIDIA Inception will help XenReality to expand its AI capabilities using tools from NVIDIA Omniverse, a platform for developing…
Archonet Joins NVIDIA Inception
MUMBAI, India - September 30, 2023 - Archonet today announced that it has joined NVIDIA Inception, a program that nurtures startups revolutionizing industries with technological advancements.
Archonet is an AI-first platform for managed home design services. It offers a hassle-free home design and execution experience to users, including millennial first-time homebuyers, such as working couples who have a design vision for their space but are too time-constrained to execute it. Archonet…
AirAI.us Joins NVIDIA Inception Program
Available on premise and in the cloud, AirAI.us’s easy-to-use automated AI software uses advanced deep learning techniques to design, train, and deploy high-performing AI models for a wide range of applications including the digitization of operations, cost tracking, project management, invoicing, invoice reconciliation, numeric prediction, classification, time series forecasting, and image recognition. Powered by Google’s AI tools and python scripting, which supports both deep neural networks and traditional machine learning…
Cloud Telecommunication AI Market Competitive Outlook | IBM, Sentient Technologi …
This Cloud Telecommunication AI Market report also includes strategic profiling of key players in the market, systematic analysis of their core competencies, and draws a competitive landscape for the Cloud Telecommunication AI Market . The Cloud Telecommunication AI Market research report is a complete overview of the market, covering various aspects like product definition, segmentation based on various parameters, and the prevailing vendor landscape.
Some Of The Key Players…
