Extreme Co-design, Building the AI Factory: ZTE Unveils Full-Stack AI Infrastructure
From vertical scaling to digital twins, ZTE redefines AI infrastructure with end-to-end compute-network and software-hardware co-design
Partner Content At the recent MWC Barcelona 2026, Chen Xinyu, Vice President of ZTE, revealed in an exclusive interview that ZTE has developed a high-compute-efficient "AI Factory" full-stack solution through the end-to-end ultimate co-design to address the challenges of AI large model explosions in computing power demand. This solution aims to break through the physical limitations of traditional hardware stacking and help global customers build AI infrastructure with optimal total cost of ownership (TCO) across the full lifecycle.
System Reconstruction: Breaking Hardware Stacking Bottlenecks to Unleash Compute Efficiency
With the escalating requirements of AI large models for infrastructure, pure hardware stacking can no longer balance the contradictions between scale, efficiency, and cost. Chen emphasized: "We need comprehensive architectural reconstruction to maximize resource compute efficiency and accelerate the large-scale deployment of AI." ZTE's "AI Factory" solution demonstrates full-stack co-design, integrating a series of AI servers, hyper-nodes, lossless network, AI Booster operating system, AI Agent Studio development platform, AI Factory Twin platform, and IDC infrastructure. Through "compute-network co-design" and "software-hardware co-design," ZTE achieves breakthroughs in computing density, energy efficiency, and scale.
Compute-Network Co-design: Breaking Physical Limits of Density and Scale
Chen detailed two core breakthroughs enabled by the compute-network co-design:
1. Vertical Scaling (Scale-Up) - Enhancing Density Leveraging the OEX (Orthogonal Electrical eXchange) the architecture in hyper-nodes, the system physically connects compute trays with switching trays via vertical crosslinking. This eliminates tens of thousands of high-speed cables within racks, maximizing rack space and significantly boosting compute density. The ultra-short inter-board interconnect paths ensure high-speed, stable communication while fundamentally eliminating downtime risks caused by cable loosening. Paired with ZTE's self-developed high-capacity switching chip, the system supports TB-level interconnect bandwidth and sub-100-nanosecond latency, fully compatible with mainstream domestic and international standards and customized interconnect protocols. Key modules adopt the componentized design, allowing adaptation to GPUs from different vendors by simply replacing UBB modules.
2. Horizontal Scaling (Scale-Out/Across) - Expanding the Scale Starting from hyper-nodes, the "AI Factory" builds single-DC clusters via Scale-Out networks and further aggregates multi-DC compute power through Scale-Across networks. This end-to-end full-stack interconnection evolution path breaks through cluster scalability limits, establishing a highly scalable AI factory compute foundation.
Software-Hardware Co-design: Unlocking Energy Efficiency to Boost "Tokens per Watt"
Chen highlighted the critical role of software-hardware co-design in enhancing energy efficiency: "Powerful hardware capabilities must be fully unleashed through a deeply collaborative, full-stack optimized software system." ZTE's software stack, acting as the "operating system" for intelligent computing resource pools, transforms discrete physical resources into efficient integrated compute services. Collaborating with mainstream GPU vendors, ZTE employs techniques like framework optimization, operator optimization, communication acceleration, and intelligent scheduling to achieve performance leaps. Chen noted: "Through parallel strategy optimization, operator fusion, PD separation, and KV Pool technologies, these optimizations can increase token throughput by 5 to 20 times." To address large-scale cluster engineering challenges, ZTE introduced the "AI Factory Twin Platform." Utilizing digital twin technology, this platform simulates hardware selection, thermal management design, training/inference optimization, and operations management in virtual environments, enabling full-lifecycle simulation and performance optimization for cost-effective AI factories.
Open Ecosystem: Collaborating with Partners for the Future
As a complex system engineering project, the "AI Factory" spans multiple links, including underlying chips, algorithms, complete machines, clusters, software, and IDCs. Chen stated that ZTE applies its expertise in communication-system engineering methodologies and large-scale networking technologies to AI infrastructure construction, focusing on resolving industry challenges like interconnect bandwidth, system stability, and engineering delivery. "Extreme co-design is our core philosophy," Chen concluded. "ZTE will adhere to an open and decoupled approach, collaborating with global industry partners to build a future-oriented open intelligent computing ecosystem. We aim to popularize AI technology, drive industrial intelligence upgrades, and empower all sectors of the economy."
Contributed by ZTE.
