Performance Improved 475 Times! Fujitsu Unveils New PHOTON Architecture, Targeting AI Computing Bottlenecks

Against the backdrop of rapid iteration in large models, computing costs and processing efficiency have always been focal points of the industry. Recently, Fujitsu has unveiled an innovative architecture called PHOTON (Top-down Network Parallel Hierarchical Computing), aimed at breaking through the performance bottlenecks of traditional Transformer models in complex scenarios.

The Transformer architecture, which is currently mainstream in the AI field, although powerful, often faces challenges in handling long texts or high-concurrency multi-query tasks due to frequent memory access to retrieve historical information, leading to slow processing speeds and increased GPU computational burden. Fujitsu's research team cleverly bypassed this pain point through a re-design of the PHOTON architecture.

The core advantage of the PHOTON architecture lies in its unique hierarchical processing mechanism. Unlike traditional Transformers that use token-level segmentation, PHOTON introduces semantic layering technology, which not only effectively reduces computational complexity but also significantly enhances parallel computing capabilities. In addition, during the decision-making process of multi-query tasks, the architecture achieves a streamlined workflow that requires only one inference to reach a conclusion by using "majority voting" or "best choice" strategies.

Test data shows that in small models with parameter sizes of 600M, 900M, and 1.2B, PHOTON demonstrates extremely high throughput and very low memory usage. Especially in the 1.2B parameter model, its multi-query performance reaches 475 times that of mainstream Transformer architectures, greatly optimizing resource scheduling efficiency.

Because this architecture requires less KV Cache per iteration, it means the system can support a higher number of iterations. This is a significant performance gain for intelligent agent systems that need to handle a large number of I/O processes. Although there is a slight trade-off in some quality metrics, PHOTON, thanks to its leapfrog progress in computational efficiency, offers a highly promising technical solution for reducing AI operational costs.

Currently, Fujitsu is actively promoting the application of this architecture, hoping to provide a lighter and more efficient underlying support for future intelligent scenarios through innovations in underlying algorithms.

Three-Year Delayed Long Article: Former OpenAI Security VP Wang Li Analyzes Scaling Laws: Your Model May Have Been Trained on the Wrong Data

Lilian Weng returns with a deep dive into scaling laws, arguing the industry consensus may be reversed: from Kaplan to Chinchilla, the mainstream data allocation might not be optimal. It examines compute, model size, and data quantity trade-offs, implying the billions-invested path requires reconsideration, prompting a re-evaluation of pretraining recipes.....

Google Gemini 3.5 Pro Release Delayed, Refining Core Capabilities Becomes the Top Priority

Google's next flagship Gemini 3.5 Pro, originally slated for release this month, has been postponed to July. The delay is not due to technical stagnation but to allow the R&D team more time for deeper optimization and refinement, aiming for higher product maturity. This reflects the intense competition in computing power and models, with major players more cautiously balancing release timing and quality.....

Doubao officially launches three tiers of paid subscription services: up to 500 yuan per month, with overall integration of the 2.1 series large model

On June 24, ByteDance's AI assistant Doubao launched paid subscriptions: Standard (¥68/month), Enhanced (¥200/month), Premium (¥500/month), using Doubao 2.1 model with notable performance gains. Free tier remains. Pricing is locally competitive, marking the commercialization start of domestic large-model apps.....

Performance Improved 475 Times! Fujitsu Unveils New PHOTON Architecture, Targeting AI Computing Bottlenecks

Related Recommendations

Three-Year Delayed Long Article: Former OpenAI Security VP Wang Li Analyzes Scaling Laws: Your Model May Have Been Trained on the Wrong Data

2026 Global Unicorn Total Valuation Surges 43%: Large Models Spark Capital Mania, Reshaping the Focus of the Global Tech Industry

Enhancing the WENXIN 5.1 Foundation: Baidu WENXIN Website Fully Expands, Introducing New Tools Such as Office Online Editing

Google Gemini 3.5 Pro Release Delayed, Refining Core Capabilities Becomes the Top Priority

Doubao officially launches three tiers of paid subscription services: up to 500 yuan per month, with overall integration of the 2.1 series large model