Innovation Across Data Centers: Moonshot AI and Tsinghua University Propose the PrfaaS Architecture

As large language models (LLMs) increasingly require computational resources during the reasoning process, traditional service architectures face bottlenecks. Moonshot AI and a research team from Tsinghua University have recently introduced a new architecture called Pre-filling as a Service (PrfaaS), designed to break the limitations of data centers and computing resources in large language model services.

Currently, the reasoning process of large language models is typically divided into two stages: pre-filling and decoding. The pre-filling stage is a computationally intensive process where the model processes the input and generates a key-value cache (KVCache). The decoding stage is a memory bandwidth-intensive process where the model generates outputs one by one. Traditional architectures require both stages to be completed within the same data center, which imposes limitations in terms of computation and bandwidth.

PrfaaS achieves efficient service across data centers by offloading the pre-filling tasks to dedicated high-computing clusters and using a general Ethernet network to transmit the generated KVCache to local decoding clusters. Studies show that this architecture significantly improves processing performance, with a 54% increase in service throughput compared to traditional models. In practical case studies, the architecture also demonstrates lower latency and higher efficiency.

The design of the PrfaaS architecture separates the three major subsystems—computing, networking, and storage—and manages them independently. A precise routing mechanism ensures that long requests are transmitted efficiently, avoiding congestion caused by uneven resource allocation in traditional methods. At the same time, the system introduces a dual timescale scheduling mechanism to handle changes in different traffic patterns, further optimizing resource utilization.

With the increasing demand for cross-data center reasoning and the continuous emergence of new hardware, PrfaaS undoubtedly provides a new solution for future AI applications.

AI Employees Start Working at the Power Station! Ant Digital Launches Intelligent Agents for Power Trading and New Energy Operations

Ant Digital launched two intelligent agent products for the energy industry - the 'Power Trading Intelligent Agent' and the 'New Energy Operations Intelligent Agent' - at the 2026 Shanghai SNEC Photovoltaic Exhibition. Both are developed based on Ant's self-developed platform DTClaw, and can build AI employees with industry expert capabilities in bulk. It has already been implemented in cooperation with Linyang Zhiwei and Ganzhou New Energy, marking the first application of intelligent agents in the energy sector, aiming to solve issues such as reliance on manual decision-making and slow response in high-frequency power trading markets.

OpenAI Holds AI Onboarding Launch Event and Introduces Six Industry Workflow Plugins

OpenAI introduced the Codex workflow plugin at its 'AI Onboarding' launch event, targeting positions such as stock investment and creative design, allowing AI to deeply integrate into corporate daily processes and replace knowledge worker tasks. Codex has expanded from a programming tool to non-programming fields, drawing attention from the capital market.

Accumulated investment in OpenAI exceeds 60 billion: Sun Yuzheng, with a net worth of 100 billion US dollars, regains the title of Asia's richest person

Sun Yuzheng regained the title of Asia's richest person with a net worth of about 100 billion US dollars, driven by the AI boom. SoftBank Group's market value exceeded Toyota and became Japan's largest listed company. According to Forbes data, SoftBank's stock price rose by 14% on June 1st, and its market value once broke through 4.8 trillion yen, reaching about 4.879 trillion yen as of June 3rd.

Tencent and CATL Plan to Invest Heavily in DeepSeek's First Round of Funding, Valuation May Reach 400 Billion Yuan

Chinese AI startup DeepSeek plans to conduct its first large-scale financing round, aiming to raise about 50 billion yuan (approximately 7.4 billion US dollars), with an estimated valuation of 350 to 400 billion yuan after the funding. The company has gained global attention with its V3 and R1 models, changing the perception of China's capabilities in large model development.

Strangers' Privacy Becomes a Casualty? Amazon Ring Doorbell's Facial Recognition Feature Faces a Class-Action Lawsuit

Amazon's Ring doorbell has faced a class-action lawsuit due to its '熟人识别' (known person recognition) feature. The feature uses AI and facial recognition technology to collect facial biometric information without the consent of passersby. Charles Sigwart, a resident of Virginia, filed a lawsuit in Seattle, accusing Ring of violating privacy.