Over the past three years, the capabilities of large models have rapidly advanced, and computing power has continuously broken through, but the practical application of enterprise AI has not delivered the expected value. An increasing number of industries have reached a consensus: the focus of the AI competition is shifting from models to data.

On June 29th, OceanBase released an AI database for the AI era, called Lakehouse Integrated AI Database, which takes the lakehouse integration as its core architecture. It unifies the open and massive storage capabilities of data lakes, the transaction processing and analytical capabilities of databases, and the multi-modal data processing capabilities into a single consistent data foundation. This helps Agents (intelligent entities) access complete business context in one go, enabling AI to truly "understand" enterprises.

With this capability, the OceanBase AI database has also formed a product system covering data engines, data governance, and business entry points, including Lakebase, DataStudio, and DataPilot, and has completed business verification in AI scenarios such as Ant's Afu and Lingguang.

Industry experts generally believe that as Agents become new users of databases, databases are moving from "recording facts" to "participating in decisions," and AI databases have thus become a new infrastructure form in the AI era.

image.png

The Bottleneck of AI Implementation Lies in Data, and "Lakehouse Integration" Becomes the Optimal Path

In the past, large models addressed the question of "whether they can think"; today, the real bottleneck has become "whether they understand the business." Model capabilities are rapidly converging, and business differences are shifting to the data layer.

As Agents enter the execution layer of systems and enterprise data becomes multi-modal, traditional multi-system collaboration architectures are increasingly unable to meet AI's demand for a "unified context." AI is changing the paradigm of data management. The three key challenges brought by Agent's entry into production—scale, context, and evolution—are becoming more prominent. When we look at the impact of AI on databases, the data forms, data flows, and data interactions are all changing, but the four basic requirements of databases—consistency, scalability, reliability, and real-time performance—cannot be compromised.

What truly needs to be rewritten is the architecture, and what must be upheld is the bottom line—pointing directly to lakehouse integration: managing, computing, and serving multi-modal data within a unified engine, eliminating the fragmentation of multiple systems at the architectural level.

Surrounding lakehouse integration, OceanBase has built a new product system for the AI era:

OceanBase Lakebase serves as the underlying engine, managing, processing, retrieving, and calling structured, unstructured, and vector data within a unified architecture, solving the issue of the data foundation in the AI era;

OceanBase DataStudio runs on top of Lakebase, covering data access, processing, orchestration, semantic modeling, and Agent collaboration, transforming scattered data assets into callable data services, addressing issues of data governance and serviceability;

OceanBase DataPilot serves as a unified business intelligence entry point for enterprises, allowing business personnel to generate analysis reports, data dashboards, and reliable answers through natural language, solving the problem of how business users directly utilize data intelligence.

According to the company, compared to traditional multi-system solutions, the OceanBase AI database can reduce total cost of ownership (TCO) by about 30% to 50%. This capability has been verified in scenarios such as Ant's Afu and Lingguang. Among them, Lingguang has generated tens of millions of "Flash Apps", verifying the feasibility of the lakehouse integration architecture in a scenario with millions of Agents.

Yang Chuanhui, CTO of OceanBase, said: "True integration must occur at the architectural level. Lakehouse integration is not a simple combination of databases and data lakes, but rather the unified management of multi-modal data within the same engine, connecting online and offline processing."

Redefining AI Databases, Domestic Vendors Face New Opportunities

AI databases are becoming a new field in global infrastructure software, but technical paths are still uncertain: some extend the boundaries of data lakes, some enhance search and semantic understanding capabilities, while others start from the database kernel and attempt to restructure the entire data system.

The essence of the difference lies not in component selection, but in different understandings of "how AI uses data." In this round of change, AI databases are no longer just an extension of traditional database capabilities, but are redefining how data is organized, accessed, and used for decision-making by AI.

image.png

OceanBase Rebuilds the Lakehouse Integrated Database for the AI Era

OceanBase starts from the database kernel, extending the long-proven transaction consistency, high availability, and elasticity capabilities in financial core systems to the lake and multi-modal data systems, giving it the ability to support AI workloads uniformly—this is a fundamental restructuring from the bottom up, not just an overlay repair on an old architecture.

Public information shows that OceanBase is a self-developed database in China, originating from the "Double 11" scenario in 2010. Over the past 15 years, it has undergone the most rigorous tests in the financial industry—serving over 400 financial institutions, ranking first in the local deployment market for distributed databases in China for two consecutive years, and being the only database to achieve top rankings in both TPC-C and TPC-H international authoritative tests. Its business covers multiple countries and regions around the world.

This set of capabilities honed in financial-grade scenarios—error-free data, uninterrupted systems, and millisecond fault recovery—is the essential requirement for the AI era. Extending these from "database" to "data lake" is a natural step for OceanBase based on 15 years of foundation, and it gives the company the basis to participate in the competition of redefining the AI database paradigm.

For a long time, the standards of infrastructure software have been defined by international vendors, and AI databases are entering a stage where rules are not yet fixed. This means the competition is no longer just about catching up with existing systems, but participating in the formation of a new system.