Recently, Perplexity launched a new security system called BrowseSafe, designed to protect AI browser agents from threats posed by web content manipulation. The system claims a 91% success rate in detecting prompt injection attacks, surpassing the performance of other solutions currently on the market. For example, PromptGuard-2 can only detect 35% of attacks, while large cutting-edge models like GPT-5 have a detection rate of 85%. In addition, BrowseSafe runs fast enough to enable real-time monitoring.

The widespread use of AI browser agents has also introduced new security risks. Earlier this year, Perplexity launched Comet, a web browser that integrates AI agents. These agents can browse websites like users and perform authentication sessions, such as email, banking, and enterprise applications. This high-level access gives malicious attackers the opportunity to hide dangerous instructions within web pages, leading the agent to perform inappropriate actions, such as sending sensitive information to external addresses.

As Perplexity conducted deeper analysis of security issues, it found that existing evaluation benchmarks, such as AgentDojo, were insufficient to address these complex network attacks. These benchmarks typically rely on simple prompts and cannot cover the complex web content in the real world, allowing attackers to easily hide their malicious code.

image.png

To address this, Perplexity created BrowseSafe Bench, defining the scope of network attacks through three specific dimensions: attack type, injection strategy, and language style. This benchmark particularly focuses on "hard-to-detect content," which appears harmless but may be mistakenly identified as an attack. By using an expert hybrid architecture, BrowseSafe can perform security scans in parallel without affecting user experience.

However, the evaluation also revealed some issues. For instance, the detection rate for multilingual attacks dropped to 76%. Additionally, content hidden in HTML comments was easier to detect than content hidden in visible areas, such as the bottom of the page. Perplexity's three-tier defense strategy forms a complete protection mechanism through a fast classifier and a cutting-edge large language model based on reasoning.

Although BrowseSafe performs well in most cases, nearly 10% of attacks can bypass the system, highlighting the complexity of the network environment and the evolving nature of attack methods. Therefore, Perplexity has made its benchmark, model, and research paper public, aiming to provide better security guarantees for AI agents' interactions online.

Key points:

🌐 The detection rate of BrowseSafe reaches 91%, higher than most current solutions.  

🔒 High-privilege access of AI browser agents increases the risk of being attacked.  

📊 Perplexity's security strategy is designed to address complex network attack methods.