Recently, the cybersecurity company SentinelOne and Censys jointly released a deep study revealing serious security challenges faced by open-source large language models. The research pointed out that when these models are run independently on external private computers, free from the "guardrails" and security restrictions of mainstream hosting platforms, they are highly likely to become targets for hackers and criminals, resulting in serious security risks.

This study, which took nearly 300 days, found that there are thousands of unprotected open-source AI instances on the Internet, including many derivative versions based on mainstream models such as Meta's Llama and Google's Gemma. Although some open-source models have built-in security defenses, researchers still found hundreds of cases where the security protections were maliciously removed.

Security experts described this phenomenon as an "iceberg" outside the industry's view: open-source computing power is being used for criminal activities while supporting legitimate uses. Attackers can hijack these instances to force the models to generate large amounts of spam, write precise phishing emails, or even launch large-scale disinformation campaigns.

The research team focused on open-source instances deployed through the Ollama tool. Concerningly, in about 25% of the observed instances, hackers could directly read the model's "system prompts"—the core underlying instructions that determine the model's behavior. Further analysis showed that 7.5% of these instructions had been modified to support harmful actions.

This risk scenario affects a wide range, including but not limited to the spread of hate speech, generation of violent content, theft of personal privacy data, financial fraud, and even content involving children's safety. Because these models bypass the monitoring mechanisms of large platforms, traditional security measures often fail to work effectively.