OpenAI Battles ‘Agentic Risk’ with New Safety Bug Bounty Program

OpenAI Blog March 25, 2026

OpenAI is now paying researchers to find logic flaws that could let AI systems go rogue or leak sensitive data. For executives, this signals a critical shift in the security landscape: as we move toward 'agentic' AI that takes real-world actions, the primary threat is no longer just bad code, but bad instructions.

Key Intelligence

•OpenAI has launched a formal 'Safety Bug Bounty' to crowdsource the discovery of vulnerabilities in how their models follow instructions.
•The program specifically targets 'agentic' vulnerabilities—scenarios where an AI agent might be tricked into performing unauthorized actions on a user's behalf.
•Security experts are being incentivized to find 'Prompt Injections,' where clever phrasing can bypass an AI’s core safety guardrails.
•The focus includes data exfiltration risks, preventing AI from being manipulated into 'leaking' its training data or private user information.
•This move highlights a growing industry consensus that AI security requires a completely different playbook than traditional software patching.
•If your organization is building with AI agents, this is a clear signal that 'social engineering' of the model itself is now a top-tier corporate risk.

Read Full Source