AI Strategy 9 min read

AI Security in 2026: The Attack Surface Your Security Team Hasn't Audited Yet

13% of organizations have had an AI model or application breached. 97% of those breached organizations lacked proper AI access controls. Shadow AI adds $670,000 to the average breach cost. The OWASP Top 10 for LLM Applications has been updated. Here is what has changed and what your security posture needs to account for.

AI Security in 2026: The Attack Surface Your Security Team Hasn't Audited Yet

IBM’s Cost of a Data Breach Report 2025 introduced a new data point that most security teams have not yet absorbed: 13% of surveyed organizations reported breaches of AI models or applications. Of those, 97% lacked proper AI access controls. Shadow AI — unmanaged, unauthorized AI tool use by employees — added an average of $670,000 to breach costs in the incidents where it was a contributing factor.

The IBM X-Force 2026 threat intelligence report identified over 300,000 ChatGPT credentials discovered in infostealer malware logs in 2025. Nearly four times as many significant supply chain and third-party compromises occurred in 2026 versus 2020. The attack surface has changed. The security controls that addressed the 2022 threat landscape do not address the 2026 one.

This is not an argument that AI is uniquely dangerous. It is an argument that the specific ways AI systems fail — prompt injection, indirect instruction hijacking, model supply chain compromise, uncontrolled data exposure through unsanctioned tools — are distinct from traditional application vulnerabilities and require specific mitigations that most security programs have not yet built.


The OWASP Top 10 for LLM Applications: What Changed

The OWASP Top 10 for LLM Applications was updated in 2025. Prompt Injection (LLM01) holds the top slot for the second consecutive edition — but the nature of the threat has evolved. Two new categories were added: System Prompt Leakage (LLM07) and Vector and Embedding Weaknesses (LLM08). Sensitive Information Disclosure jumped from 6th to 2nd.

LLM01: Prompt Injection remains the defining vulnerability class for AI systems. Direct prompt injection — a user crafting input that overrides system instructions — is well-understood and partially mitigated through careful system prompt design and input validation. Indirect prompt injection is more dangerous and less addressed: an attacker embeds malicious instructions in external content (a webpage, an email, a document) that the LLM processes, and those instructions modify the LLM’s behavior without the user or the application being aware. As AI agents increasingly browse the web, read emails, and process documents, indirect prompt injection becomes an attack vector that traditional input validation cannot block.

Researchers documented over 461,640 prompt injection submissions in a single 2025 dataset, with success rates ranging from 50% to 84% depending on technique and model. Unit 42 documented the first large-scale indirect prompt injection attacks in commercial platforms in March 2026 — including ad review evasion and system prompt leakage in production deployments.

LLM02: Sensitive Information Disclosure escalated to second place because of how LLMs handle confidential information differently from traditional applications. An LLM trained on or given access to sensitive data through retrieval may reproduce that data in response to queries that are not intended to elicit it — not through a traditional data access control failure, but through the model’s tendency to pattern-match and surface related content. The engineering controls are different from traditional access controls: output filtering, retrieval scope limiting, and system prompt design that explicitly prevents reproduction of specific content categories.

LLM07: System Prompt Leakage is a new category that reflects documented attacks where users or attackers extract the system prompt — the confidential instructions that define how a deployed AI behaves. System prompts frequently contain business logic, persona definitions, restricted topics, and in some cases API keys or internal process descriptions. Treating system prompts as secrets requires output filtering that detects and blocks reproduction of prompt content, not just confidentiality of the prompt at deployment time.

LLM08: Vector and Embedding Weaknesses reflects the new attack surface created by RAG architectures. A RAG system retrieves content from a vector database to augment model responses. If the retrieval corpus is poisoned — if an attacker can insert documents that will be retrieved in response to specific queries — they can manipulate model output for any user whose query triggers that retrieval. This attack is particularly difficult to detect because the retrieved content appears legitimate to the model and the output appears helpful.


Shadow AI: The Breach Factor Security Teams Are Missing

63% of breached organizations lack AI governance policies entirely, according to IBM’s 2025 analysis. Shadow AI — employees using AI tools without organizational awareness or approval — is the primary reason. When an employee uses an unauthorized AI tool to process work documents, the confidentiality of those documents is governed by the AI tool vendor’s privacy policy, not the organization’s data handling requirements. For documents containing customer PII, financial data, or regulated information, this is a direct compliance violation that the organization may not discover until after a breach.

The IBM finding that shadow AI adds $670,000 to average breach costs reflects the detection and remediation overhead specific to incidents where the scope of data exposure is unknown because the organization did not know what tools were being used. Traditional DLP (data loss prevention) controls do not block data entering AI tools accessed through a browser; they were designed for email, USB drives, and cloud storage, not conversational AI APIs.

The governance approach that addresses this: a formal AI tool inventory (what tools are approved, for what data categories, under what conditions), tooling that provides visibility into AI tool usage without requiring an honor system, and explicit data classification guidance that tells employees which document types cannot be processed by external AI tools regardless of approval status.


The Model Supply Chain: The Attack Vector That Gets No Attention

Open-source model repositories — Hugging Face and equivalents — have become a prime vector for delivering malware via poisoned model files. A model file that executes arbitrary code when loaded is not a theoretical threat; it has been documented in production repositories. The attack is particularly effective because model files are large binary formats that are difficult to inspect, and the trust assumption in AI development workflows (“I downloaded this model from a reputable source”) does not account for compromised uploads.

The IBM X-Force 2026 finding of a 4x increase in significant supply chain compromises since 2020 is consistent with this vector gaining maturity as an attack path. The mitigations are analogous to software supply chain security: verify model file checksums before loading, use cryptographically signed models where the framework supports it, pin model versions in production systems rather than pulling latest, and scan model files through purpose-built tools (ModelScan, Protect AI Guardian) before deployment.

MCP (Model Context Protocol) attack surface. As MCP becomes the standard for connecting AI agents to tools and external systems, its attack surface requires security assessment. The documented attack patterns: tool poisoning (an MCP server advertises a tool with a malicious description that tricks the LLM into using it inappropriately), credential theft (a compromised MCP server extracts API keys from agent context), and second-order prompt injection (a low-privilege agent, via a poisoned MCP tool call, tricks a high-privilege agent into executing an action it would not otherwise authorize). These are new enough that most security frameworks have not produced specific guidance. The practical mitigation: apply least-privilege principles to what tools agents can access, audit MCP server provenance, and treat MCP server connections with the same scrutiny as external API integrations.


The Regulatory Context: What August 2026 Changes

The EU AI Act’s full applicability date is August 2, 2026. From that date, all high-risk AI system requirements are enforceable — risk management systems, technical documentation, automatic logging, human oversight mechanisms, CE marking, and EU database registration for high-risk systems. Penalties for high-risk non-compliance: up to €15 million or 3% of global annual revenue. Penalties for prohibited practices: up to €35 million or 7% of global annual revenue.

What counts as high-risk is consequential: AI systems used in employment decisions, credit scoring, biometric identification, education, and critical infrastructure management are in scope. Organizations deploying AI in these categories that have not completed a conformity assessment before August 2, 2026 are exposed.

In the US, the NIST AI Risk Management Framework (AI RMF 1.0) is becoming the enterprise governance baseline, particularly in regulated industries and government contracting. ISO/IEC 42001 — the AI management system standard — is gaining adoption as an audit baseline for organizations that need to demonstrate AI governance to enterprise customers, auditors, and regulators.

The practical intersection of security and compliance: NIST AI RMF’s “Measure” function requires ongoing monitoring of AI system outputs for bias, accuracy drift, and security incidents. This requires instrumentation — logging AI system inputs and outputs, tracking model performance over time — that is also the foundation of good security monitoring. Organizations that build this infrastructure for compliance reasons get security monitoring as a byproduct.


What Actually Needs to Happen in Your Security Program

A security program that was mature in 2022 needs specific additions to address the 2026 AI threat landscape:

AI asset inventory. What AI models and tools are deployed, where they are deployed, what data they can access, and who approved them. Without this, the shadow AI problem is unaddressable.

Threat modeling for AI-specific attack vectors. Running a standard threat model (STRIDE or equivalent) against AI components treats the LLM as a black box. The OWASP LLM Top 10 provides a structured starting point for identifying where prompt injection, sensitive information disclosure, and supply chain risks apply to your specific architecture.

Output monitoring. Logging AI system outputs and establishing baselines for what outputs look like under normal conditions enables detection of prompt injection successes, model drift, and data leakage. This is not available from standard application logs.

AI access controls. Applying principle of least privilege to what AI agents can do — which tools they can call, which data they can read, which actions they can take without human approval — is the control that most directly limits the blast radius of a successful prompt injection or compromised agent.

Incident response procedures for AI incidents. When an AI system behaves unexpectedly — outputs that suggest prompt injection, data disclosures that should not have occurred, model behavior that differs from expected — most incident response playbooks have no guidance specific to AI. The forensics are different: logs need to include prompts and retrieved context, not just inputs and outputs.


How we approach this at Insoftex

Per customer approval, we use AI coding tools (Claude Code, Cursor, agentic workflows) across our engineering work. The governance model we apply is the same we recommend to clients: explicit data classification for what can be processed by external AI tools, audit logging on AI-assisted actions, and human review on all AI-generated code before it reaches production.

For clients building AI-integrated applications, we scope AI security requirements in the architecture phase rather than the testing phase. The controls that prevent prompt injection — output filtering, input validation, tool access restriction — are significantly cheaper to build into an architecture than to retrofit. The same is true for audit logging and model governance: instrumenting an AI system for security observability from the first deployment is a fraction of the cost of adding it to a production system that was not designed for it.

For regulated industries (financial services, healthcare, government contractors) evaluating the EU AI Act or NIST AI RMF compliance requirements, we include a regulatory scope assessment before any AI system design — mapping the intended use case against the high-risk categories and identifying the governance requirements that apply before an architecture is committed.


Building AI-integrated applications that will process regulated data or operate in regulated industries? Our Product Pilot includes an AI security assessment and compliance scope review before architecture commitment — so the controls are built in from the start, not bolted on after an audit.

Let's talk about your AI roadmap.

We work with funded SaaS companies and regulated enterprises building AI that ships — not AI that demos.

Press Esc to close