APIs have long relied on traditional gateways to manage and secure client requests. These gateways handle authentication, routing, rate limits, and data validation for web and mobile apps. However, connecting AI systems (specifically large language models, or LLMs) to external data introduces new architectures. The emerging Model Context Protocol (MCP) defines a way for LLMs to call external tools or services via special “MCP servers.” In this post, we compare the security of an MCP-style LLM-oriented server with a classic API gateway.
While both architectures serve API requests, LLM-driven systems expand the threat surface in unique ways. Traditional gateways assume a known client and fixed endpoints. In contrast, an LLM server lets the model dynamically choose methods and parameters. This difference means attackers can exploit the AI in novel ways. Below, we discuss how the attack surface changes and what developers can do to mitigate risks on each side.
For example, OrionAI’s AI Modes feature includes a “Cybersecurity Mode” that helps the model prioritize secure practices. Custom modes like this can guide the LLM to behave more safely by design.
Traditional API Gateway Attack Surface
A typical API gateway sits in front of microservices or backend APIs. It enforces security controls like authentication, authorization, and request filtering. Key threats to watch for include:
- Broken or Missing Authentication/Authorization: If API keys, tokens or session management are flawed, attackers can impersonate users or escalate privileges. An unprotected endpoint could leak data or allow unauthorized actions (similar to Broken Object-Level Authorization in OWASP API Top 10).
- Injection Attacks: SQL, NoSQL, or command injections can occur if inputs reach databases or OS calls unsanitized. The gateway should validate request payloads and use parameterized queries or prepared statements to prevent this.
- Server-Side Request Forgery (SSRF): When an API fetches external resources (URLs, files, or services) based on user input, attackers may trick it into contacting internal systems or malicious hosts. Gateways should sanitize any target URLs or restrict allowed destinations.
- Rate Limiting / Denial of Service: Unchecked traffic can overwhelm services. Attackers may flood endpoints to exhaust CPU, memory, or network. A gateway typically implements throttling, quotas, and DoS protection to mitigate this.
- Security Misconfiguration: Exposed debug routes, default credentials, or overly permissive CORS settings are common pitfalls. Proper configuration management (disable unused endpoints, enforce HTTPS, etc.) reduces this risk.
- Data Exposure: APIs that return too much data can leak sensitive fields. Services or gateways should implement strict response filtering (only expose needed fields) and encryption where appropriate.
- Logging and Monitoring Gaps: Without consistent logging of requests and anomalies, attacks may go undetected. Gateways should log all calls, errors, and unusual patterns to support incident response and audits.
LLM-Oriented (MCP) Server Attack Surface
An MCP server exposes tools or APIs specifically for consumption by an AI model. Instead of fixed REST endpoints, the model sends JSON “tool requests” containing a method name, parameters, and context. This new setup changes the threat model in several ways:
- Prompt Injection (Direct and Indirect): Attackers can embed malicious instructions in user-controlled content that the LLM interprets as commands. For example, if a document or chat prompt contains hidden guidance, the model might execute unauthorized actions. Even innocuous input might inadvertently trigger unwanted behavior in the AI system.
- Tool Poisoning: In MCP, each exposed “tool” has metadata (name, description). If an attacker can supply or modify a tool definition (for example in a shared config), they might hide harmful instructions in the description. The LLM, trusting that description, could perform unintended actions (like data exfiltration) when the tool is invoked.
- Dynamic Tool Updates & Rug Pulls: Unlike a static API, MCP tools can be updated over time. A tool that was benign at install might be altered later to maliciously reroute calls (a “rug pull” attack). If the server doesn’t verify tool definitions, the LLM could be tricked into using compromised tools.
- Cross-Server Shadowing: An LLM agent might connect to multiple MCP servers (tools). A malicious server could shadow or intercept calls meant for a trusted one, causing data leakage or unauthorized commands across contexts.
- Authentication & Context Trust: The server receives context from the LLM (e.g., auth tokens or user IDs). Because the model could hallucinate or alter these values, the server must not trust the AI’s claims without verification. Otherwise, an attacker could manipulate the context to impersonate users or escalate privileges.
- Injection via Model Output: Even though requests are structured, LLM output can include malicious content. For example, if the model’s response is used to build a database query or shell command, hidden payloads could exploit that service. Every field (even numeric or URL) should be validated against expected formats or schemas.
- Semantic Attacks & Hallucinations: LLMs can generate plausible but incorrect instructions. An attacker might poison the model’s knowledge or instruct it indirectly, causing it to call the wrong tools. Subtle changes in prompt wording can drastically alter behavior, so thorough prompt design and input sanitization are essential.
Comparing Attack Surfaces
Overall, traditional gateways mainly defend against direct hacking attempts on well-defined HTTP APIs. In contrast, MCP servers also face attacks mediated through the LLM’s reasoning. For example, a SQL injection is still a risk if user-supplied text reaches a database query, but the attacker might launch it via the AI’s output rather than a raw HTTP parameter. More uniquely, prompt injections and tool manipulations have no analogue in classic APIs. Key differences include:
- Client Control: Traditional clients send fixed requests. LLM clients decide actions at runtime. Attackers can exploit the AI’s decision-making, not just its inputs.
- Endpoint Exposure: API gateways expose fixed routes (like
/getOrder
). MCP servers expose tool interfaces (like aget_order
method) that the LLM can call. The model might try invoking any tool, valid or not, so servers must explicitly check each requested tool and parameter. - Input Trust: In normal APIs, user input is untrusted but expected to fit predefined fields. In MCP, even the choice of which tool to call is effectively “input.” The model might attempt disallowed operations unless the server enforces a strict allowlist of actions.
- Attack Vector: Traditional threats come from crafted network requests. LLM threats can come from content given to the model (prompts, documents) that indirectly lead to malicious API calls, as well as from malformed or adversarial requests.
- Defense Posture: API gateways rely on HTTPS, parameter checks, rate limits, and auth tokens. MCP servers need all that plus AI-specific controls (like validating model output, enforcing metadata integrity, and guarding against prompt injection).
Mitigation Strategies
Many security best practices apply to both architectures, but LLM servers need extra AI-specific guards. For traditional gateways, enforce strong auth (OAuth/mTLS), strict input validation, HTTPS, and logging. For MCP servers, add additional checks to treat the LLM as an untrusted client:
- Strict Authentication and Least Privilege: Require robust auth (OAuth tokens, JWTs, API keys) and only grant the LLM the minimum permissions it needs. Do not expose sensitive secrets in model context. Rotate credentials regularly and validate them on the server side.
- Schema Validation and Input Sanitization: Validate every field from the LLM against a defined schema or type. For example, if a parameter should be an integer or URL, reject anything else. Sanitize strings to remove code or injection payloads before using them in queries or commands.
- Tool Whitelisting and Metadata Protection: Only allow known tools and operations. Do not accept arbitrary or user-submitted tool definitions at runtime. Keep tool descriptions simple and controlled to avoid hidden instructions. Consider signing or checksumming tool configurations so any unauthorized change is detected.
- Output Monitoring and Auditing: Log all tool calls and model outputs. Record the chosen method, parameters, and responses. Monitor these logs for unusual patterns (e.g. unexpected data values or tools). Alert on anomalies like very frequent calls or use of rarely used tools.
- Rate Limiting and Quotas: Apply rate limits as you would on any API. Even AI-driven calls can be abused to over-consume resources. Enforcing quotas per user or project can limit blast radius if an LLM agent goes rogue.
- Content Filtering and Sanitization: Filter prompts and context for harmful content. Prevent the model from seeing data that could trigger dangerous behavior (scripts, encoded payloads, hidden commands). Consider using prompt–response sanitizers (as discussed in our prompt engineering guide). This helps catch direct or indirect prompt injections early.
- Least Privilege in Data Access: Expose only necessary data to the LLM. For example, separate read-only actions from write actions, and apply stricter checks on any operation that modifies data. Use principles like field-level ACLs so the model only retrieves fields it needs.
- Secure Deployment and Updates: Run MCP servers in hardened environments (containers or sandboxes) and keep software up to date. Because MCP frameworks are new, monitor security advisories for them and any language runtimes. Regularly review the code for patterns like shell execution from model input (avoid using
os.system
on untrusted data, etc.).
Conclusion
LLM-oriented servers introduce a broader attack surface beyond standard API threats. By design, they let an AI model dictate which actions to perform, so attackers can try to manipulate that AI layer. However, by applying solid engineering principles and adding AI-specific guards, teams can build safer integrations. Treat the LLM as an untrusted client: authenticate all calls, validate all inputs, and monitor behavior closely. As we explored in our Token Management Techniques post, large context windows aren’t infallible, and older instructions can fade or be dropped. With thorough testing and robust monitoring, an MCP server can securely bridge AI models to real data and tools.