Zero Trust Architecture for Backend Systems — Never Trust, Always Verify

Sanjeev SharmaSanjeev Sharma
7 min read

Advertisement

Introduction

Zero trust is the security model for 2026. The old perimeter-based approach assumed network was safe. Modern systems are distributed, multi-cloud, and exposed to the internet. Zero trust assumes every request is untrusted. Every service-to-service call must be authenticated and authorised.

This post covers implementing zero trust for microservices: mutual TLS (mTLS), service identities, OPA policies, and short-lived credentials.

Zero Trust Principles for Microservices

  1. Never trust, always verify: Every request requires authentication, even on internal networks
  2. Least privilege: Services get only the permissions they need
  3. Inspect and log: Every interaction is logged and monitored
  4. Assume breach: Design for compromise. If one service is hacked, limit lateral movement

Traditional (perimeter-based):

User -> [Firewall] -> Internal Network -> Service A -> Service B
(Assume everything inside firewall is safe)

Zero trust:

User -[verify]-> Service A -[mTLS+policy]-> Service B
(Verify every hop, every request)

mTLS Between Services with cert-manager

Mutual TLS means both client and server verify each other with certificates. cert-manager automates certificate issuance and renewal.

Install cert-manager:

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml

Create a self-signed CA:

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: internal-ca
  namespace: cert-manager
spec:
  secretName: internal-ca-key-pair
  commonName: internal-ca
  isCA: true

Create issuer:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: internal-ca
spec:
  ca:
    secretRef:
      name: internal-ca-key-pair

Issue service certificates:

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: api-service-cert
  namespace: default
spec:
  secretName: api-service-tls
  issuerRef:
    name: internal-ca
    kind: ClusterIssuer
  dnsNames:
    - api.default.svc.cluster.local

Use mTLS in service pods:

// api-service/src/server.ts
import fs from 'fs';
import https from 'https';

const key = fs.readFileSync('/etc/tls/private/tls.key');
const cert = fs.readFileSync('/etc/tls/private/tls.crt');
const ca = fs.readFileSync('/etc/tls/private/ca.crt');

const options = {
  key,
  cert,
  ca,
  requestCert: true,
  rejectUnauthorized: true,
};

https.createServer(options, app).listen(3000);

Pod manifest:

apiVersion: v1
kind: Pod
metadata:
  name: api-service
spec:
  containers:
    - name: api
      image: api:latest
      volumeMounts:
        - name: tls
          mountPath: /etc/tls/private
          readOnly: true
  volumes:
    - name: tls
      secret:
        secretName: api-service-tls

cert-manager rotates certificates automatically. No manual renewal.

SPIFFE/SPIRE for Service Identity

SPIFFE is a standard for issuing and managing service identities. SPIRE is the production implementation.

Install SPIRE:

helm install spire spiffe/spire -n spire --create-namespace

Service attestation:

apiVersion: spire.spiffe.io/v1alpha1
kind: ClusterStaticEntry
metadata:
  name: api-service
  namespace: default
spec:
  spiffeID: spiffe://example.com/ns/default/sa/api
  parentID: spiffe://example.com/spire/agent
  federatesWith: []

Client requests SPIFFE credential:

const credential = await spire.requestSVID({
  spiffeID: 'spiffe://example.com/ns/default/sa/api',
});

const response = await fetch('https://worker.svc.cluster.local/process', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${credential.jwt}`,
  },
  body: JSON.stringify(data),
});

Server verifies SPIFFE JWT:

import { verifyJWT } from 'spiffe';

app.use(async (req, res, next) => {
  const token = req.headers.authorization?.split(' ')[1];
  if (!token) return res.status(401).json({});

  try {
    const claims = await verifyJWT(token, publicKey);
    req.serviceID = claims.sub; // spiffe://example.com/ns/default/sa/api
    next();
  } catch {
    res.status(401).json({});
  }
});

SPIFFE decouples identity from IP address. Services identify by name, not by network location.

OPA (Open Policy Agent) for Fine-Grained Authorisation

OPA evaluates policies written in Rego, a declarative language.

Install OPA:

kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/opa/main/docs/deployments/kubernetes/kube-mgmt/kube-mgmt-deployment.yaml

Write a policy:

# policies/api-service.rego
package api.authz

default allow = false

allow {
  # Only worker service can call /process
  input.method == "POST"
  input.path == "/process"
  input.caller == "spiffe://example.com/ns/default/sa/worker"
}

allow {
  # Only analytics service can call /metrics
  input.path == "/metrics"
  input.caller == "spiffe://example.com/ns/default/sa/analytics"
}

allow {
  # Admin can call anything
  input.caller_role == "admin"
}

Evaluate policy in service:

const opaURL = 'http://localhost:8181/v1/data/api/authz/allow';

app.use(async (req, res, next) => {
  const decision = await fetch(opaURL, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      method: req.method,
      path: req.path,
      caller: req.serviceID,
      caller_role: req.role,
    }),
  }).then((r) => r.json());

  if (!decision.result?.allow) {
    return res.status(403).json({ error: 'Unauthorized' });
  }

  next();
});

Policies are version controlled, audited, and updated without redeploying services.

Network Policies in Kubernetes

NetworkPolicy restricts traffic between pods. Default deny, allow only what''s needed.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: api-service
  namespace: default
spec:
  podSelector:
    matchLabels:
      app: api-service
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: gateway
      ports:
        - protocol: TCP
          port: 3000
    - from:
        - podSelector:
            matchLabels:
              app: worker
      ports:
        - protocol: TCP
          port: 3000
  egress:
    - to:
        - podSelector:
            matchLabels:
              app: postgres
      ports:
        - protocol: TCP
          port: 5432
    - to:
        - namespaceSelector: {}
      ports:
        - protocol: TCP
          port: 53 # DNS
        - protocol: UDP
          port: 53

Only gateway and worker can call api-service. api-service can only call postgres and DNS. All other traffic is blocked.

Short-Lived Credentials with AWS IAM Roles for Service Accounts

In AWS EKS, use IRSA (IAM Roles for Service Accounts) instead of long-lived keys.

Set up IRSA:

eksctl utils associate-iam-oidc-provider --cluster=my-cluster --approve

Create IAM role:

aws iam create-role --role-name api-service-role \
  --assume-role-policy-document '{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Principal": {
          "Federated": "arn:aws:iam::ACCOUNT:oidc-provider/OIDC_ID"
        },
        "Action": "sts:AssumeRoleWithWebIdentity",
        "Condition": {
          "StringEquals": {
            "OIDC_ID:sub": "system:serviceaccount:default:api-service"
          }
        }
      }
    ]
  }'

Attach policy:

aws iam put-role-policy --role-name api-service-role \
  --policy-name s3-access \
  --policy-document '{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Action": "s3:GetObject",
        "Resource": "arn:aws:s3:::my-bucket/*"
      }
    ]
  }'

Link to service account:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: api-service
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT:role/api-service-role
---
apiVersion: v1
kind: Pod
metadata:
  name: api-service
spec:
  serviceAccountName: api-service
  containers:
    - name: api
      image: api:latest

Pod gets temporary AWS credentials automatically. No secret storage needed.

Secret Rotation Without Downtime

Use External Secrets Operator to sync secrets from a vault and rotate without pod restart.

apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: vault-store
  namespace: default
spec:
  provider:
    vault:
      server: https://vault.example.com
      path: secret
      auth:
        kubernetes:
          mountPath: kubernetes
          role: api-service
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: db-secret
  namespace: default
spec:
  secretStoreRef:
    name: vault-store
  target:
    name: db-secret
    creationPolicy: Owner
  data:
    - secretKey: password
      remoteRef:
        key: database
        version: latest
  refreshInterval: 1h # Rotate every hour

External Secrets Operator watches the secret. When updated, it modifies the Kubernetes secret. Pods can watch for changes and reload without restarting.

Audit Logging Every Service-to-Service Call

Log all service interactions for compliance and investigation.

app.use((req, res, next) => {
  const startTime = Date.now();

  res.on('finish', () => {
    const duration = Date.now() - startTime;

    console.log(JSON.stringify({
      timestamp: new Date().toISOString(),
      method: req.method,
      path: req.path,
      caller: req.serviceID,
      status: res.statusCode,
      duration,
      requestSize: req.get('content-length'),
      responseSize: res.get('content-length'),
    }));
  });

  next();
});

Forward logs to a central sink (CloudWatch, DataDog, Splunk) for analysis.

Zero Trust for AI Agents

AI agents calling your APIs need scoped permissions. Issue short-lived credentials with minimal scope.

// Issue temporary token for AI agent
const token = jwt.sign(
  {
    agent_id: 'agent-123',
    permissions: ['search', 'summarize'], // Only these actions
    expires_at: Math.floor(Date.now() / 1000) + 3600, // 1 hour
  },
  secret,
  { algorithm: 'HS256' }
);

// AI agent uses token
app.post('/api/search', (req, res) => {
  const token = req.headers.authorization?.split(' ')[1];
  const claims = jwt.verify(token, secret);

  if (!claims.permissions.includes('search')) {
    return res.status(403).json({ error: 'Permission denied' });
  }

  // Proceed...
});

Checklist

  • Deploy cert-manager for mTLS
  • Issue certificates for all services
  • Implement SPIFFE for service identity
  • Deploy OPA and write authorization policies
  • Implement NetworkPolicy for all pods
  • Migrate to IAM Roles for Service Accounts (if on AWS)
  • Set up secret rotation with External Secrets Operator
  • Implement comprehensive audit logging
  • Test by simulating compromised service
  • Document zero trust architecture

Conclusion

Zero trust is not optional in 2026. Assume every request is untrusted. Verify identity with mTLS and SPIFFE, authorise with OPA, and log everything. Short-lived credentials and automatic rotation eliminate the risk of leaked secrets. Build zero trust into your infrastructure from the start. Retrofitting is painful.

Advertisement

Sanjeev Sharma

Written by

Sanjeev Sharma

Full Stack Engineer · E-mopro