Hero Light Hero Light
Hero Light

Open Policy Agent: A Universal Policy Engine for Cloud‑Native Stacks

Deep dive into OPA's architecture, Rego language, and the most impactful ways teams use it to secure Kubernetes, IaC, APIs, and microservices.

Open Policy Agent: A Universal Policy Engine for Cloud‑Native Stacks
December 10, 2024
nScope Team

Modern platforms run on hundreds of distributed decisions: Can this pod run? May that user assume a role? Is this Terraform plan safe to apply? Hard‑coding rules in each service invites drift and blind spots. Open Policy Agent (OPA) solves this by externalising policy into a single, embeddable engine that speaks a purpose‑built, declarative language called Rego.

Developed by Styra and now a graduated CNCF project, OPA underpins Kubernetes admission controllers, CI pipelines, API gateways, and even Envoy sidecars. This article unpacks OPA’s architecture, explores high‑value use cases, and shares best practices from production deployments, all using OPA’s new v1 syntax.

Key idea: OPA decouples policy (the “what”) from code (the “how”), enabling consistent, testable, and auditable enforcement across your stack.

Architecture in a Nutshell

flowchart TD
    A["Input JSON/YAML
       - K8s manifests
       - Terraform plans
       - API requests"]

    C["Rego Policies
       - Rules & constraints
       - Data documents
       - Functions"]

    B["OPA Engine
       Evaluates input against policies"]

    D["Decision JSON
       - allow/deny
       - Filtered objects
       - Violations & errors"]

    A -->|"Query (e.g., input.method == GET)"| B
    C -->|"Loaded at runtime"| B
    B -->|"Returns decision"| D

    classDef inputs fill:#2a4e69,stroke:#5cc0d6,color:#fff,stroke-width:2px
    classDef engine fill:#2a4e69,stroke:#5cc0d6,color:#fff,stroke-width:2px
    classDef output fill:#2a4e69,stroke:#5cc0d6,color:#fff,stroke-width:2px

    class A,C inputs
    class B engine
    class D output
  1. Input – any structured data (HTTP request, Kubernetes object, Terraform plan).
  2. Policies – written in Rego and compiled to an internally optimised AST.
  3. Decision – OPA returns allow=true, a filtered object, or any JSON you define.

OPA runs as:

  • Sidecar/Daemon – local to the app (low latency).
  • Shared microservice – central policy server.
  • Library – compile Rego to WASM; your service invokes it via a WASM runtime.

Rego Language Primer

Rego feels like a mix of Datalog and JSONPath. It’s a declarative query language specifically designed for policy evaluation over structured data. Let’s start with a basic example:

package kubernetes.admission

import rego.v1

default allow = false

allow {
  input.kind.kind == "Pod"
  not input.spec.hostNetwork
  container := input.spec.containers[_]
  not startswith(container.image, "latest")
}

Core Rego Concepts

Fundamentals

Packages and Imports

package app.rbac
import data.common.constants
import data.users

Rules and Rule Bodies

# A rule is true when all expressions in its body evaluate to true
user_is_admin {
  input.user.role == "admin"
}

Variables and Binding

# Local variables with := assignment
user_name := input.user.name

# Multiple variables in a single rule
has_required_access {
  user := input.user
  resource := input.resource
  user.permissions[_] == resource.required_permission
}

Iteration with [_]

# Iterate through all elements in an array
has_privileged_container {
  container := input.spec.containers[_]  # Iterates through all containers
  container.securityContext.privileged == true
}

Logical Operators

# AND  ─ implicit conjunction: every expression in the block must be true.
is_valid_and_active if {
  input.status == "valid"
  input.active  == true
}

# OR  ─ explicit disjunction inside one rule body.
is_admin_or_owner if {
  input.user.role == "admin" or input.user.role == "owner"
}

#  Even shorter:
is_admin_or_owner if input.user.role in {"admin", "owner"}

Rule Types

Complete Rules (value assigned)

allow = true {
  input.method == "GET"
  input.path == "/api/public"
}

message = "Resource not found" {
  input.path == "/missing"
}

Partial Rules (collection rules)

# Returns a set of values
allowed_paths contains path if {
  path := input.request.path
  startswith(path, "/api/v1")
}

# Returns key-value object
violations[resource] contains message if {
  resource := input.resources[_]
  not resource.tags.owner
  message := "Resource missing required owner tag"
}

Default Rules

# Provides a fallback value if no rule body is satisfied
default allow = false
default max_connections = 10

Advanced Rego Patterns

Comprehensions and Universal Quantification

# Array comprehension - returns a filtered array
allowed_ports = [port | port := input.ports[_]; port < 1024]

# Set comprehension
private_ips = {ip | ip := input.addresses[_]; startswith(ip, "10.")}

# Object comprehension
port_map = {name: port | name := input.services[_].name; port := input.services[_].port}

# Universal quantification - checks if ALL elements satisfy condition
all_containers_have_limits if {
  every container in input.spec.containers {
    has_resource_limits(container)
  }
}

# Helper function for readability
has_resource_limits(container) {
  container.resources.limits.cpu
  container.resources.limits.memory
}

Working with Data Documents

package policies

import data.users
import data.roles

# Access external data
is_permitted {
  # Reference user from input
  username := input.user

  # Look up user's roles from data document
  user_roles := users[username].roles

  # Check if any role has required permission
  role := user_roles[_]
  permission := roles[role].permissions[_]
  permission == input.required_permission
}

Real-World Rego Examples

Multi-Factor Policy for Sensitive Resources

package authz

# Require MFA for sensitive resource access
default require_mfa = false

require_mfa {
  # Sensitive resources require MFA
  is_sensitive_resource

  # Unless accessed from corporate network
  not from_corporate_network
}

is_sensitive_resource {
  sensitive_resources := {"/api/finance", "/api/hr", "/api/admin"}
  input.resource.path in sensitive_resources
}

from_corporate_network {
  startswith(input.source_ip, "10.20.")
}

# Main authorization rule
allow {
  # Basic authentication check
  input.user.authenticated == true

  # For sensitive resources, verify MFA if required
  not require_mfa or input.user.mfa_verified == true

  # User has permission
  has_permission
}

has_permission {
  # Permission check based on role
  required_permission := concat(":", [input.method, input.resource.path])
  user_permissions := data.permissions[input.user.role]
  permission := user_permissions[_]

  # Either exact match or wildcard permission
  permission == required_permission or permission == "*"
}

Kubernetes Network Policy Validator

package kubernetes.validating.networkpolicy

import rego.v1

# Deny by default
default allow = false

# Allow if policy meets all requirements
allow {
  input.kind == "NetworkPolicy"
  has_required_labels
  not has_wildcard_ingress
  valid_egress_rules
}

# Validate required labels
has_required_labels {
  input.metadata.labels["app"]
  input.metadata.labels["environment"]
  input.metadata.labels["owner"]
}

# Check for overly permissive ingress rules
has_wildcard_ingress {
  some i
  rule := input.spec.ingress[i]
  not rule.from  # Empty "from" means allow from anywhere
}

# Validate egress rules
valid_egress_rules {
  # If no egress is specified, it's valid (default deny)
  not input.spec.egress
}

# Validate all egress rules
valid_egress_rules if {
  every rule in input.spec.egress {
    is_valid_egress(rule)
  }
}

# Helper to check individual egress rules
is_valid_egress(rule) {
  # Must have "to" field specified
  rule.to

  # Must have port restrictions
  rule.ports

  # Verify no connection to known bad CIDRs
  not has_prohibited_destination(rule)
}

# Check for prohibited external destinations
has_prohibited_destination(rule) {
  some i
  cidr := rule.to[i].ipBlock.cidr

  # List of prohibited external CIDR ranges
  prohibited_cidrs := ["0.0.0.0/0", "169.254.0.0/16"]
  cidr in prohibited_cidrs
}

# Generate violations for better reporting
violations[msg] {
  not has_required_labels
  msg := "Network policy missing required labels (app, environment, owner)"
}

violations[msg] {
  has_wildcard_ingress
  msg := "Network policy contains unrestricted ingress rule"
}

violations[msg] {
  not valid_egress_rules
  msg := "Network policy contains invalid egress rules"
}

Practical Example: Multi-layered Policy

Here’s a more complex example showing how to build layered policies for Kubernetes pods:

package kubernetes.admission

import rego.v1
import data.kubernetes.namespaces

# Default deny
default allow = false

# Allow if all conditions pass
allow {
  input.kind.kind == "Pod"
  namespace_valid
  container_images_valid
  security_context_valid
}

# Check if namespace has required restrictions
namespace_valid {
  namespace := input.metadata.namespace
  ns_data := namespaces[namespace]
  ns_data.restricted == true
}

# Validate all container images
container_images_valid if {
  every container in input.spec.containers {
    container_valid(container)
  }
}

# Individual container validation logic
container_valid(container) {
  not startswith(container.image, "latest")
  endswith(container.image, "-signed")

  # Image from allowed registries
  registry := split(container.image, "/")[0]
  allowed_registries := {"gcr.io", "registry.company.com"}
  registry in allowed_registries
}

# Security context validation
security_context_valid if {
  # Pod‑level security hardening
  input.spec.securityContext.runAsNonRoot == true

  # Container‑level check for all containers
  every container in input.spec.containers {
    has_security_context(container)
  }
}

# Helper function for container security context
has_security_context(container) {
  container.securityContext.allowPrivilegeEscalation == false
  container.securityContext.readOnlyRootFilesystem == true
}

Debugging and Testing Rego

# Testing Rego policies with opa test

# In test_policy.rego:
package kubernetes.test

import data.kubernetes.admission.allow

# Unit test - positive case
test_allow_valid_pod {
  # Allow valid pod
  input := {
    "kind": {"kind": "Pod"},
    "metadata": {"namespace": "prod"},
    "spec": {
      "securityContext": {"runAsNonRoot": true},
      "containers": [{
        "image": "gcr.io/app-signed",
        "securityContext": {
          "allowPrivilegeEscalation": false,
          "readOnlyRootFilesystem": true
        }
      }]
    }
  }

  # Mock namespaces data
  data.kubernetes.namespaces := {
    "prod": {"restricted": true}
  }

  # Policy should allow this pod
  allow == true
}

# Unit test - negative case
test_reject_pod_with_latest_tag {
  # Pod with :latest tag
  input := {
    "kind": {"kind": "Pod"},
    "metadata": {"namespace": "prod"},
    "spec": {
      "securityContext": {"runAsNonRoot": true},
      "containers": [{
        "image": "gcr.io/app-latest",
        "securityContext": {
          "allowPrivilegeEscalation": false,
          "readOnlyRootFilesystem": true
        }
      }]
    }
  }

  # Mock namespaces data
  data.kubernetes.namespaces := {
    "prod": {"restricted": true}
  }

  # Policy should reject this pod
  not allow
}

This expanded Rego language reference covers the essential concepts and provides practical examples that teams can immediately apply to their policy-as-code implementations.

High‑Value Use Cases

DomainWhat OPA enforcesPopular integration
Kubernetes admissionBlock privileged pods, enforce labels, validate Ingress TLSGatekeeper (CRD‑driven)
Infrastructure as CodeFail PRs with risky Terraform/AWS CDK changesConftest, OPA in CI
API gatewaysAuthorise JWT scopes, rate‑limit by org, filter response fieldsKong Mesh, Envoy ext‑authz
MicroservicesDecide row‑level access, AB‑testing flagsOPA sidecar or WASM
Data leakage preventionMask PII in GraphQL responsesCustom OPA filter module

Real‑world win: a fintech blocked 97 % of mis‑configured Kubernetes objects at PR time, slashing incident tickets by half after adopting OPA + Gatekeeper.

Deployment Patterns

Gatekeeper (K8s)

CRDs + OPA → write Constraints (ConstraintTemplate + Constraint) and let Gatekeeper inject a validating webhook. Ideal for large clusters—policies are stored as K8s objects.

Example Gatekeeper Constraint Template

apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8srequiredlabels
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredLabels
      validation:
        openAPIV3Schema:
          properties:
            labels:
              type: array
              items: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8srequiredlabels

        violation[{"msg": msg}] {
          provided := {label | input.review.object.metadata.labels[label]}
          required := {label | label := input.parameters.labels[_]}
          missing := required - provided
          count(missing) > 0
          msg := sprintf("Missing required labels: %v", [missing])
        }

Conftest (CI/IaC)

Lint Terraform plans, Helm charts, Dockerfiles. Add a conftest test step in GitHub Actions; fail on policy violations before they hit prod.

Example GitHub Action for Terraform Policy Enforcement

name: Terraform Policy Check

on:
  pull_request:
    paths:
      - 'terraform/**'

jobs:
  policy-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup OPA
        uses: open-policy-agent/setup-opa@v2
        with:
          version: latest

      - name: Install Conftest
        run: |
          curl -sSL https://github.com/open-policy-agent/conftest/releases/download/v0.60.0/conftest_0.60.0_Linux_x86_64.tar.gz -o conftest.tar.gz
          tar -xzvf conftest.tar.gz -C /usr/local/bin/
          chmod +x /usr/local/bin/conftest
          rm conftest.tar.gz

      - name: Initialize Terraform
        run: cd terraform && terraform init

      - name: Generate Terraform Plan
        run: cd terraform && terraform plan -out=tfplan.binary

      - name: Convert Plan to JSON
        run: cd terraform && terraform show -json tfplan.binary > tfplan.json

      - name: Run Policy Checks
        run: |
          cd terraform
          conftest test tfplan.json -p ../policies/terraform

Envoy ext‑authz (APIs)

Envoy forwards HTTP headers/body to OPA; OPA returns 200 or 403. Latency ~1 ms with sidecar pattern.

WASM bundle

Compile Rego to WASM, embed in Go/Rust services for air‑gapped or ultra‑low‑latency policy checks.

Best Practices

  1. Single source of truth – store Rego in Git, version via semver tags, deploy via CI.
  2. Test first – use opa test for unit cases; integrate rego‑bench for perf.
  3. Data > hard‑code – keep env‑specific values in data.json not rules.
  4. Policy layers – separate mandatory (security) from optional (cost) to unblock dev velocity.
  5. Observability – emit decision logs to Loki/Splunk; enable decision_id for audit traceability.

Performance & Scaling

Evaluation latency

With compact inputs and well‑indexed rules, OPA can answer a decision in hundreds of micro‑seconds to a few milli‑seconds. Treat 1 ms as a design budget, not a guarantee. Profile with your real data and policy set.

Memory footprint

At start‑up OPA compiles policies into an in‑memory AST. A minimal bundle plus engine is usually a few‑MB RSS; large data documents (Kubernetes manifests, RBAC maps, etc.) dominate the total. Measure your own workload with opa eval --metrics or the /metrics Prometheus endpoint.

Horizontal scale

OPA is stateless once a bundle is loaded, so you can run as many replicas as needed behind a load balancer or as sidecars. Teams routinely reach 10k+ queries per second with single‑digit‑millisecond p95 latencies by sharding traffic across a handful of OPA instances.

Hot policy updates

Distribute policies with the Bundle API (or the higher‑level Discovery Bundle if you manage multiple bundles). OPA polls an S3/GCS bucket or any HTTPS endpoint at your chosen interval, verifies the signature, and hot‑swaps the bundle without restarting — zero‑downtime roll‑forward or roll‑back.

Limitations & Gotchas

  • Learning curve for Rego if your team only knows YAML.
  • Debugging can be opaque—enable trace for complex rules.
  • Large JSON inputs (>1 MiB) slow eval; pass only needed fields.
  • Policy sprawl—govern via naming conventions and codeowners.

How OPA Compares to…

ToolScopeLanguageStrengthWeakness
HashiCorp SentinelTerraform, NomadHCL‑likeDeep plan dataClosed‑source, enterprise only
Kubernetes KyvernoK8s admissionYAMLDev‑friendlyLimited outside K8s
AWS IAM policiesAWS APIsJSONNative, fastAWS‑only, verbose

OPA’s vendor‑neutral stance makes it a Swiss‑army knife, although single‑purpose tools may be simpler within their silo.

Getting Started in 10 Minutes

# Install OPA
brew install opa

# Create a simple policy file
cat > policy.rego <<EOF
package example

default allow = false

allow {
  input.method == "GET"
  input.path == "/api/public"
}

allow {
  input.method == "POST"
  input.path == "/api/data"
  input.user.role == "admin"
}
EOF

# Start OPA server
opa run --server --addr :8181 policy.rego &

# Test a policy decision
curl -X POST localhost:8181/v1/data/example/allow \
     -d '{"input":{"method":"GET","path":"/api/public"}}'
# Returns {"result":true}

# Test a different input
curl -X POST localhost:8181/v1/data/example/allow \
     -d '{"input":{"method":"GET","path":"/api/admin"}}'
# Returns {"result":false}

🔗 Further resources: OPA Playground, Gatekeeper Library, Conftest Examples

Conclusion

Open Policy Agent gives platform teams a single, consistent way to declare and enforce policy from Kubernetes admission to CI pipelines. The result is fewer production surprises, faster audits, and a clear separation between rules and runtime.

Ready to standardise your policy story? nScope’s Policy‑as‑Code jump‑start delivers a production‑ready OPA deployment, complete with CI integration and K8s admission controls.

Book a policy jump‑start

More Articles

Let's have a chat!

Just fill out the form, and we will be in touch with you soon.