Best Practices: Security

For: Architects securing production deployments
Level: Advanced
Time to read: 40 minutes
Reference: DSpec security_and_governance.dspec.yaml

This guide covers the 5-layer defense-in-depth security model and implementation best practices.

Security Architecture Overview

5-Layer Defense-in-Depth Model


┌──────────────────────────────────────┐
│ Layer 1: Authentication              │
│ • Identity verification              │
│ • Kratos/OAuth2 providers            │
└──────────────────────────────────────┘
               ▼
┌──────────────────────────────────────┐
│ Layer 2: Authorization (RBAC/ReBAC)  │
│ • Role-based access control          │
│ • SpiceDB relationship-based control │
└──────────────────────────────────────┘
               ▼
┌──────────────────────────────────────┐
│ Layer 3: Policy Evaluation (OPA)     │
│ • Business rule enforcement          │
│ • Dynamic policy evaluation          │
└──────────────────────────────────────┘
               ▼
┌──────────────────────────────────────┐
│ Layer 4: Data Isolation              │
│ • Row-level security (RLS)           │
│ • Column-level encryption            │
│ • Multi-tenant segmentation          │
└──────────────────────────────────────┘
               ▼
┌──────────────────────────────────────┐
│ Layer 5: Audit & Monitoring          │
│ • Unified audit trail (NATS)         │
│ • OpenTelemetry tracing              │
│ • Security alerts                    │
└──────────────────────────────────────┘

Layer 1: Authentication

User Authentication


# cascade.yaml
auth:
  provider: kratos
  endpoints:
    admin: http://kratos-admin:80
    public: http://kratos-public:80
  session:
    timeout: 24h
    refresh_enabled: true

Credential Management

✅ DO:

Store secrets in Vault/K8s Secrets
Rotate credentials regularly
Use short-lived tokens
Implement MFA for admin accounts
Use OAuth2 for external integrations

❌ DON’T:

Hard-code credentials
Use default passwords
Store secrets in config files
Share credentials via email
Disable MFA

Layer 2: Authorization (RBAC/ReBAC)

Kubernetes RBAC Setup


# rbac.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: cascade-workflows
rules:
- apiGroups: ["cascade.io"]
  resources: ["workflows"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["cascade.io"]
  resources: ["workflows/execute"]
  verbs: ["create"]
 
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: developer-workflows
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: cascade-workflows
subjects:
- kind: Group
  name: developers

SpiceDB Relationships


# Define relationships for ReBAC
user:alice can view:app:1
user:bob can admin:app:1
group:managers can approve:app:1

# Check permission
has_permission(user:alice, view, app:1) → true

Role Hierarchy


┌──────────────────────┐
│ Admin                │ (All permissions)
└──────────┬───────────┘
           │
    ┌──────┴──────┐
    ▼             ▼
Manager       Developer
(Approve)    (Execute, view logs)
    │             │
    ▼             ▼
Viewer (Read-only access)

Layer 3: Policy Evaluation (OPA)

Database Write Authorization


package agent_policies

# From specs/policies/security.dspec.yaml:
# Agents can perform database mutations ONLY if:
# 1. Capability explicitly declared in CDL spec
# 2. Archetype permits writes (actor, responder, orchestrator)
# 3. Target table in declared scopes
# 4. Tenant isolation enforced
# 5. Not a system table

allow_database_write {
    # 1. Check capability declared
    input.capability == "write.database"
    
    # 2. Check archetype allows writes
    allowed_archetypes := ["actor", "responder", "orchestrator"]
    input.agent_archetype in allowed_archetypes
    
    # 3. Check table is in scopes
    input.target_table in input.agent_scopes
    
    # 4. Check tenant isolation
    input.tenant_id == data.context.tenant_id
    
    # 5. Check not system table
    not is_system_table(input.target_table)
}

is_system_table(table) {
    system_tables := ["pg_*", "information_schema.*"]
    table matches system_tables[_]
}

Policy Enforcement Points


┌─────────────────────────────────────┐
│ Build Time: CDL Validator           │
│ • Check capability declarations     │
│ • Verify archetype compatibility    │
│ • Validate scope declarations       │
└────────────────────┬────────────────┘
                     │
┌────────────────────▼────────────────┐
│ Runtime: OPA Policy Engine          │
│ • Evaluate permission before SDK    │
│ • Check context/tenant isolation    │
│ • Rate limiting                     │
└────────────────────┬────────────────┘
                     │
┌────────────────────▼────────────────┐
│ Database: PostgreSQL RLS Policies   │
│ • Row-level security checks         │
│ • Final enforcement layer           │
└─────────────────────────────────────┘

Layer 4: Data Isolation

Row-Level Security (RLS)


-- PostgreSQL RLS for multi-tenant isolation
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
 
CREATE POLICY tenant_isolation ON orders
    USING (tenant_id = current_setting('app.tenant_id')::uuid);
 
-- Application sets tenant context
SET app.tenant_id = 'acme-corp-id';
SELECT * FROM orders;  -- Only returns ACME orders

Column-Level Encryption


// Sensitive fields encrypted at rest
type Customer struct {
    ID    string
    Name  string
    Email string  // Encrypted
    SSN   string   // Encrypted
}
 
// Encryption on write
func (c *Customer) Save(ctx context.Context) error {
    c.Email = encrypt.AES256(c.Email, encryptionKey)
    c.SSN = encrypt.AES256(c.SSN, encryptionKey)
    return db.Save(c)
}
 
// Decryption on read
func (c *Customer) Load(ctx context.Context, id string) error {
    err := db.Load(c, id)
    c.Email = encrypt.AES256Decrypt(c.Email, encryptionKey)
    c.SSN = encrypt.AES256Decrypt(c.SSN, encryptionKey)
    return err
}

Layer 5: Audit & Monitoring

Unified Audit Trail


# From specs/policies/security.dspec.yaml:
# All significant state-changing actions MUST be recorded
# • User identity
# • Timestamp
# • Action performed
# • Affected resources
 
# NATS topic: cascade.audit.events
audit_event:
  timestamp: "2024-10-29T15:30:45Z"
  user_id: "user:alice"
  action: "execute_workflow"
  resource: "workflow:ProcessOrder"
  status: "success"
  changes:
    - field: "status"
      old_value: "pending"
      new_value: "running"

Audit Log Analysis


# Query audit logs
cascade logs --type=audit --since=24h --action=execute_workflow
 
# Alert on suspicious activity
cascade alerts create --condition='audit.action == "database_write" && audit.user != "system"' \
  --severity=high
 
# Compliance reports
cascade audit report --type=pci-dss --period=monthly

Security Alerts


alerts:
  - name: UnauthorizedAccess
    condition: |
      audit.status == "denied" AND 
      rate(count over 5m) > 10
    action: notify_security_team
    severity: high
  
  - name: SuspiciousDataAccess
    condition: |
      audit.action == "data_read" AND
      audit.records_accessed > 10000
    action: require_mfa
    severity: medium
  
  - name: FailedAuthAttempts
    condition: |
      auth.status == "failed" AND
      rate(count over 1m) > 5
    action: rate_limit_ip
    severity: high

Agent-Specific Security

Agent Archetype Capabilities


Observer       → Read-only access (no write permission)
Classifier     → Analyze & classify (no data mutation)
Actor          → Perform actions (write capability)
Responder      → Handle interactions (limited write)
Orchestrator   → Orchestrate workflows (full control)

Agent Tool Authorization


# From specs/agents/security_and_governance.dspec.yaml

# Default: deny (explicit allow required)
default allow = false

allow_tool_call {
    # Check archetype capability
    allowed_archetypes := ["actor", "responder", "orchestrator"]
    input.agent_archetype in allowed_archetypes
    
    # Check tool in allowed set
    input.tool in input.agent_tools
    
    # Check scope/permissions
    input.resource in input.agent_scopes
    
    # Check rate limiting
    rate_limit_check(input.agent_id, input.tool)
}

Least Privilege Pattern

Agent Data Access (Context Whitelist)


// From specs/policies/security.dspec.yaml:
// "An Agent's context block is a mandatory whitelist of specific API calls."
 
type AgentContext struct {
    AllowedAPIs []string
    Projections map[string][]string  // API → fields
}
 
// Example: Only allow customer API, specific fields only
agentContext := &AgentContext{
    AllowedAPIs: []string{"GetCustomer", "ListOrders"},
    Projections: map[string][]string{
        "GetCustomer": {"id", "name", "email"},  // Exclude SSN, payment info
        "ListOrders": {"id", "total", "status"},
    },
}
 
// SDK enforces projections before returning to agent

Scope Declaration (CDL)


agents:
  - name: order-processor
    archetype: actor
    capabilities:
      - read.database
      - write.database
    scopes:
      - tables: ["orders", "order_items"]
      - exclude: ["system_*"]
    constraints:
      - max_records: 1000
      - rate_limit: 100/minute

Compliance Checklist

Authentication & Authorization

All users authenticate via Kratos/OAuth2
MFA enforced for privileged accounts
RBAC/ReBAC configured
Service accounts use short-lived tokens
API keys rotated every 90 days

Data Protection

PII encrypted at rest
RLS policies configured
Column-level encryption for sensitive fields
TLS 1.3+ for all network communication
Disk encryption enabled

Audit & Monitoring

Audit trail enabled (all actions logged)
Real-time alerts configured
Access logs retained for 365 days
Security dashboard monitored daily
Incidents documented

Secrets Management

No hardcoded secrets
Vault/K8s Secrets for all credentials
Secret rotation automated
Secret access logged

Security Best Practices

✅ DO:

Implement all 5 layers
Log everything
Alert on anomalies
Rotate secrets
Use least privilege
Enable MFA
Keep systems updated
Test security regularly

❌ DON’T:

Skip layers
Trust user input
Use defaults
Store plaintext passwords
Grant broad permissions
Disable logging
Use outdated libraries

Updated: October 29, 2025
Version: 1.0
Standard: NIST AI RMF, OWASP LLM Top 10