Skip to Content
Getting StartedTroubleshooting

Troubleshooting

Common issues and how to solve them.


Installation Issues

Error: “Docker is not running”

Message: Cannot connect to the Docker daemon

Cause: Docker service is not started

Solution:

# macOS open -a Docker # Linux sudo systemctl start docker # Windows # Use Docker Desktop GUI or: # Start-Service com.docker.service (PowerShell)

Error: “Port already in use”

Message: Error response from daemon: bind: address already in use

Cause: PostgreSQL, Redis, or another service port is in use

Solution:

Option 1 - Change the port in docker-compose.yml:

services: postgres: ports: - "5433:5432" # Changed from 5432 to 5433

Option 2 - Stop the conflicting process:

# Find what's using port 5432 lsof -i :5432 # Kill the process kill -9 <PID>

Error: “Services won’t start”

Message: service did not converge or containers keep restarting

Solution: Check the logs:

# View all logs docker-compose logs -f # View specific service docker-compose logs -f postgres docker-compose logs -f temporal # View logs since the last restart docker-compose logs --tail 50 postgres

Error: “Out of memory”

Message: Cannot allocate memory or Docker crashes

Cause: Insufficient system resources

Solution:

Docker Desktop (macOS/Windows):

  1. Open Docker Desktop preferences
  2. Go to Resources
  3. Increase Memory to 6-8GB
  4. Click Apply & Restart

Linux: Edit /etc/docker/daemon.json:

{ "memory": "8g", "memory-swap": "8g" }

Then restart Docker:

sudo systemctl restart docker

Error: “Health check fails”

Message: make health shows ✗ for some services

Solution:

Wait for services to start (they take time):

# Wait 30 seconds sleep 30 # Try again make health

Check individual service logs:

docker-compose logs redis docker-compose logs temporal

Deployment Issues

Error: “Application already exists”

Message: Application 'my-app' already exists

Cause: You’re trying to create an app with a name that’s already in use

Solution:

# Option 1: Use a different name cascade app apply --file app.yaml --name my-app-v2 # Option 2: Delete the existing app first cascade app delete my-app cascade app apply --file app.yaml

Error: “CDL validation failed”

Message: Invalid state: unknown state type 'MyState'

Cause: YAML syntax error or invalid state type

Solution:

Check the error message for the specific problem:

# Example issues: # - State type typo (Task vs Task) # - Missing 'next' field # - Invalid resource URN format # - Missing required fields

Fix common issues:

# ❌ WRONG - name: MyState type: Taask # Typo! next: NextState # ✅ CORRECT - name: MyState type: Task resource: urn:cascade:activity:my_activity next: NextState

Error: “Schema not found”

Message: urn:cascade:schema:leave_request_form not found

Cause: The schema URN doesn’t exist in your application

Solution:

Define the schema in your application:

spec: schemas: - urn: urn:cascade:schema:leave_request_form type: object properties: employee_name: type: string leave_dates: type: array

Or reference an existing schema from the platform.


Workflow Issues

Workflow stuck in “RUNNING” state

Message: Workflow hasn’t completed after several hours

Cause: Workflow is waiting for something (HumanTask, external event)

Solution:

Check what state it’s in:

cascade process inspect <instance-id> --app=<app-name> # Look for states with status "waiting" or "pending"

Options:

  1. If waiting for HumanTask: Complete the human task in the UI
  2. If waiting for external event: Send the event
  3. If stuck for no reason: Check logs for errors
cascade logs --app=<app-name> --instance=<instance-id>

Error: “Activity execution failed”

Message: Activity resource:urn:cascade:activity:my_activity failed after 3 retries

Cause: Your Go activity function threw an error

Solution:

  1. Check the error details in logs:
cascade logs --app=<app-name> --instance=<instance-id>
  1. Common issues:

    • Database connection failed → Check database is running
    • API call failed → Check network connectivity
    • Invalid input parameters → Check parameter mapping
  2. Fix and retry:

# The workflow will retry automatically # Or manually retry the entire workflow cascade process cancel <instance-id> --app=<app-name> # Then start a new instance

Error: “Policy evaluation failed”

Message: Policy urn:cascade:policy:my_policy evaluation failed

Cause: OPA policy has a syntax error or runtime issue

Solution:

  1. Check the policy file for syntax errors
# ❌ WRONG - missing colon package my_policy result = "value" # ✅ CORRECT package my_policy result := "value"
  1. Test the policy locally:
# Using OPA CLI (if installed) opa run -s -d policy.rego # Then test interactively: # data.my_policy.result with input as {...}
  1. Check logs for exact error:
cascade logs --instance=<instance-id> | grep -i policy

Error: “State not found”

Message: State 'NextState' not found in workflow

Cause: Typo in state name or missing state definition

Solution:

Check your YAML for typos:

# ❌ WRONG - typo in next state - name: MyState type: Task next: NextSate # Typo! # ✅ CORRECT - name: MyState type: Task next: NextState

Ensure all states referenced in next are defined.


Performance Issues

UI Rendering is slow (>200ms)

Cause: Schema is complex or cache is not working

Solution:

  1. Check if Redis is running and healthy:
make health # Look for Redis ✓
  1. Simplify your schema (fewer fields, nested structures)

  2. Check cache hit rate:

cascade metrics --instance=<instance-id> | grep -i cache

Workflow execution is slow

Cause: Activities taking too long or Temporal configuration

Solution:

  1. Check activity execution times:
cascade trace --instance=<instance-id> # Look for activities with long duration
  1. Common issues:

    • Database queries too slow → Add indexes
    • API calls slow → Increase timeout
    • External system issues → Check connectivity
  2. Check Temporal server health:

docker-compose logs temporal

Policy evaluation is slow (>10ms)

Cause: OPA policy is complex or cache is not configured

Solution:

  1. Use the right policy engine:

    • Choice for simple logic (<0.1ms)
    • OPA for medium (expected <5ms)
    • DMN for complex (expected 10-50ms)
  2. Check if caching is enabled:

- name: MyPolicy type: EvaluatePolicy policy: urn:cascade:policy:my_policy cacheResult: true # Enable caching
  1. Simplify your OPA policy:
    • Fewer rules
    • Avoid heavy computations
    • Use built-in functions efficiently

Debug Commands

View Application Details

cascade app inspect my-app

Shows: workflows, policies, schemas, deployment status

View Workflow Instance

cascade process inspect <instance-id> --app=my-app

Shows: current state, status, elapsed time, outputs

View Execution Trace

cascade trace <instance-id> --format=json

Shows: all state transitions, timestamps, durations

View Logs

# Last 50 lines cascade logs --app=my-app --instance=<instance-id> --tail 50 # Search for errors cascade logs --app=my-app --instance=<instance-id> | grep ERROR # Follow logs in real-time cascade logs --app=my-app --instance=<instance-id> --follow

View Metrics

cascade metrics --instance=<instance-id>

Shows: execution time, cache hits, policy evaluation times

List All Instances

cascade process list --app=my-app # Filter by status cascade process list --app=my-app --status=running cascade process list --app=my-app --status=completed cascade process list --app=my-app --status=failed

Connection Issues

Cannot connect to PostgreSQL

Error: connection refused at localhost:5432

Cause: Database is not running or not accessible

Solution:

# Check if PostgreSQL container is running docker-compose ps | grep postgres # If not running, start it docker-compose up -d postgres # Wait a moment and test connection docker-compose exec postgres psql -U postgres -d cascade

Cannot connect to Redis

Error: connection refused at localhost:6379

Cause: Redis is not running

Solution:

# Check status docker-compose logs redis # Restart docker-compose restart redis # Verify redis-cli ping # Should return PONG

Cannot connect to Temporal

Error: connection refused at localhost:7233

Cause: Temporal server is not running

Solution:

# Check status docker-compose logs temporal # Temporal takes time to start, wait and retry sleep 15 cascade process list # Try a command that uses Temporal

Getting Help

If none of these solutions work:

  1. Check logs first:

    docker-compose logs -f
  2. Check GitHub issues:

  3. Ask in community:

  4. Create a GitHub issue:

    • Include: error message, steps to reproduce, logs
    • Example output from make health
    • Your OS and Docker version

Tip: Most issues are resolved by:

  1. Checking make health (are all services running?)
  2. Checking logs (what’s the actual error?)
  3. Restarting Docker (docker-compose down && make setup)

Happy troubleshooting! 🔧

Last updated on