Troubleshooting
Operational triage playbook for the most common Aegis + Axis rollout and auth issues.
Overview
Use a top-down triage order: endpoint reachability, authentication, authorization, then rollout/audit evidence checks.
Capture UTC timestamps and request IDs wherever available.
Endpoint Assumptions
- Production auth and API host:
https://axis.velikey.com(signin:https://axis.velikey.com/auth/signin). - Non-production override: set
AXIS_BASE_URLexplicitly before running docs commands. - Manual operator substitutions: provide tenant-scoped values for cookies, bearer tokens, agent IDs, and tenant slugs.
Actionable Steps
- Verify Axis sign-in endpoint and operator session validity.
- Validate agent policy endpoint with matching agent token and agent ID.
- Confirm rollout receipt presence for change operations.
# manual-only example
# production default; set AXIS_BASE_URL explicitly for staging/test.
export AXIS_BASE_URL="${AXIS_BASE_URL:-https://axis.velikey.com}"
export SESSION_COOKIE="axis-session=REDACTED"
export AGENT_ID="agent-001"
export AGENT_BEARER_TOKEN="REDACTED"
curl -fsS "$AXIS_BASE_URL/auth/signin" -o /dev/null
curl -fsS -H "Authorization: Bearer $AGENT_BEARER_TOKEN" "$AXIS_BASE_URL/api/agents/$AGENT_ID/policies" | jq
curl -fsS -H "Cookie: $SESSION_COOKIE" "$AXIS_BASE_URL/api/rollout-receipts?limit=5" | jq
Rollout Receipts Checks
Use receipt checks when rollout apply/rollback evidence is missing or disputed.
Confirm the actor role has owner/admin privileges, then query receipt list and detail endpoints in the same tenant context.
Terraform and Helm Install Triage
- Terraform init succeeds but plan fails: validate kube context and Helm provider auth in the same shell session.
- Helm release created but agent not ready: verify existing secret name and control-plane URL are correct in values.
- TLS proxy not accepting traffic: verify agent TLS secret key names (
tls.crt,tls.key) and mounted paths. - Public endpoint returns 404/502: verify ingress host/path mapping and backend service reachability.
Reference docs: Install with Terraform • Install with Helm • Public Endpoint Patterns
Validation Checks (Last Step)
# executable example command -v curl command -v jq python3 --version node --version
These checks cover the tooling used in local diagnostics and docs command audit scripts.
Common Failure Modes
- `401` on `/api/agents/{id}/policies`: token missing, expired, or tenant/agent binding mismatch.
- `403` on audit/receipt endpoints: account lacks owner/admin privileges.
- No receipt for rollout action: plan/apply failed before receipt generation or used wrong tenant context.
- Intermittent API failures: stale browser session cookie or reverse proxy header issues.
Navigate Docs
Docs Index • Previous: Axis Alerts and Audit • Install with Terraform • Install with Helm • Engineers Portal