Nimbus

AI-Powered Remediation

Some failures do not clear on a plain rerun — a query genuinely conflicts with another job, or a step times out because it is doing too much at once. For these, Nimbus can bring in AI remediation: it asks Nimbus AI to diagnose the failure and propose a concrete fix.

When AI remediation runs #

AI remediation is attempt #2. It only happens when:

  1. A monitoring rule matched the failure.
  2. AI recovery is enabled on that rule.
  3. The plain rerun (attempt #1) failed.

If attempt #1 succeeds, AI is never invoked.

How it works #

  1. Classify — Nimbus first runs a fast, rule-based classification of the error: timeout, query_conflict, data_volume, syntax, permission, missing_object, or unknown.
  2. Diagnose — the classification, the error message, the activity detail, and the relevant query text are sent to Nimbus AI, which produces a plain-language diagnosis and a set of proposed changes.
  3. Apply — Nimbus applies the proposed changes to *staging* assets — for example creating a staging data extension or splitting a heavy query into smaller steps. Production is not touched yet.
  4. Rerun — the remediated automation is fired.
  5. Resolve — depending on the rule's mode, the fix is recorded, queued for approval, or promoted.

The diagnosis and the before/after diff are always saved to a remediation report, visible on the incident in Monitoring.

The three modes #

ModeAI diagnosesAI applies a fixReaches production
**Advisory**YesNoNever — recommendation only
**Assisted**YesYes (staging)Only after a human approves
**Autonomous**YesYes (staging)Automatically, if the rerun succeeds

Advisory #

The safest mode. AI tells you what it thinks is wrong and what it would do. You stay fully in control and apply changes yourself. Good for getting comfortable with AI recommendations.

Assisted #

AI applies its fix to staging and reruns it. If the rerun works, the fix waits for you on the incident with a Review & promote button. You see the exact diff before anything changes in production. This is the recommended mode for most teams.

Autonomous #

AI applies, reruns, and — if the rerun succeeds — promotes the fix to the production automation with no human step. Use this only for automations where you trust the failure patterns and the cost of a delayed fix outweighs the value of review.

Promoting a fix #

In Assisted mode, open the incident, review the diagnosis and diff, and click Review & promote. Nimbus updates the production automation's steps with the approved changes. You can also Decline a proposal — the staging assets are left in place but production is untouched.

AI usage and quotas #

Every AI call — remediation and the AI Assistant — consumes tokens. Nimbus tracks output tokens per Business Unit per month against your plan's limit. When a Business Unit reaches its limit, further AI actions are paused until the next billing period. Check usage under Settings → Billing.