What it is
The Flow Monitor periodically reads your Power Automate cloud flows and watches for the failure modes that usually go unnoticed:
- Off — a flow was turned off (Stopped) and is no longer running on its trigger. This is the classic "it was off for ten days and nobody noticed" incident.
- Suspended — Power Automate auto-disabled the flow after repeated failures.
- Failing — the flow is on, but more than your configured percentage of its recent runs failed (default: > 20% of the last 20 runs).
- Stale — the flow is on and has runs, but none succeeded recently (default: no success in 24 hours), which usually means it is silently failing or stuck.
When a flow has a problem, TATER raises a monitoring finding and files a single de-duplicated Ops ticket for that flow (refreshed in place if the problem persists, and auto-closed when the flow recovers). It also forwards a power_automate.flow.degraded / .recovered event to your SIEM if you have one configured. The scan runs automatically every hour, and you can run it on demand.
One-time setup — connecting to your tenant
TATER reads your flows app-only (no user sign-in) via the Microsoft Power Automate Management API. You register an app in your Microsoft tenant, give it Power Platform access, and paste its credentials into TATER once.
1. Register an app
- In the Microsoft Entra admin center → App registrations → New registration. Name it something like "TATER Flow Monitor". Single tenant is fine.
- Under Certificates & secrets, create a client secret and copy its value (you only see it once).
- Note the Application (client) ID and your Directory (tenant) ID from the app's Overview page.
The commands below are PowerShell. On Windows, run them in PowerShell 7+ (pwsh). Note the PowerShell assignment syntax $appId = az ... — the bash form appId=$(az ...) will fail in PowerShell with "is not recognized as a name of a cmdlet". Create the app, service principal, and a two-year secret:
# You're signed into the customer tenant (az login --tenant <tenantId>)
$appId = az ad app create --display-name "TATER-FlowMonitor" --query appId -o tsv
az ad sp create --id $appId
$secret = az ad app credential reset --id $appId --years 2 --query password -o tsv
Write-Host "App (client) ID: $appId"
Write-Host "Client secret: $secret" # copy both — the secret is shown once
App registration is a tenant-level (Entra) operation, so it does not matter which Azure subscription is selected at az login.
2. Grant it Power Platform access
The app needs to be able to read flows across your environments. Register it as a Power Platform management application (run once, as a Power Platform admin, in PowerShell):
Install-Module -Name Microsoft.PowerApps.Administration.PowerShell -Scope CurrentUser -Force
Add-PowerAppsAccount -TenantID <your-tenant-id>
New-PowerAppManagementApp -ApplicationId $appId
This authorizes the app for app-only access to the Power Platform admin APIs, which is what lets TATER enumerate every flow and its state. Without it, TATER can still authenticate but only sees flows the app itself owns (you'll see a "per-user scope" note on the Test button) — fine for a proof of concept, but grant the management app for full coverage.
Monitoring all environments (Dataverse)
Power Platform has two kinds of environments, and they expose flows differently:
- The Default environment (and any environment without Dataverse) — its flows are read directly via the Flow API, with run-failure history. Nothing extra to do; these are monitored as soon as you connect.
- Dataverse environments (Production, Sandbox, Developer, etc.) — their flows live inside Dataverse. To read them app-only, the TATER app must be added as an application user in that specific environment. Until you do, the Flow Monitor shows that environment with an "app-user needed" badge and reads it as zero flows. For Dataverse environments TATER reports flow on/off/suspended state (the core "a flow got turned off" signal); run-failure rate is available for Default-environment flows.
The Flow Monitor's Environments panel lists every environment with its type, flow count, and a Monitoring/Muted toggle — and flags exactly which Dataverse environments still need the grant.
Add the TATER app as an application user (per Dataverse environment)
For each environment you want monitored, in the Power Platform admin center:
- Open Environments → (the environment) → Settings → Users + permissions → Application users.
- Click + New app user, Add an app, and pick your app (e.g. TATER-FlowMonitor / the client ID from setup).
- Assign a security role that can read flows — System Administrator is simplest; a least-privilege custom role needs Read on the Process table. Save.
Within a few minutes the next scan (or Scan now) picks up that environment's flows automatically. Repeat only for the environments you care about — the rest you can leave with the badge and mute them so they never alert.
Muting environments and flows (dev vs prod)
You almost never want dev/test/sandbox flows raising tickets. In the Flow Monitor:
- Mute a whole environment — in the Environments panel, click Monitoring to flip it to Muted. Every flow in it stops alerting (and any open tickets for those flows auto-close), but they stay visible in the table, greyed, so you can still see their state.
- Mute an individual flow — click Mute on its row in the flow table. Use this for a single noisy or intentionally-off flow in an otherwise-monitored environment.
Muted items show a grey MUTED badge and a note of what they would be alerting on. Un-mute any time with the same toggle.
3. Enter the connection in TATER
- Go to TATER Ops → Flow Monitor.
- Fill in Microsoft tenant ID, App (client) ID, and Client secret. The secret is encrypted at rest and never displayed again.
- Optionally restrict to specific environments (comma-separated environment names) — leave blank to scan all of them.
- Click Save connection, then Test connection to confirm TATER can reach your flows. The test reports how many environments and flows it can see.
- Click Scan now to do the first evaluation. After that the hourly sweep keeps it current.
Tuning the thresholds
| Setting | What it does | Default |
|---|---|---|
| Failure threshold % | Raise a "failing" alert when at least this percentage of recent completed runs failed. | 20% |
| Min runs for rate | Don't judge the failure rate until the flow has at least this many completed runs in the sample (avoids over-reacting to one bad run). | 5 |
| Runs to sample | How many recent runs to look at per flow (max 50). | 20 |
| Stale after (hours) | If the flow is on but its last successful run is older than this, flag it "stale". | 24 |
| Alert when a flow is OFF | Treat a Stopped (turned-off) flow as a problem. | on |
| Alert when Suspended | Treat an auto-suspended flow as a problem. | on |
| Auto-create Ops tickets | File a de-duplicated Ops ticket per problem flow. Turn off to keep findings in the Application Monitoring queue without tickets. | on |
| Ticket assignee | Optional email to assign auto-filed flow tickets to. | — |
How alerts behave
- One ticket per flow. A flow with several problems (off + stale, etc.) gets one living ticket, de-duplicated on the flow id, refreshed in place as the situation changes — never a flood of duplicates.
- Auto-close on recovery. When a flow returns to On and within the failure threshold, TATER auto-closes the open ticket and resolves the finding (unless you tagged the ticket
keep-open). - Severity. Off / Suspended are High. Failing scales with the rate (≥ 80% Critical, ≥ 50% High, else Medium). Stale is Medium.
- SIEM. If you've configured SIEM forwarding, degraded and recovered events are sent as
power_automate.flow.degraded/power_automate.flow.recoveredwith the flow id, environment, state, and failure rate. - Suppression. A flow that is intentionally off (a retired or seasonal flow) can be muted by suppressing its finding in the Application Monitoring queue, or by adding it to the exclude list.
Resetting an alert (after you fix it)
When you've resolved a flow's problem and want to clear the alert — and have it come back only if the issue continues — use Reset rather than Mute. Mute silences a flow permanently (for dev/seasonal/retired flows); Reset zeroes out the current alert and re-arms it.
- Where. On the Ops → Flow Monitor flow table, an active alert's row shows a ↺ Reset button next to Mute (also in the flow's detail card). Admin only.
- What it does. It resolves the flow's monitoring finding and closes its open Ops ticket immediately, and records a reset baseline (shown as "↺ reset <date>").
- Re-arming — the important part. For a failing or stale flow, the failure rate is measured over the last N runs, so old failures linger in the sample even after you fix the flow. After a Reset, TATER counts only runs that start after the reset toward the failing/stale decision — so the alert stays clear unless the flow keeps failing on new runs. The baseline clears itself once a run succeeds after the reset.
- Off / Suspended. These reflect the flow's live state, not run history. If you Reset a flow that's still turned off, it will alert again on the next scan — because it's still broken. Reset those after you turn the flow back on.
- From Application Monitoring. Clicking Resolve on a Power Automate finding in the Application Monitoring queue does the same thing (it stamps the same reset baseline), so either surface works.
In short: Mute = "stop watching this flow." Reset = "I fixed it; clear the alert, and tell me again only if it keeps happening."
Reading the Flow Monitor page
The flows table shows each flow's status badge (OFF / SUSPENDED / FAILING / STALE / OK), its environment, its current state, its recent success rate (and how many of the sampled runs failed), the last successful run time, and a one-line description of the problem. Under each flow name you'll see its trigger type and the connectors it depends on (e.g. SharePoint, Office 365 Outlook), plus a small 📝 when it has documentation and 🔗 / 📄 badges for linked tasks and docs. The ↗ next to a flow name opens it directly in the Power Automate maker portal. Use the filter chips to focus on just the problems, or a specific status. The summary line shows the last scan time and the count in each status.
Flow inventory & documentation
Every monitored flow is a durable inventory entity you can document and link — not just a row in a scan. Click a flow name to open its detail card. It shows everything TATER pulls from the connection plus your own documentation:
- Mined from the connection (refreshed every scan, with a fresh pull each time you open the card): trigger type, the connectors the flow uses, created / last-modified dates, whether the owner is subscribed to failure alerts, and the flow's platform owner(s).
- Solution: which Power Platform solution the flow belongs to (read from Dataverse in environments where the app is an application user — the system "Default"/"Active" solutions are filtered out). Shown as a 📦 tag in the table and in the detail card.
- Connections: the specific connection references the flow uses, by name (e.g. "PandaDoc PROD", "Outlook") and connector. Plus an impact view — "Other flows sharing a connection" lists every other flow that uses the same connection, so when a connection breaks you can see everything it takes down at a glance.
- Your documentation (preserved across scans): a human owner, a criticality (low → critical, shown as a colored dot in the table), a business purpose, free-text troubleshooting notes, a runbook URL, and tags.
- Linked Ops tasks: any ticket tied to the flow (including the auto-filed monitor ticket) shows here, and + Create task for this flow opens a new linked ticket in one click.
This makes troubleshooting faster — when a flow breaks, the on-call tech opens the card and immediately sees what it does, who owns it, what it connects to, the runbook, and the history of past tickets. Editing the documentation requires Admin+; everyone Auditor+ can read it.
MCP tools
| Tool | Purpose |
|---|---|
list_power_automate_flows | List the org's flows with their monitored health (on/off, failure rate, last success, status). Optional status filter. Read-only; surfaces problem flows by default. |
get_power_automate_flow | Full inventory record for one flow — live health, mined metadata (trigger, connectors, owners, created/modified), your documentation, and linked tasks. Pulls fresh metadata each call. Read-only (Auditor+). |
document_power_automate_flow | Set a flow's notes, business purpose, owner, criticality, runbook URL, tags, and link docs / tasks. Builds durable troubleshooting documentation. Admin role. |
Permissions & privacy
- Viewer / Auditor+ can view the Flow Monitor page and the flow list. Admin+ can edit the connection, run a scan, and test.
- TATER reads flow metadata only — names, on/off state, and run pass/fail counts. It does not read the data your flows process.
- The client secret is encrypted at rest (AES-256-GCM) and redacted from every API response.
Troubleshooting
- Test fails with a token error — re-check the tenant ID, client ID, and secret. Make sure the secret hasn't expired in Entra.
- "per-user scope" note — the app authenticated but isn't a Power Platform management app, so it only sees its own flows. Run the
New-PowerAppManagementAppstep above. - Runs show "n/a" — run history wasn't readable for that flow/environment; on/off alerting still works, but failure-rate won't.
- A flow is intentionally off — suppress its finding in Application Monitoring, or add its id to the exclude list, so it stops alerting.
Related
- Application Monitoring — the findings queue the flow alerts feed into, plus on/off toggles
- TATER Ops overview — the rest of the Ops surface
- MCP setup — connecting Claude Desktop / API to TATER