Open Ops

TATER Ops - Cloud Script Execution

Run PowerShell maintenance, audit, and diagnostic scripts against your M365 / Entra ID tenant directly from TATER Ops. Cloud scripts execute in an Azure Automation runbook context with app-only Graph / EXO auth, capture stdout/stderr, and post results back to TATER for review, drift detection, and MCP-driven analysis.

Overview

Where the device-targeted Script Library dispatches PowerShell / Bash to endpoint hostnames via the TATER Agent, cloud scripts dispatch to a per-org Azure Automation runbook (Run-OpsScriptCloud) that connects to M365 services and runs your script against the tenant. This makes Ops the right place for:

  • Help-desk user diagnostics ("Why can't Jeff send email?")
  • Tenant maintenance audits (license utilization, stale guests, app permission drift)
  • Security investigations (suspicious inbox rules, mailbox delegations, external sharing)
  • One-off Graph / Exchange queries that you don't want to ad-hoc-paste into a tech's PowerShell window

Cloud scripts are distinguished from device scripts by setting executionTarget: cloud on the script record. The Run button still works the same way - just don't supply a hostname list (cloud scripts ignore targets).

For Microsoft Intune writes, prefer the native tools. Creating and assigning Intune Proactive Remediations and Platform Scripts has a dedicated, synchronous path - the Ops → Intune page and its 8 MCP tools - with What-If preview, Entra group targeting, change-control gating, and audit logging. Use those for deviceManagement writes; reserve authored cloud scripts for the broader M365 surface (EXO, SharePoint, Graph reads) that has no native tool.

Architecture

TATER Ops UI / MCP
   │
   │ POST /api/ops/scripts/:id/execute
   ▼
opsScripts.ts executeOpsScript
   │
   │ resolve org.opsScriptCloudWebhookUrl
   │ create OpsScriptJob (targets=['__cloud__'], status=Queued)
   │ POST { jobId, scriptContent, cloudAuthContext, callbackUrl } → webhook
   ▼
Azure Automation Account (aa-tater-{client})
   │
   │ Run-OpsScriptCloud.ps1
   │   - install cert from Key Vault
   │   - Connect-MgGraph / Connect-ExchangeOnline (per cloudAuthContext)
   │   - execute script body via [ScriptBlock]::Create
   │   - capture stdout, stderr, exitCode
   ▼
POST callbackUrl  /api/ops/script-jobs/:id/cloud-result
   │   X-Api-Key: <org-bound TATER key>
   │   { status, stdout, stderr, exitCode, errorMessage }
   ▼
opsScripts.ts postOpsScriptCloudResult
   │
   │ update OpsScriptJob.status + cloudResult
   │ audit-log via='cloud-runbook'
   ▼
get_script_job_status (poll for completion)

Setup (per client org)

Cloud script execution requires per-org infrastructure that mirrors the existing per-org scan runbook setup (see the Azure Runbook Setup Guide). If the org already has scanning infrastructure provisioned, you can re-use it:

  1. Publish the runbook. Upload Runbooks/Run-OpsScriptCloud.ps1 to the org's Azure Automation Account (e.g. aa-tater-cb). The runbook is PS 7.2-targeted. The publish flow is identical to Scan-M365Cloud.ps1:
    az automation runbook create \
      --resource-group rg-tater-prod \
      --automation-account-name aa-tater-cb \
      --name Run-OpsScriptCloud \
      --type PowerShell72 \
      --description "TATER Ops cloud script executor"
    
    # Upload draft content via REST API (CLI replace-content is unreliable for PS72)
    # See Runbooks/CLAUDE.md §12af for full upload script
    
  2. Create a webhook on the published runbook. Recommended expiry: 5 years from now. Save the URL - it's a one-time reveal.
  3. Confirm Automation Variables already exist on the AA (these are shared with the scan runbook):
    • AppClientId - client tenant's app registration
    • TenantDomain - e.g. caronbletzer.onmicrosoft.com
    • KeyVaultName - kv-tatersec-tater
    • CertName - e.g. tater-cb-scanner
    • ApiKey - org-bound TATER API key (used to call back to the cloud-result endpoint)
  4. Store the webhook URL on the org record in TATER:
    PUT /api/organizations/<orgId>
    {
      "opsScriptCloudWebhookUrl": "https://<guid>.webhook.eus.azure-automation.net/webhooks?token=..."
    }
    Or use the UI: TATER Manage → Connections → Microsoft Tenant → Ops Script Cloud Webhook URL.
  5. Verify Graph + EXO modules are installed in the AA. The scan runbook depends on the same modules - if scans are working, cloud scripts will too:
    • Microsoft.Graph.Authentication (for graph-managed-identity context)
    • ExchangeOnlineManagement 3.4+ (for exo-managed-identity context)
Heads-up. Cloud scripts inherit the same Graph + EXO permissions as the scanning app registration. If a script tries to call an API the app reg doesn't have permission for, it will fail with a 403 inside the script body. You'll see the failure in the cloud-result stderr. Add permissions to the app reg if needed and re-consent.

Auth contexts

The cloudAuthContext field on a script tells the runbook which M365 service to connect to before running the script body. Pick the lightest one that covers what the script needs - runbook overhead drops if it doesn't have to connect to EXO unnecessarily.

Auth contextConnects toUse when
graph-managed-identityMicrosoft GraphDefault. Users, groups, sign-ins, licenses, app registrations, SharePoint (via Graph), Intune.
exo-managed-identityExchange OnlineMailbox permissions, transport rules, inbox rules, anti-spam policies. Anything Get-EXOMailbox / Get-InboxRule family.
azure-managed-identityAzure Resource ManagerAzure subscription audits (NSG rules, Key Vault config, RBAC). Requires the AA's managed identity to have RBAC on the target subscriptions.

Three ways to execute a cloud script

1. From the TATER Ops UI

Open Operations → Script Library, click ▶ Run on a cloud-target script. The targets field is hidden / ignored for cloud scripts. Click Run. The job appears in Recent Jobs → Active while the runbook executes, then moves to Inactive when results land.

2. From an MCP agent

The execute_script MCP tool now accepts cloud scripts. Targets array is optional. Returns immediately with the queued job ID; poll get_script_job_status for completion.

execute_script({
  script_id: "ops-script-mfa-all-users-status",
  run_as: "system"
})
// → "☁️ Queued cloud execution of MFA - All Users Status Audit
//    - job opsjob-abc123. Runbook will report results back
//    asynchronously. Poll with get_script_job_status."

3. Scheduled via Ops Schedules

Cloud scripts work in scheduled execution too - create a schedule with no targets, point it at the cloud script, and set the recurrence (e.g. weekly on Monday 06:00 UTC). The scheduler uses the same webhook fan-out as the on-demand Run button. Action rules (email on drift, auto-create Ops task on failure) apply identically.

Template library

TATER ships a curated set of starter scripts in Runbooks/OpsCloudTemplates/ covering the most common help-desk and audit scenarios. These are opinionated, read-only, and tenant-safe - designed to drop into any client org without modification:

TemplateCategoryAuthPurpose
User-FullDiagnostic.ps1User DiagnosticsGraphHelp-desk: full M365 profile + sign-ins + auth methods + groups + licenses + risk events for one user
MFA-AllUsersStatus.ps1Identity & AccessGraphTenant-wide MFA registration audit - surfaces users without strong auth
Licenses-SeatUtilization.ps1Tenant HealthGraphPer-SKU seat utilization (excludes viral pools) - find over/under-provisioned licenses
Mailbox-DelegationAudit.ps1Email SecurityEXOFull Access + Send As + Send on Behalf + calendar default sharing across every mailbox
InboxRules-SuspiciousAudit.ps1Email SecurityEXOBEC indicator scan - external forwards, auto-hide alerts, mark-all-read rules
Guests-StaleAccountAudit.ps1Identity & AccessGraphGuests who have never signed in or are inactive > N days
AppRegistrations-PermissionAudit.ps1Identity & AccessGraphService principals with high-impact Graph permissions
SharePoint-ExternalSharingAudit.ps1Data ProtectionGraphPer-site sharing capability + anyone-link surface

Every template includes a structured .METADATA comment block with the script's intended executionTarget, cloudAuthContext, riskLevel, default timeout, and TATERpedia slug. Importing today is manual; see "Future work" below for the planned auto-import flow.

Importing a template (manual)

  1. Open Runbooks/OpsCloudTemplates/<name>.ps1 from the TATER repo.
  2. Read the .METADATA block at the top - that's your reference for the script's settings.
  3. In TATER Ops: Script Library → + New Script.
  4. Fill in: name, description (copy .SYNOPSIS), language=PowerShell, executionTarget=cloud, cloudAuthContext matching the metadata.
  5. Paste the script body (everything below the comment block).
  6. Set defaultTimeoutSec and riskLevel from the metadata.
  7. Save. Click ▶ Run to execute on demand.

Writing your own templates

Cloud scripts are just PowerShell. They execute inside a [ScriptBlock]::Create scope after the runbook has already connected to Graph / EXO. You can use any cmdlet the auth context supports. Follow these conventions to keep scripts MCP-analyzable and drift-detectable:

  • Output structured JSON at the end via ConvertTo-Json -Depth N. The cloud-result endpoint stores stdout for MCP agents and Ops Schedule drift detection to compare across runs.
  • Wrap risky calls in try/catch and append errors to a $errors array. Don't throw - the runbook needs to capture all output and post back even on partial failure.
  • Read-only by default. If you must mutate state, set riskLevel: high and document rollback in the TATERpedia page for the script.
  • Cap output ~1MB. The cloud-result endpoint truncates stdout at 64KB. Use -Top N / sample parameters when developing.
  • Idempotent. Running twice should give equivalent output. Critical for scheduled drift detection.

Recommended frontmatter (a comment block at the top of every script):

<#
.SYNOPSIS
  One-line summary.

.DESCRIPTION
  Multi-paragraph detail of what the script does, what it surfaces,
  and when to run it.

.METADATA
  name:                Human-readable script name
  category:            Group label for the Ops library
  executionTarget:     cloud
  cloudAuthContext:    graph-managed-identity
  language:            powershell
  riskLevel:           low
  defaultTimeoutSec:   180
  taterpediaSlug:      ops-cloud-script-<slug>
  defaultRunAs:        system

.NOTES
  Required Graph / EXO permissions, gotchas, related scripts.
#>

Troubleshooting example: Jeff Polchlopek

The canonical "help, my user is broken" workflow:

  1. An MCP-aware help-desk agent picks up the ticket. Step 0 of the Help-Desk Session Pattern - list_tasker_tasks with Jeff's email to dedupe against any open ticket.
  2. Agent calls execute_script with script_id=ops-script-user-full-diagnostic and parameters: { UserPrincipalName: "jeff.polchlopek@..." }.
  3. Run-OpsScriptCloud connects to Graph, calls Get-MgUser, Get-MgUserAuthenticationMethod, Get-MgAuditLogSignIn, etc. against Jeff's account. ~30 seconds.
  4. Results land in cloudResult.stdout as JSON. Agent polls get_script_job_status until status=Completed.
  5. Agent parses the JSON. Sees Jeff's last sign-in failed with 50126 ("invalid credentials") from an IP in Vietnam at 03:14 UTC, plus three more failures in the next 10 minutes from rotating IPs. Risk event shows riskState: atRisk, riskLevel: high.
  6. Agent appends findings to the Ops task ("Account targeted - recommend immediate password reset + revoke sessions"), creates a ConfigDoc "Jeff Polchlopek - account targeting investigation 2026-05-29", and updates the TATERpedia page "Account targeting response playbook" with the new IPs observed.

The whole loop runs without anyone opening Entra portal or pasting PowerShell into a tech's window. Everything is audit-logged with via='cloud-runbook' + via='mcp' attribution.

Security model

  • Per-org webhook isolation. Each client org has its own webhook URL pointing at its own Azure Automation Account. Compromise of one client's webhook does not leak to others.
  • SSRF-guarded. The webhook URL passes through validateExternalUrl at both writer (PUT /organizations) and reader (executeOpsScript fetch site).
  • API key org binding. The cloud-result endpoint refuses to accept results from an unbound (tenant-level) API key. The runbook's ApiKey variable must be a key bound to the specific org.
  • IDOR-safe callback. The cloud-result endpoint verifies job.organizationId === apiKey.organizationId before accepting results. A leaked client API key cannot post results for jobs in another org.
  • Audit attribution. Job creation logs via='web' or via='mcp' with the requesting user; result post logs via='cloud-runbook' with the API key's prefix. Both sides of the loop are traceable.
  • Body cap. Cloud-result POST limited to 1.5 MB; stdout truncated to 64KB, stderr to 16KB stored on the job document.

Future work - the shared library vision

The current state requires you to manually copy templates from the repo into TATER Ops. The shared library roadmap closes that loop:

  1. Library API - GET /api/ops/script-library returns the catalog of available templates with their metadata, parsed at API build time from Runbooks/OpsCloudTemplates/*.ps1.
  2. One-click import - a + Browse Library button in the Ops Script Library. Click a template to preview metadata + script body. Click "Add to my org" to snapshot it into the org's OpsScripts container with a librarySourceId backref.
  3. TATERpedia auto-seed - wikiSeedTimer.ts walks the template directory, parses the metadata block, and ensures a wiki page exists for each script's taterpediaSlug. Page covers script purpose, parameters, sample output, operational guidance, and a "View source" link.
  4. Update notifications - when a library template gets a security or correctness fix, orgs that imported it see a "Library update available" prompt in the Ops Script Library row, with a diff view.
  5. Community contributions - users can submit new templates via the create_script_template MCP tool or pull request; submissions land in Runbooks/OpsCloudTemplates/community/ after review.
  6. MCP discovery - new tools list_script_templates and import_script_template(slug) so AI agents can suggest "this is a known scenario - want me to import the User Diagnostic template?" instead of constructing scripts from scratch.

Until that ships, treat Runbooks/OpsCloudTemplates/ as the canonical reference and copy-paste manually. See its README for the full pattern guidelines.