{"summary":{"overall":"degraded","up":15,"degraded":8,"down":0,"unknown":0,"open_incidents":8,"last_updated_at":1780693814571},"categories":[{"id":"identity","name":"Identity & Access","providers":[{"id":"okta","name":"Okta","vendor_homepage":"https://www.okta.com","status":"degraded","detail":"expected body marker missing: <channel>","last_checked_at":1780693809890,"last_changed_at":1780346449959,"last_response_ms":448,"status_page_url":"https://status.okta.com/","docs_url":"https://help.okta.com/oie/en-us/content/topics/release-notes/oie/release-notes.htm","playbook_md":"If Okta is down, sign-in and SAML federation across your stack are blocked. Confirm scope on Okta status page, switch any production cutover to your IdP backup if you operate one, and post a customer-facing notice. Most outages resolve in <2h.","affects_features":["auth","identity"],"incident":null},{"id":"entra","name":"Microsoft Entra ID","vendor_homepage":"https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id","status":"up","detail":"HTTP 200 · 68ms","last_checked_at":1780693809573,"last_changed_at":1779327353397,"last_response_ms":134,"status_page_url":"https://azure.status.microsoft/en-us/status","docs_url":"https://learn.microsoft.com/en-us/entra/identity-platform/reference-error-codes","playbook_md":"If Entra is down, Microsoft 365 sign-in, Teams, and any Azure-federated workload will be affected. The OIDC discovery probe failing is a direct, zero-lag signal — official status pages typically lag 15–45 minutes. Notify users, and disable any Entra-gated automation.","affects_features":["auth","identity"],"incident":null},{"id":"auth0","name":"Auth0","vendor_homepage":"https://auth0.com","status":"degraded","detail":"degraded body marker: Service Disruption","last_checked_at":1780693809731,"last_changed_at":1780468829725,"last_response_ms":281,"status_page_url":"https://status.auth0.com/","docs_url":"https://auth0.com/docs/troubleshoot","playbook_md":"If Auth0 is degraded, customer log-ins via your Auth0 tenant may fail intermittently. Check region (US/EU/AU) on the Auth0 status page — outages are usually region-scoped. Switch to your secondary tenant if you maintain one.","affects_features":["auth","identity"],"incident":null},{"id":"auth0_fga","name":"Auth0 FGA (Fine-Grained Authorization)","vendor_homepage":"https://fga.dev","status":"up","detail":"HTTP 200 · 332ms","last_checked_at":1780693809803,"last_changed_at":1779332153587,"last_response_ms":365,"status_page_url":"https://status.fga.dev/","docs_url":"https://docs.fga.dev/","playbook_md":"FGA outages affect any authorization decisions your app delegates to FGA. If you rely on it for permission checks, expect 5xx from your auth-decision endpoint. Have a fallback \"deny by default\" mode unless the user is verified-admin.","affects_features":["auth","identity"],"incident":null}]},{"id":"cdn","name":"Edge & CDN","providers":[{"id":"cloudflare","name":"Cloudflare","vendor_homepage":"https://www.cloudflare.com","status":"degraded","detail":"expected body marker missing: \"indicator\":\"none\"","last_checked_at":1780693810411,"last_changed_at":1780683009649,"last_response_ms":506,"status_page_url":"https://www.cloudflarestatus.com/","docs_url":"https://developers.cloudflare.com/support/troubleshooting/","playbook_md":"ITIL Sidekick itself runs on Cloudflare. A major Cloudflare outage will affect probes, status pages, and webhook deliveries. We will surface this prominently. Customers whose origin is also behind Cloudflare should monitor their own region's PoP via the status page.","affects_features":["cdn","cloud"],"incident":null},{"id":"fastly","name":"Fastly","vendor_homepage":"https://www.fastly.com","status":"up","detail":"HTTP 200 · 90ms","last_checked_at":1780693809454,"last_changed_at":1779330953356,"last_response_ms":9,"status_page_url":"https://www.fastlystatus.com/","docs_url":"https://www.fastly.com/documentation/guides/concepts/healthcheck/","playbook_md":"Fastly outages affect any site routed through their edge network. If you serve assets through Fastly, expect cold-start latency or 503s during a major event. Their status page typically updates within 5 min of detection.","affects_features":["cdn"],"incident":null},{"id":"akamai","name":"Akamai","vendor_homepage":"https://www.akamai.com","status":"degraded","detail":"expected body marker missing: \"indicator\":\"none\"","last_checked_at":1780693811917,"last_changed_at":1780684212186,"last_response_ms":383,"status_page_url":"https://www.akamaistatus.com/","docs_url":"https://techdocs.akamai.com/","playbook_md":"Akamai outages most often affect large enterprise customers. If your site is fronted by Akamai, check the regional incident list — issues are typically PoP-localized. Your origin should remain available; consider temporary DNS failover to bypass the edge if critical.","affects_features":["cdn"],"incident":null}]},{"id":"cloud","name":"Cloud Hyperscalers","providers":[{"id":"aws","name":"AWS","vendor_homepage":"https://aws.amazon.com","status":"up","detail":"HTTP 200 · 86ms","last_checked_at":1780693811497,"last_changed_at":1779327354117,"last_response_ms":562,"status_page_url":"https://health.aws.amazon.com/health/status","docs_url":"https://docs.aws.amazon.com/health/latest/ug/aws-health-dashboard-status.html","playbook_md":"AWS regional outages affect S3, EC2, RDS, Lambda, and downstream services in the affected region. The public Service Health Dashboard usually lags real-world impact by 10–30 min; cross-reference with traffic on your own dashboards. If you're multi-region, initiate failover.","affects_features":["cloud"],"incident":null},{"id":"gcp","name":"Google Cloud","vendor_homepage":"https://cloud.google.com","status":"up","detail":"HTTP 200 · 29ms","last_checked_at":1780693811023,"last_changed_at":1779327354060,"last_response_ms":77,"status_page_url":"https://status.cloud.google.com/","docs_url":"https://cloud.google.com/service-health","playbook_md":"GCP outages affect GKE, Cloud Run, BigQuery, and any service in the affected region. The public JSON incident feed is the authoritative real-time source. If you have a Personalized Service Health subscription, your dashboard will surface this in real-time. Failover regional workloads if possible.","affects_features":["cloud"],"incident":null},{"id":"azure","name":"Microsoft Azure","vendor_homepage":"https://azure.microsoft.com","status":"up","detail":"HTTP 200 · 164ms","last_checked_at":1780693811160,"last_changed_at":1779327354195,"last_response_ms":215,"status_page_url":"https://azure.status.microsoft/","docs_url":"https://learn.microsoft.com/en-us/azure/service-health/service-health-overview","playbook_md":"Azure outages cascade to App Service, Azure SQL, Functions, and AKS. Use the Azure Resource Health REST API (subscription-level) for fastest signal. For public-facing apps, expect a status-page lag of 15–45 minutes. Initiate cross-region failover via Azure Front Door if configured.","affects_features":["cloud"],"incident":null},{"id":"vercel","name":"Vercel","vendor_homepage":"https://vercel.com","status":"degraded","detail":"expected body marker missing: \"status\":\"operational\"","last_checked_at":1780693811482,"last_changed_at":1780684212164,"last_response_ms":322,"status_page_url":"https://www.vercel-status.com/","docs_url":"https://vercel.com/docs/errors","playbook_md":"Vercel outages affect Next.js apps deployed there, including preview deployments and Edge Functions. If your production frontend is on Vercel, expect 503s. Roll back to a known-good deployment may not help during a Vercel-side outage; the platform itself is the issue.","affects_features":["cloud","dev"],"incident":null},{"id":"digitalocean","name":"DigitalOcean","vendor_homepage":"https://www.digitalocean.com","status":"up","detail":"HTTP 200 · 156ms","last_checked_at":1780693811705,"last_changed_at":1780573237132,"last_response_ms":756,"status_page_url":"https://status.digitalocean.com/","docs_url":"https://docs.digitalocean.com/support/","playbook_md":"Affects Droplets, App Platform, Kubernetes, and Spaces in the affected region. DigitalOcean outages are usually region-localised; if your workload is multi-region, failover. Spaces (S3-compatible) failures cascade to any backup or asset pipeline depending on it.","affects_features":["cloud"],"incident":null}]},{"id":"transactional","name":"Transactional APIs","providers":[{"id":"stripe","name":"Stripe","vendor_homepage":"https://stripe.com","status":"degraded","detail":"unexpected HTTP 404","last_checked_at":1780693812639,"last_changed_at":1780503038182,"last_response_ms":174,"status_page_url":"https://status.stripe.com/","docs_url":"https://docs.stripe.com/api/errors","playbook_md":"A 401 from the Stripe gateway is the healthy signal — it means the API is alive and rejecting our unauthenticated probe. If we're seeing 5xx or timeouts, expect checkout and webhook delivery failures. Your Stripe customers will see \"could not process payment\". Defer any non-critical billing operations until recovery.","affects_features":["billing"],"incident":null},{"id":"twilio","name":"Twilio","vendor_homepage":"https://www.twilio.com","status":"degraded","detail":"expected body marker missing: \"indicator\":\"none\"","last_checked_at":1780693813498,"last_changed_at":1780684214500,"last_response_ms":442,"status_page_url":"https://status.twilio.com/","docs_url":"https://www.twilio.com/docs/api/errors","playbook_md":"Twilio outages affect SMS, Voice, and Verify flows. If your MFA depends on Twilio SMS, consider temporarily enabling email-based 2FA. Twilio typically scopes outages to specific regions (e.g., Italy, Brazil) — check the status page for affected carriers.","affects_features":["sms","email"],"incident":null},{"id":"resend","name":"Resend","vendor_homepage":"https://resend.com","status":"degraded","detail":"unexpected HTTP 200","last_checked_at":1780693813253,"last_changed_at":1779327354127,"last_response_ms":788,"status_page_url":"https://resend-status.com/","docs_url":"https://resend.com/docs/api-reference/introduction","playbook_md":"A 401 from Resend is the healthy signal — gateway alive, just rejecting our unauthenticated probe. We use Resend for transactional email (verification + status-page subscriber notifications). If Resend is down, expect delayed or failed delivery — your customers will see \"didn't receive verification email\" support tickets.","affects_features":["email"],"incident":null},{"id":"sendgrid","name":"SendGrid","vendor_homepage":"https://sendgrid.com","status":"up","detail":"HTTP 200 · 281ms","last_checked_at":1780693813013,"last_changed_at":1779327355252,"last_response_ms":555,"status_page_url":"https://status.sendgrid.com/","docs_url":"https://docs.sendgrid.com/","playbook_md":"If you're still on SendGrid, outages affect transactional and marketing email. SendGrid does not return IETF-compliant rate-limit headers, so 429 responses may surface as opaque failures. Consider Resend as a more modern alternative.","affects_features":["email"],"incident":null},{"id":"slack","name":"Slack","vendor_homepage":"https://slack.com","status":"up","detail":"HTTP 200 · 75ms","last_checked_at":1780693812471,"last_changed_at":1780508437808,"last_response_ms":13,"status_page_url":"https://status.slack.com/","docs_url":"https://api.slack.com/changelog","playbook_md":"Slack outages affect customer-facing support workflows, bridge calls, and any Slack-bot integrations. If your incident response uses Slack as the bridge, switch to a backup voice channel (Zoom / Teams) immediately. Slack typically scopes outages to specific regions or features (e.g., file uploads, search) — check the status page.","affects_features":["communications"],"incident":null},{"id":"discord","name":"Discord","vendor_homepage":"https://discord.com","status":"up","detail":"HTTP 200 · 258ms","last_checked_at":1780693813096,"last_changed_at":1779421554270,"last_response_ms":643,"status_page_url":"https://discordstatus.com/","docs_url":"https://discord.com/developers/docs/topics/opcodes-and-status-codes","playbook_md":"Affects community support channels, bot integrations, and gameing/social features. If your customer support depends on Discord, post a notice on alternate channels. Discord outages typically resolve within 30 minutes.","affects_features":["communications"],"incident":null},{"id":"openai","name":"OpenAI","vendor_homepage":"https://openai.com","status":"up","detail":"HTTP 200 · 256ms","last_checked_at":1780693814262,"last_changed_at":1780686614883,"last_response_ms":184,"status_page_url":"https://status.openai.com/","docs_url":"https://platform.openai.com/docs/guides/error-codes","playbook_md":"GPT, DALL-E, and ChatGPT API outages affect any AI feature in your product. If you have a fallback model (Claude, Gemini), failover. Otherwise queue requests and replay on recovery — OpenAI outages usually resolve within an hour but can degrade GPT-4 / GPT-5 independently.","affects_features":["ai","dev"],"incident":null},{"id":"anthropic","name":"Anthropic (Claude)","vendor_homepage":"https://anthropic.com","status":"up","detail":"HTTP 200 · 401ms","last_checked_at":1780693814514,"last_changed_at":1780680015301,"last_response_ms":439,"status_page_url":"https://status.claude.com/","docs_url":"https://docs.anthropic.com/en/api/errors","playbook_md":"Claude API outages affect any feature using Anthropic models (chat, code generation, summarisation). If your product has multi-model failover, switch to OpenAI/Gemini. The Anthropic status page moved to status.claude.com in 2026.","affects_features":["ai","dev"],"incident":null}]},{"id":"dev","name":"Developer Infrastructure","providers":[{"id":"github","name":"GitHub","vendor_homepage":"https://github.com","status":"up","detail":"HTTP 200 · 314ms","last_checked_at":1780693814571,"last_changed_at":1780001438429,"last_response_ms":505,"status_page_url":"https://www.githubstatus.com/","docs_url":"https://docs.github.com/en/site-policy/other-site-policies/github-availability","playbook_md":"GitHub outages affect git clone/push, Actions, Packages, and Codespaces. If your deployment pipeline fetches from GitHub, expect CI failures. For urgent deploys, use a local mirror or push directly to your runtime (Cloudflare deploys do not require GitHub access at run-time).","affects_features":["dev","ci"],"incident":null},{"id":"npm","name":"npm Registry","vendor_homepage":"https://www.npmjs.com","status":"up","detail":"HTTP 200 · 292ms","last_checked_at":1780693814473,"last_changed_at":1779327355263,"last_response_ms":395,"status_page_url":"https://status.npmjs.org/","docs_url":"https://docs.npmjs.com/","playbook_md":"npm outages affect every CI build that runs `npm install` from scratch. Use a private registry mirror or pre-built Docker images for critical pipelines. Pinned lockfiles + cached node_modules will keep most local dev unaffected.","affects_features":["dev","ci"],"incident":null},{"id":"atlassian","name":"Atlassian (Jira / Confluence)","vendor_homepage":"https://www.atlassian.com","status":"up","detail":"HTTP 200 · 240ms","last_checked_at":1780693814543,"last_changed_at":1779330955513,"last_response_ms":468,"status_page_url":"https://status.atlassian.com/","docs_url":"https://confluence.atlassian.com/cloud/atlassian-support-1107557165.html","playbook_md":"Affects Jira, Confluence, Bitbucket, Trello, Statuspage, and Opsgenie. During an Atlassian outage, your engineering team cannot triage tickets in Jira and your documentation in Confluence is unreachable. Have a fallback runbook (Notion / a Markdown repo) for incident response.","affects_features":["dev","ci"],"incident":null}]}],"open_events":[{"id":"2143231c-bb31-4aeb-8b9b-90c132e984f6","provider_id":"twilio","event_type":"degraded","severity":"P2","http_status":200,"response_ms":451,"error_msg":null,"cf_pop":"BOM","started_at":1780684214500,"duration_ms":9865063},{"id":"f5dd1e93-95b4-44f4-9302-77fc8601963d","provider_id":"akamai","event_type":"degraded","severity":"P3","http_status":200,"response_ms":487,"error_msg":null,"cf_pop":"BOM","started_at":1780684212186,"duration_ms":9867377},{"id":"d1a1781e-f4f5-417b-94fe-8c0907fd0627","provider_id":"vercel","event_type":"degraded","severity":"P1","http_status":200,"response_ms":465,"error_msg":null,"cf_pop":"BOM","started_at":1780684212164,"duration_ms":9867399},{"id":"b077a464-f574-41c2-b41d-c1f92774529b","provider_id":"cloudflare","event_type":"degraded","severity":"P3","http_status":200,"response_ms":351,"error_msg":null,"cf_pop":"BOM","started_at":1780683009649,"duration_ms":11069914},{"id":"5357c898-5e23-462b-9cbc-970c1fb08ebd","provider_id":"stripe","event_type":"degraded","severity":"P2","http_status":404,"response_ms":112,"error_msg":null,"cf_pop":"LAX","started_at":1780503038182,"duration_ms":191041381},{"id":"cc9bb083-f9bb-44f5-857e-95fecc5312b0","provider_id":"auth0","event_type":"degraded","severity":"P1","http_status":200,"response_ms":218,"error_msg":null,"cf_pop":"FRA","started_at":1780468829725,"duration_ms":225249838},{"id":"7836563c-adaf-41ed-b521-5ac1e87d9256","provider_id":"okta","event_type":"degraded","severity":"P1","http_status":200,"response_ms":1565,"error_msg":null,"cf_pop":"BOM","started_at":1780346449959,"duration_ms":347629604},{"id":"959dff22-555c-411f-a859-c12f411a261c","provider_id":"resend","event_type":"degraded","severity":"P2","http_status":200,"response_ms":96,"error_msg":null,"cf_pop":"LHR","started_at":1779327354127,"duration_ms":1366725436}]}