upti.my
All Articles
Incidentsยทยท5 min read

Status Pages vs Alerts: Real Tradeoffs

When should you update the status page vs. just alerting internally? A framework for incident communication decisions.

Your monitoring just fired. You've got an internal alert. Now the question: do you update the status page? Not every issue needs public communication, but waiting too long destroys trust.

The Core Tension

Status pages exist for transparency. But posting every blip creates noise and alarm. The goal is to communicate when customers are impacted while avoiding unnecessary panic.

๐Ÿšจ

Two Mistakes to Avoid

  • Never updating. Status page always shows green, even during obvious outages. Customers lose trust.
  • Over-updating. Every 2-second timeout becomes an incident. Customers think you're unreliable.

A Framework for Decisions

Ask three questions before posting to your status page:

1. Is the customer impacted?

Not every internal alert means customer impact. Database replication lag might be scary internally but invisible to users.

  • Internal-only issues: Alert your team, don't post publicly
  • Degraded performance: Consider posting if noticeable
  • Functionality broken: Definitely post

2. Can you fix it quickly?

If you can resolve in under 5 minutes, you might not need a status page update. The incident will be over before most customers notice.

resolution-guidelines.txt
Expected resolution < 5 min โ†’ Internal alert only
Expected resolution 5-15 min โ†’ Consider posting
Expected resolution > 15 min โ†’ Definitely post

3. Are customers already noticing?

If customers are already reporting issues (support tickets, Twitter, etc.), you've lost the initiative. Post immediately.

What to Post

When you do update the status page, be useful:

โœ…

Good Status Update

API Performance Degradation We're investigating increased latency on our API endpoints. Some requests may timeout or take longer than usual. We've identified the cause and are deploying a fix. ETA: 15 minutes.

๐Ÿšจ

Bad Status Update

Investigating Issues We're looking into some issues.

Automated vs. Manual Updates

upti.my can automatically create incidents when monitors fail. But should you enable this?

When to Use Auto-Incidents

  • Complete outages. If your main site is down for more than 2 minutes, auto-create an incident.
  • Critical path failures. Login, checkout, or core functionality.
  • Off-hours. Nobody is awake to manually update.

When to Keep Manual

  • Flaky monitors. False positives create noise.
  • Internal services. Customers don't need to know.
  • Known maintenance. Scheduled downtime is handled differently.

The Middle Ground: Auto-Detect, Manual Confirm

A balanced approach: monitoring automatically detects issues and drafts an incident, but requires human confirmation before posting.

auto-detect-workflow.txt
1. Monitor fails โ†’ Alert fires
2. System drafts incident: "API returning errors"
3. On-call engineer reviews: Confirm, edit, or dismiss
4. Confirmed โ†’ Posts to status page
5. Auto-resolves when monitor recovers

Alert Channels vs. Status Page

Internal alerts and status pages serve different audiences:

๐Ÿ’ก

Channel Selection Guide

ChannelAudienceWhen to Use
PagerDuty/OpsgenieOn-call engineerEvery customer-impacting alert
Slack #incidentsEngineering teamAll incidents, even minor
Status pageCustomersCustomer-visible impact only
Twitter/EmailAll usersMajor outages only

๐Ÿ“ŒKey Takeaways

  • 1Not every alert needs a status page update
  • 2Post when customers are impacted and can't be fixed quickly
  • 3If customers are already complaining, you're too late, post now
  • 4Auto-incidents work for clear-cut outages, not edge cases
  • 5Match communication channel to audience

The goal is trust. Customers should believe your status page reflects reality. That means updating when things break, but not crying wolf over every blip.

U

Written by

Engineering Team

Ready to try upti.my?

14-day free trial of Pro plan. No credit card required.