4 posts tagged with "cloud"

IncidentHub posts related to cloud

View All Tags

A Guide to Monitoring Multiple Status Pages

September 22, 2024 · 10 min read

Hrishikesh Barua

Founder @IncidentHub.cloud

Introduction

Updated: July 23, 2025

Incident updates on the public status pages of your cloud providers are often the first indication that they might have an outage. Providers also post updates about upcoming and ongoing maintenance on their status pages. Monitoring your vendor status pages becomes crucial to your business operations. This article will guide you through the process of effectively monitoring such status pages.

There are two ways to monitor your cloud provider status pages:

Manually
Using a status page aggregator like IncidentHub

If you are using the second option, which is the recommended approach, you can skip directly to the section on Use a Status Page Aggregator Tool.

In either case you will need to identify your cloud providers and locate their public status pages first.

Monitoring Specific Components and Regions in Your Third-Party Services

August 12, 2024 · 3 min read

Hrishikesh Barua

Founder @IncidentHub.cloud

Chances are, most of your third-party cloud and SaaS dependencies are globally distributed and have many regions of operation. Chances are, your applications use a subset of a cloud or SaaS service. If you are monitoring such a service, why should you receive alerts for all regions or every single component in the service?

E.g. if you use Digital Ocean, you might be using Kubernetes in their US locations (NYC and SFO). You would want to know only when there is an outage in one of these locations. Digital Ocean's status page gives you the option to subscribe to outages across the board - it’s all or nothing. This is the case with most services with a few exceptions.

Choosing Specific Components to Monitor

You can now choose which components/regions you wish to monitor in IncidentHub. Let us continue with our Digital Ocean example.

You can choose to monitor all components:

Monitoring Third Party Vendors as an Ops Engineer/SRE

July 22, 2024 · 3 min read

Hrishikesh Barua

Founder @IncidentHub.cloud

Why should you monitor your third-party Cloud and SaaS vendors if you are in SRE/Ops?

As part of an SRE team, your primary responsibility is ensuring the reliability of your applications. What makes you responsible for monitoring services that you don't even manage? Third-party services are just like yours - with SLAs. And outages happen, affecting you as well as many others who depend on them.

It's a no-brainer that you should know when such outages happen to be on top of things if/when it affects your running applications.

Most of your third party dependencies will have a public status page or a Twitter account where they publish updates on their outages. Here are some seemingly easy ways to monitor these pages

Subscribe to the RSS feed of these pages
Follow the Twitter account
Sign up for Slack, Email, SMS notifications on the status page itself if the page supports these

Monitoring Your Third-Party Cloud and SaaS Services is Critical

May 20, 2024 · 3 min read

Hrishikesh Barua

Founder @IncidentHub.cloud

If you have a software-based business, you are using at least a few cloud based tools. It does not matter if you are a solo developer, or part of a 50-member team in a large organization. Take this random list and chances are you are using at least half of them:

Zoom
Google Workspace
Slack
Public cloud/PaaS - GCP/[AWS]/Azure/Render/Heroku/Railway/DigitalOcean/Hetzner
PagerDuty/Opsgenie
Cdnjs
DockerHub
GitLab/GitHub
TravisCI/CircleCI/Semaphore
Let’s Encrypt

Your entire business - irrespective of org or market size - including your development tools, collaboration/communication tools, infrastructure and hosting, monitoring, even email - is dependent on services that you don’t control. They are provided by other vendors.

Of course, you pay for some of them and they all have SLAs. Having an SLA does not translate to 100% uptime. Companies will try their best to meet SLAs - which promise a percentage of uptime (usually 99.xx). There are going to be incidents in your providers at some point, and the effect will cascade to the service that you provide to your customers. This means that your own product's SLA can be breached due to causes outside your control.

Introduction​

Choosing Specific Components to Monitor​

Introduction

Choosing Specific Components to Monitor