A member of the team was performing a one-off maintenance task on a non-user-facing part of our platform. As part of this work, they ran a command to delete some unneeded applications. They believed this command would only affect this non-user-facing part of our platform.
Unfortunately, this assumption was wrong and the command deleted all of our applications, including those which serve production traffic. This caused an immediate and total outage of Notify.
Once we realised the issue, we redeployed the most important applications within 15 minutes.
During the outage, users of Notify saw a ‘404: not found’ error. Both the Notify API and website were unavailable.
We have carried out a root cause analysis and identified future mitigations for such an incident. These include:
If you have any questions or comments, please use our support form at https://www.notifications.service.gov.uk/support.
We’re sorry for the inconvenience to you and your users.
The GOV.UK Notify team