Blog Bulletin Board GitLab Webhooks get smarter with self-healing capabilities
Published on: November 14, 2024
3 min read

GitLab Webhooks get smarter with self-healing capabilities

Introducing changes to webhook self-healing behavior, which reduce manual intervention and improve reliability. Discover the impact on your integrations and how to prepare.

automation - cover

We're excited to announce upcoming changes to how GitLab handles webhooks, aimed at improving reliability and reducing manual intervention. These changes will affect GitLab.com users in the coming weeks. For GitLab Self-managed users, the current auto-disabling webhook behavior is behind an existing ops flag auto_disabling_webhooks. The changes described here will be introduced behind the same feature flag.

This improvement is the result of a valuable community contribution by Phawin Khongkhasawan, exemplifying the power of our open source community in driving GitLab forward.

What's changing?

  • Currently, webhooks that result in 4xx errors become permanently disabled after multiple failures. With this update, all webhooks, regardless of the error type (4xx or 5xx), will have the ability to self-heal.
  • Failing webhooks will be temporarily disabled with an increasing backoff period, up to a maximum of 1 day. After a webhook fails for 40 times successively, it becomes permanently disabled.
  • All types of errors (4xx, 5xx, network errors, etc.) will be treated the same way, allowing for more predictable behavior and easier troubleshooting.
  • Webhooks that are currently permanently disabled will be migrated to be temporarily disabled with 40 failures, so they will remain permanently disabled.

Why this change matters

Reduced manual intervention: You'll no longer need to manually re-enable webhooks that have been disabled due to temporary issues.

  • Improved reliability: Webhooks will automatically attempt to recover from transient errors, ensuring your integrations remain functional.
  • Better handling of temporary issues: This change accounts for scenarios like temporary outages, deployments, or configuration changes that might cause temporary webhook failures.

What you need to do

1. Review your webhooks: Take this opportunity to review your existing webhooks. If you have any that you no longer need, consider deleting them.

2. Update your monitoring: If you rely on webhook status for monitoring, update your processes to account for the new behavior where webhooks may self-heal.

3. Test your integrations: Once the change is rolled out, test your integrations to ensure they behave as expected with the new webhook handling.

Timeline and rollout

This feature will be gradually rolled out within two upcoming milestones, over a period of about two weeks to ensure a smooth transition.

  • For GitLab.com users, the changes will be applied automatically.
  • For self-managed and Dedicated users, the changes will only affect instances that have the auto_disabling_webhooks ops flag enabled. The changes are predicted to take effect from GitLab 17.8.

Feedback and support

We value your feedback! If you encounter any issues or have suggestions regarding this change, please comment on our webhook feedback issue.

For any questions or concerns, please reach out to GitLab Support or consult our webhooks documentation.

Stay tuned for more updates, and thank you for being a part of the GitLab community!

We want to hear from you

Enjoyed reading this blog post or have questions or feedback? Share your thoughts by creating a new topic in the GitLab community forum. Share your feedback

Ready to get started?

See what your team could do with a unified DevSecOps Platform.

Get free trial

Find out which plan works best for your team

Learn about pricing

Learn about what GitLab can do for your team

Talk to an expert