» Monitoring a Terraform Enterprise Instance
This document outlines best practices for monitoring a Terraform Enterprise instance.
» Health Check
Terraform Enterprise provides a
/_health_check endpoint on the instance. If Terraform Enterprise is up, the health check will return a
/_health_check endpoint operates in 2 modes:
- Full check
- Minimal check
With a full check, the service will attempt to verify the status of internal components and PostgreSQL, in contrast to a minimal check which returns
200 OK automatically after a successful full check.
The endpoint's default behavior is to perform a full check during startup of the instance, and minimal checks after Terraform Enterprise is active and running.
Note: If you wish to force a full check, an additional query parameter is required:
/_health_check?full=1. Take extra caution as every call will make requests to internal components and PostgreSQL, increasing system load and latency.
» Metrics & Telemetry
In addition to health-check monitoring, we recommend monitoring standard server metrics on the Terraform Enterprise instance:
» Internal Monitoring
Beginning in version 201811-1, Terraform Enterprise includes internal monitoring of critical application metrics using a statsd collector. The resulting metrics are included with the diagnostic bundles provided to HashiCorp Support. If the Terraform Enterprise instance is already running a collector on the same port, Terraform Enterprise may not start up correctly due to the conflict, and logs will indicate:
Error starting userland proxy: listen udp 0.0.0.0:XXXXX: bind: address already in use
To prevent this conflict, disable metrics collection by Terraform Enterprise:
- Access the installer dashboard on port 8800 of your instance.
- Locate Advanced Configuration on the configuration page under Terraform Build Worker image.
- Uncheck "Enable metrics collection".
- Restart the application from the dashboard if it does not restart automatically.
We recommend that internal monitoring only be disabled if it is causing issues, as it otherwise provides useful detail in diagnosing issues. From version 201812-1, the collector is configured on port 23010, and the 23000-23100 range should be reserved for Terraform Enterprise to run any services of this nature.