• Overview
    • Enforce Policy as Code
    • Infrastructure as Code
    • Inject Secrets into Terraform
    • Integrate with Existing Workflows
    • Manage Kubernetes
    • Manage Virtual Machine Images
    • Multi-Cloud Deployment
    • Network Infrastructure Automation
    • Terraform CLI
    • Terraform Cloud
    • Terraform Enterprise
  • Registry
  • Tutorials
    • About the Docs
    • Intro to Terraform
    • Configuration Language
    • Terraform CLI
    • Terraform Cloud
    • Terraform Enterprise
    • Provider Use
    • Plugin Development
    • Registry Publishing
    • Integration Program
    • Terraform Tools
    • CDK for Terraform
    • Glossary
  • Community
GitHubTerraform Cloud
Download

    Terraform Enterprise Admin

  • Overview
    • Credentials
    • Hardware
      • Supported OS
      • RedHat Linux
      • CentOS Linux
      • Operational Mode
      • PostgreSQL
      • Minio Setup Guide
      • External Vault
    • Network
    • Docker Engine
  • Operational Modes
    • Overview
    • AWS Reference Architecture
    • Azure Reference Architecture
    • GCP Reference Architecture
    • VMware Reference Architecture
    • Pre-Install Checklist
      • 1. Run Installer
      • 2. Configure in Browser
      • Automated Installation
      • Active/Active
      • Initial User Automation
      • Encryption Password
    • Uninstall
    • Configuration
    • Team Membership
    • Attributes
    • Login
      • Sample Auth Request
      • ADFS
      • Azure Active Directory
      • Okta
      • OneLogin
    • Troubleshooting
    • Overview
      • Automated Recovery
      • Upgrades
      • Log Forwarding
      • Monitoring
      • Backups and Restores
      • Admin CLI Commands
      • Terraform Cloud Agents on TFE
      • Demo to Mounted Disk Migration
    • Terraform Cloud Agents on TFE
      • Accessing the Admin Interface
      • General Settings
      • Customization
      • Integration Settings
      • Managing Accounts & Resources
      • Module Sharing
      • Admin API
      • Updating Terraform Enterprise License
    • Terraform Enterprise Logs
    • Overview
    • Architecture Summary
    • Reliability & Availability
    • Capacity & Performance
    • Security Model
    • Overview
      • Overview
      • v202206-1
      • v202205-1
      • v202204-2
      • v202204-1
      • v202203-1
      • v202202-1
      • v202201-2
      • v202201-1
      • Overview
      • v202112-2
      • v202112-1
      • v202111-1
      • v202110-1
      • v202109-2
      • v202109-1
      • v202108-1
      • v202107-1
      • v202106-1
      • v202105-1
      • v202104-1
      • v202103-3
      • v202103-2
      • v202103-1
      • v202102-2
      • v202102-1
      • v202101-1
      • Overview
      • Overview
      • Overview
  • Support
  • Application Usage

  • Overview
  • Plans and Features
  • Getting Started
    • API Docs template
    • Overview
    • Account
    • Agent Pools
    • Agent Tokens
    • Applies
    • Audit Trails
    • Comments
    • Configuration Versions
    • Cost Estimates
    • Feature Sets
    • Invoices
    • IP Ranges
    • Notification Configurations
    • OAuth Clients
    • OAuth Tokens
    • Organizations
    • Organization Memberships
    • Organization Tags
    • Organization Tokens
    • Plan Exports
    • Plans
    • Policies
    • Policy Checks
    • Policy Sets
    • Policy Set Parameters
      • Modules
      • Providers
      • Private Provider Versions and Platforms
      • GPG Keys
    • Runs
      • Run Tasks
      • Stages and Results
      • Custom Integration
    • Run Triggers
    • SSH Keys
    • State Versions
    • State Version Outputs
    • Subscriptions
    • Team Access
    • Team Membership
    • Team Tokens
    • Teams
    • User Tokens
    • Users
    • Variables
    • VCS Events
    • Workspaces
    • Workspace-Specific Variables
    • Workspace Resources
    • Variable Sets
      • Overview
      • Module Sharing
      • Organizations
      • Runs
      • Settings
      • Terraform Versions
      • Users
      • Workspaces
    • Changelog
    • Stability Policy
    • Overview
    • Creating Workspaces
    • Naming
    • Terraform Configurations
      • Overview
      • Managing Variables
      • Overview
      • VCS Connections
      • Access
      • Drift Detection
      • Notifications
      • SSH Keys for Modules
      • Run Triggers
      • Run Tasks
    • Terraform State
    • JSON Filtering
    • Remote Operations
    • Viewing and Managing Runs
    • Run States and Stages
    • Run Modes and Options
    • UI/VCS-driven Runs
    • API-driven Runs
    • CLI-driven Runs
    • The Run Environment
    • Installing Software
    • Users
    • Teams
    • Organizations
    • Permissions
    • Two-factor Authentication
    • API Tokens
      • Overview
      • Microsoft Azure AD
      • Okta
      • SAML
      • Linking a User Account
      • Testing
    • Overview
    • GitHub.com
    • GitHub.com (OAuth)
    • GitHub Enterprise
    • GitLab.com
    • GitLab EE and CE
    • Bitbucket Cloud
    • Bitbucket Server and Data Center
    • Azure DevOps Services
    • Azure DevOps Server
    • Troubleshooting
    • Overview
    • Adding Public Providers and Modules
    • Publishing Private Providers
    • Publishing Private Modules
    • Using Providers and Modules
    • Configuration Designer
  • Migrating to Terraform Cloud
    • Overview
    • Using Sentinel with Terraform 0.12
    • Manage Policies
    • Enforce and Override Policies
    • Mocking Terraform Sentinel Data
    • Working With JSON Result Data
      • Overview
      • tfconfig
      • tfconfig/v2
      • tfplan
      • tfplan/v2
      • tfstate
      • tfstate/v2
      • tfrun
    • Example Policies
    • Overview
    • AWS
    • GCP
    • Azure
      • Overview
      • Service Catalog
      • Admin Guide
      • Developer Reference
      • Example Customizations
      • V1 Setup Instructions
    • Splunk Integration
    • Kubernetes Integration
    • Run Tasks Integration
    • Overview
    • IP Ranges
    • Data Security
    • Security Model
    • Overview
    • Part 1: Overview of Our Recommended Workflow
    • Part 2: Evaluating Your Current Provisioning Practices
    • Part 3: How to Evolve Your Provisioning Practices
    • Part 3.1: From Manual Changes to Semi-Automation
    • Part 3.2: From Semi-Automation to Infrastructure as Code
    • Part 3.3: From Infrastructure as Code to Collaborative Infrastructure as Code
    • Part 3.4: Advanced Workflow Improvements

  • Terraform Cloud Agents

  • Other Docs

  • Intro to Terraform
  • Configuration Language
  • Terraform CLI
  • Terraform Cloud
  • Terraform Enterprise
  • Provider Use
  • Plugin Development
  • Registry Publishing
  • Integration Program
  • Terraform Tools
  • CDK for Terraform
  • Glossary
Type '/' to Search

»Terraform Enterprise Active/Active

When your organization requires increased reliability or performance from Terraform Enterprise that your current single application instance cannot provide, it is time to scale to the Active/Active architecture.

Before scaling to Active/Active, you should weigh its benefits against increasing operational complexity. Specifically, consider the following aspects of the Active/Active architecture:

  • Hard requirement of a completely automated installation
  • Observability concerns when monitoring multiple instances
  • Custom automation required to manage the lifecycle of application nodes
  • CLI-based commands for administration instead of the Replicated Admin Console

Note: Please contact your Customer Success Manager before attempting to follow this guide. They will be able to walk you through the process to make it as seamless as possible.

»Prerequisite

As mentioned above, the Active/Active architecture requires an existing automated installation of Terraform Enterprise that follows our best practices for deployment.

The primary requirement is an auto scaling group (or equivalent) with a single instance running Terraform Enterprise. This ASG should be behind a load balancer and can be exposed to the public Internet or not depending on your requirements. As mentioned earlier, the installation of the Terraform Enterprise application should be automated completely so that the auto scaling group can be scaled to zero and back to one without human intervention.

Note: Active/Active installations on VMware infrastructure also require you to configure a Load Balancer to route traffic across the Terraform Enterprise servers. This documentation does not cover that setup. While auto-scaling groups are not available via native vCenter options, you must still configure a fully automated deployment. You must also reduce the available servers to one server for upgrades, maintenance, and support.

The application itself must be using External Services mode to connect to an external PostgreSQL database and object storage.

All admin and application configuration must be automated via your settings files and current with running configuration, i.e. it cannot have been altered via the Replicated Admin Console and not synced to the file. Specifically, you should be using the following configuration files:

  • /etc/replicated.conf - contains the configuration for the Replicated installer
  • /etc/ptfe-settings.json - contains the configuration for the TFE application

Note: The location for the latter is controlled by the "ImportSettingsFrom" setting in /etc/replicated.conf and is sometimes named settings.json or replicated-tfe.conf

The requirement for automation is two-fold. First, the nodes need to be able to spin up and down without human intervention. More importantly though, you will need to ensure configuration is managed in this way going forward as the Replicated Admin Console will be disabled. The Replicated Admin Console does not function correctly when running multiple nodes and must not be used.

»Step 1: Prepare to Externalize Redis

»Prepare Network

There are new access requirements involving ingress and egress:

  • Port 6379 (or the port the external Redis will be configured to use) must be open between the nodes and the Redis service
  • Port 8201 must be open between the nodes to allow Vault to run in High Availability mode
  • Port 8800 should now be closed, as the Replicated Admin Console is no longer available when running multiple nodes

»Provision Redis

Externalizing Redis allows multiple active application nodes. Terraform Enterprise is validated to work with the managed Redis services from AWS, Azure, and GCP. Terraform Enterprise does not support Redis Cluster, but does support Redis Sentinel for failover with active/passive deployments.

Note: Please see the cloud-specific configuration guides at the end of this document for details here.

»Step 2: Update your Configuration File Templates

Before installing, you need to make changes to the templates for the configuration files mentioned earlier in the prerequisites.

»Update Installer Settings

The existing settings for the installer and infrastructure (replicated.conf) are still needed and require to be expanded. Please see documentation for the existing options here.

»Pin Your Version

To upgrade to the Active/Active functionality and for ongoing upgrades, you need to pin your installation to the appropriate release by setting the following:

KeyDescriptionSpecific Format Required
ReleaseSequenceRefers to a version of TFE (v202101-1)Yes, integer.

The following example pins the deployment to the the (v202101-1) release of TFE (which is the first to support multiple nodes):

{
  "ReleaseSequence" : 504
}

{
  "ReleaseSequence" : 504
}

»Update Application Settings

The existing settings for the Terraform Enterprise application (ptfe-settings.json) are still needed and require to be expanded. Please see documentation for the existing options here.

»Enable Active/Active
KeyRequired ValueSpecific Format Required
enable_active_active“1”Yes, string.
{
  "enable_active_active" : {
    "value": "1"
  }
}

{
  "enable_active_active" : {
    "value": "1"
  }
}

»Configure External Redis

The settings for the Terraform Enterprise application must also be expanded to support an external Redis instance:

KeyRequired ValueSpecific Format Required
redis_hostHostname of an external Redis instance which is resolvable from the TFE instance.Yes, string.
redis_portPort number of your external Redis instance.Yes, string.
redis_use_password_auth*Set to 1, if you are using a Redis service that requires a password.Yes, string.
redis_pass*Password used to authenticate to Redis.Yes, string.
redis_use_tls*Set to 1 if you are using a Redis service that requires TLS.Yes, string.

* Fields marked with an asterisk are only necessary if your particular external Redis instance requires them.

For example:

{
  "redis_host" : {
    "value": "someredis.host.com"
  },
  "redis_port" : {
    "value": "6379"
  },
  "redis_use_password_auth" : {
    "value": "1"
  },
  "redis_pass" : {
    "value": "somepassword"
  },
  "redis_use_tls" : {
    "value": "1"
  }
}

{
  "redis_host" : {
    "value": "someredis.host.com"
  },
  "redis_port" : {
    "value": "6379"
  },
  "redis_use_password_auth" : {
    "value": "1"
  },
  "redis_pass" : {
    "value": "somepassword"
  },
  "redis_use_tls" : {
    "value": "1"
  }
}

Note: To use in-transit encryption with GCP Memorystore for Redis, you must download the CA certificate for your Redis instance and configure it within the ca_certs Terraform Enterprise application setting. Additionally, you must ensure that the redis_port and redis_use_tls settings are configured correctly.

»Add Encryption Password

The Encryption Password value must be added to the config and is required to be identical between node instances for the Active/Active architecture to function:

KeyDescriptionValue can change between deployments?Specific Format Required
enc_passwordUsed to encrypt sensitive data (docs)No. Changing will make decrypting existing data impossible.No
{
   "enc_password":{
      "value":"767fee4e6046de48943df2decc55f3cd"
   },
}

{
   "enc_password":{
      "value":"767fee4e6046de48943df2decc55f3cd"
   },
}

Note: In versions prior to v202104-1 the following values were also required to be set: install_id, root_secret, user_token, cookie_hash, archivist_token, internal_api_token, registry_session_secret_key (HEX), and registry_session_encryption_key (HEX). These values are no longer required but will still work if they are still set by your configuration.

»Step 3: Connect to External Redis

Once you are prepared to include the modified configuration options in your configuration files, you must connect a single node to your newly provisioned Redis service by rebuilding your node instance with the new settings.

»Re-provision Terraform Enterprise Instance

Terminate the existing instance by scaling down to zero. Once terminated, you can scale back up to one instance using your revised configuration.

»Wait for Terraform Enterprise to Install

It can take up to 15 minutes for the node to provision and the TFE application to be installed and respond as healthy. You can monitor the status of the node provisioning by watching your auto scaling group in your cloud’s web console. To confirm the successful implementation of the TFE application you can SSH onto the node and run the following command to monitor the installation directly:

replicatedctl app status
replicatedctl app status

Which will output something similar to the following:

[
    {
        "AppID": "218b78fa2bd6f0044c6a1010a51d5852",
        "Sequence": 504,
        "PatchSequence": 0,
        "State": "starting",
        "DesiredState": "started",
        "IsCancellable": false,
        "IsTransitioning": true,
        "LastModifiedAt": "2021-01-07T21:15:11.650385151Z"
    }
]

[
    {
        "AppID": "218b78fa2bd6f0044c6a1010a51d5852",
        "Sequence": 504,
        "PatchSequence": 0,
        "State": "starting",
        "DesiredState": "started",
        "IsCancellable": false,
        "IsTransitioning": true,
        "LastModifiedAt": "2021-01-07T21:15:11.650385151Z"
    }
]

Installation is complete once isTransitioning is false and State is started.

Refer to the Admin CLI Commands documentation for more status and troubleshooting commands.

»Validate Application

With installation complete, it is time to validate the new Redis connection. Terraform Enterprise uses Redis both as a cache for API requests and a queue for long running jobs, e.g. Terraform Runs. Test the latter behavior by running real Terraform plans and applies through the system.

Once you are satisfied the application is running as expected, you can move on to step 4 to scale up to 2 nodes.

»Step 4: Scale to Two Nodes

You can now safely change the number of instances in your Auto Scaling Group ( or equivalent) to two.

»Disable the Replicated Admin Console

Before scaling beyond the first node, you must disable the Replicated Admin Console as mentioned earlier in this guide. This is done by adding the disable-replicated-ui flag as a parameter when you call the install script, as such:

sudo bash ./install.sh disable-replicated-ui
sudo bash ./install.sh disable-replicated-ui

Locate where the install.sh script is run as part of your provisioning/installation process and add the parameter. If there are other parameters on the same line, they should be left in place.

»Scale Down to Zero Nodes

Scale down to zero nodes to fully disable the admin dashboard. Once the existing instance has terminated...

»Scale Up to Two Nodes

Now that you have tested your external Redis connection change the min and max instance count of your Auto Scaling Group to 2 nodes.

»Wait for Terraform Enterprise to Install

You need to wait up to 15 minutes for the application to respond as healthy on both nodes. Monitor the status of the install with the same methods used previously for one node in Step 3.

Note: Each node needs to be checked independently.

»Validate Application

Finally, confirm the application is functioning as expected when running multiple nodes. Run Terraform plan and applies through the system (and any other tests specific to your environment) like you did to validate the application in Step 3.

Confirm the general functionality of the Terraform Enterprise UI to validate the tokens you added in Step 2 are set correctly. Browse the Run interface and your organization's private registry to confirm your application functions as expected.

»Scaling Beyond Two Nodes

Terraform Enterprise supports scaling up to 3 nodes as part of the Active/Active deployment. When scaling beyond 2 nodes, you should also carefully evaluate and scale external services, particularly the database server. Scaling to 3 nodes does not alter the upgrade requirement that you scale down to a single node or rebuild entirely.

»PostgreSQL Server

The Terraform Enterprise PostgreSQL server will typically hit the CPU capacity before other resources, so we recommend closely monitoring the CPU in a 2 node configuration before scaling up to 3 nodes. You may also need to manually modify the database maximum connection count to allow for the additional load. Defaults vary, so please refer to the documentation for the cloud hosting your installation.

  • AWS - RDS connection limits
  • AWS - Aurora Scaling
  • Azure - Azure Database Limits
  • Google Cloud - Cloud SQL Quotes and Limits
  • PostgreSQL 12 - Connection Documentation

»Redis Server

Some workloads may rarely cause spikes in the Redis server CPU or memory. We recommend monitoring the Redis server and scaling it up as necessary.

  • AWS - Monitoring ElastiCache for Redis with CloudWatch
  • Azure - Monitor Azure Cache for Redis
  • Google Cloud - Monitoring Redis Instances

»Network Infrastructure/API Limits

As you scale Terraform Enterprise beyond 2 nodes, you may be adding additional stress to your network and dramatically increasing the number of API calls made in your cloud account. Each cloud has its own default limits and processes by which those limits might be increased. Please refer to the documentation for the cloud hosting your installation.

  • AWS - EC2 instance network limits
  • AWS - Request Throttling for the EC2 API
  • Azure - Virtual Machine Network Limits
  • Azure - Resource Manager Throttling
  • Google Cloud - Network Quotas and Limits
  • Google Cloud - API rate limits

Depending on your infrastructure and Terraform Enterprise configuration, you may need to configure your application gateway or load balancer for sticky sessions. Sticky session refers to the practice of using a load balancer or gateway device with a specific setting enabled that ensures traffic is routed back to the original system that initiated a request. For example, an Active/Active deployment on Azure with SAML authentication requires sticky sessions to ensure the authentication with the SAML server is successful. The terminology for this varies across clouds. Refer to the documentation for your infrastructure.

  • AWS - Sticky Sessions for your Application Load Balancer
  • AWS - Configure sticky sessions for your Classic Load Balancer
  • Azure - Application Gateway Cookie-based affinity
  • Azure - Load Balancer distribution modes: Session Persistence
  • Google Cloud - Session affinity

»Terraform Cloud Agents - Alternative Solution

Instead of scaling Terraform Enterprise beyond 2 or 3 nodes, you can use Terraform Cloud Agents. Terraform Cloud Agents can run in other regions, other clouds, and even private clouds. Agents poll Terraform Enterprise for work and then Terraform plans and applies will run on the target system that has the agent executable installed. This has a much smaller impact on the Terraform Enterprise servers than running Terraform locally.

»

»Appendix 1: AWS ElasticCache

The following example Terraform configuration shows how to configure a replication group for use with TFE:

In this example, the required variables are:

  • vpc_id is the ID of VPC where TFE application will be deployed
  • subnet_ids are the IDs of Subnets within the VPC to use for the ElastiCache Subnet Group
  • security_group_ids are the IDs of Security Groups within the VPC that will be attached to TFE instances for Redis ingress
  • availability_zones are the zones within the VPC to deploy the ElastiCache setup to
resource "aws_elasticache_subnet_group" "tfe" {
  name       = "tfe-test-elasticache"
  subnet_ids = var.subnet_ids
}

resource "aws_security_group" "redis_ingress" {
  name        = "external-redis-ingress"
  description = "Allow traffic to redis from instances in the associated SGs"
  vpc_id      = var.vpc_id
  
  ingress {
    description     = "TFE ingress to redis"
    from_port       = 7480
    to_port         = 7480
    protocol        = "tcp"
    security_groups = var.security_group_ids
  }
}

resource "aws_elasticache_replication_group" "tfe" {
  node_type                     = "cache.m4.large"
  replication_group_id          = "tfe-test-redis"
  replication_group_description = "External Redis for TFE."

  apply_immediately          = true
  at_rest_encryption_enabled = true
  auth_token                 = random_pet.redis_password.id
  automatic_failover_enabled = true
  availability_zones         = var.availability_zones
  engine                     = "redis"
  engine_version             = "5.0.6"
  number_cache_clusters      = length(var.availability_zones)
  parameter_group_name       = "default.redis5.0"
  port                       = 7480
  security_group_ids         = [aws_security_group.redis_ingress.id]
  subnet_group_name          = aws_elasticache_subnet_group.tfe.name
  transit_encryption_enabled = true
}
resource "aws_elasticache_subnet_group" "tfe" {
  name       = "tfe-test-elasticache"
  subnet_ids = var.subnet_ids
}

resource "aws_security_group" "redis_ingress" {
  name        = "external-redis-ingress"
  description = "Allow traffic to redis from instances in the associated SGs"
  vpc_id      = var.vpc_id
  
  ingress {
    description     = "TFE ingress to redis"
    from_port       = 7480
    to_port         = 7480
    protocol        = "tcp"
    security_groups = var.security_group_ids
  }
}

resource "aws_elasticache_replication_group" "tfe" {
  node_type                     = "cache.m4.large"
  replication_group_id          = "tfe-test-redis"
  replication_group_description = "External Redis for TFE."

  apply_immediately          = true
  at_rest_encryption_enabled = true
  auth_token                 = random_pet.redis_password.id
  automatic_failover_enabled = true
  availability_zones         = var.availability_zones
  engine                     = "redis"
  engine_version             = "5.0.6"
  number_cache_clusters      = length(var.availability_zones)
  parameter_group_name       = "default.redis5.0"
  port                       = 7480
  security_group_ids         = [aws_security_group.redis_ingress.id]
  subnet_group_name          = aws_elasticache_subnet_group.tfe.name
  transit_encryption_enabled = true
}

»

»Appendix 2: GCP Memorystore for Redis

Note: Memorystore on Google Cloud does not support persistence, so encryption at rest is not an option.

Requirements/Options:

  • authorized_network - The network you wish to deploy the instance into. Internal testing was done using the same network TFE is deployed into. If a different network is used, the customer needs to ensure the 2 networks are open on port 6379.
  • memory_size_gb - How much memory to allocate to Redis. Initial testing was done with just 1GB configured. Larger deployments may require additional memory. (TFC uses an m4.large, which is just 6GB of memory, for reference.)
  • location_id - What region to deploy into - should be the same one TFE is deployed into. If Standard_HA tier is selected, an alternative_location_id will also need to be provided as a failover location.
  • redis_version - Both 4.0 and 5.0 work without issue but 5.0 is recommended

The default example provided on the provider page can be used to deploy memorystore here. The host output of the resource can then be provided to the terraform module in order to configure connectivity.

You may consider using other options in the configuration depending on your requirements, such as including the auth_enabled flag set to true, which must then be accompanied by including an additional TFE configuration item called redis_password set to the value returned in the auth_string attribute from the memorystore resource.

»

»Appendix 3: Azure Cache for Redis

Note: Azure Cache on Azure only supports persistence and encryption with their Premium tier. All other tiers, Basic and Standard, do not support data persistence.

The minimum instance size for Redis to be used with TFE is 6 GiB. For Azure, this allows for some minimum configurations across the 3 tiers using Cache Names for their different Tiers. Our recommendations on cache sizing for Azure Cache for Redis is in the table below:

BasicStandardPremium
Cache NameC3C0P2

Make sure you configure the minimum TLS version to the TFE supported version of 1.2 as the Azure resource defaults to 1.0. The default port for Azure Cache for Redis is 6380 and will need to be modified in the Application Settings ptfe-replicated.conf in order for TFE to connect to Azure Cache for Redis.

The default example provided on the provider page can be used to deploy Azure Cache for Redis here. The outputs of the resource can then be provided to the Terraform module in order to configure connectivity

»

»Appendix 4: Redis on VMware

Redis on VMware was tested with a virtual machine with 2 CPUs and 8 GB of memory running Ubuntu 20.04. Both Redis v5 and v6 are supported. A full list of supported Operating Systems can be found on the Pre-Install Checklist.

The sizing of your Redis server will depend on your company or organization's workload. Monitoring of the virtual machine and resizing based on utilization is recommended. More details on memory utilization can be found on Redis' website.

  • Overview
  • Docs
  • Extend
  • Privacy
  • Security
  • Press Kit
  • Consent Manager