» Resource: aws_emr_cluster

Provides an Elastic MapReduce Cluster, a web service that makes it easy to process large amounts of data efficiently. See Amazon Elastic MapReduce Documentation for more information.

To configure Instance Groups for task nodes, see the aws_emr_instance_group resource.

» Example Usage

resource "aws_emr_cluster" "cluster" {
  name          = "emr-test-arn"
  release_label = "emr-4.6.0"
  applications  = ["Spark"]

  additional_info = <<EOF
{
  "instanceAwsClientConfiguration": {
    "proxyPort": 8099,
    "proxyHost": "myproxy.example.com"
  }
}
EOF

  termination_protection            = false
  keep_job_flow_alive_when_no_steps = true

  ec2_attributes {
    subnet_id                         = "${aws_subnet.main.id}"
    emr_managed_master_security_group = "${aws_security_group.sg.id}"
    emr_managed_slave_security_group  = "${aws_security_group.sg.id}"
    instance_profile                  = "${aws_iam_instance_profile.emr_profile.arn}"
  }

  master_instance_group {
    instance_type = "m4.large"
  }

  core_instance_group {
    instance_type  = "c4.large"
    instance_count = 1

    ebs_config {
      size                 = "40"
      type                 = "gp2"
      volumes_per_instance = 1
    }

    bid_price = "0.30"

    autoscaling_policy = <<EOF
{
"Constraints": {
  "MinCapacity": 1,
  "MaxCapacity": 2
},
"Rules": [
  {
    "Name": "ScaleOutMemoryPercentage",
    "Description": "Scale out if YARNMemoryAvailablePercentage is less than 15",
    "Action": {
      "SimpleScalingPolicyConfiguration": {
        "AdjustmentType": "CHANGE_IN_CAPACITY",
        "ScalingAdjustment": 1,
        "CoolDown": 300
      }
    },
    "Trigger": {
      "CloudWatchAlarmDefinition": {
        "ComparisonOperator": "LESS_THAN",
        "EvaluationPeriods": 1,
        "MetricName": "YARNMemoryAvailablePercentage",
        "Namespace": "AWS/ElasticMapReduce",
        "Period": 300,
        "Statistic": "AVERAGE",
        "Threshold": 15.0,
        "Unit": "PERCENT"
      }
    }
  }
]
}
EOF
  }

  ebs_root_volume_size = 100

  tags = {
    role = "rolename"
    env  = "env"
  }

  bootstrap_action {
    path = "s3://elasticmapreduce/bootstrap-actions/run-if"
    name = "runif"
    args = ["instance.isMaster=true", "echo running on master node"]
  }

  configurations_json = <<EOF
  [
    {
      "Classification": "hadoop-env",
      "Configurations": [
        {
          "Classification": "export",
          "Properties": {
            "JAVA_HOME": "/usr/lib/jvm/java-1.8.0"
          }
        }
      ],
      "Properties": {}
    },
    {
      "Classification": "spark-env",
      "Configurations": [
        {
          "Classification": "export",
          "Properties": {
            "JAVA_HOME": "/usr/lib/jvm/java-1.8.0"
          }
        }
      ],
      "Properties": {}
    }
  ]
EOF

  service_role = "${aws_iam_role.iam_emr_service_role.arn}"
}

The aws_emr_cluster resource typically requires two IAM roles, one for the EMR Cluster to use as a service, and another to place on your Cluster Instances to interact with AWS from those instances. The suggested role policy template for the EMR service is AmazonElasticMapReduceRole, and AmazonElasticMapReduceforEC2Role for the EC2 profile. See the Getting Started guide for more information on these IAM roles. There is also a fully-bootable example Terraform configuration at the bottom of this page.

» Enable Debug Logging

Debug logging in EMR is implemented as a step. It is highly recommended to utilize the lifecycle configuration block with ignore_changes if other steps are being managed outside of Terraform.

resource "aws_emr_cluster" "example" {
  # ... other configuration ...

  step {
    action_on_failure = "TERMINATE_CLUSTER"
    name              = "Setup Hadoop Debugging"

    hadoop_jar_step {
      jar  = "command-runner.jar"
      args = ["state-pusher-script"]
    }
  }

  # Optional: ignore outside changes to running cluster steps
  lifecycle {
    ignore_changes = ["step"]
  }
}

» Multiple Node Master Instance Group

Available in EMR version 5.23.0 and later, an EMR Cluster can be launched with three master nodes for high availability. Additional information about this functionality and its requirements can be found in the EMR Management Guide.

# This configuration is for illustrative purposes and highlights
# only relevant configurations for working with this functionality.

# Map public IP on launch must be enabled for public (Internet accessible) subnets
resource "aws_subnet" "example" {
  # ... other configuration ...

  map_public_ip_on_launch = true
}

resource "aws_emr_cluster" "example" {
  # ... other configuration ...

  # EMR version must be 5.23.0 or later
  release_label = "emr-5.24.1"

  # Termination protection is automatically enabled for multiple masters
  # To destroy the cluster, this must be configured to false and applied first
  termination_protection = true

  ec2_attributes {
    # ... other configuration ...

    subnet_id = "${aws_subnet.example.id}"
  }

  master_instance_group {
    # ... other configuration ...

    # Master instance count must be set to 3
    instance_count = 3
  }

  # core_instance_group must be configured
  core_instance_group {
    # ... other configuration ...
  }
}

» Argument Reference

The following arguments are supported:

  • name - (Required) The name of the job flow
  • release_label - (Required) The release label for the Amazon EMR release
  • master_instance_group - (Optional) Configuration block to use an Instance Group for the master node type. Cannot be specified if master_instance_type argument or instance_group configuration blocks are set. Detailed below.
  • master_instance_type - (Optional, DEPRECATED) Use the master_instance_group configuration block instance_type argument instead. The EC2 instance type of the master node. Cannot be specified if master_instance_group or instance_group configuration blocks are set.
  • scale_down_behavior - (Optional) The way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized.
  • additional_info - (Optional) A JSON string for selecting additional features such as adding proxy information. Note: Currently there is no API to retrieve the value of this argument after EMR cluster creation from provider, therefore Terraform cannot detect drift from the actual EMR cluster if its value is changed outside Terraform.
  • service_role - (Required) IAM role that will be assumed by the Amazon EMR service to access AWS resources
  • security_configuration - (Optional) The security configuration name to attach to the EMR cluster. Only valid for EMR clusters with release_label 4.8.0 or greater
  • core_instance_group - (Optional) Configuration block to use an Instance Group for the core node type. Cannot be specified if core_instance_count argument, core_instance_type argument, or instance_group configuration blocks are set. Detailed below.
  • core_instance_type - (Optional, DEPRECATED) Use the core_instance_group configuration block instance_type argument instead. The EC2 instance type of the slave nodes. Cannot be specified if core_instance_group or instance_group configuration blocks are set.
  • core_instance_count - (Optional, DEPRECATED) Use the core_instance_group configuration block instance_count argument instead. Number of Amazon EC2 instances used to execute the job flow. EMR will use one node as the cluster's master node and use the remainder of the nodes (core_instance_count-1) as core nodes. Cannot be specified if core_instance_group or instance_group configuration blocks are set. Default 1
  • instance_group - (Optional, DEPRECATED) Use the master_instance_group configuration block, core_instance_group configuration block and aws_emr_instance_group resource(s) instead. A list of instance_group objects for each instance group in the cluster. Exactly one of master_instance_type and instance_group must be specified. If instance_group is set, then it must contain a configuration block for at least the MASTER instance group type (as well as any additional instance groups). Cannot be specified if master_instance_group or core_instance_group configuration blocks are set. Defined below
  • log_uri - (Optional) S3 bucket to write the log files of the job flow. If a value is not provided, logs are not created
  • applications - (Optional) A list of applications for the cluster. Valid values are: Flink, Hadoop, Hive, Mahout, Pig, Spark, and JupyterHub (as of EMR 5.14.0). Case insensitive
  • termination_protection - (Optional) Switch on/off termination protection (default is false, except when using multiple master nodes). Before attempting to destroy the resource when termination protection is enabled, this configuration must be applied with its value set to false.
  • keep_job_flow_alive_when_no_steps - (Optional) Switch on/off run cluster with no steps or when all steps are complete (default is on)
  • ec2_attributes - (Optional) Attributes for the EC2 instances running the job flow. Defined below
  • kerberos_attributes - (Optional) Kerberos configuration for the cluster. Defined below
  • ebs_root_volume_size - (Optional) Size in GiB of the EBS root device volume of the Linux AMI that is used for each EC2 instance. Available in Amazon EMR version 4.x and later.
  • custom_ami_id - (Optional) A custom Amazon Linux AMI for the cluster (instead of an EMR-owned AMI). Available in Amazon EMR version 5.7.0 and later.
  • bootstrap_action - (Optional) List of bootstrap actions that will be run before Hadoop is started on the cluster nodes. Defined below
  • configurations - (Optional) List of configurations supplied for the EMR cluster you are creating
  • configurations_json - (Optional) A JSON string for supplying list of configurations for the EMR cluster.
configurations_json = <<EOF
  [
    {
      "Classification": "hadoop-env",
      "Configurations": [
        {
          "Classification": "export",
          "Properties": {
            "JAVA_HOME": "/usr/lib/jvm/java-1.8.0"
          }
        }
      ],
      "Properties": {}
    }
  ]
EOF
  • visible_to_all_users - (Optional) Whether the job flow is visible to all IAM users of the AWS account associated with the job flow. Default true
  • autoscaling_role - (Optional) An IAM role for automatic scaling policies. The IAM role provides permissions that the automatic scaling feature requires to launch and terminate EC2 instances in an instance group.
  • step - (Optional) List of steps to run when creating the cluster. Defined below. It is highly recommended to utilize the lifecycle configuration block with ignore_changes if other steps are being managed outside of Terraform. This argument is processed in attribute-as-blocks mode.
  • tags - (Optional) list of tags to apply to the EMR Cluster

» core_instance_group Configuration Block

Supported arguments for the core_instance_group configuration block:

  • instance_type - (Required) EC2 instance type for all instances in the instance group.
  • autoscaling_policy - (Optional) String containing the EMR Auto Scaling Policy JSON.
  • bid_price - (Optional) Bid price for each EC2 instance in the instance group, expressed in USD. By setting this attribute, the instance group is being declared as a Spot Instance, and will implicitly create a Spot request. Leave this blank to use On-Demand Instances.
  • ebs_config - (Optional) Configuration block(s) for EBS volumes attached to each instance in the instance group. Detailed below.
  • instance_count - (Optional) Target number of instances for the instance group. Must be at least 1. Defaults to 1.
  • name - (Optional) Friendly name given to the instance group.

» ec2_attributes

Attributes for the Amazon EC2 instances running the job flow

  • key_name - (Optional) Amazon EC2 key pair that can be used to ssh to the master node as the user called hadoop
  • subnet_id - (Optional) VPC subnet id where you want the job flow to launch. Cannot specify the cc1.4xlarge instance type for nodes of a job flow launched in a Amazon VPC
  • additional_master_security_groups - (Optional) String containing a comma separated list of additional Amazon EC2 security group IDs for the master node
  • additional_slave_security_groups - (Optional) String containing a comma separated list of additional Amazon EC2 security group IDs for the slave nodes as a comma separated string
  • emr_managed_master_security_group - (Optional) Identifier of the Amazon EC2 EMR-Managed security group for the master node
  • emr_managed_slave_security_group - (Optional) Identifier of the Amazon EC2 EMR-Managed security group for the slave nodes
  • service_access_security_group - (Optional) Identifier of the Amazon EC2 service-access security group - required when the cluster runs on a private subnet
  • instance_profile - (Required) Instance Profile for EC2 instances of the cluster assume this role

» kerberos_attributes

Attributes for Kerberos configuration

  • ad_domain_join_password - (Optional) The Active Directory password for ad_domain_join_user. Terraform cannot perform drift detection of this configuration.
  • ad_domain_join_user - (Optional) Required only when establishing a cross-realm trust with an Active Directory domain. A user with sufficient privileges to join resources to the domain. Terraform cannot perform drift detection of this configuration.
  • cross_realm_trust_principal_password - (Optional) Required only when establishing a cross-realm trust with a KDC in a different realm. The cross-realm principal password, which must be identical across realms. Terraform cannot perform drift detection of this configuration.
  • kdc_admin_password - (Required) The password used within the cluster for the kadmin service on the cluster-dedicated KDC, which maintains Kerberos principals, password policies, and keytabs for the cluster. Terraform cannot perform drift detection of this configuration.
  • realm - (Required) The name of the Kerberos realm to which all nodes in a cluster belong. For example, EC2.INTERNAL

» instance_group

Attributes for each task instance group in the cluster

  • instance_role - (Required) The role of the instance group in the cluster. Valid values are: MASTER, CORE, and TASK.
  • instance_type - (Required) The EC2 instance type for all instances in the instance group
  • instance_count - (Optional) Target number of instances for the instance group
  • name - (Optional) Friendly name given to the instance group
  • bid_price - (Optional) If set, the bid price for each EC2 instance in the instance group, expressed in USD. By setting this attribute, the instance group is being declared as a Spot Instance, and will implicitly create a Spot request. Leave this blank to use On-Demand Instances.
  • ebs_config - (Optional) A list of attributes for the EBS volumes attached to each instance in the instance group. Each ebs_config defined will result in additional EBS volumes being attached to each instance in the instance group. Defined below
  • autoscaling_policy - (Optional) The autoscaling policy document. This is a JSON formatted string. See EMR Auto Scaling

» master_instance_group Configuration Block

Supported nested arguments for the master_instance_group configuration block:

  • instance_type - (Required) EC2 instance type for all instances in the instance group.
  • bid_price - (Optional) Bid price for each EC2 instance in the instance group, expressed in USD. By setting this attribute, the instance group is being declared as a Spot Instance, and will implicitly create a Spot request. Leave this blank to use On-Demand Instances.
  • ebs_config - (Optional) Configuration block(s) for EBS volumes attached to each instance in the instance group. Detailed below.
  • instance_count - (Optional) Target number of instances for the instance group. Must be 1 or 3. Defaults to 1. Launching with multiple master nodes is only supported in EMR version 5.23.0+, and requires this resource's core_instance_group to be configured. Public (Internet accessible) instances must be created in VPC subnets that have map public IP on launch enabled. Termination protection is automatically enabled when launched with multiple master nodes and Terraform must have the termination_protection = false configuration applied before destroying this resource.
  • name - (Optional) Friendly name given to the instance group.

» ebs_config

Attributes for the EBS volumes attached to each EC2 instance in the instance_group

  • size - (Required) The volume size, in gibibytes (GiB).
  • type - (Required) The volume type. Valid options are gp2, io1, standard and st1. See EBS Volume Types.
  • iops - (Optional) The number of I/O operations per second (IOPS) that the volume supports
  • volumes_per_instance - (Optional) The number of EBS volumes with this configuration to attach to each EC2 instance in the instance group (default is 1)

» bootstrap_action

  • name - (Required) Name of the bootstrap action
  • path - (Required) Location of the script to run during a bootstrap action. Can be either a location in Amazon S3 or on a local file system
  • args - (Optional) List of command line arguments to pass to the bootstrap action script

» step

Attributes for step configuration

  • action_on_failure - (Required) The action to take if the step fails. Valid values: TERMINATE_JOB_FLOW, TERMINATE_CLUSTER, CANCEL_AND_WAIT, and CONTINUE
  • hadoop_jar_step - (Required) The JAR file used for the step. Defined below.
  • name - (Required) The name of the step.

» hadoop_jar_step

Attributes for Hadoop job step configuration

  • args - (Optional) List of command line arguments passed to the JAR file's main function when executed.
  • jar - (Required) Path to a JAR file run during the step.
  • main_class - (Optional) Name of the main class in the specified Java file. If not specified, the JAR file should specify a Main-Class in its manifest file.
  • properties - (Optional) Key-Value map of Java properties that are set when the step runs. You can use these properties to pass key value pairs to your main function.

» Attributes Reference

In addition to all arguments above, the following attributes are exported:

  • id - The ID of the EMR Cluster
  • name - The name of the cluster.
  • release_label - The release label for the Amazon EMR release.
  • master_instance_group.0.id - Master node type Instance Group ID, if using Instance Group for this node type.
  • master_public_dns - The public DNS name of the master EC2 instance.
  • core_instance_group.0.id - Core node type Instance Group ID, if using Instance Group for this node type.
  • log_uri - The path to the Amazon S3 location where logs for this cluster are stored.
  • applications - The applications installed on this cluster.
  • ec2_attributes - Provides information about the EC2 instances in a cluster grouped by category: key name, subnet ID, IAM instance profile, and so on.
  • bootstrap_action - A list of bootstrap actions that will be run before Hadoop is started on the cluster nodes.
  • configurations - The list of Configurations supplied to the EMR cluster.
  • service_role - The IAM role that will be assumed by the Amazon EMR service to access AWS resources on your behalf.
  • visible_to_all_users - Indicates whether the job flow is visible to all IAM users of the AWS account associated with the job flow.
  • tags - The list of tags associated with a cluster.

» Example bootable config

NOTE: This configuration demonstrates a minimal configuration needed to boot an example EMR Cluster. It is not meant to display best practices. Please use at your own risk.

provider "aws" {
  region = "us-west-2"
}

resource "aws_emr_cluster" "cluster" {
  name          = "emr-test-arn"
  release_label = "emr-4.6.0"
  applications  = ["Spark"]

  ec2_attributes {
    subnet_id                         = "${aws_subnet.main.id}"
    emr_managed_master_security_group = "${aws_security_group.allow_all.id}"
    emr_managed_slave_security_group  = "${aws_security_group.allow_all.id}"
    instance_profile                  = "${aws_iam_instance_profile.emr_profile.arn}"
  }

  master_instance_type = "m5.xlarge"
  core_instance_type   = "m5.xlarge"
  core_instance_count  = 1

  tags = {
    role     = "rolename"
    dns_zone = "env_zone"
    env      = "env"
    name     = "name-env"
  }

  bootstrap_action {
    path = "s3://elasticmapreduce/bootstrap-actions/run-if"
    name = "runif"
    args = ["instance.isMaster=true", "echo running on master node"]
  }

  configurations_json = <<EOF
  [
    {
      "Classification": "hadoop-env",
      "Configurations": [
        {
          "Classification": "export",
          "Properties": {
            "JAVA_HOME": "/usr/lib/jvm/java-1.8.0"
          }
        }
      ],
      "Properties": {}
    },
    {
      "Classification": "spark-env",
      "Configurations": [
        {
          "Classification": "export",
          "Properties": {
            "JAVA_HOME": "/usr/lib/jvm/java-1.8.0"
          }
        }
      ],
      "Properties": {}
    }
  ]
EOF

  service_role = "${aws_iam_role.iam_emr_service_role.arn}"
}

resource "aws_security_group" "allow_access" {
  name        = "allow_access"
  description = "Allow inbound traffic"
  vpc_id      = "${aws_vpc.main.id}"

  ingress {
    # these ports should be locked down
    from_port   = 0
    to_port     = 0
    protocol    = "-1"

    # we do not recommend opening your cluster to 0.0.0.0/0
    cidr_blocks = # add your IP address here
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  depends_on = ["aws_subnet.main"]

  lifecycle {
    ignore_changes = ["ingress", "egress"]
  }

  tags = {
    name = "emr_test"
  }
}

resource "aws_vpc" "main" {
  cidr_block           = "168.31.0.0/16"
  enable_dns_hostnames = true

  tags = {
    name = "emr_test"
  }
}

resource "aws_subnet" "main" {
  vpc_id     = "${aws_vpc.main.id}"
  cidr_block = "168.31.0.0/20"

  tags = {
    name = "emr_test"
  }
}

resource "aws_internet_gateway" "gw" {
  vpc_id = "${aws_vpc.main.id}"
}

resource "aws_route_table" "r" {
  vpc_id = "${aws_vpc.main.id}"

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = "${aws_internet_gateway.gw.id}"
  }
}

resource "aws_main_route_table_association" "a" {
  vpc_id         = "${aws_vpc.main.id}"
  route_table_id = "${aws_route_table.r.id}"
}

###

# IAM Role setups

###

# IAM role for EMR Service
resource "aws_iam_role" "iam_emr_service_role" {
  name = "iam_emr_service_role"

  assume_role_policy = <<EOF
{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "elasticmapreduce.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF
}

resource "aws_iam_role_policy" "iam_emr_service_policy" {
  name = "iam_emr_service_policy"
  role = "${aws_iam_role.iam_emr_service_role.id}"

  policy = <<EOF
{
    "Version": "2012-10-17",
    "Statement": [{
        "Effect": "Allow",
        "Resource": "*",
        "Action": [
            "ec2:AuthorizeSecurityGroupEgress",
            "ec2:AuthorizeSecurityGroupIngress",
            "ec2:CancelSpotInstanceRequests",
            "ec2:CreateNetworkInterface",
            "ec2:CreateSecurityGroup",
            "ec2:CreateTags",
            "ec2:DeleteNetworkInterface",
            "ec2:DeleteSecurityGroup",
            "ec2:DeleteTags",
            "ec2:DescribeAvailabilityZones",
            "ec2:DescribeAccountAttributes",
            "ec2:DescribeDhcpOptions",
            "ec2:DescribeInstanceStatus",
            "ec2:DescribeInstances",
            "ec2:DescribeKeyPairs",
            "ec2:DescribeNetworkAcls",
            "ec2:DescribeNetworkInterfaces",
            "ec2:DescribePrefixLists",
            "ec2:DescribeRouteTables",
            "ec2:DescribeSecurityGroups",
            "ec2:DescribeSpotInstanceRequests",
            "ec2:DescribeSpotPriceHistory",
            "ec2:DescribeSubnets",
            "ec2:DescribeVpcAttribute",
            "ec2:DescribeVpcEndpoints",
            "ec2:DescribeVpcEndpointServices",
            "ec2:DescribeVpcs",
            "ec2:DetachNetworkInterface",
            "ec2:ModifyImageAttribute",
            "ec2:ModifyInstanceAttribute",
            "ec2:RequestSpotInstances",
            "ec2:RevokeSecurityGroupEgress",
            "ec2:RunInstances",
            "ec2:TerminateInstances",
            "ec2:DeleteVolume",
            "ec2:DescribeVolumeStatus",
            "ec2:DescribeVolumes",
            "ec2:DetachVolume",
            "iam:GetRole",
            "iam:GetRolePolicy",
            "iam:ListInstanceProfiles",
            "iam:ListRolePolicies",
            "iam:PassRole",
            "s3:CreateBucket",
            "s3:Get*",
            "s3:List*",
            "sdb:BatchPutAttributes",
            "sdb:Select",
            "sqs:CreateQueue",
            "sqs:Delete*",
            "sqs:GetQueue*",
            "sqs:PurgeQueue",
            "sqs:ReceiveMessage"
        ]
    }]
}
EOF
}

# IAM Role for EC2 Instance Profile
resource "aws_iam_role" "iam_emr_profile_role" {
  name = "iam_emr_profile_role"

  assume_role_policy = <<EOF
{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF
}

resource "aws_iam_instance_profile" "emr_profile" {
  name  = "emr_profile"
  roles = ["${aws_iam_role.iam_emr_profile_role.name}"]
}

resource "aws_iam_role_policy" "iam_emr_profile_policy" {
  name = "iam_emr_profile_policy"
  role = "${aws_iam_role.iam_emr_profile_role.id}"

  policy = <<EOF
{
    "Version": "2012-10-17",
    "Statement": [{
        "Effect": "Allow",
        "Resource": "*",
        "Action": [
            "cloudwatch:*",
            "dynamodb:*",
            "ec2:Describe*",
            "elasticmapreduce:Describe*",
            "elasticmapreduce:ListBootstrapActions",
            "elasticmapreduce:ListClusters",
            "elasticmapreduce:ListInstanceGroups",
            "elasticmapreduce:ListInstances",
            "elasticmapreduce:ListSteps",
            "kinesis:CreateStream",
            "kinesis:DeleteStream",
            "kinesis:DescribeStream",
            "kinesis:GetRecords",
            "kinesis:GetShardIterator",
            "kinesis:MergeShards",
            "kinesis:PutRecord",
            "kinesis:SplitShard",
            "rds:Describe*",
            "s3:*",
            "sdb:*",
            "sns:*",
            "sqs:*"
        ]
    }]
}
EOF
}

» Import

EMR clusters can be imported using the id, e.g.

$ terraform import aws_emr_cluster.cluster j-123456ABCDEF

Since the API does not return the actual values for Kerberos configurations, environments with those Terraform configurations will need to use the lifecycle configuration block ignore_changes argument available to all Terraform resources to prevent perpetual differences, e.g.

resource "aws_emr_cluster" "example" {
  # ... other configuration ...

  lifecycle {
    ignore_changes = ["kerberos_attributes"]
  }
}