AWS Certificate Manager as a Service Mesh Certificate Authority

Consul can be used with AWS Certificate Manager (ACM) Private Certificate Authority (CA) to manage and sign certificates.

This page documents the specifics of the AWS ACM Private CA provider. Please read the certificate management overview page first to understand how Consul manages certificates with configurable CA providers.

Requirements

The ACM Private CA Provider was added in Consul 1.7.0.

The ACM Private CA Provider needs to be authorized via IAM credentials to perform operations. Every Consul server needs to be running in an environment where a suitable IAM configuration is present.

The standard AWS SDK credential locations are used, which means that suitable credentials and region configuration need to be present in one of the following:

  1. Environment variables
  2. Shared credentials file
  3. Via an EC2 instance role

The IAM credential provided must have permission for the following actions:

  • CreateCertificateAuthority - assuming an existing CA is not specified in existing_arn
  • DescribeCertificateAuthority
  • GetCertificate
  • IssueCertificate

Configuration

The ACM Private CA provider is enabled by setting the CA provider to "aws-pca" in the agent’s ca_provider configuration option, or via the /connect/ca/configuration API endpoint. At this time there is only one, optional configuration value.

Example configurations are shown below:

Service mesh CA configuration

Service mesh CA configuration

ACM Private CA - 图1

/etc/consul.d/config.hcl

  1. # ...
  2. connect {
  3. enabled = true
  4. ca_provider = "aws-pca"
  5. ca_config {
  6. existing_arn = "arn:aws:acm-pca:region:account:certificate-authority/12345678-1234-1234-123456789012"
  7. }
  8. }
  1. {
  2. "Provider": "aws-pca",
  3. "Config": {
  4. "ExistingARN": "arn:aws:acm-pca:region:account:certificate-authority/12345678-1234-1234-123456789012"
  5. }
  6. }

Note: Suitable AWS IAM credentials are necessary for the provider to work. However, these are not configured in the Consul config which is typically on disk, and instead rely on the standard AWS SDK configuration locations.

The configuration options are listed below.

Note: The first key is the value used in API calls, and the second key (after the /) is used if you are adding the configuration to the agent’s configuration file.

  • ExistingARN / existing_arn (string: <optional>) - The Amazon Resource Name (ARN) of an existing private CA in your ACM account. If specified, Consul will attempt to use the existing CA to issue certificates.

    • In the primary datacenter this ARN must identify a root CA. See limitations.
    • In a secondary datacenter, it must identify a subordinate CA signed by the same root used in the primary datacenter. If it is signed by another root, Consul will automatically create a new subordinate signed by the primary’s root instead.

    The default behavior with no ExistingARN specified is for Consul to create a new root CA in the primary datacenter and a subordinate CA in each secondary DC.

Common CA Config Options

The following configuration options are supported by all CA providers:

  • CSRMaxConcurrent / csr_max_concurrent (int: 0) - Sets a limit on the number of Certificate Signing Requests that can be processed concurrently. Defaults to 0 (disabled). This is useful when you want to limit the number of CPU cores available to the server for certificate signing operations. For example, on an 8 core server, setting this to 1 will ensure that no more than one CPU core will be consumed when generating or rotating certificates. Setting this is recommended instead of csr_max_per_second when you want to limit the number of cores consumed since it is simpler to reason about limiting CSR resources this way without artificially slowing down rotations. Added in 1.4.1.

  • CSRMaxPerSecond / csr_max_per_second (float: 50) - Sets a rate limit on the maximum number of Certificate Signing Requests (CSRs) the servers will accept. This is used to prevent CA rotation from causing unbounded CPU usage on servers. It defaults to 50 which is conservative – a 2017 MacBook can process about 100 per second using only ~40% of one CPU core – but sufficient for deployments up to ~1500 service instances before the time it takes to rotate is impacted. For larger deployments we recommend increasing this based on the expected number of server instances and server resources, or use csr_max_concurrent instead if servers have more than one CPU core. Setting this to zero disables rate limiting. Added in 1.4.1.

  • LeafCertTTL / leaf_cert_ttl (duration: "72h") - The upper bound on the lease duration of a leaf certificate issued for a service. In most cases a new leaf certificate will be requested by a proxy before this limit is reached. This is also the effective limit on how long a server outage can last (with no leader) before network connections will start being rejected. Defaults to 72h. This value cannot be lower than 1 hour or higher than 1 year.

    This value is also used when rotating out old root certificates from the cluster. When a root certificate has been inactive (rotated out) for more than twice the current leaf_cert_ttl, it will be removed from the trusted list.

  • RootCertTTL / root_cert_ttl (duration: "87600h") The time to live (TTL) for a root certificate. Defaults to 10 years as 87600h. This value, if provided, needs to be higher than the intermediate certificate TTL.

    This setting applies to all Consul CA providers.

    For the Vault provider, this value is only used if the backend is not initialized at first.

  • IntermediateCertTTL / intermediate_cert_ttl (duration: "8760h") The time to live (TTL) for any intermediate certificates signed by root certificate of the primary datacenter. This field is only valid in the primary datacenter. Defaults to 1 year as 8760h.

    This setting applies to all Consul CA providers.

    For the Vault provider, this value is only used if the backend is not initialized at first.

  • PrivateKeyType / private_key_type (string: "ec") - The type of key to generate for this CA. This is only used when the provider is generating a new key. If private_key is set for the Consul provider, or existing root or intermediate PKI paths given for Vault then this will be ignored. Currently supported options are ec or rsa. Default is ec.

    It is required that all servers in a datacenter have the same config for the CA. It is recommended that servers in different datacenters use the same key type and size, although the built-in CA and Vault provider will both allow mixed CA key types.

    Some CA providers (currently Vault) will not allow cross-signing a new CA certificate with a different key type. This means that if you migrate from an RSA-keyed Vault CA to an EC-keyed CA from any provider, you may have to proceed without cross-signing which risks temporary connection issues for workloads during the new certificate rollout. We highly recommend testing this outside of production to understand the impact, and suggest sticking to same key type where possible.

    Note: This only affects CA keys generated by the provider. Leaf certificate keys are always EC 256 regardless of the CA configuration.

  • PrivateKeyBits / private_key_bits (string: "") - The length of key to generate for this CA. This is only used when the provider is generating a new key. If private_key is set for the Consul provider, or existing root or intermediate PKI paths given for Vault then this will be ignored.

    Currently supported values are:

Limitations

ACM Private CA has several limits that restrict how fast certificates can be issued. This may impact how quickly large clusters can rotate all issued certificates.

Currently, the ACM Private CA provider for service mesh has some additional limitations described below.

Unable to Cross-sign Other CAs

It’s not possible to cross-sign other CA provider’s root certificates during a migration. ACM Private CA is capable of doing that through a different workflow but is not able to blindly cross-sign another root certificate without a CSR being generated. Both Consul’s built-in CA and Vault can do this and the current workflow for managing CAs relies on it.

For now, the limitation means that once ACM Private CA is configured as the CA provider, it is not possible to reconfigure a different CA provider, or rotate the root CA key without potentially observing some transient connection failures. See the section on forced rotation without cross-signing for more details.

Primary DC Must be a Root CA

Currently, if an existing ACM Private CA is used, the primary DC must use a Root CA directly to issue certificates.

Cost Planning

To help estimate costs, an example is provided below of the resources that would be used.

This is intended to illustrate the behavior of the CA for cost planning purposes. Please refer to the pricing for ACM Private CA for actual cost information.

Assume the following Consul datacenters exist and are configured to use ACM Private CA as their service mesh CA with the default leaf certificate lifetime of 72 hours:

DatacenterPrimaryCA Resource CreatedNumber of service instances
dc1yes1 ROOT100
dc2no1 SUBORDINATE50
dc3no1 SUBORDINATE500

Leaf certificates are valid for 72 hours but are refreshed when between 60% and 90% of their lifetime has elapsed. On average each certificate will be reissued every 54 hours or roughly 13.3 times per month.

So monthly cost would be calculated as:

  • 3 ⨉ Monthly CA cost, plus
  • 8630 ⨉ Certificate Issue cost, made up of:
    • 100 ⨉ 13.3 = 1,330 certificates issued in dc1
    • 50 ⨉ 13.3 = 665 certificates issued in dc2
    • 500 ⨉ 13.3 = 6,650 certificates issued in dc3

The number of certificates issued could be reduced by increasing leaf_cert_ttl in the CA Provider configuration if the longer lived credentials are an acceptable risk tradeoff against the cost.