Connectivity for High Availability

Namespaces with High Availability features and private connectivity

Proper networking configuration is required for failover to be transparent to clients and Workers when using AWS PrivateLink or GCP Private Service Connect.

This page covers single-cloud HA (both replicas on AWS, or both on GCP) and multi-cloud HA (one replica on AWS, one on GCP).

These instructions assume you already have the private connections in place. If not, follow the AWS PrivateLink or GCP Private Service Connect creation guides first.

How HA + private connectivity works

A Namespace with High Availability features has two replicas — a primary and a secondary, in different regions or different cloud providers. At any moment, one is active and one is passive. On failover, Temporal Cloud changes the active replica.

Temporal Cloud expresses the active replica through DNS:

The Namespace DNS record (<ns>.<account>.tmprl.cloud) is a CNAME.
It points to the active region's regional record (<provider>-<region>.region.tmprl.cloud).
On failover, Temporal Cloud rewrites the CNAME target.

Namespace DNS records have a 15-second TTL. Clients should converge to the new region within roughly 30 seconds (about twice the TTL) once their resolver cache expires.

For private connectivity, your job is to make sure that:

Both regions resolve to the correct private endpoint inside your network — not the public internet.
Your Workers have a network path to whichever region becomes active.

Single-cloud HA on AWS PrivateLink

This is the most common setup: both replicas live in AWS regions, and Workers connect via AWS PrivateLink.

When using PrivateLink, you connect to Temporal Cloud through a VPC Endpoint, which uses addresses local to your network. Temporal treats each region.tmprl.cloud zone as a separate zone, so you override resolution per region.

Before failover, with the active region being aws-us-west-2:

Record name	Record type	Value
ha-namespace.account-id.tmprl.cloud	CNAME	aws-us-west-2.region.tmprl.cloud

After a failover to aws-us-east-1, Temporal Cloud rewrites the CNAME:

Record name	Record type	Value
ha-namespace.account-id.tmprl.cloud	CNAME	aws-us-east-1.region.tmprl.cloud

The Temporal-managed CNAME changed from us-west-2 to us-east-1 — your private DNS does not need to change.

Customer side solution example

Setting up the DNS override (AWS)

In AWS, use a Route 53 private hosted zone for region.tmprl.cloud to override resolution per region:

Record name	Record type	Value (your VPC Endpoint DNS)
`aws-us-west-2.region.tmprl.cloud`	CNAME	`vpce-...-us-west-2.vpce.amazonaws.com`
`aws-us-east-1.region.tmprl.cloud`	CNAME	`vpce-...-us-east-1.vpce.amazonaws.com`

Link the private zone to every VPC where Workers run.

When your Workers connect to the Namespace, they first resolve <ns>.<account>.tmprl.cloud, which CNAMEs to <aws-active-region>.region.tmprl.cloud, which then resolves to your local VPC Endpoint.

You also need to decide how Workers reach whichever region becomes active. Either:

Run Workers in both regions continuously (recommended), or
Establish cross-region connectivity (Transit Gateway, VPC Peering) so Workers in one region can reach the VPC Endpoint in the other.

Single-cloud HA on GCP Private Service Connect

For GCP-only HA, the same model applies, but use a Cloud DNS private zone for region.tmprl.cloud and point each gcp-<region>.region.tmprl.cloud record at the local PSC endpoint IP address.

Record name	Record type	Value (your PSC endpoint IP)
`gcp-us-central1.region.tmprl.cloud`	A	`10.x.x.x` (PSC endpoint IP)
`gcp-us-east1.region.tmprl.cloud`	A	`10.x.x.x` (PSC endpoint IP)

A Connectivity Rule is required for each PSC connection — see GCP PSC setup and Connectivity Rules.

Multi-cloud HA (AWS PrivateLink + GCP Private Service Connect)

If your replicas span clouds — for example, AWS us-east-1 (active) and GCP us-east4 (passive) — your Workers need a way to reach the active replica regardless of which cloud it's in. The Temporal-managed CNAME rewrites still work the same way; the harder problems are on the client side.

Plan for these three things:

DNS overrides for both clouds. Your private DNS for region.tmprl.cloud needs entries for both the AWS region (CNAME → AWS VPCE) and the GCP region (A → PSC IP). This typically means a Route 53 private hosted zone in your AWS Worker VPCs and a Cloud DNS private zone in your GCP Worker network — both for the same region.tmprl.cloud parent — each with the records relevant to the cloud the Workers run in.
Worker reachability across clouds. Your AWS-resident Workers must be able to reach the GCP PSC endpoint when GCP is active, and vice versa. Options include:
- Run Workers in both clouds (preferred — simplest, lowest latency, matches the failover model).
- Establish cross-cloud connectivity (e.g., AWS Transit Gateway + GCP Cloud Interconnect, or a third-party transit) so Workers in one cloud can resolve and reach the other cloud's private endpoint.
Connectivity Rules in both regions. GCP PSC requires a Connectivity Rule. AWS PrivateLink does not, but if you want to enforce private-only access, add one for the AWS side as well so the Namespace is private-only in both regions.

Alpine/musl + GCP PSC: missing AAAA records can break Workers

GCP Private Service Connect endpoints return only A (IPv4) records — there is no AAAA (IPv6) record. Most Linux distributions handle a missing AAAA gracefully, but Alpine Linux's musl resolver returns a SERVFAIL when AAAA is missing, which can cause Temporal SDK clients to fail name resolution after a failover from AWS to GCP.

If you run Workers on Alpine and use multi-cloud HA, either:

Switch the Worker base image to a glibc-based distribution (Debian, Ubuntu, distroless), or
Configure your application/runtime to disable AAAA lookups (e.g., set GODEBUG=netdns=go+v4 for Go, or prefer IPv4 in the Java/Node/Python runtimes you use).

Test failover before you depend on it

Failover is the only thing High Availability features exist to do — and DNS, cross-region or cross-cloud reachability, and Connectivity Rule coverage are exactly the kinds of configuration that look correct on paper and break under failover. Test it in a non-production Namespace first.

A reasonable validation plan:

Set up the HA Namespace and the private connectivity for both regions, including all DNS overrides.
Run Workers continuously in both regions (or arrange cross-region connectivity).
Trigger a manual failover from the Web UI or tcld and verify:
- DNS for <ns>.<account>.tmprl.cloud resolves to the new region within ~30 seconds.
- Workers in both regions are picking up tasks.
- SDK clients connect successfully (no Name resolution failed, connection reset by peer, or context deadline exceeded errors).
Trigger a failback to the original region and verify the same.
For multi-cloud HA, repeat with each cloud as the active replica, including from base images (Alpine, distroless) you actually use in production.

If a real failover finds a configuration gap that wasn't tested, recovery typically requires changes on the client side that are hard to make under pressure.

Available regions, PrivateLink endpoints, and DNS record overrides

caution

The sa-east-1 region is not yet available for use with Multi-region Namespaces. Currently, it is the only region on the continent.

The following tables list the available Temporal regions and the DNS record overrides used for HA + private connectivity:

AWS regions and PrivateLink endpoints

GCP regions and Private Service Connect endpoints

When using a Namespace with High Availability features, the Namespace's DNS record <ns>.<account>.tmprl.cloud points to a regional DNS record in the format <provider>-<region>.region.tmprl.cloud, where <provider>-<region> is the currently active region for your Namespace.

During failover, Temporal Cloud changes the target of the Namespace DNS record from one region to another. Namespace DNS records are configured with a 15-second TTL. Any DNS cache should re-resolve the record within this time. As a rule of thumb, receiving an updated DNS record takes about twice (2x) the TTL — clients should converge to the newly targeted region within, at most, a 30-second delay, assuming their resolver and language runtime honor the TTL.

How HA + private connectivity works​

Single-cloud HA on AWS PrivateLink​

Setting up the DNS override (AWS)​

Single-cloud HA on GCP Private Service Connect​

Multi-cloud HA (AWS PrivateLink + GCP Private Service Connect)​

Test failover before you depend on it​

Available regions, PrivateLink endpoints, and DNS record overrides​

AWS regions and PrivateLink endpoints​

GCP regions and Private Service Connect endpoints​