Automated Secret Rotation Patterns

Q: How often should secrets be rotated?

Rotate on a fixed schedule sized to your risk tolerance — commonly 30–90 days for static secrets, and continuously for dynamic credentials that expire on a short TTL. The key is that rotation is automated and routine, not a manual response to an incident.

Q: What breaks most often during rotation?

Long-lived connections and over-long caches. A connection pool opened with the old credential keeps using it; size the cache TTL below the overlap window and reconnect pools when the credential changes.

Rotation that only happens after a breach is not rotation. The goal is to make credential replacement routine and invisible: the secret changes on a schedule, and not a single request fails. The mechanism is a dual-credential overlap window — both the old and new credentials work simultaneously while traffic shifts over.

Rotation is the operational backbone of the enterprise secrets section. It applies whether the store is AWS Secrets Manager or Vault.

During the overlap window both credentials work, so traffic shifts with no failed requests.

Secure implementation

# secrets/rotation.py
import time
from pydantic import SecretStr

class RotatingSecret:
    """Serves a secret from cache, refreshing within the overlap window."""

    def __init__(self, fetch, ttl: int = 300):
        self._fetch = fetch          # callable returning the current SecretStr
        self._ttl = ttl              # must be shorter than the overlap window
        self._value: SecretStr | None = None
        self._loaded_at = 0.0

    def get(self) -> SecretStr:
        if self._value is None or time.monotonic() - self._loaded_at > self._ttl:
            self._value = self._fetch()        # picks up the rotated value
            self._loaded_at = time.monotonic()
        return self._value

def reconnect_pool(pool, secret: RotatingSecret) -> None:
    pool.recreate(password=secret.get().get_secret_value())  # rebuild on rotation

The cache TTL is deliberately shorter than the store’s overlap window, so the application always picks up the new credential while the old one is still accepted.

Configuration reference

Element	Value	Why
overlap window	> cache TTL	Both credentials valid during switch
cache TTL	5–15 min	Picks up new value promptly
rotation interval	30–90 days (static)	Routine, scheduled
dynamic lease	minutes–hours	Self-revoking, smallest blast radius
`SecretStr`	always	Masked in logs

Deployment parity: local to production

Local dev — test rotation by forcing a refresh and asserting the app reconnects.
CI — a test rotates a fake secret mid-run and verifies no request fails.
Staging — trigger a real rotation and watch for reconnection errors before production.
Production — rotation runs on schedule; alerts fire if a credential nears expiry un-rotated.

Security boundaries & guardrails

Always provision-new-before-revoke-old; never revoke first.
Keep the cache TTL strictly below the overlap window.
Reconnect connection pools when the credential changes; do not leave old connections open.
Wrap rotating credentials in SecretStr and unwrap only at the driver call.
Alert on any secret whose age exceeds the rotation interval.

Troubleshooting

Requests fail during rotation — the overlap window is shorter than the cache TTL; widen the window or shorten the cache. See Zero-Downtime Secret Rotation in Python.
Old credential still used after rotation — a connection pool was not recreated; reconnect on change.
Rotation never triggers — the schedule or Lambda is misconfigured; assert rotation in staging.
Credential expired before rotation — alerting is missing; add an age check.

Automated Secret Rotation Patterns

Secure implementation

Configuration reference

Deployment parity: local to production

Security boundaries & guardrails

Troubleshooting

Frequently asked questions

How do I rotate a secret without restarting the service?

How often should secrets be rotated?

What breaks most often during rotation?

Conclusion