cert-manager + Smallstep → Venafi
cert-manager + Smallstep ↔ Venafi: integration to migration path.
cert-manager and Step-CA deploy alongside Venafi first — starting as a thin K8s-native client through the Venafi Issuer, then taking over in-cluster issuance once Step-CA is cross-signed and trusted. The cutover is a one-field issuer change per workload against a dual-trust chain, so there is no flag day and no forced re-credentialing.
The honest boundary, stated up front: the two stacks overlap only in the Kubernetes quadrant. Cross-CA inventory, ITSM approvals on the F5/Citrix/IIS/Java estate, and Outagedetection ML stay on Venafi. This is a partial migration scoped to the cluster tier — not Venafi retirement.
The idea
Take the cluster corner. Keep Venafi for the rest.
The topology that makes this zero-downtime is the cross-sign plus cert-manager's pluggable issuer model: Step-CA stands up HA with an HSM-backed key, its intermediate cross-signed by your existing internal root, so currently deployed workloads validate Step-CA leaves during the transition. Because changing the issuer is a single issuerRef field on a Certificate CRD, each workload flips with a one-line PR and reverts the same way. Venafi stays the enterprise inventory and workflow plane for everything outside the cluster at every phase; only the K8s issuance path moves.
The phases
Six steps. Each one reversible.
Baseline & inventory
Every cert is tagged by workload tier — K8s in-cluster, K8s ingress, F5, Citrix, IIS, Java keystore, network appliance — with its backend CA driver, Policy Folder or Zone, validity, SAN shape and owner team, cross-referenced against 90 days of Outagedetection events. Read-only.
cert-manager via the Venafi Issuer
cert-manager installs in in-scope clusters with a Venafi Issuer (VEI) scoped to a dedicated cluster-tier Policy Folder. A canary namespace issues Certificate CRDs end to end, still backed by Venafi. Production is untouched and there is no spend reduction yet — this is a K8s-native stepping stone.
Move K8s TLS onto cert-manager + VEI
All in-cluster TLS — ingress, mesh sidecars, workload consumers — is declared as Certificate CRDs issued by VEI and renewed by the cert-manager reconciler. Manual VCert flows and one-off ticketed issuance are retired. trust-manager rolls the managed root out as ConfigMaps.
Stand up Step-CA, cross-signed
Step-CA goes live HA — at least three replicas, external Postgres, HSM-backed signing key — with its intermediate cross-signed by your existing internal root and chained to a new offline Step-CA root. trust-manager publishes the union of both roots. No Certificate CRD references Step-CA yet.
Cut canary workloads to step-issuer
Canary namespaces flip their Certificate issuerRef to StepIssuer, so issuance happens entirely in-cluster against Step-CA. The chain still validates through the cross-sign, validity drops to short-lived where supported, and per-cert Venafi cost on the canary falls to zero.
Wave the cluster tier; Venafi retained
All in-cluster CRDs move to Step-CA in waves (dev → staging → prod). Public-internet ingress stays on a publicly-trusted CA via cert-manager ACME — not Step-CA. Venafi Discovery keeps observing inventory, and the F5, Citrix, IIS, Java and network estate is untouched.
Feature parity
What moves, what stays on Venafi.
| Capability | cert-manager + Smallstep | Venafi | Parity |
|---|---|---|---|
| ACME issuance | Step-CA ACME provisioner + cert-manager acme Issuer (HTTP-01 / DNS-01 / tls-alpn-01, EAB) | Venafi TPP/Cloud ACME via Issuing Template + EAB | At parity |
| Private CA | Step-CA online intermediate + offline root | Venafi orchestrates downstream private CAs (ADCS, OpenSSL, AWS PCA) | At parity |
| Publicly-trusted root | None — Step-CA chain is private | Venafi drives DigiCert/Sectigo/Entrust public roots under WebTrust | SaaS only |
| Code signing | Step-CA can mint signing EKU but no enterprise custody | Venafi Code Sign Protect (workflow + HSM custody) | SaaS only |
| Certificate inventory / discovery | cert-manager sees its own Certificate state only | Venafi Network / Onboard Discovery across F5/Citrix/IIS/Java/network gear | SaaS only |
| HSM support | Step-CA PKCS#11 (YubiHSM 2 / CloudHSM / Azure Managed HSM / GCP KMS-HSM) | Venafi connects HSM-backed downstream CAs | At parity |
| Short-lived certs | Step-CA 24h to 7d certs, no CRL/OCSP; ACME ARI early-renew | Issuing Template low validity, but per-cert cost discourages high churn | Partial |
| cert-manager integration | Native step-issuer external Issuer | Venafi Enhanced Issuer (VEI) kind: Issuer | At parity |
| RBAC | K8s RBAC + Step-CA provisioner claims | Venafi TPP roles (MRAO/RAO/Operator/Approver/Auditor) | Partial |
| Workflow approvals | Kyverno/Gatekeeper admission (policy, not workflow); PR review | Venafi ITSM-integrated approvals (ServiceNow/Jira), SoD, evidence trail | SaaS only |
| Anomaly / outage detection | Prometheus metrics + alerts on instrumented certs | Venafi Outagedetection cross-estate ML | SaaS only |
| Deployment & HA | Step-CA self-hosted HA (3+ replicas, external Postgres, HSM) | Venafi vendor-operated SaaS / on-prem TPP | Partial |
| Cost model | Self-hosted compute + ops; zero per-cert license in-cluster | Per-machine-identity licensing | At parity |
| Compliance (WebTrust / SOC 2 / FIPS) | You operate + attest; FIPS iff HSM validated; CP/CPS is yours | Venafi vendor SOC 2 / ISO 27001 inherited as evidence | SaaS only |
What we're honest about
The caveats most vendors leave out.
This is not Venafi retirement
The two stacks overlap in exactly one quadrant: Kubernetes and cloud-native workloads. cert-manager plus Smallstep replaces the K8s corner cleanly, but it cannot replace cross-CA enterprise inventory, ITSM-driven approvals on F5, Citrix, IIS, Java and network gear, or Outagedetection ML. Any plan promising full Venafi removal breaks the audit. We state the cluster-tier scope in Phase 0 and at every steering review.
Public trust and code signing stay on Venafi
Step-CA's chain is private — public-internet ingress must keep routing through cert-manager ACME to Sectigo, DigiCert or Let's Encrypt, never Step-CA, enforced by Kyverno admission. Publicly-trusted code signing (Authenticode, Apple Developer ID, kernel-mode) requires HSM custody and OS-pinned roots, so it stays on Venafi Code Sign Protect. We make the public-vs-private split a hard admission rule.
You lose single-pane inventory and ITSM approvals in-cluster
cert-manager sees only its own Certificate state — there is no cross-CA Discovery layer, and Kyverno is an admission policy, not a ServiceNow-style approval workflow with segregation of duties. We run topology 1.3, feeding leaf metadata back to Venafi via WebSDK so the security org keeps visibility, and we get the auditor to sign off on cluster-tier zero-touch issuance before Phase 5.
You now own uptime, the HSM and the cross-sign clock
Once VEI is removed from the path, a Step-CA outage is a self-inflicted cluster cert outage with no managed-service backstop — so HA, an HSM activation runbook and cached-cert grace are mandatory. The cross-sign window must outlast the soak plus a 90-day buffer or consumers trusting only the old root fail; we track that expiry as a P0 item. Outagedetection's cross-estate ML has no OSS equivalent.
Why this beats a flag day
Reversible in minutes, retired only after a long soak.
No phase forces an outage. Each wave rolls back in under 30 minutes — flipping issuerRef back to the Venafi issuer reverts on the next renewal, and within the cross-sign window no chain change is needed at all — while canary namespaces revert in under 15 minutes. A wave only counts as migrated after at least 30 consecutive days at full cluster scale on Step-CA with renewal success at or above 99.9%. Venafi is never retired: only the cluster-tier per-cert cost reaches zero, while it keeps owning the legacy and cross-CA estate indefinitely.
See how much of the cluster tier migrates cleanly.
A call with a senior platform and PKI engineer. We classify your certs by tier, size your HSM and cross-sign window honestly, and tell you exactly how much of the K8s estate moves to cert-manager + Step-CA — and how much must stay on Venafi.
Map my migration →