Apache Ranger → Immuta

Apache Ranger ↔ Immuta: integration to migration path.

Immuta isn't a drop-in for Ranger — its model is genuinely different — so it stands up alongside Ranger as a passive observer first, then takes over the warehouse one domain at a time. Each domain is re-modelled with its owners, shadow-compared against Ranger's live decisions, and cut over only once divergence is proven — no flag day, no mass re-author event.

The honest part: Ranger is retained for the Hadoop lake, not retired. HDFS, HBase and Kafka have no Immuta equivalent, so a stable hybrid — Immuta on the warehouse, Ranger on the lake — is the realistic, deliberate end-state.

Map my migration with an engineer →

The idea

Observe in shadow first. Scope Ranger last.

The topology that makes this zero-outage: Atlas stays the canonical classification source-of-truth across the whole estate, surfaced into Immuta through its External Catalog sync. Immuta registers Snowflake and Databricks Unity Catalog Data Sources read-only, authoring policies in shadow before any are attached, while Ranger keeps enforcing HDFS, Hive, HBase, Kafka and Trino on the lake. The cutover hinges on one non-negotiable step — switching end users off direct warehouse grants so access flows through Immuta's managed views — after which Immuta governs the warehouse and Ranger reduces to lake-only, each domain moving independently and reversibly.

The phases

Seven steps. Each one reversible.

Baseline & inventory

We catalogue every source by protocol and enforcement, every Ranger policy with its last-modified date and audit volume, tag coverage, identity-propagation source, and domain ownership — cross-referencing Solr audit to find dead policies. Read-only. The Atlas classification taxonomy is the constant throughout.

Users see: No user impact.

Rollback: N/A.

Stand up Immuta; register read-only Data Sources

Immuta deploys, registers Atlas as an External Catalog, and registers at least one Snowflake or Unity Catalog Data Source read-only with no policies attached. SCIM from Okta, Entra or Keycloak comes online. Immuta sees columns and tags but writes no masking policy yet.

Users see: None — Immuta is a passive observer.

Rollback: Delete the tenant and connectors.

Author shadow policies for one pilot domain

The smallest-blast-radius domain gets full Immuta coverage — Subscription Policies, Data Policies, Purposes, Projects — authored but not attached to live tables. This is re-modelling, not translation: a domain-owner workshop asking what each Ranger policy is actually trying to achieve. Immuta's decisions are compared nightly against Ranger's audit.

Users see: None — Immuta is still passive.

Rollback: Discard the policies.

Per-domain cutover to Immuta enforcement

Immuta's Native Snowflake or UC integration is enabled and attaches policy objects. Critically, end users' direct table grants are switched off so access is mediated through Immuta — skip this and enforcement is silently bypassed. Seven days of side-by-side follow, both audit streams captured; the domain's warehouse-equivalent Ranger policies go deny-by-default.

Users see: None if Phase 2 divergence was truly under 1%. Edge cases fail closed first, then become corrections for the next wave.

Rollback: Disable the Native integration and restore pre-cutover grants from the snapshot. Under 15 minutes if the script is pre-staged.

Re-home tag policies into Immuta Global Policies

Cross-cutting tag policies — "PII tag masks the column unless a purpose is active" — move to Immuta Global Policies targeting @hasTag('PII'), which reach every Data Source at once. Ranger's tag service is retained only where resources are HDFS, HBase or Kafka, with a documented canonical source during any deliberate overlap.

Users see: None on existing access. Newly classified data immediately picks up the Global Policy.

Rollback: Disable the Global Policy; the Ranger tag policy keeps enforcing. Under 15 minutes.

Retire Ranger from the warehouse path

Ranger services covering warehouse assets are disabled or deleted; community Snowflake or UC plugins, if present, are removed. Ranger keeps governing HDFS, HBase, Kafka and Hive/Trino-on-lake, UGSync continues, and audit retention is unchanged because the regulatory clock does not shrink just because enforcement scope did. A final warehouse policy export is filed as the auditor evidence packet.

Users see: No user impact.

Rollback: Re-enable the disabled services from the policy-export snapshot. Under an hour — more involved than a flag flip.

Final retirement (partial — the honest end-state)

For most installs this is a stable hybrid: Ranger lake-only, Immuta for warehouse and tag policy. Full retirement is only feasible if the lake is decommissioned or moved behind Starburst/Trino that Immuta can govern — and that forces direct-HDFS Spark workloads onto a Trino-mediated path, an application-team workstream, not a flag flip.

Users see: None for the hybrid end-state. Full retirement would move direct-HDFS workloads to Trino.

Rollback: Within the 30-day read-only evidence window Ranger can be re-enabled; after that, rollback is out of scope.

Feature parity

What moves cleanly, and what doesn't.

Capability	Apache Ranger	Immuta	Parity
Policy model	Ranger Policy / Service / Plugin (resource-based, per-service)	Immuta Data Source + Subscription Policy + Data Policy	Partial
Tag/attribute-based policy	Ranger tag service + Atlas classifications (Ranger-governed services only)	Immuta Global Policies targeting @hasTag('PII') across every integration	Partial
Column / value masking	Ranger mask types (MASK_SHOW_LAST_4, MASK_HASH, CUSTOM valueExpr UDFs)	Immuta Data Policy primitives (hashing, format-preserving, regex, k-anonymization); no arbitrary UDFs	Partial
Row-level filtering	Ranger row-filter filterExpr with ${{USER.attr.*}}	Immuta Data Policy "row filtering — match user attribute"	At parity
Purpose-based access (PBAC)	No purpose object — simulated with per-purpose roles, no session scoping	Immuta Purpose object, session-activated, recorded per Policy Decision	SaaS only
Warehouse connectors (Snowflake/Databricks)	Community Snowflake/UC plugins of uncertain maturity	Immuta Native Snowflake (MASKING POLICY / ROW ACCESS POLICY) + Native Databricks UC	SaaS only
Lake connectors (HDFS/Hive/HBase/Kafka)	Ranger plugins for HDFS, HiveServer2, HBase coprocessor, Kafka, Trino	No equivalent — Hive only via Starburst/Trino, no HDFS/HBase/Kafka	OSS only
Sensitive-data discovery	None built-in (Apache Griffin / hand-rolled regex + Atlas API)	Immuta SDD — ML classifier on column names + samples, writes back classifications	SaaS only
Lineage / metadata	Apache Atlas classifications + lineage (external)	Consumed via Immuta External Catalog sync (~15 min poll)	Partial
Audit log	Ranger audit to Solr / ES / HDFS sink	Immuta Policy Decision log (warehouse query-log shape)	At parity
Policy simulation	None — shadow-eval by replaying audit logs (homegrown)	Immuta simulator + retroactive impact ("who loses access?")	SaaS only
BI passthrough (per-user row filter)	BI auths as service account, losing per-user policy	Immuta OAuth-on-behalf-of into Tableau / Power BI / Looker	SaaS only
Identity propagation	Ranger UGSync (LDAP/AD poll, shallow attribute model)	Immuta SCIM 2.0 from Okta/Entra/Keycloak with custom attributes	Partial
Compliance rule library	Author Ranger JSON per regulation by hand	Immuta contract-managed Global Policies (CPRA/GDPR/HIPAA), updated as law evolves	SaaS only

What we're honest about

The caveats most vendors leave out.

There is no translator — it's a re-model

Immuta's abstraction (Data Source + Subscription Policy + Data Policy + Purpose + Project) does not map one-to-one to Ranger's Policy/Service/Plugin model, so each domain is re-designed, not run through a converter. That is the point: many Ranger policies were written defensively over years and their original intent is lost — Phase 2 surfaces it, including policies that only worked by accident because deny-wins-over-allow saved a misordered sequence.

Custom mask UDFs don't translate

Ranger's CUSTOM dataMaskType with arbitrary valueExpr — vault tokenization, format-preserving encryption — has no equivalent in Immuta's fixed masking primitives (hashing with salt, format-preserving, NULL, regex, k-anonymization). We inventory every one in Phase 0, then re-implement it as a referenced stored function, keep it on Ranger if it's a lake asset, or replace it with the closest primitive and get privacy-office sign-off on the delta.

HDFS, HBase and Kafka stay on Ranger

Immuta has no equivalent for HDFS path ACLs, HBase cell-level coprocessor policy, Kafka topic ACLs, Solr or Knox — and they're not on its roadmap. If your estate is genuinely Hadoop-native, Ranger cannot be fully retired and the realistic end-state is a stable hybrid. We say so honestly rather than over-promise full retirement; if a Ranger superset is what you need, Privacera is the closer fit.

Two planes, three audit streams, a vendor boundary

During transition you run Ranger's Solr/ES audit, Immuta's Policy Decision log and the warehouse query logs at once, so we plan a SIEM uplift and tiered retention in Phase 1 and overlap both control sets for at least one audit cycle. Your SOC 2 evidence shifts from Ranger JSON in Git to Immuta exports, and if Immuta's integration breaks at 2 AM on-call now calls the vendor — so we confirm the support SLA up front.

Why this beats a flag day

Reversible per phase, gated by a soak.

Every cutover phase rolls back in under 15 minutes — disabling the Native integration detaches Immuta's policy objects and a pre-staged script restores the snapshot of pre-cutover warehouse grants, while a Global Policy or a BI dashboard reverts just as fast. And no domain is declared done on a hunch: each pilot or wave must hold under 1% Immuta-versus-Ranger divergence over a 14-day shadow window with every divergence root-caused, then the steady state must soak for at least 30 days — warehouse fully Immuta-enforced, tickets at or below baseline — before the Ranger warehouse path is retired. Honest caveats are conceded up front rather than discovered in production.

See which domains move cleanly off Ranger.

A 30-minute call with a senior data-governance engineer. We inventory your Ranger policies by domain, count the custom mask UDFs and direct table grants that need handling, and map an honest Immuta-on-the-warehouse, Ranger-on-the-lake boundary for your estate — before you commit to anything.

Map my migration →