Apache Ranger → Immuta
Apache Ranger ↔ Immuta: integration to migration path.
Immuta isn't a drop-in for Ranger — its model is genuinely different — so it stands up alongside Ranger as a passive observer first, then takes over the warehouse one domain at a time. Each domain is re-modelled with its owners, shadow-compared against Ranger's live decisions, and cut over only once divergence is proven — no flag day, no mass re-author event.
The honest part: Ranger is retained for the Hadoop lake, not retired. HDFS, HBase and Kafka have no Immuta equivalent, so a stable hybrid — Immuta on the warehouse, Ranger on the lake — is the realistic, deliberate end-state.
The idea
Observe in shadow first. Scope Ranger last.
The topology that makes this zero-outage: Atlas stays the canonical classification source-of-truth across the whole estate, surfaced into Immuta through its External Catalog sync. Immuta registers Snowflake and Databricks Unity Catalog Data Sources read-only, authoring policies in shadow before any are attached, while Ranger keeps enforcing HDFS, Hive, HBase, Kafka and Trino on the lake. The cutover hinges on one non-negotiable step — switching end users off direct warehouse grants so access flows through Immuta's managed views — after which Immuta governs the warehouse and Ranger reduces to lake-only, each domain moving independently and reversibly.
The phases
Seven steps. Each one reversible.
Baseline & inventory
We catalogue every source by protocol and enforcement, every Ranger policy with its last-modified date and audit volume, tag coverage, identity-propagation source, and domain ownership — cross-referencing Solr audit to find dead policies. Read-only. The Atlas classification taxonomy is the constant throughout.
Stand up Immuta; register read-only Data Sources
Immuta deploys, registers Atlas as an External Catalog, and registers at least one Snowflake or Unity Catalog Data Source read-only with no policies attached. SCIM from Okta, Entra or Keycloak comes online. Immuta sees columns and tags but writes no masking policy yet.
Author shadow policies for one pilot domain
The smallest-blast-radius domain gets full Immuta coverage — Subscription Policies, Data Policies, Purposes, Projects — authored but not attached to live tables. This is re-modelling, not translation: a domain-owner workshop asking what each Ranger policy is actually trying to achieve. Immuta's decisions are compared nightly against Ranger's audit.
Per-domain cutover to Immuta enforcement
Immuta's Native Snowflake or UC integration is enabled and attaches policy objects. Critically, end users' direct table grants are switched off so access is mediated through Immuta — skip this and enforcement is silently bypassed. Seven days of side-by-side follow, both audit streams captured; the domain's warehouse-equivalent Ranger policies go deny-by-default.
Re-home tag policies into Immuta Global Policies
Cross-cutting tag policies — "PII tag masks the column unless a purpose is active" — move to Immuta Global Policies targeting @hasTag('PII'), which reach every Data Source at once. Ranger's tag service is retained only where resources are HDFS, HBase or Kafka, with a documented canonical source during any deliberate overlap.
Retire Ranger from the warehouse path
Ranger services covering warehouse assets are disabled or deleted; community Snowflake or UC plugins, if present, are removed. Ranger keeps governing HDFS, HBase, Kafka and Hive/Trino-on-lake, UGSync continues, and audit retention is unchanged because the regulatory clock does not shrink just because enforcement scope did. A final warehouse policy export is filed as the auditor evidence packet.
Final retirement (partial — the honest end-state)
For most installs this is a stable hybrid: Ranger lake-only, Immuta for warehouse and tag policy. Full retirement is only feasible if the lake is decommissioned or moved behind Starburst/Trino that Immuta can govern — and that forces direct-HDFS Spark workloads onto a Trino-mediated path, an application-team workstream, not a flag flip.
Feature parity
What moves cleanly, and what doesn't.
| Capability | Apache Ranger | Immuta | Parity |
|---|---|---|---|
| Policy model | Ranger Policy / Service / Plugin (resource-based, per-service) | Immuta Data Source + Subscription Policy + Data Policy | Partial |
| Tag/attribute-based policy | Ranger tag service + Atlas classifications (Ranger-governed services only) | Immuta Global Policies targeting @hasTag('PII') across every integration | Partial |
| Column / value masking | Ranger mask types (MASK_SHOW_LAST_4, MASK_HASH, CUSTOM valueExpr UDFs) | Immuta Data Policy primitives (hashing, format-preserving, regex, k-anonymization); no arbitrary UDFs | Partial |
| Row-level filtering | Ranger row-filter filterExpr with ${{USER.attr.*}} | Immuta Data Policy "row filtering — match user attribute" | At parity |
| Purpose-based access (PBAC) | No purpose object — simulated with per-purpose roles, no session scoping | Immuta Purpose object, session-activated, recorded per Policy Decision | SaaS only |
| Warehouse connectors (Snowflake/Databricks) | Community Snowflake/UC plugins of uncertain maturity | Immuta Native Snowflake (MASKING POLICY / ROW ACCESS POLICY) + Native Databricks UC | SaaS only |
| Lake connectors (HDFS/Hive/HBase/Kafka) | Ranger plugins for HDFS, HiveServer2, HBase coprocessor, Kafka, Trino | No equivalent — Hive only via Starburst/Trino, no HDFS/HBase/Kafka | OSS only |
| Sensitive-data discovery | None built-in (Apache Griffin / hand-rolled regex + Atlas API) | Immuta SDD — ML classifier on column names + samples, writes back classifications | SaaS only |
| Lineage / metadata | Apache Atlas classifications + lineage (external) | Consumed via Immuta External Catalog sync (~15 min poll) | Partial |
| Audit log | Ranger audit to Solr / ES / HDFS sink | Immuta Policy Decision log (warehouse query-log shape) | At parity |
| Policy simulation | None — shadow-eval by replaying audit logs (homegrown) | Immuta simulator + retroactive impact ("who loses access?") | SaaS only |
| BI passthrough (per-user row filter) | BI auths as service account, losing per-user policy | Immuta OAuth-on-behalf-of into Tableau / Power BI / Looker | SaaS only |
| Identity propagation | Ranger UGSync (LDAP/AD poll, shallow attribute model) | Immuta SCIM 2.0 from Okta/Entra/Keycloak with custom attributes | Partial |
| Compliance rule library | Author Ranger JSON per regulation by hand | Immuta contract-managed Global Policies (CPRA/GDPR/HIPAA), updated as law evolves | SaaS only |
What we're honest about
The caveats most vendors leave out.
There is no translator — it's a re-model
Immuta's abstraction (Data Source + Subscription Policy + Data Policy + Purpose + Project) does not map one-to-one to Ranger's Policy/Service/Plugin model, so each domain is re-designed, not run through a converter. That is the point: many Ranger policies were written defensively over years and their original intent is lost — Phase 2 surfaces it, including policies that only worked by accident because deny-wins-over-allow saved a misordered sequence.
Custom mask UDFs don't translate
Ranger's CUSTOM dataMaskType with arbitrary valueExpr — vault tokenization, format-preserving encryption — has no equivalent in Immuta's fixed masking primitives (hashing with salt, format-preserving, NULL, regex, k-anonymization). We inventory every one in Phase 0, then re-implement it as a referenced stored function, keep it on Ranger if it's a lake asset, or replace it with the closest primitive and get privacy-office sign-off on the delta.
HDFS, HBase and Kafka stay on Ranger
Immuta has no equivalent for HDFS path ACLs, HBase cell-level coprocessor policy, Kafka topic ACLs, Solr or Knox — and they're not on its roadmap. If your estate is genuinely Hadoop-native, Ranger cannot be fully retired and the realistic end-state is a stable hybrid. We say so honestly rather than over-promise full retirement; if a Ranger superset is what you need, Privacera is the closer fit.
Two planes, three audit streams, a vendor boundary
During transition you run Ranger's Solr/ES audit, Immuta's Policy Decision log and the warehouse query logs at once, so we plan a SIEM uplift and tiered retention in Phase 1 and overlap both control sets for at least one audit cycle. Your SOC 2 evidence shifts from Ranger JSON in Git to Immuta exports, and if Immuta's integration breaks at 2 AM on-call now calls the vendor — so we confirm the support SLA up front.
Why this beats a flag day
Reversible per phase, gated by a soak.
Every cutover phase rolls back in under 15 minutes — disabling the Native integration detaches Immuta's policy objects and a pre-staged script restores the snapshot of pre-cutover warehouse grants, while a Global Policy or a BI dashboard reverts just as fast. And no domain is declared done on a hunch: each pilot or wave must hold under 1% Immuta-versus-Ranger divergence over a 14-day shadow window with every divergence root-caused, then the steady state must soak for at least 30 days — warehouse fully Immuta-enforced, tickets at or below baseline — before the Ranger warehouse path is retired. Honest caveats are conceded up front rather than discovered in production.
See which domains move cleanly off Ranger.
A 30-minute call with a senior data-governance engineer. We inventory your Ranger policies by domain, count the custom mask UDFs and direct table grants that need handling, and map an honest Immuta-on-the-warehouse, Ranger-on-the-lake boundary for your estate — before you commit to anything.
Map my migration →