SigNoz → New Relic

SigNoz ↔ New Relic: integration to migration path.

Observability is load-bearing in every incident, so we never flip a switch. SigNoz deploys alongside New Relic first — one OpenTelemetry Collector tees the same OTLP signal to both backends — and only then does SigNoz take over alerting authority in phases. Because both sides are first-class OTLP receivers, there is no re-instrumentation, no flag day, and every phase rolls back in minutes.

New Relic stays the alerting authority and pane of glass until Phase 4. The honest gaps — Errors Inbox, Applied Intelligence, RUM and synthetics — we name up front, because the rest only matters if you can trust it.

The idea

Tee one Collector to both backends first.

The topology that makes this zero-outage: a gateway-mode OpenTelemetry Collector sits in front of every signal source and fans one ingestion pipeline to two exporter sinks — SigNoz over OTLP and New Relic over OTLP/HTTP. Because both backends consume the identical Span, Metric and LogRecord structs, there is no wire-format conversion, which is what makes this materially easier than a proprietary-agent migration. Tail-sampling runs before the fan-out so both backends see the same sampled set; ClickHouse storage you control parallel-runs while New Relic still owns alerting. That lets you twin dashboards, prove alerts in shadow mode, then flip authority — each step independently reversible.

The phases

Six steps. Each one reversible.

0

Baseline & inventory

We classify every service by instrumentation type — New Relic proprietary agent, OTel SDK or Pixie-only — and capture trace, log and metric volume, dashboard and alert inventories, RUM/synthetics usage and Vulnerability Management findings. Read-only.

Users see: No user impact.

Rollback: N/A

1

Stand up SigNoz; tee a canary

SigNoz runs in HA on a ClickHouse cluster and one canary cluster's OTel Collector tees OTLP to both SigNoz and New Relic. No production alert routes off SigNoz yet.

Users see: None for users — SREs see a parallel SigNoz UI populated by canary data.

Rollback: Delete the canary dual-export block, or delete SigNoz entirely — no production alert depends on it.

2

Extend the tee; build dashboards

Every OTel Collector globally tees to both backends, and every New Relic dashboard gets a SigNoz equivalent built in git as JSON. New Relic stays the alerting authority.

Users see: None for users — on-call sees both panes; runbooks still point at New Relic.

Rollback: Revert the collector config; SigNoz stops receiving on rollback. New Relic is unaffected.

3

Alert parity in shadow mode

Every production NRQL alert gets a SigNoz twin firing into a low-priority shadow channel for at least 14 days. We compare fire rates per alert per day and investigate any divergence of 10% or more.

Users see: None for end users — on-call sees parallel shadow pages and learns the new UI.

Rollback: Disable SigNoz alerts globally. New Relic alerts unaffected.

4

Flip alerting authority to SigNoz

SigNoz alerts route to the real paging channel and New Relic alerts reverse into the shadow channel as the rollback runway. New Relic stays in receive mode via the tee; runbooks start investigation in SigNoz.

Users see: None for end users — on-call workflow shifts to SigNoz.

Rollback: Reverse the routing swap — a single config change in each system. Under 15 minutes.

5

Retire New Relic

The New Relic exporter is removed, Infrastructure DaemonSets are uninstalled, and proprietary APM agents are swapped for OTel SDKs in each service's normal release train — never a big bang. The tenant is held read-only for 30 days as an evidence window, then the contract ends.

Users see: None for end users.

Rollback: Within the 30-day read-only window, re-enable the New Relic exporter and data resumes. After the window, rollback is no longer in scope.

Feature parity

What moves cleanly, and what doesn't.

CapabilitySigNozNew RelicParity
Traces/metrics/logs coverage signoz-otel-collector OTLP 4317/4318 → ClickHouse NRDB ingest via OTLP otlp.nr-data.net:4318 + agents At parity
Collection / agent OTel Collector tee (vendor-neutral OTLP) NR proprietary APM agents + newrelic-infra + OTLP At parity
Query language ClickHouse SQL via Query Service + builder UI NRQL (proprietary, FACET/TIMESERIES/SINCE) At parity
Schema control Own ClickHouse tables, materialized views, skipping indexes NRDB schema opaque (drop rules / Events-to-Metrics) OSS only
Dashboards-as-code /api/v1/dashboards JSON newrelic_one_dashboard TF + NerdGraph At parity
Alerting SigNoz alerts on ClickHouse queries (webhook/PD/Slack) NRQL alerts + Applied Intelligence + incident workflows Partial
Error grouping Per-service exceptions from span exception events NR Errors Inbox (cross-service dedup grouping) SaaS only
Anomaly / AIOps Per-signal Robust-MAD / Prophet alerts NR Applied Intelligence / Lookout (whole-tenant) SaaS only
RUM / synthetics None (OpenReplay / Checkly separate procurement) NR Browser + NR Synthetics (proprietary, non-OTLP) SaaS only
eBPF zero-instrumentation Pixie OSS Vizier self-host NR hosted Pixie control plane + script catalog Partial
Vulnerability correlation None (Trivy/Grype + homemade correlator) NR Vulnerability Management correlated to services SaaS only
Retention / storage tiers ClickHouse TTL + TO DISK 's3_disk' cold tier NRDB per-GB ingest billing + retention tiers At parity
Self-hosting / residency Helm + your ClickHouse; any jurisdiction NR site choice (US/EU) at install Partial
Compliance attestations Self-hosted; control evidence is yours NR SOC 2 / ISO / FedRAMP posture Partial

What we're honest about

The caveats most vendors leave out.

Errors Inbox grouping has no SigNoz parity

New Relic's cross-service error triage with dedup heuristics disappears at retirement. SigNoz surfaces per-service exceptions from span exception events, which is not the same UI. We stand up Sentry as a replacement or document the workflow change explicitly — we don't pretend the per-service view is equivalent.

Applied Intelligence is a whole-tenant gap

Applied Intelligence and Lookout do whole-tenant anomaly detection with no open-source equivalent. The honest replacement is per-signal Robust-MAD or Prophet alerts in SigNoz, listed as an explicit feature gap — not a 'we'll figure it out later.'

RUM and synthetics never migrate via the tee

New Relic Browser and Synthetics are proprietary and non-OTLP, so the Collector tee cannot carry them. They are separate procurement decisions — OpenReplay or Sentry Replay for RUM, Checkly or k6 or Playwright for synthetics — and we treat them as their own workstream, never a clean migration.

You now operate ClickHouse

Once New Relic is gone, SigNoz being down means your observability is down. Your team now owns ClickHouse backups, replication, upgrades and schema migrations, plus Vulnerability Management correlation becomes a Trivy/Grype workstream. We decide managed versus self-managed upfront and run it as managed — backups, DR drills and a documented capacity plan, not just installed.

Why this beats a flag day

Reversible at every step.

Every phase is a collector or alert-routing config change with an under-15-minute rollback while both backends run in parallel — and the reverse-shadow window, where New Relic becomes the fallback after Phase 4, is itself the rollback runway. We never remove the New Relic exporter or cancel the contract until SigNoz has held alerting authority through a minimum 30-day green soak, with the tenant kept read-only for a further 30 days as an evidence window. The soak gate is the point: the new path proves itself before any bridge is burned.

See whether your observability stack migrates cleanly.

A 30-minute call with a senior observability engineer. We classify your services by instrumentation type, size the ClickHouse cluster against real ingest volume, and tell you honestly which New Relic control-plane features — Errors Inbox, Applied Intelligence, RUM — have no OSS parity. Before you commit.

Map my migration →