Graylog → Splunk
Graylog ↔ Splunk: integration to migration path.
Splunk is your authoritative log surface, so we never cut over on a single day. Graylog stands up alongside Splunk first, one high-volume source class is dual-shipped to prove parity, and only then does each source class move across in waves — turning verbose, low-signal data from a metered cost into a free tier you control. No flag day, no detection blackout, and every wave rolls back in minutes.
The honest exception: the SPL-heavy, high-value tail — firewall, EDR, critical auth, anything ITSI touches — may stay on a smaller Splunk by design. We decide that per source class with the evidence in front of us.
The idea
Dual-ship to prove parity. Move source classes last.
The topology that makes this zero-blackout is dual-shipping at the agent or collector: every source tees to both SIEMs at once, so Graylog is evaluated against exactly the same data Splunk sees without disturbing a single analyst. The shipper carries a shared event ID for cross-tool dedup, per-sink buffers stop one outage starving the other, and Splunk keeps owning every dashboard, scheduled search and premium app unchanged. Only once a source class is parity-proven and its detections are trusted do we cut its Splunk leg, move it to Graylog, then iterate — each class independently, each reversible.
The phases
Seven steps. Each one reversible.
Baseline & inventory
We document what each Splunk surface actually depends on — sourcetype, GB per day, parse rules, every scheduled search, dashboard, alert, lookup and accelerated data model — cross-referenced against real usage so dead content is flagged. Read-only.
Stand up Graylog; mirror a low-value source
A Graylog HA cluster (OpenSearch plus a MongoDB replica set) goes live, and one high-volume, low-detection-value source class — web access or DNS — is dual-shipped so we can validate parse correctness against Splunk. Splunk is untouched.
Rebuild detections in shadow mode
Every saved search and dashboard for that source class gets a Graylog equivalent running in shadow — both fire, Graylog alerts to a non-prod channel while Splunk stays primary. Simple filters port 1:1; complex SPL is flagged for human rewrite.
Promote Graylog for the mirrored source
Once detections are trusted, the Splunk leg for that source class is cut at the shipper and Graylog becomes the sole search surface. A 30-day read-only window stays open in Splunk for fallback.
Iterate per source class, in waves
Phases 1 to 3 repeat per source class: low-value high-volume first (DNS, web, S3, CDN), medium-value next, high-value last. Wave N+1 can begin while wave N finishes, inside a deliberate rollback budget.
Address the high-value tail
Firewall, EDR and critical-auth sources get a content-feasibility audit. Each class is either migrated — accepting reduced sophistication for SPL-heavy patterns — or honestly kept on a smaller Splunk. We do not fake-port to keep a clean narrative.
Retire Splunk or settle the partition
Full retirement: Splunk runs read-only for a 30 to 90 day evidence window, evidence exports to Object-Locked S3, then licences lapse and indexers decommission. Or the partition becomes permanent, documented, with audit scope updated.
Feature parity
Where Graylog matches Splunk, and where it doesn't.
| Capability | Graylog | Splunk | Parity |
|---|---|---|---|
| Structured log ingest | GELF (UDP/TCP/HTTP) plus Beats, Syslog and Raw inputs | Splunk HEC plus Universal Forwarder (S2S 9997) | At parity |
| Agent / data collection | Graylog Sidecar managing Filebeat / Winlogbeat / NXLog | Splunk Universal Forwarder / Heavy Forwarder | At parity |
| Schema-on-read parsing | Extractors plus Pipeline Rules (UI-edited, no re-index) | props.conf / transforms.conf plus Ingest Actions / Edge Processor | Partial |
| Search / query language | Lucene field-aware operators plus UI aggregations | SPL (tstats, transaction, streamstats, DM acceleration) | SaaS only |
| Alerting / notification | Alerts & Events (Aggregation, Filter & Aggregation, Correlation) | Scheduled searches plus Notable Events / adaptive response | Partial |
| Sigma rule ingestion | Graylog Sigma importer to Event Definitions | sigma-cli to SPL outside the platform | OSS only |
| Lookup / enrichment | Data Adapters (HTTP JSONPath, CSV, DNS, Threat Intel) plus Lookup Tables | lookup / KV Store | At parity |
| RBAC plus multi-tenancy | Streams plus Index Sets plus Roles plus Teams (Teams is Enterprise) | Roles plus srchIndexesAllowed per-index / eventtype RBAC | Partial |
| Retention / archive | Index Sets per-stream rotation plus Archive (or DIY S3 lifecycle) | Per-index retention plus SmartStore plus frozen-to-S3 Object Lock | At parity |
| Deployment model & HA | Self-hosted Graylog plus OpenSearch plus MongoDB replica set | Splunk Cloud managed scaling or self-hosted indexer cluster | Partial |
| Cost model | No ingest licence (AGPLv3); compute and storage only | Per-GB ingest (Enterprise) or workload-based SVC (Cloud) | OSS only |
| App / content catalogue | Illuminate | Splunkbase (~1500 apps, premium TAs, ITSI) | SaaS only |
What we're honest about
The caveats most vendors leave out.
Graylog is log management, not a SIEM peer
This path is Graylog Open against Splunk as a log-aggregation and search platform — not against Splunk ES. Graylog has growing detection features but no RBA, ESCU or correlation-search equivalent. If ES is in play, this is the wrong path and we will say so.
SPL does not fully translate
Fifteen years of tstats, transaction, streamstats, eventstats, sub-searches and macros sit at roughly 50 to 65% auto-portable, 25% manually portable, and 10 to 20% not portable at all. The complex queries that catch the real things are the ones that resist. We classify every search and never promise full portability.
Splunkbase breadth has no Graylog catalogue
Around 1500 Splunkbase apps — premium TAs with field extraction and CIM mapping for hundreds of vendor products — have no Graylog peer; Illuminate covers a fraction. Long-tail vendors mean hand-rolled Pipeline Rules, and ITSI-served sources stay on Splunk entirely.
Self-hosting moves uptime and ops onto you
A Splunk Cloud outage is a vendor problem with an SLA; a Graylog outage is yours at 2 a.m., and losing MongoDB loses your config. We run HA Graylog plus OpenSearch plus a MongoDB replica set with daily dumps, a tested restore, and a break-glass revert to Splunk-only — managed, not just installed.
Why this beats a flag day
Reversible per wave, soaked before you cancel.
Every source-class cutover rolls back in under 15 minutes by re-enabling the shipper's Splunk output, so a bad wave is a single config reload, not an incident. And we never cancel the Splunk contract on a hunch: each migrated class soaks at least 30 consecutive days as Graylog's sole search surface with no detection regression and ticket volume trending to zero before we move on, and Splunk holds a read-only evidence window before any licence lapses. You are never betting the SOC on one big cutover.
See whether your Splunk content migrates cleanly.
A call with a senior detection engineer. We inventory your sources by ingest cost and your searches by SPL construct, separate the auto-portable content from the high-value tail that stays on Splunk, and tell you honestly whether this is full retirement or cost reduction.
Map my migration →