CrowdSec → Cloudflare Bot Management

CrowdSec ↔ Cloudflare Bot Management: integration to migration path.

CrowdSec deploys at your origin behind Cloudflare first, learning your traffic while Bot Management keeps scoring every request at the edge. Only once origin coverage is proven do we hand CrowdSec the behavioural and IP-reputation tier — no flag day, no forced re-credentialing of your edge.

The honest end state is partial, not total. Cloudflare stays your edge for anycast, L3/L4 DDoS absorption, and ML bot scoring — what retires is the Bot Management SKU, and only after a soak proves CrowdSec catches what it caught.

The idea

Two enforcement planes, one decision set.

The topology that makes this zero-downtime: Cloudflare keeps fronting the apex while CrowdSec runs at origin, consuming the proxy access logs the edge can never see — per-route timing, 401/403 patterns, sensitive-file probes. Its decisions enforce locally through the nginx bouncer and are pushed back up to Cloudflare via an IP List, so edge and origin block the same attacker. Cloudflare never stops owning anycast, TLS, and DDoS; CrowdSec simply takes over the behavioural and IP-reputation work, one proven scenario at a time.

The phases

Seven steps. Each one reversible.

0

Baseline & inventory

We document every zone's Cloudflare config — score thresholds, custom rules, Managed Challenge, Turnstile, Page Shield — and capture at least 30 days of bot-score histograms. We also audit your origin reverse-proxy fleet and confirm the access-log format carries the fields CrowdSec's parsers need.

Users see: No user impact.

Rollback: N/A

1

CrowdSec at origin in detect

CrowdSec's agent and LAPI go live on each origin reverse-proxy, parsing access logs through the nginx and base HTTP collections. No bouncer enforces yet. We canonicalise the real client IP from CF-Connecting-IP so decisions point at attackers, not Cloudflare's egress range.

Users see: None — detect-only.

Rollback: systemctl stop crowdsec.

2

Bouncer blocks high-confidence only

The nginx bouncer enforces in BLOCK, but only for high-confidence scenarios — community-blocklist matches, sensitive-file probes, known WordPress brute-force, malicious-path probing. Behavioural scenarios stay in detect. Cloudflare's behaviour is unchanged.

Users see: Under 0.5% of requests hit a 403 at origin that the edge previously passed.

Rollback: Flip the bouncer to tap mode and reload nginx. Under 5 minutes.

3

Tune behavioural scenarios

Behavioural scenarios move to BLOCK in waves, one at a time, after a 14-day soak each. Per-route exclusions whitelist legitimate automation — monitoring, partner integrations, your mobile-app fingerprint — each with a ticket reference and expiry.

Users see: None, with correct whitelists.

Rollback: Per-scenario: remove or move back to detect. Under 10 minutes.

4

Push decisions to Cloudflare

The Cloudflare bouncer writes long-lived origin decisions into a Cloudflare IP List, and a custom rule blocks or managed-challenges them at edge. Edge and origin now enforce the same decision, so the worst IPs are stopped before they cost you origin bandwidth.

Users see: None.

Rollback: Clear the IP List and disable the matching custom rule. Under 10 minutes.

5

Downgrade Bot Management

Only after a 30-day soak: Bot Management steps down to Bot Fight Mode (or Super Bot Fight Mode), and the rules referencing the edge bot score are deleted. CrowdSec absorbs the slight increase in bot traffic reaching origin. Dropping the SKU entirely is an option only for narrow, signed-off workloads.

Users see: Slight bot-traffic increase reaching origin during the change window; CrowdSec absorbs it.

Rollback: Reinstating the SKU is contract-dependent — 1 to 7 days, not 15 minutes. This phase is explicitly soak-gated for that reason.

6

Steady state

Ongoing scenario maintenance, quarterly bouncer upgrades, community or premium blocklist renewal, scenario PR review in CI, and managed console retention. Quarterly review of false-positive rate, blocked-traffic rate, and cost.

Users see: None.

Rollback: N/A — steady-state operations.

Feature parity

Where CrowdSec matches the edge — and where it cannot.

CapabilityCrowdSecCloudflare Bot ManagementParity
Behavioural detection CrowdSec scenarios (leaky-bucket on access logs) Edge cannot see app-layer session or route state OSS only
Bot detection / ML IP and behaviour correlation, not per-request ML cf.bot_management.score (1–99 ML across cross-tenant telemetry) SaaS only
Verified-bot allowlist Reverse-DNS plus IP-list compare (self-maintained) cf.bot_management.verified_bot (cryptographic) Partial
Silent clear-through hCaptcha / mCaptcha / Anubis (visible challenge) Turnstile silent clear-through SaaS only
IP reputation Community Blocklist (CTI), 15s pull, inspectable cf.threat_score (opaque internal feed) At parity
DDoS absorption None — origin pays the bandwidth Global anycast plus L3/L4 absorption SaaS only
Rate limiting Scenarios plus bouncer enforcement Edge rate limiting At parity
Client-side protection Self-hosted CSP report-uri (no JS intel) Page Shield SaaS only
Exposed-credential check HIBP k-anonymity at the app layer Exposed-credential check at edge SaaS only
Local enforcement Bouncers (nginx, iptables, AWS WAF, Cloudflare, HAProxy, Envoy) Cloudflare blocks at edge only OSS only
JA3 / JA4 Scenario groupby on proxy-injected JA3/JA4 fingerprints Used inside the score, not exposed per-rule Partial
Logging / SIEM Local Console plus events under your retention Bot Analytics, retention per plan At parity
Cost model OSS agent plus free community blocklist, optional premium Per-request priced Enterprise add-on Partial
Compliance (PCI 6.4.2 / SOC 2) Self-owned SIEM plus scenario PR review Vendor IR team plus plan-bound retention Partial

What we're honest about

The caveats that keep Cloudflare in front.

Anycast and L3/L4 DDoS stay on Cloudflare — always

Global DDoS absorption needs globally-distributed points of presence and terabit upstream — that is physics and capex, not software. No origin-side tool reproduces it. Cloudflare keeps fronting every public origin; CrowdSec never owns the edge.

ML bot scoring beats us on top-tier scrapers

Cloudflare's per-request score is trained across a large share of the public web. Against headless-Chromium plus residential-proxy combos that rotate IP, UA, and fingerprint every request, IP-grouped scenarios get evaded. If sophisticated distributed bots are your threat, do not retire Bot Management — we will tell you so.

Silent clear-through, Page Shield, and exposed-credential checks have no clean OSS parity

Turnstile silently clears known-good clients; self-hosted captcha challenges good users more often. Page Shield's Magecart detection and the edge exposed-credential check both lean on cross-tenant intelligence CrowdSec does not have. If you use these, they stay on Cloudflare.

The XFF trust chain is the single most common outage cause

Behind Cloudflare, every origin connection arrives from a Cloudflare egress IP. If the real-client-IP chain is not canonicalised, CrowdSec can ban Cloudflare's whole range and take the site offline. We validate this at the Phase 1 gate and refresh Cloudflare's CIDR list daily.

Why this beats a flag day

Reversible at every step, soak-gated before any SKU change.

Every phase up to the SKU downgrade rolls back in under 15 minutes — flip the bouncer to tap mode, clear the Cloudflare IP List, or disable a custom rule. We never delete the rules referencing the edge bot score until CrowdSec has run alongside Bot Management for at least 30 days and demonstrably caught at least 95% of the bot traffic the edge was catching. The Bot Management contract is only touched after that soak gate passes — never before, and never as a big-bang cutover.

See whether CrowdSec can own your behavioural tier.

A 30-minute call with a senior security engineer. We baseline your Cloudflare config and bot-score histograms, map your origin proxy fleet, and tell you honestly which part of Bot Management CrowdSec can take over — and which part stays on the edge for good.

Map my migration →