Semgrep OSS → Snyk Code
Semgrep OSS ↔ Snyk Code: integration to migration path.
Your SAST gate is load-bearing, so we never flip a switch. Semgrep deploys alongside Snyk Code first — running advisory in your pipeline while Snyk keeps owning every blocking decision — and only takes the gate once it has proven its recall against Snyk's critical and high findings. No flag day, no forced re-tooling, and every per-phase step rolls back in minutes.
The honest exception we lead with: Snyk Code's reachability analysis and global FP-tuning loop have no OSS equal. We tell you where that is a real change for your team and staff a rule-tuning rotation to absorb it.
The idea
Run Semgrep in the pipeline first. Retire Snyk Code last.
The topology that makes this zero-downtime: Semgrep runs as an advisory per-PR job, with --baseline-ref so it reports only findings new in HEAD and never floods the developer on the legacy backlog, while Snyk Code keeps running as the blocking gate of record. Both emit SARIF into one aggregator — GitHub code scanning or DefectDojo — deduped on file, line, and CWE. We curate rule packs and author private rules until Semgrep recalls at least 95% of Snyk's critical and high findings, then promote it to blocking app by app, with Snyk dropping to advisory before it is removed. No control is ever lost before its replacement is proven.
The phases
Seven steps. Each one reversible.
Baseline & inventory
We pull your Snyk Code projects, issues, and ignore rules via the Snyk API and document per-repo language breakdown, firing CWEs, FP rate per rule, and the load-bearing rules that have blocked merges in the last 90 days. Read-only.
Semgrep CI advisory
Semgrep runs on every PR in advisory mode on the top-5 repos by commit volume, with --baseline-ref so day-one noise is zero and SARIF uploaded to code scanning. Snyk Code stays the blocking gate, unchanged.
Rule-pack curation & private rules
A triage rotation reconciles Semgrep findings against Snyk's per release cycle, classifying each as true positive, false positive, or duplicate. For every real finding Snyk caught and Semgrep missed, we author a private YAML rule under /.semgrep/ and validate it against the historic location.
Semgrep blocking on net-new
Semgrep runs with --error against the baseline on the top-5 repos, blocking PRs that introduce a new high or critical finding. Snyk Code remains the all-history gate. Promoted in waves of low- to high-blast-radius, with 7 days of comms per wave.
Semgrep primary, Snyk advisory
Semgrep becomes the blocking gate on PRs and main; Snyk Code flips to advisory — its SARIF still ingested, but its exit code no longer enforced. Weekly Snyk-only findings feed the rule-tuning rotation.
Snyk shadow & licence decision
Snyk Code drops to a nightly shadow run on main that auto-files low-priority tickets; the licence-downgrade discussion with Snyk begins. We do not cancel before the renewal date — we let it lapse to capture the calendar saving.
Snyk Code retirement
Snyk Code is fully decommissioned: projects disabled, full issue history exported to JSON for at least a year's retention, and the risk register updated to evidence SAST via Semgrep plus DefectDojo. Snyk Open Source and Container are handled separately.
Feature parity
Where Semgrep matches Snyk Code — and where it honestly does not.
| Capability | Semgrep OSS | Snyk Code | Parity |
|---|---|---|---|
| Scan type (SAST) | Structural pattern matching plus intra-proc taint | Snyk DeepCode ML SAST | At parity |
| Language coverage | First-class Python/JS/TS/Java/Go/Ruby; C/C++/Swift weaker | Mainstream set; stronger mobile (Swift/Kotlin) | Partial |
| Reachability / dataflow | mode: taint intra-proc; interproc partial or Pro | Reachability via global call-graph training | SaaS only |
| Finding format (SARIF) | SARIF 2.1.0 native (--sarif) plus JSON/JUnit/GitLab | Snyk Code SARIF plus Snyk JSON | At parity |
| CI / PR integration | semgrep ci; SARIF to code scanning; Platform PR comments (paid) | First-party SCM PR-comment bot | Partial |
| Rule authoring | YAML pattern / pattern-either / mode: taint; Registry packs | Vendor-curated; custom rules in higher tier | OSS only |
| FP feedback loop | pattern-not, .semgrepignore, # nosemgrep, DefectDojo | Global ML FP-feedback loop across customer base | SaaS only |
| Baseline / diff | --baseline-ref origin/main | Snyk "new issues only" PR view | At parity |
| CWE/OWASP mapping | metadata.cwe / metadata.owasp per rule | Findings tag CWE/OWASP | At parity |
| Local dev loop | semgrep scan under 30s; VS Code plus JetBrains plugins | Snyk IDE plugin (Snyk Code) | At parity |
| Pre-commit hook | First-class pre-commit framework hook | CLI only; custom wiring | OSS only |
| Deployment model | Self-hosted engine | SaaS vendor-hosted | OSS only |
| Cost model | OSS free; Platform paid for managed features | Per-developer / per-contributor | OSS only |
| Compliance (SOC 2) | Self-owned via DefectDojo plus CI logs | Inherited Snyk SOC 2 boundary | SaaS only |
What we're honest about
The caveats most vendors leave out.
Reachability analysis evaporates
Snyk Code's reachability — driven by a global call graph from training data — de-prioritises findings that are not actually exploitable. Semgrep OSS has no equivalent, so some findings Snyk correctly quieted will look new or uprioritised. We document the reachability-de-prioritised set in Phase 0 and carry the rationale forward as risk-accepted in DefectDojo.
No vendor FP-feedback loop
Snyk's edge is a global false-positive feedback loop — it learns ignore signals from every customer and ships tuned rules. Semgrep OSS has nothing like it. We mitigate locally with pattern-not clauses, .semgrepignore, and reviewed nosem comments, but this is the largest ongoing delta and needs a staffed rule-tuning rotation (roughly 0.25 to 0.5 FTE for a mid-sized org).
Mobile and C/C++ coverage is weaker
Semgrep OSS is strong on Python, JS/TS, Java, Go, and Ruby, but Swift, Kotlin, Objective-C, C, and C++ taint depth is weaker than Snyk's. Mobile-heavy or C/C++-heavy repos may need to stay on Snyk Code past Phase 6 or pair with a complementary tool. We flag those repos before any gate weight shifts.
Interprocedural taint has limits
Semgrep OSS taint runs intraprocedurally; data flow that crosses a module boundary or threads through several helpers favours Snyk's call-graph recall. The closed-source Semgrep Pro Engine closes part of that gap but is a paid product, not OSS — if it is on the table, this is a vendor swap, not an OSS migration, and we will say so.
Why this beats a flag day
Reversible per phase. Soaked before any contract lapses.
Every phase in this plan reverts in under 15 minutes — the Snyk Code job stays configured (just possibly disabled) right through to retirement, so rollback is a one-line CI edit, not a rebuild. And Snyk is never cancelled until Semgrep has run as the sole blocking gate through a minimum 30-day shadow soak with zero high or critical findings escaping that Snyk would have caught. We let the licence lapse at renewal rather than cancelling early — the contract is the last thing to change.
See whether your SAST coverage migrates cleanly.
A 30-minute call with a senior AppSec engineer. We inventory your languages and load-bearing Snyk rules, measure where Semgrep's recall and FP rate land, and tell you honestly which repos can move and which should stay — before any gate weight shifts.
Map my migration →