Corpus Insights — DeclassDB
DeclassDB indexes 349,128 declassified U.S. government documents locally and searchably across 6 of 7 federal sources, with live-proxied access to the rest. Below: the corpus broken down by source, collection, and named entity, with dated provenance for every source.
Corpus at a glance
- 349,128 documents indexed & searchable locally (6 of 7 sources; 2 live-proxied)
- 7 federal sources: CIA, FBI, NSA, State Dept, NARA, DoD, and the National Security Archive
- 1941–2019 — CIA CREST document-year span
- 192 named collections (plus 15,772 unattributed)
- 21,447 named entities extracted (entity coverage excludes State Dept and NARA)
Documents by source
Locally-indexed counts. State Dept is live-proxied and not included in locally-counted totals.
| Source | Documents |
|---|---|
| CIA CREST | 309,708 |
| FBI Vault | 9,282 |
| State Dept | live-proxied |
| NSA | 12,514 |
| NARA | 1,822 |
| DoD | 30 |
| NSArchive | 15,772 |
Top collections
- Friedman Documents — 7,788 documents
- FBI Vault — General — 5,011 documents
- Internal Periodicals — 2,183 documents
- Espionage — 1,784 documents
- Gangster Era — 746 documents
- FOIA Reports — 538 documents
- Civil Rights — 458 documents
- Cults & Mass Incidents — 410 documents
- Historical Releases — 265 documents
- Public Corruption / Politics — 227 documents
- NSA 60th Timeline — 184 documents
- USS Liberty — 183 documents
- Yardley Collection — 178 documents
- Counterterrorism — 177 documents
- Venona — 177 documents
Unattributed — 15,772 documents not yet assigned to a named collection (ingest defaults), excluded from the ranking above.
Provenance — verify every source
- CIA FOIA Electronic Reading Room — 309,708 documents — indexed 2026-05-19
- FBI Vault (FOIA Library) — 9,282 documents — indexed 2026-05-15
- Department of State FOIA — live-proxied — date pending
- NSA Declassification & Transparency — 12,514 documents — indexed 2026-05-16
- National Archives (NARA) — 1,822 documents — indexed 2026-05-20
- Department of Defense FOIA — 30 documents — indexed 2026-05-18
- National Security Archive (GWU) — 15,772 documents — indexed 2026-05-18
DeclassDB
FOIA search is fragmented and keyword-only. DeclassDB unifies seven federal archives — CIA CREST, FBI Vault, NSA, State, NARA, DoD, NSArchive — and adds AI semantic search, so you can find what you mean, not just what you typed.
AI semantic search
Find by meaning across 309,708 CREST documents (1.05M pages, full-text). The embedding model runs in your browser — no cloud round-trip. Pro & Researcher.
One query, every agency
Unified, blended, de-duplicated across seven federal sources. CIA / FBI / State always free; NSA / NARA / DoD / NSArchive unlocked on Pro.
Privacy-first by design
Vectors, AI summaries, and entity extraction all run on your device. No cloud, no search logs, no third-party trackers. Air-gappable for federal use.