💾 Local Free

Corpus Insights — DeclassDB

DeclassDB indexes 349,128 declassified U.S. government documents locally and searchably across 6 of 7 federal sources, with live-proxied access to the rest. Below: the corpus broken down by source, collection, and named entity, with dated provenance for every source.

Corpus at a glance

  • 349,128 documents indexed & searchable locally (6 of 7 sources; 2 live-proxied)
  • 7 federal sources: CIA, FBI, NSA, State Dept, NARA, DoD, and the National Security Archive
  • 1941–2019 — CIA CREST document-year span
  • 192 named collections (plus 15,772 unattributed)
  • 21,447 named entities extracted (entity coverage excludes State Dept and NARA)

Documents by source

Locally-indexed counts. State Dept is live-proxied and not included in locally-counted totals.

SourceDocuments
CIA CREST309,708
FBI Vault9,282
State Deptlive-proxied
NSA12,514
NARA1,822
DoD30
NSArchive15,772

Top collections

  • Friedman Documents — 7,788 documents
  • FBI Vault — General — 5,011 documents
  • Internal Periodicals — 2,183 documents
  • Espionage — 1,784 documents
  • Gangster Era — 746 documents
  • FOIA Reports — 538 documents
  • Civil Rights — 458 documents
  • Cults & Mass Incidents — 410 documents
  • Historical Releases — 265 documents
  • Public Corruption / Politics — 227 documents
  • NSA 60th Timeline — 184 documents
  • USS Liberty — 183 documents
  • Yardley Collection — 178 documents
  • Counterterrorism — 177 documents
  • Venona — 177 documents

Unattributed — 15,772 documents not yet assigned to a named collection (ingest defaults), excluded from the ranking above.

Provenance — verify every source

Collections

Download Queue