Data being seeded by my torrent server
Help preserve data being deleted by fascists: https://lydie.cc/data.html
Data being seeded by my torrent server
Help preserve data being deleted by fascists: https://lydie.cc/data.html
Today I noticed a heck of a lot of 404's when I was fetching #pubmed data. Seems there was a thorough gutting there this week. Links to articles are still there but the articles themselves are going missing en masse. I'm going to continue to grab and post what I can here:
If you work(ed) in federal government and know of some public websites or data that is likely to be taken down sooner than later, I would appreciate knowing about it so we can prioritize archiving it.
We still have a bunch more on the way, but we prioritize requests and tips over my own inklings.
There are several programms running already, some might "sleep". Please let us know so that we can start getting coordinated. Thank you!
#ClimateResearch:
#DataRescue via @CopernicusEU
https://datarescue.climate.copernicus.eu/
Projects by World Meteorological Organization (WMO): DARE (still in progress???)
https://community.wmo.int/en/data-rescue-projects-and-initiatives-dare
all mentioned above + #LiveSciences
https://safeguarding-research.discourse.group/about
//ping @ZLabe @S4F @SafeguardingResearch @riffreporter @Riff_PlanG @kachelmannwetter
It's an exciting day for public health data rescue!
Today we added preliminary collected data for:
SAMHSA: https://git.lsit.ucsb.edu/publicdata/samhsa-gov
AHRQ: https://git.lsit.ucsb.edu/publicdata/ahrq-gov
CMS: https://git.lsit.ucsb.edu/publicdata/cms-gov
We also posted significant additions to the cancer.gov and NIH archives:
https://git.lsit.ucsb.edu/publicdata/cancer-gov
https://git.lsit.ucsb.edu/publicdata/nih-gov
Here's a preliminary dump of public data from the National Institute of Health (NIH).
More on the way!
And here's a preliminary archive of data sets and publications from the US Department of Agriculture, including SNAP, chemical use, imports, etc.
And now dumps of the US National Archives and US Government Publications Office data from 1996-2025! See the data.json for metadata for what each file contains.
Note that the archives here (zip, tar) are git-lfs.
https://git.lsit.ucsb.edu/publicdata/us-national-archives-and-publications
And here's financial data for various USA federal agencies going back to 2001 in one big #postgresql database.
FYI: This repo uses git-lfs.
Just dropped a terabyte more data into the NCEI NOAA archive.
Note to future self: Do not ever put a terabyte of data in a single #git commit and push it ever again.
https://git.lsit.ucsb.edu/publicdata/NCEI-NOAA
#data #public data #datarescue #noaa
A thanks to everyone who's been linking to our publicdata rescue project! I keep seeing links to it show up in more and more places. The infrastructure it's running on is local (not public cloud) and not tied to any NSF funding. That said, I encourage you to clone these repositories if you have plenty of disk space and an unmetered connection. More coming soon!
Today, I've added a public data dump of the US Social Security Administration (SSA) website. This data dump includes things like #disability #research, policies, forms, and guidance for same sex couples, etc.
We just learned about this Declaration to #DefendResearch and the inclusion of public data! Thanks for linking to us. We love the level of cross-project support and communication we see now!
#DataRescue
RE: https://bsky.app/profile/did:plc:l2xv3qrv4oz23pg7tnsrrwqx/post/3li5k3ayop225
@hacks4pancakes perhaps something for #DataHoarders https://www.tomshardware.com/tech-industry/big-tech/data-hoarders-race-to-preserve-data-from-rapidly-disappearing-u-s-federal-websites and others: https://connect.oeglobal.org/t/webinar-federal-data-disappearing-and-who-is-saving-it/7356 @SafeguardingResearch
Join our #DigitalPreservation efforts!
Forum: https://safeguarding-research.discourse.group/
Hashtags: #SafeguardingResearch, #DataRescue
Fediverse: @SafeguardingResearch
Bluesky: https://bsky.app/profile/datarescueproject.org
EDIT: ARCHIVE.ORG sets have been fixed. Let the downloading commence!
My US government data hoarding page is up and ready with links and torrents. The torrents are all being seeded by my junkbox torrent server. I will continue to add torrents as I download things.
We still have some efforts up for grabs. We have a running inventory if you have time/capacity. Sign up on our form for information (link on gdoc). I'll also post some activities here but lmk if you are working on them so we can keep track. #DataRescue2025 #DataRescue
Folks in data rescue and archiving initiatives may find this #RStats package from @ropensci.org useful: `gitcellar` downloads and archives all repos, issues, and PRs from a GitHub organization in one shot: docs.ropensci.org/gitcellar/ #DataRescue @datarescue2025.bsky.social
Helps Download Archives of Git...
What happens when public data infrastructures aren’t safe?
This is the question that Laura Rothfritz from our research group at Humboldt-Universität zu Berlin explores in her dissertation project.
Her research focuses on #DataRescue initiatives in the U.S. in response to the Trump administration.
Learn more about her research on our group blog:
https://doi.org/10.59350/6g3km-9w70
cc: @ztirfhtor, @IBI_HU, @HumboldtUni