4

Apologies if crossposting is against the rules; I'm not entirely sure where the lines are drawn here yet.

I posted this in lemmy.fosscad (fosscad@lemmy?), but realize that may not be the most active venue.

I grabbed archives of fosscad and took a look at the contents of the zst's. I think I could probably rebuild the contents of the subreddit in some manner or another; the question is scale and hosting. How would we make the posts easily searchable, where would they live, what endpoint can we upload hundreds of thousands of comments into in a reasonable time frame... all that fun stuff.

The archives don't contain pictures, but contain links to the pictures and the ones I've checked are currently still live (meaning the pics are still hosted on reddit). Dunno how long that will remain the case.

I have no idea what the size of the archives would be with pics downloaded; gigs, a TB, no clue. I'm posting this to gauge public interest and I haven't done much preliminary work (oh, these are json. Yep, dictionaries work. Wingo.)

Is there any interest in making this more publicly available? I've run into an issue with a particular build and I'll be diving through the archives to fix it for my self. It seems like a shame that all this information would be inaccessible to everyone who isn't able or interested in trawling through their own local archives.

I'm not a programmer by trade, but work in an adjacent space. I can plink along on this if other people are interested (and if anyone is interested enough to help pitch in, even better).

you are viewing a single comment's thread
view the rest of the comments
[-] hogleg@forum.guncadindex.com 1 points 1 month ago* (last edited 1 month ago)

Had to rearrange some things, but I'm pulling data from end of 2022 through 2024. It's a chonker. This is everything, so will need to parse through and find anything id'd as fosscad. Not sure how long it would take to iterate through all of that; it's over 1TB.

I'll be back this evening to update progress; download speed is pretty decent so if no big changes, should have the raw files tonight.

Edit Happy surprise; it's everything from 2023 to 06/2025. So losing the last handful of months of data (unless more gets added later). Still a pretty huge win.

Thanks to Grey Summit Gear for kicking the shit out of this, and the folks who pulled all these dumps!

[-] gsgmfg@fosscad.io 1 points 1 month ago

Flipping fantastic. Can't stop the signal!

I need to make my alt-github to I can PR the scripts and code I wrote to upload all this.

When you get the new zsts posted lmk and I'll start another upload session.

Amazing work.

[-] hogleg@forum.guncadindex.com 2 points 1 month ago

Just a heads up; I was able to finish pulling down the archives, but it's going to take awhile to parse; wasn't expecting to need this much storage touching my compute lol. I'm hoping I can have those ready for upload tonight or early tomorrow AM.

load more comments (4 replies)
load more comments (4 replies)
this post was submitted on 07 Oct 2025
4 points (100.0% liked)

General Discussion

398 readers
27 users here now

Discuss anything GunCAD related. If you need help, see /c/help. If you think you're discussing a thing a lot of the time, consider making your own separate community.

Abide by the global rules or be smitten.

Shitty stock image may be changed at moderator discretion.

founded 2 months ago
MODERATORS