63
Why wordfreq will not be updated - AI spam
(github.com)
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
Man I feel this, particularly the sudden shutting down of data access because all the platforms want OpenAI money. I spent three years building a tool that pulled follower relation data from Twitter and exponentially crawled it's way outwards from a few seed accounts to millions of users. Using that data it was able to make a compressed summary network, identify community structures, give names to the communities based on words in user profiles, and then use sampled tweet data to tell us the extent to which different communities interacted.
I spent 8 months in ethics committees to get approval to do it, I got a prototype working, but rather than just publish I wanted to make it accessible to the academic community so I spent even more time building an interface, making it user friendly, improving performance, making it more stable etc.
I wanted to ensure that when we published our results I could also say "here is this method we've developed, and here you can test it and use it too for free, even if you don't know how to code". Some people at my institution wanted me to explore commercialising but I always intended to go open source. I'm not a professional developer by any means so the project was always going to be a janky academic thing, but it worked for our purposes and was a new way of working with social media data to ask questions that couldn't be answered before.
Then the API got put behind a $48K a month paywall and the project was dead. Then everywhere else started shutting their doors too. I don't do social media research anymore.
After my own heart right here. I followed some version of Luca Hammer's guide to categorise everyone I followed on Twitter into communities, then created rss feeds of them using nitter. It was fascinating seeing how they clustered together. I think I still have an old gephi file with that output. I did this before Musk bought Twitter, since I knew he was going to wreck it.
Basically, I would have killed for this tool.
(I'm now wondering if anyone's published a guide on this for bluesky.)