There are plenty of alternative Internets. And even just skipping the HTTP protocol is a start. IRC is a great example.
I have a testing website. I have never gave the address to absolutely anyone, ever. It's not linked with anything. It's just a silly html site living in a domain.
It's still being ping and probed to death by bad actors. No necessarily AI scrappers. But it's dozens or hundreds of http petitions a day for random places all over the world.
There's no black forest. It's all light up and under constant attack, every tree is already on fire.
That's because it's numerically possible to sweep through the entire IPv4 address range fairly trivially, especially if you do it in parallel with some kind of botnet, proverbially jiggling the digital door handles of every server in the world to see if any of them happen to be unlocked.
One wonders if switching to purely IPv6 will forestall this somewhat, as the number space is multiple orders of magnitude larger. That's only security through obscurity, though, and it's certain the bots will still find you eventually. Plus, if you have a doman name the attackers already know where you are — they can just look up your DNS record, which is what DNS records are for.
I like seeing them try and then thinking "begone thot! There is no entry for you"
In fact, I might make a honeypot that issues exactly that
nepenthes is the tool for that
But an IP can have multiple websites and even not return anything on plain IP access. How do crawlers find out about domains and unlinked subdomains? Do they even?
@kossa @dual_sport_dork If you're using HTTPS, which is by and large the norm nowadays, then every domain is going to be trivially discoverable via certificate transparency logs: https://social.cryptography.dog/@ansuz/115592837662781553
thinking about this, wouldn't the best way to hide a modern websie be something along getting a wildcard domain cert (can be done with LE with DNS challenge), cnaming the wildcard to the root domain and then hosting the website on a random subdomain string ? am I missing something
I do something something like this using wildcard certs with Let's Encrypt. Except I go one step further because my ISP blocks incoming data on common ports so I end up using an uncommon port as well.
I'm not hosting anything important and I don't need to always access to it, it's mostly just for fun for myself.
Accessing my site ends up looking like https://randomsubdomain.registered-domain-name.com:4444/
My logs only ever show my own activity. I'm sure there are downsides to using uncommon ports but I mitigate that by adjusting my personal life to not caring about being connected to my stuff at all times.
I get to have my little hobby in my own corner of the internet without the worry of bots or AI.
Servers which are meant to be secure usually are configured to not react to pings and do not give out failure responses to unauthenticated requests. This should be viable for a authenticated only walled garden type website op is suggesting, no?
It's not as simple as "only security through obscurity". You could say the same thing for an encryption key of a certain length. The private key to a public key is still technically just an obscurity, but it's still impractical to actually go through the entire range
IPv6 is big enough where this obscurity becomes impractical to sweep. But of course, as you said, there may be other methods of finding your address
Do you know how they find it? Is it just random input of address over and over?
Almost certainly. There are only 4,294,967,296 possible IPv4 addresses, i.e. 4.3ish billion, which sounds like a lot but in computer terms really isn't. You can scan them in parallel, and if you're an advanced script kiddie you could even exclude ranges that you know belong to unexciting organizations like Google and Microsoft, which are probably not worth spending your time messing with.
If you had a botnet of 8,000 or so devices and employed a probably unrealistically generous timeout of 15 seconds, i.e. four attempts per minute per device, you could scan the entire IPv4 range in just a hair over 93 days and that's before excluding any known pointless address blocks. If you only spent a second on each ping you could do it in about six days.
For the sake of argument, cybercriminals are already operating botnets with upwards of 100,000 compromised machines doing their bidding. That bidding could well be (and probably is) probing random web servers for vulnerabilities. The largest confirmed botnet was the 911 S5 which contained about 19 million devices.
That's amazing and scary at the same time. Thanks for putting it into perspective!
I don't know exactly how they do it, but probing every ipv4 address isn't that hard
If it's https it's discoverable by hostname.
https://0xffsec.com/handbook/information-gathering/subdomain-enumeration/#certificate-transparency
Certificate Transparency (CT) is an Internet security standard and open-source framework for monitoring and auditing digital certificates. It creates a system of public logs to record all certificates issued by publicly trusted CAs, allowing efficient identification of mistakenly or maliciously issued certificates.
I have a DDNS setup. Pretty random site name. Nonetheless, it’s been found and constantly probed. Lots of stuff from Russia, China, a few countries in Africa, and India. A smattering of others, but those are the constant IPs that are probing or attempting logins.
crt.sh and certificate transparency
Fabulous insight. I think that would make me very happy. Bring back the forests! Burn down the Nazi trees!
That’s not just a fabulous insight, it’s a powerful revelation!
How about just living in the actual woods with no internet? Gets more tempting by the day.
Yeah but where i live its to damn hot
Is this hopeposting ?
Kinda yeah, it's what I thought lemmy would be, but more and more it isn't
Cyberpunk as a literary genre, and the Cyberpunk TTRPG in specific, are incredibly prophetic. In the Cyberpunk TTRPG (which predates the WWW), "the net" is eventually condemned (as in boarded up) because of AI and ia replaced by silo'd networks (think extended intranets).
And of course in Cyberpunk the ttrpg setting much of the o0en internet was rendered useless by self replicating AI malware hijacking storage, processing, and bandwidth due to a zero day exploit discovered by one egomaniacal hacker.
Well I mean that's kind of what Lemmy is like since it's far more niche than something like reddit, but AI crawlers will find it anyway.
AI crawlers don’t even need to crawl individual instances. If someone wanted to scrape Lemmy, it would be way more efficient to simply spin up their own instance and let federation do its thing. Federation is literally a built in way to mass distribute content to a bunch of different servers. So just spin up an instance, set it to not respect delete requests, (so you still get the deleted posts and comments), and scrape it locally. The entire thing could be set up in like 20 minutes, and it would allow for passive data collection instead of requiring active scrapers that run constantly.
Back in the days of dial up and bbs this was a problem but you would still get robots trying to connect to modems by dialing every phone number possible.
~shhh~ ~they'll~ ~hear~ ~you!~
FUCK WE'RE TOO LATE, YOU ACTIVATED THE BOTS! YOU DOOMED US!
My bad, I'm sorry
Too late, it's dead!
Yeah it certainly is
I was thinking the other week about how it's getting to a point that I would consider a membership fee to access something like lemmy but guaranteed no AI or bots or bullshit advertising.
I know it isn't possible, but if it was, I'd pay a small fee to have it.
Do you think there will be safe places on the internet?
If it's connected, it's accessible. Won't matter what human level security we put in place when the datacenters these clankers run on have enough GPUs to brute force their way through.
Offline communication will make a resurgence, and will become indespensible when the resource wars the billionaires are funding reach the rest of the world.
Morpheus, that you?
Showerthoughts
A "Showerthought" is a simple term used to describe the thoughts that pop into your head while you're doing everyday things like taking a shower, driving, or just daydreaming. The most popular seem to be lighthearted clever little truths, hidden in daily life.
Here are some examples to inspire your own showerthoughts:
- Both “200” and “160” are 2 minutes in microwave math
- When you’re a kid, you don’t realize you’re also watching your mom and dad grow up.
- More dreams have been destroyed by alarm clocks than anything else
Rules
- All posts must be showerthoughts
- The entire showerthought must be in the title
- No politics
- If your topic is in a grey area, please phrase it to emphasize the fascinating aspects, not the dramatic aspects. You can do this by avoiding overly politicized terms such as "capitalism" and "communism". If you must make comparisons, you can say something is different without saying something is better/worse.
- A good place for politics is c/politicaldiscussion
- Posts must be original/unique
- Adhere to Lemmy's Code of Conduct and the TOS
If you made it this far, showerthoughts is accepting new mods. This community is generally tame so its not a lot of work, but having a few more mods would help reports get addressed a little sooner.
Whats it like to be a mod? Reports just show up as messages in your Lemmy inbox, and if a different mod has already addressed the report, the message goes away and you never worry about it.