Search

Twts matching tag=sux32qq

The bots have begun to access my website way more often. I’m getting about 120k hits on https://www.uninformativ.de/git/ now in a couple of hours.

They don’t cache anything, probably on purpose.

It comes in waves. I get about 100 hits (all at once) on that /git endpoint, all from different IPs. Then it takes a moment until I get another wave of about 500-1000 requests (all at once) where they do HEAD requests on some of the paths below /git. I assume they did a GET earlier and are now checking if something has changed.


#sux32qq

(#sux32qq) This probably means that I can no longer host my own website. I don’t want to deploy something like Anubis, because that ruins the whole thing: I want it to be accessible from ancient browsers, like OS/2 or Windows 3.11.

I’ll keep an eye on it for a while. Maybe try to block some IPs.

Sooner or later, I’ll take the website down and shift everything to Gopher.


#qiq5bnq

(#sux32qq) Why do I care about this?

  1. The load will become a problem at some point.
  2. These crawlers and the current ā€œAIā€ in general are breaking the rules. I am supposed to be paying for every little thing, I get sued for ā€œpiracyā€. But apparently, these rules only apply to me. If I had more money, I could break them. Fuck that.
  3. I simply don’t want it. Period.

#2s2wjga

(#sux32qq) ā€œBut all your stuff is MIT licensed! They are allowed to do that!ā€

Haha. As if they would care. They crawl everything they get their hands on.

Besides, that’s not true, the license states that the copyright notice must be retained. ā€œAIā€ breaks that. They incorporate my code and my articles in their product and make it appear as if it was their work.


#imftnja

(#sux32qq) As expected: Didn’t last long. They’re coming from different IPs now.

I’ve read enough blog posts by other people to know that this is probably pointless. The bots have so many IPs/networks at their disposal …


#5qaxnia