So far, I have been able to 'control' the CPU use by setting limits to the process that pulls stuff from the database (pool size, CPU, memory).
This does release some of the CPU for other tasks, but I think that that what creates the lag might actually be the clogged database queries. So, constraining those resources might not solve the lag problem.
I just realized that it is not a 'scraper', the requests came from the server that I am using to provide an interface to the site as an Onion site. The amount of requests was suspiciously high so maybe a bot is scraping through Tor. I will leave it off for a few days and see if I can turn it back on later.
And see if they're making repeated un-scoped (no page paramater) requests to /api/v3/comment/list. If they are, block them in your firewall.
Those used to hit my instance constantly with requests like /api/v3/comment/list?sort=Old&page=16514 (yes, page 16,414). When I blocked those IPs making those requests, problem solved.
I don't see those specific IPs, nor 16514. But now I see what scrapers tend to look like in the logs :)
I am now pretty sure that the cause was scraper-like activity coming from the Mlmym front-end that I am serving over an onion site. I am not sure if it randomly started mis-behaving or if a tor scraper was using it.
After blocking this, federation was restored, performance increased, and CPU use came down:
So far, I have been able to 'control' the CPU use by setting limits to the process that pulls stuff from the database (pool size, CPU, memory).
This does release some of the CPU for other tasks, but I think that that what creates the lag might actually be the clogged database queries. So, constraining those resources might not solve the lag problem.
Did you upgrade Lemmy on the 13th or something? Could be a new bug.
Also, did site traffic increase at all? Maybe there's an AI scraper messing up.
I did not update or change anything in the past few days.
But, now that you mentioned an AI scrapper I looked into the logs and noticed some heavy requests to the API from a specific IP.
Requests look consistent to scraping - just consistently and continuously issuing GET requests to different API endpoints.
I have started denying their requests and it is the first thing that seems to have actually helped!
I don't want to speak too early but I think you may have identified the cause. Thanks!
I just realized that it is not a 'scraper', the requests came from the server that I am using to provide an interface to the site as an Onion site. The amount of requests was suspiciously high so maybe a bot is scraping through Tor. I will leave it off for a few days and see if I can turn it back on later.
That's a level of sophistication I hadn't seen - using Tor to scrape. Hopefully they give up soon. I'm glad we got some respite. Thank you!
Check your web access logs for these 3 IPs:
And see if they're making repeated un-scoped (no
pageparamater) requests to/api/v3/comment/list. If they are, block them in your firewall.Those used to hit my instance constantly with requests like
/api/v3/comment/list?sort=Old&page=16514(yes, page 16,414). When I blocked those IPs making those requests, problem solved.Thanks!
I don't see those specific IPs, nor 16514. But now I see what scrapers tend to look like in the logs :)
I am now pretty sure that the cause was scraper-like activity coming from the Mlmym front-end that I am serving over an onion site. I am not sure if it randomly started mis-behaving or if a tor scraper was using it.
After blocking this, federation was restored, performance increased, and CPU use came down: