lads

Evotech@lemmy.world · 16 hours ago

The block underneath ai is python

rtxn@lemmy.world · 1 day ago

Hail Anubis-chan.

Xylight@lemdro.id · 17 hours ago

any “bot stopper” ends up stopping me somehow. Including anubis. I’m pretty sure ive been cursed by the rng gods because even at 40 KH/s, I get stuck on the pages for like 2 minutes before it tells me success.

Similar things like hcaptcha or cloudflare turnstile either never load or never succeed. Recaptcha gaslights me into thinking I was wrong.

https://iloveanubis.phtn.app/

rollmagma@lemmy.world · 1 day ago

What’s this about?

rtxn@lemmy.world · edit-2 1 day ago

Anubis is a simple anti-scraper defense that weighs a web client’s soul by giving it a tiny proof-of-work workload (some calculation that doesn’t have an efficient solution, like cryptography) before letting it pass through to the actual website. The workload is insignificant for human users, but very taxing for high-volume scrapers. The calculations are done on the client’s side using Javascript code.

(edit) For clarification: this works because the computation workload takes a relatively long time, not because it bogs down the CPU. Halting each request at the gate for only a few seconds adds up very quickly.

Recently, the FSF published an article that likened Anubis to malware because it’s basically arbitrary code that the user has no choice but to execute:

[…] The problem is that Anubis makes the website send out a free JavaScript program that acts like malware. A website using Anubis will respond to a request for a webpage with a free JavaScript program and not the page that was requested. If you run the JavaScript program sent through Anubis, it will do some useless computations on random numbers and keep one CPU entirely busy. It could take less than a second or over a minute. When it is done, it sends the computation results back to the website. The website will verify that the useless computation was done by looking at the results and only then give access to the originally requested page.

Here’s the article, and here’s aussie linux man talking about it.

The Quuuuuill@slrpnk.net · 1 day ago

fwiw Anubis is working on a more respectful update, this was their first pass solution for what was basically a break glass emergency. i understand FSF’s concern, but Anubis is the only thing that’s making a free and open internet remotely possible right now, and far better it that nightmare fuel like cloudflare

daniskarma@lemmy.dbzer0.com · 1 day ago

How does it factor in the “free” and “open”?

It seems to be more about IP protection that any other thing.

rtxn@lemmy.world · edit-2 1 day ago

A web server that can’t discriminate between a request made by a human and one made by a machine has to handle all requests. It may not be an issue for large companies like Amazon or Microsoft, but small websites will suffer timeouts and outages.
Without a locally hosted solution like Anubis, small websites would have to move behind a large centralized service like Cloudflare.
Otherwise they might not be able to continue operating and only large corporate-backed services like Twitter and Reddit would survive.

The alternative is having to choose between Reddit and Cloudflare. Does that look “free” and “open” to you?

daniskarma@lemmy.dbzer0.com · edit-2 1 day ago

That whole thing is under two wrong suppositions.

It assumes that we sites are under constant ddos and that cannot exist if there is not ddos protection.

This is false.

It assumes that anubis is effective against ddos attacks. Which is not. Is a mitigation, but any ddos attack worth is name would not have any issue bringing down a site with anubis. As the sever still have to handle request even if they are smaller requests.

Anubis only use case is to make AI scrappers to consume more energy while scrapping, while also making many legitimate users also use more energy. It’s just being promoted in the anti-AI wave, but I don’t really see much usefulness into it.

rtxn@lemmy.world · edit-2 1 day ago

It assumes that we sites are under constant ddos

It is literally happening. https://www.youtube.com/watch?v=cQk2mPcAAWo https://thelibre.news/foss-infrastructure-is-under-attack-by-ai-companies/

It assumes that anubis is effective against ddos attacks

It’s being used by some little-known entities like the LKML, FreeBSD, SourceHut, UNESCO, and the fucking UN, so I’m assuming it probably works well enough. https://policytoolbox.iiep.unesco.org/ https://xeiaso.net/notes/2025/anubis-works/

anti-AI wave

Oh, you’re one of those people. Enough said. (edit) By the way, Anubis’ author seems to be a big fan of machine learning and AI.

(edit 2 just because I’m extra cross that you don’t seem to understand this part)

Do you know what a web crawler does when a process finishes grabbing the response from the web server? Do you think it takes a little break to conserve energy and let all the other remaining processes do their thing? No, it spawns another bloody process to scrape the next hyperlink.

ℍ𝕂-𝟞𝟝@sopuli.xyz · 1 day ago

Websites were under a constant noise of malicious requests even before AI, but now AI scraping of Lemmy instances usually triples traffic. While some sites can cope with this, this means a three-fold increase in hosting costs in order to essentially fuel investment portfolios.

AI scrapers will already use as much energy as available, so making them use more per site measn less sites being scraped, not more total energy used.

And this is not DDoS, the objective of scrapers is to get the data, not bring the site down, so while the server must reply to all requests, the clients can’t get the data out without doing more work than the server.

kautau@lemmy.world · 1 day ago

Free software

users have the freedom to run, copy, distribute, study, change and improve the software

https://www.gnu.org/philosophy/free-sw.en.html

Open source

https://en.wikipedia.org/wiki/The_Open_Source_Definition

No discrimination against fields of endeavor, like commercial use

You are removing the terms software and source. The code is freely available and to be open source should be usable for whatever purpose.

As an aside, it’s used by smaller sites frequently to prevent overwhelming scraping that could take down the site, which has become far more rampant recently due to AI bots

daniskarma@lemmy.dbzer0.com · edit-2 1 day ago

I’m not saying it’s not open source or free. I say that it does not contribute to make the web free and open. It really only contribute into making everyone waste more energy surfing the web.

The web is already too heavy we do NOT need PoW added to that.

I don’t think even a raspberry 2 would go down over a web scrap. And Anubis cannot protect from proper ddos so…

kautau@lemmy.world · 1 day ago

I don’t think even a raspberry 2 would go down over a web scrap

Absolutely depends on what software the server is running, if there’s proper caching involved. If running some PoW is involved to scrape 1 page it shouldn’t be too much of an issue, as opposed to just blindly following and ingesting every link.

Additionally, you can choose “good bots” like the internet archive, and they’re currently working on a list of “good bots”

https://github.com/TecharoHQ/anubis/blob/main/docs/docs/admin/policies.mdx

AI companies ingesting data nonstop to train their models doesn’t make for a open and free internet, and will likely lead to the opposite, where users no longer even browse the web but trust in AI responses that maybe be hallucinated.

CanadaPlus@lemmy.sdf.org · 1 day ago

Well, that’s a typically abstract, to-the-letter take on the definition of software freedom from them. I think the practical necessity of doing something like this, especially for services like Invidious that are at risk, and the fact it’s a harmless nonsense calculation really deserves an exception.

unalivejoy@lemmy.zip · 1 day ago

aussie linux man

How did I know exactly who you were talking about before clicking the link?

Sabata@ani.social · 1 day ago

The outro song played in my head…

InFerNo@lemmy.ml · 1 day ago

But they can still scrape it, it just costs them computation?

rtxn@lemmy.world · edit-2 1 day ago

Correct. Anubis’ goal is to decrease the web traffic that hits the server, not to prevent scraping altogether. I should also clarify that this works because it costs the scrapers time with each request, not because it bogs down the CPU.

Xylight@lemdro.id · 17 hours ago

Why not then just make it a setTimeout or something so that it doesn’t nuke the CPU of old devices?

rtxn@lemmy.world · 11 hours ago

Crawlers don’t have to follow conventions or specifications. If one has a setTimeout implementation that doesn’t wait the specified amount of time and simply executes the callback immediately, it defeats the system. Proof-of-work is meant to ensure that it’s impossible to get around the time factor because of computational inefficiency.

Anubis is an emergency solution against the flood of scrapers deployed by massive AI companies. Everybody wishes it wasn’t necessary.

Dadifer@lemmy.world · 1 day ago

Beautiful

Jankatarch@lemmy.world · edit-2 1 day ago

I did notbknow FSF is complaining about anubis doesn’t a bunch of fsf-ally organizations use it?

MadMadBunny@lemmy.ca · 1 day ago

I heard this picture