• Deckweiss@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    1
    ·
    edit-2
    15 hours ago

    The only way I can think of is blacklisting everything by default, directing to a challanging proper captcha (can be selfhosted) and temporarily whitelisting proven human IPs.

    When you try to “enumerate badness” and block all AI useragents and IP ranges, you’ll always leave some new ones through and you’ll never be done with adding them.

    Only allow proven humans.


    A captcha will inconvenience the users. If you just want to make it worse for the crawlers, let them spend compute ressources through something like https://altcha.org/ (which would still allow them to crawl your site, but make DDoSing very expensive) or AI honeypots.