Mr Postman's Lemmy
  • Communities
  • Create Post
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
tofu@lemmy.nocturnal.garden to Self-hosting@slrpnk.net · 8 days ago

Aggressive AI scrapers are making it kinda suck to run wikis

weirdgloop.org

external-link
message-square
6
link
fedilink
  • cross-posted to:
  • technology@lemmy.world
47
external-link

Aggressive AI scrapers are making it kinda suck to run wikis

weirdgloop.org

tofu@lemmy.nocturnal.garden to Self-hosting@slrpnk.net · 8 days ago
message-square
6
link
fedilink
  • cross-posted to:
  • technology@lemmy.world
Bots are currently scraping the internet for LLM training data at unprecedented rates[1][2][3], driving up costs and destabilizing public-facing websites. I want to talk about how this has been particularly difficult for wikis, and has gotten much worse in the last few months.

Cross posted from: https://reddthat.com/post/66285192

  • grrgyle@slrpnk.net
    link
    fedilink
    arrow-up
    6
    ·
    7 days ago

    We use NGINX’s 444 response A LOT.

    Hmm interesting. I wasn’t aware of this one

Self-hosting@slrpnk.net

selfhosting@slrpnk.net

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !selfhosting@slrpnk.net

Hosting your own services. Preferably at home and on low-power or shared hardware.

Also check out:

  • Homebrewserver.club
  • XMPP chat
Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 7 users / day
  • 85 users / week
  • 155 users / month
  • 383 users / 6 months
  • 1 local subscriber
  • 4.37K subscribers
  • 93 Posts
  • 139 Comments
  • Modlog
  • mods:
  • poVoq@slrpnk.net
  • Sam_uk@slrpnk.net
  • BE: 0.19.18
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org