Though Lemmy and Mastodon are public sites, and their structures are open-source I guess? (I’m not a programmer/coder), can they really dodge the ability of AI s to collect/track any data everytime they search everywhere on Internet?

  • Jeremyward@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    ·
    10 months ago

    They can put a robots.txt file in their root structure which can tell robots (AI scrapers) to ignore that website. However that only works on robots which follow that rule, it’s self enforced so it’s a crap shoot of it’ll be followed. Otherwise to be honest there isn’t a lot a public facing website can do to avoid being scraped. Maybe put up a captcha on every page?