Via @rodhilton@mastodon.social

Right now if you search for “country in Africa that starts with the letter K”:

  • DuckDuckGo will link to an alphabetical list of countries in Africa which includes Kenya.

  • Google, as the first hit, links to a ChatGPT transcript where it claims that there are none, and summarizes to say the same.

This is because ChatGPT at some point ingested this popular joke:

“There are no countries in Africa that start with K.” “What about Kenya?” “Kenya suck deez nuts?”

  • MalReynolds@slrpnk.net
    link
    fedilink
    English
    arrow-up
    49
    ·
    11 months ago

    Remember GIGO (Garbage In, Garbage Out). Due to years of SEO and content farming (which google profited from, so you get what you deserve assholes) most of the internet, by volume, is self-congratulatory, for profit, garbage, or, you know, reddit garbage. Hopefully someone points a large LLM at the library of congress or other large, well curated data source, but of course copyright will not allow, thanks mickey mouse. Wouldn’t surprise me if the military is already on it, hopefully that leaks…

    • tony@lemmy.hoyle.me.uk
      link
      fedilink
      English
      arrow-up
      24
      arrow-down
      2
      ·
      11 months ago

      LLMs will eventually start feeding of search results from other LLMs and they’ll just start regurgitating each others nonsense. If that isn’t happening already.

      Nobody is going to point an LLM at a good data source because that would mean spending money on actually useful stuff not fast cars and booze.

      • T156@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        11 months ago

        Nobody is going to point an LLM at a good data source because that would mean spending money on actually useful stuff not fast cars and booze.

        It would also mean that you need to sort through that data, and most people don’t have the time or money to bother, not when it might reduce their data pool.

      • MalReynolds@slrpnk.net
        link
        fedilink
        English
        arrow-up
        9
        ·
        11 months ago

        The purpose of content farming is to sell ads, google gets a cut, lion’s share most likely.