• Onno (VK6FLAB)@lemmy.radio
    link
    fedilink
    arrow-up
    24
    ·
    5 months ago

    The underlying issue with an LLM is that there is no “learning”. The model itself doesn’t dynamically change whilst it’s being used.

    This article sets out a process that gives the ability to alter the model, by “dialling up” (or down) concepts. In other words, it’s changing the balance of the weight of concepts across the whole model.

    Altering one concept is hardly “learning”, especially since it’s being done externally by researchers, but it’s a start.

    A much larger problem is that the energy consumption is several orders of magnitude larger than that of our brain. I’m not convinced that we have enough energy to make a standalone “AI”.

    What machine learning actually gave us is the ability to automatically improve a digital model of things, like weather prediction, something that took hours on a supercomputer to give you a week of forecast, now can be achieved on a laptop in minutes with a much longer range and accuracy. Machine learning made that possible.

    An LLM is attempting the same thing with human language. It’s tantalising, but ultimately I think the idea applied to language to create “AI” is doomed.

    • dsemy@lemm.ee
      link
      fedilink
      English
      arrow-up
      8
      ·
      5 months ago

      A much larger problem is that the energy consumption is several orders of magnitude larger than that of our brain. I’m not convinced that we have enough energy to make a standalone “AI”.

      This is a major issue I have with basically anyone who talks about current “AI” systems - they’re clearly not even close to AI, as they require an extreme amount of energy and data to perform tasks which would be trivial to an actual brain. They seem to lack any ability to comprehend their input, only mimicking it through brute force, which is only feasible since computers got fast enough and we can currently keep up with the energy demands.

      • GenderNeutralBro@lemmy.sdf.org
        link
        fedilink
        English
        arrow-up
        5
        ·
        5 months ago

        AI does not mean artificial brain or anything similar. It’s a very broad term that’s been in use for about 70 years now.

        Pac Man has AI.

        • dsemy@lemm.ee
          link
          fedilink
          English
          arrow-up
          6
          ·
          5 months ago

          Obviously I’m not referring to that, but to what large tech companies call AI. And they are in fact trying to convince people these AI systems they are developing will soon be clever enough to be considered general AI.

    • Paragone@beehaw.org
      link
      fedilink
      arrow-up
      2
      ·
      5 months ago

      To the best of my knowledge, back-propagation IS learning, whether it’s happening in a neural-net on a chip, or whether we’re doing it, through feedback, & altering our understanding ( so both hard-logic & our wetware use the method for learning, though we use a rather sloppy implimentation of it. )

      & altering the relative-significances of concepts IS learning.

      ( I’m not commenting on whether the new-relation-between-those-concepts is wrong or right, only on the mechanism )

      so, I can’t understand your position.

      Please don’t deem my comment worthy of answering: I’m only putting this here for the record, is all.

      Everybody can downvote my comment into oblivion, & everything in the world’ll still be fine.

  • astronaut_sloth@mander.xyz
    link
    fedilink
    English
    arrow-up
    9
    ·
    5 months ago

    The original paper itself, for those who are interested.

    Overall, this is really interesting research and a really good “first step.” I will be interested to see if this can be replicated on other models. One thing that really stood out, though, was that certain details are obfuscated because of Sonnet being proprietary. Hopefully follow-on work is done on one of the open source models to confirm the method.

    One of the notable limitations is quantifying activation’s correlation to text meaning, which will make any sort of controls difficult. Sure, you can just massively increase or decrease a weight, and for some things that will be fine, but for real manual fine tuning, that will prove to be a difficulty.

    I suspect this method is likely generalizable (maybe with some tweaks?), and I’d really be interested to see how this type of analysis could be done on other neural networks.

  • Ilandar@aussie.zone
    link
    fedilink
    arrow-up
    5
    ·
    5 months ago

    This sounds promising but I do wonder how undermined any progress they make will be by:

    • the speed of advancements in AI
    • the fact that this research doesn’t necessarily apply to other LLMs
    • the fact that LLMs are being released/leaked to the public, so anyone who has access to them has the potential to jailbreak the AI and circumvent any safety precautions researchers implement as a result of this work