• chicken@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    1
    ·
    8 months ago

    idk, from what I understand this is the “alignment problem” and it is very nontrivial. Is what you’re describing related to the current “reinforcement learning” techniques they use to keep things nice and censored? If so it’s been shown to not work very consistently and only functions like suggestions and not actual hard rules, and can backfire.

    • Scubus@sh.itjust.works
      link
      fedilink
      arrow-up
      2
      ·
      8 months ago

      Not exactly. The alignment problem is an issue with optimization. The network is trained to do somehing, but it may not neccassarily be what you were wanting it to do. The method i referred to doesnt effect the priorities of the network, it simply overides certain outputs that the network may be sending. Imagine a network trained to identify weather conditions and range, and when it has adjusted, it fires a round. Now imagine a second network specifically trained to identify friendly targets. The two networks dont neccassarily need to communicate, and they are completely seperated software wise. It simply that when the second network identifies a target, it prohibits the first network from firing.

      By having them not communicate, you solve that it miht fire at friendlies. The issue now becomes that the aiming network thinks its doing its job, and it doesnt understand(because it has no way of knowing) that it cant fire. So it just holds position pointing at the friendly and continuing to attempt to fire.

      • chicken@lemmy.dbzer0.com
        link
        fedilink
        arrow-up
        2
        ·
        edit-2
        8 months ago

        That does clarify what you mean. Still, that’s just composing AI systems that do not operate on discreet logic within systems that do. There is no hard rational guarantee that the aiming model will not misjudge where the bullet will land, or that the target identification model will not misjudge who qualifies as a friendly. It might have passed extensive testing, or been trained on very carefully curated data, but you still have only approximate knowledge of what it will do. That applies even moreso to more open ended or ambiguous tasks.