AI will soon be grading AI submitted papers, certainly nothing can go wrong here

  • TwiddleTwaddle@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    14
    ·
    edit-2
    2 years ago

    Texas sabotaging public education again? Color me shocked. No doubt the lower test scores will be used to justify privatizing more schools.

    Also 3000 exam responses is luaghably low to train an LLM. These tests are for every 3rd-8th grader. That’s less responses than you’d get from a single mid sized school - expected to train an LLM how to grade probably millions of answers across the entire state.

    They claim its not an LLM because it doesn’t learn as it goes. I’m fairly certain that’s been the common implementation since we learned from the older generation of chatbots all turning to Nazis after being trolled by 4chan.

  • peanuts4life@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    13
    ·
    2 years ago

    Ugh… I’m deep on the ai sphere, and this seems like a bad idea to me. Gpt (let’s face it, they are probably using open ai) can be deeply biased and arbitrary in it’s evaluations.

    For example, “Two apples and four oranges,” might score better than: “4 oranges and 2 apples.” for inscrutable reasons. Say, if the question spelled out the numbers, and the LLM has a weighted bias to favor overall textual consistently, it might produces a reason to dock points apparently unrelated to that weight, such as: “incomplete sentence.” for the second answer, but not the first.

    Students may also receive lower scores due to cultural biases towards certain phrases, and factors as straightforward as their name.

    Finally, AI will hallucinate errors constantly if you ask it to evaluate text without any errors. Constantly. Consistently.

  • jonathanwerewolf@kbin.social
    link
    fedilink
    arrow-up
    6
    ·
    2 years ago

    Teaching kids to game an evaluation system where humans can’t even be bothered to read their words is great preparation for the job market.

  • flatbield@beehaw.org
    link
    fedilink
    English
    arrow-up
    3
    ·
    2 years ago

    I wonder to what sort of standard. I know I was shocked how poor things were when I started grading college students work as a TA. Same later in the work world reviewing nominations for an award.