• @ReveredOxygen@sh.itjust.works
    link
    fedilink
    English
    585 months ago

    I don’t have much statistical background, but I’m pretty sure most if not all definitions of an outlier would lead to the immortal being the outlier

  • @Perfide@reddthat.com
    link
    fedilink
    31
    edit-2
    5 months ago

    Uhhhh… no? The immortal guy would be the outlier, and even then only if they’ve been around for a long time already. They could be in their 30’s, in which case they won’t become an outlier for like 80-90+ more years.

    • @scrion@lemmy.world
      link
      fedilink
      125 months ago

      In an intuitive sense, yes, absolutely.

      But in a mathematical or statistical sense, no. Remember, adding or subtracting any number to/from infinity yields infinity again (also true for other operations). In turn, the mean of any set of numbers that contains infinity is also infinity.

      An outlier, however, is defined as a large deviation from the mean. Therefore, everyone with a normal lifespan would be considered a statistical outlier. It’s kind of a mathematical pun.

      • @Perfide@reddthat.com
        link
        fedilink
        135 months ago

        That is NOT the definition of an outlier. There IS no set mathematical definition of an outlier, in fact. What is considered an outlier is greatly determined by what the dataset actually is. Please do more research on what an outlier actually is, even the wikipedia page on outliers is surprisingly high quality(aka has good sources, read those), so it’s not hard.

        That all being said, let’s use your rigid definition. There have been approximately 100 billion humans to have ever lived, from now until all the way back 200,000-300,000 years ago when homo sapiens first emerged. The absolute oldest humans to ever live(ignoring Mr.Immortal) made it to about 120. Mr.Immortal has to be human to be factored into the calculations of human lifespan, so they are at the absolute MOST 300,000 years old.

        Now, let’s go ahead and say that all 100 billion humans lived to that maximum age of 120. Obviously not even remotely the case, but this is best case scenario here. The mean of a dataset is found by adding all of the numbers in the dataset together and then dividing by the number of data points within the set. So in this case it would be “(120(100,000,000,000) + 300,000) ÷ 100,000,000,001)”

        Now if you do the math on that, you find that even with Mr.Immortal included and every human living the absolute longest life possible, the mean is… 120.0000029988.

        Now tell me, what is closer to that number, 300,000 or the actual average lifespan of humans(70 something)? It’s not even close, and since the rate of population expansion keeps increasing, Mr.Immortal would have to wait for humanity to die out before the mean could ever increase enough to make US the outliers.

        • @RGB3x3@lemmy.world
          link
          fedilink
          English
          75 months ago

          Life expectancy… The joke said life expectancy. Not life span.

          With a life expectancy of infinity, everyone else is an outlier because the average life expectancy becomes infinite.

        • @scrion@lemmy.world
          link
          fedilink
          25 months ago

          As RGB3x3 said, despite your brilliant analysis, you were unfortunately unable to read the meme properly. The meme was of decent quality, so it should not have been that hard.

      • @Laticauda@lemmy.ca
        link
        fedilink
        8
        edit-2
        5 months ago

        Uh in both mathematics and statistics I was taught that outliers are data points that differ significantly from the other values in a set of data by either being much larger or much smaller. I’ve never heard of it being described the way you are presenting it. Yes subtracting or adding anything to or from infinity gets you infinity, but outliers aren’t being added to or subtracted from, they’re being removed from the data set because they’re seen as skewing the data with a measurement or result that is, y’know, an outlier. One person who lives forever is not an accurate representation of the life expectancy of the average human, so they would obviously be the outlier who would be excluded from the data.

      • Prox
        link
        fedilink
        45 months ago
        1. Generate a histogram of your dataset
        2. See that there’s only one sample (the immortal) in the “125+” bin and literally everything else falls into a generally normal distribution
        3. Classify the immortal as the outlier and most likely remove it from your analyses
        • @scrion@lemmy.world
          link
          fedilink
          -15 months ago

          Thank you senpai, can you explain the steps to carelessly enjoy a joke in a meme community next?

  • @_danny@lemmy.world
    link
    fedilink
    205 months ago

    This is why “average” is a shitty way to measure what values are likely.

    If you have a thousand people who have a thousand dollars, and one person who has a billion dollars, the “average” person has a million dollars.

    • @Kirca@lemmy.world
      link
      fedilink
      15 months ago

      Nah “average” is fine, just using the wrong one here. Means are better for roughly bell shaped data sets. I’m the case above, looking at the median and mode would help understand the data better

      • @webghost0101@sopuli.xyz
        link
        fedilink
        25 months ago

        i agree that the median should be used here. Maybe its an issue with translation but i was specifically tough that “median” and “average” are two very distinct things that should not be mixed up so the above doesn’t happen.

      • @_danny@lemmy.world
        link
        fedilink
        15 months ago

        Averages are fine if you have a pretty clean dataset. But if you have significant outlier data, like most do, averages can be misleading.

        Mode and median are generally better ways to get look at a “central tendency”

  • essell
    link
    fedilink
    35 months ago

    Not if you take the median average, most will be within one standard deviation.