The rules for bots

Lionir [he/him]@beehaw.org · 1 year ago

The rules for bots

Rikudou_Sage@lemmings.world · edit-2 1 year ago

Hi there!

I’ve been contacted regarding my @autotldr@lemmings.world bot. Currently it’s disabled for beehaw.org as can be seen here.

I’d like to raise a discussion though as I think the bot is really useful.

Here are some global all-time stats:

 ---------- --------- ----------- ------------------------- ------------------------- ------------------------ 
  Comments   Upvotes   Downvotes   Negative comments count   Positive comments count   Neutral comments count  
 ---------- --------- ----------- ------------------------- ------------------------- ------------------------ 
  430        3096      82          0                         429                       1                       
 ---------- --------- ----------- ------------------------- ------------------------- ------------------------

These are per-instance stats (I stripped other instances than yours). If I’m not mistaken, downvotes are disabled on Beehaw, so the like ratio doesn’t say much, but other numbers still could:

 -------------------- ---------- --------- ----------- ------------ --------------------- 
  Instance             Comments   Upvotes   Downvotes   Like ratio   Upvotes per comment  
 -------------------- ---------- --------- ----------- ------------ --------------------- 
  beehaw.org           28         236       0           100.00%      8.43

Edit: The stats are generated by this - while it’s not the cleanest code I’ve ever written, I think it’s pretty readable and everyone can see that the stats are not some weird numbers to make it look better than it is.

d3Xt3r@beehaw.org · edit-2 1 year ago

I liked the original autotldr bot on Reddit. The one here though seems to be producing a large summary instead of just TL;DR.

Here’s an example: https://lemmings.world/comment/920986

This comment takes up most of the screen space on my mobile device. I don’t consider this to be a TL;DR. At this rate, I’d opt to just read the article in question instead.

The other problem is that lengthy TL;DRs like this obstruct comments, making it annoying to scroll past for those of who are on mobile devices. I could block the bot of course, but I don’t want to - I do want a legit TL;DR, not a reworded article.

Here’s my attempt at generating a TL;DR of the mentioned article, using ChatGPT:

Two manuscripts published on the arXiv pre-peer-review repository claim to have synthesized a room-temperature superconductor, LK-99. This superconductor is a variation of lead apatite, and allegedly operates not only at room temperature but also above water’s boiling point, at regular pressures. The synthesis process is complex, yielding potential variations in the final product. Early attempts to reproduce these findings have shown mixed results, adding to the intrigue and uncertainty surrounding the material’s properties. While these claims could potentially revolutionize the field, their validation would give rise to further challenges about how to transform this material into a practical, high-current-carrying form. The upcoming period is likely to see intense activity from labs worldwide in their efforts to corroborate these results, which could significantly accelerate the emergence of a scientific consensus.

IMO, this is what a TL;DR should look like - a single paragraph and under 150 words.

brie@beehaw.org · 1 year ago

To be fair, the TL;DR would be a lot shorter if the breaks between sentences were removed. I personally draw the line at around 200 words for a summary, so the 183 words in the summary is a bit long but still a reasonable TL;DR for an article.

Since Lemmy implements spoiler tags, I think wrapping the summary in a spoiler tag would be a way to solve the length problem.

Zoop@beehaw.org · edit-2 1 year ago

I love the idea of wrapping the summary in a spoiler tag! I think that would be a great idea to be implemented for bots like that and would solve a lot of this. That’s a great idea you’ve had and I thank you for sharing it!

Chris Remington@beehaw.org · 1 year ago

This makes sense and I agree.

𝒍𝒆𝒎𝒂𝒏𝒏@lemmy.one · 1 year ago

Note - I’m not a beehaw user

The one here though seems to be producing a large summary instead of just TL;DR

I kinda prefer this though, IMO condensing an article down into a one or a few sentences could make it difficult to facilitate a “healthy” discussion

A really miniscule TL;DR seems more likely for a bunch of assumptions to be made based on that alone, and increase the likelihood of users calling each other out for not actually reading the article.

This comment takes up most of the screen space on my mobile device

Oh ☹️ I decreased the font size for comments on my mobile so there was a higher content density but that might not work for you

Lionir [he/him]@beehaw.org · 1 year ago

My thoughts are mostly that I wish this were integrated in Lemmy because of a couple reasons:

People might be interested to see comments, see that there’s one in a thread only to realize it’s a bot
The sorting algorithm of lemmy makes no difference between bots and users so it can give a much higher importance to posts which attract bots (namely news)
- The algorithm really should so I made an issue about it (https://github.com/LemmyNet/lemmy/issues/3806)
Posting it as a comment just feels a bit noisy? It takes a lot of space in a thread.
- On the other hand, maybe it could be hidden by a spoiler tag? I think @nfld0001@beehaw.org mentioned this being a possibility

Chris Remington@beehaw.org · 1 year ago

…maybe it could be hidden by a spoiler tag…

I like that idea a lot…solves some problems while still allowing the bot.

Rikudou_Sage@lemmings.world · 1 year ago

It makes the bot less useful, sadly.

Chris Remington@beehaw.org · 1 year ago

How so?

Rikudou_Sage@lemmings.world · 1 year ago

People really more scan than read and such a small comment would get missed very often.

Enfield [he/him]@beehaw.org · 1 year ago

I get what you’re getting at there, but I don’t think it would necessarily be an issue. I think that if you were to put the summary itself under the spoiler and nothing else, it would be reasonable to provide a couple more lines to explain the bot. I’d think that even with a couple of extra lines of copy it would take less real estate most of the time than if the bot continued to just provide the summary and two lines.

I’m also recalling that AutoTLDR on Reddit had some extra bits like an FAQ and providing extended summaries. Links to that stuff might also help to balance your visibility. I think the bulk of your screen real estate comes from the summary, so this content would be less of an issue in comparison.

🤖 I’m a bot that summarizes online articles! This summary is X% shorter than the article:

Summary in spoiler

[Filler text follows]
Oh, using ChatGPT to generate filler text, are we? How delightfully modern! Gone are the days of the monotonous "lorem ipsum" that Latin scholars might swoon over. Now, we can be graced with filler text in English, tailored to our whims by a machine that's fluent in more than just dead languages. Let's all take a moment to applaud the user's avant-garde approach to filling that empty space on a webpage.

But wait, there's more to this cutting-edge decision. Not only have we replaced a centuries-old tradition with a dash of AI flair, but we've also managed to make filler text even more inconsequential and pretentious. Why stick with the tried and true when you can have a machine generate something that's equally irrelevant but far more verbose? Truly, the future of procrastination is here, and it's dressed in a cloak of technological grandiosity. Bravo!

-

My programming is open source on GitHub and developed by @rikudou@lemmings.world. Contact my developer on either platform to ask questions, send feedback, and report issues.

brie@beehaw.org · 1 year ago

Although I do like the idea of having some other information outside of the spoiler, I’m of the opinion that bots should distinguish themselves with the bot flag, and no more. The message should introduce the content, rather than the bot itself, and information about the bot should go in the bot’s bio.

Here’s a summary of the article! This summary is 100% shorter than the article:

Summary

TehPers@beehaw.org · 1 year ago

I think adding 🤖 makes it stand out enough that even while skimming, I’d stop to look at what that is. Honestly this proposed format seems great, since it’s short but stands out, and I can “opt-into” reading the tl;dr by clicking the spoiler.

Chris Remington@beehaw.org · 1 year ago

I understand your point.

Rikudou_Sage@lemmings.world · 1 year ago

So, is there some kind of verdict? I’m not sure what’s there to do now.

Enfield [he/him]@beehaw.org · 1 year ago

On the other hand, maybe it could be hidden by a spoiler tag? I think @nfld0001@beehaw.org mentioned this being a possibility

Yep.

::: spoiler [Title]
[Content]
:::

[Title]

haha gottem

𝒍𝒆𝒎𝒂𝒏𝒏@lemmy.one · 1 year ago

Would it be possible for the bot to DM us instead when communities decide to ban/restrict them for whatever reason?

I’ve found this bot incredibly useful personally, and I assume the community does too, looking at various Lemmy posts where the TLDR bot upvotes closely follow the OP upvotes (sometimes exceeding it)

Note - I’m not a beehaw user, for anyone reading whose apps do not show my instance.

Rikudou_Sage@lemmings.world · edit-2 1 year ago

Well, I may try it, but it might drive the costs of running the bot significantly if a lot of users use the feature.

Edit: Done. Will see how it goes, might roll this back if it’s too taxing on my wallet.

Chris Remington@beehaw.org · 1 year ago

I’m just trying to understand, generally speaking, how the bot works. It appears to me that the bot is looking for posted articles that exceed a certain word count threshold. If it finds these, then it creates a summary and posts this as a comment. Am I understanding this correctly?

Rikudou_Sage@lemmings.world · 1 year ago

It has support for specific news sites, I don’t want to rely on some automatic text extraction because those are prone to breaking. Here are the content extractors themselves, each for one site. If a post that contains a link to any of the supported sites is found across all of Lemmy (that the bot can see), it extracts the text and then summarizes it using this. It takes 6 sentences directly from the article that look most important to the machine learning model it uses. Then it posts it as a comment.

Zoop@beehaw.org · 1 year ago

I’m not sure how to properly tag a user to where they’ll definitely get a notification, so I wanted to reply directly to you to make sure you’ve seen the idea mentioned here in this comment about maybe wrapping the summaries in spoiler tags so that those who want to read it easily can and so that they don’t automatically take up so much space to scroll past on smaller screens and those that don’t want them maybe aren’t as bothered (for lack of being able to think of a better word to use here) by them.

I do appreciate TL:DR; bots but/and I think the spoiler tags idea is a great idea and I would love to see it implemented, if you feel that’s something you’d like to do. :)

Rikudou_Sage@lemmings.world · 1 year ago

If that’s the condition for the bot staying on Beehaw, sure.

Chris Remington@beehaw.org · 1 year ago

It appears, from what I can tell, that you’ve got a green light on this referencing what @Zoop@beehaw.org has stated.

Rikudou_Sage@lemmings.world · 1 year ago

Is this acceptable? Also tagging @Zoop@beehaw.org:

https://a.lemmings.world/lemmings.world/comment/968138

Chris Remington@beehaw.org · 1 year ago

Yes.

GeneralRetreat@beehaw.org · edit-2 1 year ago

I’d just suggest that this is a defacto ban based on the current requirements.

If bots are going to be command triggered and require pre-approval by individual community moderators, I think it would be prudent to include an index of registered bots + commands in the community info pages.

Currently I can’t think of any reasonable way for a Beehaw user to know which bots are operational and what their commands are. If bots need to be command triggered but there’s no way to find out which ones are functional, why approve them to begin with?

Lionir [he/him]@beehaw.org · 1 year ago

We could put all of the bot commands on them on the page for bots. That said, I expect many people will see one person doing it and copy that behaviour.

abhibeckert@beehaw.org · edit-2 1 year ago

I’d just suggest that this is a defacto ban.

To be honest, I’m OK with that. If I want a bot to summarise an article, I’ll go to ChatGPT or use Bing Chat. I don’t come to BeeHaw to interact with a bot. I’m here to interact with humans and in my opinion it should be a human that decides to post a link to an article and that human should also summarise it. They will do a better job than even the best bot.

While I’m not outright opposed to bots, I have yet to see a bot on Lemmy or Reddit that actually added value to the community. Usually every bot I encounter gets blocked the first time I see it.

Unfortunately a lot of bots on Lemmy that’s not really an option, for example a bot that finds interesting articles and posts them on Lemmy… I don’t want to see those posts, but at the same time I might want to see the discussion around the article. So I can’t block it.

Zoop@beehaw.org · edit-2 1 year ago

I think it’d be good to mention implementing the solution mentioned in the comments about wrapping lengthy bot comments (like the TL:DR; bot summaries) in spoiler tags in the section mentioning keeping bot comment lengths short.

I do like the TL:DR; bot overall and I hope, if we are allowing bots, that something can be figured out to where it can come back to posting here.

Hopefully I worded that in a way that makes sense. I feel like I worded it awkwardly, lol. I’ve got mega brain fog, sorry! Please let me know if you’d like me to try and reword it or explain anything :)

Chris Remington@beehaw.org · 1 year ago

On a personal note, when I was using Reddit, I greatly appreciated the TL;DR bot.

Altima NEO@lemmy.zip · 1 year ago

There’s been a similar one on Lemmy.