468
submitted 6 days ago* (last edited 6 days ago) by Agosagror@lemmy.dbzer0.com to c/fediverse@lemmy.world

I was playing around with Lemmy statistics the other day, and I decided to take the number of comments per post. Essentially a measure of engagement – the higher the number the more engaging the post is. Or in other words how many people were pissed off enough to comment, or had something they felt like sharing. The average for every single Lemmy instance was 8.208262964 comments per post.

So I modeled that with a Poisson distribution, in stats terms X~Po(8.20826), then found the critical regions assuming that anything that had a less than 5% chance of happening, is important. In other words 5% is the significance level. The critical regions are the region either side of the distribution where the probability of ending up in those regions is less than 5%. These critical regions on the lower tail are, 4 comments and on the upper tail is 13 comments, what this means is that if you get less than 4 comments or more than 13 comments, that's a meaningful value. So I chose to interpret those results as meaning that if you get 5 or less comments than your post is "a bad post", or if you get 13 or more than your post is "a good post". A good post here is litterally just "got a lot of comments than expected of a typical post", vice versa for "a bad post".

You will notice that this is quite rudimentary, like what about when the Americans are asleep, most posts do worse then. That's not accounted for here, because it increases the complexity beyond what I can really handle in a post.

To give you an idea of a more sweeping internet trend, the adage 1% 9% 90%, where 1% do the posting, 9% do the commenting, and 90% are lurkers – assuming each person does an average of 1 thing a day, suggests that c/p should be about 9 for all sites regardless of size.

Now what is more interesting is that comments per post varies by instance, lemmy.world for example has an engagement of 9.5 c/p and lemmy.ml has 4.8 c/p, this means that a “good post” on .ml is a post that gets 9 comments, whilst a “good post” on .world has to get 15 comments. On hexbear.net, you need 20 comments, to be a “good post”. I got the numbers for instance level comments and posts from here

This is a little bit silly, since a “good post”, by this metric, is really just a post that baits lots and lots of engagement, specifically in the form of comments – so if you are reading this you should comment, otherwise you are an awful person. No matter how meaningless the comment.

Anyway I thought that was cool.

EDIT: I've cleared up a lot of the wording and tried to make it clearer as to what I am actually doing.

top 50 comments
sorted by: hot top controversial new old
[-] Gullible@sh.itjust.works 145 points 6 days ago

We had the chance to upvote this heavily without leaving any comments, but we blew it

[-] morrowind@lemmy.ml 35 points 6 days ago* (last edited 6 days ago)

post is too [good] unfortunately

[-] riot@lemmy.world 31 points 6 days ago

post is too unfortunately

they don't think it be like it is but it do

load more comments (2 replies)
load more comments (1 replies)
[-] ocean@lemmy.selfhostcat.com 50 points 6 days ago

Fun break down! More comments is more interesting than more posts for me

load more comments (1 replies)
[-] fmstrat@lemmy.nowsci.com 32 points 6 days ago

You need a factor for niche communities. A post with 4 comments in a backpacking community with 20 subscribers is way "gooder" than 40 comments in a 5k subscriber news community.

I.E. add a community size factor.

[-] BackgrndNoize@lemmy.world 17 points 5 days ago

Add a TLDR or this post won't get a lot of traction either

[-] grrgyle@slrpnk.net 9 points 5 days ago

Confirmed. I see "Poisson distribution" I start skimming lol

[-] ArtificialHoldings@lemmy.world 7 points 5 days ago

Goodhart's Law: "When a measure becomes a target, it ceases to be a good measure."

Not entirely sure how this applies to the discussion, it just came to mind lol

[-] Minnels@lemm.ee 15 points 6 days ago

I comment very seldom and only if i think that I can contribute. I see no need to write anything if I got nothing of significance to add.

Maybe I should. Add comments that is uplifting and kind more often.

[-] Coelacanth@feddit.nu 13 points 5 days ago* (last edited 5 days ago)

I comment a shit ton and often with absolute banalities. Especially on posts with 0 comments.

My reasoning is twofold: first of all I want to encourage posters by engaging with their content so they don't stop posting. Second I want to invite others to comment and it's much more inviting to do so if a post has at least one comment. People tend to think it's dead otherwise and not bother.

I think at the current level of MAUs there is no comment too small, and every little bit helps just by virtue of breaking the silence.

[-] Blaze@lemmy.dbzer0.com 3 points 5 days ago
[-] Coelacanth@feddit.nu 2 points 5 days ago

My meagre contributions pale in comparison to your efforts, but I do what I can.

[-] Minnels@lemm.ee 3 points 5 days ago

I feel guilty now. Yes, everything you just said is true.

I shall become a better... Lemming(?) and comment a few times every day.

I try to be positive, but my way of life are very different from other people's; and i end up doing more harm than good, if i'm forcing myself to be friendly and nice.

[-] IDKWhatUsernametoPutHereLolol@lemmy.dbzer0.com 15 points 6 days ago* (last edited 6 days ago)

Average Fediverse Experience:

Post comment

Waits 24 hours

zero replies

zero votes

not even a downvote

check post viewed from other instances

can't find the comment

realizes that the comment never federated

now too much time has passed since the original time of the post, and the joke you commented is no longer funny anymore

😭

[-] tja@sh.itjust.works 4 points 6 days ago

Or other people created the same joke without ever seeing your post

[-] CarbonatedPastaSauce@lemmy.world 25 points 6 days ago

Doing my part to make this a good post, cause it was.

load more comments (1 replies)
[-] JennyLaFae 9 points 5 days ago

This comment will be sad if you don't engage with it.

[-] S_H_K@lemmy.dbzer0.com 4 points 5 days ago

Ohhh poor thing here have an upvote and a comment.

[-] JennyLaFae 2 points 4 days ago

The comment is very happy to be a good comment with 8 updoots and two replies.

10/9.5 🥳

[-] S_H_K@lemmy.dbzer0.com 1 points 1 day ago

2 upvotes and 1 reply but I wish you the best.

This comment is part of a tree-datastructure that represents the branches of discussion.

[-] RideAgainstTheLizard@slrpnk.net 15 points 6 days ago

I've happily found that there is much more interaction here than on Mastodon :)

[-] merc@sh.itjust.works 6 points 6 days ago

It's a different model.

Mastodon, like Twitter, is a person-centered setup. You can use hashtags, but most people don't. You follow people not communities. As a result it's basically microblogs, where most people are just posting into the void. Celebrities are followed more, so they get more replies, so there are more conversations. But, fundamentally it's not really inviting interactions.

Lemmy, like Reddit, is a topic-centered setup. It has a bunch of communities and people post something because they think it might be interesting for people who are also interested in that community. Every post is basically an invitation to have a discussion about something.

I think the friction to posting something on Lemmy is slightly higher, but when you do, it's more likely to generate comments.

[-] Microw@lemm.ee 6 points 6 days ago

Mastodon mainly only looks like there is no interaction happening because of their federating logic. Which is being worked on to be fixed sometimes this year

[-] RideAgainstTheLizard@slrpnk.net 1 points 4 days ago

When I post here I get replies. On mastodon I don’t.

[-] Microw@lemm.ee 2 points 4 days ago

I see that you have a very good quota of comments that you get on every lemmy post you make. I dont think that's true for every poster, especially when posting to niche lemmy communities.

But yes: of course the lemmy format invites comments way more than the microblogging format of mastodon

load more comments (6 replies)
[-] Maiq@lemy.lol 13 points 6 days ago

Okay. Look. We both said a lot of things that you're going to regret. But I think we can put our differences behind us. For science. You monster.

[-] Pamasich@kbin.earth 4 points 5 days ago

I disagree that commenting for the sake of commenting is a good idea. Quality over quantity, a single meaningful discussion is superior to a sea of low effort garbage. I also want the fediverse to take off, but not at the cost of adopting modern Reddit culture.

a “good post”, by this metric, is really just a post that baits lots and lots of engagement

Baiting anything is bad.

[-] Agosagror@lemmy.dbzer0.com 6 points 5 days ago

Well exactly, that was kind of the point of this post. Hence "good post" being in air quotes. It being a silly idea as well.

Completely agree with you on that last point.

[-] TropicalDingdong@lemmy.world 14 points 6 days ago

So I modeled that with a Poisson distribution, and I learnt that to a 5% significance level, if your post got less than 4 comments, that was statistically significant. Or in other words – there is a 95% probability that something else caused it not to get more comments. Now that could be because it is an AMAZING post – it covered all the points and no one has anything left to say. Or it’s because it’s a crappy post and you should be ashamed in yourself. Similarly a “good post”, one that gets lots of comments, would be any post that gets more than 13 comments. Anything in-between 4 and 13 is just an average post.

So, like, I do have a background in stats and network analysis, and I'm not sure what you are trying to say here.

if your post got less than 4 comments, that was statistically significant.

Statistically significant what? What hypothesis are you testing? Like, how are you setting this question up? What is your null?

Because I don't believe your interpretation of that conclusion. It sounds like mostly you calculated the parameters of a poisson and then are interpreting them? Because to be clear, thats not the same as doing hypothesis testing and isn't interpretable in that manner. Its still fine, and interesting, and especially useful when you are doing network analysis, but on its on, its not interpretable in this manner. It needs context and we need to understand what test you are running, and how you are setting that test up.

I'm asking these questions not to dissuade you, but to give you the opportunity to bring rigor to your work.

Should you like, to further your work, I have set up this notebook you can maybe use parts of to continue your investigations or do different investigations.

[-] Agosagror@lemmy.dbzer0.com 7 points 6 days ago* (last edited 6 days ago)

Oh yeah ok, so I was going to figure out to put "H0 : L = 8.2", and "H1 != 8.2, X~Po(8.2), P(c<=X<=c2) => c=?, c2=?" but I left it out because I couldn't format it in a way that looked half decent in a Lemmy post.

I found the critical regions of the Poisson distribution, that takes the mean to be the average comments/post for the fediverse. I then interpreted those numbers, which I where I assume I've made a mistake. As if it was outside of the critical region, that would mean H1, but we know H1 is wrong, since we already have a value for L. It sounds like your interpretation of what I did is bang on. Yeah I get that it isn't a hypothesis test, but at the level of my stats exams - finding the critical regions was 99% of the work in a hypothesis test.

I only took college level statistics like I said in another reply. I just thought it was cool to see all the instances comments/post ratio. It doesn't help that my stats teacher was the most boring man alive, and I was always much preferred the pure side of the maths course.

[-] TropicalDingdong@lemmy.world 9 points 6 days ago

So lets just cover a few things..

Hypothesis testing:

The phrase “if your post got less than 4 comments, that was statistically significant” can be misleading if we don’t clearly define what is being tested. When you perform a hypothesis test, you need to start by stating:

Null hypothesis (H₀): For example, “the average number of comments per post is λ = 8.2.”

Alternative hypothesis (H₁): For example, “the average number of comments per post is different from 8.2” (or you could have a directional alternative if you have prior reasoning).

Without a clearly defined H₀ and H₁, the statement about significance becomes ambiguous. The p-value (or “significance” level) tells you how unusual an observation is under the assumption that the null hypothesis is true. It doesn’t automatically imply that an external factor caused that observation. Plugging in numbers doesn't supplant the interpretability issue.

"Statistical significance"

The interpretation that “there is a 95% probability that something else caused it not to get more comments” is a common misinterpretation of statistical significance. What the 5% significance level really means is that, under the null hypothesis, there is only a 5% chance of observing an outcome as extreme as (or more extreme than) the one you obtained. It is not a direct statement about the probability of an alternative cause. Saying “something else caused” can be confusing. It’s better to say, “if the observed comment count falls in the critical region, the observation would be very unlikely under the null hypothesis.”

Critical regions

Using critical regions based on the Poisson distribution can be useful to flag unusual observations. However, you need to be careful that the interpretation of those regions aligns with the hypothesis test framework. For instance, simply saying that fewer than 4 comments falls in the “critical region” implies that you reject the null when observing such counts, but it doesn’t explain what alternative hypothesis you’re leaning toward—high engagement versus low engagement isn’t inherently “good” or “bad” without further context. There are many, many reasons why a post might end up with a low count. Use the script I sent you previously and look at what happens after 5PM on a Friday in this place. A magnificent post at a wrong time versus a well timed adequate post? What is engagement actually telling us?

Model Parameters and Hypothesis Testing

It appears that you may have been focusing more on calculating the Poisson probabilities (i.e., the parameters of the Poisson distribution) rather than setting up and executing a complete hypothesis test. While the calculations help you understand the distribution, hypothesis testing requires you to formally test whether the data observed is consistent with the null hypothesis. Calculating “less than 4 comments” as a cutoff is a good start, but you might add a step that actually calculates the p-value for an observed comment count. This would give you a clearer measure of how “unusual” your observation is under your model.

load more comments (2 replies)
[-] empireOfLove2@lemmy.dbzer0.com 13 points 6 days ago

The other chance that you got no comments on your post for is that you are banned from the remote instance/community, or federation is broken (still happens intermittently).

Lemmy will still allow you to post from your home instance since you are not banned there, but your content will simply get black-holed by the remote instance if you're banned there. Sometimes you have to check the remote instance directly to see if your post was federated or not.

load more comments (2 replies)

Ah nice, I encountered a Poisson-distribution in the wild today. I shall recount this encounter to my children.

[-] FancyLad@lemmy.world 8 points 6 days ago

goes back to lurking in the shadows

load more comments (1 replies)
[-] diffusive@lemmy.world 5 points 6 days ago

A post by fediversechick

[-] crimeschneck@feddit.nl 8 points 6 days ago

The average for every single Lemmy instance was 8.208262964 comments per post.

I wonder how much that statistic would change if you exclude news or politics communities.

load more comments (1 replies)
[-] Rhaedas@fedia.io 8 points 6 days ago

You didn't factor in the variability of federation vs. a single platform and how not only can it affect how long it takes for everyone to see a post, if they do at all, but also how many duplications there may be floating around. And I don't know if you can predict that reliably, as we're all still trying to figure it out.

load more comments (1 replies)
[-] JaymesRS@literature.cafe 7 points 6 days ago

Even though I appreciate this post, I don't think I will comment.

load more comments (1 replies)
[-] OsrsNeedsF2P@lemmy.ml 6 points 6 days ago

I think the community matters a lot more than the instance. Hexbear has a bunch of coping bubble communities but they keep posting the same low-quality comments, so that's probably why the threshold of 20 comments is so high. Another example, I make posts to my own blog community !dginovker_blog@lemmy.ml, but there's no subscribers so there's never gonna be any comments.

Basically I'm saying you should do this same analysis across a sample of random communities ^^

[-] LemUrun@pawb.social 7 points 6 days ago

Don't be too mad at me

/c/theydidamath

load more comments (2 replies)
[-] merc@sh.itjust.works 6 points 6 days ago

Similarly a “good post”, one that gets lots of comments, would be any post that gets more than 13 comments.

By my count, this comment will take your post from one with 12 comments to one with 13 comments, therefore I'm conferring on you the title of "good post". Congratulations!!

However, I'm assuming that you're including your own comments in the comment tally. If you're not, then your 2 comments so far to this post don't count, and you'll only be at 11, and therefore "not good".

If you are counting your own comments on your own post, can you juice the numbers by adding lots of comments? In other words, can you make a post good by interacting with the people who are interacting with the post? Like some kind of um... conversation? Sounds like cheating to me.

[-] iAvicenna@lemmy.world 4 points 6 days ago

I think one needs to include parameters like how soon after the topic was created the comment was made and how deep is it in the comment tree. If you for instance consistently comment on 1 month old topics or reply on comments ten levels deep you will get very few interactions.

load more comments (1 replies)
load more comments
view more: next ›
this post was submitted on 13 Apr 2025
468 points (100.0% liked)

Fediverse

32815 readers
882 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration)

founded 2 years ago
MODERATORS