131
submitted 4 weeks ago by jordanlund@lemmy.world to c/world@lemmy.world

We've had some trouble recently with posts from aggregator links like Google Amp, MSN, and Yahoo.

We're now requiring links go to the OG source, and not a conduit.

In an example like this, it can give the wrong attribution to the MBFC bot, and can give a more or less reliable rating than the original source, but it also makes it harder to run down duplicates.

So anything not linked to the original source, but is stuck on Google Amp, MSN, Yahoo, etc. will be removed.

you are viewing a single comment's thread
view the rest of the comments
[-] jordanlund@lemmy.world 11 points 4 weeks ago

Not seeing any suggestions there to improve the bot, but lots of bannable attacks on other users, mods and admins.

So I'll say it again, as I've told other people complaining, I'm open to making the bot better. If you have suggestions, I'd love to hear them.

  1. It has to be automated, which means accessible through an API.

  2. It has to be no/low cost. Lemmy.World doesn't have a budget for this. We met with an MBFC alternative, they wanted 6 figures. HARD no.

[-] Docus@lemmy.world 18 points 4 weeks ago

Ok, i’ll bite. I don’t value the bot (in part because it rates sites/newspapers and not authors or articles. Good news sites have the occasional shit article and vice versa), so please reduce the precious space it takes up on my mobile device. A one liner with a link would be enough.

[-] jordanlund@lemmy.world 4 points 4 weeks ago

I feel your pain. Some readers, like mine (Boost) don't handle the spoiler tag markup correctly and it ends up bigger than designed.

[-] PhilipTheBucket@ponder.cat 17 points 4 weeks ago

How much are you paying for the MBFC API? The page says it isn't free. I'll give you an API endpoint which will check sources against https://en.wikipedia.org/wiki/Wikipedia:Reliable_sources/Perennial_sources, if you pay me half of whatever you were paying MBFC previously. That list is quite a lot better than relying on MBFC.

I already scraped the list. It'll take around an hour for my script to finish going down the sources and assigning web sites to each one, but I can have a working API endpoint for you tomorrow morning. I can do the bot part also, if you prefer. That's probably easier than making a new endpoint and hooking it to a bot and debugging the connection and all.

Like I said, I think the idea that readers won't be able to determine that Breitbart is unreliable is missing a pretty big elephant in the misinformational room. If the issue that's causing you to keep MBFC is finding a better source that's programmatic, though, then solving that is almost trivially easy and at least seems like some kind of step forward.

[-] Rooki@lemmy.world 8 points 4 weeks ago

MBFC API is free as they gave us access for us as a Non Profit.

We already had in mind adding these sources to our bot but we didnt had the time and knowledge how to scrape that. Personally i would like to host it on our own server so that we dont require you to use your own money just for one bot, in what programming language did you write it?

Thanks a lot!
Rooki

[-] PhilipTheBucket@ponder.cat 10 points 4 weeks ago

Here you go:

https://ponder.cat/wp/wp-sources.zip

It's in python, suitable for sticking directly into the bot if the bot is in python. There are docs. It's a first cut. How did you envision this working? I can make a real API, if for some reason that makes things easier, but it's not immediately obvious how it would get integrated into things.

Running it on the last 50 articles posted to /c/politics, we see:

It's more complex to use this than MBFC, because there's a lot more depth to the rankings, and sometimes human judgement is needed to assign scores. There's a category "needinfo," meaning it's necessary to know what topic is being discussed or when an article was written, because of an ownership change or similar factor. I've applied that judgement above. That, to me, is a good thing. It means the bot is grounded in something, and not just blithely spitting out arbitrary scores without bothering to ground them in any reality.

In practice, I think it would be realistic to assign a single reliability ranking to most of the "needinfo" sources. You can manually edit the .json data to do so. Almost all of the posts are going to fit into one of Wikipedia's categorizations or another. Newsweek is unreliable, The Guardian is reliable, and so on.

I think most of the mixed-consensus sources can be used without a second thought. Mostly, the questions about them boil down to open partisanship of the source, which for a political community is perfectly fine as long as they're trustable factually.

If you want me to boil this down further, so that it gives a single "yes" or "no" score to each source, I can do that and probably keep almost all of the accuracy of the rankings, now that I've looked at it for a little while.

When you talk about "adding" this to the bot, are you proposing to still have MBFC be the main source, with this as a footnote? A lot of the criticism of the bot is on the grounds that MBFC is a very bad source for judging reliability, so I would question the idea of keeping it on as the primary source.

[-] NOT_RICK@lemmy.world 4 points 3 weeks ago

Nice work, thanks for contributing!

[-] Rooki@lemmy.world 3 points 4 weeks ago

By "adding" i mean adding it into the field higher than MBFC ( as i personally think wikipedia is a little bit better for that ).

new:

Wikipedia: Reliability consensus is mixed.....l ( whatever the scrapper scrapes ) MBFC: Right-Center - Credibility: High - Factual Reporting: Mostly Factual - United States of America
Search Wikipedia about this source

I would like to implement your code into the bot myself so i can learn how you would do it. If you are willing to share your code, please send me a github link ( or invite me if you want it to be private between you and me ) or if its super simple just send it in the dms.

[-] PhilipTheBucket@ponder.cat 7 points 4 weeks ago* (last edited 4 weeks ago)

I already sent it. It's here:

https://ponder.cat/wp/wp-sources.zip

Edit: You don't need to do the import initially, since there's already a sources file with some small modifications. The import is the only complicated part. Use categorize.py to categorize a source, or lookup.py to run a quick command-line test.

[-] Rooki@lemmy.world 4 points 4 weeks ago

Ok i will look into it, thanks i thought it was just the sources not the code.

[-] Rooki@lemmy.world 2 points 3 weeks ago

Ok i implemented it into the bot and it took about 1 hour and 6 minutes to fetch all links and i am now implementing the part where it is inserted into the new text.

[-] PhilipTheBucket@ponder.cat 2 points 3 weeks ago

Sounds good. If you redid the import, I think you’ll want to make some manual fixes to the .json. Off the top of my head, I think you just need to add bbc.co.uk and aljazeera.com to the URL lists for those sources.

[-] PhilipTheBucket@ponder.cat 10 points 4 weeks ago

On a different topic: It sounds like jordanlund is saying that if he tried to remove the MBFC bot from the politics sub, he might be removed as a moderator, and replaced with someone else, and the bot would come back.

https://lemmy.world/comment/12825768

Is that true? Is the admin team mandating the use of this bot, and if so, why?

[-] Rooki@lemmy.world 6 points 4 weeks ago* (last edited 4 weeks ago)

No, i dont get it from where he would get that idea, because see c/politics mods wanted the bot gone and we removed it no question asked.

@jordanlund@lemmy.world if you really dont want the bot here we can remove the bot and shut the bot down ( please consult other c/world mods too )

[-] goferking0@lemmy.sdf.org 6 points 4 weeks ago

You mean news? Cause it's still running on politics

[-] nmtake@lemm.ee 8 points 4 weeks ago

Since it's a MediaWiki page you can get Markdown source of the page with appending action=raw query to the URL.

[-] jordanlund@lemmy.world 2 points 4 weeks ago

To be honest, that's Rooki's deal, but I'll link them to this comment!

[-] PhilipTheBucket@ponder.cat 6 points 4 weeks ago

I'll send them a link and an example of how to use it tomorrow.

[-] catloaf@lemm.ee 16 points 4 weeks ago

You could get rid of it. No automation, API, or cost whatsoever.

[-] jordanlund@lemmy.world 2 points 4 weeks ago
[-] PhilipTheBucket@ponder.cat 15 points 4 weeks ago

Why is it admin level? Are there admins that tell you what you can and can't do with the politics community, in this case? Or does the politics moderation team have the ability to ditch the bot if they decide to?

This is such a strange situation. If you're stuck in that former position, though, it would make a lot of your responses in this comments section make a whole lot more sense.

[-] jordanlund@lemmy.world 3 points 4 weeks ago

The Admins run lemmy.world, we serve at their pleasure.

Sure, I could ban it, then likely get removed and have the bot re-instated, and what good would that do anyone?

[-] catloaf@lemm.ee 8 points 4 weeks ago

If the admins need to micro-manage the communities on their instance, let them moderate them.

[-] njm1314@lemmy.world 5 points 4 weeks ago

But that would be require him to give up his small made up powers that make him feel big

[-] catloaf@lemm.ee 7 points 4 weeks ago* (last edited 3 weeks ago)

https://lemmy.world/comment/12834553

Rooki is happy to remove it. Ball's in your court.

edit: 🦗🦗🦗

[-] Blaze@feddit.org 2 points 3 weeks ago

Coming here randomly, interesting exchange

[-] Blaze@feddit.org 2 points 3 weeks ago

Hello @jordanlund@lemmy.world,

Is there any update on this? Rooki is ok with the community mods removing the bot, so this seems to be more of a mods community decision rather than one made by the admins?

[-] catloaf@lemm.ee 9 points 4 weeks ago

You could ask them to remove it. Or you could ban it. The other news community doesn't have it any more. Clearly, it is possible.

[-] goferking0@lemmy.sdf.org 10 points 4 weeks ago

So already ignoring. This is why people stopped giving feedback

[-] jordanlund@lemmy.world 4 points 4 weeks ago

I can't ignore suggestions nobody is making. Have a better service in mind? Feel free to present it.

We looked at AllSides, which is good for bias, but has no scoring for credibility.

[-] Catoblepas 25 points 4 weeks ago

“We have to keep using the ratings website made by a random dude with no background in journalism who makes it available for free because real fact checking services cost money” is perhaps not the argument I would use for why the bot is both accurate and useful.

You don’t have to have a bot at all, especially to replace something like blacklisting Breitbart URLs, but someone thought the idea sounds cool. So “don’t have the bot” has been unnecessarily eliminated as an option. Even though sometimes the best option really is to just not have a bot.

[-] CanadaPlus@lemmy.sdf.org 7 points 4 weeks ago* (last edited 4 weeks ago)

I mean, it's a great argument for not going with actual fact checkers, unless you're volunteering to pay.

Not having one is also an option, but for my 2 cents the bot seems accurate enough so far, and it's easy enough to ignore if you really don't like it.

[-] Catoblepas 11 points 4 weeks ago

I’m definitely not paying to have a “think for me” bot on an instance I’m not part of. You can’t automod your way out of media illiteracy.

[-] CanadaPlus@lemmy.sdf.org 3 points 4 weeks ago

Yeah, I don't expect anything to single-handedly solve the problem.

[-] grue@lemmy.world 17 points 4 weeks ago* (last edited 4 weeks ago)

Stop pretending that "get rid of the bot" doesn't count as a suggestion. That's dishonest.

I don't even care about the bot itself, but at this point I'm just getting pissed off by all the constant distracting bickering about it.

[-] jordanlund@lemmy.world 3 points 4 weeks ago

When the question is "how do we improve it?" the answer "get rid of it" is not a genuine suggestion.

The GOOD news is, we DO have a genuinely good suggestion here and the bot creator will be reaching out.

[-] Dumnorix@lemmy.world 6 points 4 weeks ago

Honest question,

If I understand the comment thread correctly, this means you'll integrate the Wikipedia/Wikidata info in the existing bot, correct? Will an announcement be posted when or if this happens, so that people like me who blocked the bot can unblock it? I do like the concept of the bot, but I prefer an open source collaborative effort compared to a one man, rightwing aligned website.

Thanks for your openness to improve the service.

[-] jordanlund@lemmy.world 2 points 4 weeks ago

Dunno yet, that's something Rooki and the other user will have to sort out, but I'm all for improvements!

this post was submitted on 10 Oct 2024
131 points (100.0% liked)

World News

38978 readers
2501 users here now

A community for discussing events around the World

Rules:

Similarly, if you see posts along these lines, do not engage. Report them, block them, and live a happier life than they do. We see too many slapfights that boil down to "Mom! He's bugging me!" and "I'm not touching you!" Going forward, slapfights will result in removed comments and temp bans to cool off.

We ask that the users report any comment or post that violate the rules, to use critical thinking when reading, posting or commenting. Users that post off-topic spam, advocate violence, have multiple comments or posts removed, weaponize reports or violate the code of conduct will be banned.

All posts and comments will be reviewed on a case-by-case basis. This means that some content that violates the rules may be allowed, while other content that does not violate the rules may be removed. The moderators retain the right to remove any content and ban users.


Lemmy World Partners

News !news@lemmy.world

Politics !politics@lemmy.world

World Politics !globalpolitics@lemmy.world


Recommendations

For Firefox users, there is media bias / propaganda / fact check plugin.

https://addons.mozilla.org/en-US/firefox/addon/media-bias-fact-check/

founded 1 year ago
MODERATORS