140
Is Lemmy Indexable? (lemmy.world)
submitted 1 year ago by TCGM@lemmy.world to c/lemmyworld@lemmy.world

Like Reddit is? e.g. for Google, or Bing (shudders), you know. Search engines. One of the ways many people around the world interacted with Reddit was looking up solutions, discussions, or similar from a search engine and NOT on Reddit itself. Is that possible in this thread of the fediverse?

top 36 comments
sorted by: hot top controversial new old
[-] samc@lemmy.world 31 points 1 year ago* (last edited 1 year ago)

It is indexable but will take time. Google has started indexing Lemmy.world but doesn’t have that many pages yet.

[-] joyjoy@lemmy.world 13 points 1 year ago* (last edited 1 year ago)

Meanwhile DuckDuckGo/Bing has only indexed the home page and !oldschoolminecraft@lemmy.world. Lots from lemmy.ml though.

[-] NevermindNoMind@lemmy.world 2 points 1 year ago

Doesn't duck duck go just use bing?

[-] jcg@halubilo.social 3 points 1 year ago

And doesn't bing just Google?

[-] imaqtpie@sh.itjust.works 4 points 1 year ago

Google just asks Jeeves.

[-] NevermindNoMind@lemmy.world 3 points 1 year ago

We're all google on this blessed day!

[-] Fizz@lemmy.nz 1 points 1 year ago

No it finds new content by following links in existing content I believe.

[-] joyjoy@lemmy.world 3 points 1 year ago
[-] damipereira@lemmy.world 19 points 1 year ago

I can already find some stuff on lemmy if I search for it. It will take time for everything to be indexed but I think it will work ok eventually.

[-] JohannesOliver@kbin.social 8 points 1 year ago

It’s also low pagerank right now.

[-] damipereira@lemmy.world 9 points 1 year ago

I wonder how will Google's algorithm treat lemmy in the future. Maybe the uncontrolled/federated nature of it will make google ignore it.

[-] HobbitFoot@thelemmy.club 6 points 1 year ago

It will probably treat the individual servers differently.

[-] jcg@halubilo.social 4 points 1 year ago* (last edited 1 year ago)

I'm of the opinion that Google should just federate with everything and index the fediverse so it can keep the original sources of everything.

[-] key@lemmy.keychat.org 18 points 1 year ago

I saw Googlebot showing up in the access logs of my personal instance the very first day I stood it up.

[-] death916@lemmy.death916.xyz 4 points 1 year ago
[-] bobaduk@lemmy.world 4 points 1 year ago

By looking at the access logs. Googlebot sends a user agent string so you can identify it.

[-] sethboy66@kbin.social 17 points 1 year ago

It's certainly archivable; all one must do is look at the 'robots.txt' (a file that websites use to let nice search engines know which pages they shouldn't index) associated with the domain to find out what it permits to be indexed. Lemmy.world's robots.txt only disallows pages associated with instance/account creation, user settings, and administrator/authorized interaction.

So everything relevant to how reddit appears on Google is possible for Lemmy, the only difference is that Lemmy's associated PageRank (and other ranking scores) are considerable lower than reddit's. This should change with time, especially when more niche and specialized communities take hold.

[-] MattMist@kbin.social 11 points 1 year ago* (last edited 1 year ago)

That's true, but aren't federated pages at a disadvantage since you can look at them from any instance thus decreasing the number of links to one specific post (which is how PageRank works)? Since then instead of one post on page 1 you'd have 10 from different instances on page 3. I'm thinking this could be fixed if all posts had a link to the post on the original instance, which is where the ranking scores would then be more likely to aggregate.

[-] sethboy66@kbin.social 6 points 1 year ago* (last edited 1 year ago)

That's a good point, and I'm sure that would certainly be a problem with PageRank and similar ranking algorithms, but I wouldn't be entirely surprised if Google and other SEs have intelligently crafted a pre-processor that translates links like "kbin.social/m/lemmyworld@lemmy.world/t/34817/Is-Lemmy-Indexable" to the Original-Instance-Link (OIL, lurking Google devs feel free to steal this acronym) "https://lemmy.world/post/189226" so that relevant algorithms properly reflect the 'true' ranking of the information itself rather than the particular instance's... instance of it.

OStatus and Pump.io have been around for a while so SEs may (should) have already identified this problem and addressed it unless they've decided it's not important, not in-line with how their rankings are intended to work, or simply not easily solvable in some cases like I previously assumed. As Bjarne Stroustrup would say, "If you think it's simple, then you have misunderstood the problem."

[-] silas@programming.dev 4 points 1 year ago

There are <meta> HTML tags and <link rel=“canonical” href=“https://example.com/sample-page/”> tags as well that point to the original copy of a page, if it is not implemented it would be super easy to, but I’m on my phone at the moment so I can’t see the source code

[-] sethboy66@kbin.social 2 points 1 year ago

Indeed, that's one solution that was brought up in the GitHub issues.

[-] Mirrorgiraffe@kbin.social 6 points 1 year ago

Yeah if they set a meta canonical tag to the origin instance they would help that post rank.

[-] jcg@halubilo.social 4 points 1 year ago

All posts and comments do have a link to the instance they were originated from. That's what that weird looking multicolour star is (the fediverse logo).

I was wondering about that! Thanks!

[-] 0485919158191@lemmy.world 14 points 1 year ago
[-] TGhost@lemmy.fmhy.ml 8 points 1 year ago

I think its fully SEO compilant, regarding how its built

[-] JurassicPork@lemmy.one 6 points 1 year ago

I'm hoping so! Would also be a great way to introduce people to Lemmy!!

[-] Abreus96@vlemmy.net 6 points 1 year ago

In good time I'm sure it will be easier to find by using the standard Google search. For those who used to use (site:reddit.com) as a reliable search query. Here is a Feddiverse alternative you can copy and paste, at least for now until Google searching the feddiverse gets easier.

Copy paste and then type your search

(site:kbin.social OR site:lemmy.world OR site:sh.itjust.works OR site:beehaw.org OR site:lemmy.ml OR site:lemmy.ca OR site:midwest.social OR site:lemmy.blahaj.zone)

[-] Soltros@lemmy.world 4 points 1 year ago

Hopefully as Google is now indexing lemmy.world, more people will see it while searching and come hang out.

[-] thegreekgeek@kbin.social 3 points 1 year ago

I found this post talking about it, it'll just take time for the fediverse (or is it threadiverse now?) to make it's way up the page rankings I guess.

[-] KnittingTrekker@kbin.social 2 points 1 year ago

Honestly, I hope all of the Fediverse's instances get indexed!

load more comments
view more: next ›
this post was submitted on 16 Jun 2023
140 points (100.0% liked)

Lemmy.World Announcements

29042 readers
1 users here now

This Community is intended for posts about the Lemmy.world server by the admins.

Follow us for server news 🐘

Outages 🔥

https://status.lemmy.world

For support with issues at Lemmy.world, go to the Lemmy.world Support community.

Support e-mail

Any support requests are best sent to info@lemmy.world e-mail.

Report contact

Donations 💗

If you would like to make a donation to support the cost of running this platform, please do so at the following donation URLs.

If you can, please use / switch to Ko-Fi, it has the lowest fees for us

Ko-Fi (Donate)

Bunq (Donate)

Open Collective backers and sponsors

Patreon

Join the team

founded 1 year ago
MODERATORS