Items tagged with: fedisearch
There've been quite a few #fedisearch issues recently, but the common thread is that there's usually a gap in reporting - they're often live for weeks before people are made aware.
It's not just people's pet projects either, there are other #scrapers active, quietly consuming posts
So, I built a bot to detect and out them so that fedi admins can block as necessary
Creating A Log-Analysis System To Autodetect and Announce Mastodon Scr
I decided to build a scraper bot detection system to run against my mastodon instance, it uses behavioural scoring to fimd scrapers and then toots details to help other instance admins protect their uwww.bentasker.co.uk
✅ #GraphQL search endpoints were extracted into separated app
✅ Opt out page was polished
✅ Frontend was migrated to #NextJS 13 and AppDir api
✅ Improved frontend UI by introdcing load placeholders and by fixing of several glitches
✅ Frontend was migrated to esm modules
✅ Properly fixed robot.txt handling in crawler
✅ Improved and unified codestyle
Search people on Fediversefedisearch.skorpil.cz
1️⃣ Fixed mastodon api pagination
2️⃣ It now respects #nobot tag in bio
3️⃣ Added robots-parser library for full complience with robots.txt specification (ua: "FediCrawl/1.0")
4️⃣ Now fully respects Mastodon noindex option.
5️⃣ I also added page about opting out https://fedisearch.skorpil.cz/optout
So many new accounts should be discovereable now.
It looks the same, but It's much faster, and search results are far more relevant.
For now it discovered 5194 users📊 👤
You can now find discoverable accounts on these servers:
(Yep, I still need to do some proper styling, but it works)
I created a fediverse search app: https://fedisearch.skorpil.cz
It collects feeds from #Mastodon, #PeerTube, #Misskey and partly, #Pleroma
New ideas are welcomed. I still have several improvements in todo list (like better search query specification, instance searching...)