Items tagged with: fedisearch

New #blog: Autodetecting and Announcing #Mastodon Scrapers and Crawlers

There've been quite a few #fedisearch issues recently, but the common thread is that there's usually a gap in reporting - they're often live for weeks before people are made aware.

It's not just people's pet projects either, there are other #scrapers active, quietly consuming posts

So, I built a bot to detect and out them so that fedi admins can block as necessary

#infosec #security

I just deployed a new #FediSearch version:
✅ #GraphQL search endpoints were extracted into separated app
✅ Opt out page was polished
✅ Frontend was migrated to #NextJS 13 and AppDir api
✅ Improved frontend UI by introdcing load placeholders and by fixing of several glitches
✅ Frontend was migrated to esm modules
✅ Properly fixed robot.txt handling in crawler
✅ Improved and unified codestyle

:fedisearch: 🎉

It seems that is blocking my ip address. Maybe because of #fedisearch ? But why they didn't use standard opt out features like robots.txt? 🧐

I fixed some bugs in the #FediSearch crawler and once again cleared whole index. It will take for about two days to fill it back...
1️⃣ Fixed mastodon api pagination
2️⃣ It now respects #nobot tag in bio
3️⃣ Added robots-parser library for full complience with robots.txt specification (ua: "FediCrawl/1.0")
4️⃣ Now fully respects Mastodon noindex option.
5️⃣ I also added page about opting out

Yesterday I deleted whole #FediSearch index and started crawling the #fediverse from scratch.

So many new accounts should be discovereable now.

I deployed a new version of #FediSearch.
It looks the same, but It's much faster, and search results are far more relevant.
:fedisearch: 🎉

# added a directory api. Users who enabled to be discoverable are now indexed by #... 🎉 :friendica:
For now it discovered 5194 users📊 👤

# now mirrors the query in url address. You can now share links leading directly to search results.

Or pixelfed could join to Mastodon, Pleroma, Misskey and Peertube in having opened directory api that can be easilly indexed by services like # 🔎

Current statistics of # reported by #:
# 222903 public users on 3551 instances
# 634 users on 1315 instances (only current beta has needed api)
# 277613 channels on 1195 instances
# 23931 users on 402 instances

#fedisearch: I disabled the minimal char count limit to enable searching by emoji. It proved to be useful to search by language/nationality.
Screenshot from fedisearch

I upgraded #fedisearch by adding a basic search in indexed servers. You can now check if and when was your server last indexed...

(Yep, I still need to do some proper styling, but it works)
Screenshot of fediverse server search page

Hi, Today I found your post when looking for #fedisearch and #SepiaSearch tags

I created a fediverse search app:
It collects feeds from #Mastodon, #PeerTube, #Misskey and partly, #Pleroma

New ideas are welcomed. I still have several improvements in todo list (like better search query specification, instance searching...)