Closing the gap on fediverse hashtag visibility with hashtag-importer
Importing posts one by one into your instance
October 02, 2023
I have released a small application, hashtag-importer, that users of a Mastodon instance can use to slowly import more content from low-traffic hashtags, into their instance (with their admin's permission).
Why hashtag-importer
In the fediverse, your server might not see all posts made by everyone; it should only see posts that appear in anyone's timeline. So if no one on your server follows a user, you won't see their posts, even if you're subscribed to a hashtag and they use that hashtag. If your Mastodon instance is small, you'll often have niche hashtags being unusable. For server admins, a simple solution is to use relays, and Mastodon supports it. But what if you're not admin ? That's where hashtag-importer
comes in.
How it works
Most servers (but not all) have publicly-available hashtag timelines. You can use those to get new posts from hashtags of your interests elsewhere. But wouldn't it be better to be able to read those on all your Mastodon clients, directly from your instance ?
The second part is in fact very simple, and a "core" part of the fediverse user experience: copy/pasting links to posts. If you paste a link to a post in your instance's search, the server will fetch the post, and become "aware" of it; it will have been imported, just as if it had appeared in someone's timeline. It will become available on the global timeline and indexed hashtag timelines. So that's what hashtag-importer
does, it automates this copy/pasting.
Crates galore
hashtag-importer is written in Rust, and relies on reqwest
for HTTP requests (in blocking mode only, life is too short for async™), toml
and serde
for the config file, clap
for argument parsing, webbrowser
for opening a webpage to get permissions, anyhow
for care-free error management and governor
for rate-limiting. Building pulls about 160 crates in total with transitive dependencies. I'm usually wary of adding too many dependencies, but this time I didn't hold back, and it shows. At least one of those isn't really necessary, I'll let you guess which one ;-) (Update: it has been removed)
There's little practical advantage to using Rust here, it was mostly done for fun. I've written small tools like this in go or python than can be just as reliable. I wanted to see what were the ergonomics of writing this type of client code in rust, and it works well in general. The main service loop ended up being a bit long, but that's only because I haven't taken the time to split it properly (Update: it was split).
Rate-limiting
Before writing this tool, I asked my Mastodon instance admins, the Treehouse Staff, what they thought of the idea, and if they would be opposed to adding this type of invisible automation. They suggested that I should add rate-limits:
"1 req/min, 20/hr […] with some sort of per-upstream limiting".
As I was almost done, I realized, my strategy to sprinkle calls to sleep()
around the code was not going to cut it. That's why I turned to the governor
crate, which implements GCRA, a well known leaky-bucket rate-limiting algorithm. It made things very simple, except for the fact that hashtag-importer
did not use async
, even for network code; and the governor
crate only provided a way to wait for resolution using an async fn
. So I had to add blocking helpers to wait for the next rate-limit deadline (~12 lines of code).
In the end, it was an interesting learning experience, and the code is much more readable with limits than with sleeps. So I want to thank the Treehouse Staff for providing valuable feedback upfront (and letting me run hashtag-importer
on the instance).
Real-world use: Kernel Recipes
This tool was started just before Kernel Recipes, so the conference was used as an opportunity to import more posts on the #kr2023 hashtag. It found a few posts that weren't visible from my instance, even though I'm already well connected to many attendees. I wrote the Kernel Recipes live blog, so I didn't have much time to watch social networks, but it did prove somewhat useful !
FAQ
Does it support other fediverse software than Mastodon ?
Most probably not, it wasn't tested with anything else, and developed against the document REST API of Mastodon. It does not work with Firefish for example.
Can I run this without asking my admin first ?
No, even with the care taken to lighten the load, you should always ask your admin before adding this type of automation.
Why do you hate async in Rust ?
I do not, I just did not take the time to learn enough, otherwise you wouldn't be reading this article. I still hope to convert this app to using async/await some time in the future.