Epiktistes

This action will delete this post on this instance and on all federated instances, and it cannot be undone. Are you certain you want to delete this post?

No

Yes

This action will block this actor and hide all of their past and future posts. Are you certain you want to block this actor?

No

Yes

This action will block this object. Are you certain you want to block this object?

No

Yes

Are you sure you want to delete the OAuth client [Client Name]? This action cannot be undone and will revoke all access tokens for this client.

No

Yes

Are you sure you want to revoke the OAuth token [Token ID]? This action cannot be undone and will immediately revoke access for this token.

No

Yes

You are currently offline.

Todd Sundsted posted 1:30pm

Release v3.6.0 of Ktistec

It is said that there are only two hard things in computer science: cache invalidation and naming things. The story goes: you have something that is expensive to compute, so you compute it once and then you cache it and use the cached value in the future. But the inputs to that computation change, and so the cached value grows stale. You have to decide when and how to recompute that value.

In Ktistec, presenting accurate tag counts is expensive because not every tagged post counts. Posts are deleted, actors are blocked. My own drafts don't count, but when they're published they do. A post tagged with the same hashtag more than once, must count as one. And tag cardinality is not uniform: #3dprinting has hundreds of thousands of posts, others have one or two. Even with indexes, there is no single query that counts all cases in an acceptable amount of time.

So I reached for a cache, counted once and then cached the count. Because I didn't want to maintain adjustments from every place in the code that changed something that touched the count, I settled for eventual consistency and recomputed counts after every server restart.

As it turns out, that's not good enough. On a server with reasonable traffic, an event that affects some tag's count happens every few hours. Days or weeks later there is significant drift. Worse, the implementation didn't recompute on first read, it recomputed on first write (a new tagged object arrives).

This release fixes all that. Counts are still eventually consistent, but all counts are recomputed in a regular background task, so they really are eventually consistent, and care was taken in constructing the query to minimize database (read) locking to ~100-200msec.

Is it better? Yes! Is it perfect? Probably not. Cache invalidation is hard.

Here's the full changelog for this release:

Added

Background task to reconcile tag statistics.

Fixed

Prevent model hook callbacks from interleaving.
Add spacing between content and the sticky footer.

Changed

Replace Semantic UI with Fomantic UI.
Cache the PURL and GoToSocial JSON-LD contexts.
Reduce database lock time when reconciling tags.
Block npm dependency install scripts.

Removed

The unused idx_relationships_type database index.

In the next release, I'm going to fix a few bugs in the Mastodon-compatible API. These require an internal redesign, so I've held off until a few other things were out of the way. And I'm turning my attention to reading and better tools for surfacing and finding interesting content.

#ktistec #crystallang #activitypub #fediverse