#ktistec 160 hashtags

Todd Sundsted
Release v2.4.8 of Ktistec

Ktistec v2.4.8 has many small fixes and improvements, but includes one significant fix to ActivityPub garbage collection, which was the major feature introduced in the last release.

⚠️ Important Note: Building with Crystal Language version 1.17.x is not supported due to two breaking changes. See:

(Maybe it's three changes—compile times are also far slower and executable sizes are much larger.)

Added

  • Send "User-Agent" header identifying Ktistec on outbound HTTP requests.
  • Add accept/reject action buttons to top panel on actor pages.

Fixed

  • Add index on "username" on "actors" table. (Fixes a regression introduced in e659e84a.)
  • Rejection now correctly sets follow relationships as confirmed (previously they remained pending).
  • Fix garbage collection issues with threads created in earlier versions.

Changed

  • Prioritize the author's self-replies in thread view.

Enjoy!

#ktistec #fediverse #activitypub #crystallang

Todd Sundsted
Ktistec PSA

Ktistec (temporarily) only builds with versions of the Crystal Programming Language 1.16.3 and below. There was a significant change to the libxml integration in the Crystal Standard Library in version 1.17.0. Ktistec implements some extensions on top of the standard library that need to be updated as a result. A permanent fix in in progress.

#ktistec #crystallang #libxml

Todd Sundsted
Release v2.4.7 of Ktistec

After a mental health break, release v2.4.7 of Ktistec is out. The biggest improvement is the addition of a command line switch/option to run garbage collection on startup. Garbage collection, in this context, trims down your database by deleting old ActivityPub objects that are not connected to your user through:

  • Attribution: Objects attributed to you or actors you follow
  • Activities: Objects referenced by your activities or activities of actors you follow
  • Collections: Objects in your timeline, notifications, or outbox
  • Content: Objects with hashtags, mentions, or in threads you follow

It reduced the size of my database ~24%. Details on usage, warnings, etc. are in the README.

Other changes:

Fixed

  • Use single quotes for string literals in SQLite queries.
  • Fix WITH RECURSIVE queries.
  • Fix broken CI workflow.

Changed

  • Present local internal URLs as external URLs in posts.
  • Limit pagination size for unauthenticated users.
  • Better convey actor/object deleted/blocked status on index pages.
  • Improve presentation of inline code and code blocks.
  • Clip alt text on thumbnail images.

Other

  • Update cached copy of Lemmy's JSON-LD context.

#ktistec #fediverse #activitypub #crystallang

Todd Sundsted
Release v2.4.6 of Ktistec

Release v2.4.6 of Ktistec is out. As mentioned in an earlier post, this release focuses on database performance improvements. This means caching the results of expensive queries (like counting all posts with a particular hashtag or mention). On my instance at least, pages like the notifications page are now snappier.

There are still slow queries (queries that take more than 50msec). Most of those are requests for pages of old posts where none of the necessary database pages are in the page cache. I have increased the page cache size, and that reduces the frequency, but I don't see an immediate fix.

Fixed

  • Add missing database query logging.

Changed

  • Improve query performance for hashtags and mentions.
  • Make less costly updates to tag statistics.
  • Improve anonymous session management.
  • Cache the Nodeinfo count of local posts.

Removed

  • Remove support for X-Auth-Token.

Other

  • Add timeout values for POST socket operations.

I don't have an immediate plan for the next release. There have been a bunch of feature requests that I think have merit. I'll probably get started on some of those.

#ktistec #fediverse #activitypub #crystallang

Todd Sundsted

You can specify the SQLite database and pass options (pragmas like cache_size, journal_mode, ...) on the command line when you start the Ktistec server. The following example sets the cache size to 20,000 pages (up from the default of 2,000 pages) which improves performance on larger instances.

KTISTEC_DB=~/ktistec.db\?cache_size=-20000 ./server

You can also enable the write-ahead log (but make sure you know what that means).

KTISTEC_DB=~/ktistec.db\?journal_mode=wal\&synchronous=normal ./server

Pragmas supported are limited to those listed here.

#ktistec #fediverse #activitypub #crystallang

Todd Sundsted

A new release of ktistec that improves database performance is imminent. In the past, database optimization usually meant "fixing a bunch of poorly constructed queries", and I'm sure there's more of that to do—I'm not an expert. But this time, I found most of the queries were as good as they were going to get on my watch (I'm not an expert). If you have a million records and you need to filter and count them, that's just going to take some time...

So this time, I focused on caching the results of queries like that (which really means I focused on cache invalidation, right). A case in point is commit d544b1af. Previously, the nodeinfo endpoint filtered and counted posts on every request, and it took +80msec to do that. Worse, the filtering pushed everything else out of the sqlite page cache, which made the next, unrelated database query slow!

Caching this value, and only recounting when I post something, not only dropped the service time for the request to ~1msec but actually improved database performance, generally!

More to come...

#ktistec #fediverse #activitypub #crystallang

Todd Sundsted

it's interesting to see what scans show up in the logs:

2025-01-24 16:24:11 UTC 404 GET /.env 1.16ms
2025-01-24 16:24:11 UTC 404 GET /.env 563.87µs
2025-01-24 16:24:14 UTC 404 GET /.aws/credentials 601.43µs
2025-01-24 16:24:14 UTC 404 GET /.aws/credentials 498.43µs
2025-01-24 16:24:16 UTC 404 GET /.env.example 609.78µs
2025-01-24 16:24:16 UTC 404 GET /.env.example 544.13µs
2025-01-24 16:24:18 UTC 404 GET /.env.production 798.14µs
2025-01-24 16:24:19 UTC 404 GET /admin/.env 628.06µs
2025-01-24 16:24:23 UTC 404 GET /api/.env 906.66µs
2025-01-24 16:24:25 UTC 404 GET /app/.env 574.45µs
2025-01-24 16:24:27 UTC 404 GET /app_dev.php/_profiler/open?file=app/config/parameters.yml 537.69µs
2025-01-24 16:24:33 UTC 404 GET /app_dev.php/_profiler/phpinfo 841.8µs
2025-01-24 16:24:35 UTC 404 GET /backend/.env 513.92µs
2025-01-24 16:24:36 UTC 404 GET /core/.env 661.94µs
2025-01-24 16:24:38 UTC 404 GET /credentials 649.68µs
2025-01-24 16:24:40 UTC 404 GET /crm/.env 480.42µs
2025-01-24 16:24:43 UTC 404 GET /demo/.env 579.16µs
2025-01-24 16:24:49 UTC 404 GET /info/ 614.09µs
2025-01-24 16:24:51 UTC 404 GET /infos/ 705.33µs
2025-01-24 16:24:54 UTC 404 GET /pinfo.php 489.59µs
2025-01-24 16:24:58 UTC 404 GET /vendor/.env 780.1µs

this reminds me that i have to make responding to those requests much much slower...

#ktistec #security #todo

Todd Sundsted

If you're running an instance of Ktistec and want to see what other ActivityPub instances are sending you, turn on JSON-LD processing debug logging.

  1. Go the the /system URL.
  2. Find the ktistec.json_ld setting.
  3. Select "Debug" and save.

Ktistec will dump received activities to the log, after the activity has been parsed into JSON but before JSON-LD expansion.

2025-01-22 14:53:17 UTC 409 POST /actors/toddsundsted/inbox 4.29ms
2025-01-22T14:53:17.597172Z  DEBUG - ktistec.json_ld: {"@context" => ["https://www.w3.org/ns/activitystreams", "https://w3id.org/security/v1"],
"id" => "https://random.site/users/FooBar#delete", "type" => "Delete", "actor" => "https://random.site/users/FooBar", "object" => "https://random.site/users/FooBar", "to" => ["https://www.w3.org/ns/activitystreams#Public"], 
"signature" => {"type" => "RsaSignature2017", "creator" => "https://random.site/users/FooBar#main-key", "created" => "2025-01-22T14:52:40Z", "signatureValue" => "01234567890abcdefghijklmnopqrstuvwxyz=="}}

Answer to a FAQ:
The server returns HTTP status code 409 ("Conflict") if it has already received an activity.

#ktistec #fediverse #activitypub

Todd Sundsted

Crystal is fast because methods are monomorphized at compile time. In simple terms, that means that at compile time, a polymorphic method is replaced by one or more type-specific instantiations of that method. The following polymorphic code...

def plus(x, y)
  x + y
end

...is effectively replaced by two methods—one that does integer addition if called with two integers, and one that does string concatenation if called with two strings.

This extends to inherited methods, which are implicitly also passed self. You can see this in action if you dump and inspect the symbols in a compiled program:

class FooBar
  def self.foo
    puts "#{self}.foo"
  end

  def bar
    puts "#{self}.bar"
  end
end

FooBar.foo
FooBar.new.bar

class Quux < FooBar
end

Quux.foo
Quux.new.bar

Dumping the symbols, you see multiple instantiations of the methods foo and bar:

...
_*FooBar#bar:Nil
_*FooBar::foo:Nil
_*FooBar@Object::to_s<String::Builder>:Nil
_*FooBar@Reference#to_s<String::Builder>:Nil
_*FooBar@Reference::new:FooBar
_*Quux@FooBar#bar:Nil
_*Quux@FooBar::foo:Nil
_*Quux@Object::to_s<String::Builder>:Nil
_*Quux@Reference#to_s<String::Builder>:Nil
_*Quux@Reference::new:Quux
...

The optimizer in release builds is pretty good at cleaning up the obvious duplication. But during my optimization work on Ktistec, I found that a lot of duplicate code shows up anyway.

Most pernicious are weighty methods that don't depend on class or instance state (don't make explicit or implicit reference to self). As I blogged about earlier, this commit replaced calls to the inherited method map on subclasses with calls to the method map defined on the base class and reduced the executable size by ~5.8%. The code was identical and the optimizer could remove the unused duplicates.

So, as a general rule, if you intend to use inheritance, put utility code that doesn't reference the state or the methods on the class or instance in an adjacent utility class—as I eventually did with this commit.

(The full thread starts here.)

#ktistec #crystallang #optimization

Todd Sundsted
Release v2.4.5 of Ktistec

Ktistec release v2.4.5 rolls out the build time and executable size optimizations I've been blogging about here. It also fixes a few small bugs.

Fixed

  • Handle @-mentions with hosts in new posts.
  • Handle HEAD requests for pages with pretty URLs.
  • Destroy session after running scripts.

Changed

  • Delete old authenticated sessions.

I've started a branch full of query optimizations. My general rule—as highlighted in the server logs—is if a query takes longer than 50msec, it takes too long. It's time to address some problems...

#ktistec #fediverse #activitypub #crystallang