I have some concerns about Matrix (the protocol, not the movie that
came out recently, although I do have concerns about that as
well). I've been watching the project for a long time, and it seems
more a promising alternative to many protocols like IRC, XMPP, and
Signal.
This review may sound a bit negative, because it focuses on those
concerns. I am the operator of an IRC network and people keep asking
me to bridge it with Matrix. I have myself considered just giving up
on IRC and converting to Matrix. This space is a living document
exploring my research of that problem space. The TL;DR: is that no,
I'm not setting up a bridge just yet, and I'm still on IRC.
This article was written over the course of the last three months, but
I have been watching the Matrix project for years (my logs seem to say
2016 at least). The article is rather long. It will likely take you
half an hour to read, so copy this over to your ebook reader,
your tablet, or dead trees, and lean back and relax as I show you
around the Matrix. Or, alternatively, just jump to a section that
interest you, most likely the conclusion.
Introduction to Matrix
Matrix is an "open standard for interoperable, decentralised,
real-time communication over IP. It can be used to power Instant
Messaging, VoIP/WebRTC signalling, Internet of Things communication -
or anywhere you need a standard HTTP API for publishing and
subscribing to data whilst tracking the conversation history".
It's also (when compared with XMPP) "an eventually consistent
global JSON database with an HTTP API and pubsub semantics - whilst
XMPP can be thought of as a message passing protocol."
According to their FAQ, the project started in 2014, has about
20,000 servers, and millions of users. Matrix works over HTTPS but
over a special port: 8448.
Security and privacy
I have some concerns about the security promises of Matrix. It's
advertised as a "secure" with "E2E [end-to-end] encryption", but how
does it actually work?
Data retention defaults
One of my main concerns with Matrix is data retention, which is a key
part of security in a threat model where (for example) an hostile
state actor wants to surveil your communications and can seize your
devices.
On IRC, servers don't actually keep messages all that long: they pass
them along to other servers and clients as fast as they can, only keep
them in memory, and move on to the next message. There are no concerns
about data retention on messages (and their metadata) other than the
network layer. (I'm ignoring the issues with user registration, which
is a separate, if valid, concern.) Obviously, an hostile server
could log everything passing through it, but IRC federations are
normally tightly controlled. So, if you trust your IRC operators, you
should be fairly safe. Obviously, clients can (and often do, even if
OTR is configured!) log all messages, but this is generally not
the default. Irssi, for example, does not log by
default. IRC bouncers are more likely to log to disk, of course,
to be able to do what they do.
Compare this to Matrix: when you send a message to a Matrix
homeserver, that server first stores it in its internal SQL
database. Then it will transmit that message to all clients connected
to that server and room, and to all other servers that have clients
connected to that room. Those remote servers, in turn, will keep a
copy of that message and all its metadata in their own database, by
default forever. On encrypted rooms those messages are encrypted, but
not their metadata.
There is a mechanism to expire entries in Synapse, but it is not
enabled by default. So one should generally assume that a message
sent on Matrix is never expired.
GDPR in the federation
But even if that setting was enabled by default, how do you control
it? This is a fundamental problem of the federation: if any user is
allowed to join a room (which is the default), those user's servers
will log all content and metadata from that room. That includes
private, one-on-one conversations, since those are essentially rooms
as well.
In the context of the GDPR, this is really tricky: who is the
responsible party (known as the "data controller") here? It's
basically any yahoo who fires up a home server and joins a room.
In a federated network, one has to wonder whether GDPR enforcement is
even possible at all. But in Matrix in particular, if you want to
enforce your right to be forgotten in a given room, you would have to:
- enumerate all the users that ever joined the room while you were
there
- discover all their home servers
- start a GDPR procedure against all those servers
I recognize this is a hard problem to solve while still keeping an
open ecosystem. But I believe that Matrix should have much stricter
defaults towards data retention than right now. Message expiry should
be enforced by default, for example. (Note that there are also
redaction policies that could be used to implement part of the GDPR
automatically, see the privacy policy discussion below on that.)
Also keep in mind that, in the brave new peer-to-peer world that
Matrix is heading towards, the boundary between server and client is
likely to be fuzzier, which would make applying the GDPR even more difficult.
Update: this comment links to this post (in german) which
apparently studied the question and concluded that Matrix is not
GDPR-compliant.
In fact, maybe Synapse should be designed so that there's no
configurable flag to turn off data retention. A bit like how most
system loggers in UNIX (e.g. syslog) come with a log retention system
that typically rotate logs after a few weeks or month. Historically,
this was designed to keep hard drives from filling up, but it also has
the added benefit of limiting the amount of personal information kept
on disk in this modern day. (Arguably, syslog doesn't rotate logs on
its own, but, say, Debian GNU/Linux, as an installed system, does have
log retention policies well defined for installed packages, and those
can be discussed. And "no expiry" is definitely a bug.
Matrix.org privacy policy
When I first looked at Matrix, five years ago, Element.io was called
Riot.im and had a rather dubious privacy policy:
We currently use cookies to support our use of Google Analytics on
the Website and Service. Google Analytics collects information about
how you use the Website and Service.
[...]
This helps us to provide you with a good experience when you
browse our Website and use our Service and also allows us to improve
our Website and our Service.
When I asked Matrix people about why they were using Google Analytics,
they explained this was for development purposes and they were aiming
for velocity at the time, not privacy (paraphrasing here).
They also included a "free to snitch" clause:
If we are or believe that we are under a duty to disclose or share
your personal data, we will do so in order to comply with any legal
obligation, the instructions or requests of a governmental authority
or regulator, including those outside of the UK.
Those are really broad terms, above and beyond what is typically
expected legally.
Like the current retention policies, such user tracking and
... "liberal" collaboration practices with the state set a bad
precedent for other home servers.
Thankfully, since the above policy was published (2017), the GDPR was
"implemented" (2018) and it seems like both the Element.io
privacy policy and the Matrix.org privacy policy have been
somewhat improved since.
Notable points of the new privacy policies:
- 2.3.1.1: the "federation" section actually outlines that
"Federated homeservers and Matrix clients which respect the Matrix
protocol are expected to honour these controls and
redaction/erasure requests, but other federated homeservers are
outside of the span of control of Element, and we cannot guarantee
how this data will be processed"
- 2.6: users under the age of 16 should not use the
matrix.org
service
- 2.10: Upcloud, Mythic Beast, Amazon, and CloudFlare possibly
have access to your data (it's nice to at least mention this in the
privacy policy: many providers don't even bother admitting to this
kind of delegation)
- Element 2.2.1: mentions many more third parties (Twilio,
Stripe, Quaderno, LinkedIn, Twitter, Google, Outplay,
PipeDrive, HubSpot, Posthog, Sentry, and Matomo
(phew!) used when you are paying Matrix.org for hosting
I'm not super happy with all the trackers they have on the Element
platform, but then again you don't have to use that service. Your
favorite homeserver (assuming you are not on Matrix.org) probably has
their own Element deployment, hopefully without all that garbage.
Overall, this is all a huge improvement over the previous privacy
policy, so hats off to the Matrix people for figuring out a reasonable
policy in such a tricky context. I particularly like this bit:
We will forget your copy of your data upon your request. We will
also forward your request to be forgotten onto federated
homeservers. However - these homeservers are outside our span of
control, so we cannot guarantee they will forget your data.
It's great they implemented those mechanisms and, after all, if
there's an hostile party in there, nothing can prevent them from using
screenshots to just exfiltrate your data away from the client side
anyways, even with services typically seen as more secure, like
Signal.
As an aside, I also appreciate that Matrix.org has a fairly decent
code of conduct, based on the TODO CoC which checks all the
boxes in the geekfeminism wiki.
Overall, privacy protections in Matrix mostly concern message
contents, not metadata. In other words, who's talking with who, when
and from where is not well protected. Compared to a tool like Signal,
which goes through great lengths to anonymize that data with features
like private contact discovery, disappearing messages,
sealed senders, and private groups, Matrix is definitely
behind. (Note: there is an issue open about message lifetimes in
Element since 2020, but it's not at even at the MSC stage yet.)
This is a known issue (opened in 2019) in Synapse, but this is
not just an implementation issue, it's a flaw in the protocol
itself. Home servers keep join/leave of all rooms, which gives clear
text information about who is talking to. Synapse logs may also
contain privately identifiable information that home server admins
might not be aware of in the first place. Those log rotation policies
are separate from the server-level retention policy, which may be
confusing for a novice sysadmin.
Combine this with the federation: even if you trust your home server
to do the right thing, the second you join a public room with
third-party home servers, those ideas kind of get thrown out because
those servers can do whatever they want with that information. Again,
a problem that is hard to solve in any federation.
To be fair, IRC doesn't have a great story here either: any client
knows not only who's talking to who in a room, but also typically
their client IP address. Servers can (and often do) obfuscate
this, but often that obfuscation is trivial to reverse. Some servers
do provide "cloaks" (sometimes automatically), but that's kind of a
"slap-on" solution that actually moves the problem elsewhere: now the
server knows a little more about the user.
Overall, I would worry much more about a Matrix home server seizure
than a IRC or Signal server seizure. Signal does get subpoenas,
and they can only give out a tiny bit of information about their
users: their phone number, and their registration, and last connection
date. Matrix carries a lot more information in its database.
Amplification attacks on URL previews
I (still!) run an Icecast server and sometimes share links to it
on IRC which, obviously, also ends up on (more than one!) Matrix home
servers because some people connect to IRC using Matrix. This, in
turn, means that Matrix will connect to that URL to generate a link
preview.
I feel this outlines a security issue, especially because those
sockets would be kept open seemingly forever. I tried to warn the
Matrix security team but somehow, I don't think this issue was taken
very seriously. Here's the disclosure timeline:
- January 18: contacted Matrix security
- January 19: response: already reported as a bug
- January 20: response: can't reproduce
- January 31: timeout added, considered solved
- January 31: I respond that I believe the security issue is
underestimated, ask for clearance to disclose
- February 1: response: asking for two weeks delay after the next
release (1.53.0) including another patch, presumably in two
weeks' time
- February 22: Matrix 1.53.0 released
- April 14: I notice the release, ask for clearance again
- April 14: response: referred to the public disclosure
There are a couple of problems here:
the bug was publicly disclosed in September 2020, and not
considered a security issue until I notified them, and even then,
I had to insist
no clear disclosure policy timeline was proposed or seems
established in the project (there is a security disclosure
policy but it doesn't include any predefined timeline)
I wasn't informed of the disclosure
the actual solution is a size limit (10MB, already implemented), a
time limit (30 seconds, implemented in PR 11784), and a
content type allow list (HTML, "media" or JSON, implemented in PR
11936), and I'm not sure it's adequate
(pure vanity:) I did not make it to their Hall of fame
I'm not sure those solutions are adequate because they all seem to
assume a single home server will pull that one URL for a little while
then stop. But in a federated network, many (possibly thousands)
home servers may be connected in a single room at once. If an attacker
drops a link into such a room, all those servers would connect to
that link all at once. This is an amplification attack: a small
amount of traffic will generate a lot more traffic to a single
target. It doesn't matter there are size or time limits: the
amplification is what matters here.
It should also be noted that clients that generate link previews
have more amplification because they are more numerous than
servers. And of course, the default Matrix client (Element) does
generate link previews as well.
That said, this is possibly not a problem specific to Matrix: any
federated service that generates link previews may suffer from this.
I'm honestly not sure what the solution is here. Maybe moderation?
Maybe link previews are just evil? All I know is there was this weird
bug in my Icecast server and I tried to ring the bell about it, and it
feels it was swept under the rug. Somehow I feel this is bound to blow
up again in the future, even with the current mitigation.
Moderation
In Matrix like elsewhere, Moderation is a hard problem. There is a
detailed moderation guide and much of this problem space is
actively worked on in Matrix right now. A fundamental problem with
moderating a federated space is that a user banned from a room can
rejoin the room from another server. This is why spam is such a
problem in Email, and why IRC networks have stopped federating ages
ago (see the IRC history for that fascinating story).
The mjolnir bot
The mjolnir moderation bot is designed to help with some of those
things. It can kick and ban users, redact all of a user's message (as
opposed to one by one), all of this across multiple rooms. It can also
subscribe to a federated block list published by matrix.org
to block
known abusers (users or servers). Bans are pretty flexible and
can operate at the user, room, or server level.
Matrix people suggest making the bot admin of your channels, because
you can't take back admin from a user once given.
There's also a new command line tool designed to do things like:
- System notify users (all users/users from a list, specific user)
- delete sessions/devices not seen for X days
- purge the remote media cache
- select rooms with various criteria (external/local/empty/created by/encrypted/cleartext)
- purge history of theses rooms
- shutdown rooms
This tool and Mjolnir are based on the admin API built into
Synapse.
Rate limiting
Synapse has pretty good built-in rate-limiting which blocks
repeated login, registration, joining, or messaging attempts. It may
also end up throttling servers on the federation based on those
settings.
Fundamental federation problems
Because users joining a room may come from another server, room
moderators are at the mercy of the registration and moderation
policies of those servers. Matrix is like IRC's +R
mode ("only
registered users can join") by default, except that anyone can
register their own homeserver, which makes this limited.
Server admins can block IP addresses and home servers, but those tools
are not easily available to room admins. There is an API
(m.room.server_acl
in /devtools
) but the it is not reliable
(thanks Austin Huang for the clarification).
Matrix has the concept of guest accounts, but it is not used very
much, and virtually no client or homeserver supports it. This contrasts with the way
IRC works: by default, anyone can join an IRC network even without
authentication. Some channels require registration, but in general you
are free to join and look around (until you get blocked, of course).
I have heard anecdotal evidence that "moderating bridges is hell", and
I can imagine why. Moderation is already hard enough on one
federation, when you bridge a room with another network, you inherit
all the problems from that network but without the entire abuse
control tools from the original network's API...
Room admins
Matrix, in particular, has the problem that room administrators (which
have the power to redact messages, ban users, and promote other users)
are bound to their Matrix ID which is, in turn, bound to their home
servers. This implies that a home server administrators could (1)
impersonate a given user and (2) use that to hijack the room. So in
practice, the home server is the trust anchor for rooms, not the user
themselves.
That said, if server B administrator hijack user joe
on server B,
they will hijack that room on that specific server. This will not
(necessarily) affect users on the other servers, as servers could
refuse parts of the updates or ban the compromised account (or
server).
It does seem like a major flaw that room credentials are bound to
Matrix identifiers, as opposed to the E2E encryption credentials. In
an encrypted room even with fully verified members, a compromised or
hostile home server can still take over the room by impersonating an
admin. That admin (or even a newly minted user) can then send events
or listen on the conversations.
This is even more frustrating when you consider that Matrix events are
actually signed and therefore have some authentication attached
to them, acting like some sort of Merkle tree (as it contains a link
to previous events). That signature, however, is made from the
homeserver PKI keys, not the client's E2E keys, which makes E2E feel
like it has been "bolted on" later.
Availability
While Matrix has a strong advantage over Signal in that it's
decentralized (so anyone can run their own homeserver,), I couldn't
find an easy way to run a "multi-primary" setup, or even a "redundant"
setup (even if with a single primary backend), short of going full-on
"replicate PostgreSQL and Redis data", which is not typically for the
faint of heart.
How this works in IRC
On IRC, it's quite easy to setup redundant nodes. All you need is:
a new machine (with it's own public address with an open port)
a shared secret (or certificate) between that machine and an
existing one on the network
a connect {}
block on both servers
That's it: the node will join the network and people can connect to it
as usual and share the same user/namespace as the rest of the
network. The servers take care of synchronizing state: you do not need
to worry about replicating a database server.
(Now, experienced IRC people will know there's a catch here: IRC
doesn't have authentication built in, and relies on "services" which
are basically bots that authenticate users (I'm simplifying, don't
nitpick). If that service goes down, the network still works, but
then people can't authenticate, and they can start doing nasty things
like steal people's identity if they get knocked offline. But still:
basic functionality still works: you can talk in rooms and with users
that are on the reachable network.)
User identities
Matrix is more complicated. Each "home server" has its own identity
namespace: a specific user (say @anarcat:matrix.org
) is bound to
that specific home server. If that server goes down, that user is
completely disconnected. They could register a new account elsewhere
and reconnect, but then they basically lose all their configuration:
contacts, joined channels are all lost.
(Also notice how the Matrix IDs don't look like a typical user address
like an email in XMPP. They at least did their homework and got the
allocation for the scheme.)
Rooms
Users talk to each other in "rooms", even in one-to-one
communications. (Rooms are also used for other things like "spaces",
they're basically used for everything, think "everything is a file"
kind of tool.) For rooms, home servers act more like IRC nodes in that
they keep a local state of the chat room and synchronize it with other
servers. Users can keep talking inside a room if the server that
originally hosts the room goes down. Rooms can have a local,
server-specific "alias" so that, say, #room:matrix.org
is also
visible as #room:example.com
on the example.com
home server. Both
addresses refer to the same room underlying room.
(Finding this in the Element settings is not obvious though, because
that "alias" are actually called a "local address" there. So to create
such an alias (in Element), you need to go in the room settings'
"General" section, "Show more" in "Local address", then add the alias
name (e.g. foo
), and then that room will be available on your
example.com
homeserver as #foo:example.com
.)
So a room doesn't belong to a server, it belongs to the federation,
and anyone can join the room from any serer (if the room is public, or
if invited otherwise). You can create a room on server A and when a
user from server B joins, the room will be replicated on server B as
well. If server A fails, server B will keep relaying traffic to
connected users and servers.
A room is therefore not fundamentally addressed with the above alias,
instead ,it has a internal Matrix ID, which basically a random
string. It has a server name attached to it, but that was made just to
avoid collisions. That can get a little confusing. For example, the
#fractal:gnome.org
room is an alias on the gnome.org
server, but
the room ID is !hwiGbsdSTZIwSRfybq:matrix.org
. That's because the
room was created on matrix.org
, but the preferred branding is
gnome.org
now.
As an aside, rooms, by default, live forever, even after the last user
quits. There's an admin API to delete rooms and a tombstone
event to redirect to another one, but neither have a GUI yet. The
latter is part of MSC1501 ("Room version upgrades") which allows
a room admin to close a room, with a message and a pointer to another
room.
Spaces
Discovering rooms can be tricky: there is a per-server room
directory, but Matrix.org people are trying to deprecate it in favor
of "Spaces". Room directories were ripe for abuse: anyone can create a
room, so anyone can show up in there. It's possible to restrict who
can add aliases, but anyways directories were seen as too limited.
In contrast, a "Space" is basically a room that's an index of other
rooms (including other spaces), so existing moderation and
administration mechanism that work in rooms can (somewhat) work in
spaces as well. This enables a room directory that works across
federation, regardless on which server they were originally created.
New users can be added to a space or room automatically in
Synapse. (Existing users can be told about the space with a server
notice.) This gives admins a way to pre-populate a list of rooms on a
server, which is useful to build clusters of related home servers,
providing some sort of redundancy, at the room -- not user -- level.
Home servers
So while you can workaround a home server going down at the room
level, there's no such thing at the home server level, for user
identities. So if you want those identities to be stable in the long
term, you need to think about high availability. One limitation is
that the domain name (e.g. matrix.example.com
) must never change in
the future, as renaming home servers is not supported.
The documentation used to say you could "run a hot spare" but that has
been removed. Last I heard, it was not possible to run a
high-availability setup where multiple, separate locations could
replace each other automatically. You can have high performance
setups where the load gets distributed among workers, but those
are based on a shared database (Redis and PostgreSQL) backend.
So my guess is it would be possible to create a "warm" spare server of
a matrix home server with regular PostgreSQL replication, but
that is not documented in the Synapse manual. This sort of setup
would also not be useful to deal with networking issues or denial of
service attacks, as you will not be able to spread the load over
multiple network locations easily. Redis and PostgreSQL heroes are
welcome to provide their multi-primary solution in the comments. In
the meantime, I'll just point out this is a solution that's handled
somewhat more gracefully in IRC, by having the possibility of
delegating the authentication layer.
Delegations
If you do not want to run a Matrix server yourself, it's possible to
delegate the entire thing to another server. There's a server
discovery API which uses the .well-known
pattern (or SRV
records, but that's "not recommended" and a bit confusing) to
delegate that service to another server. Be warned that the server
still needs to be explicitly configured for your domain. You can't
just put:
{ "m.server": "matrix.org:443" }
... on https://example.com/.well-known/matrix/server
and start using
@you:example.com
as a Matrix ID. That's because Matrix doesn't
support "virtual hosting" and you'd still be connecting to rooms and
people with your matrix.org
identity, not example.com
as you would
normally expect. This is also why you cannot rename your home
server.
The server discovery API is what allows servers to find each
other. Clients, on the other hand, use the client-server discovery
API: this is what allows a given client to find your home server
when you type your Matrix ID on login.
The high availability discussion brushed over the performance of
Matrix itself, but let's now dig into that.
Horizontal scalability
There were serious scalability issues of the main Matrix server,
Synapse, in the past. So the Matrix team has been working hard to
improve its design. Since Synapse 1.22 the home server can
horizontally scale to multiple workers (see this blog post for details)
which can make it easier to scale large servers.
Other implementations
There are other promising home servers implementations from a
performance standpoint (dendrite, Golang, entered beta in late
2020; conduit, Rust, beta; others), but none of those
are feature-complete so there's a trade-off to be made there. Synapse
is also adding a lot of feature fast, so it's an open question whether
the others will ever catch up. (I have heard that Dendrite might
actually surpass Synapse in features within a few years, which would
put Synapse in a more "LTS" situation.)
Latency
Matrix can feel slow sometimes. For example, joining the "Matrix HQ"
room in Element (from matrix.debian.social
) takes a few minutes
and then fails. That is because the home server has to sync the
entire room state when you join the room. There was promising work on
this announced in the lengthy 2021 retrospective, and some of
that work landed (partial sync) in the 1.53 release already.
Other improvements coming include sliding sync, lazy loading
over federation, and fast room joins. So that's actually
something that could be fixed in the fairly short term.
But in general, communication in Matrix doesn't feel as "snappy" as on
IRC or even Signal. It's hard to quantify this without instrumenting a
full latency test bed (for example the tools I used in the terminal
emulators latency tests), but
even just typing in a web browser feels slower than typing in a xterm
or Emacs for me.
Even in conversations, I "feel" people don't immediately respond as
fast. In fact, this could be an interesting double-blind experiment to
make: have people guess whether they are talking to a person on
Matrix, XMPP, or IRC, for example. My theory would be that people
could notice that Matrix users are slower, if only because of the TCP
round-trip time each message has to take.
Transport
Some courageous person actually made some tests of various
messaging platforms on a congested network. His evaluation was
basically:
- Briar: uses Tor, so unusable except locally
- Matrix: "struggled to send and receive messages", joining a room
takes forever as it has to sync all history, "took 20-30 seconds
for my messages to be sent and another 20 seconds for further
responses"
- XMPP: "worked in real-time, full encryption, with nearly zero
lag"
So that was interesting. I suspect IRC would have also fared better,
but that's just a feeling.
Other improvements to the transport layer include support for
websocket and the CoAP proxy work from 2019 (targeting
100bps links), but both seem stalled at the time of writing. The
Matrix people have also announced the pinecone p2p overlay
network which aims at solving large, internet-scale routing
problems. See also this talk at FOSDEM 2022.
Usability
Onboarding and workflow
The workflow for joining a room, when you use Element web, is not
great:
- click on a link in a web browser
- land on (say) https://matrix.to/#/#matrix-dev:matrix.org
- offers "Element", yeah that's sounds great, let's click "Continue"
- land on
https://app.element.io/#/room%2F%23matrix-dev%3Amatrix.org and
then you need to register, aaargh
As you might have guessed by now, there is a specification to
solve this, but web browsers need to adopt it as well, so that's far
from actually being solved. At least browsers generally know about the
matrix:
scheme, it's just not exactly clear what they should do with
it, especially when the handler is just another web page (e.g. Element
web).
In general, when compared with tools like Signal or WhatsApp, Matrix
doesn't fare so well in terms of user discovery. I probably have some
of my normal contacts that have a Matrix account as well, but there's
really no way to know. It's kind of creepy when Signal tells you
"this person is on Signal!" but it's also pretty cool that it works,
and they actually implemented it pretty well.
Registration is also less obvious: in Signal, the app confirms your
phone number automatically. It's friction-less and quick. In Matrix,
you need to learn about home servers, pick one, register (with a
password! aargh!), and then setup encryption keys (not default),
etc. It's a lot more friction.
And look, I understand: giving away your phone number is a huge
trade-off. I don't like it either. But it solves a real problem and
makes encryption accessible to a ton more people. Matrix does have
"identity servers" that can serve that purpose, but I don't feel
confident sharing my phone number there. It doesn't help that the
identity servers don't have private contact discovery: giving them
your phone number is a more serious security compromise than with
Signal.
There's a catch-22 here too: because no one feels like giving away
their phone numbers, no one does, and everyone assumes that stuff
doesn't work anyways. Like it or not, Signal forcing people to
divulge their phone number actually gives them critical mass that
means actually a lot of my relatives are on Signal and I don't have
to install crap like WhatsApp to talk with them.
5 minute clients evaluation
Throughout all my tests I evaluated a handful of Matrix clients,
mostly from Flathub because almost none of them are packaged in
Debian.
Right now I'm using Element, the flagship client from Matrix.org, in a
web browser window, with the PopUp Window extension. This makes
it look almost like a native app, and opens links in my main browser
window (instead of a new tab in that separate window), which is
nice. But I'm tired of buying memory to feed my web browser, so this
indirection has to stop. Furthermore, I'm often getting completely
logged off from Element, which means re-logging in, recovering my
security keys, and reconfiguring my settings. That is extremely
annoying.
Coming from Irssi, Element is really "GUI-y" (pronounced
"gooey"). Lots of clickety happening. To mark conversations as read,
in particular, I need to click-click-click on all the tabs that have
some activity. There's no "jump to latest message" or "mark all as
read" functionality as far as I could tell. In Irssi the former is
built-in (alt-a) and I made a custom /READ
command for
the latter:
/ALIAS READ script exec \$_->activity(0) for Irssi::windows
And yes, that's a Perl script in my IRC client. I am not aware of any
Matrix client that does stuff like that, except maybe Weechat, if we
can call it a Matrix client, or Irssi itself, now that it has a
Matrix plugin (!).
As for other clients, I have looked through the Matrix Client
Matrix (confusing right?) to try to figure out which one to try,
and, even after selecting Linux
as a filter, the chart is just too
wide to figure out anything. So I tried those, kind of randomly:
- Fractal
- Mirage
- Nheko
- Quaternion
Unfortunately, I lost my notes on those, I don't actually remember
which one did what. I still have a session open with Mirage, so I
guess that means it's the one I preferred, but I remember they were
also all very GUI-y.
Maybe I need to look at weechat-matrix
or gomuks
. At least Weechat
is scriptable so I could continue playing the power-user. Right now my
strategy with messaging (and that includes microblogging like Twitter
or Mastodon) is that everything goes through my IRC client, so Weechat
could actually fit well in there. Going with gomuks
, on the other
hand, would mean running it in parallel with Irssi or ... ditching
IRC, which is a leap I'm not quite ready to take just yet.
Oh, and basically none of those clients (except Nheko and Element)
support VoIP, which is still kind of a second-class citizen in
Matrix. It does not support large multimedia rooms, for example:
Jitsi was used for FOSDEM instead of the native videoconferencing
system.
Bots
This falls a little aside the "usability" section, but I didn't know
where to put this... There's a few Matrix bots out there, and you are
likely going to be able to replace your existing bots with Matrix
bots. It's true that IRC has a long and impressive history with lots
of various bots doing various things, but given how young Matrix is,
there's still a good variety:
- maubot: generic bot with tons of usual plugins like sed, dice,
karma, xkcd, echo, rss, reminder, translate, react, exec,
gitlab/github webhook receivers, weather, etc
- opsdroid: framework to implement "chat ops" in Matrix,
connects with Matrix, GitHub, GitLab, Shell commands, Slack, etc
- matrix-nio: another framework, used to build lots more
bots like:
- hemppa: generic bot with various functionality like weather,
RSS feeds, calendars, cron jobs, OpenStreetmaps lookups, URL
title snarfing, wolfram alpha, astronomy pic of the day, Mastodon
bridge, room bridging, oh dear
- devops: ping, curl, etc
- podbot: play podcast episodes from AntennaPod
- cody: Python, Ruby, Javascript REPL
- eno: generic bot, "personal assistant"
- mjolnir: moderation bot
- hookshot: bridge with GitLab/GitHub
- matrix-monitor-bot: latency monitor
One thing I haven't found an equivalent for is Debian's
MeetBot. There's an archive bot but it doesn't have topics
or a meeting chair, or HTML logs.
Working on Matrix
As a developer, I find Matrix kind of intimidating. The specification
is huge. The official specification itself looks somewhat
digestable: it's only 6 APIs so that looks, at first, kind of
reasonable. But whenever you start asking complicated questions about
Matrix, you quickly fall into the Matrix Spec Change
specification (which, yes, is a separate specification). And there are
literally hundreds of MSCs flying around. It's hard to tell
what's been adopted and what hasn't, and even harder to figure out if
your specific client has implemented it.
(One trendy answer to this problem is to "rewrite it in rust": Matrix
are working on implementing a lot of those specifications in a
matrix-rust-sdk that's designed to take the implementation
details away from users.)
Just taking the latest weekly Matrix report, you find that
three new MSCs proposed, just last week! There's even a graph that
shows the number of MSCs is progressing steadily, at 600+ proposals
total, with the majority (300+) "new". I would guess the "merged" ones
are at about 150.
That's a lot of text which includes stuff like 3D worlds which,
frankly, I don't think you should be working on when you have such
important security and usability problems. (The internet as a whole,
arguably, doesn't fare much better. RFC600 is a really obscure
discussion about "INTERFACING AN ILLINOIS PLASMA TERMINAL TO THE
ARPANET". Maybe that's how many MSCs will end up as well, left
forgotten in the pits of history.)
And that's the thing: maybe the Matrix people have a different
objective than I have. They want to connect everything to everything,
and make Matrix a generic transport for all sorts of applications,
including virtual reality, collaborative editors, and so on.
I just want secure, simple messaging. Possibly with good file
transfers, and video calls. That it works with existing stuff is good,
and it should be federated to remove the "Signal point of
failure". So I'm a bit worried with the direction all those MSCs are
taking, especially when you consider that clients other than Element
are still struggling to keep up with basic features like end-to-end
encryption or room discovery, never mind voice or spaces...
Conclusion
Overall, Matrix is somehow in the space XMPP was a few years ago. It
has a ton of features, pretty good clients, and a large
community. It seems to have gained some of the momentum that XMPP has
lost. It may have the most potential to replace Signal if something
bad would happen to it (like, I don't know, getting banned or
going nuts with cryptocurrency)...
But it's really not there yet, and I don't see Matrix trying to get
there either, which is a bit worrisome.
Looking back at history
I'm also worried that we are repeating the errors of the past. The
history of federated services is really fascinating:. IRC, FTP, HTTP,
and SMTP were all created in the early days of the internet, and are
all still around (except, arguably, FTP, which was removed from major
browsers recently). All of them had to face serious challenges in
growing their federation.
IRC had numerous conflicts and forks, both at the technical level
but also at the political level. The history of IRC is really
something that anyone working on a federated system should study in
detail, because they are bound to make the same mistakes if they are
not familiar with it. The "short" version is:
- 1988: Finish researcher publishes first IRC source code
- 1989: 40 servers worldwide, mostly universities
- 1990: EFnet ("eris-free network") fork which blocks the "open
relay", named Eris - followers of Eris form the A-net, which
promptly dissolves itself, with only EFnet remaining
- 1992: Undernet fork, which offered authentication ("services"),
routing improvements and timestamp-based channel synchronisation
- 1994: DALnet fork, from Undernet, again on a technical disagreement
- 1995: Freenode founded
- 1996: IRCnet forks from EFnet, following a flame war of historical
proportion, splitting the network between Europe and the Americas
- 1997: Quakenet founded
- 1999: (XMPP founded)
- 2001: 6 million users, OFTC founded
- 2002: DALnet peaks at 136,000 users
- 2003: IRC as a whole peaks at 10 million users, EFnet peaks at
141,000 users
- 2004: (Facebook founded), Undernet peaks at 159,000 users
- 2005: Quakenet peaks at 242,000 users, IRCnet peaks at 136,000
(Youtube founded)
- 2006: (Twitter founded)
- 2009: (WhatsApp, Pinterest founded)
- 2010: (TextSecure AKA Signal, Instagram founded)
- 2011: (Snapchat founded)
- ~2013: Freenode peaks at ~100,000 users
- 2016: IRCv3 standardisation effort started (TikTok founded)
- 2021: Freenode self-destructs, Libera chat founded
- 2022: Libera peaks at 50,000 users, OFTC peaks at 30,000 users
(The numbers were taken from the Wikipedia page and
Netsplit.de. Note that I also include other networks launch in
parenthesis for context.)
Pretty dramatic, don't you think? Eventually, somehow, IRC became
irrelevant for most people: few people are even aware of it now. With
less than a million users active, it's smaller than Mastodon, XMPP, or
Matrix at this point.1 If I were to venture a guess, I'd say that
infighting, lack of a standardization body, and a somewhat annoying
protocol meant the network could not grow. It's also possible that the
decentralised yet centralised structure of IRC networks limited their
reliability and growth.
But large social media companies have also taken over the space:
observe how IRC numbers peak around the time the wave of large social
media companies emerge, especially Facebook (2.9B users!!) and Twitter
(400M users).
Where the federated services are in history
Right now, Matrix, and Mastodon (and email!) are at the
"pre-EFnet" stage: anyone can join the federation. Mastodon has
started working on a global block list of fascist servers which is
interesting, but it's still an open federation. Right now, Matrix is
totally open, but matrix.org
publishes a (federated) block list
of hostile servers (#matrix-org-coc-bl:matrix.org
, yes, of course
it's a room).
Interestingly, Email is also in that stage, where there are block
lists of spammers, and it's a race between those blockers and
spammers. Large email providers, obviously, are getting closer to the
EFnet stage: you could consider they only accept email from themselves
or between themselves. It's getting increasingly hard to deliver mail
to Outlook and Gmail for example, partly because of bias against small
providers, but also because they are including more and more
machine-learning tools to sort through email and those systems are,
fundamentally, unknowable. It's not quite the same as splitting the
federation the way EFnet did, but the effect is similar.
HTTP has somehow managed to live in a parallel universe, as it's
technically still completely federated: anyone can start a web server
if they have a public IP address and anyone can connect to it. The
catch, of course, is how you find the darn thing. Which is how Google
became one of the most powerful corporations on earth, and how they
became the gatekeepers of human knowledge online.
I have only briefly mentioned XMPP here, and my XMPP fans will
undoubtedly comment on that, but I think it's somewhere in the middle
of all of this. It was co-opted by Facebook and Google, and
both corporations have abandoned it to its fate. I remember fondly the
days where I could do instant messaging with my contacts who had a
Gmail account. Those days are gone, and I don't talk to anyone over
Jabber anymore, unfortunately. And this is a threat that Matrix still
has to face.
It's also the threat Email is currently facing. On the one hand
corporations like Facebook want to completely destroy it and have
mostly succeeded: many people just have an email account to
register on things and talk to their friends over Instagram or
(lately) TikTok (which, I know, is not Facebook, but they started that
fire).
On the other hand, you have corporations like Microsoft and Google who
are still using and providing email services — because, frankly, you
still do need email for stuff, just like fax is still around —
but they are more and more isolated in their own silo. At this point,
it's only a matter of time they reach critical mass and just decide
that the risk of allowing external mail coming in is not worth the
cost. They'll simply flip the switch and work on an allow-list
principle. Then we'll have closed the loop and email will be
dead, just like IRC is "dead" now.
I wonder which path Matrix will take. Could it liberate us from these
vicious cycles?
Update: this generated some discussions on lobste.rs.
wrap-and-sort with experimental support for comments in devscripts/2.22.2
In the devscripts package currently in Debian testing (2.22.2),
wrap-and-sort
has opt-in support for preserving comments in deb822 control files such asdebian/control
anddebian/tests/control
. Currently, this is an opt-in feature to provide some exposure without breaking anything.To use the feature, add
--experimental-rts-parser
to the command line. A concrete example being (adjust to your relevant style):Please provide relevant feedback to #820625 if you have any. If you experience issues, please remember to provide the original control file along with the concrete command line used.
As hinted above, the option is a temporary measure and will be removed again once the testing phase is over, so please do not put it into scripts or packages. For the same reason,
wrap-and-sort
will emit a slightly annoying warning when using the option.Enjoy.
20 June, 2022 08:00PM by Niels Thykier