Free Software, Free Society!
Thoughts of the FSFE Community (English)

Friday, 07 June 2019

I'm going to Akademy!

And you should too!

Akademy is free to attend however you need to register to reserve your space, so head to https://akademy.kde.org/2019/register and press the buttons.

Akademy is very important to meet people, discuss future plans, learn about new stuff and make friends for life!

Note this year the recommended accomodations are a bit on the expensive side, so you may want to hurry and apply for Travel support. The last round is open until July 1st.


I'm going to Akademy 2019

Thursday, 06 June 2019

Akademy-es 2019 talks announced!

Akademy-es 2019 will be happening this June 28-30 in Vigo.

The talks were just announced recently.

Check them out at https://www.kde-espana.org/akademy-es-2019/programa-akademy-es-2019 it has lots of interesting talks so if you understand Spanish and are interested in KDE or Free Software in general I'd really recommend to attend!

Friday, 17 May 2019

[Some] KDE Applications 19.04.1 also available in flathub

Thanks to Nick Richards we've been able to convince flathub to momentarily accept our old appdata files as still valid, it's a stopgap workaround, but at least gives us some breathing time. So the updates are coming in as we speak.

Partition like a pro with fdisk, sfdisk and cfdisk

  • Seravo
  • 10:44, Friday, 17 May 2019

Most Linux distributions ship the hard drive partition tool fdisk by default. Knowing how to use it is a good skill for every Linux system administrator since having to rescue a system that has disk issues is a very common task. If the admin is faced with a prompt in a rescue mode boot, often fdisk is the only partitioning tool available and must be used, since if the main root filesystem is broken, one cannot install and use any other partitioning tools.

When installing Debian based systems (e.g. Ubuntu) in the text mode server installer, keep in mind that you can at any time during the installation process press Ctrl+Alt+F2 to jump to a text console running a limited shell prompt (Busybox) and manipulate the systems as you wish, among others run fdisk. When done press Ctrl+Alt+F1 to jump back to the installer screen.

In fact, fdisk is not a single utility but actually a tool that ships with three commands together: fdisk, sfdisk and cfdisk.

fdisk

Most Linux sysadmins have at some point used fdisk, the classic partitioning tool. There is also a tool with the same name in Windows, but its not the same tool. Across the Unix ecosystem the fdisk tool is however nowadays the same one, even on MacOS X.

To list the current partition layout one can simply run fdisk -l /dev/sda. Below is an example of the output. One can also run fdisk -l /dev/sd* to print the partition info of all sd devices in one go. The fdisk man page lists all the command line parameters available.

Example output of fdisk -l

If one runs just fdisk it will launch in interactive mode. Pressing m will show the help. To create a new GPT (for modern disk) partition table (resetting any existing partition table) and add a new Linux partition that uses all available disk space one can simply enter the commands g, n and w in sequence and pressing enter to all questions to accept them at their default values.

In-command help in fdisk, printed when pressing m

cfdisk

The command cfdisk servers the same purpose as fdisk with the difference that it provides a slightly fancier user interface based on ncurses so there are menus one can browse with arrows and the tab button without having to remember the single letter commands fdisk uses.

Screenshot of cfdisk

sfdisk

The third tool in the suite is sfdisk. This tool is designed to be scripted, enabling administrators to script and automate partitioning operations.

The key to sfdisk operations is to first dump the current layout using the -d argument, for example sfdisk -d /dev/sda > partition-table. An example output would be:

$ cat partition-table
label: gpt
label-id: AF7B83C8-CE8D-463D-99BF-E654A68746DD
device: /dev/sda
unit: sectors
first-lba: 34
last-lba: 937703054

/dev/sda1 : start=        2048, size=      997376, type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B, uuid=F35A875F-1A53-493E-85D4-870A7A749872
/dev/sda2 : start=      999424, size=   936701952, type=A19D880F-05FC-4D3B-A006-743F0F84911E, uuid=725EAB2A-F2E2-475E-81DC-9A5E18E29678

This text file describes the partition type, the layout and also includes the device UUIDs. The file above can be considered a backup of the /dev/sda partition table. If something has gone wrong with the partition table and this file was saved at some earlier time, one can recover the partition table by running: sfdisk /dev/sda < partition-table.

Copying the partition table to multiple disks

One neat application of sfdisk is that it can be used to copy the partition layout to many devices. Say you have a big server computer with 16 hard disks. Once you have partitioned the first disk, you can dump the partition table of the first disk with sfdisk -d and then edit the dump file (remember, it is just a plain-text file) to remove references to the device name and UUID’s, which are unique to a specific device and not something you want to clone to other disks. If the initial dump was the example above, the version with unique identifiers removed would look like this:

label: gpt
unit: sectors
first-lba: 34
last-lba: 937703054

start=        2048, size=      997376, type=0FC63DAF-8483-4772-8E79-3D69D8477DE4
start=      999424, size=   936701952, type=A19D880F-05FC-4D3B-A006-743F0F84911E

This can applied to the another disk, for example /dev/sdb simply by running sfdisk /dev/sdb < partition-table. Now all the admin needs to do is run this same command a couple of times with only the one character changed on each invocation.

Listing device UUID’s with blkid

Keep in mind that the Linux kernel uses the device UUIDs for indentifying partitions and file systems. Be vary not to accidentally make two disks have the same UUID with sfdisk. Technically it is possible, and maybe useful in some situation where one wants to replace a hard drive and make the new hard drive 100% identical, but in a running system different disks should all have unique UUIDs.

To list all UUIDs use blkid. Below is an example of the output:

$ blkid
/dev/sda1: UUID="F379-8147" TYPE="vfat" PARTUUID="f35a875f-1a53-493e-85d4-870a7a749872"
/dev/sda2: UUID="5f12f800-1d8d-6192-0881-966a70daa16f" UUID_SUB="2d667c5b-b9f3-6510-cf76-9231122533ce" LABEL="fi-e3:0" TYPE="linux_raid_member" PARTUUID="725eab2a-f2e2-475e-81dc-9a5e18e29678"
/dev/sdb2: UUID="5f12f800-1d8d-6192-0881-966a70daa16f" UUID_SUB="0221fce5-2762-4b06-2d72-4f4f43310ba0" LABEL="fi-e3:0" TYPE="linux_raid_member" PARTUUID="cd2a477f-0b99-4dfc-baa6-f8ebb302cbbb"
/dev/md0: UUID="dcSgSA-m8WA-IcEG-l29Q-W6ti-6tRO-v7MGr1" TYPE="LVM2_member"
/dev/mapper/ssd-ssd--swap: UUID="2f2a93bd-f532-4a6d-bfa4-fcb96fb71449" TYPE="swap"
/dev/mapper/ssd-ssd--root: UUID="660ce473-5ad7-4be9-a834-4f3d3dfc33c3" TYPE="ext4"

Extra tip: listing all disk with lsblk

While fdisk -l is nice for listing partition tables, often admins also want to know the partition sizes in human readable formats and what the partitions are used for. For this purpose the command lsblk is handy. While the default output is often enough, supplying the extra arguments -o NAME,SIZE,FSTYPE,TYPE,MOUNTPOINTmakes it even better. See below an example of the output:

$ lsblk -o NAME,SIZE,FSTYPE,TYPE,MOUNTPOINT
NAME                  SIZE FSTYPE            TYPE  MOUNTPOINT
sda                 447,1G                   disk  
├─sda1                487M vfat              part  /boot/efi
└─sda2              446,7G linux_raid_member part  
  └─md0             446,5G LVM2_member       raid1 
    ├─ssd-ssd--swap   8,8G swap              lvm   [SWAP]
    └─ssd-ssd--root 437,7G ext4              lvm   /
sdb                 447,1G                   disk  
├─sdb1                487M                   part  
└─sdb2              446,7G linux_raid_member part  
  └─md0             446,5G LVM2_member       raid1 
    ├─ssd-ssd--swap   8,8G swap              lvm   [SWAP]
    └─ssd-ssd--root 437,7G ext4              lvm   /
sdc                 447,1G                   disk  
├─sdc1                487M                   part  
└─sdc2              446,7G                   part  

A word of warning…

Remember that modifying the partition table is a destructive process. It is something the admin does while installing new systems or recovering broken ones. If done wrongly, all data might be lost!

Wednesday, 15 May 2019

No KDE Applications 19.04.1 available in flathub

The flatpak and flathub developers have changed what they consider a valid appdata file, so our appdata files that were valid last month are not valid anymore and thus we can't build the KDE Applications 19.04.1 packages.

Wednesday, 08 May 2019

Of elitists and laypeople

Spoilers Game of Thrones.

I have been watching Game of Thrones with great interest the past few weeks. It has very strongly highlighted a struggle that has been gripping my mind for a while now: That between elitists and laypeople. And I find myself in a strange in-between.

For those not in the know, the latest season of Game of Thrones is a bit controversial to say the least. If you skip past the internet vitriol, you’ll find a lot of people disliking the season for legitimate reasons: The battle tactics don’t make any sense, characters miraculously survive after the camera cuts away, time and distance stopped being an issue in a setting that used to take it slow, and there’s a weird, forced conflict that would go away entirely if these two characters that are already in love would simply marry. And the list goes on, I’m sure.

But on the other hand, there appears to be a large body of laypeople who watch and enjoy the series. Millions of people tune in every week to watch fictional people fight over a fictional throne, and they appear to be enjoying themselves. And me? Sure, I’m enjoying myself as well. I had muscle aches from the tension of watching The Long Night, and nothing gripped me more than the half-botched assassination attempt at the end of the episode.

So what gives? On the one hand enthusiasts are rightly criticising the writers for some very strange decisions, but on the other hand millions of people are enjoying the series all the same.

Of power users and newbies

I frankly don’t know the answer to the posed question, but I do know an analogy that prompted me to write this blog post. I am a humble contributor to the GNOME Project, chiefly as translator for Esperanto, but also miniscule bits and bobs here and there. GNOME faces a similar problem with detractors: They have their complaints about systemd, customisability, missing power user features, themes breaking, and so forth. And I’m sure they have some valid points, but GNOME remains the default desktop environment on many distributions, and many people use and love GNOME as I do.

These detractors often run some heavily customised Arch Linux system with some unintuitive-but-efficient window manager, and don’t have any editor other than Vim installed. Or in other words: They run a system that the vast majority of people could not and do not want to use.

And I understand these people, because in one aspect of my digital life, I have been one of them. For at least two years, I ran Spacemacs as my primary editor. For a while I even did my e-mail through that program, and I loved it. Kind of. Sure, everything was customisable, and the keyboard shortcuts were magically fast, but the mental overhead of using that program was slowly grinding me down. Some menial task that I do infrequently would turn out to involve a non-intuitive sequence of keys that you just simply need to know, and I would spend far too long on figuring that out. Or I would accidentally open Vim inside of the Emacs terminal emulator, and :q would be sent to Emacs instead of the emulator. Sure, if you know enough Emacs wizardry, you could easily escape this situation, but that’s the point, isn’t it? The wizardry involved takes effort that I don’t always want to put in, even if I know that it pays off. Kind of.

These days I use VSCodium, a Free Software version of Visual Studio Code. I like it well enough for a multitude of reasons, but mainly because the mental overhead of using this editor is a lot lower. Even so, is Emacs a better editor? Probably. If I could be bothered to maintain my Emacs wizarding skills, I am fairly certain that it would be the perfect editor. But that’s a big if. So that’s why I settle for VSCodium. And the same line of reasoning can be extended to why I use and love GNOME.

Back to Westeros

Having made that analogy, can it be mapped onto the kerfuffle surrounding Game of Thrones? Is it a matter of a small group wanting an intricate, advanced plot and a larger group wanting a simple, rudimentary story, because they can’t or don’t want to deal with a complicated story?

It seems that way, but the damnedest thing is that I don’t know. I like the latest season of Game of Thrones for what it is: An archetypical fight of good versus evil. The living gathered together to fight an undead army, and the living won. Such a story is a lot easier to get into as a layperson, and there is nothing wrong with enjoying simple, archetypical stories.

But that’s not what Game of Thrones is. Game of Thrones is the derivative of an incredibly intricate series of books with so many details and plots, and the TV series stayed faithful to that for a long time. The latest season is a huge diversion from its roots. It is, as far as I can tell, replacing vi with nano. There is nothing wrong with either, but there is a good reason why the two are separate.

Who are these laypeople, anyway?

This question is difficult to answer, because the layperson isn’t me. It can’t possibly be, because here I am writing about the subject. The layperson must be someone who isn’t particularly interested or informed. I imagine that they just turn on the telly and enjoy it for what it is. No deep thoughts, no deep investment.

But why don’t these laypeople care? Why should we care about laypeople? Must we really dumb everything down for the lowest common denominator? Why can’t they just get on my level? This is really frustrating!

Enter cars.

I have a driving license, but I don’t really care about cars. I know how to work the pedals and the steering wheel, and that’s pretty much it. I don’t know why I don’t care about cars. I just want to get from my home to my destination and back. If I can put in as little effort as possible to do that, I’m happy. I just don’t have the time or desire to learn all the intricacies of cars.

And knowing that, I suppose that I’m the layperson I was so frustrated over a moment ago. When I walk into the garage with a minor problem, I like to imagine that I’m the sort of person who shows up at tech support because I can’t log in: I accidentally pressed Caps Lock.

So the layperson is me. Sometimes.

Then who are the elitists?

Having said all of that, something throws a wrench in the works: Game of Thrones was also immensely popular when it had all the intricacies and inter-weaving plots of the first few seasons. That appears to indicate to me that laypeople aren’t allergic to the kind of story that the enthusiasts want. But they aren’t allergic to the story that is being told in season 8, either, unlike the elitists.

So why do the elitists care? Why can’t they just appreciate the same things that laypeople do? Why must it always be so complex? Why should the complaints of a few outweigh the enjoyment of many?

And this is where I get stuck. Because frankly, I don’t know. Shouldn’t everything be as accessible as possible? The more the merrier? Why should vi exist when nano suffices?

But you can take my vi key bindings from my cold, dead hands. And I love what Game of Thrones used to be, and am sad that it morphed into an archetypical story that used to be its antithesis. I want complex things, even though I switched from Spacemacs to VSCodium and use GNOME instead of i3. Not for the sake of difficulty, but because complexity gives me something that simplicity cannot.

So the elitist is me. Sometimes.

Fin

I’m still in a limbo about this clash between elitists and laypeople. Maybe the clash is superficial and the two can exist side-by-side or separately. Maybe the writers of Game of Thrones just aren’t very good and accidentally made the story for laypeople instead of their target audience of elitists. Maybe it’s a sliding scale instead of a binary.

I don’t really know. I just wanted to get these thoughts out of my head and into a text box.

Sunday, 05 May 2019

External encrypted disk on LibreELEC

Last year I replaced, on the Raspberry Pi, the ArchLinux ARM with just Kodi installed with LibreELEC.

Today I plugged an external disk encrypted with dm-crypt, but to my full surprise this isn’t supported.

Luckily the project is open source and sky42 already provides a LibreELEC version with dm-crypt built-in support.

Once I flashed sky42’s version, I setup automated mount at startup via the autostart.sh script and the corresponding umount via shutdown.sh this way:

// copy your keyfile into /storage via SSH
$ cat /storage/.config/autostart.sh
cryptsetup luksOpen /dev/sda1 disk1 --key-file /storage/keyfile
mount /dev/mapper/disk1 /media

$ cat /storage/.config/shutdown.sh
umount /media
cryptsetup luksClose disk1

Reboot it and voilà!

Automount

If you want to automatically mount the disk whenever you plug it, then create the following udev rule:

// Find out ID_VENDOR_ID and ID_MODEL_ID for your drive by using `udevadm info`
$ cat /storage/.config/udev.rules.d/99-automount.rules
ACTION=="add", SUBSYSTEM=="usb", SUBSYSTEM=="block", ENV{ID_VENDOR_ID}=="0000", ENV{ID_MODEL_ID}=="9999", RUN+="cryptsetup luksOpen $env{DEVNAME} disk1 --key-file /storage/keyfile", RUN+="mount /dev/mapper/disk1 /media"

Saturday, 04 May 2019

Hardening OpenSSH Server

start by reading:

man 5 sshd_config

 

CentOS 6.x

Ciphers aes128-ctr,aes192-ctr,aes256-ctr
KexAlgorithms diffie-hellman-group-exchange-sha256
MACs hmac-sha2-256,hmac-sha2-512

and change the below setting in /etc/sysconfig/sshd:

AUTOCREATE_SERVER_KEYS=RSAONLY

 

CentOS 7.x

Ciphers chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
KexAlgorithms curve25519-sha256,curve25519-sha256@libssh.org,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256
MACs umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512

 

Ubuntu 18.04.2 LTS

Ciphers chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
KexAlgorithms curve25519-sha256,curve25519-sha256@libssh.org,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256
MACs umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512
HostKeyAlgorithms ecdsa-sha2-nistp384-cert-v01@openssh.com,ecdsa-sha2-nistp521-cert-v01@openssh.com,ssh-ed25519-cert-v01@openssh.com,ssh-rsa-cert-v01@openssh.com,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,ssh-ed25519,rsa-sha2-512,rsa-sha2-256

 

Archlinux

Ciphers chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
KexAlgorithms curve25519-sha256,curve25519-sha256@libssh.org,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256
MACs umac-128-etm@openssh.com,hmac-sha2-512-etm@openssh.com
HostKeyAlgorithms ssh-ed25519-cert-v01@openssh.com,rsa-sha2-512-cert-v01@openssh.com,rsa-sha2-256-cert-v01@openssh.com,ssh-rsa-cert-v01@openssh.com,ssh-ed25519,rsa-sha2-512,rsa-sha2-256,ssh-rsa

 

Renew SSH Host Keys

rm -f /etc/ssh/ssh_host_* && service sshd restart

 

Generating ssh moduli file

for advance users only

ssh-keygen -G /tmp/moduli -b 4096

ssh-keygen -T /etc/ssh/moduli -f /tmp/moduli 
Tag(s): openssh

Automated phone backup with Syncthing

How do you backup your phones? Do you?

I use to perform a copy of all the photos and videos from my and my wife’s phone to my PC monthly and then I copy them to an external HDD attached to a Raspberry Pi.

However, it’s a tedious job mainly because: - I cannot really use the phones during this process; - MTP works one in 3 times - often I have to fallback to ADB; - I have to unmount the SD cards to speed up the copy; - after I copy the files, I have to rsync everything to the external HDD.

The Syncthing way

Syncthing describes itself as:

Syncthing replaces proprietary sync and cloud services with something open, trustworthy and decentralized.

I installed it to our Android phones and on the Raspberry Pi. On the Raspberry Pi I also enabled remote access.

I started the Syncthing application on the Android phones and I’ve chosen the folders (you can also select the whole Internal memory) to backup. Then, I shared them with the Raspberry Pi only and I set the folder type to “Send Only” because I don’t want the Android phone to retrieve any file from the Raspberry Pi.

On the Raspberry Pi, I accepted the sharing request from the Android phones, but I also changed the folder type to “Receive Only” because I don’t want the Raspberry Pi to send any file to the Android phones.

All done? Not yet.

Syncthing main purpose is to sync, not to backup. This means that, by default, if I delete a photo from my phone, that photo is gone from the Raspberry Pi too and this isn’t what I do need nor what I do want.

However, Syncthing supports File Versioning and best yet it does support a “trash can”-like file versioning which moves your deleted files into a .stversions subfolder, but if this isn’t enough yet you can also write your own file versioning script.

All done? Yes! Whenever I do connect to my own WiFi my photos are backed up!

Friday, 03 May 2019

How I put order in my bookmarks and found a better way to organise them

I have gone through several stages of this and so far nothing has stuck as ideal, but I think I am inching towards it.

To start off, I have to confess that while I love the internet and the web, I loathe having everything in the browser. The browser becoming the OS is what seems to be happening, and I hate that thought. I like to keep things locally, having backups, and control over my documents and data. Although I changed my e-mail provider(s) several times, I still have all my e-mail locally stored from 2003 until today.

I also do not like reading longer texts on an LCD, so I usually put longer texts into either Wallabag or Mozilla’s Pocket to read them later on my eInk reader (Kobo Aura). BTW, Wallabag and Pocket both have their pros and cons themselves. Pocket is more popular and better integrated into a lot of things (e.g. Firefox, Kobo, etc.), while Wallabag is fully FOSS (even the server) and offers some extra features that are in Pocket either subject to subscription or completely missing.

Still, an enormous amount of information is (and should be!) on the web, so each of us needs to somehow keep track and make sense of it :)

So, with that intro out of the way, here is how I tackle(d) this mess.

Historic overview of methods I used so far

Hierarchy of folders

As many of us, I guess, I started with first putting bookmarks in the bookmark bar, but soon had to start organising them into folders … and subfolders … and subsubfolders …and subsubsubfolders … until the screen did not fit the whole tree of them any more when expanded.

Pro:

  • can be neat and tidy
  • easy to sync between devices through e.g. Firefox Sync

Con:

  • can become a huge mess, once it grows to a behemoth
  • takes several clicks to put a bookmark into the appropriate (sub)folder

Then I decided to keep it flat and use the Firefox search bar to find what I am looking for.

To achieve that, when I bookmarked something, I renamed it to something useful and added tags (e.g.: shop, tea; or python, sql, howto).

This worked kinda OK, but a big downside is that there is a huge amount of clutter which is not easy to navigate and edit once you want to organise all the already existing bookmarks. The bookmark panel is somewhat helpful, but not a lot.

Pro:

  • easy to search
  • easy to find a relevant bookmark when you are about to search for something through the combined URL/search bar
  • easy to sync between devices through e.g. Firefox Sync

Con:

  • your search query must match the name, tag(s), or URL of bookmark
  • hard to find or navigate other than searching (for name tag, URL)

OneTab

Several years after that, I learnt about OneTab from an onboarding website of a company I applied to (but did not get the job). The main promise of it is to loads of open tabs into (simple) organised lists on a single page. And all that with a single click (well, two really).

This worked wonders for (still does) for decluttering my tab list. Especially when grouped with Tree Style Tabs, which I very warmly recommend trying out. Even if it looks odd and unrully at first, it is very easy to get used to and helps organise tabs immensely. But back to OneTab…

The good side of OneTab is that it really helps keep your tab bar clean and therefore reduces your computer’s resource usage. It is also super for keeping track of tabs that you may (or maybe not) need to open again later, as you can (re)open a whole group of “tabs” with a single click.

As a practical example, let us say I am travelling to Barcelona in two months. So I book flights and the hotel, and in the process also check out some touristy and other helpful info. Because I will not be needing the touristy and travel stuff for quite some time before the trip, I do not need all the tabs open. But as it is a one-off trip, it is also silly to bookmark it all. So I send them all to OneTab and name the group e.g. “Barcelona trip 2019”. If I stumble upon any new stuff that is relevant, I simply send it to the same Named Group in OneTab. Once I need that info, I either open individual “tabs” or restore the whole group with one click and have it ready. An additional cool thing is that by default if you open a group or a single link “tab” from OneTab, it will remove it from the list. You can decide to keep the links in the list as well.

In practices, I still used tagged bookmarks for links that I wanted to store long-term, while depending on OneTab for short- to mid-term storage.

Pro:

  • great for decluttering your tabs
  • helps keep your browser’s resource usage low
  • great for creating (temporary) lists of tabs that you do not need now, but will in the future
  • can easily send a group of “tabs” with others via e-mail

Con:

  • no tags, categories or other means of adding meta data – you can only name groups, and cannot even rename links
  • no searching other than through the “webpage” list of “tabs”
  • as the list of “tabs”/bookmarks grows, the harder it is to keep an overview
  • cannot sync between devices
  • (proprietary plug-in)

Worldbrain’s Memex

About two months ago, I stumbled upon Worldbrain’s Memex through a FOSDEM talk. It promises to fix bookmarking, searching, note-taking and web history for you … which is quite an impressive lot.

So far, I have to say, I am quite impressed. It is super easy to find stuff you visited, even if you forgot to bookmark it, as it indexes all the websites you visit (unless you put tell Memex to ignore that page or domain).

For more order, you can assign tags to websites and/or store them into collections (i.e. groups or folders). What is more, you can do that even later, if you forgot about it the first time. If you want to especially emphasise a specific website, you can also star it.

An excellent feature missing in other bookmarking methods I have seen so far is that it lets you annotate websites – through highlights and comments and tags attached to those highlights. So, not only can you store comments and tags on the websites, but also on annotations within those websites.

One concern I have is that they might have taken more than what they can chew, but since I started using it, I have seen so much progress that I am (cautiously) optimistic about it.

Pro:

  • supports both tags and collections (i.e. groups)
  • enables annotations/highlights and comments (as well as tags to both) to websites
  • indexes websites, so when you search for something it goes through both the website’s text, as well as your notes to that website and, of course, tags
  • starring websites you would like to find more easily
  • you can also set specific websites or domain names to be ignored
  • it offers quite an advanced search, including limiting by data ranges, stars, or domains
  • when you search for something (e.g. using DuckDuckGo or Google) it shows suggested websites that you already visited before
  • sharing of annotations and comments with others (as long as they also have Memex installed)
  • for annotations it uses the W3C Open Annotation spec
  • stores everything locally (with the exception of sharing annotations via a link, of course)

Con:

  • it consumes more disk space due to running its own index
  • needs an external app for backing up data
  • so far no syncing of bookmarks between devices (but it is in the making)
  • so far it does not sync annotations between different devices (but both mobile apps for iOS/Android, and Pocket integration are in the making)

Status quo and looking at the future

I currently have still a few dozen bookmarks that I need to tag in Memex and delete from my Firefox bookmarks. And a further several dozen in OneTab.

The most viewed websites, I have in the “Top Sites” in Firefox.

Most of the “tabs” in OneTab, I have already migrated to Memex and I am looking very much forward to trying to use it instead of OneTab. So far it seems a bit more work, as I need to 1) open all tabs into a tab tree (same as in OneTab), 2) open that tab tree in a separate window (extra step), and then 3) use the “Tag all tabs in window” or “Add all tabs in window” option from the extension button (similar as in OneTab), and finally 4) close the tabs by closing the window (extra step). What I usually do is to change a Tab Group from OneTab to a Collection in Memex and then take some extra time to add tags or notes, if appropriate.

So, I am quite confident Memex will be able to replace OneTab for me and most likely also (most) normal bookmarks. I may keep some bookmarks of things that I want to always keep track of, like my online bank’s URL, but I am not sure yet.

The annotations are a god-send as well, which will be very hard to get rid of, as I already got used to them.

Now, if I could only send stuff to my eInk reader (or phone), annotate it there and have those annotations auto-magically show up in the browser and therefore stored locally on my laptop … :D

Oh, oh, and if I could search through Memex from my KDE Plasma desktop and add/view annotations from other documents (e.g. ePub, ODF, PDF) and other applicatios (e.g. Okular, Calibre, LibreOffice). One may dream …

hook out → sipping Vin Santo and planning more order in bookmarks


P.S. This blog post was initially a comment to the topic “How do you organize your bookmarks?” in the ~tech group on Tildes where further discussion is happening as well.

Monday, 22 April 2019

[Some] KDE Applications 19.04 also available in flathub

The KDE Applications 19.04 release announcement (read it if you haven't, it's very complete) mentions some of the applications are available at the snap store, but forgets to mention flathub.

Just wanted to bring up that there's also some of the applications available in there https://flathub.org/apps/search/org.kde.

All the ones that are released along KDE Applications 19.04 were updated on release day (except kubrick that has a compilation issue and will be updated for 19.04.1 and kontact which is a best and to be honest i didn't particularly feel like updating it)

If you feel like helping there's more applications that need adding and more automation that needs to happen, so get in touch :)

Monday, 15 April 2019

Closer Look at the Double Ratchet

In the last blog post, I took a closer look at how the Extended Triple Diffie-Hellman Key Exchange (X3DH) is used in OMEMO and which role PreKeys are playing. This post is about the other big algorithm that makes up OMEMO. The Double Ratchet.

The Double Ratchet algorithm can be seen as the gearbox of the OMEMO machine. In order to understand the Double Ratchet, we will first have to understand what a ratchet is.

Before we start: This post makes no guarantees to be 100% correct. It is only meant to explain the inner workings of the Double Ratchet algorithm in a (hopefully) more or less understandable way. Many details are simplified or omitted for sake of simplicity. If you want to implement this algorithm, please read the Double Ratchet specification.

A ratchet tool can only turn in one direction, hence it is eponymous for the algorithm.
Image by Benedikt.Seidl [Public domain]

A ratchet is a tool used to drive nuts and bolts. The distinctive feature of a ratchet tool over an ordinary wrench is, that the part that grips the head of the bolt can only turn in one direction. It is not possible to turn it in the opposite direction as it is supposed to.

In OMEMO, ratchet functions are one-way functions that basically take input keys and derives a new keys from that. Doing it in this direction is easy (like turning the ratchet tool in the right direction), but it is impossible to reverse the process and calculate the original key from the derived key (analogue to turning the ratchet in the opposite direction).

Symmetric Key Ratchet

One type of ratchet is the symmetric key ratchet (abbrev. sk ratchet). It takes a key and some input data and produces a new key, as well as some output data. The new key is derived from the old key by using a so called Key Derivation Function. Repeating the process multiple times creates a Key Derivation Function Chain (KDF-Chain). The fact that it is impossible to reverse a key derivation is what gives the OMEMO protocol the property of Forward Secrecy.

A Key Derivation Function Chain or Symmetric Ratchet

The above image illustrates the process of using a KDF-Chain to generate output keys from input data. In every step, the KDF-Chain takes the input and the current KDF-Key to generate the output key. Then it derives a new KDF-Key from the old one, replacing it in the process.

To summarize once again: Every time the KDF-Chain is used to generate an output key from some input, its KDF-Key is replaced, so if the input is the same in two steps, the output will still be different due to the changed KDF-Key.

One issue of this ratchet is, that it does not provide future secrecy. That means once an attacker gets access to one of the KDF-Keys of the chain, they can use that key to derive all following keys in the chain from that point on. They basically just have to turn the ratchet forwards.

Diffie-Hellman Ratchet

The second type of ratchet that we have to take a look at is the Diffie-Hellman Ratchet. This ratchet is basically a repeated Diffie-Hellman Key Exchange with changing key pairs. Every user has a separate DH ratcheting key pair, which is being replaced with new keys under certain conditions. Whenever one of the parties sends a message, they include the public part of their current DH ratcheting key pair in the message. Once the recipient receives the message, they extract that public key and do a handshake with it using their private ratcheting key. The resulting shared secret is used to reset their receiving chain (more on that later).

Once the recipient creates a response message, they create a new random ratchet key and do another handshake with their new private key and the senders public key. The result is used to reset the sending chain (again, more on that later).

Principle of the Diffie-Hellman Ratchet.
Image by OpenWhisperSystems (modified by author)

As a result, the DH ratchet is forwarded every time the direction of the message flow changes. The resulting keys are used to reset the sending-/receiving chains. This introduces future secrecy in the protocol.

The Diffie-Hellman Ratchet

Chains

A session between two devices has three chains – a root chain, a sending chain and a receiving chain.

The root chain is a KDF chain which is initialized with the shared secret which was established using the X3DH handshake. Both devices involved in the session have the same root chain. Contrary to the sending and receiving chains, the root chain is only initialized/reset once at the beginning of the session.

The sending chain of the session on device A equals the receiving chain on device B. On the other hand, the receiving chain on device A equals the sending chain on device B. The sending chain is used to generate message keys which are used to encrypt messages. The receiving chain on the other hand generates keys which can decrypt incoming messages.

Whenever the direction of the message flow changes, the sending and receiving chains are reset, meaning their keys are replaced with new keys generated by the root chain.

The full Double Ratchet Algorithms Ratchet Architecture

An Example

I think this rather complex protocol is best explained by an example message flow which demonstrates what actually happens during message sending / receiving etc.

In our example, Obi-Wan and Grievous have a conversation. Obi-Wan starts by establishing a session with Grievous and sends his initial message. Grievous responds by sending two messages back. Unfortunately the first of his replies goes missing.

Session Creation

In order to establish a session with Grievous, Obi-Wan has to first fetch one of Grievous key bundles. He uses this to establish a shared secret S between him and Grievous by executing a X3DH key exchange. More details on this can be found in my previous post. He also extracts Grievous signed PreKey ratcheting public key. S is used to initialize the root chain.

Obi-Wan now uses Grievous public ratchet key and does a handshake with his own ratchet private key to generate another shared secret which is pumped into the root chain. The output is used to initialize the sending chain and the KDF-Key of the root chain is replaced.

Now Obi-Wan established a session with Grievous without even sending a message. Nice!

The session initiator prepares the sending chain.
The initial root key comes from the result of the X3DH handshake.
Original image by OpenWhisperSystems (modified by author)

Initial Message

Now the session is established on Obi-Wans side and he can start composing a message. He decides to send a classy “Hello there!” as a greeting. He uses his sending chain to generate a message key which is used to encrypt the message.

Principle of generating message keys from the a KDF-Chain.
In our example only one message key is derived though.
Image by OpenWhisperSystems

Note: In the above image a constant is used as input for the KDF-Chain. This constant is defined by the protocol and isn’t important to understand whats going on.

Now Obi-Wan sends over the encrypted message along with his ratcheting public key and some information on what PreKey he used, the current sending key chain index (1), etc.

When Grievous receives Obi-Wan’s message, he completes his X3DH handshake with Obi-Wan in order to calculate the same exact shared secret S as Obi-Wan did earlier. He also uses S to initialize his root chain.

Now Grevious does a full ratchet step of the Diffie-Hellman Ratchet: He uses his private and Obi-Wans public ratchet key to do a handshake and initialize his receiving chain with the result. Note: The result of the handshake is the same exact value that Obi-Wan earlier calculated when he initialized his sending chain. Fantastic, isn’t it? Next he deletes his old ratchet key pair and generates a fresh one. Using the fresh private key, he does another handshake with Obi-Wans public key and uses the result to initialize his sending chain. This completes the full DH ratchet step.

Full Diffie-Hellman Ratchet Step
Image by OpenWhisperSystems

Decrypting the Message

Now that Grievous has finalized his side of the session, he can go ahead and decrypt Obi-Wans message. Since the message contains the sending chain index 1, Grievous knows, that he has to use the first message key generated from his receiving chain to decrypt the message. Because his receiving chain equals Obi-Wans sending chain, it will generate the exact same keys, so Grievous can use the first key to successfully decrypt Obi-Wans message.

Sending a Reply

Grievous is surprised by bold actions of Obi-Wan and promptly goes ahead to send two replies.

He advances his freshly initialized sending chain to generate a fresh message key (with index 1). He uses the key to encrypt his first message “General Kenobi!” and sends it over to Obi-Wan. He includes his public ratchet key in the message.

Unfortunately though the message goes missing and is never received.

He then forwards his sending chain a second time to generate another message key (index 2). Using that key he encrypt the message “You are a bold one.” and sends it to Obi-Wan. This message contains the same public ratchet key as the first one, but has the sending chain index 2. This time the message is received.

Receiving the Reply

Once Obi-Wan receives the second message and does a full ratchet step in order to complete his session with Grevious. First he does a DH handshake between his private and the Grevouos’ public ratcheting key he got from the message. The result is used to setup his receiving chain. He then generates a new ratchet key pair and does a second handshake. The result is used to reset his sending chain.

Obi-Wan notices that the sending chain index of the received message is 2 instead of 1, so he knows that one message must have been missing or delayed. To deal with this problem, he advances his receiving chain twice (meaning he generates two message keys from the receiving chain) and caches the first key. If later the missing message arrives, the cached key can be used to successfully decrypt the message. For now only one message arrived though. Obi-Wan uses the generated message key to successfully decrypt the message.

Conclusions

What have we learned from this example?

Firstly, we can see that the protocol guarantees forward secrecy. The KDF-Chains used in the three chains can only be advanced forwards, and it is impossible to turn them backwards to generate earlier keys. This means that if an attacker manages to get access to the state of the receiving chain, they can not decrypt messages sent prior to the moment of attack.

But what about future messages? Since the Diffie-Hellman ratchet introduces new randomness in every step (new random keys are generated), an attacker is locked out after one step of the DH ratchet. Since the DH ratchet is used to reset the symmetric ratchets of the sending and receiving chain, the window of the compromise is limited by the next DH ratchet step (meaning once the other party replies, the attacker is locked out again).

On top of this, the double ratchet algorithm can deal with missing or out-of-order messages, as keys generated from the receiving chain can be cached for later use. If at some point Obi-Wan receives the missing message, he can simply use the cached key to decrypt its contents.

This self-healing property was eponymous to the Axolotl protocol (an earlier name of the Signal protocol, the basis of OMEMO).

Acknowledgements

Thanks to syndace and paul for their feedback and clarification on some points.

Saturday, 13 April 2019

Tenth Anniversary of AltOS

In the early days of the collaboration between Bdale Garbee and Keith Packard that later became Altus Metrum, the software for TeleMetrum was crafted as an application running on top of an existing open source RTOS. It didn't take long to discover that the RTOS was ill-suited to our needs, and Keith had to re-write various parts of it to make things fit in the memory available and work at all.

Eventually, Bdale idly asked Keith how much of the RTOS he'd have to rewrite before it would make sense to just start over from scratch. Keith took that question seriously, and after disappearing for a day or so, the first code for AltOS was committed to revision control on 12 April 2009.

Ten years later, AltOS runs on multiple processor architectures, and is at the heart of all Altus Metrum products.

Friday, 05 April 2019

Hard drive failure in my zpool 😞

I have a storage box in my house that stores important documents, backups, VM disk images, photos, a copy of the Tor Metrics archive and other odd things. I’ve put a lot of effort into making sure that it is both reliable and performant. When I was working on a modern CollecTor for Tor Metrics recently, I used this to be able to run the entire history of the Tor network through the prototype replacement to see if I could catch any bugs.

I have had my share of data loss events in my life, but since I’ve found ZFS I have hope that it is possible to avoid, or at least seriously minimise the risk of, any catastrophic data loss events ever happening to me again. ZFS has:

  • cryptographic checksums to validate data integrity
  • mirroring of disks
  • “scrub” function that ensures that the data on disk is actually still good even if you’ve not looked at it yourself in a while

ZFS on its own is not the entire solution though. I also mix-and-match hard drive models to ensure that a systematic fault in a particular model won’t wipe out all my mirrors at once, and I also have scheduled SMART self-tests to detect faults before any data loss has occured.

Unfortunately, one of my drives in my zpool has failed a SMART self-test.

Unfortunately, one of my drives in my zpool has failed a SMART self-test.

This means I now have to treat that drive as “going to fail soon” which means that I don’t have redundancy in my zpool anymore, so I have to act. Fortunately, in September 2017 when my workstation died, I received some donations towards the hardware I use for my open source work and I did buy a spare HDD for this very situation!

At present my zpool setup looks like:

% zpool status flat
  pool: flat
 state: ONLINE
  scan: scrub repaired 0 in 0 days 07:05:28 with 0 errors on Fri Apr  5 07:05:36 2019
config:

	NAME                                            STATE     READ WRITE CKSUM
	flat                                            ONLINE       0     0     0
	  mirror-0                                      ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	  mirror-1                                      ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	cache
	  gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0

errors: No known data errors

The drives in the two mirrors are 3TB drives, in each mirror is one WD Red and one Toshiba NAS drive. In this case, it is one of the WD Red drives that has failed and I’ll be replacing it with another WD Red. One important thing to note is that you have to replace the drive with one of equal or greater capacity. In this case it is the same model so the capacity should be the same, but not all X TB drives are going to be the same size.

You’ll notice here that it is saying No known data errors. This is because there hasn’t been any issues with the data yet, it is just a SMART failure, and hopefully by replacing the disk any data error can be avoided entirely.

My plan was to move to a new system soon, with 8 bays. In that system I’ll keep the stripe over 2 mirrors but one mirror will run over 3x 6TB drives with the other remaining on 2x 3TB drives. This incident leaves me with only 1 leftover 3TB drive though so maybe I’ll have to rethink this.

Free space remaining in my zpool

Free space remaining in my zpool

My current machine, an HP MicroServer, does not support hot-swapping the drives so I have to start by powering off the machine and replacing the drive.

% zpool status flat
  pool: flat
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
	the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: scrub repaired 0 in 0 days 07:05:28 with 0 errors on Fri Apr  5 07:05:36 2019
config:

	NAME                                            STATE     READ WRITE CKSUM
	flat                                            DEGRADED     0     0     0
	  mirror-0                                      ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	  mirror-1                                      DEGRADED     0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	    xxxxxxxxxxxxxxxxxxxx                        UNAVAIL      0     0     0  was /dev/gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
	cache
	  gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx    ONLINE       0     0     0

errors: No known data errors

The disk that was part of the mirror is now unavailable, but the pool is still functioning as the other disk is still present. This means that there are still no data errors and everything is still running. The only downtime was due to the non-hot-swappableness of my SATA controller.

Through the web interface in FreeNAS, it is possible to now use the new disk to replace the old disk in the mirror: Storage -> View Volumes -> Volume Status (under the table, with the zpool highlighted) -> Replace (with the unavailable disk highlighted).

Running zpool status again:

% zpool status flat
  pool: flat
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Apr  5 16:55:47 2019
	1.30T scanned at 576M/s, 967G issued at 1.12G/s, 4.33T total
	4.73G resilvered, 21.82% done, 0 days 00:51:29 to go
config:

	NAME                                            STATE     READ WRITE CKSUM
	flat                                            ONLINE       0     0     0
	  mirror-0                                      ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	  mirror-1                                      ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0  (resilvering)
	cache
	  gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx    ONLINE       0     0     0

errors: No known data errors

And everything should be OK again soon, now with the dangerous disk removed and a hopefully more reliable disk installed.

A more optimistic message from FreeNAS

A more optimistic message from FreeNAS

This has put a dent in my plans to upgrade my storage, so for now I’ve added the hard drives I’m looking for to my Amazon wishlist.

As for the drive that failed, I’ll be doing an ATA Secure Erase and then disposing of it. NIST SP 800-88 thinks that ATA Secure Erase is in the same category as degaussing a hard drive and that it is more effective than overwriting the disk with software. ATA Secure Erase is faster too because it’s the hard drive controller doing the work. I just have to hope that my firmware wasn’t replaced with firmware that only fakes the process (or I’ll just do an overwrite anyway to be sure). According to the same NIST document, “for ATA disk drives manufactured after 2001 (over 15 GB) clearing by overwriting the media once is adequate to protect the media from both keyboard and laboratory attack”.


This blog post is also a little experiment. I’ve used a Unicode emoji in the title, and I want to see how various feed aggregators and bots handle that. Sorry if I broke your aggregator or bot.

IETF 104 in Prague

Thanks to support from Article 19, I was able to attend IETF 104 in Prague, Czech Republic this week. Primarily this was to present my Internet Draft which takes safe measurement principles from Tor Metrics work and the Research Safety Board and applies them to Internet Measurement in general.

My IETF badge, complete with additional tag for my nick

My IETF badge, complete with additional tag for my nick

I attended with a free one-day pass for the IETF and free hackathon registration, so more than just the draft presentation happened. During the hackathon I sat at the MAPRG table and worked on PATHspider with Mirja Kühlewind from ETH Zurich. We have the code running again with the latest libraries available in Debian testing and this may become the basis of a future Tor exit scanner (for generating exit lists, and possibly also some bad exit detection). We ran a quick measurement campaign that was reported in the hackathon presentations.

During the hackathon I also spoke to Watson Ladd from Cloudflare about his Roughtime draft which could be interesting for Tor for a number of reasons. One would be for verifying if a consensus is fresh, another would be for Tor Browser to detect if a TLS cert is valid, and another would be providing archive signatures for Tor Metrics. (We’ve started looking at archive signatures since our recent work on modernising CollecTor).

On the Monday, this was the first “real” day of the IETF. The day started off for me at the PEARG meeting. I presented my draft as the first presentation in that session. The feedback was all positive, it seems like having the document is both desirable and timely.

The next presentation was from Ryan Guest at Salesforce. He was talking about privacy considerations for application level logging. I think this would also be a useful draft that compliments my draft on safe measurement, or maybe even becomes part of my draft. I need to follow up with him to see what he wants to do. A future IETF hackathon project might be comparing Tor’s safe logging with whatever guidelines we come up with, and also comparing our web server logs setup.

Nick Sullivan was up next with his presentation on Privacy Pass. It seems like a nice scheme, assuming someone can audit the anti-tagging properties of it. The most interesting thing I took away from it is that federation is being explored which would turn this into a system that isn’t just for Cloudflare.

Amelia Andersdotter and Christoffer Långström then presented on differential privacy. They have been exploring how it can be applied to binary values as opposed to continuous, and how it could be applied to Internet protocols like the QUIC spin bit.

The last research presentation was Martin Schanzenbach presenting on an identity provider based on the GNU Name System. This one was not so interesting for me, but maybe others are interested.

I attended the first part of the Stopping Malware and Researching Threats (SMART) session. There was an update from Symantec based on their ISTR report and I briefly saw the start of a presentation about “Malicious Uses of Evasive Communications and Threats to Privacy“ but had to leave early to attend another meeting. I plan to go back and look through all of the slides from this session later.

The next IETF meeting is directly after the next Tor meeting (I had thought for some reason it directly clashed, but I guess I was wrong). I will plan to remotely participate in PEARG again there and move my draft forwards.

When the Duke walks, you don't notice it

This is my late submission for transgender day of visibility. It comes almost a week late, but I suppose I’ll use this proverb that is popular among the trans community to justify myself:

The best time to plant a tree was 20 years ago. The second best time is now.

Or: The best time to post about transgender day of visibility was 31st of March. The second best time is 5th of April.

I should preface this post by emphasising that all of its contents are exclusively my own experiences, and may not speak for anybody other than myself. It is written in the spirit of visibility, so that the public knows that transgender people exist, and that ultimately we are normal people.

A second justification for my lateness has been my hesitance to broadcast this to the internet. I like privacy, and I take a lot of steps to safeguard it (e.g, by using Free Software). But there is one step that I have made only halfway, and that is anonymity. I use a VPN to hide my IP address, and I take special care not to give internet giants all of my personal data—I am a nobody when I surf the web.

But when I interact with human beings on the internet, I try to be me. This is terrible privacy advice, because the internet never forgets when you make a mistake. But I find it important, because the internet is a very real place. More and more, the internet affects our collective lives. It allows us to do tangible things such as purchasing items, and it does intangible things such as morphing our perception and opinion of the world. Anonymity allows you to enter this space—this real space—as an ethereal ghost, existing perpetually out of sight, but able to interact just the same. It does not take a creative mind to imagine how this can be abused, and people do.

I could abuse such ghostly powers for good, but I am not comfortable with holding that power. So I wish to be myself in spite of knowing better. To exist online under this name, I must self-censor. I must not say things that I imagine will come back to harm my future self, and I must hide aspects about myself that I do not want everybody to know—say it once, and the whole world knows.

And for years, I haven’t said it: I am transgender. By itself this is unimportant (so what?), but the act of saying it is not. The act of saying it means that anybody, absolutely anybody until the end of the digital age, can discover this about me and hold it against me, and there is no shortage of people who would. And that is frankly quite scary.

But the act of saying it is also activism. By saying it, you assert your existence in the face of an ideology that wishes you didn’t. By saying it, you own the narrative of what it means to be trans, rather than ideologues who would paint you in a dehumanised light. By saying it, you make tangible and visible a human experience that many people do not understand. At the risk of sounding self-aggrandising, there is power in that.

The last point I find especially empowering. Until the exact moment I decided to transition, I simply did not know of the mere existence of trans people. I knew about drag queens flamboyantly dancing on boats in the canals of Amsterdam, but those people were otherworldly to me. They weren’t tangibly real, and they weren’t me. Had I known that transgender people were everyday women and men who care about the same things that I do, I would have spared myself a lot of mental anguish and made the leap a lot sooner.

Instead, something else prompted that realisation. I was reading a Christmas novel in Summer, as you do. The book was called “Let It Snow: Three Holiday Romances” authored by John Green, Lauren Myracle and Maureen Johnson. The book has three POVs in a town struck by a heavy snow storm, and there’s a lot of interplay between the POVs.

I was reading one of John Green’s chapters. His main character had long been great friends with a tomboyish girl (nicknamed “the Duke”) who struggled with her gender expression, and the two embark on a great journey through the snow storm to reach the waffle house. During this trek, there is a scene where he is walking a few paces behind her, and he is looking at her. And while looking, and through the shared experiences, a sudden thought strikes him: “Anyway, the Duke was walking, and there was a certain something to it, and I was kind of disgusted with myself for thinking about that certain something. [‌] When Brittany the cheerleader walks, you notice it. When the Duke walks, you don’t. Usually.”

These two had been friends for the longest time, and for the first time, he entertained the thought that it might be something more than that. And you read on, and on, and this wayward thought starts to become quite real and serious. And suddenly he becomes self-aware of the thought absorbing him:

“Once you think a thought, it is extremely difficult to unthink it. And I had thought the thought.”

It hit me like a brick. In that very same instance, I, too, thought the thought. What had been a feeling for so long, I finally thought out loud in my head, and it was impossible for me to unthink it. It had nothing to do with the novel, and I have no idea how I made that leap, but in that moment I realised for the first time, truly realised, that I did not want to be a boy—that I wanted to be a girl. And I was miserable for it, but eventually better off. A quick search later and I discovered that trans people exist, and that transition is an actual thing that normal people do.

So I did it. And it has been good. Whatever ailed me prior to transition is mostly gone, and I have become a functioning adult who does many non-transgender-related things such as translating GNOME into Esperanto and creating cat monsters for Dungeons & Dragons. But I never really included being transgender in any of my online activities, and I want to change that. I want to be more like the Duke. I want to walk like her, and while people may not always see that walk, I want to call attention to it every now and then. And maybe it will help someone be struck by the thought, whatever spark of madness it is that they need.

Happy transgender day of visibility.

Thursday, 04 April 2019

The Power of Workflow Scripts

Nextcloud has the ability to define some conditions under which external scripts are executed. The app which makes this possible is called “Workflow Script”. I always knew that this powerful tool exists, yet I never really had a use case for it. This changed last week. Task I heavily rely on text files for note taking. I organize them in folders, for example I have a “Projects” folders with sub-folders for each project I work on currently.

Wednesday, 03 April 2019

Shaking Hands With OMEMO: X3DH Key Exchange

This is the first part of a small series about the cryptographic building blocks of OMEMO. This post is about the Extended Triple Diffie Hellman Key Exchange Algorithm (X3DH) which is used to establish a session between OMEMO devices.
Part 2: Closer Look at the Double Ratchet

In the past I have written some posts about OMEMO and its future and how it does compare to the Olm encryption protocol used by matrix.org. However, some readers requested a closer, but still straightforward look at how OMEMO and the underlying algorithms work. To get started, we first have to take a look at its past.

OMEMO was implemented in the Android Jabber Client Conversations as part of a Google Summer of Code project by Andreas Straub in 2015. The basic idea was to utilize the encryption library used by Signal (formerly TextSecure) for message encryption. So basically OMEMO borrows almost all the cryptographic mechanisms including the Double Ratchet and X3DH from Signals encryption protocol, which is appropriately named Signal Protocol. So to begin with, lets look at it first.

The Signal Protocol

The famous and ingenious protocol that drives the encryption behind Signal, OMEMO, matrix.org, WhatsApp and a lot more was created by Trevor Perrin and Moxie Marlinspike in 2013. Basically it consists of two parts that we need to further investigate:

  • The Extended Triple-Diffie-Hellman Key Exchange (X3DH)
  • The Double Ratchet Algorithm

One core principle of the protocol is to get rid of encryption keys as soon as possible. Almost every message is encrypted with another fresh key. This is a huge difference to other protocols like OpenPGP, where the user only has one key which can decrypt all messages ever sent to them. The later can of course also be seen as an advantage OpenPGP has over OMEMO, but it all depends on the situation the user is in and what they have to protect against.

A major improvement that the Signal Protocol introduced compared to encryption protocols like OTRv3 (Off-The-Record Messaging) was the ability to start a conversation with a chat partner in an asynchronous fashion, meaning that the other end didn’t have to be online in order to agree on a shared key. This was not possible with OTRv3, since both parties had to actively send messages in order to establish a session. This was okay back in the days where people would start their computer with the intention to chat with other users that were online at the same time, but it’s no longer suitable today.

Note: The recently worked on OTRv4 will not come with this handicap anymore.

The X3DH Key Exchange

Let’s get to it already!

X3DH is a key agreement protocol, meaning it is used when two parties establish a session in order to agree on a shared secret. For a conversation to be confidential we require, that only sender and (intended) recipient of a message are able to decrypt it. This is possible when they share a common secret (eg. a password or shared key). Exchanging this key with one another has long been kind of a hen and egg problem: How do you get the key from one end to the other without an adversary being able to get a copy of the key? Well, obviously by encrypting it, but how? How do you get that key to the other side? This problem has only been solved after the second world war.

The solution is a so called Diffie-Hellman-Merkle Key Exchange. I don’t want to go into too much detail about this, as there are really great resources about how it works available online, but the basic idea is that each party possesses an asymmetric key pair consisting of a public and a private key. The public key can be shared over insecure networks while the
private key must be kept secret. A Diffie-Hellman key exchange (DH) is the process of combining a public key A with a private key b in order to generate a shared secret. The essential trick is, that you get the same exact secret if you combine the secret key a with the public key B. Wikipedia does a great job at explaining this using an analogy of mixing colors.

Deniability and OTR

In normal day to day messaging you don’t always want to commit to what you said. Especially under oppressive regimes it may be a good idea to be able to deny that you said or wrote something specific. This principle is called deniability.

Note: It is debatable, whether cryptographic deniability ever saved someone from going to jail, but that’s not scope of this blog post.

At the same time you want to be absolutely sure that you are really talking to your chat partner and not to a so called man in the middle. These desires seem to be conflicting at first, but the OTR protocol featured both. The user has an IdentityKey, which is used to identify the user by means of a fingerprint. The (massively and horribly simplified) procedure of creating a OTR session is as follows: Alice generates a random session key and signs the public key with her IdentityKey. She then sends that public key over to Bob, who generates another random session key with which he executes his half of the DH handshake. He then sends the public part of that key (again, signed) back to Alice, who does another DH to acquire the same shared secret as Bob. As you can see, in order to establish a session, both parties had to be online. Note: The signing part has been oversimplified for sake of readability.

Normal Diffie-Hellman Key Exchange

From DH to X3DH

Perrin and Marlinspike improved upon this model by introducing the concept of PreKeys. Those basically are the first halves of a DH-handshake, which can – along with some other keys of the user – be uploaded to a server prior to the beginning of a conversation. This way another user can initiate a session by fetching one half-completed handshake and completing it.

Basically the Signal protocol comprises of the following set of keys per user:

IdentityKey (IK)Acts as the users identity by providing a stable fingerprint
Signed PreKey (SPK)Acts as a PreKey, but carries an additional signature of IK
Set of PreKeys ({OPK})Unsigned PreKeys

If Alice wants to start chatting, she can fetch Bobs IdentityKey, Signed PreKey and one of his PreKeys and use those to create a session. In order to preserve cryptographic properties, the handshake is modified like follows:

DH1 = DH(IK_A, SPK_B)
DH2 = DH(EK_A, IK_B)
DH3 = DH(EK_A, SPK_B)
DH4 = DH(EK_A, OPK_B)

S = KDF(DH1 || DH2 || DH3 || DH4)

EK_A denotes an ephemeral, random key which is generated by Alice on the fly. Alice can now derive an encryption key to encrypt her first message for Bob. She then sends that message (a so called PreKeyMessage) over to Bob, along with some additional information like her IdentityKey IK, the public part of the ephemeral key EK_A and the ID of the used PreKey OPK.

Visual representation of the X3DH handshake

Once Bob logs in, he can use this information to do the same calculations (just with swapped public and private keys) to calculate S from which he derives the encryption key. Now he can decrypt the message.

In order to prevent the session initiation from failing due to lost messages, all messages that Alice sends over to Bob without receiving a first message back are PreKeyMessages, so that Bob can complete the session, even if only one of the messages sent by Alice makes its way to Bob. The exact details on how OMEMO works after the X3DH key exchange will be discussed in part 2 of this series đŸ™‚

X3DH Key Exchange TL;DR

X3DH utilizes PreKeys to allow session creation with offline users by doing 4 DH handshakes between different keys.

A subtle but important implementation difference between OMEMO and Signal is, that the Signal server is able to manage the PreKeys for the user. That way it can make sure, that every PreKey is only used once. OMEMO on the other hand solely relies on the XMPP servers PubSub component, which does not support such behavior. Instead, it hands out a bundle of around 100 PreKeys. This seems like a lot, but in reality the chances of a PreKey collision are pretty high (see the birthday problem).

OMEMO does come with some counter measures for problems and attacks that arise from this situation, but it makes the protocol a little less appealing than the original Signal protocol.

Clients should for example keep used PreKeys around until the end of catch -up of missed message to allow decryption of messages that got sent in sessions that have been established using the same PreKey.

Monday, 25 March 2019

Another Step to a Google-free Life

I watch a lot of YouTube videos. So much, that it starts to annoy me, how much of my free time I’m wasting by watching (admittedly very interesting) clips of a broad range of content creators.

Logging out of my Google account helped a little bit to keep my addiction at bay, as it appears to prevent the YouTube algorithm, which normally greets me with a broad set of perfectly selected videos from recognizing me. But then again I use Google to log in to one service or another, so it became annoying to log in and back out again all the time. At one point I decided to delete my YouTube history, which resulted in a very bad prediction of what videos I might like. This helped for a short amount of time, but the algorithm quickly returned to its merciless precision after a few days.

Today I decided, that its time to leave Google behind completely. My Google Mail account was used only for online shopping anyways, so I figured why not use a more privacy respecting service instead. Self-hosting was not an option for me, as I only have a residential IP address on my Raspberry Pi and also I heard that hosting a mail server is a huge pain.

A New Mail Account

So I created an account at the Berlin based service mailbox.org. They offer emails plus some cloud stuff like an office suite, storage etc., although I don’t think I’ll use any of the additional services (oh, they offer an XMPP account as well :P). The service is not free as in free beer as it costs 1â‚Ź per month, but that’s a fair price in my opinion. All in all it appears to be a good replacement for all the Google stuff.

As a next step, I went through the long list of all the websites and shops that I have accounts on, scouting for those services that are registered on my Google Mail address. All those mail settings had to be changed to the new account.

Mail Extensions

Bonus Tipp: Mailbox.org has support for so called Mail Extensions (or Plus Extensions, I’m not really sure how they are called). This means that you can create a folder in your inbox, lets say “fsfe”. Now you can change your mail address of your FSFE account to “username+fsfe@mailbox.org”. Mails from the FSFE will still go to your “username@mailbox.org” mail account, but they are automatically sorted into the fsfe inbox. This is useful not only to sort mails by sender, but also to find out, which of the many services you use messed up and leaked your mail address to those nasty spammers, so you can avoid that service in the future.

This trick also works for Google Mail by the way.

Deleting (most) the Google Services

The last step logically would be to finally delete my Google account. However, I’m not entirely sure if I really changed all the important services over to the new account, so I’ll keep it for a short period of time (a month or so) to see if any more important mails arrive.

However, I discovered that under the section “Delete Services or Account” you can see a list of all the services which are connected with your Google account. It is possible to partially delete those services, so I went ahead and deleted most of it, except Google Mail.

Additional Bonus Tipp: I use NewPipe on my phone, which is a free libre replacement for the YouTube app. It has a neat feature which lets you import your subscriptions from your YouTube account. That way I can still follow some of the creators, but in a more manual way (as I have to open the app on my phone, which I don’t often do). In my eyes, this is a good compromise đŸ™‚

I’m looking forward to go fully Google-free soon. I de-googled my phone ages ago, but for some reason I still held on to my Google account. This will be sorted out soon though!

De-Googling your Phone?

By the way, if you are looking to de-google your phone, Mike Kuketz has a great series of blog posts about that topic (in German though):

Happy Hacking!

Update (27.05.2019)

Friday, 22 March 2019

How to create good SSH keys

  • Seravo
  • 08:11, Friday, 22 March 2019

A couple years back we wrote a guide on how to create good OpenPGP/GnuPG keys and now it is time to write a guide on SSH keys for much of the same reasons: SSH key algorithms have evolved in past years and the keys generated by the default OpenSSH settings a few years ago are no longer considered state-of-the-art. This guide is intended both for those completely new to SSH and to those who have already been using it for years and who want to make sure they are following the latest best practices.

Use OpenSSH 7 or later

Related to SSH keys there have been some relevant changes in versions 5.7, 6.5 and 7.0. Latest version is 7.9. You should be running at least 7.0. Current Debian stable (”Stretch”) shipped version 7.4 and for example Ubuntu 16.04 (”Xenial”) shipped 7.2, so nobody should be running on their laptop any later versions than these.

Generate Ed25519 keys

With a recent version of OpenSSH, simply run ssh-keygen -t ed25519. This will create a private and public key pair files at .ssh/id_ed25519 (and .pub) using the Ed25519 algorithm, which is considered state of the art. Elliptic curve algorithms in general are sleek and efficient and unlike the other well known elliptic curve algorithm ECDSA, this Ed25519 does not depend on any suspicious NIST defined constants. If you encounter a server that is very old and does not support Ed25519 keys, you might need to have a more traditinal RSA keypair. A strong 4k RSA key pair can be generated with ssh-keygen -b 4096. Hopefully you won’t need to ever do that.

Running ssh-keygen will prompt for a password. If your laptop is encrypted and well protected you can omit the password and gain some speed and convenience in your SSH commands.

Store the private key securely

Hopefully your laptop is well protected with full disk encryption etc and you can trust that nobody else than yourself has access to your /home/<username>/.ssh directory. Hopefully you also have securely stored encrypted backups of your laptop so that you can recover the .ssh directory if your laptop for any reason is lost or broken.

The SSH keyfiles are stored as armored ASCII, which means that you could even print them on paper and store the printed key in a real vault just to be extra sure you never loose your private key (.ssh/id_ed25519).

The public key is indeed designed to be public

The public key .ssh/id_ed25519.pubon the other hand is meant to be public. Here is mine for example:

~/.ssh$ cat id_ed25519.pub 
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIMdBlrVoDupARk3pd1Q9sDImaGCxEalcFt7QTqBa36kH otto@XPS-13-9370

You can find a lot of public SSH keys for example on Github using the URL https://github.com/<username>.keys. You may also want to check out the excellent Github SSH usage docs.

This public key is the one you will distribute to remote servers. Place them on the remote server in the .ssh/authorized_keys file. The servers you try to access will use the public key to create a challenge, and only your laptop that has the private key pair can solve that challenge, and thus authenticate that your connection to the server is authorized.

On Linux machines a shorthand to copy your SSH key to a remote server is to run ssh-copy-id remote.example.com. This will ask for a password on the first time, but once the key is in place, a SSH keys will be used instead and no password is asked anymore.

Best practices

Once you are familiar with SSH key usage basics, please adopt these policies:

  • Disable password authentication on the SSH servers you control altogether. The less there are passwords, the less there as ones that can leak or be forgotten etc. For passwords less is indeed more. Only use keys for authentication on SSH connections.
  • Use a separate key per client you SSH from. So don’t copy the private key from your laptop to another laptop for use in parallel. Each client system should have only one key, so in case a key leaks, you know which client system was compromised. If you stop using your old laptop and start using a new one it is naturally another case and then you can copy the key.
  • Avoid chaining SSH connections. Any system that is used for chaining SSH connections incurs a potential man-in-the-middle situation.
  • For the same reasons try to avoid X11 forwarding and agent forwarding. Consider putting these in your .ssh/config:
Host *
  ForwardX11   no
  ForwardAgent no

Using ProxyCommand

You shouldn’t forward SSH agent (ssh -A), but it’s ok to use ProxyCommand or ProxyJump. You can permanently configure it in your .ssh/config like this:

Host backend.example.com
Hostname 1.2.3.4.
Port 23
ProxyCommand ssh bastion.example.com -W %h:%p

Stay vigilant

If running ssh remote.example.comyields some error messages, don’t ignore them! SSH has an opportunistic key model, which is convenient, but it also means that if you are confronted with warnings that the connection might be eavesdropped you should really take note and not proceed.

Monday, 18 March 2019

Dataspaces and Paging in L4Re

The experiments covered by my recent articles about filesystems and L4Re managed to lead me along another path in the past few weeks. I had defined a mechanism for providing access to files in a filesystem via a programming interface employing interprocess communication within the L4Re system. In doing so, I had defined calls or operations that would read from and write to a file, observing that some kind of “memory-mapped” file support might also be possible. At the time, I had no clear idea of how this would actually be made to work, however.

As can often be the case, once some kind of intellectual challenge emerges, it can become almost impossible to resist the urge to consider it and to formulate some kind of solution. Consequently, I started digging deeper into a number of things: dataspaces, pagers, page faults, and the communication that happens within L4Re via the kernel to support all of these things.

Dataspaces and Memory

Because the L4Re developers have put a lot of effort into making a system where one can compile a fairly portable program and probably expect it to work, matters like the allocation of memory within programs, the use of functions like malloc, and other things we take for granted need no special consideration in the context of describing general development for an L4Re-based system. In principle, if our program wants more memory for its own use, then the use of things like malloc will probably suffice. It is where we have other requirements that some of the L4Re abstractions become interesting.

In my previous efforts to support MIPS-based systems, these other requirements have included the need to access memory with a fixed and known location so that the hardware can be told about it, thus supporting things like framebuffers that retain stored data for presentation on a display device. But perhaps most commonly in a system like L4Re, it is the need to share memory between processes or tasks that causes us to look beyond traditional memory allocation techniques at what L4Re has to offer.

Indeed, the filesystem work so far employs what are known as dataspaces to allow filesystem servers and client applications to exchange larger quantities of information conveniently via shared buffers. First, the client requests a dataspace representing a region of memory. It then associates it with an address so that it may access the memory. Then, the client shares this with the server by sending it a reference to the dataspace (known as a capability) in a message.

Opening a file using shared memory containing file information

Opening a file using shared memory containing file information

The kernel, in propagating the message and the capability, makes the dataspace available to the server so that both the client and the server may access the memory associated with the dataspace and that these accesses will just work without any further effort. At this level of sophistication we can get away with thinking of dataspaces as being blocks of memory that can be plugged into tasks. Upon obtaining access to such a block, reads and writes (or loads and stores) to addresses in the block will ultimately touch real memory locations.

Even in this simple scheme, there will be some address translation going on because each task has its own way of arranging its view of memory: its virtual address space. The virtual memory addresses used by a task may very well be different from the physical memory addresses indicating the actual memory locations involved in accesses.

An illustration of virtual memory corresponding to physical memory

An illustration of virtual memory corresponding to physical memory

Such address translation is at the heart of operating systems like those supported by the L4 family of microkernels. But the system will make sure that when a task tries to access a virtual address available to it, the access will be translated to a physical address and supported by some memory location.

Mapping and Paging

With some knowledge of the underlying hardware architecture, we can say that each task will need support from the kernel and the hardware to be able to treat its virtual address space as a way of accessing real memory locations. In my experiments with simple payloads to run on MIPS-based hardware, it was sufficient to define very simple tables that recorded correspondences between virtual and physical addresses. Processes or tasks would access memory addresses, and where the need arose to look up such a virtual address, the table would be consulted and the hardware configured to map the virtual address to a physical address.

Naturally, proper operating systems go much further than this, and systems built on L4 technologies go as far as to expose the mechanisms for normal programs to interact with. Instead of all decisions about how memory is mapped for each task being taken in the kernel, with the kernel being equipped with all the necessary policy and information, such decisions are delegated to entities known as pagers.

When a task needs an address translated, the kernel pushes the translation activity over to the designated pager for a decision to be made. And the event that demands an address translation is known as a page fault since it occurs when a task accesses a memory page that is not yet supported by a mapping to physical memory. Pagers are therefore present to receive page fault notifications and to respond in a way that causes the kernel to perform the necessary privileged actions to configure the hardware, this being one of the few responsibilities of the kernel.

The role of a pager in managing access to the contents of a dataspace

The role of a pager in managing access to the contents of a dataspace

Treating a dataspace as an abstraction for memory accessed by a task or application, the designated pager for the dataspace acts as a dataspace manager, ensuring that memory accesses within the dataspace can be satisfied. If an access causes a page fault, the pager must act to provide a mapping for the accessed page, leaving the application mostly oblivious to the work going on to present the dataspace and its memory as a continuously present resource.

An Aside

It is rather interesting to consider the act of delegation in the context of processor architecture. It would seem to be fairly common that the memory management units provided by various architectures feature built-in support for consulting various forms of data structures describing the virtual memory layout of a process or task. So, when a memory access fails, the information about the actual memory address involved can be retrieved from such a predefined structure.

However, the MIPS architecture largely delegates such matters to software: a processor exception is raised when a “bad” virtual address is used, and the job of doing something about it falls immediately to a software routine. So, there seems to be some kind of parallel between processor architecture and operating system architecture, L4 taking a MIPS-like approach of eager delegation to a software component for increased flexibility and functionality.

Messages and Flexpages

So, the high-level view so far is as follows:

  • Dataspaces represent regions of virtual memory
  • Virtual memory is mapped to physical memory where the data actually resides
  • When a virtual memory access cannot be satisfied, a page fault occurs
  • Page faults are delivered to pagers (acting as dataspace managers) for resolution
  • Pagers make data available and indicate the necessary mapping to satisfy the failing access

To get to the level of actual implementation, some familiarisation with other concepts is needed. Previously, my efforts have exposed me to the interprocess communications (IPC) central in L4Re as a microkernel-based system. I had even managed to gain some level of understanding around sending references or capabilities between processes or tasks. And it was apparent that this mechanism would be used to support paging.

Unfortunately, the main L4Re documentation does not seem to emphasise the actual message details or protocols involved in these fundamental activities. Instead, the library code is described in reference documentation with some additional explanation. However, some investigation of the code yielded some insights as to the kind of interfaces the existing dataspace implementations must support, and I also tracked down some message sending activities in various components.

When a page fault occurs, the first thing to know about it is the kernel because the fault occurs at the fundamental level of instruction execution, and it is the kernel’s job to deal with such low-level events in the first instance. Notification of the fault is then sent out of the kernel to the page fault handler for the affected task. The page fault handler then contacts the task’s pager to request a resolution to the problem.

Page fault handling in detail

Page fault handling in detail

In L4Re, this page fault handler is likely to be something called a region mapper (or perhaps a region manager), and so it is not completely surprising that the details of invoking the pager was located in some region mapper code. Putting together both halves of the interaction yielded the following details of the message:

  • map: offset, hot spot, flags → flexpage

Here, the offset is the position of the failing memory access relative to the start of the dataspace; the flags describe the nature of the memory to be accessed. The “hot spot” and “flexpage” need slightly more explanation, the latter being an established term in L4 circles, the former being almost arbitrarily chosen and not particularly descriptive.

The term “flexpage” may have its public origins in the “Flexible-Sized Page-Objects” paper whose title describes the term. For our purposes, the significance of the term is that it allows for the consideration of memory pages in a range of sizes instead of merely considering a single system-wide page size. These sizes start at the smallest page size supported by the system (but not necessarily the absolute smallest supported by the hardware, but anyway…) and each successively larger size is double the size preceding it. For example:

  • 4096 (212) bytes
  • 8192 (213) bytes
  • 16384 (214) bytes
  • 32768 (215) bytes
  • 65536 (216) bytes

When a page fault occurs, the handler identifies a region of memory where the failing access is occurring. Although it could merely request that memory be made available for a single page (of the smallest size) in which the access is situated, there is the possibility that a larger amount of memory be made available that encompasses this access page. The flexpage involved in a map request represents such a region of memory, having a size not necessarily decided in advance, being made available to the affected task.

This brings us to the significance of the “hot spot” and some investigation into how the page fault handler and pager interact. I must admit that I find various educational materials to be a bit vague on this matter, at least with regard to explicitly describing the appropriate behaviour. Here, the flexpage paper was helpful in providing slightly different explanations, albeit employing the term “fraction” instead of “hot spot”.

Since the map request needs to indicate the constraints applying to the region in which the failing access occurs, without demanding a particular size of region and yet still providing enough useful information to the pager for the resulting flexpage to be useful, an efficient way is needed of describing the memory landscape in the affected task. This is apparently where the “hot spot” comes in. Consider a failing access in page #3 of a memory region in a task, with the memory available in the pager to satisfy the request being limited to two pages:

Mapping available memory to pages in a task experiencing a page fault

Mapping available memory to pages in a task experiencing a page fault

Here, the “hot spot” would reference page #3, and this information would be received by the pager. The significance of the “hot spot” appears to be the location of the failing access within a flexpage, and if the pager could provide it then a flexpage of four system pages would map precisely to the largest flexpage expected by the handler for the task.

However, with only two system pages to spare, the pager can only send a flexpage consisting of those two pages, the “hot spot” being localised in page #1 of the flexpage to be sent, and the base of this flexpage being the base of page #0. Fortunately, the handler is smart enough to fit this smaller flexpage onto the “receive window” by using the original “hot spot” information, mapping page #2 in the receive window to page #0 of the received flexpage and thus mapping the access page #3 to page #1 of that flexpage.

So, the following seems to be considered and thus defined by the page fault handler:

  • The largest flexpage that could be used to satisfy the failing access.
  • The base of this flexpage.
  • The page within this flexpage where the access occurs: the “hot spot”.
  • The offset within the broader dataspace of the failing access, it indicating the data that would be expected in this page.

(Given this phrasing of the criteria, it becomes apparent that “flexpage offset” might be a better term than “hot spot”.)

With these things transferred to the pager using a map request, the pager’s considerations are as follows:

  • How flexpages of different sizes may fit within the memory available to satisfy the request.
  • The base of the most appropriate flexpage, where this might be the largest that fits within the available memory.
  • The population of the available memory with data from the dataspace.

To respond to the request, the pager sends a special flexpage item in its response message. Consequently, this flexpage is mapped into the task’s address space, and the execution of the task may resume with the missing data now available.

Practicalities and Pitfalls

If the dataspace being provided by a pager were merely a contiguous region of memory containing the data, there would probably be little else to say on the matter, but in the above I hint at some other applications. In my example, the pager only uses a certain amount of memory with which it responds to map requests. Evidently, in providing a dataspace representing a larger region, the data would have to be brought in from elsewhere, which raises some other issues.

Firstly, if data is to be copied into the limited region of memory available for satisfying map requests, then the appropriate portion of the data needs to be selected. This is mostly a matter of identifying how the available memory pages correspond to the data, then copying the data into the pages so that the accessed location ultimately provides the expected data. It may also be the case that the amount of data available does not fill the available memory pages; this should cause the rest of those pages to be filled with zeros so that data cannot leak between map requests.

Secondly, if the available memory pages are to be used to satisfy the current map request, then what happens when we re-use them in each new map request? It turns out that the mappings made for previous requests remain active! So if a task traverses a sequence of pages, and if each successive page encountered in that traversal causes a page fault, then it will seem that new data is being made available in each of those pages. But if that task inspects the earlier pages, it will find that the newest data is exposed through those pages, too, banishing the data that we might have expected.

Of course, what is happening is that all of the mapped pages in the task’s dataspace now refer to the same collection of pages in the pager, these being dedicated to satisfying the latest map request. And so, they will all reflect the contents of those available memory pages as they currently are after this latest map request.

The effect of mapping the same page repeatedly

The effect of mapping the same page repeatedly

One solution to this problem is to try and make the task forget the mappings for pages it has visited previously. I wondered if this could be done automatically, by sending a flexpage from the pager with a flag set to tell the kernel to invalidate prior mappings to the pager’s memory. After a time looking at the code, I ended up asking on the l4-hackers mailing list and getting a very helpful response that was exactly what I had been looking for!

There is, in fact, a special way of telling the kernel to “unmap” memory used by other tasks (l4_task_unmap), and it is this operation that I ended up using to invalidate the mappings previously sent to the task. Thus the task, upon backtracking to earlier pages, finds that the mappings from virtual addresses to the physical memory holding the latest data are absent once again, and page faults are needed to restore the data in those pages. The result is a form of multiplexing access to a resource via a limited region of memory.

Applications of Flexible Paging

Given the context of my investigations, it goes almost without saying that the origin of data in such a dataspace could be a file in a filesystem, but it could equally be anything that exposes data in some kind of backing store. And with this backing store not necessarily being an area of random access memory (RAM), we enter the realm of a more restrictive definition of paging where processes running in a system can themselves be partially resident in RAM and partially resident in some other kind of storage, with the latter portions being converted to the former by being fetched from wherever they reside, depending on the demands made on the system at any given point in time.

One observation worth making is that a dataspace does not need to be a dedicated component in the system in that it is not a separate and special kind of entity. Anything that is able to respond to the messages understood by dataspaces – the paging “protocol” – can provide dataspaces. A filesystem object can therefore act as a dataspace, exposing itself in a region of memory and responding to map requests that involve populating that region from the filesystem storage.

It is also worth mentioning that dataspaces and flexpages exist at different levels of abstraction. Dataspaces can be considered as control mechanisms for accessing regions of virtual memory, and the Fiasco.OC kernel does not appear to employ the term at all. Meanwhile, flexpages are abstractions for memory pages existing within or even independently of dataspaces. (If you wish, think of the frame of Banksy’s work “Love is in the Bin” as a dataspace, with the shredded pieces being flexpages that are mapped in and out.)

One can envisage more exotic forms of dataspace. Consider an image whose pixels need to be computed, like a ray-traced image, for instance. If it exposed those pixels as a dataspace, then a task reading from pages associated with that dataspace might cause computations to be initiated for an area of the image, with the task being suspended until those computations are performed and then being resumed with the pixel data ready to read, with all of this happening largely transparently.

I started this exercise out of somewhat idle curiosity, but it now makes me wonder whether I might introduce memory-mapped access to filesystem objects and then re-implement operations like reading and writing using this particular mechanism. Not being familiar with how systems like GNU/Linux provide these operations, I can only speculate as to whether similar decisions have been taken elsewhere.

But certainly, this exercise has been informative, even if certain aspects of it were frustrating. I hope that this account of my investigations proves useful to anyone else wondering about microkernel-based systems and L4Re in particular, especially if they too wish there were more discussion, reflection and collaboration on the design and implementation of software for these kinds of systems.

Sunday, 10 March 2019

Generate a random root password aka Ansible Password Plugin

I was suspicious with a cron entry on a new ubuntu server cloud vm, so I ended up to be looking on the logs.

Authentication token is no longer valid; new one required

After a quick internet search,

# chage -l root

Last password change                                    : password must be changed
Password expires                                        : password must be changed
Password inactive                                       : password must be changed
Account expires                                         : never
Minimum number of days between password change          : 0
Maximum number of days between password change          : 99999
Number of days of warning before password expires       : 7

due to the password must be changed on the root account, the cron entry does not run as it should.

This ephemeral image does not need to have a persistent known password, as the notes suggest, and it doesn’t! Even so, we should change to root password when creating the VM.

Ansible

Ansible have a password plugin that we can use with lookup.

TLDR; here is the task:

- name: Generate Random Password
    user:
      name: root
      password: "{{ lookup('password','/dev/null encrypt=sha256_crypt length=32') }}"

after ansible-playbook runs

# chage -l root

Last password change                                    : Mar 10, 2019
Password expires                                        : never
Password inactive                                       : never
Account expires                                         : never
Minimum number of days between password change          : 0
Maximum number of days between password change          : 99999
Number of days of warning before password expires       : 7

and cron entry now runs as it should.

Password Plugin

Let explain how password plugin works.

Lookup needs at-least two (2) variables, the plugin name and a file to store the output. Instead, we will use /dev/null not to persist the password to a file.

To begin with it, a test ansible playbook:

- hosts: localhost
  gather_facts: False
  connection: local

  tasks:
    - debug:
        msg: "{{ lookup('password', '/dev/null') }}"
      with_sequence: count=5

Output:

ok: [localhost] => (item=1) => {
    "msg": "dQaVE0XwWti,7HMUgq::"
}
ok: [localhost] => (item=2) => {
    "msg": "aT3zqg.KjLwW89MrAApx"
}
ok: [localhost] => (item=3) => {
    "msg": "4LBNn:fVw5GhXDWh6TnJ"
}
ok: [localhost] => (item=4) => {
    "msg": "v273Hbox1rkQ3gx3Xi2G"
}
ok: [localhost] => (item=5) => {
    "msg": "NlwzHoLj8S.Y8oUhcMv,"
}

Length

In password plugin we can also use length variable:

msg: "{{ lookup('password', '/dev/null length=32') }}"

output:

ok: [localhost] => (item=1) => {
    "msg": "4.PEb6ycosnyL.SN7jinPM:AC9w2iN_q"
}
ok: [localhost] => (item=2) => {
    "msg": "s8L6ZU_Yzuu5yOk,ISM28npot4.KwQrE"
}
ok: [localhost] => (item=3) => {
    "msg": "L9QvLyNTvpB6oQmcF8WVFy.7jE4Q1K-W"
}
ok: [localhost] => (item=4) => {
    "msg": "6DMH8KqIL:kx0ngFe8:ri0lTK4hf,SWS"
}
ok: [localhost] => (item=5) => {
    "msg": "ByW11i_66K_0mFJVB37Mq2,.fBflepP9"
}

Characters

We can define a specific type of python string constants

  • ascii_letters (ascii_lowercase and ascii_uppercase
  • ascii_lowercase
  • ascii_uppercase
  • digits
  • hexdigits
  • letters (lowercase and uppercase)
  • lowercase
  • octdigits
  • punctuation
  • printable (digits, letters, punctuation and whitespace
  • uppercase
  • whitespace

eg.

msg: "{{ lookup('password', '/dev/null length=32 chars=ascii_lowercase') }}"

ok: [localhost] => (item=1) => {
    "msg": "vwogvnpemtdobjetgbintcizjjgdyinm"
}
ok: [localhost] => (item=2) => {
    "msg": "pjniysksnqlriqekqbstjihzgetyshmp"
}
ok: [localhost] => (item=3) => {
    "msg": "gmeoeqncdhllsguorownqbynbvdusvtw"
}
ok: [localhost] => (item=4) => {
    "msg": "xjluqbewjempjykoswypqlnvtywckrfx"
}
ok: [localhost] => (item=5) => {
    "msg": "pijnjfcpjoldfuxhmyopbmgdmgdulkai"
}

Encrypt

We can also define the encryption hash. Ansible uses passlib so the unix active encrypt hash algorithms are:

  • passlib.hash.bcrypt - BCrypt
  • passlib.hash.sha256_crypt - SHA-256 Crypt
  • passlib.hash.sha512_crypt - SHA-512 Crypt

eg.

msg: "{{ lookup('password', '/dev/null length=32 chars=ascii_lowercase encrypt=sha512_crypt') }}"

ok: [localhost] => (item=1) => {
    "msg": "$6$BR96lZqN$jy.CRVTJaubOo6QISUJ9tQdYa6P6tdmgRi1/NQKPxwX9/Plp.7qETuHEhIBTZDTxuFqcNfZKtotW5q4H0BPeN."
}
ok: [localhost] => (item=2) => {
    "msg": "$6$ESf5xnWJ$cRyOuenCDovIp02W0eaBmmFpqLGGfz/K2jd1FOSVkY.Lsuo8Gz8oVGcEcDlUGWm5W/CIKzhS43xdm5pfWyCA4."
}
ok: [localhost] => (item=3) => {
    "msg": "$6$pS08v7j3$M4mMPkTjSwElhpY1bkVL727BuMdhyl4IdkGM7Mq10jRxtCSrNlT4cAU3wHRVxmS7ZwZI14UwhEB6LzfOL6pM4/"
}
ok: [localhost] => (item=4) => {
    "msg": "$6$m17q/zmR$JdogpVxY6MEV7nMKRot069YyYZN6g8GLqIbAE1cRLLkdDT3Qf.PImkgaZXDqm.2odmLN8R2ZMYEf0vzgt9PMP1"
}
ok: [localhost] => (item=5) => {
    "msg": "$6$tafW6KuH$XOBJ6b8ORGRmRXHB.pfMkue56S/9NWvrY26s..fTaMzcxTbR6uQW1yuv2Uy1DhlOzrEyfFwvCQEhaK6MrFFye."
}
Tag(s): ansible, password

A look at Matrix.org’s OLM | MEGOLM encryption protocol

Everyone who knows and uses XMPP is probably aware of a new player in the game. Matrix.org is often recommended as a young, arising alternative to the aging protocol behind the Jabber ecosystem. However the founders do not see their product as a direct competitor to XMPP as their approach to the problem of message exchanging is quite different.

An open network for secure, decentralized communication.

matrix.org

During his talk at the FOSDEM in Brussels, matrix.org founder Matthew Hodgson roughly compared the concept of matrix to how git works. Instead of passing single messages between devices and servers, matrix is all about synchronization of a shared state. A chat room can be seen as a repository, which is shared between all servers of the participants. As a consequence communication in a chat room can go on, even when the server on which the room was created goes down, as the room simultaneously exists on all the other servers. Once the failed server comes back online, it synchronizes its state with the others and retrieves missed messages.

Matrix in the French State

Olm, Megolm – What’s the deal?

Matrix introduced two different crypto protocols for end-to-end encryption. One is named Olm, which is used in one-to-one chats between two chat partners (this is not quite correct, see Updates for clarifying remarks). It can very well be compared to OMEMO, as it too is an adoption of the Signal Protocol by OpenWhisperSystems. However, due to some differences in the implementation Olm is not compatible with OMEMO although it shares the same cryptographic properties.

The other protocol goes by the name of Megolm and is used in group chats. Conceptually it deviates quite a bit from Olm and OMEMO, as it contains some modifications that make it more suitable for the multi-device use-case. However, those modifications alter its cryptographic properties.

Comparing Cryptographic Building Blocks

ProtocolOlmOMEMO (Signal)
IdentityKeyCurve25519X25519
FingerprintKey⁽š⁞Ed25519none
PreKeysCurve25519X25519
SignedPreKeys⁽²⁞noneX25519
Key Exchange
Algorithm⁽³⁞
Triple Diffie-Hellman
(3DH)
Extended Triple
Diffie-Hellman (X3DH)
Ratcheting AlgoritmDouble RatchetDouble Ratchet
  1. Signal uses a Curve X25519 IdentityKey, which is capable of both encrypting, as well as creating signatures using the XEdDSA signature scheme. Therefore no separate FingerprintKey is needed. Instead the fingerprint is derived from the IdentityKey. This is mostly a cosmetic difference, as one less key pair is required.
  2. Olm does not distinguish between the concepts of signed and unsigned PreKeys like the Signal protocol does. Instead it only uses one type of PreKey. However, those may be signed with the FingerprintKey upon upload to the server.
  3. OMEMO includes the SignedPreKey, as well as an unsigned PreKey in the handshake, while Olm only uses one PreKey. As a consequence, if the senders Olm IdentityKey gets compromised at some point, the very first few messages that are sent could possibly be decrypted.

In the end Olm and OMEMO are pretty comparable, apart from some simplifications made in the Olm protocol. Those do only marginally affect its security though (as far as I can tell as a layman).

Megolm

The similarities between OMEMO and Matrix’ encryption solution end when it comes to group chat encryption.

OMEMO does not treat chats with more than two parties any other than one-to-one chats. The sender simply has to manage a lot more keys and the amount of required trust decisions grows by a factor roughly equal to the number of chat participants.

Yep, this is a mess but luckily XMPP isn’t a very popular chat protocol so there are no large encrypted group chats ;P

So how does Matrix solve the issue?

When a user joins a group chat, they generate a session for that chat. This session consists of an Ed25519 SigningKey and a single ratchet which gets initialized randomly.

The public part of the signing key and the state of the ratchet are then shared with each participant of the group chat. This is done via an encrypted channel (using Olm encryption). Note, that this session is also shared between the devices of the user. Contrary to Olm, where every device has its own Olm session, there is only one Megolm session per user per group chat.

Whenever the user sends a message, the encryption key is generated by forwarding the ratchet and deriving a symmetric encryption key for the message from the ratchets output. Signing is done using the SigningKey.

Recipients of the message can decrypt it by forwarding their copy of the senders ratchet the same way the sender did, in order to retrieve the same encryption key. The signature is verified using the public SigningKey of the sender.

There are some pros and cons to this approach, which I briefly want to address.

First of all, you may find that this protocol is way less elegant compared to Olm/Omemo/Signal. It poses some obvious limitations and security issues. Most importantly, if an attacker gets access to the ratchet state of a user, they could decrypt any message that is sent from that point in time on. As there is no new randomness introduced, as is the case in the other protocols, the attacker can gain access by simply forwarding the ratchet thereby generating any decryption keys they need. The protocol defends against this by requiring the user to generate a new random session whenever a new user joins/leaves the room and/or a certain number of messages has been sent, whereby the window of possibly compromised messages gets limited to a smaller number. Still, this is equivalent to having a single key that decrypts multiple messages at once.

The Megolm specification lists a number of other caveats.

On the pro side of things, trust management has been simplified as the user basically just has to decide whether or not to trust each group member instead of each participating device – reducing the complexity from a multiple of n down to just n. Also, since there is no new randomness being introduced during ratchet forwarding, messages can be decrypted multiple times. As an effect devices do not need to store the decrypted messages. Knowledge of the session state(s) is sufficient to retrieve the message contents over and over again.

By sharing older session states with own devices it is also possible to read older messages on new devices. This is a feature that many users are missing badly from OMEMO.

On the other hand, if you really need true future secrecy on a message-by-message base and you cannot risk that an attacker may get access to more than one message at a time, you are probably better off taking the bitter pill going through the fingerprint mess and stick to normal Olm/OMEMO (see Updates for remarks on this statement).

Note: End-to-end encryption does not really make sense in big, especially public chat rooms, since an attacker could just simply join the room in order to get access to ongoing communication. Thanks to Florian Schmaus for pointing that out.

I hope I could give a good overview of the different encryption mechanisms in XMPP and Matrix. Hopefully I did not make any errors, but if you find mistakes, please let me know, so I can correct them asap đŸ™‚

Happy Hacking!

Sources

Updates:

Thanks for Matthew Hodgson for pointing out, that Olm/OMEMO is also effectively using a symmetric ratchet when multiple consecutive messages are sent without the receiving device sending an answer. This can lead to loss of future secrecy as discussed in the OMEMO protocol audit.

Also thanks to Hubert Chathi for noting, that Megolm is also used in one-to-one chats, as matrix doesn’t have the same distinction between group and single chats. He also pointed out, that the security level of Megolm (the criteria for regenerating the session) can be configured on a per-chat basis.

Tuesday, 05 March 2019

Copyright Reform

censorship machine

Το Ευρωπαικό Κοινοβούλιο ετοιμάζει να ψηφίσει το τελικό κείμενο για την οδηγία "Πνευματικής Ιδιοκτησίας". Μέσα σε αυτή την οδηγία υπάρχουν δύο άρθρα που αν εγκριθούν θα αλλάξουν δραματικά τον χαρακτήρα του Internet. Προς το χειρότερο. Θα αλλάξει όχι μόνο ο τρόπος με το οποίο οι άνθρωποι το χρησιμοποιούν ως πλατφόρμα έκφρασης λόγου, αλλά θα επηρεάσεις αρνητικά ακόμα και τους ίδιους τους δημιουργούς, που υποτίθεται προσπαθεί να προστατέψει.

Παρακάτω είναι μια συνοπτική περίληψη των καταστροφικών συνεπειών, που θα γίνουν πραγματικότητα αν το Ευρωπαικό Κοινοβούλιο υπερψηφίσει την οδηγία.

Άρθρο 13: Φίλτρα περιεχομένου (Upload filters)

  • Τα εμπορικά sites και εφαρμογές, όπου χρήστες αναρτούν περιεχόμενο, πρέπει να καταβάλουν κάθε προσπάθεια ώστε να έχουν αγοράσει εκ των προτέρων τις απαραίτητες άδειες για οτιδήποτε μπορεί να αναρτήσουν οι χρήστες τους. Αυτό πρακτικά σημαίνει το σύνολο της ανθρώπινης δημιουργίας που διέπεται από περιορισμούς πνευματικής ιδιοκτησίας. Είναι προφανές πως κάτι τέτοιο είναι αδύνατο να εφαρμοστεί τόσο τεχνικά, όσο και οικονομικά. Ειδικά για μικρές πλατφόρμες και υπηρεσίες.
  • Επιπρόσθετα, τα περισσότερα sites θα πρέπει να κάνουν ό,τι είναι δυνατόν ώστε να αποτρέψουν την ανάρτηση περιεχομένου απ' τους χρήστες τους, για το οποίο δεν έχουν την άδεια του δημιουργού. Δεν θα έχουν άλλη επιλογή απ' το να εφαρμόσουν φίλτρα περιεχομένου, τα οποία είναι ακριβά και αναποτελεσματικά.
  • Σε περίπτωση που ένα δικαστήριο κρίνει πως τα φίλτρα δεν είναι επαρκώς αυστηρά, τα sites θα είναι νομικά υπόλογα για τις παραβάσεις, σαν να τις έχουν διαπράξει τα ίδια. Αυτή η νομική απειλή θα οδηγήσει αρκετές υπηρεσίες να υπερβάλουν στην εφαρμογή των φίλτρων ώστε να παραμείνουν ασφαλή από διώξεις. Γεγονός που απειλεί την ελευθερία του λόγου στο διαδίκτυο.
  • Το περιεχόμενο που θα αναρτούμε online δεν είναι πλέον άμεσα διαθέσιμο, σε πολλές περιπτώσεις, καθώς θα πρέπει πρώτα να εγκριθεί απ' τα παραπάνω φίλτρα.
  • Αρκετές απ' τις υπηρεσίες που χρησιμοποιούμε σήμερα θα σταματήσουν να λειτουργούν εντός της ΕΕ, υπό το φόβο διώξεων για παράβαση της οδηγίας ή για τη μη-αυστηρή εφαρμογή της.
  • Τα φίλτρα περιεχομένου θα βλάψουν και τους δημιουργούς. Διασκευές, reviews, memes και άλλες κατηγορίες περιεχομένου που βασίζονται σε copoyrighted υλικό θα μπλοκάρεται απ' τα φίλτρα.
  • Θα είσαι ένοχος μέχρι να αποδείξεις το αντίθετο. Ακόμα κι αν το περιεχόμενό σου ανήκει, θα φέρεις το βάρος της απόδειξης όταν τα φίλτρα θα το μαρκάρουν λανθασμένα ως παράβαση πνευματικής ιδιοκτησίας.
  • Θα επωφεληθούν οι μεγάλοι παίχτες της αγοράς, καθώς είναι οι μόνοι που μπορούν να αντέξουν οικονομικά την εφαρμογή της οδηγίας.
  • Τελικώς λιγότερες νέες και καινοτόμες υπηρεσίες θα ξεκινούν εντός της ΕΕ, γιατί το κόστος εκκίνησης θα είναι δυσβάσταχτο.

Άρθρο 11: Φόρος συνδέσμων (Link tax)

  • Η αναπαραγωγή ειδήσεων με παραπάνω από μία λέξη ή περισσότερο από μια πολύ σύντομη περιγραφή, θα απαιτεί σχετική άδεια. Αυτό πιθανότατα θα καλύψει και τις περιλήψεις (snippets) ειδήσεων που χρησιμοποιούνται κατά κόρον. Υπάρχει αρκετή ασάφεια ως προς το τι σημαίνει το "σύντομη περιγραφή", αλλά υπάρχει ο κίνδυνος να καταστεί νομικά επικίνδυνη η χρήση περιλήψεων με συνδέσμους.
  • Δεν υπάρχουν εξαιρέσεις. Η οδηγία περιλαμβάνει ακόμα και ατομικές υπηρεσίες, μικρές ή μη-κερδοσκοπικές εταιρείες, blogs.
  • Πολλοί θεωρούν πως η οδηγία θα δημιουργήσει έσοδα για τους ειδησεογραφικούς οργανισμούς από μεγάλες υπηρεσίες (πχ. Google News), που χρησιμοποιούν περιλήψεις άρθρων. Στην πράξη όμως, θα πλήξει τους μικρούς παίχτες (πχ. blogs) και οι μεγάλες υπηρεσίες πιθανότατα θα αναστείλουν τη λειτουργία τους στην ΕΕ αφαιρώντας απ' τα ειδησιογραφικά sites ακόμα και αυτή την κίνηση απ' τους συνδέσμους προς τα άρθρα τους.
  • Τελικώς η εφαρμογή της οδηγίας θα περιορίσει τη ροή της πληροφορίας και της ενημέρωσης.

Τι μπορείς να κάνεις

Ενημέρωσε τους ευρωβουλευτές. Πολλοί εξ αυτών θα υπερψηφίσουν είτε από άγνοια είτε από κακή πληροφόρηση. Επισκέψου το saveyourinternet, όπου υπάρχουν σχετικά στοιχεία επικοινωνίας. Με κόκκινο χρώμα είναι οι ευρωβουλευτές που έχουν ταχθεί υπέρ της οδηγίας στο παρελθόν.

Βοήθησε να ενημερωθούν καλύτερα για τους κινδύνους της οδηγίας. Αν σε βοηθάει στην επικοινωνία, μπορείς να χρησιμοποιήσεις όσα αποσπάσματα επιθυμείς απ' το παρόν κείμενο είτε ακόμα και ολόκληρο το κείμενο.


*Comments and reactions on Mastodon, Diaspora, Twitter.

Sunday, 03 March 2019

Scaling automation with ansible-pull

Ansible is a wonderful software to automatically configure your systems. The default mode of using ansible is Push Model.

 

Ansible Push

That means from your box, and only using ssh + python, you can configure your flee of machines.

 

Ansible is imperative. You define tasks in your playbooks, roles and they will run in a serial manner on the remote machines. The task will first check if needs to run and otherwise it will skip the action. And although we can use conditional to skip actions, tasks will perform all checks. For that reason ansible seems slow instead of other configuration tools. Ansible runs in serial mode the tasks but in psedo-parallel mode against the remote servers, to increase the speed. But sometimes you need to gather_facts and that would cost in execution time. There are solutions to cache the ansible facts in a redis (in memory key:value db) but even then, you need to find a work-around to speed your deployments.

But there is an another way, the Pull Mode!

 

Useful Reading Materials

to learn more on the subject, you can start reading these two articles on ansible-pull.

 

Pull Mode

So here how it looks:

Ansible Pull

 

You will first notice, that your ansible repository is moved from you local machine to an online git repository. For me, this is GitLab. As my git repo is private, I have created a Read-Only, time-limit, Deploy Token.

With that scenario, our (ephemeral - or not) VMs will pull their ansible configuration from the git repo and run the tasks locally. I usually build my infrastructure with Terraform by HashiCorp and make advance of cloud-init to initiate their initial configuration.

Cloud-init

The tail of my user-data.yml looks pretty much like this:

...
# Install packages
packages:
  - ansible

# Run ansible-pull
runcmd:
  - ansible-pull -U https://gitlab+deploy-token-XXXXX:YYYYYYYY@gitlab.com/username/myrepo.git 

Playbook

You can either create a playbook named with the hostname of the remote server, eg. node1.yml or use the local.yml as the default playbook name.

Here is an example that will also put ansible-pull into a cron entry. This is very useful because it will check for any changes in the git repo every 15 minutes and run ansible again.

- hosts: localhost

  tasks:
    - name: Ensure ansible-pull is running every 15 minutes
      cron:
        name: "ansible-pull"
        minute: "15"
        job: "ansible-pull -U https://gitlab+deploy-token-XXXXX:YYYYYYYY@gitlab.com/username/myrepo.git &> /dev/null"

    - name: Create a custom local vimrc file
      lineinfile:
        path: /etc/vim/vimrc.local
        line: 'set modeline'
        create: yes

    - name: Remove "cloud-init" package
      apt:
        name: "cloud-init"
        purge: yes
        state: absent

    - name: Remove useless packages from the cache
      apt:
        autoclean: yes

    - name: Remove dependencies that are no longer required
      apt:
        autoremove: yes

# vim: sts=2 sw=2 ts=2 et

More Languages to the F-Droid Planet?

You may know about Planet F-Droid, a feed aggregator that aims to collect the blogs of many free Android projects in one place. Currently all of the registered blogs are written in English (as is this post, so if you know someone who might be concerned by the matter below and is not able to understand English, please feel free to translate for them).

Recently someone suggested that we should maybe create additional feeds for blogs in other languages. I’m not sure if there is interest in having support for more languages, so that’s why I want to ask you.

If you feel that Planet F-Droid should offer additional feeds for non-English blogs, please vote by thumbs up/down in the planets repository.

Happy Hacking!

Friday, 01 March 2019

Protect freedom on radio devices: raise your voice today!

We are facing a EU regulation which may make it impossible to install a custom piece of software on most radio decives like WiFi routers, smartphones and embedded devices. You can now give feedback on the most problematic part by Monday, 4 March. Please participate – it’s not hard!

In the EU Radio Equipment Directive (2014/53/EU) contains one highly dangerous article will cause many issues if implemented: Article 3(3)(i). It requires hardware manufacturers of most devices sending and receiving radio signals to implement a barrier that disallows installing software which has not been certified by the manufacturer. That means, that for installing an alternative operating system on a router, mobile phone or any other radio-capable device, the manufacturer of this device has to assess its conformity.

[R]adio equipment [shall support] certain features in order to ensure that software can only be loaded into the radio equipment where the compliance of the combination of the radio equipment and software has been demonstrated.

Article 3(3)(i) of the Radio Equipment Directive 2014/53/EU

That flips the responsibility of radio conformity by 180°. In the past, you as the one who changed the software on a device have been responsible to make sure that you don’t break any applicable regulations like frequency and signal strength. Now, the manufacturers have to prevent you from doing something wrong (or right?). That further takes away freedom to control our technology. More information here by the FSFE.

The European Commission has installed an Expert Group to come up with a list of classes of devices which are supposed to be affected by the said article. Unfortunately, as it seems, the recommendation by this group is to put highly diffuse device categories like „Software Defined Radio“ and „Internet of Things“ under the scope of this regulation.

Get active today

But there is something you can do! The European Commission has officially opened a feedback period. Everyone, individuals, companies and organisations, can provide statements on their proposed plans. All you need to participate is an EU Login account, and you can hide your name from the public list of received feedback. A summary, the impact assessment, already received feedback, and the actual feedback form is available here.

To help you word your feedback, here’s a list of some of the most important disadvantages for user freedom I see (there is a more detailed list by the FSFE):

  • Free Software: To control technology, you have to be able to control the software. This only is possible with Free and Open Source Software. So if you want to have a transparent and trustworthy device, you need to make the software running on it Free Software. But any device affected by Article 3(3)(i) will only allow the installation of software authorised by the manufacturer. It is unlikely that a manufacturer will certify all the available software for your device which suits your needs. Having these gatekeepers with their particular interests will make using Free Software on radio devices hard.
  • Security: Radio equipment like smartphones, routers, or smart home devices are highly sensitive parts of our lives. Unfortunately, many manufacturers sacrifice security for lower costs. For many devices there is better software which protects data and still offers equal or even better functionality. If such manufacturers do not even care for security, will they even allow running other (Free and Open Source) software on their products?
  • Fair competition: If you don’t like a certain product, you can use another one from a different manufacturer. If you don’t find any device suiting your requirements, you can (help) establish a new competitor that e.g. enables user freedom. But Article 3(3)(i) favours huge enterprises as it forces companies to install software barriers and do certification of additional software. For example, a small and medium-sized manufacturer of wifi routers cannot certify all available Free Software operating systems. Also, companies bundling their own software with third-party hardware will have a really hard time. On the other hand, large companies which don’t want users to use any other software than their own will profit from this threshold.
  • Community services: Volunteer initiatives like Freifunk depend on hardware which they can use with their own software for their charity causes. They were able to create innovative solutions with limited resources.
  • Sustainability: No updates available any more for your smartphone or router? From a security perspective, there are only two options: Flash another firmware which still recieves updates, or throw the whole device away. From an environmental perspective, the first solution is much better obviously. But will manufacturers still certify alternative firmware for devices they want to get rid of? I doubt so…

There will surely be more, so please make your points in your individual feedback. It will send a signal to the European Commission that there are people who care about freedom on radio devices. It’s only a few minutes work to avoid legal barriers that will worsen your and others‘ lives for years.

Thank you!

Tuesday, 26 February 2019

TIL: Browsers ignore Expires header on reload

This may have been obvious, but I’ve just learned that browsers ignore Expires header when the user manually reloads the page (as in by pressing F5 or choosing Reload option).

I’ve run into this when testing how Firefox treats pages which ‘never’ expire. To my surprise, the browser made requests for files it had a fresh copy of in its cache. To see behaviour much more representative of the experience of a returning user, one should select the address bar (Alt+D does the trick) and then press Return to navigate to the current page again. Hitting Reload is more akin, though not exactly the same, to the first visit.

Of course, all of the above applies to the max-age directive of the Cache-Control header as well.

Moral of the story? Make sure you test the actual real-life scenarios before making any decisions.

Saturday, 23 February 2019

An argument for the existence of non-binary gender identities

So there’s a thought I’ve been having lately. The idea that some people identify as neither male nor female has become more mainstream over the past few years. And I assume like a lot of people, I’ve struggled with giving those people a place within my own concept and experience of gender. So I’d taken the accepting-but-condescending view of “yeah whatever, do what makes you happy, I just don’t understand it”. That stance is sufficient for making the lives of those people nicer, but what really makes the lives of those people nicer is if they feel understood and believed.

And I think I finally grok it within my own understanding of what gender is. I want to share this, but it requires already being on board with a few things:

  • Gender is a social construct, in so much as it can only be understood by humans within social contexts. It is different from biological sex, but possibly informed by it. Some people understand gender to be bad because it is a social construct, but I am not one of those people.

  • Transgender people exist. This mostly goes without saying, but it needs saying. Some people, somehow, are inclined towards a gender that does not match with their sex.

  • Trans women are women. Trans men are men. There are a lot of arguments for this—just pick your poison for which one you like best. I am personally convinced by performativism, or the Intrinsic Inclination model suggested by Julia Serano.

  • Intersex people exist. This also goes without saying. Some people are born neither male nor female, either because their genetics are different, or because their genitals are different.

The last point is especially important. As you do, I was thinking about intersex people a few weeks ago, and I wondered what gender they might identify as. What usually happens is that they gravitate towards either binary choice, and that is fine. But I wondered then, what if they do not? What if they identify along the same lines of their sex: somewhere in between?

I could not find a single good argument that might explain why they couldn’t. Even a gender essentialist might be convinced that an intersex person has every right to identify somewhere in between in line with their physical sex. So if there is room for an “intersex” gender identity, then that must necessarily mean that gender is a spectrum.

And if gender is a spectrum, and we agree that the gender identities of trans people are valid even though they do not match their sex, then that must necessarily mean that the gender identities of non-binary people are also valid. Otherwise you end up in a strange situation where only intersex people are allowed to identify as non-binary, which is equally as restricting as saying that only cis men are allowed to identify as men.

I am half-certain that I am making some incorrect leaps in logic here or there, and I am not academically well-read on the topic, but this reasoning helped me a lot with at least giving the existence of a non-binary gender identity a place within my own framework of how gender works. And I hope it helps others too.

Open Call: carpenter for children’s playground in refugee camp Greece

On of the members of the Designers With Refugees community, which is part of innovation lab Latra, shared the following open call:

Subject: My quest for a compassionate Carpenter

Dear reader,

I wanted to grab your attention for the following. Finally, after weeks of being in contact with different people and organisations, I settled with a new project on  Lesvos.

Refugee 4 Refugees together with Infinitum Limits is setting up a new project, called Mandala, right next to the Olive Grove and camp Moria. A field of +/‐ 4000 m2 will be  transformed into a child friendly space, offering education, sports and play activities, with the aim to create a safe space.

I will go there with a small team to help out in the design and build activities from half of March to the end of May. At the moment I am with a civil engineer and  another architecture student, but I’m still looking for a fairly skilled carpenter to  strengthen the team for any amount of time in above period, on site! I’m able to offer free accommodation.

Would you all like to consider whether this could apply on you or someone you know well that would open‐minded for this? It would mean a lot tome to find someone!

Also if you’re not a carpenter but would like to learn more, contact me! đŸ™‚

WhatsApp: 0613332421
Email: *protected email*

The information is also available for download as PDF.

Planet FSFE (en): RSS 2.0 | Atom | FOAF |

        /var/log/fsfe/flx » planet-en  Albrechts Blog  Alessandro at FSFE » English  Alessandro's blog  Alina Mierlus - Building the Freedom » English  Andrea Scarpino's blog  André Ockers on Free Software  Being Fellow #952 of FSFE » English  Bela's Internship Blog  Bernhard's Blog  Bits from the Basement  Blog of Martin Husovec  Blog » English  Blog – Think. Innovation.  Bobulate  Brian Gough's Notes  Chris Woolfrey -- FSFE UK Team Member  Ciarán's free software notes  Colors of Noise - Entries tagged planetfsfe  Communicating freely  Daniel Martí's blog  David Boddie - Updates (Full Articles)  ENOWITTYNAME  English Planet – Dreierlei  English on Björn Schießle - I came for the code but stayed for the freedom  English – Max's weblog  Escape to freedom  Evaggelos Balaskas - System Engineer  FSFE Fellowship Vienna » English  FSFE interviews its Fellows  Fellowship News  Florian Snows Blog » en  Frederik Gladhorn (fregl) » FSFE  Free Software & Digital Rights Noosphere  Free Software with a Female touch  Free Software –  Free Software – Frank Karlitschek_  Free Software – hesa's Weblog  Free as LIBRE  Free speech is better than free beer » English  Free, Easy and Others  From Out There  Giacomo Poderi  Green Eggs and Ham  Handhelds, Linux and Heroes  HennR's FSFE blog  Henri Bergius  Hook’s Humble Homepage  Hugo - FSFE planet  Inductive Bias  Jelle Hermsen » English  Jens Lechtenbörger » English  Karsten on Free Software  Losca  MHO  Mario Fux  Martin's notes - English  Matej's blog » FSFE  Matthias Kirschner's Web log - fsfe  Michael Clemens  Myriam's blog  Mäh?  Nice blog  Nico Rikken » fsfe  Nicolas Jean's FSFE blog » English  Nikos Roussos - opensource  PB's blog » en  Paul Boddie's Free Software-related blog » English  Planet FSFE on Iain R. Learmonth  Po angielsku — mina86.com  Posts - Carmen Bianca Bakker  Posts on Hannes Hauswedell's homepage  Pressreview  Ramblings of a sysadmin (Posts about planet-fsfe)  Rekado  Repentinus » English  Riccardo (ruphy) Iaconelli - blog  Saint's Log  Seravo  TSDgeos' blog  Tarin Gamberini  Technology – Intuitionistically Uncertain  The Girl Who Wasn't There » English  The trunk  Thib's Fellowship Blog » fsfe  Thinking out loud » English  Thomas Løcke Being Incoherent  Told to blog - Entries tagged fsfe  Tonnerre Lombard  Torsten's FSFE blog » english  Viktor's notes » English  Vitaly Repin. Software engineer's blog  Weblog  Weblog  Weblog  Weblog  Weblog  Weblog  With/in the FSFE » English  a fellowship ahead  agger's Free Software blog  anna.morris's blog  ayers's blog  bb's blog  blog  drdanzs blog » freesoftware  egnun's blog » FreeSoftware  english – Davide Giunchi  foss – vanitasvitae's blog  free software blog  freedom bits  gollo's blog » English  julia.e.klein's blog  marc0s on Free Software  mkesper's blog » English  pichel's blog  polina's blog  rieper|blog » en  softmetz' anglophone Free Software blog  stargrave's blog  tobias_platen's blog  tolld's blog  wkossen's blog  yahuxo's blog