Take All Your Belongings with You

Posted on September 4, 2022

Introduction

Meet the Self-Hosters, Taking Back the Internet One Server at a Time

Why

I want my data back

It is an intellectual humiliation to not be able to take my things back. Getting my personal data out of Facebook | Ruben Verborgh. Doubly so when they exploit your data, and then take your data as hostage to let they continue their exploiting.

A real world metaphor Lebanon man hailed hero for holding Beirut bank hostage over savings - BBC News

Freedom as an utility

Just having the possibility is exulting.

希望本无所谓有,也无所谓无,这就像地上的路,其实地上本没有路,走的人多了,也便成了路。

Free from want

Free from fear

The freedom to run, copy, distribute, study, change and improve the software

Kind of raison d’etre.

I read a lot of articles with Pocket. The search function of Pocket works poorly. I frequently find myself unable to recall some articles I saved on pocket. I am pretty excited about Mozilla’s acquisition of pocket, I imagine it would be easier for me to export my data, and auto-label the articles I read on pocket. My wish is like waiting for Godot, it bears no fruits, cf Open source Pocket server-side code. I want the potential of improving something I use everyday.

Gaining insights for your digital life

An unexamined life is not worth living, aka did you update your Facebook status today?

Tenets

Offline first

What’s it?

Offline First

Why is it important?

High avaibility in the presense of network partition, But what about inconsitence, not critical in the case of personal computing. Also may use all the standard methods to solve inconsistency.

See also

Downsides of Offline First

Bring your own client

What’s it?

Bring Your Own Client

Why is it important?

Email clients as an example. Read and write email with mu4e in emacs!

Interoperability

What’s it?

A Legislative Path to an Interoperable Internet

Why is it important?

Not only client Interoperable with the server. Also data portability, back-end interoperability, and delegability. Take BookWyrm as an example.

Weak centralization

What’s it?

Another Penrose triangle

Why is it important?

Take atuin as an example.

Data portability

Show case

code-server

Edit files with your favrioute editor, but on the web and with all your files.

calibre web

Show how large my personal digital libraray is, and why it can’t be done with a public service. This also applies to your music/video library.

datasette

Explore my pocket data with datasette. Also mention powerful bussiness intelligence tools in the context of personal data.

smos

Show how amazing is smos a productivity tool, and how c

organice

Indicate why I am inifinitely more productive with my emacs config, and how I can choose my own tools.

keeweb

Store everything I have on the Internet without fear.

aria2

Download things from the office, use it when arriving home.

grocy

What can you make with all the materials in your refrigerator?

rclone/sftpgo

Mount remote storage, and expose them in a standarized interface.

Selected topics

Remote access

connectivity

DDNS + router port mapping

Need to dial up with your router, not the fiber optic modem. You may do all the DDNS port mapping work in the router if your router is flexible enough. Or you may run miniupnp in your server.

remote port mapping

autossh (my favorite), ngrok, frp, nps. Not enough if you have a few hosts to manage.

TOR

With a relay network, but it not so censorship-resistant.

Static VPN solutions

Wireguard is not dynamic enough. O(n) cost for a new host. Manually IPAM (ip addresses management). Not able to penetrate double-NAT.

Magic overlay networks

All computer science problems can be solved by adding a new layer of indirection. There are many solutions, e.g. zerotier, tailscale, netmaker, innernet, nebula, headscale, netbird, firezone.

node discovery

MDNS/LLMNR

Ever wonder how timemachine server on your LAN is discovered, or why you can just ping hostname in Windows? Free lunch if your overlay network supports multicast.

Coredns

More magic.

exposing http services

Dynamic and self organizing.

SSL termination

Routing rules

remote editing

remote shell access

Synchronization

Syncthing

Syncing without a 7x24 hour listening server

rclone bisync

cryptomator

rclone/sftpgo

Backup

Caveats

The Time-of-check to time-of-use problem in the case of data backup. An almost harmless example is that the backup software first reads the dirent to get a list of all files, and then it tries to read the file content, now it founds out the file is no longer there. So backing up this file failed. A more pernicious example is that the backup software backs up two inconsistent part of a file, thus results in file corruption. If you want to be absolutely sure about the integerity of the file, you can either let the underlying software so its back up job, or create a file system level snapshot.

File system backup

File system agnostic backup

Two styles of backing up

tar

tar -C "$HOME" --zstd -cpf - --one-file-system --exclude-vcs-ignores --exclude-backups --exclude-caches-all --exclude="$encrypted_backup_file" "$HOME" | gpg --yes --pinentry-mode loopback --symmetric --cipher-algo aes256 -o "$encrypted_backup_file"

rclone

rclone sync ~/Sync/ backup-primary-encrypted:/sync/
cat ~/.config/rclone/rclone.conf
[backup-primary-encrypted]
type = crypt
remote = backup-primary:encrypted
password = passwordheree

restic

borgbackup

See also

GitHub - restic/others: Exhaustive list of backup solutions for Linux

Service provision

Off-the-shelf solutions

TODO: add a meme Fear not, we already have multiple solutions designed for this niche market. awesome-selfhosted/awesome-selfhosted Self-hosting Solutions

My take

See repo.

Security

CI/CD

Observability