Easily manage your Let's Encrypt certificates with Digital Ocean Spaces

Introduction

Managing and distributing Let's Encrypt SSL certificates can be tedious. In this article, I will share my strategy for obtaining, renewing and distributing my SSL certificates with let's encrypt. I wanted to have to type only one command to:

  • get a new certificate for a subdomain
  • renew all my certificates that needed renewal
  • publish the certificates to Digital Ocean spaces or S3
  • pull the certificates on my webservers

To issue and renew let's encrypt certificate we will use DNS verification using lego (with Digital Ocean's DNS). We will distribute the certificates using Digital Ocean spaces, an equivalent of S3.

All four commands are in a makefile that you can find at: https://github.com/charignon/certificates.

Prerequisites

Step 1 — Set up the aws cli and creating a bucket

In this step, we will be creating a bucket domain-certificates to hold our certificates.

Let's start by setting up the aws cli credentials and bucket:

export AWS_ACCESS_KEY_ID="YYYY" # change with your key id
export AWS_SECRET_ACCESS_KEY="XXXX" # change with your key
export BUCKET=domain-certificates

If like me you are using digital ocean spaces, you need to specify an endpoint when using the aws cli. To avoid repeatedly typing the --endpoint argument, I created an alias. If instead, you use S3, just use aws instead of awscli in all of the future commands:

alias awscli="aws --endpoint-url=https://nyc3.digitaloceanspaces.com"

Now let's use this alias to create a new bucket and check that it exists:

awscli s3api create-bucket --bucket $BUCKET --region nyc3
awscli s3api list-buckets

We can also check that lego can be invoked as well:

lego --version # prints "lego version 0.4.0" for me

If this does not work for you, you might want to adjust your $GOPATH environment variable so that your shell can find lego.

Step 2 — Generating and uploading a certificate for a subdomain

Let's assume that you own the domain foobar.com and want to generate a new certificate for prod.foobar.com. To generate a new certificate for your subdomain, we can use lego. The instructions vary depending on your DNS provider. Here is what I am doing for Digital Ocean's DNS:

export DO_AUTH_TOKEN=XXX
lego --accept-tos  --email contact@foobar.com --dns digitalocean -d prod.foobar.com run

If all goes well, this will create a new certificate in the ".lego" folder. We can now upload this folder to the bucket that we created (in a acme folder):

aws s3 sync $(pwd)/.lego  s3://domain-certificates/acme

Voila! You just issued a new certificate and stored it online.

Step 3 — Downloading the certificate on another host

On your webserver you can download the certificate we just issued by using the following command:

aws s3 sync s3://domain-certificates/acme $(pwd)/.lego

Step 4 - Renewing domains

To renew a previously issued certificate stored in the ".lego" folder, you can use:

lego -d prod.foobar.com renew --days 30

Step 5 - Automating with a Makefile

Let's automate all these steps in a Makefile. Let's first create a rule to add a new domain by calling lego :

LEGO=lego --accept-tos  --email contact@foobar.com --dns digitalocean 

new:
	@$(if ${DOMAIN}, echo "Setting up ${DOMAIN}", echo "Please set DOMAIN env var" ; exit 1 )
	${LEGO} -d ${DOMAIN} run

We can invoke it like this:

DOMAIN=staging.foobar.com make new

Next we need a way to upload and download certificates to and from our bucket. Let's create two rules for that:

ENDPOINT=--endpoint-url https://nyc3.digitaloceanspaces.com
LOCAL_PATH=${PWD}/.lego
S3_PATH="s3://domain-certificates/acme"
AWS=aws ${ENDPOINT}

upload:
	@echo "uploading"
	${AWS} s3 sync ${LOCAL_PATH} ${S3_PATH}

download:
	@echo "Download"
	${AWS} s3 sync ${S3_PATH} ${LOCAL_PATH}

Finally, to renew all of the domains, we need to be able to query a list of domain in ".lego" :

DOMAINS=`ls .lego/certificates | grep key | sed "s/.key$$//"`

And loop through all of the subdomains, which leads to the following makefile rule:

DOMAINS=`ls .lego/certificates | grep key | sed "s/.key$$//"`

renew:
	@echo "Renewing ${DOMAINS}"
	@for d in ${DOMAINS};do ${LEGO} -d $$d renew --days 30 ; done

I can use the rules we just defined to download and renew all of my certificates:

make download && make renew && make upload

The final touch is to make the rules phony because they don't produce files:

.PHONY: new renew upload download

Putting it all together, the makefile looks like:

.PHONY: new renew upload download
ENDPOINT=--endpoint-url https://nyc3.digitaloceanspaces.com
AWS=aws ${ENDPOINT}
LEGO=lego --accept-tos --email contact@foobar.com --dns digitalocean
S3_PATH="s3://domain-certificates/acme"
LOCAL_PATH=${PWD}/.lego
DOMAINS=`ls .lego/certificates | grep key | sed "s/.key$$//"`

new:
	@$(if ${DOMAIN}, echo "Setting up ${DOMAIN}", echo "Please set DOMAIN env var" ; exit 1 )
	${LEGO} -d ${DOMAIN} run

renew:
	@echo "Renewing ${DOMAINS}"
	@for d in ${DOMAINS};do ${LEGO} -d $$d renew --days 30 ; done

upload:
	@echo "uploading"
	${AWS} s3 sync ${LOCAL_PATH} ${S3_PATH}

download:
	@echo "Download"
	${AWS} s3 sync ${S3_PATH} ${LOCAL_PATH}

This makefile makes it a lot easier to manage your certificates! You can now quickly create, renew, upload and download certificates. I hope that it makes your life easier!

The code is available on Github at https://github.com/charignon/certificates.

Extracting Chrome Cookies with Clojure

Introduction

Did you know that it was possible to extract your google chrome cookies to use them in a script? This can be a great way of automating interaction with a website when you don't want to bother automating the login flow. There is a great python library to do that called browsercookie. Browsercookie works on Linux, Mac OSX, and Windows and can extract cookies from Chrome and Firefox.

I have been learning Clojure for the past month and I decided to reimplement the same functionality as browsercookie as an exercise! I built a command line tool to print decrypted Chrome cookies as a JSON on OSX. This article will walk you through the implementation. Even if you don't know a thing about Clojure you should be able to understand the process and learn a few things along the way; enjoy! The full code for this project is available on github.

Context: how does Chrome store cookies?

Chrome store its cookies in a SQLite database, in the cookies table. The actual content of the cookies is encrypted with AES using a key stored in the user's keychain.

Architecture

Here is a diagram of the architecture (in orange, are the side effects and green pure computation):

Architecture of the code
High-level architecture of the code

This translates to what the main function does:

(defn -main
  [& args]
  (if (not= 1 (count args))
    (.println *err* "Usage ./program <site-url>")
    (let [site-url (first args)
          aes-key (build-decryption-key (get-chrome-password-osx))
          db-spec {:dbtype "sqlite" :dbname (temp-file-copy-of cookies-file-osx)}
          query (str "select * from cookies where host_key like '" site-url "'")
          cookies (jdbc/query db-spec [query])]
      (println
       (json/write-str
        (map (partial decrypt-cookie aes-key) cookies)))
      (System/exit 0))))

Let's explore how each part works, dividing the process into three steps:

Step 1 — Building the cookies decryption key

This is a two step process: reading the key from the keychain and making it suitable to use for AES. To read a key from the keychain, MacOS provide a command line utility called =security=:

To get the chrome key you can do:

security find-generic-password -a "Chrome" -w

Which can express this in Clojure:

(defn get-chrome-rawkey-osx
  "Get the chrome raw decryption key using the `security` cli on OSX"
  []
  (-> (shell/sh "security" "find-generic-password" "-a" "Chrome" "-w" :out-enc "UTF-8")
      (:out)
      (clojure.string/trim)
      (.toCharArray)))

The Clojure syntax (-> x (a) (b k)) would translate to b(a(x), k) in python, it allows chaining of operations in a very readable way. Note how we can easily mix Clojure function and Java functions that start with a =.= character.

We can then build a key suitable for AES decryption:

(defn build-decryption-key
  "Given a chrome raw key, construct a decryption key for the cookies"
  [raw-key]
  (let [salt (.getBytes "saltysalt")]
  (-> (SecretKeyFactory/getInstance "pbkdf2withhmacsha1")
      (.generateSecret (PBEKeySpec. raw-key salt 1003 128))
      (.getEncoded)
      (SecretKeySpec. "AES"))))

To decrypt AES, we need the key and a salt. saltysalt is the salt used by Chrome (I copied this from browsercookie). Again note that methods starting with . are Java methods. We use the Java interop and the javax.crypto library to build this key.

Now that we have a decryption key, let's see how we can read the cookies and decrypt them!

Step 2 — Reading the encrypted cookies

The cookies are stored in a SQLite database at a known location on OSX. Each row in the cookies table corresponds to a cookie:

  • if the encrypted_value fields start with v10, it is an encrypted cookie (again according to the reference implementation browsercookie).
  • otherwise, the value field contains the value of the cookie.

Let's start with a utility function to return the path of the cookie database:

(def cookies-file-osx
  "Return the path to the cookie store on OSX"
  (str
   (System/getProperty "user.home")
   "/Library/Application Support/Google/Chrome/Default/Cookies"))

Since we don't want to work directly on the real cookie database, we can make a copy of it to a temp file using another function:

(defn temp-file-copy-of
  "Create a copy of file as a temp file that will be removed when
  the program exits, return its path"
  [file]
  (let [tempfile (java.io.File/createTempFile "store" "db")]
    (.deleteOnExit tempfile)
    (io/copy (io/file file) (io/file tempfile))
    tempfile))

We can use these two functions to query all the cookies from the database matching the domain name the user requested:

(let [site-url (first args)
      db-spec {:dbtype "sqlite" :dbname (temp-file-copy-of cookies-file-osx)}
      query (str "select * from cookies where host_key like '" site-url "'")]
      ... ;; Do something with cookies)

At this point, we have extracted all the relevant cookies from the SQLite database and have a decryption key ready, let's decrypt and print the cookies!

Step 3 - Decrypting the cookies and printing them

In this step we will build the decrypt-cookie function. Given a cookie like:

{
  :path "/",
  :firstpartyonly 0,
  :name "tz",
  :persistent 0,
  :encrypted_value "v10garbage",
  :expires_utc 0,
  :host_key "github.com",
  :creation_utc 13147635893809723,
  :httponly 0,
  :priority 1,
  :last_access_utc 13147635892809723,
  :secure 1,
  :has_expires 0
}

The decrypt-cookie function returns a decrypted cookie:

{
  :path "/",
  :firstpartyonly 0,
  :name "tz",
  :persistent 0,
  :value "decrypted value",
  :expires_utc 0,
  :host_key "github.com",
  :creation_utc 13147635893809723,
  :httponly 0,
  :priority 1,
  :last_access_utc 13147635892809723,
  :secure 1,
  :has_expires 0
}

How do we know if a cookie is encrypted? As we said above, a cookie is encrypted if its encrypted_value field starts with v10. Let's use the -> macro to express this in the is-encrypted? function:

(defn is-encrypted?
  "Returns true if the provided encrypted value (bytes[]) is an encrypted
  value by Chrome that is, if it starts with v10"
  [cookie]
  (-> cookie
      (:encrypted_value)
      (String. "UTF-8")
      (clojure.string/starts-with? "v10")))

But how do we decrypt an encrypted value? Since we use CBC, we need to combine the AES key and an initialization vector (again taken from the reference implementation) and build a cipher. Using this cipher we decrypt the encrypted text in the decrypt function:

(defn decrypt [aeskey text]
  "Decrypt AES encrypted text given an aes key"
  (let [ivsbytes (-> (repeat 16 " ") (clojure.string/join) (.getBytes))
        iv       (IvParameterSpec. ivsbytes)
        cipher   (Cipher/getInstance "AES/CBC/PKCS5Padding")
        _        (.init cipher Cipher/DECRYPT_MODE aeskey iv)]
    (String. (.doFinal cipher text))))

This is a little terse but can easily be decomposed into three steps:

Building the initial vector for the AES decryption (let block omitted):

ivsbytes (-> (repeat 16 " ") (clojure.string/join) (.getBytes))
iv       (IvParameterSpec. ivsbytes)

Building the cipher

cipher   (Cipher/getInstance "AES/CBC/PKCS5Padding")
_ (.init cipher Cipher/DECRYPT_MODE aeskey iv)]

This is a little ugly, because .init is used only for its side effect and we ignore its result ...

Decrypting the text:

(String. (.doFinal cipher text))

Now that we know how to decrypt a value and detect encrypted values, we can put it all together into decrypt-cookie, a function that decrypts a cookie:

(defn decrypt-cookie
  "Given a cookie, return a cookie with decrypted value"
  [aes-key cookie]
  (-> cookie
      (assoc :value
             (if (is-encrypted? cookie)
               ;; Drop 3 removes the leading "v10"
               (->> cookie (:encrypted_value) (drop 3) (byte-array) (decrypt aes-key))
               (-> cookie (:value))))
      ;; Remove the unused :encrypted_value entry
      (dissoc :encrypted_value)))

Note that we use the ->> macro instead of -> that we used before. Instead of threading the expression as the first argument of each function like ->, ->> threads the expression as the last argument.

Putting it all together

It all comes together in the main function that we showed above:

(defn -main
  [& args]
  (if (not= 1 (count args))
    (.println *err* "Usage ./program <site-url>")
    (let [site-url (first args)
          aes-key (build-decryption-key (get-chrome-password-osx))
          db-spec {:dbtype "sqlite" :dbname (temp-file-copy-of cookies-file-osx)}
          query (str "select * from cookies where host_key like '" site-url "'")
          cookies (jdbc/query db-spec [query])]
      (println
       (json/write-str
        (map (partial decrypt-cookie aes-key) cookies)))
      (System/exit 0))))

Here we map over all the relevant cookies from the database using a function to decrypt the cookies. We use partial to build a function with the first argument aes-key locked in.

Conclusion

In this guide, we covered how to extract and decrypt Chrome cookies. We also looked at some Clojure idioms. Again if you want to use this in python you should check out browsercookie. The full code for this project is available on github.

Automate your laptop setup with Ansible

Introduction

You don't have to set up a new laptop every day. But if you have done it recently you probably remember that it was time-consuming. In fact, the most complex setups can take weeks to reproduce because of all the details and configuration settings. This post will teach you how to automate your laptop set up with Ansible. The examples I will show are for macbooks but can be applied to any kind of laptop.

Prerequisites

To follow this tutorial you must have Ansible and homebrew installed:

To install homebrew:

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

To install ansible:

sudo easy_install pip
sudo pip install ansible

Step 1 - Hello ansible

Ansible is configured with yaml files. It expects the files to be organized following this convention:

├── inventory -------------> Mapping of hostname to group name
├── roles
│   ├── development
│   │   ├── files
│   │   │   └── tmux.conf -> A file used by the role
│   │   └── tasks
│   │       └── main.yml --> Tasks for the role development
└── site.yml --------------> Mapping of group name to role
  • inventory maps hostname (in our case localhost), with a group name (in our case laptop)
  • site.yml maps group names (laptop here) to roles (here we have one role: development.
  • roles/development is a role, it defines the actual operations that Ansible will run

Before using ansible to set up our laptop, let's build a hello world example to make sure everything is set up propertly. Create the following file structure on disk:

├── inventory -------------> Empty file 
├── roles
│   ├── development
│   │   └── tasks
│   │       └── main.yml --> Empty file
└── site.yml --------------> Empty file 

As we said the inventory maps hostnames to group names. Update the content of inventory with:

[laptop]
localhost

This means that we declare a group "laptop" that contains one host: "localhost".

Next let's look at the main entry point: site.yml, it maps group names to roles and is called a playbook. Update the content of site.yml to associate the group laptop with the role development :

- hosts: laptop
  roles:
    - development

Finally let's add some code to main.yml at roles/development/main.yml to write Hello ansible to the file /tmp/log :

- name: Hello ansible
  shell: echo "Hello ansible" >> /tmp/log

You can run this playbook with:

ansible-playbook -i inventory site.yml -c local

-c local tells Ansible not to try to connect through ssh but rather run the playbook (site.yml) locally.

The run should have created a file /tmp/log containing Hello ansible.

Now that we have a basic structure in place we can add more roles and more useful tasks!

Step 2 - Installing packages with homebrew

We can install packages with homebrew using the following construct:

- name: Install apps with homebrew
  homebrew: name={{ item }} state=present
  with_items:
    - wget
    - vim
    - tmux
    - htop
    - ag
    - git
    - python3
    - zsh 
    - bash
    - reattach-to-user-namespace
    - hugo
    - graphviz
    - mosh

Notice the with_items key that let's you specify multiple items for a step!

Step 3 - Installing apps

Did you know that homebrew could also install apps? To do so you can use homebrew casks :

- name: Install cask packages 
  homebrew_cask: name={{ item }} state=present
  with_items:
    - iterm2
    - spectacle 
    - 1password
    - google-chrome
    - dropbox

Step 4 - Copying dot files

What if you need to copy config files? Just put them under the files folder of your role (alongside the tasks folder) and use: "{{ role_path }}/files/name_of_a_file" to refer to those files. Here is an example:

- name: Copy tmux configuration 
  copy:
    src: "{{ role_path }}/files/tmux.conf"
    dest: ~/.tmux.conf

Step 5 - Managing configuration in repositories

While keeping files alongside your Ansible configuration is an option, lots of people like to store their files in a separate repository. You can clone repositories using the git task:

- name: Clone repo
  git:
    repo: https://github.com/foo/dotfiles
    dest: ~/dotfiles
    update: no

Then in that case, for dotfiles, you would create links between the files in the repository and your home folder.

Step 6 - Setting up system default

We nearly covered everything you need to automate the setup of a laptop, but one important topic is missing: System Defaults.

When you change a setting in Mac OS, for example, the key repeat setting:

Key Repeat Settings
Key Repeat Settings

It gets written as a system default. You can look up online the name of those defaults and replicate them using the osx_default_module.

For example, my key repeat settings that I showed above look like this in Ansible:

- osx_defaults:
    key: KeyRepeat
    type: int
    value: 2

- osx_defaults:
    key: InitialKeyRepeat
    type: int
    value: 15

Conclusion

If you follow Ansible best practices, rerunning a playbook on a system already configured should be idempotent (it should be a no-op). I encourage you to write your laptop setup as an Ansible config and reflect any change to it as you go to keep it up to date.

Now that you know how to set up laptops programmatically you can think about areas to apply this skill! For example, if your company does not set up laptops automatically, you can suggest it as a project that will have a significant impact and speed up onboarding.

Also, if you are interested in more content about Ansible, check out: Set up digital ocean block storage with Ansible.

Set up digital ocean block storage with Ansible

Ansible is a great tool for automating environment setup and deploy. This article shows how to programmatically set up digital ocean block storage with Ansible. I started by reading this article on how to use the digital ocean manually and built an Ansible runbook inspired by it. The complete runbook is available as a gist, we will build it piece by piece in this post.

Digital Ocean block storage

Digital ocean lets you create volumes and attach them to droplets. For those of you more familiar with Amazon Web Services, digital oceans volumes are to droplets what EBS volumes are to EC2 instances.

Analogy between AWS and Digital Ocean
Analogy between AWS and Digital Ocean

When you create a volume in digital ocean and attach it to a droplet it shows up in the /dev/disk-by-id folder. It is not formatted into partitions and does not have a file system.

Goals

The goal of our runbook is to automate the setup of a new volume attached to a droplet so that it can be used by applications. I use this runbook in my chatbot before installing a Redis datastore that uses the volume. The runbook should also be a noop with droplets that are already set up

Finding the name of the volume

Let's start by defining the first step to get the volume name. We look at what's in /dev/disk/by-id and discard entries that contain part (they refer to partitions of disks instead of the whole volume). We store the name of the volume for the future steps:

- name: Get the volume name
  shell: ls /dev/disk/by-id/ | grep -v part
  register: volume_name_raw

- set_fact:
    volume_name: "{{ volume_name_raw.stdout }}"

Now we want to be able to run this recipe multiple times, so let's find a way to know if the disk is already set up. We can do that by checking /etc/fstab which describes the disk to mount on boot:

- name: Check if the volume is already setup
  command: grep '{{ volume_name }}' /etc/fstab -q
  register: volume_present
  ignore_errors: True

Formatting the volume

All the following steps setup the volume as ext4 and should not run if the volume is already setup:

- name: Label the volume
  command: parted /dev/disk/by-id/{{ volume_name }} mklabel gpt
  when: volume_present|failed

- name: Create an ext4 partition
  command: parted -a opt /dev/disk/by-id/{{ volume_name }} mkpart primary ext4 0% 100%
  when: volume_present|failed

- name: Build the ext4 metadata
  command: mkfs.ext4 /dev/disk/by-id/{{ volume_name }}-part1
  when: volume_present|failed

- name: Create the mount point
  command: mkdir -p /mnt/data
  when: volume_present|failed

Mounting the volume and persisting the mount

Finally, the volume is ready, we want to ensure that it is mounted and configured in fstab. This will be a noop for a volume that is already mounted and there:

- name: Mount volume read-write
  mount:
    path: /mnt/data
    src: /dev/disk/by-id/{{ volume_name }}-part1
    fstype: ext4
    opts: defaults,discard
    state: mounted

As I wrote previously, you can then use the volume for various applications. I use them to persist a Redis database for my bot. It is really handy to have an ansible cookbook to automate their set up. The complete runbook with all the steps is available as a gist.

Don't forget to also check out the articles discussing how I build my chatbot using Digital Ocean: Syncing org mode reminders to my bot and Managing my todos, notes, and reminders.

Syncing org mode reminders to my bot

This post is about the chatbot I created to managed my reminders and notes. If you missed the first post introducing the bot check out this post.

In this post I am describing how I sync reminders from org-mode on my laptop to my bot.

Design

To sync the reminders I decided on the following approach:

  • Parse the current file with the org mode parser
  • Find all the reminders
  • Put them all in a format that the backend understands into a file
  • Sync the reminders to the datastore
  • Notify all the backends using the reminders that they should reload

Implementation

Reminders format

I currently store my reminders in plain text using org mode. A reminder typically look like this:

* Columbia CS@CU Happy Hour
:REMINDER_DATE: <2017-07-15 Sat 12:00>
:REMINDER_TARGET: laurent
:REMINDER_QUICK_REPLY: yes
:END:

Parsing the org mode file

The following method parses the currently open org mode file and iterates through all the headlines. It keeps only the ones that pass the test laurent/valid-reminder.

  (defun laurent/reminders-current-buffer ()
    "Return the list of org element that represent reminders
     from the current buffer"
    (let ((reminders
          (org-element-map (org-element-parse-buffer) 'headline #'identity)))
      (cl-remove-if-not #'laurent/valid-reminder reminders)))

Filtering valid reminders

So what is a valid reminder? We looked at an example above

* Columbia CS@CU Happy Hour
:REMINDER_DATE: <2017-07-15 Sat 12:00>
:REMINDER_TARGET: laurent
:REMINDER_QUICK_REPLY: yes
:END:

It should contain a target, and some date specification. It can be an absolute date like here or it can also be repeating:

* Gym Monday
  :PROPERTIES:
  :REMINDER_TARGET: laurent
  :REMINDER_HOUR: 18
  :REMINDER_MINUTE: 0
  :REMINDER_WEEKDAY: 0
  :REMINDER_MESSAGE: It's monday night, workout night! 
  :END:

org-mode's element API exposes functions to access the property of elements, let's use them to express what a valid reminder is:

  (defun laurent/valid-reminder (x)
     "Given an org element returns truthy if it
      is a valid reminder, aka it has a target and
      some time specs supported by the system"
    (and
      (org-element-property :REMINDER_TARGET x)
      (or
        (org-element-property :REMINDER_DATE x)
        (org-element-property :REMINDER_HOUR x)
        (org-element-property :REMINDER_MINUTE x)
        (org-element-property :REMINDER_WEEKDAY x))))

So far we have parsed the current buffer and collected a list of element that are valid reminders. Time to format them to a format that the backend understands: JSON!

Formatting reminders in JSON

Here is how I convert the reminders. I used json-encode to encode the reminder list.

(defun laurent/write-reminders (filename)
  "Write reminders of the current buffer to a file as JSON"
  (let ((reminder-list 
          (mapcar #'laurent/format-reminder (laurent/reminders-current-buffer))))
    (with-temp-buffer
      (insert (json-encode reminder-list))
      (json-pretty-print-buffer)
      (write-file filename))))

As you can notice we don't encode directly laurent/reminders-current-buffer but preprocess each reminder with laurent/format-reminder.

This is because the backend expects all the fields of the reminders to be defined even if they are null. It also expects every entry in the JSON to be properly casted.

That's the responsibility of laurent/format-reminder, to format the reminder to look like what the backend expects:

  (defun laurent/tonum (x)
     "If x is nil return it, otherwise cast x to a number"
     (if (eq x nil)
        nil
       (string-to-number x)))

  (defun laurent/format-reminder (x) 
    "Given an org element representing a reminders 
     make it into a list of cons cell key value pair.
     Keeping only the properties relevant for reminders and
     casting all properties to their expected type"
    (list
      (cons 'title (car (org-element-property :title x)))
      (cons 'target (org-element-property :REMINDER_TARGET x))
      (cons 'hour (laurent/tonum (org-element-property :REMINDER_HOUR x)))
      (cons 'minute (laurent/tonum (org-element-property :REMINDER_MINUTE x)))
      (cons 'day_of_week (laurent/tonum (org-element-property :REMINDER_WEEKDAY x)))
      (cons 'quick_reply (org-element-property :REMINDER_QUICK_REPLY x))
      (cons 'date
            (let ((date (org-element-property :REMINDER_DATE x)))
              (if (eq date nil) nil (substring date 1 -1))))
      (cons 'timezone "America/Los_Angeles")
      (cons 'message  (or (org-element-property :REMINDER_MESSAGE x) (car (org-element-property :title x))))))

Syncing the reminders to the datastore

I wrote a quick python script to copy a file to a redis key. It assumes that redis is running on localhost on the usual redis port.

#!/usr/bin/env python3
import redis
import sys
import time

reminder_file = sys.argv[1]
attempt = 0

with open(reminder_file) as f:
    while attempt < 5:
        print("Attemption connection")
        try:
            attempt += 1
            r = redis.Redis()
            r.ping()
            break
        except Exception as e:
            print(e)
            print("Retrying in 0.5s")
            time.sleep(0.5)
    if attempt >= 5:
        print("Failed to connect")
        sys.exit(1)
    print("Syncing reminders")
    r.set("reminders.json".encode("utf-8"), f.read())
    print("DONE adding the reminders to REDIS")

In order to call it, I first set up port forwarding with the remote redis instance:

  (defun laurent/start-port-fwd (server port)
    "Start port forwarding to a server (ex: \"root@12.12.12.12\") and a port 
     like \"4564\""
    (start-process 
      "port-forwarding" 
      "*port-fwd*" 
      "ssh" "-L" (concat port ":localhost:" port) server "-N" ))

All that is left is to tell the backend that the reminders have been updated and put it all together in an interactive function I can call:

(defun laurent/sync-reminders ()
  "Export, sync reminders to the server and reload the reminders on the bot"
  (interactive)
  (let* ((server "root@12.12.12.12")
          (redisport "6379")
          (repopath "~/repos/docker_apps")
          (syncprogram (concat repopath "/APPS/backend/add_reminders.py"))
          (tempfile "/tmp/reminders.json")
          (reloadurl "localhost:3000/reload_reminders")
          (reloadcmd (concat "\"for id in \\$(docker ps -q --filter 'name=backend'); do  docker exec -t \\$id  curl -X POST " reloadurl " ;done\""))
          (port-fwd-process (laurent/start-port-fwd server redisport)))
    (message "== Starting Sync ==")
    (message "Exporting reminders")
    (laurent/write-reminders tempfile)
    (message "Copy reminders on server")
    (message (shell-command-to-string (concat syncprogram " " tempfile)))
    (delete-process port-fwd-process) ;; Stop port forwarding
    (message "Reloading reminders")
    (message (shell-command-to-string (concat "ssh " server " " reloadcmd)))
    (message "== DONE with Sync ==")))

Managing my todos, notes, and reminders

This post is about the chatbot I used to managed my reminders and notes.

Moving from many apps to one bot

Before, when I wanted to be reminded of something, I would use the reminders app on my iPhone, to take notes, I would use the notes app and for todos, wunderlist.

Recently I have tried to simplify my apps in order to focus more and avoid context switching. I have decided to rely on org mode, a mode that ships with emacs (I actually use spacemacs (http://spacemacs.org/).

Org mode is for keeping notes, maintaining TODO lists, planning projects, and authoring documents with a fast and effective plain-text system.

- Org mode definition from org mode's site

I use org mode for my todos, my notes, my reminders and many other things on my laptop.

Contrary to the apps like notes and reminders, org mode does not integrate as well with the iPhone. So, I decided to devise an interface between all my org mode content and my phone, bi-directional and easy to use. This interface is a bot.

Building a bot

There are many resources and tools you can use to build a bot. My favorite is this book: http://shop.oreilly.com/product/0636920057741.do. It highlights some of the basic principles to respect to make your bot useful, friendly and manageable (from a developers perspective).

I decided to build my bot from scratch instead of relying on prebuilt services that make a bot for you as I wanted to learn. My goals were:

  • improve my knowledge of docker-compose, python3, Digital Ocean and Ansible
  • learn how to build regression testing with traffic replay
  • support both telegram and facebook messenger and share code logic
  • deployable bot on a brand new server in less than 2 minutes
  • to manage the state externally (separate server)
  • to have a repl interface for the bot too

Architecture of the bot
Architecture of the bot

I can deploy both the facebook and telegram bot separately using docker-compose. Docker Compose is a way to specify containers configuration and links between containers forming an app. I use it in conjunction with ansible to orchestrate the deployment of the bot. Here is the docker-compose file for the telegram bot (not that I frontend in this file is the same as adapter above):

version: "2"
networks:
  internal:
services:
  telegram_backend:
    build: ../../APPS/backend
    image: "backend:1.0"
    volumes:
     - ../../CONFIG/backend:/usr/src/app/config
    environment:
     - FRONTEND_URL=http://telegram_frontend:5000/reply
     - REDIS_HOST=redis.laurentcharignon.com
    networks:
      internal: 
        aliases:
         - telegram_backend
  telegram_nginx:
    image: "nginx"
    volumes:
     - ../../CONFIG/nginx/config/telegram_nginx.conf:/etc/nginx/nginx.conf:ro
     - ../../CONFIG/nginx/config/ssl-params.conf:/etc/nginx/snippets/ssl-params.conf:ro
     - ../../CONFIG/nginx/certificates/fullchain.pem:/usr/src/fullchain.pem:ro
     - ../../CONFIG/nginx/certificates/privkey.pem:/usr/src/privkey.pem:ro
     - ../../CONFIG/nginx/certificates/dhparam.pem:/etc/ssl/certs/dhparam.pem:ro
    networks:
      internal:
        aliases:
          - telegram_nginx
    depends_on:
      - "telegram_frontend"
    ports:
      - "8443:8443"
  telegram_frontend:
    build: ../../APPS/telegram_frontend
    image: "telegram_frontend:1.0"
    environment:
     - BACKEND_URL=http://telegram_backend:3000
     - ADVERTISE_URL=https://bot.laurentcharignon.com:8443
    volumes:
     - ../../CONFIG/telegram_frontend:/usr/src/app/config
    networks:
      internal:
        aliases:
         - telegram_frontend
    depends_on:
      - "telegram_backend"

How to interact with the bot?

    Typically bots use slash commands. I decided not to use that because the feature set of my bot is so limited:
  • Reminding me of things
  • Asking me to acknowledge reminders
  • Log tasks and todos

I can ask

help

To get some help

gtd

To turn on/off a recording session of notes, whatever I send will be persisted for further processing. When recording notes, multiple formats are supported to route the request to different note files.

Interaction with the bot
Interaction with the bot

The bot is named Pascal by the way!.

Emacs integration

The main reason why I built Pascal was to be reminded of things to do. I created a module to parse my org-mode files and set up reminders in the bot.

Entries for reminders in my org mode files look like this:

* Columbia CS@CU Happy Hour
   :PROPERTIES:
   :REMINDER_DATE: <2017-07-15 Sat 12:00>
   :REMINDER_TARGET: laurent
   :REMINDER_QUICK_REPLY: yes
   :END:

This is plain text. The first line is

* Columbia CS@CU Happy Hour

It is a heading, it starts with a star, in markdown you would achieve a similar result using #

The heading is followed by:

:REMINDER_DATE: <2017-07-15 Sat 12:00>
:REMINDER_TARGET: laurent
:REMINDER_QUICK_REPLY: yes
:END:

Which is called a property drawer, it stores key/value pairs associated with an entry, here we have three key value pairs. The first one is a date (When the reminder will filre), the second is a target (who to notify, me), and the last one configures quick replies. When I toggle Quick Replies it means that the bot should display buttons to make me acknowledge that I acted on the reminder.

Here is another example of a weekly reminder, every Monday at 6pm, I go to the gym:

* Gym Monday
  :PROPERTIES:
  :REMINDER_TARGET: laurent
  :REMINDER_HOUR: 18
  :REMINDER_MINUTE: 0
  :REMINDER_WEEKDAY: 0
  :REMINDER_MESSAGE: It's monday night, workout night! 
  :END:

I create these programmatically with a template and sync them with the datastore powering the bot every day.

Changeset Evolution

Changeset What?

Mercurial is a distributed version control system, similar to git. If you have not tried it yet, you really should!

I work on Mercurial, and as you know already, I love to automate everything. If you use git and mercurial today, you know that source control is not trivial, workflows could be easier and require less manual intervention and dark magic.

Changeset evolution is a proposal to make source-control less error-prone, more forgiving and flexible. I will use changeset evolution and evolve interchangeably. Pierre-Yves David created Changeset Evolution and you can see his talk at FOSDEM 2013

The history of commits does not exist

Let’s start with an example. Assume that a user committed b on top of a:

Before running the amend command
Starting point

After making some changes, the user runs hg commit --amend (like git commit --amend) and decides to call the new commit b’:

After running the amend command
After amend

Under the cover, the amend command creates a new commit but the old revision is still there but hidden:

Under the cover
b didn't disappear yet, it is hidden

For the user, b’ is a newer version of b. Even though, the intent of amending is clear, no information about this intent is recorded in the source control system!

If the user wants to access, let’s say a week from now, what b was before the amend, he or she will have to dig through the reflog to find the hash of b.

What if we could record that b’ is the successor of b?

Defining the commit history with obsolescence markers

Changeset evolution introduces the concept of obsolescence markers to represent that a revision is the successor of another revision. I will represent the obsolescence markers with dotted lines in the following graphs. In the example above after running hg commit --amend we would have:

Before running the amend command
Recording that b' is the successor of b with an obsolescence marker. b is the precursor of b'

And after running hg commit --amend again:

Before running the amend command
Two amends: b" is the successor of b' and b' is the successor of b

All this is happening under the hood, and the user does not see any difference in the UI. It is just some extra information that is recorded that can be used by commands as we will see in the next section.

Simplify rebases, go back in time and don’t make mistakes

Let’s see how we can seamlessly use the obsolescence markers to simplify the life of the user through three examples.

1. Easily accessing a precursor:

Consider the situation discussed above:

Before running the amend command
After hg commit –amend

We can give the user some commands to access precursors of revisions to compare them or manipulate them. After running the amend, you can easily:

  • Go back to the previous version (without using the reflog)
  • Figure out what changes the amend introduced.

The reflog (git or mercurial) is a command to list the successive location of the head of the repository and all its branches or bookmarks. It is a list of lines with the format: “hashes command” and shows the working copy parent (i.e. current commit) after each command. It is used to recover from mistakes and go back to a previous state.

2. Rebasing with fewer conflicts:

It is common to have a testing/continuous integration system run all the tests on a revision before pushing it to a repository. Let’s assume that you are working on a feature and committed b and c locally.

Before running the amend command
Before pushing b to the server on top of d

Satisfied with b, you send it to the CI system that pushes it onto remote/master on the server, when you pull, you will have:

Before running the amend command
Pushing a commit can also add a marker

If you pull one hour later (assuming other people are very productive :D) you will have a situation like that:

Before running the amend command
Your colleagues have been productive and pushed many new changes since you last pulled

And if you try to rebase your stack (b and c) on top of master, you will potentially have conflicts applying b because of the work of another developer. This could happen if this other developer changed the same files you changed in b. But in that case, you know that the person resolved the conflicts once already when applying their work on top of newb. The user should not have to do a merge and resolve conflicts in that case and obsolescence markers can help resolving this. What if on pull the server could tell you that newb is the new version of b:

Before running the amend command
When rebasing the stack, the first commit can be omitted

This way when you rebase the stack, only c gets rebased, b is skipped, and you cannot get conflicts from the content in b.

3. Working with other people

Let’s assume that you start from this simple state:

Before running the amend command
Starting point

You and your friend make changes to the revision b. You create a new version of b called b’ and your friend creates a new version of b called b”.

Before running the amend command
The first developer rewrote b
Before running the amend command
The second developer rewrote b as well

Then you decide to put your work together. For example, you can do that by pulling from eachother’s repository. The obsolescence markers and revisions are exchanged and you end up with the following state:

Before running the amend command
b has two successors, b' and b'' are called divergent

In git or vanilla (no extension) mercurial, you would have to figure out that b’ and b” are two new versions of b and merge them. Changeset evolution detects that situation, marks b’ and b” as being divergent. It then suggests automatic resolution with a merge and preserves history.

Before running the amend command
Everything gets resolved intelligently

The graph might seem overcomplicated, but once again, most things are happening under the hood and the UI impact is minimal. These examples show one of the benefit of working with Changeset Evolution: it provides an automatic resolution of typical source control issues.

As we will see in the next section, Changeset Evolution does much more than that and gives developers more flexibility when working with stacks of commits.

A more flexible workflow with stacks

Changeset evolution defines the concept of an unstable revision, a revision based on an obsolete revision. From the previous section:

Before running the amend command
c is unstable because it is based on b and b has as a new version

Evolve resolves instability intelligently by rebasing unstable commits on a stable destination, in the case above newb. But it does not force the user to resolve the instability right away and allows, therefore, to be more flexible when working with stacks. Consider the following stack of commits:

Before running the amend command

A user can amend b or c without having to rebase d.

Before running the amend command
We rewrote b and c, so c' and d are now unstable

And when everything looks good changeset evolution can figure out the right commands to run to end up with the desired stack:

Before running the amend command

If the user was not using changeset evolution, he or she would have to rebase every time anything changes in the stack. Also, the user would have to figure out what rebase command to run and could potentially make mistakes!

What I didn’t cover

  • Working collaboratively with stacks
  • Markers defining multiple precursors (fold) and multiple successors (split)
  • And a lot of other things

How to install evolve and start playing with it

  1. Install mercurial
  2. Clone evolve’s repository with hg clone http://hg.netv6.net/evolve-main/
  3. Add the following configuration to you ~/.hgrc with the correct path from the repo you just cloned: {% highlight ini %} [extensions] evolve = path to/evolve.py {% endhighlight ini %}

    More resources

A Test Automation Story

I find that doing repetitive computer-related tasks is time consuming, error-prone and frustrating. I am a strong believer of automating everything and I have this mantra:

Automate anything that you did twice as you will certainly have to do it a third time

This article describes my approach to automate a time consuming task in my daily workflow.

I am working on Mercurial. Whenever I work on a new series of patches, I want to ensure that the whole test suite passes on every patch of the series. In the end I made this task completely automated and transparent.


Iteration Zero: run the tests and go grab a drink with a friend

The very first time I ran the Mercurial test I stared at my screen while the tests results were being slowly displayed. After a while, I went to grab coffee, came back, and all the tests had run. I swore to myself that I would never waste more time staring at the screen waiting for the tests to finish. The first step to a successful automation is getting frustrated with doing things manually. Some people never get frustrated by that but I have a pretty low tolerance for repetitive tasks!

I am not saying that Mercurial tests are too slow, they actually take typically less than 5 minutes to run. For tests that take one hour to run, people don’t hesitate to automate the process. For tests that take 2 minutes, most people think they can stand staring at their screen and it is a mistake! Automate all the things, don’t manually do the same thing twice and you will improve!

First iteration: Open a new window, type some commands, check later

The first improvement I made was opening a new terminal window to run all the tests. This was better than waiting for the tests to finish. If you are not using a terminal multiplexer, I strongly advise you to do so, check out tmux. While running the tests, I was being careful not to modify any files to avoid messing up the results of the test.

Second iteration: Isolate testing and work environment

Soon after, I decided to have two copies of the Mercurial repository. I kept one copy to run the test completely in RAM (for the sake of speed) and one copy on disk on which I was working. After finishing a patch, I would push my changes to the test repository and run the tests while not being blocked to make further changes. This created a separation of environment between the test environment and the development environment, effectively overcoming the main issue of the first iteration.

Despite being an enhancement over the previous iteration, four key things were missing:

  1. I still had to type commands to push the changes, checkout the revisions and run the tests
  2. If I had to test 10 patches, I would have to launch them one by one
  3. I had to look through the test results and take note of what passed and what didn’t pass
  4. There was no neat way for me to see what changes pass the test and what changes didn’t pass the test

Third iteration: Avoid unnecessary typing

I got fed up with typing the commands to push my changes, check them out and run the tests. I did some research and figured out that tmux allows you to open new windows and run commands in other windows programmatically. I used this API and scripted tmux to open a new window to run my tests.

Fourth iteration: Getting tired of reading test output

I realized that our tests runner could produce a JSON report with the results of each test. There was a bug where the output was not a valid JSON and it didn’t contain some of the information I needed so I sent patches upstream. Once fixed I managed to get my previous automation to read the report, archive it, parse it and give me a summary. I decided to store all these information in SQLite to easily query them later.

Fifth iteration: When it gets serious

My main issue at that point was that I was launching the tests one by one and being careful not to launch two at a time. I added a task queue to my system using Celery to separate launching the tests and running them. This way also, I can run the tests on multiple machines in the future or run unrelated tests in parallel. At this point, I hooked up tests for other repositories and not just the core Mercurial repository. I built a command line tool to easily select what tests to run and to query the tests results and failures.

List of changesets and test results

One thing was missing, I still had to read the reports to know when the tests were passing.

Sixth iteration: Cherry on the cake: a hud, colored labels, and vim integration!

It is extremely easy to write Mercurial extensions, it can take as little as 5 lines of python to create a useful feature. I wrote an extension that adds an overlay on top of the list of commits in my repository to show me if the tests passed for each revision:

Changelog

I added a status bar in tmux to inform me of what is being tested and I am thinking of adding an ETA for the tests.

Finally, I added shortcuts in vim to launch tests against the current revision, the current stack and see test results.

Future iterations: Dependencies, packaging, and distribution

I installed this automation on three machines already and it works great! Moving forward, I want to distribute this system to more people. I will write more about it and talk about implementation details in other articles.

Introduction to Mechanize with Python

Disclaimer: make sure to check the terms of use of the website you plan to interact with, a lot of websites forbids interaction from automation. Don’t do anything that could get you in trouble.

mechanize is a library to interact with websites. It fits in between high-level browser automation tools like Selenium and HTTP libraries like requests. It doesn’t handle Javascript, if that’s an issue for you, you should consider CasperJS. The big advantage of using mechanize compared to a higher level library is speed: it is an order of magnitude faster!

I use the following boilerplate code for all my programs with mechanize:

Set up user agent

Load cookies from a file

If the cookies are expired:
    Go through the login flow

Interact with the website

Persist the cookies to a file

If you do the following instead:

Go through the login flow

Interact with the website

Then, you will end up going through the login flow as many times as you run the script. Not only will this be inefficient, but you would also take the risk of being blacklisted by the website’s owner for making too many requests. Let’s see in practice what this code looks like.

Boilerplate

Assuming that you want to log into a website and read a page that shows some JSON content, you would do something like this:

import os
import json
import cookielib

import mechanize

def getbrowser():
    br = mechanize.Browser()
    br.set_handle_redirect(True)
    br.addheaders = [('User-agent', 'XXX')] # Set you user agent here
    return br

def loadcookiejar():
    cj = cookielib.LWPCookieJar()
    if os.path.exists("cookies"):
    	cj.load("cookies")
    return cj

def main():
    br = getbrowser()
    cj = loadcookiejar()
    br.set_cookiejar(cj)
    if len(cj) != 1:
		url = 'https://XXX'
		br.open(url)
		# Select the first form
		br.select_form(nr=0)
		# Fill in some information
		br["email"] = "XXXXXXX"
		br["password"] = "YYYYYYY"
		# Actually log in and set the cookies
		br.submit()
    # The page you are interested in
    r = br.open("https://YYYY")
    # Assumes that the content is JSON, otherwise use r.read()
    print json.loads(r.read())
    cj.save("cookies")

if __name__ == "__main__":
    main()

Now, some of you might wonder “how do you figure out what the URLS/parameter are for the stuff that you are interested in”. It is actually fairly easy to gather that information.

Figure out the API behind a website

This explanation assumes that you are using Google Chrome, tools exists with many other browser to do the same thing.

Open the Developer Console and go to the Network Tab. Click on the button to start a recording and navigate to a page.

You will see a bunch of requests, filter the kind that interests you, you generally want to see the requests for web pages and XHR, click on both and see if you find something interesting.

In my case, I am looking at a website that gives the location of my phone, I filtered the XHR requests: XHR requests

It looks like the first entry is very interesting: First entry

From there, right click on the request and you can copy the URL and information to make the request from python or a terminal: Exporting

Add Colors To Your Python Logs With Colorlog

colorlog is a drop-in replacement for the Python logging module that allows you to add color to your logs.

Start by installing the colorlog module with pip (or easy_install):

pip install colorlog

Then you can use colorlog the way you would use logging:

import colorlog
import logging

colorlog.basicConfig(level=logging.DEBUG)
colorlog.debug("debug")
colorlog.info("info")
colorlog.warning("warning")
colorlog.error("error")
colorlog.critical("critical")

And your log become much easier to read:

Color logging

Practical one time padding with Node JS

Source code: https://github.com/charignon/otpCSnode

Too long don't read: How I used Node JS & CoffeeScript to implement one time padding of TCP traffic. Example use with: SSH, SCP and HTTP Proxying.

One time padding

In cryptography, a one-time pad (OTP) is an encryption technique that cannot be cracked if used correctly. In this technique, a plaintext is paired with random, secret key (or pad). Then, each bit or character of the plaintext is encrypted by combining it with the corresponding bit or character from the pad using modular addition.Wikipedia: One time pad.

In other words, one-time padding is a cryptography technique using a key (that both sides know) as long as the message. Since the key is as long as the message, the message appear as random as the key and is uncrackable if the key cannot be guessed.

Technically, it is implemented with a simple XOR between the message and the key. It has been deem quite unpractical for most encryption needs because the key has to be as long as the data. But when you can afford to use it, it is the only uncrackable encryption technique. Don’t take my word for it, the red telephone apparently used it according to the same wikipedia article.

Key generation and distribution

We can easily create a key using the /dev/random file. For instance to generate a key file named key of 100Mb you could run:

dd if=/dev/random of=./key bs=1048576 count=100

In the next part I assume that you have this key on two different hosts.

Don’t reuse the key

Contrary to most encryption method, the one-time-pad keys are used only once. In our case, it means that our 100Mb key can encrypt only 100Mb, not one more bit!

To use the keys several times (for example several ssh sessions) without having to dump it after each one, you must keep track of what your encrypted before. In other words you need to know your offset in the key file.

Client and server code

To simplify, the key file is hardcoded and loaded fully in memory.

As an improvement the key could be specific on the command line and precached by chunk at runtime.

Server code

The server connects to a service like SSH, HttpProxy, VoIP server etc. It allows the client to operate with the service by: - Encrypting the traffic from the service and forwarding it to the client - Decrypting the traffic coming from the client and forwarding it to the service

net = require("net")
otp = require("./otp")
argv = require('minimist')(process.argv.slice(2))

expectedArgs = ["localPort", "servicePort", "serverOffset", "clientOffset"]
otp.validateArgs(argv, expectedArgs)

servicePort = Number(argv.servicePort)
localPort = Number(argv.localPort)
serverOffset = Number(argv.serverOffset)
clientOffset = Number(argv.clientOffset)

net.createServer((outBoundSocket) ->
  inBoundSocket =  net.createConnection(servicePort, "localhost")
  outBoundSocket.pipe(otp.encryptor("client", clientOffset)).pipe(inBoundSocket)
  inBoundSocket.pipe(otp.encryptor("server",serverOffset)).pipe(outBoundSocket)
).listen Number(localPort)

The first part of the code is straightforward, we just parse the arguments and validate them.

Then we create a tcp server bound to localPort and when a client connects to this server we:

  • Create a connection to the service
  • Wire the connection between service and client with an offset for each encryptor: the offset is in byte from start of the key file

Client code

Very similar to the server code:

net = require("net")
otp = require("./otp")
argv = require('minimist')(process.argv.slice(2))

expectedArgs = ["localPort", "serverPort", "clientOffset", "serverOffset", "host"]
otp.validateArgs(argv, expectedArgs)

serverPort = Number(argv.serverPort)
localPort = Number(argv.localPort)
host = argv.host
serverOffset = Number(argv.serverOffset)
clientOffset = Number(argv.clientOffset)

outBoundSocket = net.createConnection(serverPort, host)
  net.createServer((inBoundSocket) ->
  outBoundSocket.pipe(otp.encryptor("server",serverOffset)).pipe(inBoundSocket)
  inBoundSocket.pipe(otp.encryptor("client",clientOffset)).pipe(outBoundSocket)
).listen(localPort)

Library code

This is where the encryption is done, again, pretty straightforward:

through = require('through')
_ = require("underscore")
fs = require("fs")

# Keeping track of usage of key per entity
root = exports ? this
root.offsets = {}

# Load the key in memory
key = fs.readFileSync("key")

# Show usage notice
usage = (expected) ->
  console.log "Missing argument expecting #{expected}"
  process.exit 1

# Compute XOR of two buffers
xor = (v1,v2) ->
  new Buffer(_(v1).map((e,i) ->
  v2[i] ^ e
))

# Encryptor Through Stream, identifier is for accounting purposes of the offset
exports.encryptor = (identifier,offset) ->
  console.log "Init encryptor with offset #{offset}"
  _offset = offset
  through((data) ->
    end_offset = _offset + data.length
    @queue(xor(data,key.slice(_offset,end_offset)))
    _offset += data.length
    root.offsets[identifier] = _offset
    console.log root.offsets
)

# Validate arguments, actual = object from minimist, expected = array
exports.validateArgs = (actual,expected) ->
  actualKeys = _.keys(actual)
  _.each expected, (k) ->
    do usage(expected) if actualKeys.indexOf(k) == -1

# Show offset on CTRL-C to keep track of where we stopped
process.on 'SIGINT', () ->
  console.log "Logging offsets"
  console.log root.offsets
  process.exit 0

Encrypt ssh traffic

We have two computers called HostA and HostB, let’s say that the ssh server is on HostB. Let’s say that out of the 100Mb key, we want to use the last 50Mb for the server and the first 50Mb for the client.

On HostB we start the encryption server:

coffee receiver.coffee --localPort=8000 --servicePort=22 --serverOffset=52428800 --clientOffset=0
localPort,serverPort,clientOffset,serverOffset,host

On HostA we start the encryption client:

coffee sender.coffee --localPort=9000 --serverPort=8000 --host=HostB --serverOffset=52428800 --clientOffset=0

Then we can connect to HostB through the encryption tunnel with this command on HostA:

ssh localhost -p 9000

I also tried it with SCP and even an HTTP proxy and it worked fine! Let me know what you think of it!