Realizing What Protocols Are For

Realizing What Protocols Are For

I enjoyed Douglas Crockfords talk on “Javascript, the Better Parts”, particularly the middle section where Douglas describes all the things he has stopped doing in Javascript, and gives an insight into the style he uses day to day.

This-less programming

In particular, I found his avoidance of the this keyword interesting. Crockford advocates for using constructor functions which return frozen objects, forming a closure over all the methods in the object. In the spirit of the classic contrived OOP examples, lets consider a construcor function for a Dog object:

var Dog = function(spec) {

    var sound = spec['sound'];

    var bark = function() {
        return sound + "!!";
    };

    return Object.freeze({
        bark: bark
    });
};

Notably, this style of OOP completely avoids using the this keyword to refer to any other part of the object, everything is closed over by the functions on the resulting frozen object.

One nice side-effect of this is that an instance of an object built by this constructor cannot be interfered with:

var pup = Dog({sound: "woof"});

pup.bark()  
// => "woof!!"

pup.bark = function() { return "meow"; };

pup.bark()  
// => "woof!!"

In Clojure

Being much more fond of Clojure than Javascript, I opened a lein repl and figured out the equivalent Clojure code:

(defn Dog [spec]
  (let [sound (:sound spec)
        bark (fn [] (str sound "!!"))]
    {:bark bark})

(def pup (Dog {:sound "woof"}))

(:bark pup)
;; => "woof!""

At this point I realized that this pattern seemed kind of like what I had previously read about Clojure Protocols, a topic which I had not fully understood before. I lookup up clojuredocs.org to read up on Protocols, and came up with the following solution:

(defprotocol Barking
  (bark [self]))

(defrecord Dog [sound]
  Barking
  (bark [self] (str (:sound self) "!!"))

Once the Barking protocol and the Dog record are defined, they can be used like so:

(def pup (->Dog "woof"))

(bark pup)
;; => "woof!!"

Which I found to be most pleasing.

Conclusion

Lets compare the two solutions, in Javascript:

var Dog = function(spec) {

    var sound = spec['sound'];

    var bark = function() {
        return sound + "!!!";
    };

    return Object.freeze({
        bark: bark
    });
};

And in Clojure:

(defprotocol Barking
  (bark [self]))

(defrecord Dog [sound]
  Barking
  (bark [self] (str (:sound self) "!!"))

The Clojure solution is obviously quite a bit shorter than the Javascript solution. It also more clearly indicates what is happening, there is a Barkingprotocol, consisting of a single function/method, called bark, and the Dogtype implements this function, using its own sound property.

So Clojure solutions can be more concise and readable than the equivalent Javascript? Sure, we knew that aleady, but that’s not what I found interesting. For me personally this was the light-bulb moment which demonstrated why I would ever want to think about using a Protocol over a plain old data-generating function in Clojure.

Welp, BedquiltDB version 2 is finally available. Improvements include:

  • New “Advanced” Query operators
  • A remove_many_by_ids operation
  • $created and $updated sort specifiers
  • skip and sort parameters to find_one
  • Marginal performance improvements
  • Better documentation
  • A whole new website
  • An official Docker image for an easy-to-use PostgreSQL/BedquiltDB server

Let’s take a look at the most significant new feature…

Query Operators

In BedquiltDB v1, queries could be expressed in terms of JSON “query documents”, which would be used to match against the contents of a collection. For example, the following query would return all articles whose authorId is "abcd" and with upvotes equal to 4:

articles.find({'upvotes': 4, 'authorId': 'abcd'})

Which is fine, but what if we want to get all documents with more than 4 upvotes, or where the status is not equal to "open"? In short, those queries were not possible without dropping down to the SQL layer.

In BedquiltDB v2 however, these queries are expressible with the use of Advanced Query Operators. These queries generally take the form:

{"someField": {: }}

… where operator is a string beginning with $, and the argument is some value that has meaning to the operator.

The following operators are supported:

  • $eq: asserts that some field is equal to some value
  • $noteq: asserts that some field is not equal to some value
  • $in: asserts that the fields value is a member of the supplied list of values
  • $notin: asserts that the fields value is not a member of the supplied list of values
  • $gt: asserts that the field is greater than some numeric value
  • $gte:asserts that the field is greater than or equal to some numeric value
  • $lt: asserts that the field is less than some numeric value
  • $lte: asserts that the field is less than or equal to some numeric value
  • $exists: if the supplied values is truthy, asserts that the field exists, otherwise asserts that the field does not exist
  • $type: asserts the type of a field value, valid arguments are “object”, “string”, “boolean”, “number”, “array” and “null”
  • $like: asserts that the field value is “like” the supplied match string, following the semantics of PostgreSQL LIKE operation.
  • $regex: asserts that the field value matches the supplied regex string, following the semantics of PostgreSQL ~ operation.

These operators can be mixed and matched freely within a query document, and the query will do The Right Thing™.

Examples:

collection.find({
    'upvotes': {
        '$gte': 4,
        '$lt':  10
    },
    'authorId': 'abcd',
    'metadata': {
        '$exists': True
    }
})

collection.find({
    "title": {
        "$regex": "^.*Elixir.*$"
    }
})

collection.find({
    "city": {
        "$notin": ["London", "Glasgow"]
    }
})

See the “Advanced” Query Operators section of the BedquiltDB Spec for full documentation.

Other Cool Stuff

You can now sort by the collections $created and $updated times:

# returns documents in order of the time they were created
collection.find({...}, sort=[{'$created': 1}])

# returns documents in order of the time they were updated
collection.find({...}, sort=[{'$updated': 1}])

The find_one operation now accepts sort and skip parameters, which behave the same as those for find:

collection.find_one({...}, sort=[{'upvotes': -1}], skip=2)

And there’s a delightful new remove_many_by_ids operation:

collection.remove_many_by_ids(['one', 'four', 'nine'])

BedquiltDB v2 is available to install right now, see the Installation section of the BedquiltDB Guide for full insturctions.

And of course, the source-code can be found, forked and contributed-to on Github.

Happy Hacking!

In my day-to-day work as a software developer, I find myself moving into a bunch of different directories and running some repetitive commands to “activate” or “launch” the project in that directory.

Ideally I’d like to be able to type something like x-activate after I cd into a directory and have it just do The Right Thing™. Now, if the commands were the same each time I could just make a shell alias like (fictional example for illustration):

alias x-activate="export MODE=DEV && nvm use && grunt server"

… but alas, some projects are the same, some are different. But, with cleverness and determination, there is a nice hacky way we can get what we want…

A Contrived Example

Let’s say we have three projects in our ~/code directory, ringo, paul and george

$ ls ~/code
# => ringo paul george

We usually work on these projects all at once, by opening three separate terminal windows and cding into each directory, then running some command which starts the program.

  • ringo : nvm use && grunt server
  • paul : export MODE=DEV && make run
  • george : make compile && ./bin/run --port 9020

Awful right? Imagine how much worse this would be if our job involved over a dozen microservices, all like this? It would be so much nicer if we could just cd into each directory and type one command (in our example x-activate) and have The Right Thing™ just happen.

In this example we name our new command x-activate, with a leading x-, for the sake of differentiating it from existing commands and programs on the system. For example, python virtualenv tool uses a command called activate. It may (or may not) be a good idea to prefix your own hacky tools with a consistent prefix in general.

Step One: A Script in Every Folder

Let’s go into each of our project directories and create a file called .x-activate, containing the commands we usually run to start that project:

$ cd ~/code
$ cd ringo
$ echo 'nvm use && grunt server' > .x-activate
$ cd ../paul
$ echo 'export MODE=DEV && make run' > .x-activate
$ cd ../george
$ echo 'make compile && ./bin/run --port 9020'

Next, and presuming we use git for version control on our projects, we should add .x-activate to our global git-ignore file, because we don’t want any of these files being committed to our git repositories.

Adding .x-activate to the Global Git-ignore file

First, let’s check if we’ve already configured a global gitignore file:

$ git config --global --list | grep 'exclude'
# => core.excludesfile=/Users/shanek/.gitignore_global

Here we can see that on this machine, I’ve already done the following:

  • created a file called .gitignore_global in my home directory
  • told git to use that file for its core.excludesfile option

If the last command we ran produces a similar output for you, then you should open that file in your editor of choice and add the following line:

.x-activate

Otherwise, if the command produced no output, you should create the .gitignore_global file and tell git to use it:

$ echo '.x-activate' >> ~/.gitignore_global
$ git config --global core.excludesfile '~/.gitignore_global' 

Easy, and now we don’t have to worry about polluting shared code repositories with our own convenient little hack.

Step Two: A Tiny Shell Alias

Now that each project has a little lump of shell-code tucked away in a file, we could start all of our projects by going to each project directory and using the source command to load the .x-activate file as if it were shell source-code:

$ source .x-activate

This is better than trying to remember the right commands for each project, but we can go one step further and create an alias which will shorten source .x-activate to just x-activate

Let’s presume you’re using the bash shell, like most people these days. In most cases, bash will read in either, or both of the files called .bashrc and .bash_profile located in your home directory (if you’re not sure of the difference between these files, check out this StackOverflow question). Let’s presume that ~/.bashrc will be loaded whenever you open the shell. Open ~/.bashrc in your editor of choice and add the following line:

alias x-activate="source ./.x-activate"

Step Three: Shell Domination

Let’s open a new terminal window or tab, and try it out:

$ cd ~/code/ringo
$ x-activate
# => ... a bunch of output here ...

Nice, that’s exactly what we wanted. Let’s recap on what’s going on here

  1. The shell receives the command x-activate
  2. The shell recognises this as an alias for source ./.x-activate
  3. The shell runs source ./.x-activate
  4. The source command loads the contents of .x-activate and executes it as shell code
  5. Things run, stuff happens

What Have We Learned?

  • Shell aliases are cool
  • We can put shell-code which is specific to a certain directory, in that directory
  • We can use unix-foo to save us time and keystrokes while hacking on our awesome projects

Let’s generate a super-secure random password (let’s say, for our tumblr account), using only the command line and a few basic unix tools.

First, we’ll read 10 bytes of random data out for /dev/random:

$ head -c 10 /dev/random
# -> �u#�ko�%

The output looks kinda shitty huh?

Ok, let’s encode this data in base64 format:

$ head -c 10 /dev/random | base64
# -> 9W0MVZQ+SC27VA==

Better, but those trailing ’=’ characters aren’t really useful to us, and that ’+’ in there reminds me that we should prefer to generate ‘url-safe’ base64 text.

Let’s use tr (translate) to delete (-d) the equals-signs:

$ head -c 10 /dev/random | base64 | tr -d '='
# -> PHCSXH7w3TZgHg

And let’s use tr again to change ’+’ into ’-’ and ’/’ into ’_’:

$ head -c 10 /dev/random | base64 | tr -d '=' | tr '+/' '-_'
# -> XE_TRFKrfv-nwA

Much better, but how many characters are in this password we are generating?

$ _my_password=$(
  head -c 10 /dev/random | base64 |
  tr -d '='| tr '+/' '-_'
)
$ echo -n "$_my_password" | wc -c
# -> 14

(note how we passed -n to echo, asking it to not print a trailing new-line)

Fourteen characters isn’t bad, but we can always get more by increasing the value of the -c parameter to head and get a longer password:

$ head -c 16 /dev/random | base64 | tr -d '=' | tr '+/' '-_'
# -> 94xKa4qk2tpclnL-OjV6Wg
$ head -c 22 /dev/random | base64 | tr -d '=' | tr '+/' '-_'
# -> L8V3Ee3TxyvEl88cOaIJ-SUWB3YCqg

Now we can just copy-paste this delicious new password into our browser and our account is secure again!

While reading up on BEM and other modern CSS methodologies I came across several articles which advised keeping nested CSS rules to a minimum, for the sake of minimizing specificity and also for the sanity of the developers.

I then got to thinking about CSS pre-processors such as LESS and SASS (both of which I’ve used), and that if you don’t use nesting with those pre-processors then the only (commonly used) features you’re left with are variables, mixins, file inclusion and convenience functions.

So, I thought, could skip LESS/SASS and just use the simple (and ancient) macro processor known as GNU m4?

Why m4?

GNU m4 is a macro processor, it’s whole job is to process text files with macros defined in them. A few advantages of m4 are:

  • it’s available on basically every Unix system, without installing any extra dependencies
  • it’s not specific to css, or any other language, it could be used on any old text file.
  • if you’re not already sold on something like LESS or SASS, then m4 could be a really powerful tool to help make your CSS cleaner without adding another heavy dependency to your build-pipeline.

And some of the disadvantages:

  • being an older unix tool, the syntax is pretty clunky
  • ~~m4 basically doesn’t work with Unicode, or any variable-width text encodings~~ This turns out to be wrong, I’ve tried using m4 with a bunch of non-ascii text in UTF-8 and it appears to work just fine.
  • because m4 isn’t tied to any particular language, it doesn’t have any language-specific features, such as SASS’s ability to process nested CSS declarations.

Quick intro to m4 language

GNU m4 is all about defining macros in a text file, which can then be used later in the file to generate more text. We call m4 like so:

$ m4 some-file

In this case m4 will read in some-file and process any m4-specific directives found in the text, emiting the processed text to standard-out. In our example we’re going to do some preprocessing on a CSS file called style.css.m4, so we’re going to call m4 like so:

$ m4 -P style.css.m4 > style.css

The -P (or --prefix) flag instructs m4 to prefix it’s own built-in functions with m4_. This is useful because it makes collisions between m4 and our own text less likely. Without -P, we would use the define() function to define a macro, but with -P we use m4_define(). You can see why the latter would be preferable. From here on I’m going to presume we’re using the -P flag to m4.

How to define macros

In the simple case we can just define a simple text-substitution like so:

m4_define(FOO, bar)

We can then use the FOO macro later in the file, to splice in the text bar at that location:

I want to go to the FOO and have an apple-juice.

When m4 is run on this file, the resulting text will be:

I want to go to the bar and have an apple-juice.

We can also define macros which take parameters at call-time, but we’ll get to those later.

m4’s weird quoting

Probably the weirdest part of m4 is it’s quoting rules. Basically, instead of using ordinary ‘quotes’ and “double-quotes”, m4 uses the backtick (`) as the opening quote character, and the single-quote (‚) as the closing quote character. So, We could write the FOO macro example like this:

m4_define(`FOO', `bar')

Kinda weird, but whatever.

Anyway, that’s enough for us to get started with marco-ising our CSS, so let’s get on with it.

Setup

You’ll need the m4 program installed. Mac OSX ships with version 1.4.6 currently, while version 1.4.17 is available in homebrew. For our purposes the pre-installed version will do just fine. On ubuntu m4 can be installed with apt-get install m4. If you’re on Windows, IDK, it’s probably possible to install m4 somehow.

Let’s make a new directory and create a file in there called style.css.m4:

$ mkdir m4-example
$ cd m4-example
$ touch style.css.m4

The .m4 extension isn’t required, but I think it looks nice. Let’s invoke m4 on this (empty) file:

$ m4 -P style.css.m4

Basic Variable Substitution

This invocation of m4 won’t output anything useful, because there’s nothing useful in the file. Let’s fix that by opening our favourite editor and entering the following:

m4_changecom(`@@##')
m4_define(BLACK, #000)
m4_define(GREY,  #ccc)
m4_define(WHITE, #fff)

body {
  background-color: WHITE;
  color: BLACK;
}

On the first line, we change the m4 comment character to be @@###, rather than #. This is a good idea because # is used all over the place in CSS, so we’d rather not have it be interpreted as an m4 comment, and @@## is a suitably obscure alternative.

The next three lines define three macros, BLACK, GREY, and WHITE. From now on, any time those words occur in the file they will be replaced with the appropriate color hash values. We are using UPPER_CASE identifiers for our macros, but bear in mind you should alwoys choose some kind of naming scheme which is not going to conflict with legitimate content in the files you are processing. Use your head.

Of course, we aren’t limited to defining macros for basic color values, we can use any text we want, but for most web developers the most useful values to put in these macros will be color and numeric values. Note also that we didn’t bother to quote either the macro names, nor their values, but we could easily have wrapped these in m4 quotes likes so:

m4_define(`BLACK', `#000')

If we run m4 again, the output should be something like:



body {
  background-color: #fff;
  color: #000;
}

Yes, there’s a bunch of whitespace in there, but the text we care about (the CSS declarations) have been processed and the correct color values have been spliced into the text.

If the extra whitespace is annoying, you can add the m4_dnl directive to the end of the m4_define... lines, which will delete the extra whitespace up to the next new-line, like this:

m4_define(BLACK, #000)m4_dnl

In this tutorial we won’t bother with that, for the sake of clarity. Plus, if you’re running the resulting CSS through a minifier the extra whitespace shouldn’t be an issue.

Including other files

So, now that we can define variables in our CSS code, the next feature we probably care about is the ability to split our CSS over multiple files and then import those files into our main stylesheet. In m4 we can do this with the m4_include directive, like so:

m4_include(./other_file)

Let’s imagine we want to keep all the styles related to our site footer in a separate file, say footer.css.m4:

m4_define(FOOTER_TEXT_COLOR, #222)

.footer {
  color: FOOTER_TEXT_COLOR;
}

We can then include that file in our main file with:

m4_include(./footer.css.m4)

… and the contents of footer.css.m4 will be processed by m4 and spliced into the output text stream. Our example file style.css.m4 now looks something like this:

m4_changecom(`@@##')
m4_define(BLACK, #000)
m4_define(GREY,  #ccc)
m4_define(WHITE, #fff)

body {
  background-color: WHITE;
  color: BLACK;
}

m4_include(./footer.css.m4)

Simulating mixins

Both SASS and LESS support “mixins”, which essentially allow you to create a block of CSS code which can be included in some other block of CSS code with a simple one-liner. In m4 we can achieve the same effect with good-old macros:

m4_define(ANGRY_TEXT, `
    color: red;
    font-weight: bold;')

p.angry {
  overflow: auto;
  ANGRY_TEXT
}

But we can go one step further and define a macro which accepts parameters at the call-site, allowing you to re-use blocks of code almost like functions in a real programming language. Let’s define a macro which will handle all the weirdness of adding a border-radius to a DOM element:

m4_define(BORDER_RADIUS,
  `-webkit-border-radius: $1;
  -moz-border-radius: $1;
  -ms-border-radius: $1;
  border-radius: $1;')

In this example, the $1 stands for the first parameter to the macro. If we use the macro like this: BORDER_RADIUS(6), then the number 6 will be bound to the $1 and processing will continue as you’d expect it to. If we had more than one parameter, then $2, $3 etc would be available to use in the macro.

Let’s add both of these examples to our style.css.m4 file:

m4_changecom(`@@##')
m4_define(BLACK, #000)
m4_define(GREY,  #ccc)
m4_define(WHITE, #fff)

m4_define(ANGRY_TEXT, `
    color: red;
    font-weight: bold;')

m4_define(BORDER_RADIUS,
  `-webkit-border-radius: $1;
  -moz-border-radius: $1;
  -ms-border-radius: $1;
  border-radius: $1;')

body {
  background-color: WHITE;
  color: BLACK;
}

p.angry {
  overflow: auto;
  ANGRY_TEXT
}

pre.formatted-code {
    font-family: monospace;
    background-color: GREY;
    BORDER_RADIUS(6)
}

m4_include(./footer.css.m4)

Conditionals

GNU m4 has a m4_ifdef directive, which allows you to conditionally emit some text:

m4_ifdef(SOME_TEST, `yes')

I think this would only be useful in conjunction with the -D (or --define) flag to the m4 program. Usind -D you can create a definition when the program runs, which is basically equivalent to passing variable definitions into m4.

Consider the following example:

$ m4 -P -D DEBUG=true some_file.m4

In this case you could use m4_ifdef(DEBUG, some_text) to only include some_text when you’re compiling in “debug mode”.

Doing math

One more cool feature: m4 can do basic math by using the m4_eval() directive. For example:

m4_eval(2 + 6)

… will emit “8”.

If we wanted to do some math on our border-radius declarations, we could do something like this:

pre.formatted-code {
  background-color: GREY;
  BORDER_RADIUS(m4_eval(22 / 2));
}

In fact, we could save ourselves some effort and define a macro for our main border-radius number, and then do math using m4_eval whenever we want to use a smaller or larger radius. Here’s a contrived example:

m4_define(BORDER_RADIUS_AMOUNT, 22)

.normal-radius {
  BORDER_RADIUS(BORDER_RADIUS_AMOUNT)
}

.smaller-radius {
  BORDER_RADIUS(m4_eval(BORDER_RADIUS_AMOUNT / 2))
}

.larger-radius {
  BORDER_RADIUS(m4_eval(BORDER_RADIUS_AMOUNT * 2))
}

Further Reading

I’ve been hacking around in Clojure for a few years now, and while I like the language very much there are a few quirks of the implementation which I still find irritating, namely the glacial startup time and general un-suitability to writing small programs.

So this week I’ve been looking at Common Lisp, in search of another nice lisp dialect to add to my tool-box.

Some general impressions:

  • roswell is pretty nice, making it easy to get sbcl, quicklisp and asdf installed
  • Practical Common Lisp is a good book, very easy to read and gets right to the point without trying to explain what programming is from the ground up
  • SBCL seems fine, I’ve not had any trouble with it yet
  • The ecosystem seems to be plenty diverse, with robust implementations of all the things I’d care about
  • The language can be a bit weird:
  • To a beginner it’s not clear why getf, setq and co are named the way they are
  • Using higher-order functions can feel a bit clunky, using #'foo and (funcall foo x y z) feels much less clean than the equivalent code in Clojure. More than once I’ve stumbled on what to do when returning a lambda from a function, binding it to a var and then passing it along to another function
  • Common Lisp doesn’t seem to have a fundamental sequence abstraction like Clojure, so it’s a lottery trying to figure out which functions will work on whichever data-type you’ve got in hand.
  • Verbosity seems like it’s going to be a problem, function names are usually pretty long and there are no data-literals in the style of Clojures {}, [] and #{}
  • The Common Lisp Wiki looks like a nice source of up-to-date information on Common Lisp
  • Common Lisp makes uses the empty list ((), or nil) to denote boolean false, which I could see causing all sorts of problems when parsing formats like JSON. Actually, now that I think about it, this makes working with any external data basically untenable, the language simply doesn’t have a concept of empty-list, null and false being different things.

I think I’ll keep going with this one, despite some flaws Common Lisp does look like a promising language and a good candidate for small, compiled programs and little network services.

Problem: sometimes you want to make some document or resource available for download (ex: a whitepaper, brochure, report), but only after the downloader has provided their email address, read some T&Cs and clicked “Accept”.

Setting this up could be a pain if you’re part of a non-technical organisation. Ideally you’d want a service to handle it for you, which you can link to from your website.

Solution: a web app which allows a user to upload a document or other file, and “clickwrap” it. A visitor can then be directed to the URL of the clickwrap’d file where they will be shown the “agreement” text, prompted to fill in their email and click “accept” before being given the file.

Monetisation:

  • provide custom CSS and branding on premium accounts
  • better, deeper metrics on premium accounts
  • maybe restrict to a two-month free trial on free accounts

2015 was a great year for me. In those twelve months I:

  • Developed BedquiltDB into a viable project
  • Saw NightChamber grow into a small but vibrant community
  • Joined the team at ShareLaTeX
  • Learned a bunch of stuff
  • Most importantly of all, continued to foster wonderful personal relationships
  • Perl
  • ClojureScript
  • Elixir
  • And a bunch of others

Let’s hope 2016 is just as good.

Inspired by a post on Hacker News: create a new competitor to GitHub, but with more focus on the social aspect of “Social Coding”.

  • Organisations
  • Projects/Repos
  • Reddit-like discussion threads attached to things like issues, branches, commits and pull-requests
  • Easy hyperlinking between discussions, issues, commits, etc
  • Obviate the need for OSS projects to host their discussions/community elsewhere.
  • Also addresses the fact that some projects try to use Github issues as if they were forum topics
  • Ideally the implementation should be open-source, and with the business providing a hosted solution

Why? Someone raised a point that Github, despite it’s motto of “Social Coding”, has been more focused on chasing Enterprise contracts than on developing a truly social platform for code. The potential for a code hosting platform with real social features is huge.

Today I released v0.4.0 of BedquiltDB. Changes include:

  • New skip, limit and sort options to find queries
  • node-bedquilt now uses ES6 features, and thus requires node >= 4.0.0
  • Improved documentation, including a new BedquiltDB Guide

A lot has happened for me in the last few months.
Some highlights:

New Job

In August I joined the team at sharelatex.com. We’re working on a new product called DataJoy. It’s awesome.

Lots of other stuff

Which I can’t really remember, or don’t feel like talking about online.

See this in the app Show more