Pieter Edelman
12 September 2017

Elixir is a new programming language that promises a straightforward approach to massive concurrency and fault tolerance, all wrapped in a neat, modern and friendly formalism. Its creator, José Valim, explains its foundations and how it came to be.

José Valim didn’t set out to create yet another programming language, but that’s exactly what he ended up doing in his quest to solve a problem he encountered as a programmer: modernizing concurrency. The result is Elixir, a functional language that makes it easy to write concurrent and even distributed, fault-tolerant software – or so it claims.

Six years after its first release, Elixir is starting to make ripples in the software development community. So when Sioux invited Valim to one of its Hot-or-Not sessions last June, the room was packed with programmers – making it, quite literally, a hot session.

Valim is quick to give credit where credit is due. Elixir is entirely indebted to another programming language and its accompanying ecosystem: Erlang. Created over three decades ago by Ericsson – the name ‘Erlang’ is both a reference to a Danish mathematician and a play on ‘Ericsson language’ – it was designed to build the telco giant’s vastly concurrent telephone switches by combining two paradigms in computer engineering: functional programming and the actor model.

Elixir creator José Valim. Photo: Sioux

Elixir leverages the hard work already done on Erlang and slots right into its ecosystem, explains Valim: ‘Erlang is a programming language, but it’s also a virtual machine and a runtime, just like Java. The reason Elixir exists is precisely because of the Erlang virtual machine. When I discovered Erlang, I felt this was the runtime, the environment where I wanted my future code to be.’

Points of no return

In both Erlang and Elixir, the main issue is concurrency, which is a bit of an elephant in the software room. The imperative approach to programming that has been the industry standard for the last half century is based on the way a processor works: instructions are executed one after the other, step by step.

But working from this assumption on an architecture where different parts concurrently work on the same data, and might have to wait for each other, is very hard. It’s hard to reason about, and it’s hard because things might get unpredictable. For example, one of the parallel paths may run slower from one time to the next because it runs on a separate processor core, where it all of a sudden has to start competing with another program. Good luck debugging that!

The issue is growing ever more relevant. For some time now, chip companies have had a hard time leveraging Moore’s law to increase processor clock speeds. The best way to increase processing power, they discovered, is to spread it out over multiple processor cores. The core count has steadily risen ever since, and with it the losses in a single-core software design.

Programmers not only have to deal with more and more cores – consider accelerators like GPUs, which use hundreds of cores running in parallel – they’re also encountering the problem in more and more situations. ‘Last year Apple announced a new version of its Apple Watch with two cores. Your wristwatch has two cores,’ says Valim. ‘Today, writing software that uses only one core in your machine should be the exception.’

The problem became tangible for Valim in 2008, when his professional life revolved around the Ruby programming language and the well-known Rails web framework. Not only did he use it for the company he had cofounded; he was also a Rails core developer. ‘At the time there was pressure from the community to get Rails applications running on these machines that had multiple cores. So in 2008 we had a release that was thread-safe, which meant that you could use threads to achieve concurrency without your application blowing up.’

There was just one problem: it wasn’t entirely true. In special circumstances, with high amounts of traffic, applications sometimes became erratic. Debugging them turned out to be a nightmare. ‘That’s when I started to think we need to find better ways, better tools.’

In his search for them, Valim crossed two points of no return: the functional programming paradigm and Erlang. ‘They changed my perspective on a lot of things. After I learned about them, I thought to myself, I can’t go back and write software the way I was doing it before.’

Russian nesting doll

Whether the functional style of programming is better than the imperative way is of course a matter of taste, but most people would agree that its characteristics do play out well when it comes to concurrency. ‘We can talk about functional programming for a long time, but for me there were two very simple, very small things that caught my attention,’ Valim explains.

The first one has to do with state. ‘The essence of objects in an object-oriented language is to encapsulate state. And because objects can contain other objects, which can contain other objects, you get this Russian nesting doll of objects. And if you have two threads trying to change the same object, the same state, you have a concurrency issue. And that’s very hard to find, because the state is hidden. It’s encapsulated away.’

In contrast, one of the basic tenets of functional programming is that all processing routines (the functions) are ‘pure’, which means they take input, produce an output, and that’s it; they aren’t influenced by other factors in the environment and they can’t affect the system state – in fact, there is no global system state. So all functions are effectively isolated from their surroundings.

The second thing that caught Valim’s attention has to do with data immutability, another tenet in functional programming – you can build new data from old data, but never change them in place. ‘If you have a list, and you remove one element from it, you get a new list,’ says Valim. ‘So together these formed the first point of no return. If I reduce the places where my code is mutating and changing things, complex code is going to be simpler to understand and maintain.’

Of all the functional languages Valim encountered, Erlang stood out, for a simple reason: Ericsson built it for massive concurrency because it needed it to create telephone switches that could handle many calls simultaneously. And there was an intriguing modern use case as well: Whatsapp. ‘At some point they were handling more messages than the whole global SMS system, and they were doing that with a small engineering team, using Erlang,’ Valim says. ‘In a blog post at the time they described how they handled two million connections on a single machine with twenty-four cores, still using only 40 per cent of system resources. And that’s exactly what we want.’

Erlang achieves this feat by using an approach where an application is built from small, self-contained snippets of functionality. Each instance of each snippet is executed by the virtual machine as a completely isolated, lightweight process, easily scaling to millions running at the same time. These processes interact by sending messages to each other, or spawning new processes.

This approach is known as the actor model. There are many languages and frameworks that implement it. For example, Akka and Akka.Net are popular frameworks that bring actors to languages running on the Java Virtual Machine and the CLR, respectively.

But for Valim it’s all about the combination of the two approaches: ‘If you run something on top of the JVM or on top of .Net, you miss a lot of the beneficial properties because they were developed for languages that have mutable objects. There’s nothing that’s going to guarantee their isolation.’

Another aspect made possible by the architecture of isolated processes is the way Erlang copes with errors, which is quite different than in mainstream languages – and another reason Valim fell in love with it. Instead of throwing exceptions that in the worst case can propagate through the entire application, or straight-out crashing, Erlang uses the concept of supervisors: processes that monitor the behaviour of other processes and restart them if needed. And supervisors can have their own supervisors. This mechanism enables the construction of fault-tolerant systems that are able to heal themselves.

‘There are a couple of other things as well,’ Valim adds. ‘For example, processes in Erlang are pre-empted: each process has some time to run and then is taken out. No process can take over and run forever. In those JVM and .Net languages you don’t have those guarantees. Something else that’s worth considering is garbage collection. In the Erlang VM it’s done per process because they’re all isolated, so there’s no ‘stop the world’ garbage collection.’

Sioux built a colour-sorting robot to demonstrate Elixir’s distributed capabilities. Photo: Sioux

assertThis, assertThat

Still, for all Erlang’s merits, Valim felt the language was missing some features: ‘I loved everything that I saw, but hated the things I didn’t see. To me, it was missing some components. That was the trigger for creating Elixir.’

One thing he missed in particular were metaprogramming capabilities that allow a language to be extended. ‘Today, the field of computer science is so open, so wide, there’s just no chance you’re going to design a programming language that’s going to work across all those possible domains. So we need to have a language that we can adapt for the different domains we work on.’

Thus at the heart of Elixir lies a powerful macro system that allows programmers to define new syntax with code that’s executed during compilation – in fact, most of the language itself has been created using this macro system. ‘For example, there’s a library called Ecto that allows you to write database queries in Elixir-like code. It’s just an external library; it has no relationship with Elixir and uses only regular Elixir syntax. But during compilation it gets translated into SQL.’

Another example illustrating how this compile-time power can be leveraged is Elixir’s test framework. ‘In a lot of languages you write things like assertEqual, assertDifferent, assertThis, assertThat, but never simply ‘assert 1 + 1 == 2’, because that would give you an error like ‘expected true but got false’. In Elixir, ‘assert’ is a macro that can introspect the code at compile time, see which things you’re trying to compare and extract all the possible information from that.’

Cryptic error messages

Elixir improves on Erlang in many other ways as well. While searching for better ways to do concurrency, Valim encountered many good ideas in other programming languages that he wanted to incorporate. Given his background, it’s no surprise that Ruby was also a major influence on Elixir. People familiar with Clojure will also see its strong footprint.

But Python developers, for example, will be right at home with the idea of doctests, a way to embed examples of what a function should do directly in the comments documenting that function and subsequently use them as the basis for testing. Doctests are built into the language right alongside other advanced documentation features, making documentation a first-class citizen.

Something similar can be said about the toolchain. ‘This is one of the things we learned from the Go programming language,’ Valim says. ‘There was a lot of discussion about Go at one point, and people would debate what they liked about it and what they didn’t. But the tooling was one thing that people almost universally praised. So it was obvious that we, too, needed to have very good tooling if we wanted to succeed.’

And thus Elixir, like Go (and many other modern languages), comes with a straightforward toolchain that enables you to easily set up new projects, compile the code, run tests, find and install third-party modules, define dependencies, et cetera. An interactive shell to experiment and prototype is also available. And – another important feature borrowed from Go – the compiler tries to provide actual, helpful guidance rather than throwing cryptic error messages.

Despite its varied sources of inspiration, Elixir is fully compatible with Erlang – that was an explicit design goal. So Erlang and Elixir code can be mixed freely in the same project, and every library that has ever been written for Erlang can also be used in Elixir.

Brain stuck

With these capabilities, Valim reckons Elixir is suited for pretty much any kind of programming work – though of course some problems are a better fit than others. For example, Elixir doesn’t really shine when it comes to numerical computation, and its GUI capabilities are a bit lacking at the moment. Code also needs to run inside the virtual machine, so environments like low-level microcontrollers are out of the question.

Elixir’s biggest drawback, however, may be one of its actual strengths: the functional paradigm can be hard to transition to after a lifetime of thinking imperatively. Even Valim admits he still struggles with the concept from time to time: ‘We don’t have while loops, for example; we like to focus on recursion and things like pattern matching. I’ve programmed in imperative languages longer than in Elixir and functional languages, and sometimes my own brain still gets stuck and I have to take a step back and think about the problem in the right way.’

The good news, Valim says, is that functional ideas are now being incorporated into all languages. ‘Even in Java they’ve got lambdas in the latest version, and all of this functional vocabulary in terms of collections, and functions like map, reduce and filter. But immutability still isn’t the default. I think that’s what sets us apart.’

And it’s well worth mastering this skill, as the problem of concurrent execution will only grow more pressing. ‘It was this point that really made me fall in love with Erlang,’ Valim says. ‘For the last ten years everybody’s been thinking about concurrency, but what happens when your machine with twenty-four cores is no longer enough? Then you need to start thinking about distribution, with different machines working towards a common goal. Erlang has always been about distribution. For message passing it doesn’t matter whether processes are on the same machine or on different ones. So when that becomes an issue in the future, it’s already handled for us.’