Angelo Hulshout is an experienced independent software craftsman and a member of the Brainport High Tech Software Cluster.

19 October 2022

Angelo Hulshout has the ambition to bring the benefits of production agility to the market and set up a new business around that. He explains how a software engineering practice can help.

What happened back then? How often haven’t you asked this question after discovering a problem that was caused by something that happened way back in the past? To get an answer, it would help a lot to have data available from the time the problem occurred. However, as I explained in an earlier article, it’s not easy to collect the right data without drowning in a sea of seemingly unrelated data.

In that article, I examined the issue from two sides. I looked at the available technology to collect data and the fact that this collection should be done ‘wisely.’ I didn’t go into the possible ways to do the latter and avoid getting too much or the wrong data. Here, I’m going to fill in that gap by putting something in between the functional need to gather data and the technologies to do the collecting – some software engineering practices.

Technologist hat

We can collect data on all levels of factory operation: on the machine control level, we can get it from our PLCs or other controllers, on the process control level from the MES system and on the factory level from the relevant parts of the ERP system. All of these are connected to a network, so the data is relatively easy to access. We can use dedicated interfaces or just read directly from the databases of the systems.

We can analyze the data on the fly or gather it in a central place and optimize it for further analysis. Data scientists (specialists in gathering and analyzing data) call this last activity “data cleaning” or “data optimization.” For cleaning, analyzing and reporting, lots of different software solutions are available. If a commercial or open-source application can’t help us, we can always write our own using one of the many data science development kits.

Still, having these opportunities circumvents the real question: how can we reduce the amount of data we gather to just what we need? The short answer is that we can’t, simply because we don’t know what data we may need at a given point in the future. However, we can increase the chance that we have the right data without just randomly collecting everything.

To achieve this, we have to take a step back and take off our technologist hats for a little while. After all, when operators or managers in a factory wonder “What happened there?” they’re most likely not thinking about what exactly the hardware or software was doing. Instead, their main concern is whether somebody made a mistake, whether the temperature in a silo was too high or maybe a delivery of materials didn’t arrive.

Although these are completely different examples, relating to completely different processes or parts of the factory, they do have something in common. They’re events that occur in the factory. In this case, events of the form “something went wrong.” There are also events of the type “a planned or controlled action was completed,” which occur when a step in a process or procedure has been carried out. It’s these events that can help us filter out the relevant data.

Event sourcing

If you go to the app or website of your bank and open your account overview, you’ll see the current balance and the most recent transactions. Each transaction is an event resulting in a change in your account balance. This delta is positive when money was added and negative when money was withdrawn. With each transaction, there’s an indication of the delta, a timestamp and the identification of the other party involved. Basically, this is all we need to keep track of what’s going on in our bank account.

Event sourcing is a software architecture pattern based on this idea. Actually, it’s based on the invention of double-entry accounting by the 15th-century Italian mathematician Luca Pacioli. He was the first to register transactions in a bank or any other business, next to the current balance – hence the term “double entry.”

In event sourcing, the focus is on the transaction side of this approach. Instead of the state of an object, only state changes are stored. For a production order, that could lead to the order being registered, then scheduled, then produced, then stored in the warehouse and finally shipped. Starting from when the system was first turned on, the current state can be calculated by adding up all transactions.

For smart-industry projects, event sourcing seems a very useful approach. It opens up a way to collect data for problem analysis as well as for setting up a process of continuous evaluation and improvement. This is what we try to achieve for our customers at Shinchoku.

We can use the approach for each object, whether it’s a production order, a product recipe or a transport document. We need to do two things to make this work. First of all, identify which events are important for the people running the factory. Second, determine what information should be included with each event to make storing it useful. Some data is needed for problem analysis; for continuous improvement, something more may be needed. The essence is that we stop collecting seemingly random data items at regular time intervals, as still happens a lot in practice, especially at the PLC and MES levels. Instead, we start collecting data in the form of meaningful event traces.

Angelo Hulshout current state
Current-state registration in an MES system gives us only the current status and no history of the production order.
Angelo Hulshout event sourcing
With event data, we can reconstruct the state of the production order at any given point in time.


Although ERP and MES systems are written by software people, not all of them implement event sourcing. In fact, most don’t. Most systems are built in the ‘traditional’ way, around a database showing the current state of every object instead of the full history. Some features come close but don’t completely fulfill our needs. For example, most ERP systems have an audit module that can be enabled for financial functions. Also, MES systems in some branches, like food production, often keep track of how ingredient mixes are produced, because of traceability requirements.

To apply event sourcing in combination with these systems, we’ll have to have a look at how to interface with them and collect the appropriate event data. This is not impossible, and I expect that at some point, we’ll have standardized interfaces for it. With the 10+-year life span of MES and ERP systems, it will take a while before it’s common practice, though.

There’s also the issue of deltas versus the current state on the PLC level. PLCs are used to control machines and gather data from sensors that are integrated with those machines. By nature, sensors give absolute, current values instead of deltas.

This is something we can handle easily. It’s not in line with the ‘rules’ of event sourcing, but if we have the absolute values and related timestamps, in an analysis we can always calculate the deltas ourselves. On top of that, looking at what operators and managers want to know, there’s no need to collect sensor data continuously. Getting operational information like “set temperature reached” or “safety boundary on power exceeded,” accompanied by a timestamp and some other relevant data, may be more useful in a lot of cases.

Finally, recalculating the current state from a lot of past events may become time consuming. If an object has a long life cycle (machines may remain switched on for weeks or months), it’s something we probably want to avoid. Event sourcing solves this by creating snapshots of the current state at regular intervals. This allows us to only take into account the events occurring after the last baseline when calculating the current state. In a manufacturing environment with pre-existing MES and ERP systems, this is less of an issue. Since these systems are usually based on the current state, we already have a snapshot at all times. In fact, combining this with event sourcing would be closer to double-entry accounting than plain event sourcing.

Work in progress

This idea isn’t new, but it’s not widely implemented yet. In many cases, companies working on smart-industry solutions have already achieved a certain level of digitalization, including MES and ERP systems based on the current state. There, data cleaning before analysis is the chosen option because it’s often faster to implement than event sourcing inside or on top of existing systems. In more greenfield environments like the ones we find at SME manufacturers that are only making their first digitalization steps, it could work out very well.

Edited by Nieke Roos