Robert Howe is the CEO of Verum Software Tools, based in Waalre. Rutger van Beusekom and Henk Katerberg work there as consultants.

6 May 2016

The Dezyne toolset can be used to rediscover the ‘lost’ behaviour of complex software systems. Verum’s Robert Howe, Rutger van Beusekom and Henk Katerberg describe how.

An unpleasant but nevertheless unavoidable truth of software development is that conventionally developed software ‘rots’ in time. Rot occurs slowly and insidiously, driven by the very nature of source code itself and the human factors that impinge on developing and maintaining it. It often starts when changes to source code are not reflected in documentation, leading to a loss of readily accessible information. It accelerates when development teams change and knowledge of the – now poorly documented – code is lost. It proliferates when new features are added by new software engineers, based on incomplete documentation and knowledge. It reaches its zenith as the law of diminishing returns kicks in and development progress grinds to a halt. It is at this point that reengineering the software becomes unavoidable.

From a business perspective, reengineering software is a nightmare. It involves spending a lot of time and money just to stand still. It’s also highly risky, simply because the existing legacy software is so poorly understood. And in the worst case history can repeat itself, with the newly reengineered code base being no more resistant to rot than its predecessor. This risk, and the work involved in reengineering, can be greatly diminished if the essential functionality and behaviour of the software can be recovered from the legacy code base. Future rot can be minimized by converting the legacy code into verifiably complete and correct models.

In an ideal world it would be possible to automatically reverse engineer the existing code base using tools. But then, in an ideal world water could flow uphill and time could be reversed. The simple fact is that the second law of thermodynamics means that creating a more highly ordered system (reengineered software) from a less-ordered system (legacy code) requires work. The trick is to perform that work as effectively as possible.

No guarantee

Dezyne offers a means to reduce the cost and risk of the work involved in reengineering the behaviour of complex software systems. In a process that we have come to call ‘software archaeology’, the toolset can be used to rediscover a system’s ‘lost’ behaviour. The key to this process is the use of Dezyne’s interface models to capture the externally visible behaviour across legacy software interfaces and to separate it into expected behaviour on the one hand and unexpected or erroneous behaviour on the other.

Dezyne recognizes three types of models: interface models, component models and (sub)system models. An interface model is an abstraction that describes the externally visible behaviour of a component, be it a Dezyne component model or a component comprising legacy software. An interface model describes the API provided by the component it represents and also details the protocol – the sequence of allowed events and responses – that is associated with the API. When an interface model is used in conjunction with a Dezyne component model, the verification engine will assert that the component completely and correctly adheres to the protocol of any interface that it provides or requires. In this way the structural integrity of entire systems composed from Dezyne components is established.

Dezyne interface models are also used to connect Dezyne components to legacy software and vice versa. In this case the interface model represents merely an assumption of how the legacy component behaves, against which the Dezyne component is verified. The verification engine cannot be used to show that the legacy software actually complies with the interface model, meaning the interface model offers no guarantee that the legacy component will actually stick to the protocol that the interface model defines. With a small, simple or highly ordered legacy component, it’s possible to be confident that an interface model captures the exact protocol that the component implements, but when reengineering legacy software, the component’s behaviour is poorly understood and it’s therefore highly likely that a Dezyne interface model will represent an approximation of the component’s visible behaviour.

This is an issue at runtime because Dezyne-generated components assume that all other components they use or are used by, including legacy components, behave correctly. The verification engine guarantees that other Dezyne components meet this requirement. However, errant behaviour by a legacy component can cause an interface protocol violation, thereby causing a Dezyne component to abort.

Catching the nasty behaviour

Since every Dezyne user needs to interface with legacy software components at some point, we have developed a technique to deal with potential errant behaviour at these interfaces. This technique involves creating ‘armoured’ interfaces between legacy and Dezyne components. Such an interface is built from two interface models sandwiched around a ‘protocol observer’ component.

The two interface models are syntactically identical, but semantically different. The outer, legacy-facing, interface model is written to be robust, meaning that it defines ‘weak’ semantics that accept a wide range of behaviour from the legacy component, including potentially errant or erroneous behaviour. The inner interface that faces the Dezyne component is written with strict semantics that accept only known, verifiably complete and correct behaviour.

The protocol observer component in the middle is written to deal with the difference between the two. This component passes intended behaviour from the legacy component inwards through the strict interface to a receiving Dezyne component. It filters out any errant behaviour by the legacy component and handles it appropriately. In the simplest case, it might take passive action by just swallowing the errant behaviour or logging it. But it could equally take affirmative action, perhaps by triggering an exception – whatever is possible and appropriate in the circumstances.

Over the course of time, armoured interfaces have become a standard pattern for interfacing with legacy code in the Dezyne community. In practice, it turns out they have uses beyond defending Dezyne components from errant legacy code behaviour. Specifically, they can be used offensively to uncover lost legacy behaviour. In this case, we construct an armoured interface between the legacy component in question and a Dezyne component, with the protocol observer component built to trap and log all errant behaviour.

The resulting system is built, run and subjected to a wide range of tests. Should the behaviour of the legacy component at the interface differ from the assumptions in the strict interface model, the protocol observer component will capture and log the difference, providing detailed information about the true, real-world behaviour of the legacy code. Of course, the confidence level of this approach depends largely on the behavioural coverage of the test suite used to exercise the system. If that is in doubt, the armoured interfaces (which generate little overhead) could be left in place in a final system, ready to catch the nasty sort of errant behaviour that only occurs in the field.

Correct models

Armoured interfaces have been used by Verum’s customers for many years as well and have been shown to provide a useful starting point for the introduction of Dezyne-engineered components into a legacy code base. One approach is to identify a small legacy component that can relatively easily be isolated. This component’s interface with the rest of its system is modelled as an armoured interface and the legacy component replaced with a verified Dezyne component. The system is then subjected to testing and any errant behaviour at the interface is dealt with by improving the Dezyne interface and component models accordingly.

Once completed for the first component, the process is repeated for the adjoining legacy component(s). In this way the behaviour of entire (sub)systems can be rediscovered and captured in verifiably complete and correct models. These then provide a basis for further reengineering and a way to dramatically reduce the onset of software rot.

Edited by Nieke Roos