Nieke Roos
12 December 2019

Having operated under the radar for almost two years, Grai Matter Labs recently stepped into the spotlight, announcing its first products. According to the fabless semiconductor scale-up with Eindhoven roots, Grai One is the world’s first AI chip optimized for ultra-low-latency and low-power processing at the edge. It’s based on the company’s brain-like Neuronflow architecture.

Compared to the human brain, standard computer CPUs are terribly inefficient. With their classical Von Neumann architectures, they’re constantly moving data around, back and forth between the central processing unit and central memory, thereby squandering lots of power. No wonder companies such as Grai Matter Labs (GML) are venturing to create neuromorphic, ie brain-like, processors.

GML’s technology is based on 20 years of breakthrough research on the human brain carried out at the Vision Institute of the former Pierre and Marie Curie University in Paris (now part of Sorbonne University). The fabless semiconductor company’s neuromorphic computing paradigm overcomes the limitations of standard CPUs. Grai One, GML’s recent hardware debut, like the brain, uses a large number of local compute elements called neurons and impulses called spikes for data communication, offering massively parallel and fully programmable sensor analytics and machine learning at reduced power consumption.

GML Grai One 01
The fully digital Grai One chip measures 20 mm2 in TSMC 28nm technology. Credit: Grai Matter Labs

According to GML, its Grai One is the world’s first AI chip optimized for ultra-low-latency and low-power processing at the edge. It’s targeted at response-critical edge applications in autonomous navigation, human-machine interaction and smart healthcare markets. “Grai One processes edge AI applications orders of magnitude faster than traditional architectures while maintaining a power footprint suitable for battery-powered devices,” explains GML CEO Ingolf Held.

Silicon Hive

GML started out as Brainiac, incubated in 2016 within the iBionext healthcare startup studio in Paris. Among its founders are Vision Institute professor Ryad Benosman, iBionext chairman Bernard Gilly and Atul Sinha – a team combining experiences in neuromorphic computing, silicon design and entrepreneurship. In December 2017, the company closed its Series-A financing round of 15 million dollars, led by iBionext, and in April of last year, it adopted its current name. Next to its HQ in the French capital, it has offices in Silicon Valley (San Jose) and Eindhoven.

Bits&Chips event sponsor registration

The Dutch connection comes from co-founder Sinha, a prominent figure in the high tech industry in the Netherlands. After having worked at Philips for 13 years, he was the long-time CEO of spinoff Silicon Hive, which was acquired by Intel in 2011 (link in Dutch). He went on to become one of the founding fathers of the Eindhoven University of Technology medical robotics startups Preceyes and Microsure. At present, he serves in the board of directors of several Dutch high tech companies, including IoT security specialist Intrinsic ID and healthcare monitoring expert Sensara.

Intel’s decision, at the end of 2017, to terminate the former Silicon Hive team located at the High Tech Campus Eindhoven boosted GML’s Dutch presence. Although the American semiconductor behemoth backtracked on its plan a couple of months later (link in Dutch), much of the ‘harm’ was already done: by then, several of the 115 employees had moved just around the corner to the newly formed office of the AI startup, brought together by their former boss Sinha. Among them were Ingolf Held and Menno Lindwer, who were appointed CEO and VP Engineering, respectively.

Under Held and Lindwer’s (technological) leadership, GML has been quietly building its team and product portfolio. With Paris and Silicon Valley focusing more on machine learning applications and business development, Eindhoven is responsible for architecture exploration, hardware design and AI tools. This culminated in the introduction of the Neuronflow programmable processor technology and the Graiflow software development kit last September and the Grai One chip at the end of October.

GML Lindwer and Held 02
Under the (technological) leadership of Menno Lindwer (left) and Ingolf Held (right), GML has been quietly building its team and product portfolio. Credit: Grai Matter Labs

35 mW and 20 µs

Neuronflow draws from neuromorphic and dataflow paradigms to solve core problems for real-world AI applications. The technology is designed for multiple types of computation: digital signal processing, machine learning inference, procedural computation and mixtures of these. One of its breakthroughs is dynamic dataflow processing of real-time data, which drastically reduces application latency.

The underlying architecture utilizes in-memory compute with a mesh of cores and local neuron/synapse memories, avoiding the memory bottleneck of the traditional Von Neumann model. The neuron cores process 8 or 16-bit data, are event-triggered and connected through a packet-switched network-on-chip. By only processing and propagating the sparse change events, the system has much less work to do and uses much less power.

Based on Neuronflow, the fully digital Grai One chip measures 20 mm2 in TSMC 28nm technology and implements a mesh of 196 neuron cores with local memories for a total of 200,000 neurons. It provides a GPIO interface to offload latency-critical AI workloads. At 100 percent neuron core utilization, the chip consumes as little as 35 mW and has a latency as low as 20 µs. For keyword spotting, the latency was benchmarked below 3 µs, while for hand gesture recognition, it was even below 1 µs.

Neuronflow and its future silicon implementations are supported by GML’s Graiflow SDK. It’s capable of both conventional program execution and machine learning computation via industry-standard languages like Tensorflow, Python and C++. The kit includes a graphical editor, compute and network APIs, a mapper, a simulator, a debugger, a code generator and full runtime support.