René Raaijmakers
10 October

If headlines are any clue, machine learning is taking the semicon industry by storm. Bits&Chips talks to two of ASML’s data scientists about the impact their discipline has on the hardcore physics company. The litho giant is increasing its use of machine learning to support its holistic lithography products.

If you want to see data sciences walking hand in hand with physics, visit the data science groups at ASML. This workplace is crowded by scientists – most of them physicists – focused on finetuning lithography processes by sifting through mountains of data.

ASML may be known as an optics and mechanics stronghold but lithography actually never was a domain of pure physics alone. Even the first wafer stepper that was built at Philips’ laboratories in the early seventies needed external data to keep itself on the right track. Herman van Heek, the system architect of this machine, consulted the Dutch national meteorology service, KNMI, to check the air pressure on an hourly basis. This way, he prevented his machine from drifting during different weather conditions because of small changes in atmospheric pressure impacting the wavelength of the laser used in the interferometry system that positioned the wafer stage.

“We need to serve the data in such a way that experts can analyze it in a limited amount of time,” ASML data scientist Maialen Larrañaga points out.

In our days, data is of growing importance to printing every detail on a chip the right way. The methods used by scanners to finetune themselves with internal and external information can be made with ‘machine learning’. This way of working complements physical modeling. No wonder this area of expertise is now on the agenda of ASML’s top engineers.

ASML’s chief technology officer Martin van den Brink spent several slides on machine learning at the company’s recent Investor Day in Veldhoven. In short, his message was: data translates in value for customers. As an example, Van den Brink told his audience about the improvements in the field of optical proximity correction – OPC makes minute adaptations to the patterns on the masks to influence the exposure in such a way that they result in better structures on the chip. “Our physical modeling capability inside our Brion activity combined with a machine learning system enabled us to substantially drive the model accuracy up,” the CTO said. Smilingly adding: “This is a bit of a fight between physical modeling and machine learning but we’ve learned over the last year that by combining it with the most advanced learning algorithms, we can continue to drive OPC accuracy.”

Digital Twin Conference 2019

Solving data science puzzles

At international litho conferences like SPIE, machine learning presentations and papers started to pop up only very recently. “This is really different from earlier years,” says Alexander Ypma, manager of one of the data science groups at ASML.

At its own technology conference, the Dutch machine builder added a stepper data analytics track to the program for the first time in 2018. Before an audience of engineers and scientists, CTO Van den Brink clearly underlined the untapped potential of machine learning. “Being part of the technology conference program is a concrete sign that this work is considered very seriously,” states Ypma.

The perception of the importance of machine learning is slowly shifting, adds press officer Sander Hofman. “Our top technologists have only in the last couple of years got on board.”

At this moment, about fifty people are working as data scientists at ASML and this number is growing steadily. This core group is immersed in a community of around 250 people, mostly physicists, involved in solving data science puzzles. “We talk to a lot of researchers and developers,” says Maialen Larrañaga, a data scientist at ASML. “We organize meetups to share our knowledge. People are getting more involved, also those who do not have a background in machine learning.”

ASML is still building its machine learning competence in a structured way. The big challenge is to involve all expertise and experience within the company to keep improving the learning algorithms. In her first years, Larrañaga worked on an exploratory project that looked at how to combine automatic modeling with the knowledge of domain experts and physicists.

The applied data science team has built up an internal network to connect several hotspots with skills and knowhow within ASML. “This work is currently evolving,” clarifies Larrañaga. “We have sites around the world and we have several groups here and we’re trying to connect them, for instance by bringing the community in Veldhoven together several times a year with external speakers.”

Expensive data

Machine learning can be applied to the lithography process because it produces piles of data. The ultraviolet light source of one scanner alone is monitored by some 1,500 parameters that spit out a daily amount of 1.5 terabytes. A total EUV system produces 31 terabytes a week. All kinds of sensors in the machine measure every wafer and every die on a wafer.

At other locations in a wafer fab, the metrology instruments Yieldstar and HMI’s e-beam inspection can measure at the wafer level if everything is going according to requirements. They check for things like overlay errors and defects. The information is used by the scanner to apply the proper corrections but also to do predictions using algorithms.

The general public may have the idea that an abundance of data, computational power and advanced algorithms automatically translates in correct actions or results. This is clearly not the case, asserts Larrañaga. “You cannot simply take raw input data and expect that something good is going to come out.”

The first thing machine learning specialists need is meaningful data. They define their information need as data points. Around 5,000 are needed for decent results but for excellent results, you need millions of measurements. Optical and e-beam inspection tools can deliver this data but at a price. “That’s why we call it expensive data,” says Larrañaga. “It’s costly because measurements are time-consuming.” Physical measurements should be as informative as possible and planned efficiently in a running chip factory.

Avoiding these delays is where machine learning kicks in. Larrañaga: “In a manufacturing process, we need to be creative. The idea is that we use the data to do predictions of the best settings to process the wafer. There’s lots of information about masks, wafer topology, alignment and corrections that have been applied to previous lots. We can use all these to correct and improve the exposures. If we’re confident enough about what we predict, we don’t have to measure. We can measure more efficiently, which results in a reduction of cycle time.”

Process fingerprint

The metrology and computation toolboxes have to keep up with the ever-shrinking features on chips. “Shrink means more complex chips, which requires more advanced models to do corrections,” says Ypma.

ASML is applying loads of tricks to support the gathering of data. One example is the specific markers and targets that are added: small support structures that are printed on the wafer and that are only there for calibration purposes. Ypma: “They’re not part of the product structures. They’re often printed on wafer areas that aren’t used, like the spaces between chips that will be sawn away later, when the wafer is diced.”

After the scanner prints these structures, a comparison between the expected position and the measured spots results in a reference. “We typically see patterns over the entire wafer,” reports Ypma. “You could say that this gives us a fingerprint of the processes in the fab.”

Chemical mechanical polishing (CMP) is an example of a process step that can impact the position of small features. It’s a polishing process that chip manufacturers use to flatten the whole wafer during the fabrication of the copper interconnects. “CMP is a grinding rotating movement,” explains Ypma. In the nanometer world, these kinds of mechanical processes can bend the microscopic chip details like the hairs on a silk carpet. “The resulting error patterns aren’t random. If the CMP steps are suboptimally configured, you may see rotation patterns on the wafer.”

Alexander Ypma, the manager of one of ASML’s data science groups: “Machine learning isn’t only about corrections but also about painting the bigger picture.”

Spurious relations

To make sense of all the data, machine learning specialists lean on other human beings. Showing these process experts raw data is not an option. The interpretation needs pre-boiling of the information. “We need to serve it in such a way that experts can analyze it in a limited amount of time,” Larrañaga points out. “Humans have a short concentration span.”

The idea is to develop informative dashboards. “We really don’t want to overwhelm the experts. For that, we need to design proper tooling and proper visualization. Our goal is to show meaningful information to make sure it’s easy for them to draw conclusions.”

Not all information is interpreted as easy as a nano-scale crop circle from a CMP process. Sometimes there seems to be a causal relationship where in reality, there is none. Especially in those cases, human help is needed. Larrañaga: “We often see spurious relations. For instance, signals that seem to correlate for overlay but actually don’t have any causal connection to it. Then experts can help determine if that data is a proxy for something else.”

It’s also necessary to identify and filter nonrelevant information, like the individual profile of each scanner because they aren’t 100 percent identical. “Each wafer that’s exposed in a particular scanner carries the fingerprint of this machine,” Larrañaga illustrates. “Many actuators and sensors contribute to that fingerprint. Experts can also help make sense out of this. The idea is to use this information to better characterize machine fingerprints and correct for them.”

Thousand dimensions

In a chip factory, many characteristics can be relevant to explain certain visible error patterns. Machine learning specialists call those complex environments “very high-dimensional spaces”. Ypma: “We need lots and lots of data do describe them. But when a space is very sparsely filled, it will be very difficult to teach a machine to make correlations or classifications in that space. It can only be done when you have enough data.”

To find needles in data haystacks, expert knowledge is of great help. “It’s very common for us to have a space with a thousand dimensions,” Ypma says. “You need a huge amount of data to fill that space. Even if we do have this information, we still need to come up with smart ways to look into the relevant parts of the space. Of the thousand variables, there might only be twenty that are really relevant. Machine learning will help us confine the problem to these twenty dimensions. Domain experts can be of great help to identify only the relevant part.”

Complementing physical modeling

Ypma underlines that data science has a solid foundation, with ways to make empirical models. “With the available knowledge, we’re able to interpret the underlying reasons why machine learning has found a particular model. That’s how we see it at ASML – as an extension of our modeling portfolio. We want to integrally connect it to the physics domain knowledge we have. The combination of both will make a difference.”

Applied to very large amounts of historical data, machine learning can complement physical modeling. Ypma: “We have extracted some of the underlying fingerprints of fabs and wafer lots by using data of many, many wafers in close collaboration with our customers. We characterized sets of wafers in relation to particular customer uses. This resulted in a kind of fingerprint library that we can use to quantify the impact of new wafers on specific processes and even link that to process steps. This is a machine learning exercise that’s very difficult to do with a physical model. It shows machine learning isn’t only about corrections but also about painting the bigger picture.”

Asked about future developments to prove the business case of machine learning in the litho market, Ypma starts out with a general statement. In all parts of the chip factory, data will be playing an increasingly important role, he says. For applications and optimization of the performance and availability of ASML’s equipment. “More data, better modeling tooling, learning how to integrate that with domain knowledge,” summarizes Ypma.

But in the end, we touch upon the metrology systems roadmap that ASML publicly talks about. For example, HMI recently upgraded its systems to nine inspection beams and expects upgrades to 5×5 and 11×11 beams – all the way up to 400 beams. Ypma: “The paradigm is shifting towards bringing analytics to the data instead of the other way around. This means that eventually, the majority of data will likely stay at the customer sites.”