1 Introduction

The questions regarding possible coupling mechanisms across multiscale microstructure and resulting consequences on property and performance of materials remain still an openly challenging problem. However, studies seeking quantitative formulations of the structure of materials and trying to understand the ways in which the structure change with composition and with processing, and the way in which structure relates to useful properties, i.e., the core idea of materials genome engineering (MGE), were full of the history and defined the core tenet of materials science and engineering (MSE). It was Hooke who first investigated the world of microstructure and published his book Micrographia in 1665 [1]. Two hundred years later in 1863, Sorby also discovered that microstructure exists in steel and the strength of it was remarkably affected by its corresponding microstructure [2]. Considering only the richest aspects of a large and complicated system of structural hierarchy of materials, i.e., the interlock of the smaller parts that generates the larger overarching structures, Smith found in 1970s that the analytical approaches at his time were insufficient and hence turned to the artist’s approach [3]. Although the recent Materials Genome Initiative (MGI) in US [4] and MGE platform in China [5] have achieved substantial progress, Smith’s expectation on a new model that can help to understand the rich and necessary interactions between different levels of structural hierarchy remains yet to be solved. It is worth noting the recent progress made in the field of mechanics where a concept of structure genome (SG) was proposed by Yu in 2016 and defined as the smallest mathematical building block of the structure, to emphasize the fact that it contains all the constitutive information needed for a structure in the same fashion that the genome contains all the genetic information for an organism’s growth and development [6]. Although the concept of structure genome is useful, the structures considered in the finite element models [7, 8] are only idealized structures tailored for the convenience of mathematical treatment and by no means represent the frequently encountered realistic and complex microstructure in metallurgy and MSE. Multiscale materials modeling [9] can generate a plethora of useful data, but it still represents the traditional way of problem solving based on logical derivations. There is no doubt that a data-driven based approach empowered by artificial intelligence (AI) machine learning (ML) is a true enabler and driver for MGI. In the context of the MGI evolving into version 2.0 and being re-energized and scaled up [10], this work serves to be a perspective and propose a hierarchical microstructure descriptor expected by Smith since the 1970s through a focused review on relevant fields. Without a scientifically sound and robust microstructure descriptor, the vision of the MGI cannot be successfully fulfilled.

2 Wavelet, nonlinearity and invariant

Nowadays, microstructural images have been converted to digital signals, and thus allowing application of modern digital image processing tools. LeCun et al. [11] made significant progress on developing computational models of deep learning for visual objects recognition in images in 2015. However, the mathematical mechanisms behind such a big success remain mysterious. Mallat, a well-known mathematician for his work on multi-resolution framework of wavelet analysis, revealed what have been learned in a deep convolutional neural network [12]. Understanding the psychophysics of vision in a mathematical way [13] can shed light onto the key ingredients of the underlying mechanisms, i.e., directional wavelets, nonlinearity, large receptive fields, and some forms of invariance.

Although not obvious, a pure mathematical object wavelet is inherently connected with material microstructure because they both exhibit as multi-scale systems in nature. Discontinuities exist in a wavelet just as interfaces present in microstructure composed of different chemical phases. Such a similarity makes wavelet analysis a superior tool to localize, in a systematic way, features in microstructure. Unfortunately, it was not wavelet but Fourier analyses that have been used more widely. Interfaces in microstructure can cause serious problems for the infinitely smooth Sines and Cosines, and the spatial locations of features become lost in Fourier transforms. Microstructure forms in nonlinear physical processes and nonlinearity must enter the learning framework aiming to understand it. Recognizing a complex system like material microstructure works in the same way as recognizing a person. This process requires a large set of training data, or many encountering occasions, and recognition starts first with an overall impression and then progressively towards finer and finer details. A microstructural image of \(512 \times 512\) \(\text {pixels}^2\) represents a vector in a linear space with a dimension of \(N = 512\times 512 \approx 3\times 10^5\). To fill such a high-dimensional microstructure space would require an astronautically large number of samples. It is known as “the curse of dimensionality”. In addition, microstructural images obtained by either experimental characterizations or computer simulations show a huge variability even inside the same class of material. Therefore, microstructure recognition and classification should be based on invariants, which means that the microstructure can be successfully identified even if its features are subjected to small deformations or symmetry operations such as translation and rotation.

3 Microstructure genome

Mallat proposed in 2016 the following way to construct hierarchical invariants under translation using Morlet wavelets [7]. The image x first convolves with a scaling function \(\phi\) to obtain the zeroth order invariant, which is essentially the average of the image; next, the image convolves with directional wavelets \(\psi _{j,\theta }\) at the scale j and orientation angle \(\theta\). The nonlinear operator modulus \(\vert \cdot \vert\) then enters to introduce nonlinearity and contract the microstructure space. To construct an invariant, the scaling function must be used again. By the averaging process, high frequency information is lost and to recover it successive wavelets in the wavelet system are used. To construct the next order invariant, the modulus operator and scaling function must be used again. The above procedure forms a loop and generates the required hierarchical invariants under translation, i.e., the zeroth order \(S^0\), the first order \(S^1\), the second order \(S^2\) and so on:

$$S^0x(u)=x\star \phi _J(2^Ju)$$
(1)
$$S^1x(j_1,\theta _1,u)=\vert x\star \psi _{j_1,\theta _1} \vert \star \phi _J(2^Ju)$$
(2)
$$S^2x(j_1,j_2,\theta _1,\theta _2,u)=\vert \vert x\star \psi _{j_1,\theta _1} \vert \star \psi _{j_2,\theta _2}\vert \star \phi _J(2^Ju)$$
(3)

where x(u) is the image with u the spatial position index, and \(S^0 x(u)\) is the zeroth order invariant of x(u). The symbol \(\star\) represents convolution and J the spatial scale of the transform.

Smith urged metallographer to use quantitative methods wherever appropriate [14], but the methods available at his time were essentially lower order statistical measures, e.g. measuring the grain size by counting the number of grain boundaries intercepted by a traverse of known length. Unfortunately, such methods are still widely adopted today. Figure 1e plots the hierarchical descriptors, constructed by using the above-mentioned wavelet zoom and invariants, of three images, shown in Fig. 1a–c, reported in Smith’s work on metallic artifacts in 1967 [14]. Morlet wavelets spanning 6 spatial scales with each scale consisting of 8 orientations, shown in Fig. 1d, are used to construct the hierarchical descriptors. Note that each colored line in Fig. 1e connects 48 data points and represents a hierarchical description of one microstructural image. Each data point shows the total sum in a \(\text {log}_{{10}}\) scale of the invariant \(S^1\) at a specific scale \(j_1\) and orientation \(\theta _1\) according to Eq. (2). In other words, each data point is the total sum in a \(\text {log}_{{10}}\) scale of the averaged version of the modulus (or absolute value) of the image (or matrix) resulted from the convolution of the microstructural image with one of the directional wavelets, as shown in Fig. 1d. Therefore, the value at each data point measures the degree of structural similarity between the microstructure and one of the 48 wavelets, which play the role of structure genomes. Since wavelets capture changes or interfaces (boundaries) in microstructure, the bigger the \(S^1\) values, the larger the interface amount. The microstructure in Fig. 1a contains a large amount of homogenously distributed fine spheroidized carbides. The details of interfaces between the carbides and matrix, where abrupt changes occur, are supposed to be captured by wavelets. Among the three microstructural images in Fig. 1, panel a contains the largest number of interface and, therefore, it exhibits the largest \(S^1\) features up to the scale 3. It is obvious that both structures in Fig. 1b, c contain interfaces at multiple scales, some of which are larger than those in Fig. 1a. The flatter blue curve in Fig. 1e indicates that Fig. 1a contains a more homogeneous microstructure with the scales of carbides ranging from 0 to 3, but drops dramatically at the scales 4 and 5. In contrast, the hierarchical features of the slag inclusions in Fig. 1b, c show the following pronounced characteristics: the maximum of the \(S^1\) features at each scale increases as the scale index J increases and the \(S^1\) features vary significantly with the orientation index \(\theta\). Furthermore, the maxima of the \(S^1\) features at larger scale indices from 3 to 5 stabilize at the \(\theta\) indices of 4 and 7 for Fig. 1b, c, respectively. Similar characteristics are not observed in Fig. 1a, but match well with our visual impression that larger scales of slag inclusions dominate and obviously orientational features present in both Fig. 1b, c. It is worth noting that the spatial locations of the features are not presented in Fig. 1, but in fact available. It is remarkable that a sequence of 48 numbers serves to quantitatively describe the structure hierarchy for an image with around 300,000 pixels. The author believes that this could be something that Smith was expecting at his time. If the \(S^1\) features are not sufficient, higher order invariants or even invariants under both translation and rotation can be extracted with the same principle. Figure 2 demonstrates the \(S^2\) features of the seven microstructural images shown in Figs. 1 and 3. Although the images are not zoomed to the same scale, the spiky spectrums containing a sequence of 384 numbers supply high frequency information missed in \(S^1\). The hierarchical invariants themselves establish as a system and provide a physically matchable descriptor for structure hierarchy. In the same spirit as coining the term “materials genome” and “structure genome”, a system of hierarchical invariants that serves as an inherent descriptor for structural hierarchy of microstructure is denoted as a “microstructure genome”.

Fig. 1
figure 1

Plots of the total sum of the first order hierarchical invariant \(S^1x\) in logarithmic scale versus its serial number in e of three metallic artefacts with microstructure shown in ac respectively using a Morlet wavelet system with six spatial scales and eight orientations as displayed in d. The corresponding microstructural images (magnification \(\times\)500, \(800 \times 800\) \(\text {pixels}^2\)) of: a spheroidized carbides in a Luristan sword; b slag inclusions in an iron clamp from Persepolis (ca. 530 B.C.); and c slag inclusions in Luristan dagger are after Smith’s work in 1967 [14] and replotted according to a colormap. In d, the scaling function \(\phi\) is plotted at (row 1, column 1) and the real parts of the wavelet system \(\psi\) are plotted with the spatial scale J increasing from 0 to 5 for row indices running from 1 to 6 and the orientation angle increasing from 1 to 8 for column indices running from 2 to 9. The imaginary parts of the wavelet system are plotted in columns 10 to 17 in the same manner. In e, there are eight invariants constructed according to Eq. (2) using the eight directional wavelets at each scale

Fig. 2
figure 2

Plots of the total sum of the second order hierarchical invariant \(S^2x\) versus its serial number of the seven microstructural images shown in both Figs. 1 and 3. A Morlet wavelet system with four spatial scales and eight orientations is used, thus resulting 384 invariants of the second order. The first seven curves from top to bottom correspond to microstructural images in Figs. 1a–c, and 3a–d, respectively. The last four curves correspond to \(\theta _1\), \(\theta _2\) multiplied by a factor of 3, and \(j_1\), \(j_2\) multiplied by a factor of 6, respectively, for a better illustration. The curves are each filled with a color according to the average value of that curve. The symbols \(\theta _1\), \(\theta _2\), \(j_1\), and \(j_2\) are referred to Eq. (3)

Fig. 3
figure 3

Plots of the total sum of the first order hierarchical invariant \(S^1x\) versus its serial number in e of typical microstructure of copper (\(512 \times 512\) \(\text {pixels}^2\), replotted according to a colormap) from no reduction in area a, to b 23%, c 42%, and d 68% reduction when copper grains become squashed in the direction of working [16]. A Morlet wavelet system with four spatial scales and eight orientations is used. Noted that a quadruple-well structure develops more pronounced in curves as deformation continues from b to d

4 Materials genome

As Kalidindi pointed out, it is the microstructure space that should be used to quantitatively establish the linkage with chemical compositions and processing parameters on the one end and property and performance indicators the other [15]. Now that we have an inherent descriptor for microstructure genome and would it enable our search for the materials genomes? The following example can provide some hints. Figure 3a–d show typical microstructure of copper from no reduction in area to 68% reduction when copper grains become squashed in the direction of working [16]. Before rolling, the descriptor shows a decreasing step increase as the scale increases from 0 to 3. As rolling continues, a quadruple-well structure develops more pronounced in the descriptor curve. If the depth of well is defined as the difference in \(S^1\) between the vertical, i.e., \(\theta = 1\), and horizontal, i.e., \(\theta = 5\), orientation at each scale, it is found that the depth of well increases at all scales from 0 to 3 as deformation proceeds forward. On the property side, experimental data suggested that elongation decreased while hardness continued to rise as the copper grains became deformed and squashed more. Figures 4 and 5 provide additional examples of using the hierarchical descriptor on literature data to connect microstructure with property. Figure 4 shows the 64 hierarchical invariants \(S^1\) of multicomponent (TiZrHfNbTaMo)C high entropy ceramics pressurelessly sintered at different temperatures studied by Zhang et al. [17]. The authors claimed that the grain size increases from microstructure a to d by counting the number of grains with the assistance of a software package. In contrast, the hierarchical invariants in Fig. 4 provide rich information about the grains at different scales. For example, microstruture b contains almost equiaxed grains at scales from 1 to 4 and it contains less grains at scales 6 and 7, while microstructure c and d contains larger grains since the hierarchical invariants increase at scale 6 and 7. Examining the property data only finds little difference in hardness of the four ceramics and therefore connecting the microstructure descriptor with hardness is not attempted here. Figure 5 also plots the 64 hierarchical invariants \(S^1\) of twelve microstructural images of Sn–Ag and Sn–Cu alloys after solidification with different cooling rates studied by Seo et al. [18], in which six are Sn–Ag alloys denoted by labels “AgNN” with the two digits “NN” following “Ag” indicating the row and column numbers of the image in the original reference and the other six are Sn–Cu alloys denoted as “CuNN”. The row number indicates cooling rate, i.e., row 1 represents “quenched”, row 2 “air-cooled”, and row 3 “furnace-cooled”, and the column number indicates composition with 1 and 4 for 0 and 1.8wt% Ag for Sn–Ag and 0.5 and 2.0wt%Cu for Sn–Cu, respectively. In this case, it is interesting to find a correlation between the hierarchical descriptor and the hardness data. The hardness of composition 4 is consistently larger than composition 1 for both Sn–Ag and Sn–Cu alloys and for all of the cooling rates. In Fig. 5, the hierarchical descriptors clearly show that the curve of an alloy with a larger hardness is consistently above the one with a lower hardness at scales range from 0 to 5 at least. For the two “furnace-cooled” Sn–Cu alloys with the least difference in hardness, the two hierarchical descriptors coincide with each other from scales 0 to 2, but the curve of the alloy with slightly larger hardness climbs above the curve of the alloy with lower hardness from scales 3 to 7.

Fig. 4
figure 4

Hierarchical descriptor of the microstructure of multicomponent (TiZrHfNbTaMo)C high entropy ceramics constructed using a Morlet wavelet system with eight spatial scales and eight orientations. The ceramics are pressureless sintered at a temperature of a 2200 °C, b 2300 °C, c 2400 °C, and d 2500 °C. Microstructural images and hardness data are referred Figs. 4 and 6 in reference [17] respectively

Fig. 5
figure 5

Hierarchical descriptor of cross polarized images of Sn–Ag and Sn–Cu solder alloys with different compositions and cooling rates during solidification. Hierarchical invariants are constructed using a Morlet wavelet system with eight spatial scales and eight orientations. Cross polarized images and hardness data are referred Figs. 1, 6, and 12 in reference [18] respectively. The labels AgNN and CuNN in the legend with Ag or Cu indicating Sn–Ag or Sn–Cu alloys respectively and the following 2 digits indicating the row and column numbers of the image in Figs. 1 and 6. Only half of the images in Figs. 1 and 6 are shown to clearly demonstrate the hierarchical descriptors and avoid overlapping of data

The above examples demonstrate the possibility to establish a quantitative linkage between the microstructure genome and property using neural networks if sufficient data have been accumulated. Note that the hierarchical microstructure descriptor overcomes the major difficulty in microstructure recognition, i.e., the curse of dimensionality. After all, a microstructure genome consisting of less than 100 hierarchical invariants is much easier to deal with than a matrix with approximately 300,000 numbers. Furthermore, latest progress on understanding the mathematics behind deep learning suggests that extensive computing involved in deep convolutional neural networks only finds out that the optimized coefficients for the layers of neurons are directional wavelets. Therefore, using the hierarchical invariants proposed in this work means substantially less computation to search for materials genomes. In a data-driven approach, the decision on exactly what constitutes the set of important salient features is not taken in a static manner—instead it is taken objectively based on the actual available data. It is continuously refined as more data become available [19]. With the microstructure genome and its quantitative linkages with chemical composition and processing parameters on the one end and property and performance indicators on the other end, we are then in a much better position to answer the opening question regarding the coupling mechanisms across multiscale microstructure and resulting consequences on property and performance.

5 Outlook

Zhang pointed out that a dataset can be fitted with different equations with more or less the same goodness of fitting due to the noises contained in the experimental data [20]. The author believes that the same principle is applicable on the road to search for materials genomes. The bottom line is that the materials genomes that we search should be understandable and interpretable, but not treated as a mysterious Blackbox. Although the microstructure genome is discussed in a deterministic way in this work, a probabilistic approach based on Bayesian inference, in which a hierarchical invariant may correspond to a class of microstructure with a certain possibility, could also be adopted. The author believes that microstructure genomes targeting at anticipated property and performance can be generated on computers first by artificial intelligence and then physically fulfilled with the advancement of MGE in near future.