Introduction

The purpose of spectrometric methods is to produce knowledge about the object under study by separating the sample matrix into the informative prime components. The spectral components, often manifested as intensity peaks or their combinations are used to reveal some descriptive feature or to detect such. The usefulness of a method is determined by its analytical power and time and efforts invested to sample preparation, measurement time and signal processing to obtain the desired output. Despite continuous improvement of technology there is no single ultimate method or instrument that provides both extreme analytical power and ease of use in the area of applications of analytical chemistry. Compromises between analytical performance and overall process time, cost and efforts, are common.

Field Asymmetric Ion Mobility Spectrometry (FAIMS) and Differential Mobility Spectrometry (DMS) are atmospheric pressure techniques to separate ions with a high-frequency amplitude-asymmetric strong electric field [1] based on non-linear velocity vs electric field characteristics of ion species. Whereas Ion Mobility Spectrometry in its various embodiments [2] has emerged to analytical applications [3, 4] such as explosive detection [5], breath [6, 7], pharmaceutical [8], food analysis [9] and detection of hazardous chemicals [10], only a few commercial stand-alone DMS based applications exist on the market [11,12,13]. The possible explanations of low popularity of DMS instruments are technical, since the realisation of the required asymmetric electric field with proper characteristics is a complex engineering challenge [14, 15]. Although the spectra analysis time can be within a range of one second in some cases [16], the measurement time is a function of the desired parametric resolution. The output of a DMS measurement is a 2D-signal, which is called as dispersion plot. Creating a high resolution DMS dispersion plot can be time consuming, potentially lasting several minutes. This paper introduces one possible strategy to reduce the measurement time in order to enable applications such as real time detection of biomarkers in surgical smoke.

Measurement time in real life DMS

The separation of ions in DMS is based on the nonlinear electric field transport characteristics of the ions or ion clusters [1]. In DMS ions are transported with laminar gas flow between two parallel electrodes, which create time dependent asymmetric, transverse electric field whose intensity changes in time in such a way that time integral of the electric field is zero. The ideal shape of voltage waveform generating the electric field is a rectangular wave with duty cycle of 0.1..0.3. Therefore, in the part of the waveform period, electric field intensity is distinctly higher than for the rest of the period. The” high field” phase is often called separation field and the voltage that generates this field is called separation voltage USV. The alternating electric field causes ions to oscillate between the electrodes and their net movement in the channel depends on the mobility difference between high and low field. The field dependent mobility is described as K(E/N) = K(0)(1 + α(E/N)) [17], where k(0) refers to mobility in low electric field and α(E/N) describes the field dependent component of the mobility. If the net movement in between the electrodes is not zero, ions move towards the electrode, lose their charges and neutralise. With a small superimposed DC field, created by and referred as compensation voltage UCV, the net movement can be adjusted to zero and ions can be transported with a flow between the electrodes. The DMS can be imagined as a filter which allows only portion of the ions of specific α(E/N) to pass at selected USV and UCV values. The separation capability and also the signal intensity and thus signal-to-noise ratio depends on the geometry of the filter and the separation field strength. To obtain understanding of an ion’s behaviour in high field or detect existence of some ion species, both USV and UCV are commonly scanned with discrete steps. Ionic current registered for these values constitutes the dispersion plot such as shown in Fig. 1. Dispersion plots are commonly used to visualise the measurement result.

Fig. 1
figure 1

Left: An example of a dispersion plot where horizontal axis represents UCV, vertical axis USV and colouring is based on ionic intensity. Right: Differential mobility spectra extracted from USV values denoted with A, B and C. Horizontal axis is UCV, vertical axis ionic intensity

To measure one ionic current signal for one pair of USV and UCV values of the dispersion plot, following steps need to be taken:

  • USV is set to adjust the strength of an asymmetrically oscillating electric field

  • UCV is adjusted to create DC field

  • Ions are transported between the planar channel with electrodes by flow

  • Number of surviving ions are measured with static electric field with amplifiers connected to electrode pair after the DMS filter

  • Measured signal is processed and stored into the measurement memory

Normal strategy is to scan UCV over potential compensation value range while keeping USV constant. After a complete UCV scan, USV is stepped between given USV range and this process is repeated until all desired USV values are scanned. The measurement time of this process can be estimated as

$$ {t}_{\mathrm{scan}}={n}_{\mathrm{SV}}\ast \left({t}_{\mathrm{SV}}+{n}_{\mathrm{CV}}\ast \left(\mathit{\max}\left({t}_{\mathrm{flow}},{t}_{\mathrm{CV}}\ \right)+{t}_{\mathrm{sample}}+{t}_{\mathrm{spt}}\ \right)\right) $$
(1)

where

  • tscan is the time required for the complete scan

  • nSV is the number of separation voltage USV steps, typically 10–100

  • tSV is the setup time for separation voltage USV, typically 10–500 ms

  • nCV is the number of compensation voltage UCV steps, typically 100–1000

  • tflow is the time required to get new steady state with new USV and UCV parameters, typically 1–3 ms depending on the geometry and amount of the flow

  • tCV is the setup time for the compensation voltage, typically 0.1–2 ms

  • max(tflow,tCV) is used because the next UCV can be set as soon as analog digital conversion (ADC) sampling is completed

  • tsample is the time required for the analog to digital conversion, including the number of measurement repetitions for digital filtering, typically 0.1–5 ms

  • tspt is the time for signal processing required before next measurement step can be taken, 1–5 ms

The practical values depend strongly on the need of USV and UCV resolution, desired signal to noise ratio, and hardware configuration and limitations. Using typical worst case values the tscan can end up to more than 20 min, which may well be suitable for laboratory measurements but is unpractical for real time measurements requiring response time in the range of seconds. To tackle the measurement time, it’s possible to limit the number of UCV steps but this option is limited itself by the spectral peaks in the obtained spectrum. The peak position is an indication of ion type and too coarse selection of UCV will affect correct peak position detection. One option would be to cut down tsample. The sample rate of the ADC can be in the range of 100,000 samples per second and the majority of tsample time is contributed by the filtering requirements and limiting the filtering would sacrifice the signal-to-noise ratio. The parameter tflow can be varied, but enough residence time is necessary to ensure stabilisation of ionic current at the output of separator after changing the UCV. Also, using higher flow will reduce tflow and is seen as increased signal intensity, but it will also have a negative impact on peak width and thus, separation capability. It is notable that USV can be set as soon as sampling of previous USV -UCV pair is completed and therefore readiness for next sample acquisition is possible as soon as both tsample and tflow are completed. High voltage setup time tSV is fixed by the design of the high voltage source and tend to increase with USV step size, because of the time needed to charge or discharge capacitances in the high voltage source. The signal processing time tspt is not a limiting factor for the measurement and digital devices such as modern embedded controllers or signal processors. The digital filtering and data transfer to the memory can be completed in milliseconds or less and the measurement cycle time depends more on other tasks the processor needs to execute. The most effective way to limit the total measurement time tscan is to limit the number of USV,UCV -pairs in the scan and thus find a way to limit nSV or nCV .

Example of measurement time estimation

The measurement results from DMS can be displayed as a heat map, where x-axis represents the compensation voltage UCV, y-axis the separation voltage USV and the intensity of the signal is colour coded. This type of data presentation is visually appealing, informative for the user and is an outcome of the natural way of parameter scanning in DMS. An example of this kind of graph called a dispersion plot is shown in Fig. 1. For illustration purposes, using Owlstone V-OVG gas calibrator and a self-made permeation tube, low ppb-level heptanone sample was generated and mixed with small amount of water vapour. Concentration levels of heptanone and humidity were experimentally adjusted to demonstrate typical structure of dispersion plot in which reactant ions (hydrated protons) and two kinds (protonated monomer and proton bound dimer) of sample ions are visible. The measurements were performed and data were collected with ENVI-Analyzer manufactured by Environics Oy, Finland. The DMS was operated with square wave pulses at frequency of 250 kHz and duty cycle of 10.3%. The UCV resolution of the data was set to 200 equally distributed values between −1.89 V to 11.8 V. For USV, the resolution was set to 45 equally distributed values between 240 V to 722 V. The total number of acquisition points in the scan was 9000. Each point in dispersion plot is an average of samples measured over 20 ms sampling period. The exact number of samples is not known because instrument’s program cycle and load time may affect to analog-to-digital conversion interval. The device collected the full 9000 dispersion plot points for positive and negative ions separately. The total time required to obtain scans for both polarities was 394 s, giving 21.9 ms estimate for single USV UCV point measurement time tUSV,UCV. This time may be short or long compared to some other instrument and should be considered here only as an illustrative measure.

Anatomy of the dispersion plot

A common reason to scan ionic current over USV and UCV ranges is to get information to characterise ion’s high field properties or use the data for detection purposes. Figure 1 shows colour mapped coding from the two dimensional scan of a DMS measurement. The ion species, heptanone dimer, monomer and positive reactant ions, RIP are seen as fingers or branches shown against blue background. The background presents the floor of the intensities, which consists mainly of low level noise. Common sense interpretation of the figure indicates that interesting data lies in such branches of the dispersion plot in which the branches of the spectrum are well separated. Not all data in the dispersion plot are useful.

Figure 1 shows UCV scans extracted from the dispersion plot with fixed USV values. These scans are named differential mobility spectra. With low USV (Fig. 1 right panel, A) ions are not separated. Mid range USV (Fig. 1 right panel B) reveals three peaks, from left to right: heptanone dimer, heptanone monomer and RIP. Figure 1, right panel, C shows the highest USV. The RIP is no more visible and only the dimer and monomer peaks are seen. The peak intensities in Fig. 1 A are highest but separation is poor and peaks in C are the lowest while separation of the two peaks is the best. The measurement of intensities along the branches in the dispersion plot in Fig. 1 are shown in Fig. 2. The intensity is reduced as a function of USV. The loss is caused by the reduction of an effective gap, when increased electric field intensity increases oscillation length, thus increasing contacts of ions to electrodes causing charge loss. This is discussed in detail in [18].

Fig. 2
figure 2

Change of the peak intensity vs. separation voltages for heptanone dimer, monomer and RIP

Dispersion plot sub sections

Based on the data shown in Figs. 1 and 2 the dispersion plot is annotated to descriptive sections shown in Fig. 3.

Fig. 3
figure 3

Descriptive area division in dispersion plot

The sections are:

  • A section, where branches caused by ion peaks are visually separated and well above noise level

  • B section, where branches are very close to noise level or have disappeared (see 1C)

  • C section, where no peak exists because no ion will pass the DMS filter with these USV,UCV values

  • D section, where almost no dispersion of ions can occur due to too low separation field (see 1, A)

By intuition, the most useful measurement area lies in section A of Fig. 3. Considering that the measurement could be limited to the section A only, the potential savings of measurement time can be estimated from the surface areas of A compared to the full area. By arbitrary selection based on Fig. 2 of corner points for (−1.89,300), (−1.89,600), (9600) and (2.5300) described by pairs of UCV and USV values, the proportion of area A to the full area is 0.42, therefore measurement time could be decreased in the same proportion. Although the hardware would support measuring only this kind of section, the measurement time would still be too long for real time applications. Further time reduction could be possible if only a single differential mobility spectra was used. Assuming that relevant spectra could be found from section A, limited by its boundaries, the reduction of data could be dramatic. It can be estimated from Fig. 1, section B in right panel, that perhaps only half of a single spectrum data is necessary to be able to properly separate the three peaks in this example. Because the selection of USV impacts to nCV, the next question is how to find the most representative USV with good separation and signal-to-noise ratio.

Entropy

Shannon entropy - the definition and basic properties

The theory of information contains tools for validation of the information content of data. One of the most important and known parameters is the entropy of the signal known as Shannon entropy H(x) [19], which can be defined as

$$ H(x)=-{\sum}_{i=1}^N{p}_i lo{g}_k\left({p}_i\right) $$
(2)

where pi is the probability of obtaining the value xi of variable x and N is the number of possible values of a variable x. In signal processing the probability is represented usually by signal intensity measured as a function of some independent variable (e.g. frequency, wavelength, number of measurement, etc.). The logarithm base, k is usually equal to 2. For sequence of signal measurements, where results are distributed between discrete values, the Shannon entropy is zero, when all results have the same value. Shannon entropy increases when uniformity of distribution increases and will get maximum, when signal is uniformly distributed over the possible values. One way to interpret Shannon entropy is to see it as a measure of the quality of signal distribution. Shannon entropy and its minimum have been applied recently for liquid chromatrography - mass spectrometry (LCMS) [20] to select the best mass-spectrum. The basis of this application is explained by Chatterjee et al. [21]. In this work the probability was estimated as ratio of the signal value for each point to the sum of signal intensities for all points of the spectrum. In our work another kind of approach for calculation of probability was selected. It was assumed that the result of signal intensity measurement in DMS is a random variable distributed between N intervals of equal value in such a way that

$$ \Delta S=\frac{S_{max}-{S}_{min}}{N} $$
(3)

where Smax to Smin is the range of possible values of DMS signal and S is the width of signal interval. Assuming that logarithm base k in (2) is equal to 2, total number of samples is ns and number of counts for ith signal interval is ni. For such approach the probability needed for calculation – the entropy can be estimated as \( {p}_i=\frac{n_i}{n_s} \). Then

$$ H=-{\sum}_{i=1}^N\frac{n_i}{n_s} lo{g}_2\left(\frac{n_i}{n_s}\right) $$
(4)

In the case of fully concentrated noiseless distribution ni = ns for i = j and ni = 0 for i ≠ j. Then, the entropy calculated with Eq. (4) is equal to 0, and it is minimum possible value of this parameter. In case of uniform distribution, the probability of finding signal in each interval is \( {p}_i=\frac{1}{N} \) and thus the formula for calculation the entropy becomes

$$ {H}_U=-{\sum}_{i=1}^N\frac{1}{N} lo{g}_2\left(\frac{1}{N}\right)= lo{g}_2(N) $$
(5)

It can be proved that the value calculated with the formula (5) is the maximum value of entropy. The DMS dispersion plot can be considered as a matrix, which contains experiment vectors with a length of nCV for each USV value. Each vector element is the intensity of the signal. In our approach each vector is considered rather as a set of random variable values than a spectrum. For high USV values, i.e., for experimental conditions when ions do not pass through the DMS filter (section B in Fig. 3) the signal is the baseline with characteristic noises. This represents the lowest possible entropy. Appearing of peaks in the differential mobility spectrum for lower values of USV cause distribution to expand towards larger uniformity, which will be seen as increased entropy. The low USV values result to minimal or no separation and only for few UCV values the signal intensities differ from noise level, making the noise a major contributor for the entropy. Most of the ion related peaks exist between low and high USV and thus the maximum of entropy sets somewhere between the low and high USV.

Entropy estimation

The intensity probabilities were obtained with function hist() from graphics package of R Studio (version 3.5.1). The results of entropy values were calculated with Eq. 4 for each value of USV rows in dispersion plots measured for the heptanone. Measurements were performed with duty cycles 5.25% and 10.25% and are shown in Fig. 4. The entropy maximum for lower duty cycle was found at the higher limit of USV range and entropy maximum with higher duty cycle is in mid-range of used USV range. The difference can be explained on the basis of dispersion plots: The lower duty cycle contributes lower loss of the signal but needs higher field strength to cause dispersion, because the pulse length of USV is shorter and thus entropy value increases as long as loss of ions becomes meaningful. When duty cycle is increased the dispersion gets stronger, but the loss increases as well and the maximum point is met with lower USV than for lower duty cycle.

Fig. 4
figure 4

Left: Example of entropy as indicator of separation selection tool with duty cycle of 5.25%. A1: Dispersion plot, B1: Entropy vs separation voltage USV, C1: Differential mobility spectrum measured at separation voltage USV equal to 582 V for which maximum entropy was achieved. Right: Example of Entropy as indicator of separation selection tool, with duty cycle of 10.25%. A2: Dispersion plot, B2: Entropy vs separation voltage USV, C2:Differential mobility spectrum measured at separation voltage USV equal to 431 V for which maximum entropy was achieved

The decreasing tendency of maximum entropy USV is shown in Fig. 5. Altogether 202 experiments were completed with similar chemical setup as described before, but duty cycle was stepped between 5 and 10.25%, each 9–11 times. The automated test run took about 24 h and there was some variation in water vapour concentration as well as in ambient pressure during the test, which may explain the large dispersion of single experiments results. This, somewhat unexpected dispersion of entropy maximum locations, can actually be used to determine the width of operative USV window, when some parametric noise such as variation in sample concentration is expected.

Fig. 5
figure 5

USV at maximum entropy with different duty cycles. The solid triangles shows the mean USV at a given duty cycle. The line connecting triangles is only to clarify the visualisation

Discussion

The original problem presented in this paper is to reduce the time needed for the measurement. The ENVI-Analyzer instrument used to collect heptanone data is used here as an example. The device allows script based control of USV and UCV. The range and number of steps within a range and set-up delays can be set independently. For given USV, also frequency and duty cycle can be defined. The averaging time for single intensity measurement can be defined as a parameter. The practical scanning speed for the used hardware is limited by the transients related to USV and UCV step size and the noise level of the intensity measurement. The fluctuations of the intensity signal are related to characteristic noises of transimpedance amplifier used for measurement of the ion current, and electrical noise sources within the instrument and variations of the gas flow. The instrumental nonidealities define the averaging time and other parameters, which were found experimentally.

In the examples shown in the Fig. 4, the ionic peaks are located within about 6 V UCV span. With the used instrument settings, this corresponds to about 100 UCV values and equals to 2.2 s measurement time. The reduction from original 9000 sample points to the proposed 100 points must be understood as a demonstration of measurement time reduction potential for limited type of applications rather than a generic result. DMS instruments are commonly used to scan over USV,UCV space to gather dispersion data, which is time consuming regardless of the hardware design. Applications aiming to detect or identify some set of chemicals need large amount of experimental trials for library creation. When instrument is used with the library of spectra or peak positions, the detection or identification speed depends on the speed of sample measurement - the scan time - and the algorithm how the sample data is evaluated against the experimental data. The entropy can be calculated from experimentally collected data sets during the library creation process and used to find candidate sets of USV values which work well in indicative detection based on the peak positions. The fine resolution dispersion data can provide a lot of extra information compared to simple peak position detection and therefore, advanced measurement parameter set and detection algorithms could follow the indicative step. This two step approach would lead to improvement in the first stage detection time, especially if also the UCV span can be limited as illustrated within Fig. 3.

Conclusion

Our test data was limited to single chemical example, which does not predict the location of entropy maximum in a case of different analyte or in the case of complex multichemical matrix. When creating a database for detection applications containing large number of different conditions, it is likely that USV for entropy maximum varies per sample. However, it seems possible that the knowledge of DMS dispersion plot characteristics and Shannon Entropy can be used to find section of DMS dispersion data and thus USV and UCV parameters which provide good separation on ions together with good signal-to-noise ratio. Narrow dispersion plot, optimally single USV results to short measurement time. Rapid measurement benefit applications which utilise this data to trigger more exhaustive data measurement and analysis algorithms. Using entropy as an indicator can be envisioned for parameter optimisation for hyphenated systems such as GC-DMS. The results shown here are early proof-of-concept and more experimenting is needed to find out the true potential of this concept. These experiments, as well as some theoretical considerations, should also enrich our knowledge of the properties of entropy as a measure of the quality of the spectra in DMS. We treat our work as the first step in this direction.