Fundamentals of Clinical Data Science pp 85100  Cite as
Extracting Features from Time Series
Abstract
Clinical data is often collected and processed as time series: a sequence of data indexed by successive time points. Such time series can be from sources that are sampled over short time intervals to represent continuous biophysical wave(one word waveforms) forms such as the voltage measurements representing the electrocardiogram, to measurements that are sampled daily, weekly, yearly, etc. such as patient weight, blood triglyceride levels, etc. When analyzing clinical data or designing biomedical systems for measurements, interventions, or diagnostic aids, it is important to represent the information contained within such time series in a more compact or meaningful form (e.g., noise filtering), amenable to interpretation by a human or computer. This process is known as feature extraction. This chapter will discuss some fundamental techniques for extracting features from time series representing general forms of clinical data.
Keywords
Timeseries Feature extraction Frequency filtering Fast Fourier Transform (FFT) Autoregressive (AR) modeling Wavelets7.1 TimeDomain Processing
Raw timeseries data, sometimes referred to as a signal, is inherently represented in the timedomain. Timedomain processing directly exploits the temporal relations between data points and generally provides an intuitive representation of these relationships. Timedomain techniques often aim to identify and detect the temporal morphology of transient or stereotyped information in the time series. When the information of interest repeats over regular or semiregular intervals, straightforward transformations can be used to convert the timedomain information to the frequencydomain, which can isolate oscillatory information for comparison within and across oscillatory frequencies present in the time series. This section discusses some fundamental time(no space) domain techniques and shows how oscillations in the timedomain data lead to frequencydomain representations.
7.1.1 Basic Magnitude Features and TimeLocked Averaging
Peakpicking and integration are two of the most straightforward and basic feature extraction methods . Peakpicking simply determines the minimum or maximum value of the data points in a specific time interval (usually defined relative to a specific labeled event in the data) and uses that value (and possibly its time of occurrence) as the feature(s) for that time segment. Alternatively, the time series can be averaged or integrated over all or part of the time interval to yield the feature(s) for the segment. Some form of averaging or integration is typically preferable to simple peakpicking, especially when the responses to the stimulus are known to vary in latency and/or when there is noise in the time series that can corrupt a simple peak estimation. These same methods can be applied for tracking transient magnitude peaks in the frequency domain.
7.1.2 Template Matching
7.1.3 Weighted Moving Averages: Frequency Filtering
By simply alternating the sign of each weight in the moving average, the opposite effect is observed as shown in the bottom of Fig. 7.4 and the left portion of Fig. 7.5. In this case, the amplitudes of the lower frequencies are attenuated and the higher frequencies are preserved. This is the most basic form of a highpass filter, which preserves the amplitude of high frequency oscillations and attenuates the amplitude of lower frequency oscillations. Just as with the lowpass filter, as N increases, the range of highend frequencies that are preserved decreases because only the oscillation frequencies that are near the oscillation frequency of the weights are preserved.
Based on these two rudimentary filter types, it can be surmised that the weight values in the moving average (i.e., filter weights or coefficients) and length of the average (i.e., filter length) can be adjusted to preserve and attenuate arbitrary frequency ranges, as well as to produce different output characteristics such as increased attenuation of undesired frequencies. In addition to lowpass and highpass filters, the two other basic filter designs are the bandpass and bandreject filters. A bandpass filter attenuates a range of low and high frequency oscillations, while preserving an intermediate range of frequencies. In contrast, a bandreject filter preserves a range of low and high frequency oscillations, while attenuating an intermediate range of frequencies.
The frequency response characteristic of a given filter, or the magnification/attenuation factor (i.e., gain) produced with respect to input oscillation frequency, is typically visualized in a frequencydomain plot as shown for the 4 fundamental filter types in the right portion of Fig. 7.5. Notice that the time variable is removed and the plots merely track the attenuation for each input oscillation frequency as illustrated in the time domain in Fig. 7.4. The region of the frequency response that preserves the oscillations is referred to as the passband and the region that attenuates the oscillations is referred to as the stopband. The region between the passband and stopband is referred to as the transition band. For practical filters, there is some finite slope to the transition band because a perfect threshold between frequencies (i.e., an ideal filter with infinite slope) requires an infinite length filter. Therefore, by convention, the threshold of the transition band is defined in the frequency response characteristic as the point where the attenuation drops by 3 decibels (3 dB point) from the passband. This 3 dB point is referred to as the filter’s cutoff frequency.
Returning to Fig. 7.4, not only does the amplitude between the input and output time series change depending on the oscillation frequency, but the output time series may be shifted (i.e., delayed) in time. Note that, for moving average filters with symmetric weights about the center of the average, the time shift will be constant for all input frequencies and equal to the length of the moving average divided by 2. This is known as linear phase response. Thus, for realtime applications, the length of the weighted moving average (i.e., filter length) impacts the delay between the input and output time series. Furthermore, because longer averages preserve/attenuate tighter frequency ranges, there is a tradeoff between the precision of frequency discrimination and the amount of delay introduced for a given filter length.
The weighted moving average filters discussed to this point are more formally referred to as finite impulse response (FIR) filters because they will always produce a finitelength output time series if the input time series is finite in length. A common method to determine FIR filter weights to match a desired frequency response characteristic is known as the equiripple design, which minimizes the maximum error between the approximated and desired frequency response.
7.1.3.1 Weighted Moving Averages with Feedback
By taking an FIR filter structure and including weighted values of the past output values (i.e., feedback), a different filter structure is formed known as an infinite impulse response (IIR) filter . The basic idea is that, due to feedback, the output of the filter may continue infinitely in time even if the input time series is finite in length. The advantages of IIR filters over FIR filters is that they offer superior precision of frequency discrimination using fewer data points in the averaging (i.e., lower filter order). This also generally equates to shorter delay times. However, IIR filters tend to distort the output time series because, in contrast to symmetric FIR filters, all input frequencies generally do not experience the same time delay. This is know as nonlinear phase response. Additionally, unlike FIR filters, IIR filters can be unstable if not designed carefully. This occurs when there is a positive feedback loop that progressively increases the amplitude of the output until it approaches infinity, which is highly undesirable and potentially damaging to the system.

Butterworth: Provides a flat passband and stopband with the smallest transition band slope for a given filter order.

Chebyshev I: Provides a flat passband and rippled stopband with greater transition band slope for a given filter order compared to Butterworth.

Chebyshev II: Provides a flat stopband and rippled passband with greater transitionband slope for a given filter order compared to Butterworth (equivalent to Chebyshev I).

Elliptic: Provides a rippled passband and stopband with greatest transitionband slope for a given filter.
A flat passband or stopband means that the oscillations in the band will be preserved or attenuated by a uniform gain factor. A rippled passband or stopband means that oscillations in the band will be preserved or attenuated by a factor that varies with frequency, which is generally undesirable and can be minimized using design constraints. Thus, if frequency discrimination (i.e., sharp transition bands) is paramount for the filter application and ripple can be tolerated in both bands, then an elliptic filter will provide the best result for a given filter order. If the application requires uniform frequency preservation and attenuation in the respective bands, then a Butterworth is warranted with the compromise of the sharpness of the transition band.
In summary, the main considerations when selecting/designing a filter are:
Filter Order
Affects output delay for minimumlatency or realtime applications, longer orders are required for constraints approaching ideal filters such as transition band and stopband attenuation, elliptic IIR generally provides the lowest order for given constraints but other tradeoffs must be considered (e.g., nonlinear phase and ripple).
Linear Phase
Constant phase delay (no phase distortion), achieved by symmetic FIR and can be approximated with an IIR, particularly Butterworth.
Filter Precision
the sharpness of the transition band for separating two adjacent oscillations, elliptic IIR generally provides the sharpest transition for a given filter order but other tradeoffs must be considered (e.g., nonlinear phase and ripple).
Passband/Stopband Smoothness
Degree of amplitude distortion in the passband and stopbands. Butterworth IIR provides a smooth passband and stopband. The Chebychev variants can be used to obtain sharper transition bands if it is critical for only one band (passband or stopband) to be smooth.
Stopband Attenuation
How well the filter will block the undesired oscillations in the stopband. For a fixed filter order, there will be a tradeoff between filter precision and stopband attenuation.
7.2 FrequencyDomain Processing
Thus far, we have shown how weighted moving averages (i.e., FIR and IIR filters) can preserve/attenuate specific oscillatory frequencies present in an input time series. This forms the basis of frequencydomain processing. What has not yet been emphasized is that practical time series are comprised of a mixture of many (possibly infinite) oscillations at different frequencies. Specifically, a time series can be uniquely represented as a sum of sinusoids, with each sinusoid having a specific oscillation frequency, amplitude, and phase shift (delay factor). The method of determining these frequency, amplitude, and phase values for a given time series is known as the Fourier Transform. The Fourier transform converts the time series from the timedomain to the frequencydomain, similar to what is described in the previous section for the frequency response characteristic of a filter. The significance of converting a time series to the frequencydomain is that the specific oscillations present in the time series and their relative amplitudes and phases can be more easily identified, particularly compared to a timedomain visualization of a mixture of many different oscillations. By representing and visualizing in the frequency domain, filter response characteristics can be better designed to preserve/attenuate specific oscillations present in the time series. The filters described previously can operate on a time series that is comprised of a mixture of oscillations in a way that the mixture of oscillations observed at the output is completely defined by the frequency response characteristic of the filter. In other words, if a time series is a simple sum of a low frequency oscillation and a highfrequency oscillation, an appropriatelydesigned lowpass filter would preserve only the lowfrequency oscillation at the output and sufficiently attenuate the highfrequency oscillation.
7.2.1 Band Power
7.2.2 Spectral Analysis
7.2.2.1 Fast Fourier Transform (FFT)
The Fast Fourier Transform (FFT) is an efficient algorithm to transfer a time series into a representation in the frequencydomain. The FFT represents the frequency spectrum of a digital signal with a frequency resolution of samplerate/FFTpoints, where the FFTpoint is a selectable scalar that must be greater or equal to the length of the time series and is typically chosen as a base2 value for computational efficiency. Because of its simplicity and effectiveness, the FFT often serves as the baseline method to which other spectral analysis methods are compared.
The FFT takes an Nsample time series and produces N frequency samples uniformly spaced over a frequency range of sampling rate/2, thus making it a one toone transformation that incurs no loss of information. The maximum frequency of sampling rate/2 in this transformation is called Nyquist frequency and refers to the highest frequency that can be reconstructed using the FFT. These frequency domain samples are often referred to as frequency bins. Each bin of the FFT magnitude spectrum tracks the sinusoidal amplitude of the signal at the corresponding frequency. The FFT will produce complex values that can be converted to magnitude and phase. The FFT spectrum of a real signal has symmetry such that only half of the bins are unique, from zero to + sampling rate/2. The bins from zero to sampling rate/2 are a mirror image of the positive bins about the origin (i.e., zero frequency). Therefore, for an Nsample real signal, there are N/2 unique frequency bins from zero to sampling rate/2. Knowing this fact allows one to apply and interpret the FFT without a firm grasp of the complex mathematics associated with the notion of “negative frequencies.”
Finer frequency sampling can be achieved by appending M zeros to the Nsample signal, producing (M + N)/ 2 bins from zero to the sampling rate/2. This is known as zero padding. Zero padding does not actually increase the spectral resolution since no additional signal information is being included in the computation, but it does provide an interpolated spectrum with different bin frequencies.
7.2.2.2 Windowing
7.2.2.3 Autoregressive (AR) Modeling
Autoregressive (AR) modelingis an alternative to Fourierbased methods for computing the frequency spectrum of a signal. AR modeling assumes that the signal being modeled was generated by passing white noise through an infinite impulse response (IIR) filter. The specific weights of the IIR filter shape the white noise input to match the characteristics of the signal being modeled. White noise is essentially random noise that has the unique property of being completely uncorrelated when compared to any delayed version of itself. The specific IIR filter structure for AR modeling uses no delayed input terms and p delayed output terms. This structure allows efficient computation of the IIR filter weights. Because white noise has a completely flat power spectrum (i.e., the same power at all frequencies), the IIR filter weights are set so as to shape the spectrum to match the actual spectrum of the time series being analyzed.
Because the IIR filter weights define the signal’s spectrum, AR modeling can potentially achieve higher spectral resolution for shorter signal blocks than can the FFT. Short signal blocks are often necessary for realtime applications. Additionally, the IIR filter structure accurately models spectra with sharp, distinct peaks, which are common for biological signals such as ECG or EEG. [2] discusses the theory and various approaches for computing the IIR weights (i.e., AR model) from an observed signal.
The primary issue with AR modeling is that the accuracy of the spectral estimate is highly dependent on the selected model order (p). An insufficient model order tends to blur the spectrum, whereas an overly large order may create artificial peaks in the spectrum, as illustrated in the bottom of Fig. 7.9. The complex nature of many time series should be taken into account for accurate spectral estimation, and this often cannot be reliably accomplished with such small model orders. It should be noted that the model order is dependent on the spectral content of the signal and the sampling rate. For a given signal, the model order should be increased in proportion to an increased sampling rate. More information about AR modeling can be found in [3, 4].
7.3 TimeFrequency Processing: Wavelets
For the conventional spectralanalysis techniques discussed thus far, the temporal and spectral resolution of the resulting estimates are highly dependent on the selected time series length, model order, and other parameters. This is particularly problematic when the time series contains transient oscillations that are localized in time. For instance, for a given time series observation length, the amplitude of a particular highfrequency oscillation (with respect to the observation length) has the potential to fluctuate significantly over each cycle within the observation. In contrast, the amplitude of a lowerfrequency oscillation will not do so because a smaller number of cycles occur within the observation. For a given observation length, the FFT and AR methods produce only one frequency bin that represents these fluctuations at the respective frequency. By observing this bin in isolation, it is not possible to determine when a transient oscillation at that particular frequency occurs within the observation. Wavelet analysis solves this problem by producing a timefrequency representation of the signal. However, as predicted by Heisenberg’s uncertainty principle, there is always a time/frequency resolution tradeoff in time series analysis: it is impossible to precisely determine the instantaneous frequency and time of occurrence of an event. This means that longer observation lengths will produce spectral estimates having higher frequency resolution, while shorter time windows will produce estimates having lower frequency resolution.
There are a wide variety of mother wavelets, and each has specific timefrequency characteristics and mathematical properties. In addition, applicationspecific mother wavelets can be developed if general mother wavelet characteristics are known or desired. [5, 6] provide the theoretical details of wavelets.
7.4 Conclusion
This chapter provided a broad overview of common techniques to extract meaningful features from timeseries data. The reader should be familiarized with the basic concepts of timedomain analysis and the transition to frequency domain using filtering and Fourier and wavelet analyses. For a deeper understanding of the topic, dedicated textbooks are recommended (e.g. [7, 8]).
References
 1.Priestley MB. Nonlinear and nonstationary time series analysis. London: Academic Press; 1988.Google Scholar
 2.Hayes MH. Statistical digital signal processing and modeling. New York: John Wiley & Sons; 1996.Google Scholar
 3.Hamilton JD. Time series analysis, vol. 2. Princeton: Princeton university press; 1994.Google Scholar
 4.Madsen H. Time series analysis. Hoboken: CRC Press; 2007.CrossRefGoogle Scholar
 5.Mallat S. A wavelet tour of signal processing. San Diego: Academic press; 1999.Google Scholar
 6.Ingrid Daubechies. The wavelet transform, timefrequency localization and signal analysis. IEEE Trans Inf Theory. 1990;36(5):961–1005.CrossRefGoogle Scholar
 7.Lyons RG. Understanding digital signal processing, 3/E. Upper Saddle River: Prentice Hall; 2011.Google Scholar
 8.Proakis JG, Manolakis DG. Digital signal processing: principles algorithms and applications. Upper Saddle River: Pearson Prentice Hall; 2007.Google Scholar
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.