Estimating the DOA of speakers and speech signals are a wide-ranging problem of interest in acoustic signal processing. From the different literatures, it is concluded that beamforming based on optimization for DOA estimation is advantageous. Significant knowledge about the MA is essential before developing the DOAE techniques. Copious applications recommended the use of large arrays in the order of greater than 100 elements in auditoriums, while small arrays of 2 or 3 elements are recommended for mobile telephones and hearing aids. Besides, the microphone array technology is extensively realistic in surveillance, and speech recognition. Conventional techniques have been employed for MAs include fixed spatial filters, such as optimal beamformer, adaptive beamformer, and frequency invariant beamformers. Such array methods assume either calibration signal knowledge or model knowledge in addition to localization information for their design. Accordingly, they typically embrace some form of localization and tracking along with the beamforming methods. Currently, contemporary techniques, exhausting time, frequency masking and blind signal separation (BSS) techniques have enticed the researchers’ attention. These methods are less reliant on localization and array model as well as the speech signals’ statistical properties, including the non-stationarity, sparseness, and non-Gaussianity.

From the theoretical perspective, the spatial diversity is considered the foremost advantage of the multiple microphones, which is an efficient tool to combat reverberation, noise, and interference. In the speech signal (target), the used sustaining physical feature is the difference in the coherence versus the noise field and for understanding the striving in the enhancement of the highly reverberant speech of the received microphone signals.

Traditional techniques such as hand-free operation of MAs, frequency invariant beamforming, and source localization are developed for efficient DOAE. Small size microphone arrays have numerous applications for hearing aids, close up microphones, and mobile terminals. The novelty in representing small size arrays supports the suppression of multiple interferers. Abnormalities in speech stemming and noise from processing are principally unavoidable.