Barge-in Implementation Method for Multi-CPU In-Vehicle Speech Recognition System
- 1.1k Downloads
The objective is to implement barge-in function for multi-CPU in-vehicle speech recognition system. The barge-in can allow user utterances during the guidance prompt from the system. Barge-in function requires two input audio data. The one is speech signal from microphone. The other is the original guidance prompt data. There are two major problems to implement barge-in. The one is sampling frequency synchronization, because the frequencies are usually different between two input audio data. The other is input timing synchronization, because the timing gap between two input audio data must be within plus or minus 7 ms based on the specification. We have adopted digital hardware converter for sampling frequency conversion. Hardware processing makes much smaller signal input delay than software processing. In addition, digital processing keeps the performance of barge-in compared to analog processing, because it does not cause the quality deterioration. We store the reference data right before audio data input for barge-in module, because the variabilities by the reference data transmission between two CPUs can be removed. Regarding the microphone input data, we reduce the variabilities by synchronizing the microphone input request with the guidance prompt play request. The guidance prompt taken by microphone is passed into barge-in module with the reference data one by one in the smallest unit sequentially to keep real-time processing. The implementation method was validated in terms of the design. The presented design has been implemented and evaluated. Conclusively, it is validated based on the evaluation in terms of the input timing gap between two input audio data into barge-in module and the performance of barge-in function. This study was specified for multi-CPU in-vehicle speech recognition system. The prerequisite is that the full reference data must be on memory to keep real-time processing. Our method is able to solve the two major problems by hardware sampling frequency conversion and the proposed input timing synchronization method. In addition, this method keeps the real-time processing of speech recognition even if we added the barge-in function as the preprocessing of speech recognition. We proposed a barge-in function implementation method for multi-CPU in-vehicle speech recognition system. It solves problems regarding sampling frequency and input timing synchronization of two input audio data to adopt barge-in function. This method does not make any latency performance issues of speech recognition.