Speech over Engine Noise
My journey with Meluta started last August, when I first stepped in the small office in Hermia and began to familiarize myself with the AI-SEE project. My task in the project was to filter the engine noise of the Sandvik Panthera series mining drill, thus enhancing the speech coming outside the machine. In this way I got the subject for my thesis: Real-time noise filtering with adaptive filters in heavy equipment soundscape. Now, almost a year later I am finally closing the finish line of the research and writing for the project and my graduation is near.
The demand for this kind of application stems from the comfort of the machine operator and the safety of others. During the development of cars and heavy equipment the pit has become isolated from the outside voices. This is understandable since all the sound isolating materials of the vehicle and last but not least the earmuffs are designed to make the working conditions safer and more comfortable for the machine operator. However, cancelling all unwanted and loud noises of the machine, the beneficial sounds are also cancelled. These are for example the speech or yells coming near the machine. By filtering the unwanted sound, in other words the noise, and bringing the speech to the drivers' headphones the safety of others is improved. Because the system should be turned on the whole workday of the operator, the pleasantness of the filtering result is on focus along with the accuracy of the system.
As the development platform the Meluta MAX was used. MAX is a stand-alone platform for the development of different real-time algorithms. It has various sensor connections, in this case the mic array containing four microphones was used. The MAX is packed in a briefcase, which can be easily attached on top of the machine which engine sound is intended to be suppressed. In this case one microphone was used as a reference microphone for the engine noise.
The solution for the filtering problem was decided to do with two different adaptive filters, Wiener filter and normalized least-mean square filter (NLMS) and the thesis compared these two solutions to each other. Even though there were only sixteen test listeners, the results were clear: both filters managed to enhance the speech and produce a cleaner soundscape. Surprisingly, the Wiener filter produced better results. Regarding the reasons why, I have thought about the wind conditions during the tests, the design of the NLMS algorithm and the order in which the algorithms were tested. It may be any one of those or nothing of them, but regardless the same results as with the Wiener filter could be reached if the algorithm were developed further.
The limit of the system was noticed to be human speaking from 20m distance, which the listeners were not able to hear even with the enhancement on. When you are thinking of real life, people do not have a very wide hearing range regarding human speech, so this was anticipated. There were also a few milliseconds delay due to the processing and the Bluetooth connection of the headphones, but since the listener were not able to hear any sound outside otherwise than through the Bluetooth headphones transmitting the soundscape, this did not bother the listeners.
Even though this system was developed for the machine operator, who is sitting in the cabin of the vehicle, it can be adapted to remotely controlled machines as well. In that case the operators can hear the surroundings of the machine even though they are far away from the machine itself, but without the disturbing engine noise. This may improve the safety of the remote-controlled machines since the operators may have several machines to operate at the same time and they cannot watch all of them at the same time. In this case the system may benefit from the simple classifier, which detects if there is a human speaking near the remote-controlled machine. This is, however, a matter of future development, only briefly discussed in my thesis.
The system could be developed forever, but the project must come to an end at some point. The developed system reached its requirements and at some point, even surpassed them (with human speech at 5-meter distance filtered with Wiener filter). Now it is time to move forward to tackle different problems with different solutions in different projects, but my journey with audio processing has just begun.