0
  • Assume that my microphone audio is being live streamed to a Zoom app (could be any application in general)

  • I want to run this audio through an executable (a filter) before being sent to the application. Naively, this piping mechanism: raw input | filter | application.

  • I need to maintain a buffer in my filter, assume around 50ms (because the audio preprocessing is dynamic), and only then flush that to the application, like how I/O works in general.

  • Let's say I am coding this in C++ (although Python would be preferred, since a bit of ML is involved in the audio preprocessing).

(In effect, the application would be receiving a 50ms delayed audio from my end, but practically this does not matter for me.)

How to achieve this? Since this is quite a broad field, I want to specifically know a way to do the following:

  1. capture raw audio data (from - assume - a headset)
  2. process the raw audio buffer using a C++ executable (preferably Python, though) which maintains it in a 50ms buffer, and then
  3. flushes the post-processed buffer to a live stream application (like Zoom, etc.)

I have seen the noise removal with pulseaudio post post, but my desired audio processing is much more general than just using a built-in pulseaudio module or profiling noise using sox.

0 Answers0