OPERATORS
PIPELINES

Voice Activity

v1.0.0new

voice_activity processes microphone audio into speech-state signals for voice workflows. Use it when a patch needs echo cancellation, speech start/end detection, and optional Smart Turn end-of-turn classification before sending audio to STT or an agent loop.

What It Does

The operator loads a pipeline with Silero VAD, optional LiveKit echo cancellation/audio processing, and optional Smart Turn v3. While Active is on, incoming audio chunks are queued, processed, and converted into CHOP-observable state: speaking, speech start, speech end, turn complete, Smart Turn probability, and latency metrics.

Echo Cancellation uses input 2 as reference speaker/TTS audio. Smart Turn checks whether a pause likely means the user has finished speaking, reducing mid-sentence cutoffs in voice interfaces.

Typical Workflow

Wire microphone audio to input 1. If Echo Cancellation is enabled, wire speaker or TTS reference audio to input 2.
Pulse Install Dependencies once if the required Python packages are missing.
Pulse Load Pipeline, or leave Auto Load on Init enabled and wait for Pipeline Ready.
Tune Speech Threshold, Min Silence, and Speech Pad for the microphone and room.
Enable Smart Turn when semantic end-of-turn detection is useful, then adjust Turn Threshold and Turn Silence.
Turn Active on and monitor Is Speaking, Smart Turn Probability, and downstream CHOP flags.

Inputs And Outputs

Input 1: Mono microphone audio CHOP, expected at the processing sample rate.
Input 2: Optional reference audio CHOP for Echo Cancellation.
Output 1: Processed audio CHOP after the enabled audio-processing stages.
Output 2: Status and metrics CHOP.
Output 3: Speaking and turn-complete flag CHOP.

Works Well With

stt: Receives gated/processed microphone audio and speech boundary signals.
tts: Supplies reference audio for echo cancellation in speaker playback setups.
agent: Uses turn-complete signals to decide when to respond.
flow_router: Routes speech start/end and Smart Turn events.

Gotchas

Load Pipeline must succeed before Active does useful work.
Echo Cancellation needs reference audio on input 2. Without that signal, enabling it cannot remove speaker echo.
Smart Turn can add a small amount of end-of-turn latency in exchange for fewer premature cutoffs.
First Smart Turn load may download/cache ONNX and feature-extractor assets.
The operator replaces older Silero-only VAD workflows; avoid running both on the same microphone path.

Parameters

Voice Activity

Pipelinestatus (Pipelinestatus) op('voice_activity').par.Pipelinestatus Str

Default:: "" (Empty String)

Pipeline Header

Active (Active) op('voice_activity').par.Active Toggle

Default:: False

Auto Load on Init (Autoloadoninit) op('voice_activity').par.Autoloadoninit Toggle

Default:: True

Load Pipeline (Loadpipeline) op('voice_activity').par.Loadpipeline Pulse

Default:: False

Unload Pipeline (Unloadpipeline) op('voice_activity').par.Unloadpipeline Pulse

Default:: False

Pipeline Ready (Pipelineready) op('voice_activity').par.Pipelineready Toggle

Default:: False

Is Speaking (Isspeaking) op('voice_activity').par.Isspeaking Toggle

Default:: False

Install Dependencies (Installdependencies) op('voice_activity').par.Installdependencies Pulse

Default:: False

Audio Processing Header

Echo Cancellation (Enableaec) op('voice_activity').par.Enableaec Toggle

Default:: True

Noise Suppression (Enablenoisesuppression) op('voice_activity').par.Enablenoisesuppression Toggle

Default:: False

Auto Gain Control (Enableautogaincontrol) op('voice_activity').par.Enableautogaincontrol Toggle

Default:: False

High-Pass Filter (Enablehighpassfilter) op('voice_activity').par.Enablehighpassfilter Toggle

Default:: False

VAD (Silero) Header

Speech Threshold (Speechthreshold) op('voice_activity').par.Speechthreshold Float

Default:: 0.81
Range:: 0 to 1

Min Silence (ms) (Minsilenceduration) op('voice_activity').par.Minsilenceduration Int

Default:: 508
Range:: 0 to 2000

Speech Pad (ms) (Speechpadding) op('voice_activity').par.Speechpadding Int

Default:: 242
Range:: 0 to 500

Smart Turn Header

Smart Turn (Enablesmartturn) op('voice_activity').par.Enablesmartturn Toggle

Default:: True

Turn Threshold (Smartturnthreshold) op('voice_activity').par.Smartturnthreshold Float

Default:: 0.333
Range:: 0 to 1

Turn Max Audio (sec) (Smartturnmaxaudio) op('voice_activity').par.Smartturnmaxaudio Float

Default:: 8.0
Range:: 1 to 30

Turn Probability (Smartturnprob) op('voice_activity').par.Smartturnprob Float

Default:: 0.0
Range:: 0 to 1

Smart Turn Ready (Smartturnready) op('voice_activity').par.Smartturnready Toggle

Default:: False

Turn Silence (ms) (Smartturnsilence) op('voice_activity').par.Smartturnsilence Int

Default:: 1000
Range:: 100 to 3000

Changelog

v1.0.02026-05-02

updated manifest category to 0.3.0 group taxonomy
Initial voice_activity structure

What It Does

Typical Workflow

Inputs And Outputs

Works Well With

Gotchas

Parameters

Voice Activity

Changelog

Related Operators