Handling interruptions with LLM

Handling Interruptions with LLM

This guide explains how to implement sophisticated interruption handling in your Dasha applications when working with Large Language Models (LLMs). Proper interruption handling ensures natural, human-like conversations by managing when the AI should speak, listen, or wait.

Overview

Interruption handling involves three key components:

Silence Management - Knowing when to stay quiet while users speak
Wait Request Handling - Processing user requests to pause the conversation
Smart Interruption Logic - Determining appropriate responses when users interrupt the AI

Essential Prompt Instructions

Before implementing the technical components, you can use these instructions in your LLM prompts for more proper behavior:

## Silence Management

**Critical**: Use the keep_silence function to avoid interrupting users:

- **When users are dictating information**: Phone numbers, addresses, codes, names, or any critical data
- **When users are mid-thought**: Verbal cues like "I think that...", "right?", "I mean..." indicate they may continue
- **When users haven't finished speaking**: Based on conversational context, intonation, or natural speech patterns
- **During natural pauses**: Wait for clear completion signals before responding

**IMPORTANT EXCEPTION**: Do NOT use keep_silence if the user has interrupted you while you were speaking. In interruption scenarios, allow the interruption handling system to manage the flow instead of forcing silence. You can identify interruption scenarios by checking if there is a system message in conversation history stating "user interrupted us".

Always remain completely silent until you're certain the user has finished speaking. Never interrupt with prompts like "please continue" during dictation or mid-speech.

## Wait Request Handling

**Critical**: When users ask you to wait or indicate they need a moment, you MUST follow this two-step process:

### Step 1: Acknowledge the Request
**ALWAYS** verbally acknowledge the user's wait request first with a brief, natural response such as:
- "Of course, take your time!"
- "Sure, no problem!"
- "Absolutely, I'll wait."
- "No rush, I'm here when you're ready."
- "Take all the time you need!"

### Step 2: Call the Function
**AFTER** acknowledging, immediately call the handle_wait_request function:

- **When users ask to wait**: Use handle_wait_request("user asked to wait") for requests like "Can you wait for a second?", "Hold on", "Give me a moment", "One sec"
- **When users indicate they're stepping away**: Use handle_wait_request("user stepping away temporarily") for phrases like "Let me check something", "I need to do something quickly"
- **When users need time**: Use handle_wait_request("user needs time") for any indication they need a pause in the conversation

**Important behaviors**:
- This function automatically disables hello pings until the user speaks again
- The user won't be interrupted with "Hello?" or "Are you there?" messages while they're away
- Hello functionality automatically re-enables when the conversation resumes

**Complete example usage**:
- User: "Can you wait for a second?" → "Of course, take your time!" + handle_wait_request("user asked to wait")
- User: "Hold on, let me check" → "Sure, no problem!" + handle_wait_request("user asked to hold on")
- User: "Give me a moment" → "Absolutely, I'll wait." + handle_wait_request("user requested moment")

**IMPORTANT**: Never call handle_wait_request without first providing a verbal acknowledgment to the user.

Implementation Guide

Main Application Structure

Your main DSL file should follow this pattern to properly integrate interruption handling:

import "./hello.dsl";
import "./interruptions.dsl";


context {
    // Your inputs
    input llm_ApiKey: string? = null;
    input llm_model: string = "openai/gpt-4.1-mini";

    // System variables
    finished: boolean = false;
    top_p: number = 1.0;
}

// ... your code here


// main loop

node main_loop
{
    do
    {
         if ($.interruptions_should_wait_for_user()) {
          wait *;
         }
         set $helloRequested = false;

        // This is your main function. It is responsible for the majority of the dialogue. In it, you can see a prompt for GPT ($prompt, which can be found and changed in prompts.dsl) and options for this function.
        // Feel free to modify this part as needed based on your experience.
        // For more information on this function, visit: https://docs.dasha.ai/en-us/default/dasha-script-language/built-in-functions/#gpt
        var a = #answerWithGPT($prompt, interruptible:true, gptOptions: {
            model: $llm_model,
            apikey: $llm_ApiKey,
            allow_function_name_in_response: false
        },
          sayOptions: {
            interruptDelay: 1.0,
            fillerTexts: [
              "um"
            ],
            fillerSpeed: 1.0,
            fillerDelay: 5.0,
            fillerStartDelay: 10,
            max_retries: 3
        });

        #log("Answered with gpt");

        if ($finished) { 
            exit;
        }        

        // This code is used when GPT calls a function. The retry is necessary to obtain a response from GPT with updated information from the called function(s).
        if ($.interruptions_process_llm_answer(a)) {
          goto retry;
        }
        
        wait *;
    } transitions {
        main_loop: goto main_loop on true;
        hello_request: goto main_loop on $helloRequested tags: ontick;
        retry: goto main_loop;
    }
}

Key Implementation Points

Wait Check: The interruptions_should_wait_for_user() function must be called at the beginning of your main loop
Answer Processing: Use interruptions_process_llm_answer(a) after each GPT response to handle function calls and interruptions
Interruptible Setting: Ensure your #answerWithGPT call includes interruptible:true

Core Interruption Library

The interruption handling logic is contained in a dedicated library file:

library

context {
  interrupted: boolean = false;
  lastInterruptionTime: number = 0;
  abruptInterruptionDelay: number = 3000; // Threshold for the duration of the user's phrase, used in smart interruptions. For more information, see the checkSmartInterruption comment. The current delay is set to 3 seconds.
  defVADPauseLength: number = 0.8; // This is a multiplier for the default VADPauseLength. It is used when Dasha is interrupted. 
  defInterruptedVADPauseLength: number = 1; // This is a multiplier for the default VADPauseLength. It is used when Dasha is interrupted. 
}

function checkSmartInterruption(): boolean {
// Used when a person starts speaking but is interrupted due to delays, causing them to stop mid-sentence. 
//In such cases, the response is, "Sorry, please continue."
// Modification is not recommended.

// This code checks whether Dasha should stop and listen to the user or continue speaking.
// If the user's phrase takes less time than the abruptInterruptionDelay, it will be considered an abrupt interruption, and Dasha will continue speaking.
// If the user's phrase takes more time than the abruptInterruptionDelay, Dasha will stop and wait for the user to continue.

  if($this.interrupted && (#getCurrentTime() - $this.lastInterruptionTime < $this.abruptInterruptionDelay)) {
    var schema = {
      name: "completeness_analysis",
      strict: true,
      description: "Analysis of sentence completeness",
      schema: {
        @type: "object",
        properties: {
          is_complete: {
            @type: "string",
            enum: ["yes", "no"]
          }
        },
        additionalProperties: false,
        required: ["is_complete"]
      }
    }.toString();

    var gptOptions = {
      model: "openai/gpt-4.1-mini",
      apikey: $this.llm_ApiKey,

      function_call: "none",
      history_length: 0,
      response_format: schema,
      save_response_in_history: false
    };

    var last_turn = #getFormattedTranscription(options:
        {
            allow_incomplete: true,
            history_length: 2
        }
        );
    #log("last_turn: " + last_turn);
    var short = #askGPT(`Instructions: Analyze if the user (human) sentence is complete from a logic perspective based on the last conversationturn. Last turn: ` + last_turn, 
      gptOptions: gptOptions);

    #log("askGPT: " + short.responseText);
    var parsedResult = short.responseText.parseJSON() as {
      is_complete: string;
    };

    #log("askGPT smart interruption: " + (parsedResult?.is_complete ?? "null"));

    if(parsedResult?.is_complete?.toLowerCase() == "no") {
      #addHistoryMessage({ source: "system", text: "User interrupted us." });
      return true;
    }
  }

  return false;
}

function handleInterruption(isInterrupted: boolean): unknown {
// This code is responsible for logging cases of interruptions that occur during the dialogue.
// It also dynamically increases the VADPauseLength to ensure that in the next turn of the conversation, Dasha listens to the user until they stop talking.
// We don't recommend to modify it.

  if (isInterrupted) {
    // If an interruption occurs, increases the VAD pause length used to determine when the user has stopped talking.
    #log("Increase VAD pause length to " + $this.defInterruptedVADPauseLength.toString());
    #setVadPauseLength($this.defInterruptedVADPauseLength);

    #trackEvent("Interrupted");
  }
  else {
    //#log("NOT Interrupted. Restore default VAD pause length of " + $this.defVADPauseLength.toString());
  }

  return null;  
}



/**
* - **Purpose**: Remain completely silent, avoiding any speech or interruption.
* - **Best Practice**: Use whenever:
*     1. The user is actively dictating critical information (e.g., phone numbers, addresses, codes, names) either continuously or in segments.
*     2. The user is clearly still in the middle of expressing a thought, hasn't finished speaking, or indicates through context, verbal cues (e.g., "I think that...", "right?", "I mean..."), intonation, or speech patterns that they may continue.
*     3. Even if there are no explicit verbal cues, intelligently determine if silence should be maintained based on conversational context and natural pauses indicating incomplete speech.
*
* - **IMPORTANT EXCEPTION**: Do NOT use keep_silence if the user has interrupted us while we were speaking. In interruption scenarios, allow the interruption handling system to manage the flow instead of forcing silence. You can identify interruption scenarios by checking if there is a system message in conversation history stating "user interrupted us".
*
* - Always remain silent until you are certain the user has completely finished dictating or speaking.
* - Never interrupt, prompt, or interject phrases like "please continue" while the user is mid-speech or mid-dictation.
* - Resume speaking only after a clear pause, explicit indication from the user that their input is finished, or when they directly invite your response.
*
* @param reason Succinct explanation for why silence is required (e.g., "user dictating phone number", "user still speaking", "user likely hasn't finished thought").
* @return Explanation string (for agent logging only, not visible to user) detailing why silence was maintained.
*/
function keep_silence(reason: string): string
{
  #log("keeping silence because: " + reason);
  return "keeping silence because: " + reason;
}

/**
* - **Purpose**: Handle user requests to wait (e.g., "Can you wait for a second?", "Hold on", "Give me a moment").
* - **Best Practice**: Use whenever:
*     1. The user explicitly asks you to wait or hold on (e.g., "Can you wait for a second?", "Hold on", "Give me a moment", "One sec").
*     2. The user indicates they need time to do something before continuing the conversation.
*     3. The user is stepping away temporarily but plans to return to the conversation.
*
* @param reason Succinct explanation for why wait was requested (e.g., "user asked to wait", "user said hold on").
* @return Explanation string (for agent logging only, not visible to user) detailing why wait was handled.
*/
function handle_wait_request(reason: string): string
{
  #log("handling wait request because: " + reason);
  set $this.helloDisabledByWait = true;
  return "handling wait request because: " + reason;
}

function findIndex(array: string[], element: string): number{
    var index = 0;
    for(var arrayElement in array){
        if(arrayElement == element){
            return index;
        }
        set index = index + 1;
    }
    return -1;
}



function interruptions_should_wait_for_user(): boolean {
      // Re-enable hello if it was disabled by wait request
    if ($this.helloDisabledByWait) {
        set $this.helloDisabledByWait = false;
        #log("Re-enabled hello digression after wait request");
    }
    
    #setVadPauseLength($this.defVADPauseLength);

    var si = $this.checkSmartInterruption();
    if($this.interrupted && !si) // Turns on smart interruptions. For more information on how it works, refer to interruptions.dsl. 
    {
      #log("User interrupted but we have enough information to respond.");
    }
    else if($this.interrupted && si) {
      #log("They interrupted but we don't have enough information to respond. Passing turn to user.");
      $this.handleInterruption($this.interrupted); // This function is responsible for handling interruption. This function is set in interruptions.dsl
      if(#waitForSpeech(1000)) {
        #log("User continued speaking");
        return true;
      }
    } else if ($this.interrupted) {
      #log("Interrupted, but no phrase said");
    }
    set $this.interrupted = false;
    return false;
}

// returns true if retry is required
function interruptions_process_llm_answer(a: { calledFunctionNames: string[]; functionCalled: boolean; saidPhrase: string; interrupted: boolean; } ): boolean {
    if (a.functionCalled) {
      #log(a.calledFunctionNames);
      var index1 = $this.findIndex(a.calledFunctionNames, "keep_silence");
      var index2 = $this.findIndex(a.calledFunctionNames, "handle_wait_request");

      if(index1 == -1 && index2 == -1)
      {
        #log("Called a tool, retry");
        return true;
      }
    }

    if(a.interrupted) {
      #log("Interrupted");
      set $this.interrupted = a.saidPhrase != "";
      set $this.lastInterruptionTime = #getCurrentTime();
    }

    return false;
}

Library Function Breakdown

keep_silence(): Instructs the AI to remain completely silent
handle_wait_request(): Manages user requests to pause the conversation
checkSmartInterruption(): Analyzes whether an interruption requires immediate response
handleInterruption(): Adjusts VAD settings when interruptions occur
interruptions_should_wait_for_user(): Main control function called in your main loop
interruptions_process_llm_answer(): Processes LLM responses and determines retry needs

Optional: Hello/Ping System

To complement interruption handling, you can implement a hello system that pings users during extended silence:

library

context {
  output status:string? = null;
  output serviceStatus:string? = null;
  
  helloDisabledByWait: boolean = false;
  helloRequested: boolean = false;
}

// These are the settings for pings (e.g., "Hello?", "Are you there?") when there is no response from the user.
// The types of these settings are defined here and it is not recommended to change them.
// You can find their descriptions and set values in the 'configuration' variable below.
type HelloConfiguration = {
    idleTimeLimit:number;
    lastIdleTime: number;
    retriesLimit: number;
    counter: number;
} with {
    preprocessorExecution(): boolean {
        set $this.lastIdleTime = 0;
        set $this.counter = 0;
        return false;
    }
}
;

// Reaction if nothing meaningful occurs in the dialogue for an long period.
preprocessor digression hello
{
    conditions
    {
        on #getIdleTime() - digression.hello.configuration.lastIdleTime > digression.hello.configuration.idleTimeLimit && !$helloDisabledByWait tags: ontick;
    }
    var configuration: HelloConfiguration = {
        idleTimeLimit: 5000,            // Maximum amount of silence time before a ping is sent. The higher the number, the longer Dasha will wait before pinging the user. Currently set to 5 seconds.
        lastIdleTime: 0,                // Checks the time since the last ping and determines when to send the next ping if the user hasn't responded to previous ones.
        retriesLimit: 2,                // Number of times Dasha will ping the user before ending the call.
        counter: 0                      // Number of pings sent during the current silence period.
    };

    do
    {
        #log("hello preprocessor");
        set digression.hello.configuration.lastIdleTime=#getIdleTime();
        if (digression.hello.configuration.counter >= digression.hello.configuration.retriesLimit)
        {
            set $status = "EmptyCall";
            set $serviceStatus = "Done";
            exit;
        }
        set digression.hello.configuration.counter = digression.hello.configuration.counter + 1;

        #addHistoryMessage({ source: "system", text: "User is silent for a long time. Ping them with short, concise, one-sentence message, like \"Hello?\" or \"Are you there?\" or \"Can you hear me?\"" });
        set $helloRequested = true;
        return;
    }
}

// This section resets the ping counter if the user responds.
// This prevents Dasha from ending the conversation if there are two or more pings separated by the user's responses.
preprocessor digression hello_preprocessor
{
    conditions
    {
        on digression.hello.configuration.preprocessorExecution() priority 50000 tags: ontext;
    }
    do
    {
        // never be there because preprocessorExecution must return always false
        set digression.hello.configuration.lastIdleTime = 0;
        set digression.hello.configuration.counter = 0;
        return;
    }
}

Configuration Options

Timing Parameters

abruptInterruptionDelay: 3000ms threshold for smart interruption analysis
defVADPauseLength: 0.8 multiplier for normal VAD pause length
defInterruptedVADPauseLength: 1.0 multiplier when interrupted

Hello System Settings

idleTimeLimit: 5000ms before first ping
retriesLimit: 2 attempts before ending call
interruptDelay: 1.0s delay before interruption detection

Best Practices

Always include prompt instructions: The LLM needs explicit instructions about when to use silence and wait functions
Test interruption scenarios: Verify behavior when users interrupt during different conversation phases
Adjust timing parameters: Fine-tune delays based on your specific use case and user behavior
Monitor logs: Use the built-in logging to debug interruption handling issues
Handle edge cases: Consider scenarios like network delays, audio quality issues, and rapid user speech patterns

Troubleshooting

Common Issues

LLM not using functions: Ensure prompt instructions are included and function names are correctly configured
Excessive interruptions: Adjust abruptInterruptionDelay or VAD settings
Missing wait acknowledgments: Verify the two-step wait handling process is implemented
Hello pings during waits: Check that helloDisabledByWait flag is properly managed

Debug Tools

Use the extensive logging throughout the system to identify issues:

Function call logging shows when keep_silence and handle_wait_request are triggered
Interruption logging tracks smart interruption analysis
VAD logging monitors pause length adjustments

Found a mistake? Let us know.

Handling interruptions with LLM