Structured Output with JSON Schema

Overview

The structured output feature allows you to control and validate LLM responses using JSON Schema. This enables predictable, type-safe responses that can be directly used in your application logic without complex parsing or error handling.

Key benefits include:

Type Safety - Enforce response structure with strong typing
Validation - Ensure responses match expected formats and value constraints
Consistency - Get reliable, predictable outputs for programmatic use
Simplicity - Eliminate complex parsing of free-form text

For additional details and implementation guidelines, see the OpenAI Structured Outputs Guide.

How It Works

Define a JSON Schema describing the expected response format
Convert the schema to a string using .toString()
Pass the schema to an LLM function like #askGPT using the response_format option
Parse the structured response using standard JSON parsing
Use the strongly typed result directly in your application logic

Usage

Here's a basic example of using structured output:

// Define the schema with strong typing
var schema = {
  name: "result",
  strict: true,
  description: "A categorization result",
  schema: {
    @type: "object",
    properties: {
      category: {
        @type: "string",
        description: "The category of the input",
        enum: ["Type A", "Type B", "Type C"]
      },
      confidence: {
        @type: "number",
        description: "Confidence score between 0 and 1"
      }
    },
    additionalProperties: false,
    required: ["category", "confidence"]
  }
}.toString();

// Call GPT with the structured output format
var ask = #askGPT($prompt, {
  model: "openai/gpt-4.1-nano",
  function_call: "none",
  history_length: 0,
  save_response_in_history: false,
  response_format: schema
},
promptName: "categorizer");

// Parse and use the result with proper typing
var result = ask.responseText.parseJSON() as { 
  category: string; 
  confidence: number; 
};

if ((result?.confidence ?? 0) > 0.8) {
  #log("High confidence classification: " + (result?.category?.toString()??"-"));
} else {
  #log("Low confidence classification: " + (result?.category?.toString()??"-"));
}

Schema Components

A properly structured schema includes:

Component	Description
`name`	Name identifier for the schema
`strict`	Boolean flag to enforce strict validation
`description`	Human-readable description of the schema purpose
`schema`	The actual JSON Schema definition

The schema definition follows standard JSON Schema syntax with the following common elements:

@type - Specifies the data type (object, string, number, boolean, array)
properties - Defines the object properties (for object types)
required - Lists required properties. For best results, include all defined properties
additionalProperties - Controls whether extra properties are allowed. Should always be set to false to ensure strict validation
Type-specific validators (enum for example)

Advanced Example

Here's a more complex example for a multi-field form:

var schema = {
  name: "formData",
  strict: true,
  description: "Extracted form information",
  schema: {
    @type: "object",
    properties: {
      firstName: {
        @type: "string",
        description: "Customer's first name"
      },
      lastName: {
        @type: "string", 
        description: "Customer's last name"
      },
      reason: {
        @type: "string",
        description: "Reason for contact",
        enum: ["Support", "Sales", "Billing", "Other"]
      },
      urgency: {
        @type: "integer",
        description: "Urgency level from 1-5"
      },
      callback: {
        @type: "boolean",
        description: "Whether customer wants a callback"
      }
    },
    required: ["firstName", "lastName", "reason", "urgency", "callback"],
    additionalProperties: false
  }
}.toString();

var result = #askGPT(#getFormattedTranscription(), {
  model: "openai/gpt-4.1",
  response_format: schema,
  function_call: "none",
  history_length: 0,
  save_response_in_history: false  
});


var formData = result.responseText.parseJSON() as { firstName: string; lastName: string; reason: "Support" | "Sales" | "Billing" | "Other"; urgency: number; callback: boolean; };
#log(formData);

// Now use the structured data in your application
#sayText("Thank you " + (formData?.firstName??"") + ", we'll handle your " + 
     (formData?.reason?.toString()?.toLowerCase() ?? "") + " request.");
     
if (formData?.callback == true) {
  #sayText("We'll call you back as soon as possible.");
}

Best Practices

For optimal results with structured output:

Be Specific - Define schemas with precise types and constraints
Include Descriptions - Add clear descriptions for each field to guide the LLM
Use Enums - Restrict string values to specific options when possible
Set Required Fields - Always include all defined properties in the required array
Disable Additional Properties - Always set additionalProperties: false to prevent unexpected fields
Handle Parsing - Always handle potential parsing failures gracefully
Use Type Assertions - Leverage TypeScript-style type assertions for type safety

Troubleshooting

If you encounter issues with structured output:

Check Schema Syntax - Ensure your JSON Schema is valid
Verify Conversion - Make sure .toString() is called on the schema object
Review LLM Response - Look at the raw responseText for format issues
Test with Simpler Schemas - Start with basic schemas and add complexity
Try Different Models - Some models handle structured output better than others

Limitations

Be aware of these limitations:

Requires model support for structured output (works best with GPT-4 and newer models)
Complex schemas may reduce response quality
Very strict schemas might cause model completion failures
The strict option may reject valid responses if not perfectly formatted

Found a mistake? Let us know.