Gemini Live, Call Escalation, and Config Portability in feros

Gemini Live, Call Escalation, and Config Portability in feros

A closer look at three major feros updates: native Gemini Live sessions, production-grade call escalation, and full agent config import/export with public schemas.

gemini-livetelephonyescalationconfigschemas

Since feros went live, we’ve shipped a few meaningful updates — and three of them stand out for teams running real voice workflows in production.

1) Native Gemini Live support for real-time voice conversations

feros now supports a native multimodal path powered by Gemini Live.

In this mode, sessions can run through Gemini's bidirectional audio WebSocket instead of the classic STT → LLM → TTS pipeline. In the runtime, this path is modeled explicitly (gemini_live_model) and wired as a dedicated native session flow, with transport and audio handling tailored for low-latency voice.

What this enables:

  • A native audio-to-audio conversation path for realtime interactions
  • Runtime handling for Gemini-specific session lifecycle and reconnect behavior
  • More robust transcript/timeline handling for streaming turns
  • Better stability around hang-up/audio-drain edge cases in native sessions
  • Built-in Gemini voice selection in the agent config UI when native mode is enabled

In short: Gemini Live is not treated as a bolt-on provider toggle — it is integrated as a first-class runtime path.

Conversation Mode set to Native Multimodal (Gemini Live) Conversation Mode in the feros agent config UI, with Native Multimodal (Gemini Live) selected.

2) Call escalation for telephony agents (Twilio + Telnyx)

feros added explicit call-escalation capabilities for telephony deployments.

At the config/proto level, agents can define escalation_destinations with named targets (E.164 number or SIP URI). At runtime, feros injects an escalate_call tool for telephony sessions when destinations are configured, so the agent can hand off to predefined human endpoints with structured arguments (destination + reason).

What was shipped across the stack:

  • Escalation destination support in agent config (name + phone/SIP target)
  • Runtime/tooling integration so escalation is available only when context is valid
  • Telnyx and Twilio transfer path support
  • Validation and guardrails around transfer destinations and call state
  • Follow-up reliability fixes for webhook routing and auto-hangup edge cases
  • Transfer behavior adjustments toward blind transfer where appropriate

The result is a clearer and more operationally-safe handoff path from AI agent to human support during live calls.

Human handoff and escalation destination configuration UI Escalation destinations in the telephony config flow (for example, named sales/support handoff targets).

3) Full agent config export/import + schema versioning improvements

feros introduced a full config portability workflow so teams can move agent setups between environments with fewer surprises.

On the API side:

  • GET /api/agents/{id}/export returns a full payload including config and builder metadata (for example mermaid diagram + connection metadata)
  • POST /api/agents/import/validate performs schema + runtime-fulfillability checks
  • POST /api/agents/import supports strict import or mapping-assisted import

On validation/import behavior:

  • Imported configs are normalized to config_schema_version: v3_graph
  • Validation separates schema issues from fulfillment issues
  • Mapping support is provided for key voice settings (tts_provider, tts_model, voice_id)
  • Blocking issues are enforced before import succeeds

On the web/docs side:

  • Public schema artifacts are now exposed (including agent-config-v3.schema.json), making integrations and tooling more predictable for external consumers

This is a practical step toward reproducible, shareable, and automatable agent configuration workflows.

Import Voice Agent flow with validation and resolution steps The import workflow in Studio: config input, validation/resolution, and metadata finalization.

Other changes worth noting

  • Experimental Telnyx feature gating was added to support safer rollout controls for telephony behavior (bb1f5e1).
  • Voice trace timing was fixed to correctly report STT time-to-first-byte (TTFB) and improve span timing reliability (38c7f64).
  • Secret vault refresh handling was hardened to prevent task leaks and better handle token expiry (ea30a2f).
  • Public schema artifacts were added to the web/docs repo to make integrations easier to validate against a concrete schema file (530e4d7).
  • API client auth was expanded with Bearer token support, and middleware route matching was corrected for more reliable authenticated API behavior (7661c6a).
  • The call-log table play button issue was fixed to restore expected playback interaction in the UI (0dd9018).
  • Link navigation from call-log entries was improved for smoother operator workflows (84e2845).

Why these three updates matter together

Taken together, these changes strengthen feros in three core dimensions:

  • Realtime capability: native Gemini Live path for multimodal voice sessions
  • Production call handling: escalation flow that is explicit, validated, and provider-aware
  • Portability: import/export + schema-driven config workflows that travel better across teams/environments

For teams building serious voice systems, this combination improves both runtime behavior and day-2 operational experience.

If you want to follow what we’re building, check out the open-source repo at https://github.com/ferosai/feros and star/watch it for upcoming updates. We’ll keep sharing practical improvements as they ship.