
Gemini Live, Call Escalation, and Config Portability in feros
A closer look at three major feros updates: native Gemini Live sessions, production-grade call escalation, and full agent config import/export with public schemas.
Since feros went live, we’ve shipped a few meaningful updates — and three of them stand out for teams running real voice workflows in production.
1) Native Gemini Live support for real-time voice conversations
feros now supports a native multimodal path powered by Gemini Live.
In this mode, sessions can run through Gemini's bidirectional audio WebSocket instead of the classic STT → LLM → TTS pipeline. In the runtime, this path is modeled explicitly (gemini_live_model) and wired as a dedicated native session flow, with transport and audio handling tailored for low-latency voice.
What this enables:
- A native audio-to-audio conversation path for realtime interactions
- Runtime handling for Gemini-specific session lifecycle and reconnect behavior
- More robust transcript/timeline handling for streaming turns
- Better stability around hang-up/audio-drain edge cases in native sessions
- Built-in Gemini voice selection in the agent config UI when native mode is enabled
In short: Gemini Live is not treated as a bolt-on provider toggle — it is integrated as a first-class runtime path.
Conversation Mode in the feros agent config UI, with Native Multimodal (Gemini Live) selected.
2) Call escalation for telephony agents (Twilio + Telnyx)
feros added explicit call-escalation capabilities for telephony deployments.
At the config/proto level, agents can define escalation_destinations with named targets (E.164 number or SIP URI). At runtime, feros injects an escalate_call tool for telephony sessions when destinations are configured, so the agent can hand off to predefined human endpoints with structured arguments (destination + reason).
What was shipped across the stack:
- Escalation destination support in agent config (name + phone/SIP target)
- Runtime/tooling integration so escalation is available only when context is valid
- Telnyx and Twilio transfer path support
- Validation and guardrails around transfer destinations and call state
- Follow-up reliability fixes for webhook routing and auto-hangup edge cases
- Transfer behavior adjustments toward blind transfer where appropriate
The result is a clearer and more operationally-safe handoff path from AI agent to human support during live calls.
Escalation destinations in the telephony config flow (for example, named sales/support handoff targets).
3) Full agent config export/import + schema versioning improvements
feros introduced a full config portability workflow so teams can move agent setups between environments with fewer surprises.
On the API side:
GET /api/agents/{id}/exportreturns a full payload including config and builder metadata (for example mermaid diagram + connection metadata)POST /api/agents/import/validateperforms schema + runtime-fulfillability checksPOST /api/agents/importsupports strict import or mapping-assisted import
On validation/import behavior:
- Imported configs are normalized to
config_schema_version: v3_graph - Validation separates schema issues from fulfillment issues
- Mapping support is provided for key voice settings (
tts_provider,tts_model,voice_id) - Blocking issues are enforced before import succeeds
On the web/docs side:
- Public schema artifacts are now exposed (including
agent-config-v3.schema.json), making integrations and tooling more predictable for external consumers
This is a practical step toward reproducible, shareable, and automatable agent configuration workflows.
The import workflow in Studio: config input, validation/resolution, and metadata finalization.
Other changes worth noting
- Experimental Telnyx feature gating was added to support safer rollout controls for telephony behavior (bb1f5e1).
- Voice trace timing was fixed to correctly report STT time-to-first-byte (TTFB) and improve span timing reliability (38c7f64).
- Secret vault refresh handling was hardened to prevent task leaks and better handle token expiry (ea30a2f).
- Public schema artifacts were added to the web/docs repo to make integrations easier to validate against a concrete schema file (530e4d7).
- API client auth was expanded with Bearer token support, and middleware route matching was corrected for more reliable authenticated API behavior (7661c6a).
- The call-log table play button issue was fixed to restore expected playback interaction in the UI (0dd9018).
- Link navigation from call-log entries was improved for smoother operator workflows (84e2845).
Why these three updates matter together
Taken together, these changes strengthen feros in three core dimensions:
- Realtime capability: native Gemini Live path for multimodal voice sessions
- Production call handling: escalation flow that is explicit, validated, and provider-aware
- Portability: import/export + schema-driven config workflows that travel better across teams/environments
For teams building serious voice systems, this combination improves both runtime behavior and day-2 operational experience.
If you want to follow what we’re building, check out the open-source repo at https://github.com/ferosai/feros and star/watch it for upcoming updates. We’ll keep sharing practical improvements as they ship.