Clinical Voice Assistant

NOV 2025

AI AI-Agent JavaScript Next.js RAG RealtimeAPI TypeScript Voice

Summary

Built with OpenAI’s Realtime API using WebSockets for low-latency, bi-directional speech.
Seamlessly connects voice input to a vector database for semantic searching of clinical guidelines.
Features server-side Voice Activity Detection (VAD) enabling users to have a fluid and natrual dialogue.
Remembers returning users via Caller ID for personalized greetings, emails and preffered speaking speed.
Including a tool to send RAG search summaries via email.

Starting from the standalone Clinical Audit Platform Phase 1 project, I built a voice route which leverages the OpenAI Realtime API to provide a seamless, conversational interface for querying complex medical documents directly through the browser.

Upon entering a caller ID (will be a real user’s phone number in production), users establish a low-latency WebSocket connection that enables natural, bi-directional communication. The system employs advanced Voice Activity Detection (VAD) to allow users to interrupt the AI naturally at any moment, mimicking real human interaction. When a query is posed, the assistant utilizes a Retrieval-Augmented Generation (RAG) pipeline to semantically search through indexed clinical guidelines—specifically for diabetes and dermatitis—and synthesizes the findings into spoken responses with accurate citations.

The architecture prioritizes audio fidelity and user experience. It includes a custom audio processing pipeline that optimizes PCM16 to Float32 conversion and applies dynamic gain control to prevent distortion.

The system delivers a personalized experience by recognizing returning users with their name, email and preffered speaking speed (user will be able to provide these information during the call). The AI also have the ability to generate and email detailed RAG search summary which have key insights and guideline references.

Back to Projects List

More PROJECTS

The “Zhang Wei Missing Case” Game

FEB 2026
Musicsloth

DEC 2025 – PRESENT
AI Receptionist Platform

NOV 2025 – FEB 2026
Clinical Voice Assistant

NOV 2025