Novel challenges in voice AI, workflow automation, and intelligent systems. Not theoretical research - real problems from production systems serving thousands of users.
These aren't side projects. They're core to what we're building and directly impact the products our customers use.
Building conversational AI agents that handle phone calls naturally. The technical challenges include sub-200ms latency, interruption handling, turn-taking detection, and graceful degradation on poor audio.
General-purpose LLMs are impressive but expensive and slow for high-volume, domain-specific tasks. We build custom models for intent classification, entity extraction, and quality scoring.
Moving beyond rule-based automation to systems that can suggest workflow improvements, predict bottlenecks, and handle exceptions intelligently.
Specific initiatives within our research areas. Some are close to production, others are early exploration.
Using historical patterns to identify cases likely to breach deadlines before they do. Early warning means early intervention.
Maintaining context across long conversations without losing track of what's been discussed or decided.
Detecting caller frustration or confusion from voice signals and adapting agent behaviour accordingly.
Conference calls with AI agents - handling multiple speakers, turn-taking between humans and AI.
Research that doesn't ship isn't useful. We balance exploration with practical constraints - everything we build needs to work in production.
No vendor lock-in. Clean interfaces let us swap AI providers without changing application code. When a better model emerges, we can test it on production traffic.
Voice AI can't wait for complete responses. Audio streams to transcription while speaking, transcripts stream to LLM before finished, responses stream to TTS as tokens generate.
When providers fail or audio quality degrades, the system adapts. Fallback to simpler models, ask for clarification, or escalate to humans - never just fail silently.
Every conversation generates metrics. Latency, accuracy, user satisfaction, task completion. You can't improve what you don't measure.
Problems we haven't solved yet. If you have ideas or have tackled similar challenges, we'd love to hear from you.
Simple silence detection doesn't work. Falling intonation helps but isn't universal. Linguistic completeness helps but requires understanding. We're combining signals but it's still imperfect.
Full automation is risky for high-stakes decisions. Full human review doesn't scale. The boundary is context-dependent and we're still learning where to draw it.
Long conversations exceed context windows. Summarisation loses detail. Retrieval adds latency. There's no perfect solution, only trade-offs.
Automated metrics catch technical issues but miss naturalness. Human evaluation doesn't scale. We're exploring hybrid approaches.
Detailed writeups of problems we've solved and approaches we've taken.
If these challenges sound interesting, we're hiring. Small team, real problems, code that ships to production.