Skip to main content
SwiftCase
PlatformSwitchboardFeaturesSolutionsCase StudiesFree ToolsPricingAbout
Book a Demo
SwiftCase

Workflow automation for UK service businesses. Created in the UK.

A Livepoint Solution

Platform

  • Platform Overview
  • Workflow Engine
  • Case Management
  • CRM
  • Document Generation
  • Data Model
  • Integrations
  • Analytics

Switchboard

  • Switchboard Overview
  • Voice AI
  • Chat
  • Email
  • SMS
  • WhatsApp

Features

  • All Features
  • High-Volume Operations
  • Multi-Party Collaboration
  • Contract Renewals
  • Compliance & Audit
  • Pricing
  • Case Studies
  • Customers
  • Why SwiftCase

Company

  • About
  • Our Team
  • Adam Sykes
  • Nik Ellis
  • Implementation
  • 30-Day Pilot
  • Operations Pressure Map
  • For Your Role
  • Peer Clusters
  • Engineering
  • Careers
  • Partners
  • Press
  • Research
  • Tech Radar
  • Blog
  • Contact

Resources

  • Use Cases
  • Software
  • ROI Calculator
  • Pressure Diagnostic
  • Pilot Scope Estimator
  • Board Case Builder
  • Free Tools
  • Guides & Templates
  • FAQ
  • Compare
  • Glossary
  • Best Practices
  • Changelog
  • Help Centre

Legal

  • Privacy
  • Terms
  • Cookies
  • Accessibility

Stay in the loop

Cyber Essentials CertifiedGDPR CompliantUK Data CentresISO 27001 Standards

© 2026 SwiftCase. All rights reserved.

Back to Blog
Engineering

Text Normalisation for Natural AI Speech: Making TTS Sound Human

How to preprocess text for speech synthesis so phone numbers, postcodes, dates, and business data are pronounced correctly.

SwiftCase Engineering
January 19, 2026
7 min read
Text Normalisation for Natural AI Speech: Making TTS Sound Human
Contents
  • The Problem with Raw Data
  • UK Phone Numbers
  • UK Postcodes
  • Currency
  • Dates and Times
  • Email Addresses
  • Car Registrations
  • Business Names and Acronyms
  • Stripping Internal Syntax
  • Performance Optimisation
  • Pattern Ordering Matters
  • Testing Edge Cases
  • The Complete Pipeline
  • Regional Considerations
  • Building voice experiences with business data?

Your AI voice agent reads out a customer's phone number: "Your number is zero seven billion, one hundred twenty-three million..."

The customer hangs up.

Text-to-speech engines are remarkably good at converting written language to natural-sounding audio. But they're trained on prose, not data. Phone numbers, postcodes, reference codes, dates, currency amounts: these are the building blocks of business conversations, and TTS engines routinely mangle them.

Text normalisation solves this by preprocessing text before it reaches the TTS engine, converting data formats into speakable phrases the engine can pronounce correctly.

The Problem with Raw Data

Consider what happens when you send these strings directly to a TTS engine:

InputTTS Output (Unprocessed)
07123456789"Zero seven billion, one hundred twenty-three million..."
SW1A 1AA"Swuh-wun-ah one double-a" or similar gibberish
£1,234.56"Pound sign one comma two three four point five six"
25/12/2024"Twenty-five slash twelve slash two thousand twenty-four"
REF123456"Ref one hundred twenty-three thousand..."

None of these are how a human would read the information aloud. A human says "oh seven one two three, four five six, seven eight nine" for the phone number. They say "S W one A, one A A" for the postcode. The TTS engine doesn't know these conventions unless you teach it.

UK Phone Numbers

Phone numbers should be read as individual digits with natural grouping. UK mobile numbers follow a predictable pattern: five digits, then three, then three.

The normalisation approach strips non-digit characters, then groups the digits according to the phone number type. For mobiles starting with 07, the grouping is 5-3-3. For landlines, it's typically 4-3-4. International formats (+44) are converted to national format first.

Crucially, commas are inserted between groups to create natural pauses. "0 7 1 2 3, 4 5 6, 7 8 9" sounds like a phone number. "0 7 1 2 3 4 5 6 7 8 9" sounds like a robot counting.

InputOutput
071234567890 7 1 2 3, 4 5 6, 7 8 9
+4471234567890 7 1 2 3, 4 5 6, 7 8 9
0121 234 56780 1 2 1, 2 3 4, 5 6 7 8

UK Postcodes

Postcodes combine letters and numbers in specific patterns. The outward code (first part) identifies the postal district; the inward code (last three characters) identifies the specific address group.

The normalisation approach spaces out each character individually, with a comma pause between the outward and inward codes. If the postcode arrives without a space (like "SW1A1AA"), the normaliser splits it before the last three characters.

InputOutput
SW1A 1AAS W 1 A, 1 A A
M1 1AAM 1, 1 A A
B338THB 3 3, 8 T H

The comma between outward and inward codes creates the same pause a human uses when dictating a postcode.

Currency

Currency requires context-aware formatting. "£1,234.56" should become "1,234 pounds and 56 pence", not "pound sign one comma two three four point five six".

The approach involves:

  1. Identifying the currency symbol and mapping it to spoken names (pound/pounds, pence for GBP; dollar/dollars, cents for USD)
  2. Separating whole and decimal parts
  3. Using singular or plural forms based on the amount
  4. Handling special cases like amounts under £1 ("99 pence" not "0 pounds and 99 pence")
  5. Recognising multiplier suffixes like "k", "m", "bn" for thousands, millions, billions
InputOutput
£1,234.561,234 pounds and 56 pence
£0.9999 pence
£1.5m1.5 million pounds
$100100 dollars
€11 euro

Dates and Times

Dates in business contexts usually appear as DD/MM/YYYY (UK format) or YYYY-MM-DD (ISO format). Neither reads naturally without transformation.

The normalisation converts numeric dates to spoken form with ordinal day numbers and full month names: "25/12/2024" becomes "the 25th of December 2024". The ordinal suffix (st, nd, rd, th) is determined by the day number.

Times require similar treatment. The approach converts 24-hour format to 12-hour with am/pm, handles special cases like noon and midnight, and formats minutes naturally ("2 oh 5 pm" for 2:05pm rather than "2 zero 5 pm").

InputOutput
25/12/2024the 25th of December 2024
2024-01-15the 15th of January 2024
14:302 30 pm
2:05pm2 oh 5 pm
12:00noon

Email Addresses

Email addresses contain symbols that TTS engines pronounce literally. "@" becomes "at sign" and "." becomes "dot" or "period" depending on the engine.

The normalisation replaces "@" with " at " and "." with " dot ", producing output that sounds natural when spoken.

InputOutput
john.smith@example.comjohn dot smith at example dot com

Car Registrations

UK vehicle registrations follow several formats depending on age. Modern registrations (post-2001) use the pattern AA00 AAA. The letters and numbers should be read separately with a pause between groups.

The approach normalises the registration to uppercase, then groups characters by type (letters vs digits), inserting comma pauses when switching between letter and digit groups.

InputOutput
AB12 CDEA B, 1 2, C D E
A123 BCDA, 1 2 3, B C D

Business Names and Acronyms

Business data contains abbreviations that should be spelled out rather than pronounced as words. "AX Motors" should be "A X Motors", not "Axe Motors". But common words like "UK" or "TV" should remain as-is because TTS engines handle them correctly.

The approach maintains a list of common words and abbreviations that TTS handles well (UK, US, TV, LTD, PLC, CEO, etc.) and leaves these unchanged. Other uppercase sequences of 2-4 letters are spaced out for spelling.

Additional business-specific patterns include:

  • "T/A" becomes "trading as"
  • Ampersands become "and"
  • Letter-number-letter patterns like "B2B" become "B 2 B"
InputOutput
AX MotorsA X Motors
B&M RetailB and M Retail
ABC Ltd T/A XYZA B C Ltd trading as X Y Z
UK BusinessUK Business (unchanged)
B2B ServicesB 2 B Services

Stripping Internal Syntax

Sometimes language models output internal syntax as text instead of using proper function calls. Tool call syntax, JSON fragments, or other machine-readable content should never be read aloud.

The normalisation identifies and removes these patterns, leaving only the human-readable content.

InputOutput
Thank you. [action: end] Goodbye.Thank you. Goodbye.

Performance Optimisation

Text normalisation runs on every response before TTS generation. For high-volume systems, avoiding unnecessary processing matters.

The optimisation approach uses a quick pre-check to determine whether text contains any patterns that need normalisation. Simple conversational responses like "How can I help you today?" skip the full normalization pipeline entirely. Only text containing phone numbers, postcodes, currency symbols, dates, or other data patterns gets processed.

Pattern Ordering Matters

The order in which you apply transformations affects results. Consider the text "at 14:30 on AB12 CDE".

If you process acronyms before times, "AB" might get spaced out before the car registration pattern matches. If you process car registrations before postcodes, "AB12 CDE" correctly matches as a vehicle registration rather than being misidentified as a malformed postcode.

A robust implementation processes patterns in order of specificity:

  1. Email addresses (before other dot processing affects them)
  2. Car registrations (specific alphanumeric patterns)
  3. Times with "at" prefix (before acronym processing)
  4. Business name patterns (T/A, ampersands, acronyms)
  5. Dates (UK and ISO formats)
  6. Times (12-hour format)
  7. Currency (with and without multipliers)
  8. Postcodes
  9. Phone numbers (most specific patterns first)
  10. Reference numbers

Each pattern is designed to avoid false positives on the others.

Testing Edge Cases

Real-world data produces surprising edge cases:

  • "Call me at 2" - Is "2" a time or just the number two?
  • "Unit 2A" - Should "2A" be spaced out?
  • "SW1A" without the inward code - Partial postcode or abbreviation?
  • "REF: TBC" - Reference number or "to be confirmed"?

The solution is specificity. "at 2pm" triggers time formatting; "at 2" alone doesn't. Reference patterns require at least 4 alphanumeric characters after the prefix. Postcodes require both outward and inward codes to match.

When in doubt, leave text unchanged. It's better for TTS to mispronounce an edge case than for normalisation to corrupt valid prose.

The Complete Pipeline

Text normalisation slots into the voice AI pipeline between the language model and TTS:

  1. User speaks → Speech-to-text transcription
  2. LLM processes → Generates response text
  3. Normalisation → Converts data formats to speakable phrases
  4. TTS generates → Converts normalised text to audio
  5. Audio plays → User hears natural pronunciation

The normalisation step is invisible to both the language model and the user. The LLM can output "Your reference is REF123456" naturally, and the user hears "Your reference is REF 1 2 3 4 5 6" without either party knowing about the transformation.

Regional Considerations

The patterns described here focus on UK conventions: UK phone formats, UK postcodes, UK date ordering (DD/MM/YYYY), pounds sterling. Adapting for other regions requires:

  • Different phone number patterns and groupings
  • Different postal code formats (US ZIP codes, German PLZ, etc.)
  • Different date ordering (MM/DD/YYYY for US)
  • Different currency handling

A production system serving multiple markets needs either locale detection or explicit configuration to apply the correct regional patterns.


Building voice experiences with business data?

SwiftCase integrates voice AI with workflow automation, handling the complexity of natural speech synthesis so your AI agents can read customer data, reference numbers, and business information clearly. Our platform manages the technical details so you can focus on your processes.

Book a demo | Explore the platform | View pricing

Related Articles

Engineering

Detecting Humans vs Machines in Voice AI: AMD and VAD Explained

January 21, 20268 min read
Engineering

AI Navigating IVR Menus: How Voice Agents Automate Phone System Interactions

January 17, 20268 min read
Engineering

Maintaining State Across WebSocket Reconnections in Voice Applications

January 14, 20268 min read

Get automation insights delivered

Join operations leaders who get weekly insights on workflow automation and AI.

Related Free Tools

Workflow Mapper

Draw your business process visually and export a professional PDF.

Try free

SLA Template Builder

Build and download a professional Service Level Agreement.

Try free

Meeting Cost Calculator

See the true cost of your meetings based on attendees and salary.

Try free

11.8M+ cases processed

How we build SwiftCase

A look behind the curtain at the engineering decisions, tools, and culture that power our platform.

Meet the Engineering Team
View Careers