- Speakable schema (SpeakableSpecification) flags specific content blocks for text-to-speech; without it, voice assistants guess — and usually guess wrong.
- Featured snippets feed most smart-speaker answers, so snippet optimisation and speakable schema are the same pipeline, not two separate tasks.
- Conversational, question-first headers (Who, What, Where, How, Why) dramatically increase the odds your content matches natural-language voice queries.
- Local voice searches ('open now near me', 'best dentist in [city]') pull from your Google Business Profile first — structured data on your site is the backup signal.
- Page speed under 2 seconds is a hard prerequisite; voice assistants skip slow pages because they can't pause for a loading spinner.
- Optimising for voice is cheaper than paid search and compounds over time — one well-structured FAQ page can earn dozens of spoken citations.
The Problem With Treating Voice Search as a Checkbox
Most guides on voice search end at "add speakable schema and you're done." That's like buying running shoes and calling yourself a marathoner. The schema markup matters — we'll explain exactly how to implement it — but the bigger opportunity is in how you write, structure, and publish content that voice assistants want to read.
Voice queries are growing faster than text queries in local and informational categories. When someone asks their Google Nest, "What time does [business] close?" or "How do I remove rust from cast iron?" — they are not typing a keyword. They are talking to something they expect to understand them. If your site is built only for scannable text and keyword density, you are invisible to that interaction.
This guide covers the full stack: what speakable schema actually does (not just what it is), why featured snippets are the real pipeline for voice answers, how to write content that earns those snippets, and what local businesses specifically need in place before any of the above matters.
What Speakable Schema Actually Does
Speakable schema is a type of structured data (using the Schema.org SpeakableSpecification) that explicitly tells voice assistants: these specific sections of this page are suitable for text-to-speech playback. You add it to your page's JSON-LD block, referencing either CSS selectors or XPath expressions that point to the text you want spoken.
Without it, a voice assistant has to guess which paragraph is the right answer. It might grab a navigation label, a copyright notice, or a sentence fragment. With speakable markup, you're handing the assistant a highlighter and saying, "Read this part."
Google's documentation targets speakable schema primarily at news articles and broadcast content, but the underlying mechanism works on any page type. The practical effect for a small business is that your FAQ answer, your service description, or your "about" paragraph becomes a clean, speakable candidate in the assistant's index.
Here's a minimal JSON-LD implementation:
{
"@context": "https://schema.org",
"@type": "WebPage",
"name": "How to Remove Rust From Cast Iron",
"speakable": {
"@type": "SpeakableSpecification",
"cssSelector": [".answer-block", "h2.answer-heading"]
}
}
That's it. Point cssSelector at the elements that contain your clean, spoken answer. Keep those elements to 2–3 sentences. Longer passages get truncated or skipped entirely.
Featured Snippets Are the Voice Answer Pipeline
This is the part most guides miss. The majority of smart-speaker answers are pulled directly from featured snippets — the boxed answer at position zero in Google Search. If you want voice traffic, you need to understand that these are not two separate optimisation tasks. They are one task.
The path looks like this:
- A user asks a natural-language question
- Google (or the underlying search engine powering the assistant) identifies the best answer
- If a featured snippet exists for that query, it is the answer — either displayed on screen or read aloud
- Speakable schema helps Google identify which part of your page to use when it's deciding between multiple snippet candidates
So your workflow is: earn the snippet, then mark it as speakable. Not the other way around.
To earn featured snippets, your content needs to:
- Directly answer a specific question in the first sentence of the relevant paragraph — don't build to the answer, start with it
- Keep the answer block concise: 40–60 words for paragraph snippets, 4–8 items for list snippets
- Use the question as a header (exact or close variant), then answer immediately below it
- Provide supporting depth in the rest of the section — this signals that your page is authoritative, not just terse
How to Write for Ears, Not Eyes
Written content optimised for scanning looks like this: short bullets, bold keywords, minimal full sentences. That is the opposite of what voice assistants need. Voice content needs to sound natural when read aloud by a robotic voice at moderate speed.
Practical rules for voice-friendly writing:
Use full sentences. Bullets don't speak well. "• Available Monday–Saturday" becomes "bullet available Monday dash Saturday" in some assistants. Write "We're available Monday through Saturday, 9 a.m. to 6 p.m."
Avoid jargon and acronyms on first use. An assistant won't pause to explain what "GBP" means. If you write "your Google Business Profile (GBP)", the assistant reads the parenthetical too. Spell it out or drop the acronym.
Front-load your answers. "Yes, we offer same-day delivery in [city]" beats "For customers in [city], we are pleased to offer same-day delivery." The first version works as a voice answer. The second does not.
Use conversational question formats as subheadings. "How long does installation take?" instead of "Installation Timeline." The question matches what a user actually says. The label does not.
The Local Voice Search Layer
Voice assistants handle local queries differently from informational ones. When someone says "best plumber near me" or "is [business name] open right now," the assistant is not primarily querying your website — it's querying your Google Business Profile (GBP) and structured local data.
This means your voice search strategy has two tracks:
Track 1: Your GBP (for local queries)
- Hours must be current — including holiday hours and special hours
- Categories must be specific (not just "Restaurant" but "Italian Restaurant" and "Pizza Delivery")
- Q&A section should have pre-populated answers to common questions ("Do you take walk-ins?" "Is there parking?")
- Your NAP (name, address, phone) must be identical across every directory — inconsistent NAP data directly suppresses local pack rankings
Track 2: Your website (for informational and comparison queries)
- LocalBusiness schema with
openingHoursSpecification,geo,areaServed - FAQ schema on your services and location pages
- Speakable markup on your most concise, accurate answer blocks
The two tracks feed each other. A strong GBP with lots of reviews and complete data signals authority, which helps your site rank higher in organic results, which increases the chance of earning featured snippets.
Page Speed: The Prerequisite Nobody Mentions
Voice assistants operate on a tighter latency budget than human users. A person will wait 4 seconds for a page to load. A voice assistant won't — if your page is slow, it moves to the next candidate.
Target under 2 seconds TTFB (Time to First Byte) for pages you want in voice results. Run your core content pages through Google PageSpeed Insights and focus on:
- Eliminating render-blocking JavaScript from above-the-fold content
- Serving images in WebP with explicit
widthandheightattributes - Using a CDN if your audience is geographically distributed
- Caching HTML responses for static or semi-static pages
Page speed is also a direct ranking signal for mobile, and most voice queries originate on mobile devices. Fixing speed helps your overall SEO, not just voice.
The Content Gap Most Small Businesses Have
Run this test right now: type your most commonly asked customer question into Google as a full sentence. ("How much does a [your service] cost in [your city]?" or "Does [your business name] do [specific service]?")
Does your website appear? Does it appear with a snippet? If not, you have a content gap that no amount of schema markup can fix.
The solution is a dedicated FAQ or question-and-answer page built specifically around the questions your customers actually ask. Not the questions you wish they asked. The ones that come in via phone, email, and chat.
Structure each answer as:
- Question as H2 or H3 heading
- Direct 1–2 sentence answer immediately below
- 100–200 words of supporting context
- Relevant internal link to the full service or product page
Apply FAQ schema markup to the entire page. Apply speakable schema to the direct answer sentences. This single page, done properly, can capture voice traffic for dozens of queries simultaneously.
Putting It Together: What to Prioritise
If you're starting from zero, here's the order that gets results fastest:
- Fix your GBP first. It's free, it's fast, and it directly answers most local voice queries without your website being involved at all.
- Run a page speed audit. Remove the technical blockers before adding more content.
- Build or improve your FAQ page. Write in full conversational sentences. Use question headers. Keep answers under 60 words.
- Add FAQ schema to that page using Google's Structured Data Markup Helper.
- Add speakable schema pointing to your direct answer paragraphs.
- Check for featured snippet opportunities in Google Search Console — look for queries where you rank in positions 2–10; those are your highest-probability snippet targets.
- Monitor voice-friendly queries by filtering GSC for question-format queries (who, what, where, when, how, why) and tracking their click-through rates over 60–90 days.
Voice search optimisation isn't a one-day project, but it's also not a complicated one. The businesses that win voice traffic are the ones who write clear answers to real questions, mark those answers up properly, and maintain the local data signals that assistants rely on. That's a discipline, not a trick.
“The businesses that win voice traffic are the ones who write clear answers to real questions — speakable schema just makes sure the assistant finds them first.”
| Area | Unoptimised site | Voice-optimised site |
|---|---|---|
| Content structure | Keyword-dense paragraphs written for scanning, bullet-heavy formatting | Full-sentence, question-and-answer prose with conversational headers and direct opening answers |
| Schema markup | No structured data, or only basic title/description meta tags | FAQ schema, LocalBusiness schema, and SpeakableSpecification pointing to clean answer blocks |
| Local data signals | Inconsistent NAP across directories, incomplete or outdated GBP listing | Fully completed GBP with current hours, categories, Q&A populated, and NAP identical site-wide |
| Page speed | TTFB over 3 seconds, render-blocking JS, uncompressed images | TTFB under 2 seconds, WebP images with explicit dimensions, CDN-served static assets |
| Featured snippet ownership | Ranks on page one but holds no featured snippets; answers buried mid-paragraph | Holds featured snippets for top question-format queries; answers front-loaded in 40–60 word blocks |
| Visibility in voice results | Rarely or never read aloud by Google Assistant, Alexa, or Siri | Consistently surfaced for local and informational voice queries relevant to the business |
How to Optimise Your Site for Voice Search
- 01Audit and complete your Google Business Profile. Open your GBP dashboard and verify that business hours (including holidays), primary and secondary categories, service areas, and the Q&A section are fully populated. Most local voice queries hit GBP directly before ever reaching your website.
- 02Run a page speed test on your key pages. Use Google PageSpeed Insights or WebPageTest to check TTFB and Largest Contentful Paint on your homepage, FAQ page, and top service pages. Address render-blocking scripts and uncompressed images before adding any schema markup.
- 03Build or rewrite your FAQ page with conversational answers. List the 10–20 questions customers most commonly ask, write each as an H2 or H3 heading, then answer each one in 1–2 full sentences immediately below the heading — no preamble. Aim for 40–60 words per answer block.
- 04Add FAQ and LocalBusiness schema markup. Use Google's Structured Data Markup Helper or a plugin like Rank Math to add FAQ schema to your question page and LocalBusiness schema (with openingHoursSpecification and geo) to your homepage and contact page. Validate both with Google's Rich Results Test before publishing.
- 05Implement speakable schema on your cleanest answer blocks. Add a SpeakableSpecification JSON-LD block to your FAQ page, pointing cssSelector at the elements that contain your direct answer sentences. Keep each referenced element to 2–4 sentences maximum.
- 06Identify featured snippet opportunities in Search Console. Filter Google Search Console queries for question-format keywords (how, what, where, why, when, who) and sort by impressions. Pages ranking in positions 2–10 for high-impression question queries are your best candidates for snippet optimisation — rewrite those answer sections to be more direct and concise.
- 07Monitor and iterate every 60 days. Re-check your featured snippet holdings and track question-format click-through rates in GSC on a 60-day cadence. Voice search improvements compound slowly — schema changes can take 4–6 weeks to be recrawled — so give changes time before evaluating them.