Voting systems must provide every voter with the ability to mark, verify, and cast their ballot independently and privately. For voters who are blind or have low vision, or who have difficulty reading, that means relying on an audio interface paired with a tactile controller, rather than the visual touchscreen most voters use. But designing a good audio experience for voting is harder than it sounds. Unlike websites and apps, voting systems are self-contained and cannot use personal assistive technology like screen readers, so the system itself must communicate every piece of information a sighted voter would see.
This report offers practical guidance for voting system designers on how to build and test an audio interface that is clear, consistent, and usable — one that gives voters with disabilities a voting experience equivalent to that of any other voter.
In addition to the research findings, the report includes:
This research was published by the National Institute of Standards and Technology (NIST) as VTS 100-5, also available on NIST’s website.
Voice text that communicates the current focus and relevant information needed for context
The audio must include text contained in the screen element that has the current focus. It must also voice relevant information that is available visually and provides context.
Use a consistent syntax for how candidates and voting instructions are voiced. Put the outcome first, then the action to get that outcome. Voters are listening for the outcome they want. Voicing the outcome first lets voters identify the one they want before being told how to achieve it.
Frequently, audio will need to convey multiple pieces of information (e.g., status of the focus, possible actions, how to move on). Use a consistent order for these items: focus → condition/status → specific actions → general actions.
Break up audio with short pauses to identify transitions between different types of information. Use no more than 2 types of pauses: one short and one long. Use short pauses to separate information that is specific to this moment. Use longer pauses to indicate that the contest-specific audio is complete, and the “routine” audio (additional possible actions) is about to begin.
“Earcons” are short pieces of nonverbal audio that can provide supplemental clues to verbal audio, give feedback on actions, or prompt voter actions. Use no more than 2 or 3 earcons. Use sounds that are commonly used rather than inventing new sounds that may not be familiar to voters.
This report was authored by Lynn Baumeister, Whitney Quesenbery, and Sharon Laskowski (NIST) and published by NIST in January 2025.
The best practices are grounded in research conducted in 2020, focused on designing an accessible voting system capable of handling a ballot with a mix of ranked choice voting (RCV) and standard contests. The research prioritized creating an interface that worked across multiple interaction modes — visual with touch, audio with tactile controller, and combinations of both.
Testing was conducted iteratively with voters with disabilities using a semi-interactive prototype. Rather than waiting for audio to be fully programmed, the team used a human “computer voice” to read from an editable script in real time while participants navigated the prototype. This approach allowed the team to observe how voters responded to different phrasings, pause lengths, and information sequences and to make adjustments between testing rounds.
Visit our page on voting systems to find more resources about the usability and accessibility of voting systems.