PDF Accessibility: Making Documents Screen-Reader Friendly

PDF Accessibility: Making Documents Screen-Reader Friendly

A PDF that looks perfect on screen can be completely invisible to a screen reader. Inaccessible PDFs aren't just a compliance risk — they lock out users who rely on assistive technology, and they're far more common than most teams realize. Here's what actually goes wrong and how to fix it.

What Makes a PDF Inaccessible

Most accessibility problems trace back to one root cause: the PDF has visual content but no semantic structure. When a PDF is created by printing to PDF, exporting poorly from Word, or scanning a physical document, the result is often a flat stream of graphics with no machine-readable meaning.

Untagged content — A tagged PDF includes a logical structure tree that labels elements as headings, paragraphs, lists, tables, and figures. An untagged PDF has no such tree. A screen reader encounters it as a blank document, or reads characters in the wrong order if it tries at all.

No reading order — Even tagged PDFs can have incorrect reading order. In multi-column layouts, sidebars, and complex designs, the order elements appear in the tag tree may not match the visual reading flow. A screen reader follows the tag order, not the visual order — so it might read a footer before the first paragraph.

Scanned images without OCR — A scanned document is a raster image. There is no text in the file, only pixels. Without optical character recognition (OCR), a screen reader has nothing to read. The document appears empty.

Missing or absent document language — If the PDF doesn't declare a language, text-to-speech can't choose the correct pronunciation rules and voice profile.

What Tagged PDFs Actually Are

The PDF/UA standard (ISO 14289) defines how to build accessible PDFs. The core mechanism is a logical structure tree — a hierarchy of tagged elements embedded in the file alongside the visual content.

Tags look conceptually similar to HTML elements: <H1>, <P>, <L> (list), <Table>, <Figure>. Each tag maps to content in the page stream and carries semantic meaning. The structure tree tells assistive technology what the content is, not just how it looks.

A <Figure> tag, for example, can carry an Alt attribute — the alt text that a screen reader announces instead of skipping the image entirely. A <Table> has column and row header tags that enable cell-by-cell navigation.

Tagging is what separates a PDF that screen readers can navigate from one they can't.

The Accessibility Checklist

Before distributing a PDF, work through this list:

Document tags — The file must be tagged. In Adobe Acrobat Pro: Tools → Accessibility → Autotag Document is a starting point, but automated tags are often wrong. Manual review is almost always needed.

Reading order — Check it by reading the document with a screen reader, or using Acrobat's Order panel (View → Navigation Panels → Order). Every element should appear in the sequence a reader would encounter it on the page.

Alt text for images — Every meaningful image needs alt text. Decorative images should be tagged as Artifact (background) so screen readers skip them. In Acrobat: right-click a figure tag → Edit Alternate Text.

Document title — Set a meaningful document title in File → Properties → Description. Set "Show Document Title" in the Initial View tab. The title is what screen readers announce when the document opens.

Document language — File → Properties → Advanced → Reading Options → Language. Use a valid BCP 47 language tag (en-US, fr-FR, etc.).

Bookmarks for long documents — Any document with more than a few pages should have bookmarks (View → Navigation Panels → Bookmarks). These let users jump directly to sections without reading sequentially.

Form field labels — Interactive form fields need accessible names. Each field should have a tooltip or visible label properly associated with it. Unlabeled fields show up as "Text Field" or "Check Box" with no context.

Heading hierarchy — Headings should be tagged <H1> through <H6> in logical order. Don't skip levels. A screen reader user navigating by headings relies on this structure to understand document organization.

Color contrast — Text needs a contrast ratio of at least 4.5:1 against the background for normal text (3:1 for large text). This applies to PDF body text just as it does to HTML. WCAG 2.1 Success Criterion 1.4.3 covers this in detail.

Creating Accessible PDFs from Source

Fixing tags in Acrobat after the fact is tedious. The better approach is to produce accessible PDFs from the source document.

From Microsoft Word — Use proper heading styles (Heading 1, Heading 2) instead of manually bolding text. Add alt text to images (right-click → Edit Alt Text). Export via File → Save As → PDF, and check "Document structure tags for accessibility" in the options. This carries Word's structure directly into the PDF.

From Adobe InDesign — InDesign has the most control. Use paragraph styles consistently, define export tags for each style (H1, H2, P, etc.), add alt text to placed images, and set reading order in the Articles panel before export. Export as PDF with "Create Tagged PDF" and "Create Acrobat Layers" enabled.

From LaTeX — Use the tagpdf package alongside \usepackage{hyperref} for metadata. LaTeX-produced PDFs are often mathematically complex, and full PDF/UA compliance for math content requires additional tooling. The PDF/UA conformance flag can be set via \DocumentMetadata{pdfstandard=UA-1} in modern LaTeX (requires a current TeX distribution).

For scanned documents, run OCR first. In Acrobat Pro: Tools → Scan & OCR → Recognize Text. For batch processing, Tesseract (open source) can add a text layer to image-only PDFs.

Checking Your Work

Two tools are the standard for accessibility verification:

Adobe Acrobat Pro Accessibility Checker — Tools → Accessibility → Accessibility Check. It runs a series of automated tests and produces a report. Automated checks catch the obvious problems (missing tags, missing language), but they miss logical reading order errors and incorrect tag types. Treat a passed automated check as necessary but not sufficient.

PAC (PDF Accessibility Checker) — A free tool from the PDF Association. The current version (PAC 2024, formerly known as PAC 3) tests specifically against PDF/UA (ISO 14289) and WCAG, and it produces more detailed results than Acrobat's built-in checker. It also includes a screen reader preview that shows what assistive technology will actually encounter, which is invaluable for catching reading order problems.

For true end-to-end verification, read through the document with NVDA (Windows, free) or VoiceOver (macOS, built-in). No automated tool catches everything that a real screen reader experience reveals.

PDF/UA vs WCAG 2.1

These standards complement each other. PDF/UA (ISO 14289-1) is the technical specification for accessible PDF files — it defines tag structure requirements, metadata requirements, and PDF-specific rules. WCAG 2.1 is the broader web content accessibility guideline that applies to all digital content, including PDFs. A document can conform to PDF/UA while still failing WCAG (for example, insufficient color contrast). Aim for both.

Keeping Documents Accessible Through Your Workflow

Accessibility doesn't survive an uncontrolled PDF workflow. If a tagged document goes through a merge, compress, or split operation, the tags need to survive.

When you use PDF Merge to combine documents, each source PDF should already be tagged before merging. The logical structure trees from each input get combined in the output. Merging an untagged document into a tagged one will leave that section inaccessible.

When you PDF Compress a file, verify that tags survive compression. Lossless optimization (removing redundant objects, compressing content streams) generally preserves tags. Aggressive optimization that removes structure data will break accessibility. Always run the accessibility checker after compression if the output will be distributed publicly.

For more on how PDF internals work — including what gets stored in a file beyond what you see on screen — see How PDF Works and the tradeoffs involved in different formats in PDF vs DOCX.

The Bottom Line

Accessible PDFs require intentional work, but it's largely front-loaded. Build accessibility into your source documents, export with the right settings, and verify with PAC. Retrofitting accessibility at scale is harder — but Acrobat Pro's autotag + manual correction workflow gets you most of the way there for important documents.

Screen reader users navigate by headings, jump between form fields, and rely on alt text to understand images. Getting those three things right handles the majority of real accessibility failures.