In this age of digital by default it is important that all digital content is accessible. This will include web sites and web pages but also video, audio and documents. This article will investigate the needs, challenges and issues around the creation and consumption of accessible documents.
For this article a document is a collection of words and images that can be printed as a whole. The article does not cover interactive books that require the reader to be able to access them electronically.
These documents will include: letters, memos, minutes, reports, user guides, brochures, pamphlets, transcripts of speeches, magazines, novels, etc. They will be held in one or more digital formats.
There is a potential tension between the requirements of the creator of a document, the distributor and the user:
- The creator of the document will wish to use tools and technologies that they are familiar with.
- The distributor of the document will wish to minimise the number of document formats used for distribution. Multiple versions cost money, cause management issues and increase the risk of different users seeing different content.
- The different users will wish to consume the document in different ways (the word 'consume' is used here rather than 'read' because 'read' implies reading words on paper or a screen, whereas the user may have the document read to them, or turned into braille or sign language, or other formats).
The end user must be considered the most important of these roles; if they cannot consume the document then there is no point in creating or distributing it.
This article looks at the requirements of these different players and reviews the alternative technologies available.
It summarises the pros and cons of various solutions and makes tentative suggestions for an optimum solution. It is hoped that this will help organisations that are going digital by default to decide how to distribute accessible documents; it also hoped that it will show the weaknesses in current technology so that vendors can improve their products.
The document looks first at the end user, then the distribution process, then the creation process, it then looks at the various technologies for creating, distributing and consuming the document and concludes with some tentative best practise.
The end user experience
To understand how these documents must be created, stored and distributed we must first understand how different end-users will consume them.
However the user consumes the document they need to be able to access more than just the text and the images (or descriptions of the images), they need to be able to:
- understand the structure of the document, including sections and sub-sections, lists, tables notes, quotations, citations, indexes, etc.
- navigate to relevant parts easily.
- annotate the document.
- copy and extract information.
They will not expect to be able to modify the original document without the express authorisation of the owner.
Types of consumer
People with different disabilities will wish to consume the documents in different ways. The following section outlines the different disabilities and methods of consumption:
- Non-disabled: a person with no relevant disabilities will want to be able to read the document electronically on some type of screen. The document should be laid out so that its structure is visually apparent by the use of different types and sizes of fonts, use of bullets, indentation, and tables. The reader software should enable the user to navigate the document by table of contents, indexes, bookmarks and searches.
This electronic version of the document should be considered the base version: any other version should contain the same information.
Besides the electronic version, non-disabled people may wish to have a printed version of the document. It must be possible to print all or parts of the document so that the printed version is an accurate reflection of the electronic version.
- People who are partially sighted should be able to modify how the document is displayed: size of text, type of font, background-foreground colours, line separation, justification, etc. to enable them to see the content as clearly as possible. The electronic document should interface well with screen-magnifiers.
- People who are blind should be able to access the document using a screen-reader. Tthe screen-reader should convey the structure of the document by announcing headings, lists, tables and other structural elements, and assist the user navigation by providing functions such as jump to next header, or to the end of a list, or to the next chapter.
- People who are vision impaired and use braille should be able to access the document and have it presented on a braille display including the structure and the ability to navigate.
It should also be possible to create printed braille from the base document.
- People with dyslexia: can improve the reading experience by using suitable background-foreground colour and brightness combinations, also by using left justified text. Having text read out aloud and highlighted at the same time can also improve the experience.
- People with hearing impairments have different capabilities of reading written text. If their reading level is good then the base document should be accessible. There is a great deal of pressure from the deaf community for films and TV to be captioned but there is much less pressure for them to be signed; the main area of signing is for news and current affairs where live captioning is inadequate. Signed versions of a document should probably be limited to general introductions to an organisation and documents specifically aimed at the deaf community.
- People who do not understand written English may need some introductory document which explains what the organisation does and how to get help in understanding the other documents.
- People with learning disabilities may not be able to fully understand the base document. Firstly the base document should be reviewed to see if it can altered so that it is understandable by a wider range of cognitive abilities, without it becoming patronising for the majority of users. If this cannot be done then a version may need to be created that is easier to understand without losing any of the meaning. This format is often known as 'Easy Read'; it concentrates on simple language and use of images and videos to match the words.
- People who cannot use keyboards and/or pointing devices should be able to access and navigate the base document using assistive technologies such as switches or voice commands.
- People with sever dementia and similar problems cannot understand or make decisions independently. In these cases the document only has to be accessible to their carer. An extreme case is a person in a coma.
Formats required by the consumer
To support all these different user types ideally requires the following end user formats (requirements for readers for these formats is discussed below):
- Base document, which includes text and images, the format should support:
- Changes to fonts, colours, justification etc
- Easy Read documents are needed where the base document is difficult to understand by some users.
- Sign language the base document can be converted into sign language by videoing a signer reading the document (there is research into automatic generation of sign language but it is not considered advanced enough to be used instead of human signers). At present there is no easy way to support navigation of signed videos.
- Audio can be produced either by using a text to voice software or by recording a human reading the text. At present there is no easy way to support navigation of audio versions of documents, however if the voice is synchronised with a text version then navigating the text version will provide navigation of the audio.
Possible Distribution Formats
The question is which format(s) should the content be distributed in? The following are some options with pros and cons.
Word processor format
The documents will often be created using a word processor (Microsoft Office (.docx), Open Office (.odt) Apple iWorks (.pages)). If it is going to be distributed in this format it needs to be in a format that can be read by all systems: this means .doc or possible .docx. There are two problems with distributing in this format:
- The formatting of these documents by different word processors is not always identical and in a few situations does not work at all. This can be a particular problem with mobile devices that have limited support for these formats.
- The content is not intended to be edited or changed by the recipient but the program used to access it is designed to do just that. The recipient should be able to annotate and comment but not to change the original.
For these reasons it is not really a suitable format for distributing the base document. However it is a very common format for creating base documents and therefore there should be methods for converting them into formats suitable for distribution.
PDF is designed to be a final document format. The common tools used to access it, such as Adobe Reader and Apple Preview, do not support change but do provide annotation functions.
PDF used not to work well with screen readers because the format did not include any document structure information; with the publishing of the PDF/UA standard this is no longer the case.
PDF readers are available on all relevant platforms and are installed on most PCs. PDF is therefore a popular format for distribution of finalised documents.
PDF/UA has not been designed to facilitate conversion to other formats; it is possible but not easy.
PDF documents are designed to ensure the page layout is preserved. This is important if the page layout is critical to the design of the document, or if the layout has a legal significance.
The ePub format is growing in importance and is especially popular on mobile devices.
The format does not define the page layout but just the document structure. This means that the document can be rendered differently to suit the display device and user preferences. It is also suitable for converting into other formats including Braille.
It has the functionality to support screen readers as the document structure is defined as part of the format. The common reader tools that are used to access the content enable users to annotate but not to change the original.
The latest standard version of the epub standard (epub3) includes functions for synchronising audio with the text.
The present issue is that not everyone has ePub readers installed on their device. Also not everyone has an ePub creator tool.
Daisy is a format that has been developed to support people with vision impairments. It requires a special reader and development tools. It would appear that the benefits of DAISY are being built into ePub 3 technology. Therefore it is unlikely that Daisy will become a general document distribution format.
MP3 is the common format for audio. The problem is that it does not include any facility for defining structure, for navigation or for annotation.
MP3 versions of the base document may work for short documents or for documents that are designed to be read linearly such as novels. On its own it is not a suitable format for documents such as reports, manuals or magazines.
MP4 (or mov) are the standard file format for videos. It is the format that will be used for sign language. The problem, as with mp3, is that it does not include any facility for defining structure, for navigation or for annotation.
A suggestion is that a video file is created which includes the signed version of the text, an audio track with the spoken words, a closed captions track with the written text. This way there is one file that can support users with different disabilities.
Recommended Distribution Formats
Based on the discussion above it would seem that all users can be accommodated by providing two formats: ePub 3 and Video. ePub has been recommended over PDF/UA because it is designed to supported conversion and because of its widespread support on mobile devices.
The base document should be distributed using EPUB 3 format. Given a suitable reader (see discussion below) this format can be used by people with most of the disabilities described above; the one major exception is people who are dependent on sign language for communications.
The format can be converted relatively easily in to other formats. This means that users who require another format for technological, preferential or legal reasons can convert the document or have it converted for them.
Sign language cannot be adequately created from an EPUB 3 document. The only solution for this requirement is to create a video of a signer reading the document. If this includes the sound track of the document being read then the video provides a single source that supports multiple users.
It is not recommended that a video is made of every document but a decision is made for each new document as to whether it is beneficial to make the video up-front or if it should only be created on request.
The three formats (EPUB, PDF and Video) have different reader technologies.
There are many different readers on the market. They all support the EPUB format but vary in details such as which platforms they run on, design of the user interface, options available for the user to change the look and feel. This means that is not possible for the distributor of the document to recommend and link to a single reader (this compares to PDF readers where, although there are multiple readers on the market, Adobe Reader can be recommended for all users).
This means that the user has to decide which reader is most suitable for them. Some questions that the user will need to consider are:
- Does the document have to be loaded into the reader library before being consumed or can it be opened from a standard file directory.
- Does the reader interface effectively with the assistive technology they use.
- Does the reader provide sufficient customisation options to give the user an optimal experience. Options include font style and size, background-foreground colours, justification, hyphenation.
- Can the user set up themes so that they can use different sets of options for different types of documents.
- Is the customisation interface easy enough to use. Some options should be very easy to change but ideally there should be a more sophisticated way of changing the options e.g. a standard of three background-foreground colours to choose from but with the ability to use CSS to define any combination.
- Is there an in-built text-to-speech facility.
There are several readers on the market, not all of them take advantage of the PDF/UA tagging.
Adobe Reader is the leading reader and is available for all major platforms. Not all of the assistive technologies available understand or take advantage of PDF/UA, especially in the mobile environment.
Video players are available on all major platforms. The problems with video players are that they do not provide functions for: defining structure, navigation, searching, annotation, copying or extraction.
Creation and Conversion tools
There are various EPUB creation tools: there are desktop publishing systems that can be used to generate EPUB documents and there are tools that convert from word processors (Microsoft Office or Open Office) to EPUB.
Assuming many of the documents will be written using a word processor this section concentrates on products that convert the source to EPUB.
Calibre is one tool that will convert from .docx and .odt to .epub and the latest version supports more styles and formats than before. The problem is that there is a lack of documentation as to what can be converted and how it is converted. This information is needed as the ideal is to create the document in the word processor and then automatically generate the .epub without any manual intervention.
Calibre and other tools can read EPUB and convert it into other formats.
There are several tools for converting .doc, .docx and .odt files into PDF/UA. These include Adobe Acrobat, Microsoft Office and Open Office so the process is well supported by the leading players.
There are products that attempt to convert from PDF to other formats but they tend not to use the PDF/UA tagging so the output often loses much of the structure of the original.
To provide accessible documents to the widest possible set of users an organisation should distribute the documents in accessible EPUB format with some also available as videos with the text read out and signed.
To ensure this is practical there needs to be more research so that recommendations can be made about:
- The best readers for different users.
- About the creation of word documents that can be automatically generated into EPUB documents.
This recommendation is intended to provide the best long term solution to accessible documents. It should be the solution promoted by the accessibility community. However, the creation and reader technologies for EPUB are are at present (January 2014) somewhat immature and lacking a complete set of easily implemented functions. There is a need to persuade the providers of EPUB technology to improve the quality and function of their products.
Therefore, for a distributor of accessible documents who requires an immediately available, low risk solution PDF/UA could be the preferred choice.