Sie sind auf Seite 1von 6

The Pros and Cons of PDF Files

By

Andrew Downie

Paper presented at the


Round Table Conference
on Print Disability
Sydney, 2004
The Pros and Cons of PDF Files

Abstract
PDF (Portable Document Format) files were once completely inaccessible to people
using screen readers. Over the past several years, they have undergone several
steps along the evolutionary path towards accessibility. Under the best
circumstances, a PDF file can now be a very effective resource for people using
screen readers.

The presentation will begin by summarising developments which have made PDFs
more broadly accessible. These include improvements to software by such diverse
groups as Adobe, Microsoft and screen reader producers. The potential benefits of
reading PDF files via a screen reader will be discussed. These include reliable page
structure and access to document bookmarks and hypertext links. Limitations when
using current software will also be addressed. These include no access to font style
information and the potential for poorly constructed documents. Importantly, issues
arising from use of older (not very old) screen readers will be considered
3

The Pros and Cons of PDF Files

Background
PDF files are created, either directly or indirectly, by a number of Adobe products.
They can also be created by software from other companies, including directly from
the current Apple Macintosh operating system. The "portable" in the name refers to
their portability across a wide range of computer platforms. That is, a PDF file will
retain its layout, regardless of the computer used to display it. As will be discussed
further later, not all PDFs are created equal. they can now be created in a manner
which will make them largely accessible to people using screen readers. In a worst
case, however, they can be created to produce a jumbled mess of characters.

Adobe Acrobat is the major piece of software associated with PDFs. With it, files can
be created, modified, annotated, bookmarked and even turned into a multimedia a
extravaganza. This software is not free. PDFs can, however, be read with the free
Adobe Reader.

One issue surrounding PDFs causes considerable confusion in relation to


accessibility. This is the feature in Acrobat which allows paper documents to be
scanned and saved as image-only files. While representing a quick way of getting
paper-based material into an electronic format, an image-only PDF is not accessible
to screen readers. This type of file is not the subject of the following discussion until
specifically addressed in the Image-Only PDF Files Section. As will be shown in that
discussion, even image-only PDFs which were once the archetype of inaccessibility
can, with the right tools, have a claim to salvation.

From Inaccessible to Largely Accessible
Less than ten years ago, PDF files were completely inaccessible to people using
screen readers. Under the right circumstances (discussed later), they are now
largely accessible and can offer some advantages over other formats. Let's briefly
trace the accessibility history of this file format. In doing so, I encourage those of you
who have had unpleasant PDF experiences to apply as open a mind as you can to
the subject.

The first significant step was when Adobe released a plugin for version 4 of Acrobat
Reader to be used by screen readers. Having mastered a less than mnemonic set of
commands, the user could navigate a document by page, bookmark or hypertext link.
Text attributes were not (and are still not) provided and such layouts as multiple
column and tables caused real frustration.

With the release of Acrobat Reader Version 5 accessibility was integrated into the
product without the need for a plugin. This was subject to a couple of constraints.
Firstly, the user had to be using a MSAA (Microsoft Active Accessibility) enabled
screen reader. Secondly, the full version of Acrobat Reader (a smaller one without
4

The Pros and Cons of PDF Files

accessibility was offered) had to be used. In this release, Adobe adopted coding
developed by Microsoft to make products more accessible to screen readers.

Quite good access was available, provided the PDF file was "tagged". I'll say more
about this process later. Files which are not tagged can still cause major problems.
A properly constructed file, though, could be navigated in very much the same way
as a file in other Windows applications. An access barrier previously not
encountered now became an issue. One of the features of PDFs which is appealing
to some authors and publishers is a raft of security settings. Files protected against
copying were often also protected against reading by people using screen readers.
While facilities were built into Acrobat to overcome this, files prepared with older
versions remained blocked. Further, the default setting in Acrobat V5 was not to
allow access by screen readers. Version 5 also included some useful features for
people not relying on screen readers. Text size could be adjusted over a wide range
and colours could also be altered to meet individual preference.

The current product from Adobe for reading PDFs is Adobe Reader V6. Note the
change of name from Acrobat to Adobe Reader. The rationale for this seems to be
that this product now handles a variety of electronic books and not just normal PDF
files. A significant inclusion is the ability of the Reader to utilise synthetic speech to
read documents aloud. Given the limited options – by page or entire document – this
won't excite people using screen readers. It does, however, have potential for people
who have reading difficulties and who may want to augment text with speech output.

The other major fix is that files protected with older versions of Acrobat can now be
read with current screen readers. The default setting when applying security with
Acrobat V6 is, as it should have been originally, to allow screen reader access.

Work at Adobe has not been in isolation. Screen reader developers have also been
refining their products to provide better access to PDFs. Hypertext links and
bookmarks are now valuable features. Tables can also be read with the same
precision as those on web pages.

Current Limitations
Text attribute information is still not available. Identification of headings and
paragraphs is unreliable. To get a reasonable level of access one must be using an
up-to-date screen reader – certainly one released no more than a year ago and
preferably one of the very latest offerings. And then there's the vexed issue of
whether the PDF file has been properly constructed.

Producing Tagged PDF Files
In some ways accessibility to PDFs parallels access to web pages some years ago.
Over the past several years, improved tools and increased knowledge has allowed
developers of web pages to produce highly accessible sites. While still eminently
5

The Pros and Cons of PDF Files

possible to produce very ugly (in several senses) pages, quite good access can now
generally be expected.

With the release of Acrobat V5, Adobe began producing extensive documentation on
how to create accessible PDF files. A great deal of information is available at
http://access.adobe.com, covering preparation from various sources. As mentioned
earlier, the crucial issue is to tag the file. Depending on how the PDF is produced,
this may be an automatic or manual process. The former is clearly the most
desirable option and, so far as I can tell, the only one for a person using a screen
reader. What follows is a very brief introduction to the relatively new (and widely
unknown) science of accessible PDF creation.

Accessible PDF Files From Word

A very easy way of producing a tagged PDF file is to create it from Microsoft Word
(you must be using Word 9 or later). When Acrobat (the one you pay for) is installed
onto a Windows-based computer with Word already installed, an extra item is added
to the top menu in Word. This allows the setting of Acrobat preferences and the
conversion of Word files to PDF files with ease. However, and it's a fairly big
"however", it's really important to construct the Word file correctly to get a well
constructed PDF file.

The key to this process is to use Word styles. For example, do not simply change
the font size and type for headings. Instead, select a heading style for each heading.
Similarly, if information is to be presented in tabular format, use a table rather than
simply tabbing between items. Space between paragraphs should be determined by
the paragraph style, not by repeated presses of the Enter key. If this all sounds a bit
tedious, be warned that it is becoming important not just for producing PDF files but
for preparation of material to be displayed in the increasingly popular XML format.

Creating Accessible PDF Files in Acrobat

Acrobat V5 and later offers facilities for tagging files. In Acrobat 6, this can be done
automatically. My so far limited experimentation suggests varying results, probably
subject to the manner in which the file was created. Acrobat also allows manual
tagging, including addition of alternative text for images. This process requires sight
and some knowledge for best results.

Both Acrobat and Adobe Reader offer an accessibility checker. From within Acrobat,
problems can be rectified. While a file cannot be altered in Adobe Reader, it offers a
feature which can help with poorly constructed files. The "reading order" can be
adjusted for screen reader output and this can sometimes turn garbled text into an
accurate representation of the author's intentions.
6

The Pros and Cons of PDF Files

Image­Only PDF Files
As mentioned earlier, these files are created by scanning paper documents into
Acrobat. No optical character recognition (OCR) takes place, each scanned page
simply appearing in the file as an image of the original. As screen readers cannot
read pictures, this type of presentation is completely inaccessible.

But now to the good news. It is now, subject to the quality of the original paper
document, easy to convert an image-only file into one containing quite readable text.
Some, and only some, of the available options are mentioned here.

Acrobat V6 contains a Paper Capture facility. Having loaded an image-only PDF file
into Acrobat, simply run Paper Capture to produce text.

Commercial OCR software will now also convert image-only files to text. Two which
do this very well are OmniPage and Finereader.

As already cautioned, the quality of print on the original paper document will affect
the OCR process. A smudged, faded document will look smudged and faded when
viewed as an image-only file and there is a high likelihood of poor results when
applying OCR software.

Conclusion
Accessibility of PDF files has improved markedly over the past decade or so. In the
best circumstances, these files can be a valuable resource to people using screen
readers and other adaptive equipment. Navigation by hypertext links, bookmarks
and specific page references can allow quick and easy access to material without
risk of altering the original file. The improvements are due to efforts on the part of
large companies such as Adobe and Microsoft and to very small ones which produce
screen readers.

On the other hand, partly due to the need for further work to yield more information to
readers using synthetic speech or Braille output and partly because much more
education of people producing PDF files is needed, there are still shortcomings.
Apart from anything else, people using screen readers need a current, or nearly
current, version to get best results.

It is important that those who enjoy using PDFs and those who endure the
experience continue to provide feedback to software developers. It is also at least as
important to increase substantially the level of education regarding the need to
construct PDF files correctly. If a far greater proportion of files are produced in
accordance with Adobe's recommendations, PDF will not be so widely viewed among
people using screen readers as a nasty three-letter word.

Das könnte Ihnen auch gefallen