Blog
duffjohnson  

30 years of change, 30 years of PDF

Laurels around an image of a page containing

The last 30 years have revealed some persistent truths about how people use and think about information and communication.


About the author: As CEO of the PDF Association and as an ISO Project Leader, Duff coordinates industry activities, represents industry stakeholders in a variety of settings and promotes the advancement and adoption of … Read more

Duff Johnson

Duff Johnson
June 15, 2023

We live in a world where the only constant is accelerating change. The twists and turns in the technology landscape over the last 30 years have drained some of the hype from the early days of the consumer digital era. Today we are confronted with all-new, even more disruptive, possibilities.

Along with the drama of the internet, the web, broadband, smart-phones, mobile broadband, social media, and AI, the last thirty years have revealed some persistent truths about how people use and think about information and communication.

Screen-shot of referenced tweet.
Source: Twitter

From the vantage-point of 2023 we are positioned to recognize 1993 as a year of two key developments; the first specification of HTML, the language of the web, and the first specification of PDF, the language of documents. Today, both technologies predominate in their respective use cases. They coexist because they meet deeply-related but distinct needs.

Thirty years after it was popularized, the World Wide Web has revolutionized the human experience of information. Meanwhile, PDF has replaced paper documents with digital analogues. Some expected the web to replace digital documents as well, but all indications are that PDF continues to grow. Many websites are – let’s face it – mostly navigation to help visitors find a specific PDF. Maybe that’s why, after HTML, PDF is the 2nd most common format on the internet. A recent survey by Bitkom Research in Germany showed that “PDF…has become indispensable for the vast majority of companies”.

The history of technological and societal change indicates that documents are key to a vast array of human activities. Although communications technology has progressed over millennia from stone tablets to IT systems, documents remain an essential means of information exchange.

30 years of digital revolution has simply expanded the definition of “document” to include “PDF”.

What happened in 1993?

Laurels around an image of a page containing "30th".Although the internet itself came into existence over the course of the 1980s, by early 1993 end user access to internet resources principally consisted of text-only discussion groups and proto-web protocols such as Gopher, all accessed over dial-up modems, as the Computer History Museum recounts.

Leveraging HTML, 1993’s Mosaic and its successors spawned a new era of experience in which required resources – text, images, layout, styling, fonts, scripts, etc. – were assembled on request, anywhere, on any connected device, with the viewing software controlling the experience.

Leveraging PDF (together with Adobe’s decision to make the PDF Reference freely available), 1993’s Adobe Acrobat and Reader were soon joined in the marketplace by many other commercial and open source applications. As with HTML, any developer could read or write PDF from the beginning; unlike HTML, the file controlled the experience, not the viewing software.

As the forthcoming PDF Association documentary will make clear, an ecosystem emerged to support the billions of users worldwide who need to create, share, view, edit or store digital documents. Today, PDF is an open ISO standard, trusted worldwide for reliable digital document presentation and maintained by members of the PDF Association, a worldwide vendor-neutral technology ecosystem.

Documents, then… and documents now

PDF allows people to use digital documents as they once used paper documents, but with digital, scalable tools. Accordingly, even though you can always print a PDF when necessary, PDF files are a far more capable medium for documents than paper. PDF delivers far beyond paper’s static capabilities, with support for integrated 3D diagrams, movies, object-level metadata, file attachments, MathML, encryption, and much, much more.

Like HTML, PDF facilitates the user’s choice of device and operating system. Unlike HTML, PDF does not assume that remote servers or content are available.

It’s hard to overstate the impact of PDF technology – launched 30 years ago today – on communications and commerce worldwide. So what do people do with documents then… and now?

Content and uses 1993 2023
Text content Black-and-white text predominates, with limited fonts. Color, transparency, special effects, emoji, global languages, advanced typography, math …
Images Low resolution, limited colors, slow High resolution, wide gamut, HDR, advanced compression, rich metadata
Viewing Paper Viewer software on the user’s choice of device. Responsive layouts. Paper is optional
Sharing Postal and delivery services, device and vendor-dependent electronic documents Email, cloud services, access from anywhere on any device, common reliable appearance
Signing Wet ink signatures Digital and electronic signatures with digital certificates, long-term validation, revocation, non-repudiation, tamper detection, etc. Database entries.
Assembling In offices, shuffling pages, staplers, paperclips. For commercial printers, lots of manual handling On-screen ad hoc (manual) and automated solutions, multi-page imposition (N-up), automated finishing, stapling and binding, mixed media sizes and shapes
Classifying Folders and labels Metadata, provenance information, content-based and automated classification
Securing Distribution control, locked filing cabinet Passwords, public-key cryptography, dynamic access controls, tamper detection, digital and timestamp signatures, digital rights management (DRM)
Filling forms Pen (or pencil) and paper Typing into an HTML or PDF fillable form, validation on data entry, context-sensitive validation, auto-complete, auto-fill
Redacting Grease pen, photocopier Redaction tools and workflows
Searching Card catalogs, siloed databases Rapid search engines and AI, supplemental data (e.g. file attachments and MathML representations)
Extracting content OCR, copy-paste Content recognition, data mining, “big data”, AI
Accessibility Braille, magnifiers, analog audio tape Assistive Technology (AT), logical structure, semantics, alternative text for images, MathML, navigation aids
Comment / review Pen-on-paper markup, manually reconcile comments, edit original, share again,repeat Multi-user live commenting, rich markup, video-conferencing, cloud-based exchange
Rich media Not possible / impractical Sound, movies, animations, rich media, 3D, cross-platform, links to web content…
Paper capture Low resolution scan to bitmap High resolution scan to rich document, OCR, HCR, natural language recognition
Archiving Boxes of paper and magnetic tape Cloud storage, databases, instant search and retrieval

What remains? Integration

Screenshot of autocomplete results from a Google search for "covert to..."In 2023, 30 years after the specification of HTML and PDF, URLs are themselves often used as proxies for documents. Unfortunately, websites (and web pages) come and go; URLs can change or disappear completely, and are beyond the control of end users, as Twitter users know well. After all, URLs are merely references to content. That’s why so many users still resort to screen-shots or the Wayback Machine as their go-to means of “documenting” content previously found on web pages.

In 2023, however, everyone should have learned (if they didn’t know it already) that by themselves, pixels preserve only so much, and prove nothing.

PDF can help with that. PDF is all about capturing information for the purpose of ensuring that it persists as a self-contained and reliable object, to serve “documentary” and communication needs.

Today, the evolving industry landscape, especially the introduction of AI, makes provenance an ever-pressing issue. Establishing provenance information for PDFs as a matter of routine is part of the impetus behind initiatives such as C2PA. Although PDF already provides some mechanisms for provenance, the AI-powered future will amplify demand for its increased transparency and availability in a standardized manner. We are confident that the PDF specification and its implementations will continue to adapt to meet users’ needs. PDF will remain as reliable as ever; a necessary capability and thus, a friend.

We cannot say what information technology will look like in 2053, thirty years from now. We can reasonably predict, however, that a persistent and reliable portable document format will be necessary for as long as humans need to communicate information.

If PDF didn’t exist we’d have to invent it.

Leave A Comment