30 years of change, 30 years of PDF
The last 30 years have revealed some persistent truths about how people use and think about information and communication.
June 15, 2023
We live in a world where the only constant is accelerating change. The twists and turns in the technology landscape over the last 30 years have drained some of the hype from the early days of the consumer digital era. Today we are confronted with all-new, even more disruptive, possibilities.
Along with the drama of the internet, the web, broadband, smart-phones, mobile broadband, social media, and AI, the last thirty years have revealed some persistent truths about how people use and think about information and communication.
From the vantage-point of 2023 we are positioned to recognize 1993 as a year of two key developments; the first specification of HTML, the language of the web, and the first specification of PDF, the language of documents. Today, both technologies predominate in their respective use cases. They coexist because they meet deeply-related but distinct needs.
Thirty years after it was popularized, the World Wide Web has revolutionized the human experience of information. Meanwhile, PDF has replaced paper documents with digital analogues. Some expected the web to replace digital documents as well, but all indications are that PDF continues to grow. Many websites are – let’s face it – mostly navigation to help visitors find a specific PDF. Maybe that’s why, after HTML, PDF is the 2nd most common format on the internet. A recent survey by Bitkom Research in Germany showed that “PDF…has become indispensable for the vast majority of companies”.
The history of technological and societal change indicates that documents are key to a vast array of human activities. Although communications technology has progressed over millennia from stone tablets to IT systems, documents remain an essential means of information exchange.
30 years of digital revolution has simply expanded the definition of “document” to include “PDF”.
What happened in 1993?
Although the internet itself came into existence over the course of the 1980s, by early 1993 end user access to internet resources principally consisted of text-only discussion groups and proto-web protocols such as Gopher, all accessed over dial-up modems, as the Computer History Museum recounts.
Leveraging HTML, 1993’s Mosaic and its successors spawned a new era of experience in which required resources – text, images, layout, styling, fonts, scripts, etc. – were assembled on request, anywhere, on any connected device, with the viewing software controlling the experience.
Leveraging PDF (together with Adobe’s decision to make the PDF Reference freely available), 1993’s Adobe Acrobat and Reader were soon joined in the marketplace by many other commercial and open source applications. As with HTML, any developer could read or write PDF from the beginning; unlike HTML, the file controlled the experience, not the viewing software.
As the forthcoming PDF Association documentary will make clear, an ecosystem emerged to support the billions of users worldwide who need to create, share, view, edit or store digital documents. Today, PDF is an open ISO standard, trusted worldwide for reliable digital document presentation and maintained by members of the PDF Association, a worldwide vendor-neutral technology ecosystem.
Documents, then… and documents now
PDF allows people to use digital documents as they once used paper documents, but with digital, scalable tools. Accordingly, even though you can always print a PDF when necessary, PDF files are a far more capable medium for documents than paper. PDF delivers far beyond paper’s static capabilities, with support for integrated 3D diagrams, movies, object-level metadata, file attachments, MathML, encryption, and much, much more.
Like HTML, PDF facilitates the user’s choice of device and operating system. Unlike HTML, PDF does not assume that remote servers or content are available.
It’s hard to overstate the impact of PDF technology – launched 30 years ago today – on communications and commerce worldwide. So what do people do with documents then… and now?
Content and uses | 1993 | 2023 |
---|---|---|
Text content | Black-and-white text predominates, with limited fonts. | Color, transparency, special effects, emoji, global languages, advanced typography, math … |
Images | Low resolution, limited colors, slow | High resolution, wide gamut, HDR, advanced compression, rich metadata |
Viewing | Paper | Viewer software on the user’s choice of device. Responsive layouts. Paper is optional |
Sharing | Postal and delivery services, device and vendor-dependent electronic documents | Email, cloud services, access from anywhere on any device, common reliable appearance |
Signing | Wet ink signatures | Digital and electronic signatures with digital certificates, long-term validation, revocation, non-repudiation, tamper detection, etc. Database entries. |
Assembling | In offices, shuffling pages, staplers, paperclips. For commercial printers, lots of manual handling | On-screen ad hoc (manual) and automated solutions, multi-page imposition (N-up), automated finishing, stapling and binding, mixed media sizes and shapes |
Classifying | Folders and labels | Metadata, provenance information, content-based and automated classification |
Securing | Distribution control, locked filing cabinet | Passwords, public-key cryptography, dynamic access controls, tamper detection, digital and timestamp signatures, digital rights management (DRM) |
Filling forms | Pen (or pencil) and paper | Typing into an HTML or PDF fillable form, validation on data entry, context-sensitive validation, auto-complete, auto-fill |
Redacting | Grease pen, photocopier | Redaction tools and workflows |
Searching | Card catalogs, siloed databases | Rapid search engines and AI, supplemental data (e.g. file attachments and MathML representations) |
Extracting content | OCR, copy-paste | Content recognition, data mining, “big data”, AI |
Accessibility | Braille, magnifiers, analog audio tape | Assistive Technology (AT), logical structure, semantics, alternative text for images, MathML, navigation aids |
Comment / review | Pen-on-paper markup, manually reconcile comments, edit original, share again,repeat | Multi-user live commenting, rich markup, video-conferencing, cloud-based exchange |
Rich media | Not possible / impractical | Sound, movies, animations, rich media, 3D, cross-platform, links to web content… |
Paper capture | Low resolution scan to bitmap | High resolution scan to rich document, OCR, HCR, natural language recognition |
Archiving | Boxes of paper and magnetic tape | Cloud storage, databases, instant search and retrieval |
What remains? Integration
In 2023, 30 years after the specification of HTML and PDF, URLs are themselves often used as proxies for documents. Unfortunately, websites (and web pages) come and go; URLs can change or disappear completely, and are beyond the control of end users, as Twitter users know well. After all, URLs are merely references to content. That’s why so many users still resort to screen-shots or the Wayback Machine as their go-to means of “documenting” content previously found on web pages.
In 2023, however, everyone should have learned (if they didn’t know it already) that by themselves, pixels preserve only so much, and prove nothing.
PDF can help with that. PDF is all about capturing information for the purpose of ensuring that it persists as a self-contained and reliable object, to serve “documentary” and communication needs.
Today, the evolving industry landscape, especially the introduction of AI, makes provenance an ever-pressing issue. Establishing provenance information for PDFs as a matter of routine is part of the impetus behind initiatives such as C2PA. Although PDF already provides some mechanisms for provenance, the AI-powered future will amplify demand for its increased transparency and availability in a standardized manner. We are confident that the PDF specification and its implementations will continue to adapt to meet users’ needs. PDF will remain as reliable as ever; a necessary capability and thus, a friend.
We cannot say what information technology will look like in 2053, thirty years from now. We can reasonably predict, however, that a persistent and reliable portable document format will be necessary for as long as humans need to communicate information.
If PDF didn’t exist we’d have to invent it.