(Photo: Shutterstock.com)

 

One of the greatest risks a business can face is the inadvertent disclosure of privileged or confidential information. This can result in huge fines and loss of professional reputation. Permanently removing content from a document to ensure confidentiality is a common practice that is all too frequently attempted with tools intended for other types of annotation.

A decade ago, the term 'redacted' would not have been widely used or understood unless you worked in the legal profession or had been involved in a court case. Searches on Google for 'redacted' and other similar terms have spiked since 2005, a trend that is likely to increase given the growing number of reported data breaches caused by improper or poor redaction technique and the growing body of government legislation aimed at protecting people's personal information.

The courts are also mandating the redaction of personal information in court filings. Attorneys now have an obligation to redact many sorts of confidential information before submitting documents to courts, including social security numbers, financial account numbers, names of minors, dates of birth, home addresses and other sensitive information.

Over the past 10 years, we have seen some spectacular redaction fiascos exposing national security secrets, business deals, and everything in between. And no one appears to be immune. Take the recent example of Paul Manafort's lawyers submitting documents to the Court that were improperly redacted, revealing that Manafort had shared “polling data” with the Russian Konstantin Kilimnik and that he had lied to the special counsel, Robert Mueller.

Masking is Not Redacting

Even though redaction software is widely available, you have to ask, “Why is it so many people continue to get redaction so wrong?” The reason is always the same—masking is not the same as redacting. The Manafort example cited and most others fall into one of the following three flawed processes:

  1. Using a mark-up or annotation tool in Word, draw a solid black box around the information that needs to be redacted. The document is converted to PDF and distributed.
  2. The document is converted to PDF and the redactions are applied using a mark-up or annotation tool to place solid black boxes over the text.
  3. The third example is the same process as the second process except with a twist. When the redaction process is finished, the PDF file is flattened.

True redaction requires that the text be removed or “burned out” from the page. It cannot be copied or exposed because it is no longer there. In each of the above examples, the text is still on the page. The user is of the belief that “what can't be seen, obviously can't be read.” This is simply not true.

Peeling Back the Layers

PDF documents are constructed in layers—for example, text is on one layer and images on another, bookmarks on yet another etc. Thus, simply masking the text or image is not a foolproof method of redaction as you are simply adding annotations to the annotation layer. The text layer remains untouched. Copying the text from the PDF and pasting it into a new Word document will expose the hidden text. Also, flattening a PDF is not the same as burning in a redaction. Flattening simply merges everything onto the text layer, which can be copied including text despite the black boxes.

To add another layer of complexity to the redaction problem, you need to consider the underlying structure of the PDF. Depending on how the PDF was generated—i.e. a Word document converted to PDF, or a scanned document output as a PDF—there may be more than one layer of information that needs to be redacted. There is the image layer, what you see on the page, and the text layer underneath. The problem is it isn't always obvious what type of PDF you have.

So how do you ensure that you have the right tool for the job? Basically, you need a PDF application with a native redaction tool. Only PDF can provide the following safeguards and workflows:

Remove content from the document: A PDF redaction tool removes (burns out) the information from the document once the redaction(s) has been applied. It cannot be undone or exposed later because it is no longer in the file.

No metadata: When converting documents to PDF from Word, minimal metadata is carried over to the PDF. PDF creators or editors do not generate a lot of document metadata in the PDF.

Redaction workflows: Your redaction tool should support various workflows: search and redact; review and redact; as well as allow you to redact a page or range of pages. Exemption codes allow you to explain why the content is being redacted in the first place. There are some standard exemption codes that are required by certain courts and other regulatory authorities, but you should be able to create your own.

 

As President and Co-Founder of DocsCorp, Dean has been the driving force behind its global expansion, working closely with key stakeholders to develop new products and ensure that existing applications deliver real benefits to law firms. Dean devotes time to speaking at international conferences and forums. He holds a Computer Science degree from the University of Technology, Sydney.