2020 census form. Credit: Maria Dryfhout/Shutterstock.com.

 

With the aggressive pace of technological change and the onslaught of news regarding data breaches, cyber-attacks, and technological threats to privacy and security, it is easy to assume these are fundamentally new threats. The pace of technological change is slower than it feels, and many seemingly new categories of threats have actually been with us longer than we remember. Nervous System is a monthly blog that approaches issues of data privacy and cybersecurity from the context of history—to look to the past for clues about how to interpret the present and prepare for the future.

 

This month, the U.S. Census Bureau undertakes the 24th national census, beginning with Census Day on April 1. Today's tools for data collection include the ability to submit information online and the use of sophisticated database technology to derive statistics from raw counts. This has not always been the case, and looking back to when the Census was a manual process provides insight into one of the longest-lasting and most influential components of information technology—the punch card.

In 1890, the Census Bureau faced a problem. The 1880 census had taken fully eight years to tabulate by hand, meaning the numbers had only been recently released before the Bureau needed to begin the next round. The country's population had bloomed in the last decade, so the task ahead was taxing and only getting harder.

That being said, a new age of automation and technology was dawning, and new tools of information science were taming problems like these with spectacular success. Herman Hollerith's Tabulating Machine Company, for example, had used machines to calculate vital statistics automatically for the New York City Board of Health. Perhaps Hollerith's machines could help turn the census around as well.

Hollerith's tabulating machine was roughly the size and shape of a small player piano. The operator fed perforated paper cards into it in bulk stacks. As the machine sorted cards, a metal needle traced across each card's surface. Whenever there was a hole, the needle would penetrate through the gap to a well of mercury on the other side, completing an electrical circuit. Every time the circuit connected, the signal would advance a counting dial by one increment. The operator would watch the calculations on a faceplate of clocklike dials until the final tally was reached. The earliest model could only count—later improvements added other mathematical skills.

The idea of using punch cards to manage an automated task was not new. When Charles Babbage first began developing the idea of a programmable computer, he and Ada Lovelace imagined encoding data and instructions in perforated paper cards. Punch cards of that type had been around since the 1700s, when they were used to control textile looms in France (one reason why Joseph Marie Jacquard's automated loom is regarded as an ancestor of computers).

Thanks to the efficiency of punch-card automation, the 1890 census was completed in just six years. Automated tabulating machines began appearing in a wide array of industries and businesses, revolutionizing the management of information.

In 1915, Thomas Watson took over from Hollerith as the new head of the company, later renamed International Business Machines (IBM).

The US government remained IBM's primary customer, relying on its technology to manage not just the census but other "Big Data" requirements like the newly passed Social Security Act.

Punch cards remained the primary means of data access for mechanical data sorters and digital computers alike until the mid-1970s, when newer technologies came along such as magnetic tape. Hollerith's decision to match the dimensions of punch cards to the size and shape of US currency at the time (allowing him to cannily repurpose storage boxes made for the Treasury) became a worldwide standard. These dimensions have since been fixed in place as the EIA standard RS-292 media 1 punched card. The ubiquity and lasting influence of punch cards continues to be felt in the fact that computer programmers still limit programming line lengths to 80 characters, a practice originally established by the 80 columns available on a standard IBM punch card.

IBM punch cards stayed in use in certain specialized applications like voting booths until as late as 2000. The contested 2000 presidential election results involved a difference in votes counted for George Bush and for Al Gore that was less than the total number of uncounted votes. The infamous "hanging chads" of poorly punched cards became a national punchline and discredited any residual confidence left in the reliability of punch cards as a data storage tool.

The fact that Hollerith's cards stood the test of time with virtually no changes over 110 years of service is perhaps the most remarkable longevity of anything in computer science.

 

David Kalat is Director, Global Investigations + Strategic Intelligence at Berkeley Research Group. David is a computer forensic investigator and e-discovery project manager. Disclaimer for commentary: The views and opinions expressed in this article are those of the author and do not necessarily reflect the opinions, position, or policy of Berkeley Research Group, LLC or its other employees and affiliates.