An article (A data deluge swamps science historians by Robert Lee Hotz) published in the Wall Street Journal discusses a problem brought about by the spiralling amount of data generated by computer-assisted research:
In a vault beneath the British Library here, Jeremy Leighton John grapples with a formidable challenge in digital life. Dr. John, the library's first curator of eManuscripts, is working on ways to archive the deluge of computer data swamping scientists so that future generations can authenticate today's discoveries and better understand the people who made them.
His task is only getting harder. Scientists who collaborate via email, Google, YouTube, Flickr and Facebook are leaving fewer paper trails, while the information technologies that do document their accomplishments can be incomprehensible to other researchers and historians trying to read them. Computer-intensive experiments and the software used to analyze their output generate millions of gigabytes of data that are stored or retrieved by electronic systems that quickly become obsolete.
Usually, historians are hard-pressed to find any original source material about those who have shaped our civilization. In the Internet era, scholars of science might have too much. Never have so many people generated so much digital data or been able to lose so much of it so quickly, experts at the San Diego Supercomputer Center say. Computer users world-wide generate enough digital data every 15 minutes to fill the U.S. Library of Congress.
The problem is forcing historians to become scientists, and scientists to become archivists and curators. Digital records, unlike laboratory notebooks, can't be read without the proper hardware, software and passwords. Electronic copies are difficult to verify and are easy to alter or forge. Digital records "can be more direct, more immediate and more candid," Dr. John says. "But how can we demonstrate to people in the future that these are the real thing?"