Data Storage Issues: Part 3
Future Storage Technologies
The incredible amount of data being produced by individuals, industries, and governments continues to increase yearly along with the demand for greater archival storage capacities. Currently there are at least 25 Exabytes (EB) of data flowing through the Internet each month and more than 5 Zettabytes (ZB) of digital data in existence worldwide. Most of this data is stored on millions and millions of conventional HDDs. However, this is a very expensive and inefficient method to store data. HDDs need a constant supply of electricity and over time their moving parts will eventually degrade and ultimately fail. Also, at some point the maximum amount of data that can be stored on a HDD platter will be reached and future data storage using HDDs may become an issue. Alternative storage technologies are already under development and they may eventually replace the conventional HDD for data storage.
Solid State Drives (SSDs)
SSDs have been in use since the 1950s and high performance models, some of which exceed 1 TB capacity, are readily available. When compared to a conventional HDD, SSDs are totally different in architecture and functionality. Since they have no moving parts, SSDs have incredibly fast data access times. For instance, a typical random data access time can be 0.1ms or less whereas a typical 2.5” HDD may take about 10~12ms or more. That’s about 100 times faster in accessing data which directly leads to dramatic improvements in boot time, application installations, application loads, file copies, Web browsing, and shutdown. Other advantages: they are not affected by fragmentation, they are shock resistant, and their power consumption is minimal. Although they cost more than a comparable HDD, their advantages will lead to their proliferation and continued demand will eventually decrease cost.
Under development by IBM, Racetrack Memory (also referred to as Domain-Wall Memory - DWM), relies upon a spin-coherent electric current to move data to precise locations along a nanoscopic permalloy (nickel-iron alloy) wire called a Racetrack. The spin of the electrons (a quantum mechanical property) moves the data at hundreds of miles per hour along the “U” shaped nanowires which are thousands of times finer than a human hair. Data is stored as magnetic regions (magnetic domains) in the Racetracks. Electrical currents are generated by transistors connected to the bottom of the “U” nanowires. The current causes these magnetic domains to pass magnetic read/write heads which alter the magnetic domains to record patterns of bits. This configuration allows each transistor to store 100 bits or more of data rather than just one bit as in most all other solid state memory. Recent improvements in magnetic detection capabilities will lead to very small magnetic domains providing greater bit densities. Many combinations of wires and read/write heads are needed to make a practical storage device. Conceivably, such devices could be very small, such as the size of a button on a shirt, but would have the capacity to store Terabytes of data.
DNA Encoded Storage
Deoxyribonucleic acid (DNA) is a molecule that encodes all the genetic instructions that are used in the development and function of all living organisms. DNA consists of two biopolymer Polynucleotide strands twisted together to form a double helix. Each Polynucleotide strand is comprised of simpler units called Nucleotides. The Nucleotides consist of Nitrogen containing Nucleobases, either Adenine (A), Guanine (G), Cytosine (C), or Thymine (T), along with Deoxyribose (a monosaccharide sugar) and a Phosphate group. Nucleotides are joined to each other by covalent bonds between the Phosphate of one Nucleotide and the Deoxyribose sugar of the next Nucleotide to form a chain. This results in an alternating Phosphate-Deoxyribose sugar backbone with Hydrogen bonds binding the nitrogenous bases of the Polynucleotide strands together. “So, what does this have to do with data storage?”
Today it is relatively easy for Paleontologists to resurrect and sequence DNA from long-extinct species. We know that DNA, when kept in a cold, dry, dark place, can stay intact for thousands of years. If data could somehow be stored in DNA, this process could provide a long-term solution for data storage. A tremendous amount of data, ~450 EBs, could be stored indefinitely in one gram of DNA. Sound impossible? While decoding and reading DNA is very straightforward, encoding and writing data is problematic. However, geneticists from Harvard, John Hopkins, and from the EMBL-European Bioinformatics Institute (EMBL-EBI) have recently demonstrated DNA’s superlative information storage abilities. Geneticists took text, image, and audio data files and converted them into binary code (0s and 1s) and then into trinary code (0s, 1s, and 2s). The trinary code helps prevent errors. Using very sophisticated software, they then rewrote the data as strings of DNA’s chemical bases: As, Gs, Cs, and Ts and drew up plans for thousands of pieces of the DNA, each of which contained a part of a text, image, or audio file. These customized designs were sent to a company which manufactures custom DNA. Geneticists then took the resulting DNA and it was ‘read’ using a DNA sequencer. Using software, the sequenced DNA was reassembled back into the original text, image, and audio files with 100% accuracy.
Although this sounds like an ultimate storage solution, there are some limitations. For instance, it takes about two weeks to sequence DNA and the equipment is very expensive. This time frame and high cost makes DNA storage somewhat impractical considering that data can be retrieved almost instantly from a low cost HDD or an SSD. Storing data in this manner is not something that can be accomplished by just anyone. Geneticists are highly educated, trained, and skilled individuals who work with very complicated and exacting biological processes. DNA is so small it cannot be seen with the naked eye and requires judicious handling and sample preparation to avoid contamination. Once the data is written in the DNA, it cannot be rewritten or changed and cannot be accessed for just one particular piece of information. Rather, the entire DNA sample would have to be sequenced to find a specific piece of information. Although DNA has potential for archival storage of data, it is does not appear practical for everyday use.
This discussion will continue in a future column.
John J. Barbara owns Digital Forensics Consulting, LLC, providing consulting services for companies and laboratories seeking digital forensics accreditation. An ASCLD/LAB inspector since 1993, John has conducted inspections in several forensic disciplines including Digital Evidence. firstname.lastname@example.org