Presentation on theme: "Disk Wiping / A Story with a Cast of Thousands Presented By Gary Smith."— Presentation transcript:
Disk Wiping / A Story with a Cast of Thousands Presented By Gary Smith
Supercomputers are like cars. Everybody talks about how great the new one but you’ve still got to cleanup the old and get rid of it. One of the items in getting rid of a supercomputer is wiping the disk drives. What is wiping a disk drive? Wiping a disk is the non-destructive process of rendering the data on the disk drive irretrievable. What this amounts to is writing some pattern on every track and sector of the disk drive some number of times until your comfort level with the irretrievability is reached. Disk Wiping / A Story with a Cast of Thousands
What is Disk Wiping? There are destructive processes for rendering the data on a disk drive irretrievable: degaussing and crushing, for instance. Degaussing is the process of subjecting magnetic media to an intense magnetic field. This turns a disk drive onto a paperweight. Crushing also turns a disk drive into a paperweight; it won’t be recognizable as having once been a disk drive.
MPP2, The Old Super Computer MPP2, the old super computer had a lot of disk drives in it. There were 980 total nodes in MPP2 Some nodes were fat nodes and some nodes were thin nodes. Fat nodes had more memory and disks than thin nodes. Thin nodes had two 37 GB drives in them; fat nodes had 2 37 GB and 7 73 GB disk drives. There were a total of 5,898 local disks drives in MPP2 for a total of 423,216 GB of disk space And then there was dtemp. dtemp was the large scratch space shared among all the nodes in MPP2. There were 1280 disk drives in dtemp for a total of 53 TB of disk space. And then there was the /home file system. And then there was the test cluster.
The CPUs in MPP2 were Intel Itaniums. Itanium is the brand name for 64-bit Intel microprocessors that implement the Intel Itanium architecture (formerly called IA-64). The processors are marketed for use in enterprise servers and high-performance computing systems. The architecture originated at Hewlett-Packard (HP) and was later developed by HP and Intel together. Itanium's architecture differs dramatically from the x86 architectures (and the x86-64 extensions) used in other Intel processors and is incompatible with both of these architectures. Itanium is the fourth-most deployed microprocessor architecture for enterprise-class systems. What Makes MPP2 Unique
Disk Wiping Programs Almost all disk wiping programs assume: The target machine runs a Microsoft operating system. The target machine has a CDROM drive. The target machine has a keyboard/monitor/mouse attached to it. A human is going to be watching the monitor to see the final results. The target CPU is x86 architecture.
The systems making up MPP2: Run Linux. Have no CDROM drive. Have no keyboard/monitor/mouse attached to it. Have no human watching the non-existent monitor that isn’t attached to it. Have that wonderful Itanium CPU in them. On The Other Hand…
One tool favored by IM Services is Wipe Disk Pro. It assumes the above points about the system it is going to run on. Next, I checked with my colleagues in CaNS about their program of choice. Their program of choice for disk wiping programs is called DBAN (Darik’s Boot and Nuke). What’s Locally Available
The wiping program: has to run on Itanium architecture. does not rely on having a keyboard/monitor/mouse. does not require having a CDROM or floppy. has reasonable error recovery. produces sufficient logging output along the way and at the end. has to be capable of doing multiple passes over a disk. A mechanism of capturing the progress and final results of each node’s wiping operation. A mechanism to test wiping programs. A mechanism to test the overall wiping process. Meet any DOE requirements on disk wiping. Not be hideously expensive. My Criteria
Going to sourceforge.net and freshmeat.net searching for disk wiping programs, I got over 75 pages of hits. There’s wipe, wipe disk, disk wiper, quick wipe, quickwiper, wipersnapper, secure wipe, maxiwipe, shred, and on and on. Some would build on the Itanium to a greater or lesser degree and some wouldn’t. After testing about 30 different wiping programs on a two disk Itanium system, I found two candidates for that were the most promising, wipe and shred. Both are based on the work of secure data s destruction, Dr. Peter Gutmann. I finally decided on shred because I liked the way it logged progress better than wipe. Open Source Disk Wiping Programs
The next step is to figure out how to get shred to the systems in a suitable environment and capture the results. MPP2 came from HP with a technique for installing a new operating system distribution over the network. All the program files, libraries, etc. would come from the master server; all the logging information would be written back to the master server. Once loaded, the kernel started up a minimal environment that downloaded the new operating system to a RAM disk and loaded and configured it to the local disks. What I had to do was take that primitive environment and make its purpose wiping disks instead of installing stuff to them. This is where the test cluster comes into play and proved invaluable for testing. There is a special console port on the motherboards that connects to a series of terminal servers. The server that acts as the installation host can connect to the console of any of the systems making up the test cluster or MPP2. Getting shred into an working environment
After the usual amount of futzing around that goes with any hardware/software system, I get things to work. My process for wiping the first disk worked (2 random passes followed by a pass of zeros). I changed my process around to do both disks simultaneously, rebooted the system, and watched what happened. The system rebooted, initialized, started the two disk wipes, and horror of horrors, froze solid. The system was going nowhere fast. The problem was that the primitive kernel used to boot the system is really primitive. My choices are 1)Redo the SCSI driver and potentially the kernel. 2)Wipe the drives in a serial rather than parallel manner. 3)Go to another environment. How Primitive Is Primitive?
After the amount of futzing around that goes with any hardware/software system, I get things to work. My process for wiping the first disk worked (2 random passes followed by a pass of zeros). I changed my process around to do both disks simultaneously, rebooted the system, and watched what happened. The system rebooted, initialized, started the two disk wipes, and horror of horrors, froze solid. The system was going nowhere fast. The problem was that the primitive kernel used to boot the system is really primitive. My choices are 1. Redo the SCSI driver and potentially the kernel. 2. Wipe the drives in a serial rather than parallel manner. 3. Go to another environment. I decided to test how long it would take to wipe a fat node by serially wiping the drives. This takes 2H 35m. How Primitive Is Primitive?
There are 8 nodes in the test cluster. While it’s completely not representative of MPP2, it’s a start. I started the wipe of the test cluster and captured all the proceedings to a file as well as watching it in real time. Messages are arriving from the 8 nodes funneled into one point in an asynchronous manner. After it was all over with, I needed to make some sort of sense of what went on. I reached for the Swiss Army Knife of computer programming, Perl. Perl is you friend. The test cluster took 2H 55m to wipe. Not bad. How Long Does It Take To Wipe The Test Cluster?
The next step was to try this on MPP2. When the next monthly downtime came around for MPP2 in April, I put my dibs in for testing a disk wipe. This time I was wiping 930 if the nodes in MPP2. That’s quite a scale-up from 8. I started the wipe at 17:10 and it finished at 21:18. Several nodes had to be booted by hand and that’s another serial process. So this is looking pretty good. Scaling UP To The Big Time!
Admiral Grace Hopper, the creator of COBOL, said that the wonderful thing about standards is there are so many to choose from. This is particularly true of disk wiping. The Department of Defense has very clear, definitive policy on wiping disks. They are famous for the “7 Pass Rule” of disk wiping. The Department of Energy is no so clear-cut. There used to be DOE M205.12, which said that for one pass of zeros was OK for unclassified data. But M205.12 is out of date. It was superseded by TMR 10 which is obsolete. So, how many passes do I need to run over these disks to be compliant with the proper directive(s) which ever they may be? And Now for A word About Standards…
At first The Powers That Be (TPTB) said “Uhh, well, why are you wiping them; they’re going to be destroyed, anyway.” Some time passes and the TPTB say, “Uhh, well, just to be sure, you ought to wipe the drives before they’re destroyed.” OK. How many passes should I do it? TPTB say, “Uhh, well, one is enough.” Some time passes and TPTB say, “Uhh, well, maybe that’s not enough; better make it 3 passes.” OK. Whatever you want. Sometime passes and TPTB say, “Uhh, well, we changed our mind and since the drives are going to be crushed, never mind the wiping.” OK, glad you came to a definitive conclusion. Some time passes and TPTB say, “Uhh, well, you know we thought about it again, and since the drives might end up eBay some how, you need to wipe the drives.” OK, how many times? “Uhh, well, We’ll get back to you on that.” Finally, the powers that be decide that 2 passes, one random pass followed by one pass of zeros is sufficient for the disks. The Powers That Be
Dtemp, Home, and the Test Cluster Dtemp is a parallel, high performance file system using a the Lustre file system technology. Thirty four nodes in MPP2 serve up the 53 TB of disk space to the rest of MPP2. It turns out that wiping all this space is going to be easier than wiping the individual nodes. These nodes are not running a stripped down kernel; they’re running a super computer version of Red Hat version 4 so they can wipe multiple disks at the same time without freezing up. The home file system is served up by two HP Alpha systems running Tru64, not your average run of the mill system. The game plan on this one is pretty much the same as wipe dtemp, only I don’t have to wipe the system disks in this case. The test cluster is 8 nodes and some mini-dtemp space: small potatoes in comparison.
As the time to do the disk wiping draws near, TPTB say, “Uhh, well, how do you REALLY know that the disks got zeroed out? Is there a way you can show that the disks got zeroed out?” How about if I wipe one of disks and you analyze it for being zeroed out? “ “Uhh, well, can you do some kind of mathematical test on the disks that shows that it’s just zeros,” TPTB say. How about if I checksum the disks and if the checksum comes out zero, it’s good. “Uhh, well, that sounds good. Oh and by the way, you need to get the serial number of all the disks that get wiped in case any of them end up on eBay and somebody claims there’s super secret data on them.” Getting the disk serial numbers is no problem with a program called sginfo. But the option that gets the serial number also returns lots of other information about the disk drive. Perl is your friend. The Powers That Be, Part Two
The game plan for D Day (Disk-wiping Day) is this: Wipe the disks in the first 930 nodes, i.e.,those that don’t play a part in the dtemp file system, the dtemp file system, the home file system, and the mini-dtemp on the test cluster. Wipe the disks in the systems that serve up the dtemp file system, and the rest of the test cluster. Take care of any loose ends in the wiping process. D Day, August 4th, 2008
Now the fun can really begin; the documentation of what happened. After all, this is DOE, remember? It’s at least as important to document what was done as what was done. As I said before, Perl is your friend. I massaged the massage file from wiping the 930 nodes into individual files for each node including the check-summing operation and capturing the serial numbers. The dtemp file system had two reports, a wipe of the dtemp disks and the individual disks. The same is true of the test cluster. The home file system only required the disks making up the home file system. All this was packaged up into a CDROM and distributed to all the interested parties for easy reference. The Job Isn’t Over Until The Paper Work is Done
In conclusion, you too can wipe over 7000 disk drives in less than 48 hours. You just need lots of parallelism and a great deal of patience. And by the way, Perl is your friend. In Conclusion