<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<font size="-1">This is not the first time I've attempted backing up
everything I've created over 20 years.<br>
<br>
Every so often someone I do not know emails me for a file,
software, etc that I may have based on what they read on a mailing
list archive. Last night someone requested a file from a project
I was working on back in 2002. <br>
<br>
Looking for this file I am reminded about how much crap I really
have. I have floppies, CDs, DVDs, etc of backups, data, and
whatever. Many of these have dupes. I want to copy all this
stuff to a hard drive then archive it. I need to deal with these
duplicates.<br>
<br>
A year ago I wrote a perl program to locate the duplicates using
md5 hashes. I was then going to delete all, but one of the
dupes. The problem I ran into is that some of these could be
installs of software and I needed to keep the dupe. I was then
wasting time manually determining which ones to delete and which
ones to keep.<br>
<br>
Last night I had an idea that may work. Create a directory of the
root of the backup named 'dupes'. Copy the dupe once into that
directory. Every where else replace the dupe with a symbolic link.
<br>
<br>
My biggest issue with backups is management of the data. At 40
I'll copy something today and 6 months from now forget why I even
did that. It could be a 1GB SVN software tree I checked out, made
some changes for testing and then decided not to delete it. Now I
have 1GB of stuff I feel like I can not delete. I can't even
remember why I made the copy. Ugh.<br>
<br>
I have so many CDs and DVDs stacked in my office it would be nice
to get rid of them. I have drives in boxes that I've forgotten
what is on them. I'm a tech hoarder.<br>
<br>
Chris<br>
</font>
</body>
</html>