<html><head><style type='text/css'>p { margin: 0; }</style></head><body><div style='font-family: arial,helvetica,sans-serif; font-size: 12pt; color: #000000'>This is an instance where hard links are probably called for--no need to create a dupes directory. Each hard link is equal, and the file only goes away when the last link is deleted.<div><br></div><div>There is a command 'fdupes' that's available in the standard yum/zypper repos (and probably others) that will automatically find duplicate files and, with --link option, create hard or soft links automatically.<br><br>Scott</div><div><br><hr id="zwchr"><div style="color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;"><b>From: </b>"Chris Fowler" <cfowler@outpostsentinel.com><br><b>To: </b>"ALE" <ale@ale.org><br><b>Sent: </b>Wednesday, August 6, 2014 9:09:16 AM<br><b>Subject: </b>[ale] archiving backups<br><br>This is not the first time I've attempted backing up everything I've <br>created over 20 years.<br><br>Every so often someone I do not know emails me for a file, software, etc <br>that I may have based on what they read on a mailing list archive. Last <br>night someone requested a file from a project I was working on back in <br>2002.<br><br>Looking for this file I am reminded about how much crap I really have. <br>I have floppies, CDs, DVDs, etc of backups, data, and whatever. Many of <br>these have dupes. I want to copy all this stuff to a hard drive then <br>archive it. I need to deal with these duplicates.<br><br>A year ago I wrote a perl program to locate the duplicates using md5 <br>hashes. I was then going to delete all, but one of the dupes. The <br>problem I ran into is that some of these could be installs of software <br>and I needed to keep the dupe. I was then wasting time manually <br>determining which ones to delete and which ones to keep.<br><br>Last night I had an idea that may work. Create a directory of the root <br>of the backup named 'dupes'. Copy the dupe once into that directory. <br>Every where else replace the dupe with a symbolic link.<br><br>My biggest issue with backups is management of the data. At 40 I'll <br>copy something today and 6 months from now forget why I even did that. <br>It could be a 1GB SVN software tree I checked out, made some changes for <br>testing and then decided not to delete it. Now I have 1GB of stuff I <br>feel like I can not delete. I can't even remember why I made the copy. <br>Ugh.<br><br>I have so many CDs and DVDs stacked in my office it would be nice to get <br>rid of them. I have drives in boxes that I've forgotten what is on <br>them. I'm a tech hoarder.<br><br>Chris<br>-------------- next part --------------<br>An HTML attachment was scrubbed...<br>URL: <http://mail.ale.org/pipermail/ale/attachments/20140806/217fedc5/attachment.html><br>_______________________________________________<br>Ale mailing list<br>Ale@ale.org<br>http://mail.ale.org/mailman/listinfo/ale<br>See JOBS, ANNOUNCE and SCHOOLS lists at<br>http://mail.ale.org/mailman/listinfo<br></div><br></div></div></body></html>