<div dir="ltr"><div><br></div>I'm working on a tool to parse through a lot of data for processing. Â Right now it's taking longer than I wish it would so I'm trying to find ways to improve the performance. Â Right now it appears the biggest bottleneck is IO. Â I'm looking at about 2000 directories which contain between 1 and 200 files in tar.gz format on a VM with 4 Gigs of RAM. Â I need to load the data into an array to do some pre-processing cleanup so I am currently chopping the files in each of the directories into an array of groups of 10 files at a time ( seems to be the sweet spot to prevent swap ) and then a straight forward loop of which each iteration executes:<div>
<br></div><div>Â tar xzOf $Loop |</div><div><br></div><div>and then pushes it into my array for processing.</div><div><br></div><div>I have tried:</div><div><br></div><div>Â gzcat $Loop | tar xO |</div><div><br></div><div>
which is actually slower. Â Yes, I'm at the point of trying to squeeze seconds of time out of a group. Â Any thoughts of a method which might be quicker?</div><div><br></div><div>Robert</div><div><br></div><div><br></div>
<div><br></div><div><br><br><br><div><div><br></div>-- <br>:wq!<br>---------------------------------------------------------------------------<br>Robert L. Harris<br><br>DISCLAIMER:<br>Â Â Â These are MY OPINIONSÂ Â Â Â Â Â Â With Dreams To Be A King,<br>
    ALONE. I speak for           First One Should Be A Man<br>    no-one else.                   - Manowar
</div></div></div>