<div dir="ltr"><div><br></div>I&#39;m working on a tool to parse through a lot of data for processing.  Right now it&#39;s taking longer than I wish it would so I&#39;m trying to find ways to improve the performance.  Right now it appears the biggest bottleneck is IO.  I&#39;m looking at about 2000 directories which contain between 1 and 200 files in tar.gz format on a VM with 4 Gigs of RAM.  I need to load the data into an array to do some pre-processing cleanup so I am currently chopping the files in each of the directories into an array of groups of 10 files at a time ( seems to be the sweet spot to prevent swap ) and then a straight forward loop of which each iteration executes:<div>


<br></div><div>  tar xzOf $Loop |</div><div><br></div><div>and then pushes it into my array for processing.</div><div><br></div><div>I have tried:</div><div><br></div><div> gzcat $Loop | tar xO |</div><div><br></div><div>


which is actually slower.  Yes, I&#39;m at the point of trying to squeeze seconds of time out of a group.  Any thoughts of a method which might be quicker?</div><div><br></div><div>Robert</div><div><br></div><div><br></div>


<div><br></div><div><br><br><br><div><div><br></div>-- <br>:wq!<br>---------------------------------------------------------------------------<br>Robert L. Harris<br><br>DISCLAIMER:<br>      These are MY OPINIONS             With Dreams To Be A King,<br>


       ALONE.  I speak for                      First One Should Be A Man<br>       no-one else.                                     - Manowar

</div></div></div>