You could write a perl script to break it apart for you. The pseudo code would look something like:<div><br></div><div>open original log file</div><div><br></div><div>while input from file</div><div> read first line</div>
<div> pattern match for the thing that looks like a date</div><div><div> open a different file (probably with date as part of the name)</div><div><br></div><div> while read line contains date<br></div></div><div> write out the line<br>
</div><div> read the next line</div><div><br></div><div> close the file <br></div><div><br></div><div>close the original log file</div><div><br></div><div>variations would include adding some directory structure around where to place the logs when they're broken apart, or instead of separating by day, separating by month or year.</div>
<div><br></div><div>-Scott</div><div><br><div class="gmail_quote">On Sun, Mar 22, 2009 at 10:54 AM, Kenneth Ratliff <span dir="ltr"><<a href="mailto:lists@noctum.net">lists@noctum.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div class="im">-----BEGIN PGP SIGNED MESSAGE-----<br>
Hash: SHA1<br>
<br>
</div><div class="im">On Mar 22, 2009, at 10:15 AM, Greg Freemyer wrote:<br>
<br>
> If you have the disk space and few hours to let it run, I would just<br>
> "split" that file into big chinks. Maybe a million lines each.<br>
<br>
</div>Well, I could just sed the range of lines I want out in the same time<br>
frame, and keep the result in one log file as well, which is my<br>
preference. I've got about 400 gigs of space left on the disk, so I've<br>
got some room. I mean, I don't really care about the data that goes<br>
before, that should have been vaporized to the ether long before, I<br>
just need to isolate the section of the log I do want so I can parse<br>
it and give an answer to a customer.<br>
<div class="im"><br>
> I'd recommend the source and destination of your split command be on<br>
> different physical drives if you can manage it. Even if that means<br>
> connecting up a external usb drive to hold the split files.<br>
<br>
</div>Not a machine I have physical access to, sadly. I'd love to have a<br>
local copy to play with and leave the original intact on the server,<br>
but pulling 114 gigs across a transatlantic link is not really an<br>
option at the moment.<br>
<div class="im"><br>
> If you don't have the disk space, you could try something like:<br>
><br>
> head -2000000 my_log_file | tail -50000 > /tmp/my_chunk_of_interest<br>
><br>
> Or grep has a option to grab lines before and after a line that has<br>
> the pattern in it.<br>
><br>
> Hopefully one of those 3 will work for you.<br>
<br>
</div>mysql's log file is very annoying in that it doesn't lend itself to<br>
easy grepping by line count. It doesn't time stamp every entry, it's<br>
more of a heartbeat thing (like once a second or every couple seconds,<br>
it injects the date and time in front of the process it's currently<br>
running). There's no set number of lines between heartbeats, so one<br>
heartbeat might have a 3 line select query, the next heartbeat might<br>
be processing 20 different queries including a 20 line update.<br>
<br>
I do have a script that will step through the log file and parse out<br>
what updates were made to what database and what table at what time,<br>
but it craps out when run against the entire log file, so I'm mostly<br>
just trying to pare the log file down to a size where it'll work with<br>
my other tools :)<br>
<div class="im"><br>
> FYI: I work with large binary data sets all the time, and we use split<br>
> to keep each chunk to 2 GB. Not specifically needed anymore, but if<br>
> you have read error etc. if is just the one 2 GB chunk you have to<br>
> retrieve from backup. if also affords you the ability to copy the<br>
> data to FAT32 filesystem for portability.<br>
<br>
</div>Normally, we rotate logs nightly and keep about a weeks worth, so the<br>
space or individual size comparisons are usually not an issue. In this<br>
case, logrotate busted for mysql sometime back in November and the<br>
beast just kept eating.<br>
<div class="im">-----BEGIN PGP SIGNATURE-----<br>
Version: GnuPG v2.0.9 (Darwin)<br>
<br>
</div>iEYEARECAAYFAknGUTIACgkQXzanDlV0VY53YgCgkJxWJK6AAOZ+c2QTPN/gYLJH<br>
v/YAoPZXNIBckyfhfbMGrAZ6TNEqcIxV<br>
=IOjT<br>
<div><div></div><div class="h5">-----END PGP SIGNATURE-----<br>
<br>
_______________________________________________<br>
Ale mailing list<br>
<a href="mailto:Ale@ale.org">Ale@ale.org</a><br>
<a href="http://mail.ale.org/mailman/listinfo/ale" target="_blank">http://mail.ale.org/mailman/listinfo/ale</a><br>
</div></div></blockquote></div><br></div>