[ale] Any MySQL Experts around?
Thomas Wood
thomaswood at mac.com
Sat Feb 14 22:28:23 EST 2004
Just to add my $0.02 to this argument, I know it's possible to do
something like this b/c it's done at my office. Our two production
iPlanet 4 webservices running on Sun 280Rs generate nearly 4 gigabytes
of logs per day that we export the next day into an oracle db running
on a quad proc v880 (soon to be a 12 proc v1280--but that's another
story). Not including the time it takes to move the files to the
staging env to gunzip and untar them it takes about 3 hours to import
the raw data. Those numbers are pretty rough though and I don't know
how much data is dropped. I can give more precise figures on monday
after I check some logs--if anybody's interested. This process runs
concurrently with reports and other data imports so your mileage may
vary. I'll see if I can get some eyeballs on the scripts that control
the import, but it may be wholly useless in this case since you want to
write directly to a db and not update after the fact.
hope that provides some comparison.
wood
On Feb 14, 2004, at 10:11 PM, Jason Etheridge wrote:
>> I'm thinking of having all my server logs (including NT/2K/2K+3)
>> record
>> into a MySQL db for better reporting. Does anyone know if MySQL can
>> handle 100,000+ inserts per day? Also, if it handling this many
>> inserts,
>> how good is the SELECT performance for the automated reporting? ANy
>> thoughts, pointers?
>
> Disclaimer: I'm not a real DBA and the tuple calculus I learned in
> school actually tripped me up in the real world :D
>
> The largest table I work with in mysql has roughly 8 million records
> using full-text indexing. Each row contains a little over 260 bytes
> of data. Every morning I turn indexing off and do a LOAD DATA REPLACE
> touching every record, changing a good bit of them, and adding a
> thousand or so new records. It can take around 3 hours to do that and
> reindex on an old Sun E10K (I forget how many processors; maybe a
> dozen for that domain). Queries are fast after that, but I use READ
> LOCK's and there are no updates during the day. I'm not reporting on
> live data.
>
> If you don't use full-text indexing, and instead use LIKE statements
> on large fields, searching is going to be pretty slow with that many
> records, especially if you can't lock the tables because you're doing
> updates. If you do use full-text indexing, I'd worry about 100,000+
> inserts a day and concurrent searching. Full-text updates work best
> with indexing disabled, and even then, when you go to renable
> indexing, it'll do it via REPAIR BY SORTING, which could take a long
> time if you don't have enough RAM to do it without thrashing the
> disks.
>
> What sort of reporting do you want to do and sort of logs are they?
> If you just want to store key statistical information rather than
> entire log entries, that would be better. But for storing actual
> logs, I'm not sure relational databases are the best tool. For
> log-based reporting, I currently use a kludge: the system loads into
> temporary tables specific data parsed from log entries from just the
> months we're interested in for the current running report. This only
> works well for one or two months in our case because of the arbitrary
> timeout value I set. For anything really large, like a year's worth
> of transactions, I just run the logs through a perl script and forget
> mysql.
>
> I have another table without full-text indexing that gets searched and
> updated throughout the day, with about a million records. But nothing
> is added except in the mornings, just updated. I'll try to find some
> stats for that if you want.
>
> Best thing to do is get a test machine and play around with some
> sample loads, and make sure to allocate a lot of filehandles for
> mysql's process space.
>
> -- Jason
>
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://www.ale.org/mailman/listinfo/ale
>
More information about the Ale
mailing list