[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] small dbcheck patch for mysql

In response to Bob Hetzel <beh@xxxxxxxx>:

> Sorry to jump in late here... but if both of
> a) the dbceck takes a long time w/o the index
> b) the index adds quickly, then the dbcheck runs quickly
> I would have to agree with Kern's earlier suggestion that perhaps 
> somebody should look at the SQL code in dbcheck to optimize it.  This 
> also has the added benefit of not adding more db specific code, 
> especially given the course the mysql Inc.
> Aren't there any SQL gurus interested in diving into it?

After spending about 5 minutes looking at the code, I can tell you that
(from an SQL standpoint) dbcheck is enormously inefficient, and will
not get much better without a huge rewrite.  Additionally, doing that
rewrite so that it results in improvements to all SQL platforms is
probably going to require that the code be broken into SQL-dependent
and SQL-independent sections.

For example, the process to delete orphaned path records, first does
a select to get a list of IDs, then goes through a loop to delete each
record individually.  If you wanted to speed this up in PostgreSQL,
you could do the following:
1) Wrap all the DELETE statements in a transaction
2) Make 1 big DELETE statement: DELETE FROM path where id IN ([list
   id #s here])
3) Make less delete statements (much like #2, but instead of doing
   all IDs in the IN, do, say 1000 at a clip)
4) A combination of 1, 2, and 3.
5) Combine the original select with the delete in a subselect.

Any one of these is going to improve performance.  My guess would be
that #5 will be the fastest on PG, but I can't be sure until I actually
test it.

Now, will that work in MySQL?  Does MySQL support subselects yet?  Is
MySQL's transaction support faster than unchained transactions?  What
about SQLLite?

I think the transaction thing is a no-brainer.  If you create a
db_sql_start() and db_sql_commit() function and use them appropriately,
then you can have them do the appropriate SQL or nothing at all as
makes sense for the underlying DB engine.  But those other questions
are complicated and require a lot of investigation into all the
platforms in order to get right.  Most likely they require big
switch statements (or some similar method) to use different approaches
for different SQL platforms.

So, my question would be, is dbcheck's performance important enough to
do that with?  As a utility program that doesn't really get run that
often, it's quite likely a misdirection of resources.  Granted, don't
turn down the patch ... if it helps MySQL without hurting anything else,
go for it.

Bill Moran
Collaborative Fusion Inc.

Phone: 412-422-3463x4023

This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
Bacula-devel mailing list

This mailing list archive is a service of Copilot Consulting.