[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] small dbcheck patch for mysql

>>>>> On Tue, 9 Sep 2008 11:40:02 -0400, Bill Moran said:
> In response to Bob Hetzel <beh@xxxxxxxx>:
> > Sorry to jump in late here... but if both of
> > a) the dbceck takes a long time w/o the index
> > b) the index adds quickly, then the dbcheck runs quickly
> > 
> > I would have to agree with Kern's earlier suggestion that perhaps 
> > somebody should look at the SQL code in dbcheck to optimize it.  This 
> > also has the added benefit of not adding more db specific code, 
> > especially given the course the mysql Inc.
> > 
> > Aren't there any SQL gurus interested in diving into it?
> After spending about 5 minutes looking at the code, I can tell you that
> (from an SQL standpoint) dbcheck is enormously inefficient, and will
> not get much better without a huge rewrite.  Additionally, doing that
> rewrite so that it results in improvements to all SQL platforms is
> probably going to require that the code be broken into SQL-dependent
> and SQL-independent sections.
> For example, the process to delete orphaned path records, first does
> a select to get a list of IDs, then goes through a loop to delete each
> record individually.  If you wanted to speed this up in PostgreSQL,
> you could do the following:
> 1) Wrap all the DELETE statements in a transaction
> 2) Make 1 big DELETE statement: DELETE FROM path where id IN ([list
>    id #s here])
> 3) Make less delete statements (much like #2, but instead of doing
>    all IDs in the IN, do, say 1000 at a clip)
> 4) A combination of 1, 2, and 3.
> 5) Combine the original select with the delete in a subselect.
> Any one of these is going to improve performance.  My guess would be
> that #5 will be the fastest on PG, but I can't be sure until I actually
> test it.

You are right in theory, but I there is no current evidence that the DELETEs
are significantly slow.  For a start, Path.PathId is a primary key and the
suggested new index on File.PathId must have sped up the SELECT, not the
DELETEs.  Secondly, unless the Path table is very much out of date relative to
the File table, you would not expect to find a huge number of orphaned path


This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
Bacula-devel mailing list

This mailing list archive is a service of Copilot Consulting.