[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] small mysql patch


Hello,

Thanks for the patch.  I see no problem applying it as it is a comment, which 
is a suggestion.  However, as Eric has said, it may not be such a good idea 
for users to turn these indexes on since, as he points out, they will 
probably leave them.

I like Eric's suggestion that dbcheck automatically (possibly with an option, 
or by requesting permission) create and then delete the indexes that are 
needed for speed.  

Or perhaps we can speed up dbcheck itself.  The current SQL was written by me 
and is *far* from optimal.  Marc Cousin has rewritten the dbcheck sql in a 
script file in <bacula-source>/examples/database/dbcheck.sql, which he says 
runs *much* faster.  In the short run, I would recommend trying out this 
script.  In the long run, when some developer steps forward, or we finish our 
3.0.0 work, I would like to integrate his ideas into dbcheck.

Please see more notes below ... 


On Sunday 31 August 2008 23:17:05 Yuri Timofeev wrote:
> 2008/8/31 Eric Bollengier <eric@xxxxxxxxxxxxxxxx>:
> > Le Sunday 31 August 2008 21:32:16 Yuri Timofeev, vous avez écrit :
> >> Hi
> >>
> >> 2008/8/31 Eric Bollengier <eric@xxxxxxxxxxxxxxxx>:
> >> > Adding indexes will probably speed up dbcheck, but it will slow down
> >> > the attribute insertion process and grow up the database.
> >>
> >> I wrote earlier (see reference to the discussion), that dbcheck can't
> >> complete the job for 3 days!
> >> Once the necessary indexes were created, all work was completed
> >> dbcheck for 3-5 minutes!
> >> Therefore, I say NOT to increase the speed of dbcheck (though this
> >> too), I say about NOT normal work dbcheck.
> >
> > it's why i suggest you to add them to dbcheck.c, if we add this to the
> > default creation script, we can be sure that users will let them in place
> > forever. And after, they will complain about performance problem.
>
> If the problem only in dbcheck (and no problem with Verifies Jobs)
> then have to change dbcheck.
> Maybe I can do it later, I - beginner C programmer ;)
>
> So I asked here (bacula-devel):
> "What IDE, editors, plugins, utilities, etc. You use in developing the
> bacula? In other words, are interested in your working environment. "
> ;)

Yes, and I answered for myself (a dinosaur)  you only need a text editor and 
make -- occasionally gdb.

>
> >> > So, we can say that speed of attribute insertion is much more
> >> > important than dbcheck performance.
> >>
> >> MySQL performs INSERT operations with a very high speed, thus slowing
> >> the work should not be noticeable, IMHO.
> >
> > Performance are much higher with few indexes, i'm not sure that mysql
> > will insert million of rows at the same speed if you add 5 indexes.
>
> Million files in a single Jobs? Maybe this is indeed a rare case.

No, it is, unfortunately, not at all rare.  There are even some users with 10 
Million files in a single backup.  This causes a number of problems for 
restore.  We have some ideas on how to fix it but not the time ... yet ....

>
> >> > The dbcheck operation have to be run once per month, maybe twice per
> >> > year.
> >>
> >> I do understand (see comments in src/cats/make_mysql_tables.in:46,
> >> revision 7535) that there are problems with Verifies Jobs.
> >
> > You can use them if your verify jobs are too slow. Many users don't use
> > this feature. In bacula, you are free to modify indexes as you want.
>
> I agree with you, but all these questions, imho, political:
> whether bacula function normally "out of the box" (decision for a
> large number of users / system administrators), or bacula will work
> for hackers (who will explore SQL queries, create / delete indexes,
> etc.).
> For example, I often do not have time to do so.

The problem is balancing making it "work out of the box" and ensuring that it
works.  There are users out there with 100GB databases, and simply enabling 
new indexes that should be rarely used could be disastrous.  dbcheck is not a 
priority item, so we have not spent much time on this problem.  However, see 
Marc Cousin's script.

>
> No problem will not happen, imho, if Job will be 5-10 minutes longer,
> and if the database will be at 5-10 Gb more.

Unfortunately the same does not apply to a number of our big users.

>
> >> > If you want to use new indexes with dbcheck, you can modify dbcheck.c
> >> > in this way :
>
> 1. check that used mysql (?)
>
> 2. to verify the existence of indices, if not there, then :
> >> >  - ask to the user if he wants to add them (think about disk space)
> >> >  - add them + run analyse
> >> >  - run the cleanup operation
> >> >  - remove them
> >> >
> >> > I'm not sure that all your indexes are useful, for example (FileId,
> >> > JobId) is probably used to see if you have orphan jobids in File
> >> > table. The Job table is very small compare to others, so the database
> >> > engine won't use it.
> >> >
> >> > Here, it's not the type of your index that will be important, it's
> >> > much more the size of the File table (hundred of million in my case)
> >> > that will change the cost of your index.
> >>
> >> I think the problem (dbcheck) and its solution can be described, for
> >> example, in the section "Troubleshooting" in the documentation.
> >> And then not need to change the source code.
> >
> > This solution will not decrease support request about this subject, too
> > few users are reading this document. Modify dbcheck is a better idea, if
> > you can't work on it, perhaps we can add this to the project list.
>
> I do not know.


> On the other hand, it is really not the biggest problem.

Yes, that is the basis on which we are working too :-)

>
> For example, I am more concerned about progress on the problem from
> the file projects:
> "Item 8: Implement Copy pools" ;)
> I'll do a separate message on this issue.

Yes, I would like to hear your ideas on Copy pools.

By the way, one of the most difficult aspects of evolving Bacula is to do so 
in the current spirit of the way it works (central control, ...) and to move 
forward with new features a step at a time without breaking the current 
design or making major redesigns or creating incompatibilities with previous 
versions.

Best regards,

Kern

>
> >> >> I believe that the results of this discussion concerns
> >> >>
> >> >> "[Bacula-users] Bacula 2.2.8, dbcheck never completes"
> >> >> http://sourceforge.net/mailarchive/forum.php?thread_name=48A984BF.405
> >> >>020 6%4 0umdnj.edu&forum_name=bacula-users
> >> >>
> >> >> requires a small patch, see attached file.
> >> >>
> >> >> Since all fields (which fall in the indices) type INTEGER  the size
> >> >> of the indices will be low and increase speed dbcheck will be very
> >> >> substantial.



-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-devel mailing list
Bacula-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/bacula-devel


This mailing list archive is a service of Copilotco.