[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bacula-devel] Marking Files for Restore


I have recently started testing Bacula 2.4.3 and am very pleased with 
the software. One thing I have noticed is very, very slow processing 
when marking files using bconsole. My points of comparison are the 
following commands:

restore all

vs

restore (and then)
mark {*|dir}

For the record I am using CentOS 4 on some old Dual P3 750 hardware with 
750MB RAM. It has been more than adequate as a backup server with our 
current software. I am using PostgreSQL 8.1.11 as the catalog server. 
The restore is for a single JobID. There are only 2 jobs in the catalog 
database.

Turning query logging on in PostgreSQL gives me the following results.

restore all
-----------

[This executes once!]
LOG:  duration: 8543.411 ms  statement: SELECT 
Path.Path,Filename.Name,FileIndex,JobId,LStat FROM File,Filename,Path 
WHERE File.JobId=2 AND Filename.FilenameId=File.FilenameId AND 
Path.PathId=File.PathId

Total Time: 8.5 seconds for 124,475 "files"

restore + mark {*|dir}
----------------------

[This query executes once for every directory]
LOG:  duration: 1.147 ms  statement: SELECT PathId FROM Path WHERE 
Path='/usr/src/kernels/2.6.9-55.0.2.plus.c4-i686/include/config/at1700/'

[These two queries execute once for every file]
LOG:  duration: 1.181 ms  statement: SELECT FileId, LStat, MD5 FROM File 
WHERE File.JobId=2 AND File.PathId=10352 AND File.FilenameId=27865
LOG:  duration: 0.751 ms  statement: SELECT FilenameId FROM Filename 
WHERE Name='module.h'

Total Time: 45+ minutes for 79,979 "files" in the /usr directory

For the situation of a single JobID it would seem that "mark *" at the 
root directory should use the same query as "restore all", rather than 
looping over every file/directory individually. For a specific 
directory, such as /usr, the following query based on the "restore all" 
query syntax seems to pull out the correct information:

LOG:  duration: 5150.884 ms  statement: SELECT 
Path.Path,Filename.Name,FileIndex,JobId,LStat FROM Path JOIN file on 
file.pathid=path.pathid JOIN filename on 
filename.filenameid=file.filenameid WHERE jobid=2 AND path like '/usr/%';

Total Time: 5.2 seconds for 79,979 "files"

I will not pretend to understand all of the complexities of the system 
for multiple jobID's and file versions, or which queries are or are not 
compatible across PostgreSQL, MySQL and SQLite, but it would seem that 
some optimization may be possible.

Thank you in advance for any insight you can provide.

Tim

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-devel mailing list
Bacula-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/bacula-devel


This mailing list archive is a service of Copilot Consulting.