[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] Feature Request: Reconnect On Failure/Continuation of Failed Backups


A couple of comments about this feature request:

1. Bacula was designed around the concept that TCP/IP is a reliable service, 
therefore the current Bacula implementation would need some major redesign to 
be able to restart a failed job.  We have been considering the possibilities 
in this regard.

2. This is not currently a project as such that we are working on or are 
planning given all the other high priority projects (see below #5 for 

3. If you want this on the project list, we would like to see a bit of support 
from the users -- at the moment, we have had no feedback.  To a very large 
extent, the projects we work on are those that interest the most Bacula users 
by a voting process.

4. If you are having problems with failed jobs on a LAN, then you have a 
broken LAN or some other problem, and the LAN or other problem should be 
fixed.  Restarting a backup for a failed job over a LAN is not the solution 
to having a broken LAN.

5. We do see a number of failed jobs over Internet where the sysadmin has 
little or no control of the quality of the TCP/IP connections. This is 
particularly true for backup of laptops out in the field.  We are currently 
considering better ways of dealing with laptops -- a full solution is much 
more complicated than restarting a job.  This is a problem we intend to 
solve, which may or may not respond to your request.

6. When backups fail, the data does not disappear. What was saved (written to 
the volume) is there and can be recovered from the Volume -- this is 
documented in the manual.

7. Bacula is *very* conservative in the way it works, and we prefer to 
re-write data from a job that has failed rather than possibly have problems 
because the restart was not correctly done. In fact, if you think about it, 
what does it mean to restart a job a day later when you have a partial backup 
for which the last block written may or may not be valid?  Whatever is 
implemented in the future, this will be the default action, because it is 
more secure.  

8. What many users forget is that Bacula is a multitasking system that runs 
multiple simultaneous jobs to a Volume.  If one of 100 jobs fails while 
writing a Volume, you cannot simply append new data to the Volume.  For me 
the concept of a "partial" incremental job (one that failed) is rather 
complex, and I would find it very difficult to know exactly how you go about 
restarting that job 5 hours later.  This doesn't mean that we are not 
thinking about the problem and ways to resolve it (possibly by using some of 
the new Accurate backup code techniques ...).

9. If you want to submit a "real" Feature request, please include all the 
fields with the correct formatting as specified in the Feature Request 
documentation on the web site (but first, please try to get a bit of user 
interest in this).  Given what I mention above, please take into account that 
one of the major problems is to have a viable algorithm for doing what you 
want -- this is something that we don't currently have.

Best regards,


On Wednesday 07 May 2008 17:43:44 Michael Short wrote:
> What: When backups fail data should not disappear, it should be used
> as a partial incremental. There should also be a way to re-establish
> broken backups upon failure.
> Why: Its an important feature because precious network bandwidth is
> wasted re-transferring files that were already sent but ignored.
> Notes: Files which are incomplete should not have to be resent. If we
> can combine this feature with rsync in the future, a lot of backup
> disk space can be conserved.
> Sincerely,

This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
Bacula-devel mailing list

This mailing list archive is a service of Copilotco.