[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] [Bacula-users] Copy jobs in Bacula version 3.0.0


16.12.2008 19:21, Josh Fisher wrote:
> Arno Lehmann wrote:
>> Hi,
>> 16.12.2008 16:49, Josh Fisher wrote:
>>> Kern Sibbald wrote:
>>>> Hello,
>>>> I've been discussing with Eric how we might handle Copy jobs in our 
>>>> development version.  Currently, Copy jobs are implemented, and they 
>>>> work much like Migration jobs (share 99% of the code).  The 
>>>> difference is that Migration jobs purge the original backup job and 
>>>> keep only the Migrated data.  With a Copy Job, the original backup 
>>>> job remains and there is a second "identical" job that contains the 
>>>> copied data.  The only difference between the original and the Copy 
>>>> job is that they will be in different Pools.
>>>> Now this poses a few problems for doing restores such as:
>>>> 1. It is possible that a simple restore will choose JobIds from both 
>>>> the original and the Copy Job.
>>>> 2. There is no easy mechanism for the user to select whether he/she 
>>>> wants to restore from the original backup or the Copy (or Copies).
>>>> So for the moment, the situation is not really satisfactory (one of 
>>>> the reasons the code is not yet released).
>>>> We have a number of ideas for different ways to solve the above 
>>>> problems, many have already been discussed on the mailing lists, and 
>>>> we will probably implement a number of the ideas put forward, either 
>>>> before or after the release (depending on the time we have and the 
>>>> complexity of the proposal -- e.g. using the Location table and 
>>>> Costs ...).
>>>> A few things seem obvious:
>>> Maybe, but I am still trying to understand. :)  My thought is that a 
>>> Copy job is just that, a copy of a real Backup job, or in other 
>>> words, a "backup of a backup",
>> No.
> Hmmm. Bad choice of words, perhaps. It is a redundant copy, identical to 
> the original except with respect to the volumes used and the pool, yes?

Yes - it's the same set of Bacula data blocks, stored in another pool, 
having the same file information in the catalog. Quite like what Kern 
wrote initially :-)

>>> as for example an offsite backup. So I am inclined to think that to 
>>> "restore" from a Copy job is to restore the Backup job, rather than 
>>> the client machine. In this scenario, nothing at all would change 
>>> about the client restore job. Instead, a restore from a Copy job 
>>> would restore the original Backup job. A client would always be 
>>> restored from an "original" Backup job, though the "original" may 
>>> actually have been restored from a Copy.
>> The idea is to not need to do two restores, but only one.
> I failed to explain what I meant by a "restore" of a Copy job. Since the 
> two are identical except for Catalog entries, it should be possible to 
> "promote" the Copy job to a Backup job. When the client restore job is 
> completed, the "promoted" job could be demoted back to a Copy job.

If I understand you correctly, that means changing the catalog in a 
way that makes the copied job look like a regular backup job, then 
trigger the restore, and after that is done, set the job's data back 
to what it was before.

That's not a good idea - first, you'd need to persistently store the 
original job data, because the catalog must not be broken, should 
Bacula crash while the job is "promoted".

Second, while the job is "promoted", new jobs running - either backups 
or restores - would see a different set of jobs run, i.e. jobs could 
behave differently when run wile a certain restore is run. That's not 
a good thing.

Of course, using transactions, you could make the promoted view reste 
itself automatically in case of a crash. The different view problem 
remains, though... you could resolve that by using a private catalog 
connection for a restore, though - but then you end up with re-writing 
a good part of Bacula.

> So 
> new functions would be "Promote Copy job to Backup Job" and "Demote 
> Previously Promoted Copy job". The idea is that the client restore code 
> would not need to change at all.

As far as I understand, there are only three major changes: One is to 
some of the catalog queries, contained in a few SQL statements. This 
is easily audited and can be regression-tested without difficulties. 
(Which shows us that it was a good chioce to implement Bacula's 
catalog in an SQL DBMs, and that Baculas API to catalog queries is 
well designed.)
The other change is the user interface, which will change a bit. As 
this change would only affect jobs with copies existing, it would 
never interfere with existing setups (which can, obviously, not 
contain job copies). In my opinion, that makes this change 
backward-compatible. Scripts that interface to Bacula through bconsole 
might need an update, though - but the same is true for direct catalog 
accesses. Both can be a problem for users, of course, but it's not one 
of the problems the Bacula project has to resolve as any such script 
would be third-party applications. (By the way, I would myself have to 
rework some scripts I use, but that's what happens if you don't get 
your add-ons integrated into the main project - I know that's my 
responsibility... (and, luckily, my scripts refuse to run when they 
encounter an unknown catalog database version :-)
Finally, the code to handle the new user interface has to be written 
and tested.

Those are three well-defined areas to work on, where, otherwise, a 
much more encompassing rework of Bacula would be required - I don't 
think that keeping the restore code itself unmodified is worth that 

> The user's choice would be whether or 
> not to promote the Copy job before running the restore job. The idea is 
> that the client restore code might not need to be changed at all.

Why is it a bad thing to change code? After all, Bacula is constantly 
evolving, and even the parts where functionality is not changed are 
subject to being refactored again and again.

And then, your suggestion requires that the user decides which jobs to 
restore from before actually starting the restore. It is much more 
user-friendly though to start with the restoration command first, and, 
if applicable, let the user decide which jobs to use when he can 
really see the choices. That's much better than starting a restore, 
then, after being presented with the list of jobs chosen and the media 
required, having to cancel the restore, find job copies, manually 
"promote" them, start the restore again, and afterwards - probably 
manually - having do "demote" the jobs. I guess that even is a much 
heavier change to the user interface that the few new dialog lines 
Kern suggested...


Arno Lehmann
IT-Service Lehmann
Sandstr. 6, 49080 Osnabrück

SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
Bacula-devel mailing list

This mailing list archive is a service of Copilotco.