[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] Synthetic Full backup or Consolidation

On Saturday 17 May 2008 09:43:19 Marc Schiffbauer wrote:
> * Kern Sibbald schrieb am 17.05.08 um 14:59 Uhr:
> > Item  3:  Merge multiple backups (Synthetic Backup or Consolidation)
> >   Origin: Marc Cousin and Eric Bollengier
> >   Date:   15 November 2005
> >   Status:
> >
> >   What:   A merged backup is a backup made without connecting to the
> > Client. It would be a Merge of existing backups into a single backup. In
> > effect, it is like a restore but to the backup medium.
> >
> >           For instance, say that last Sunday we made a full backup.  Then
> >           all week long, we created incremental backups, in order to do
> >           them fast.  Now comes Sunday again, and we need another full.
> >           The merged backup makes it possible to do instead an
> > incremental backup (during the night for instance), and then create a
> > merged backup during the day, by using the full and incrementals from the
> > week.  The merged backup will be exactly like a full made Sunday night on
> > the tape, but the production interruption on the Client will be minimal,
> > as the Client will only have to send incrementals.
> >
> >           In fact, if it's done correctly, you could merge all the
> >           Incrementals into single Incremental, or all the Incrementals
> >           and the last Differential into a new Differential, or the Full,
> >           last differential and all the Incrementals into a new Full
> >           backup.  And there is no need to involve the Client.
> >
> >   Why:    The benefit is that :
> >           - the Client just does an incremental ;
> >           - the merged backup on tape is just as a single full backup,
> >             and can be restored very fast.
> >
> >           This is also a way of reducing the backup data since the old
> >           data can then be pruned (or not) from the catalog, possibly
> >           allowing older volumes to be recycled
> This sounds like a very useful Feature.
> Would there be a way to combine this with some sort of data-deduplication?

No, we won't get data deduplication from the consolidation project.

However, deduplication at the file level (not at a block or byte level) is a 
project that is in the project list.  It has been on the list for a *very* 
long time (since the beginning) but has always been rated very low by the 
users.  In the current list it is Item  7:  Implement Base jobs

Now that Eric has implemented the Accurate backup project, the Base Jobs 
project will be a rather easy extension of the "accurate" code.  The 
fundamental algorithms are very similar.   I doubt that we will implement the 
Base project in the next version, but it will almost certainly come in the 
following release.

I had originally planned to release 3.0.0 in June or July, but will probably 
push it back to September to allow a few more features to go in and to give 
us sufficient time for stabilization and testing.

Best regards,


> Deduplication is "just" writing Metadata for a client and not
> putting the data on tape because it has already been written to tape
> by another client.
> Think of "hard-linking" Datablocks on tape.
> Further there could be an option tp specify a minimum and a maximum
> number of mediums that each data block has to be spread on.
> This would bring a real "killerfeature" useful for most users into bacula
> that is offered only by commercial enterprise backup solutions so
> far (AFAIK)
> -Marc

This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
Bacula-devel mailing list

This mailing list archive is a service of Copilotco.