[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] Pre-alpha version of Bacula plugins working

>>>>> "Kern" == Kern Sibbald <kern@xxxxxxxxxxx> writes:

Kern> On Tuesday 19 February 2008 23.13:04 John Stoffel wrote:
Kern> Mark today in your calendar.  Bacula just did its first backup
Kern> and restore of a MySQL database using a plugin.  I did it with
Kern> using a simplistic "pipe" plugin.
>> Congrats!
Kern> The operation consisted of adding the following line to the
Kern> Include section of the FileSet:
Kern> 1                    2                             
>> 3                                               4 Kern> Plugin =
>> "bpipe:/@MYSQL/regress.sql:mysqldump -f --opt --databases regress:mysql"
Kern> This plugin line goes in the FileSet section where you have your
Kern> File = xxx lines, and for this plugin is composed of 4 fields
Kern> separated by colons (I've indicated the field numbers above the
Kern> Plugin line for reference only.
>> Ugh... sorry to be negative, but could we spend some time coming up
>> with a nicer syntax please?
Kern> Field 1 specifies a specific plugin name (in this case bpipe).
>> Ok.  But would it make more sense to have a plugin { ... } resource block
>> instead in the FileSet?
Kern> Field 2 specifies the namespace (in this case the pseudo
Kern> path/filename under which the backup will be saved -- this
Kern> will show up in the restore tree selection).
>> Hmm... what are the limits in Bacula's design in terms of namespace
>> support?  Does it strictly follow the unix rooted tree, or does it
>> also support the notion of drive letters as in PCs?  If so, why not
>> just extend the drive letter to be:
>> DB:/MYSQL[345N]/<database>/<table>
>> instead?  Maybe the version of mysql doesn't matter, but might be
>> useful if you try to restore a version 5 mysql dump onto a version 3
>> server to get a warning.
Kern> Field 3 is the "reader" shell command used for doing a backup.
Kern> Since this is a pipe plugin (Linux shared object), it does a
Kern> popen() on that command.
>> It would be nice to just abstract this out even more, so that the
>> command is specified elsewhere, and we just feed in the standard
>> connection arguements.  Sorta like the Perl::DBI modules abstract out
>> DB connection stuff.
Kern> Field 4 is the "writer" shell command used for doing the restore.
>> Does this need to be different, or handle remote DBs, etc?
Kern> I created a MySQL database named regress, populated it, backed
Kern> it up, dropped the database, then I restored the "file"
Kern> /@MYSQL/regress.sql, and the database was restored.  There is
Kern> nothing magical about /@MYSQL/...  It is just something unique
Kern> and distinctive enough that it will not be confused with another
Kern> file on the system.
>> Not sure I like this, esp since /@MYSQL/ is perfectly legal on Unix
>> systems as a filename.
Kern> As I mentioned, this is a rather trivial example of what can be
Kern> done with a simple pipe plugin.  As it stands, bpipe knows
Kern> nothing about MySQL (it is 365 lines of C code), but it could be
Kern> any shared object that can implement a C interface, and I could
Kern> imagine for example a MySQL specific plugin which could all
Kern> databases or a list of databases.  Also, Bacula was running with
Kern> an SQLite database -- it certainly would not work very well if
Kern> Bacula were using the MySQL database in question during the
Kern> restore ...
Kern> Obviously, this is a first cut and there remains a lot to be
Kern> done (much clean up, a lot of additional implementation, error
Kern> message implementation, and documentation), but at least it is
Kern> now a full proof of concept.
Kern> By the way, this is an example of what I call a "plugin
Kern> command", where a specific plugin is referenced, and it backs up
Kern> a specific file (or set of files).  I have also planned plugins
Kern> that will be called when particular Options are met (i.e. to
Kern> backup all .gz files, ...).  However, I am putting off
Kern> implementation of those plugins until later.
>> Can we nail down how the plugin interface is used first?  And maybe we
>> need to put all the plugins under their own 'plugin:/...' namespace as
>> well, so that it doesn't get mixed up with regular files and such when
>> browsing.

Kern> Aside from specifying the name of the plugin followed by a
Kern> colon, everything else on the line is up to the plugin designer.
Kern> In this particular case, fields 2, 3, and 4 have defined
Kern> meanings but the user can put anything he/she wants into them.

Sigh... are you just ignoring my comments?  I'm trying to point out
that the syntax you have proposed for specifying how to integrate
plugins is, IMO, ugly.  

Sure, I understand that it's upto the developer of the plugin to
specify what needs to be used in each field.  It's not really upto the
user, the developer will be telling them what to fill in where.

So what I'm proposing is that instead of:

  "bpipe:/@MYSQL/regress.sql:mysqldump -f --opt --databases regress:mysql"

in the FileSet, that we instead make this a seperate JOB completely,
because it shouldn't be mixed in with regular FileSets.  I guess I
didn't make that clear enought.  Then you could have something generic
like below, which doesn't show the bpipe stuff because it's not
needed, it's just the API that the driver writer needs to worry about,
not the end user:

       plugin {
	  driver: mysql[345...]    # required
	  targets: ...		

	  namespace: /@MYSQL       # optional, can default to driver default
	  backupcmd: ....          # optional, can default to driver default
	  restorecmd: ....	   # optional, can default to driver default

But you never bothered to answer any of my comments on the Namespace
design, etc.  I know this is boring stuff in some ways, but it's vital
from the end-user's point of view to make it crystal clear where their
data is and how it can be found again for restore.  

Now, to explain the fields I choose here:

     driver:  This is where we define which driver we're setting up,
              and what it's base type is.  Can be very flexible, but
	      we might want to reserve a namespace here for bacula
	      supplied drivers, versus outside drivers other people

     targets: This name is just the names of what to backup with the
               driver.  Format is driver dependant, but we might want
	       to specify a general syntax.  For example "ALL" could
	       mean all databases, mailboxes, NDMP mount points, etc
	       on a client should be handled by this driver.

	       Or you can name your own targets specifically.

      namespace:  What bacula shows as the top of the driver's backups
            when browsing.  Again, can be based on the driver name by

      restorecmd:  Since the driver will already have a structure for
      these, you can leave them alone or override the at need.

Maybe I'm being too eager to over-engineer this interface, but I'm
trying to hide the complexity and to make it *simple* to use.  Heck,
I'd even put all this into a DB table called 'plugins' and use that to
drive backups.  Get away from the endless configuration files beyond
the base ones to let the daemons talk to each other securely.  Store
the rest into the DB where it can be queried and updated more easily
with better syntax and bounds checking.

It's not like we *don't* have lots of room for this stuff.


This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
Bacula-devel mailing list

This mailing list archive is a service of Copilot Consulting.