[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] Pre-alpha version of Bacula plugins working

On Thursday 21 February 2008 20.57:08 John Stoffel wrote:
> >>>>> "Kern" == Kern Sibbald <kern@xxxxxxxxxxx> writes:
> Kern> On Tuesday 19 February 2008 23.13:04 John Stoffel wrote:
> Kern> Mark today in your calendar.  Bacula just did its first backup
> Kern> and restore of a MySQL database using a plugin.  I did it with
> Kern> using a simplistic "pipe" plugin.
> >> Congrats!
> Kern> The operation consisted of adding the following line to the
> Kern> Include section of the FileSet:
> Kern> 1                    2
> >> 3                                               4 Kern> Plugin =
> >> "bpipe:/@MYSQL/regress.sql:mysqldump -f --opt --databases regress:mysql"
> Kern> This plugin line goes in the FileSet section where you have your
> Kern> File = xxx lines, and for this plugin is composed of 4 fields
> Kern> separated by colons (I've indicated the field numbers above the
> Kern> Plugin line for reference only.
> >> Ugh... sorry to be negative, but could we spend some time coming up
> >> with a nicer syntax please?
> Kern> Field 1 specifies a specific plugin name (in this case bpipe).
> >> Ok.  But would it make more sense to have a plugin { ... } resource
> >> block instead in the FileSet?
> Kern> Field 2 specifies the namespace (in this case the pseudo
> Kern> path/filename under which the backup will be saved -- this
> Kern> will show up in the restore tree selection).
> >> Hmm... what are the limits in Bacula's design in terms of namespace
> >> support?  Does it strictly follow the unix rooted tree, or does it
> >> also support the notion of drive letters as in PCs?  If so, why not
> >> just extend the drive letter to be:
> >>
> >> DB:/MYSQL[345N]/<database>/<table>
> >>
> >> instead?  Maybe the version of mysql doesn't matter, but might be
> >> useful if you try to restore a version 5 mysql dump onto a version 3
> >> server to get a warning.
> Kern> Field 3 is the "reader" shell command used for doing a backup.
> Kern> Since this is a pipe plugin (Linux shared object), it does a
> Kern> popen() on that command.
> >> It would be nice to just abstract this out even more, so that the
> >> command is specified elsewhere, and we just feed in the standard
> >> connection arguements.  Sorta like the Perl::DBI modules abstract out
> >> DB connection stuff.
> Kern> Field 4 is the "writer" shell command used for doing the restore.
> >> Does this need to be different, or handle remote DBs, etc?
> Kern> I created a MySQL database named regress, populated it, backed
> Kern> it up, dropped the database, then I restored the "file"
> Kern> /@MYSQL/regress.sql, and the database was restored.  There is
> Kern> nothing magical about /@MYSQL/...  It is just something unique
> Kern> and distinctive enough that it will not be confused with another
> Kern> file on the system.
> >> Not sure I like this, esp since /@MYSQL/ is perfectly legal on Unix
> >> systems as a filename.
> Kern> As I mentioned, this is a rather trivial example of what can be
> Kern> done with a simple pipe plugin.  As it stands, bpipe knows
> Kern> nothing about MySQL (it is 365 lines of C code), but it could be
> Kern> any shared object that can implement a C interface, and I could
> Kern> imagine for example a MySQL specific plugin which could all
> Kern> databases or a list of databases.  Also, Bacula was running with
> Kern> an SQLite database -- it certainly would not work very well if
> Kern> Bacula were using the MySQL database in question during the
> Kern> restore ...
> Kern> Obviously, this is a first cut and there remains a lot to be
> Kern> done (much clean up, a lot of additional implementation, error
> Kern> message implementation, and documentation), but at least it is
> Kern> now a full proof of concept.
> Kern> By the way, this is an example of what I call a "plugin
> Kern> command", where a specific plugin is referenced, and it backs up
> Kern> a specific file (or set of files).  I have also planned plugins
> Kern> that will be called when particular Options are met (i.e. to
> Kern> backup all .gz files, ...).  However, I am putting off
> Kern> implementation of those plugins until later.
> >> Can we nail down how the plugin interface is used first?  And maybe we
> >> need to put all the plugins under their own 'plugin:/...' namespace as
> >> well, so that it doesn't get mixed up with regular files and such when
> >> browsing.
> Kern> Aside from specifying the name of the plugin followed by a
> Kern> colon, everything else on the line is up to the plugin designer.
> Kern> In this particular case, fields 2, 3, and 4 have defined
> Kern> meanings but the user can put anything he/she wants into them.
> Sigh... are you just ignoring my comments?  I'm trying to point out
> that the syntax you have proposed for specifying how to integrate
> plugins is, IMO, ugly.
> Sure, I understand that it's upto the developer of the plugin to
> specify what needs to be used in each field.  It's not really upto the
> user, the developer will be telling them what to fill in where.
> So what I'm proposing is that instead of:
>   "bpipe:/@MYSQL/regress.sql:mysqldump -f --opt --databases regress:mysql"
> in the FileSet, that we instead make this a seperate JOB completely,
> because it shouldn't be mixed in with regular FileSets.  I guess I
> didn't make that clear enought.  Then you could have something generic
> like below, which doesn't show the bpipe stuff because it's not
> needed, it's just the API that the driver writer needs to worry about,
> not the end user:
>        plugin {
> 	  driver: mysql[345...]    # required
> 	  targets: ...
> 	  namespace: /@MYSQL       # optional, can default to driver default
> 	  backupcmd: ....          # optional, can default to driver default
> 	  restorecmd: ....	   # optional, can default to driver default
> 	  }
> But you never bothered to answer any of my comments on the Namespace
> design, etc.  I know this is boring stuff in some ways, but it's vital
> from the end-user's point of view to make it crystal clear where their
> data is and how it can be found again for restore.
> Now, to explain the fields I choose here:
>      driver:  This is where we define which driver we're setting up,
>               and what it's base type is.  Can be very flexible, but
> 	      we might want to reserve a namespace here for bacula
> 	      supplied drivers, versus outside drivers other people
> 	      write.
>      targets: This name is just the names of what to backup with the
>                driver.  Format is driver dependant, but we might want
> 	       to specify a general syntax.  For example "ALL" could
> 	       mean all databases, mailboxes, NDMP mount points, etc
> 	       on a client should be handled by this driver.
> 	       Or you can name your own targets specifically.
>       namespace:  What bacula shows as the top of the driver's backups
>             when browsing.  Again, can be based on the driver name by
>       	    default.
>       backupcmd:
>       restorecmd:  Since the driver will already have a structure for
>       these, you can leave them alone or override the at need.
> Maybe I'm being too eager to over-engineer this interface, but I'm
> trying to hide the complexity and to make it *simple* to use.  Heck,
> I'd even put all this into a DB table called 'plugins' and use that to
> drive backups.  Get away from the endless configuration files beyond
> the base ones to let the daemons talk to each other securely.  Store
> the rest into the DB where it can be queried and updated more easily
> with better syntax and bounds checking.
> It's not like we *don't* have lots of room for this stuff.

I don't feel that I am experienced or smart enough to know what fields and 
values are sufficient for all that plugin developers are going to need to 
have specified, so I have left it to them to decide rather than limit them.

I appreciate your comments, but I just don't have any comments on your 
comments other than to say that I have explained what I have implemented, and 
plugins as (partially) implemented were for me something difficult to design 
(I am referring to fitting it into Bacula and the API design, not the fact 
that it accepts a pretty much arbitrary string).  In the time left, I will 
have difficulties finishing: the API design, programming it, testing, and 
documenting the current way it is implemented, which seems to get the job 
done, so a major redesign at this point is pretty much impossible at least 
for me.

If someone would like to improve the code and submit a patch that implements a 
more elegant solution for plugins I and possibly others would be very happy. 

I am and haved been open to a lot of ideas, improvements and changes, but one 
concept that has been discussed and rejected by users (and me) a number of 
times is the idea of putting  configuration data in the catalog.

This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
Bacula-devel mailing list

This mailing list archive is a service of Copilot Consulting.