Re: [Bacula-devel] A bug in 2.2.8-jobmedia.patch?

On Sunday 30 March 2008 16:42:23 Tom Ivar Helbekkmo wrote:
> Kern Sibbald <kern@xxxxxxxxxxx> writes:
> > I suspect that has something to do with the fact that you are using
> > spooling, and they record the start block early in the game at the
> > beginning of the spooling and the code is not intelligent enough to
> > wait until the very first block is actually written to *tape*.
> That is correct.  In stored/append.c, in do_append_data(), there are two
> calls to write_session_label(), performed at the start and end of
> spooling to a disk file.  These calls, which generate session labels in
> the disk file, marking the start and end of the session, both cause
> information about the current tape position to be recorded in the job's
> dcr structure.
> This has two specific consequences:
> - If several jobs start spooling at the same time, they will all get the
>   current tape position noted as the StartFile/StartBlock for the job.
>   If they end up despooling to the file that was current when they
>   started spooling, this is what will end up in the JOBMEDIA table.  If
>   there is a file change before they despool, the setting of NewFile in
>   the dcr structure will fix this up later, but the "start of session"
>   label is already in the spool file, of course, so it holds the wrong
>   information anyway.
> - If the job is longer than the maximum spool size, it will get its
>   first spool session despooled, and then start spooling again after the
>   first despooling is over.  The last blocks despooled to tape from the
>   first session will not have been recorded, but they will be flushed
>   later, when the next session despools.  However, if another job has
>   been despooling while this one is spooling its second round, the
>   session label written to the spool file at its close will cause the
>   EndFile/Endblock to be set to wherever the tape is at that time.  When
>   the dangling record is flushed to JOBMEDIA, it gets this wrong
>   information.  Both session labels in the spool file will be wrong,
>   too, of course, because they reflect the state of the tape during
>   spooling, not during despooling.
> I would have to study the code much more closely to work out what's the
> proper fix -- but it seems clear that it should involve creating the
> session labels only when something is actually written to the archive
> device, not during spooling.  I'm tempted to try making do_append_data()
> not create session labels if we're spooling, and add the making of them
> to despool_data() in stored/spool.c.  Sound reasonable?

Nice analysis. I'm impressed.  

Yes, your suggestion sounds reasonable, but I hesitate to move such calls 
around due to possible side effects, so prefer for the moment a "minimal" 
fix.  Perhaps moving the calls is something to do more long term.

I have made a simple patch, which seems to correct the problem.

Index: src/stored/spool.c
--- src/stored/spool.c  (revision 6701)
+++ src/stored/spool.c  (working copy)
@@ -246,6 +246,13 @@
    dcr->despool_wait = false;
    dcr->despooling = true;
+   /*
+    * If we have not switched volumes, update the start location when
+    *   we begin despooling.
+    */
+   if (!dcr->NewVol) {
+      set_start_vol_position(dcr);
+   }

     * This is really quite kludgy and should be fixed some time.

It is committed to the trunk, but I am doing a bit more testing on the 2.2.9 
branch before applying it.  I would appreciate to know if it fixes your 

Best regards,


> -tih

