[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] Bug 1083 again...maybe


Hello,

I am preparing to release 2.2.10-b3 at the moment, and providing I can get it 
uploaded to Source Forge (not always so easy), I will release it as a BETA 
tonight.

It has two fixes since 2.2.10-b2 that should interest you: 1. frees Volumes 
that are no longer used that got "stuck" occasionally.  2. more important, 
the Slot id for moving a Volume from one drive to another was wrong (more 
often than not, because it sometimes worked).  This fixes that problem. The 
result of the problem was that Bacula "forgot" to unload a volume from a 
drive, then tried to move it to another drive, which would cause an error 
from the changer script.

2.2.10-b3 cleans up *all* the problems I have seen with 2 drive autochangers.  
There may be other problems but I have not seen them.

The only remaining bug fixes for an official release are some Win32 problems 
with reparse points (mount points ...).

Regards,

Kern

PS: I leave tomorrow morning *early* for Ottawa and will not be back until the 
20th.

On Tuesday 13 May 2008 16:18:12 Josh Fisher wrote:
> Kern,
>
> Below is a condensed version of the log from a sequence of jobs using
> bacula-dir and bacula-sd v. 2.2.10-b2. I believe that the reason job
> 7272 below failed has something to do with bug 1083 (mult-drive disk
> autochanger and volume swapping between drives).
>
> I purposefully reset the job start times to prevent simultaneous jobs
> from starting. I inadvertently missed one and jobs 7273 and 7274 ran in
> parallel. This turned out to be interesting, because when jobs 7273 and
> 7274 were simultaneously started the drives were  nearly in the same
> state as when job 7272 started. A usable volume was in drive 1 and an
> unusable volume (or no volume) was in drive 0. For the simultaneous jobs
> it worked. I think this might have been by chance, however, because it
> appears that job 7273 (using drive 0) did not have to swap from drive 1
> because job 7274 (using drive 1) had already unloaded drive 1 to slot 34???
>
> Also, both Dir and SD are running in the same Xen domU with only one
> virtual CPU assigned. Would job 7273 (the one using drive 0) had failed
> in the same way as job 7272 if there would have been another CPU
> available and the two jobs "really" started simultaneously?
>
> Anyway, I did not have debug level set in the SD, so I don't have much
> more for you at this time. I am now running SD with debug level 150 and
> will let you know what happens. Should I file another bug report if this
> can be reproduced?
>
> Thanks,
>
> Josh Fisher
>
> ###  sequence of jobs before, during, and after volume swap error ###
>
> 12-May 08:10  drive 0 = slot-32 (pool=inc) - drive 1 = slot-34 (pool=inc)
>   job 7263 started
>     using drive 0
>     finds slot-32 (from pool="inc") in drive 0
>     unloads drive 0 into slot-32
>     loads slot-37 (from pool="cat") into drive 0
>     writes to slot-37
>     exits OK
>
> 12-May 11:10  drive 0 = slot-37 (pool=cat) - drive 1 = slot-34 (pool=inc)
>   job 7264
>     using drive 0
>     finds slot-37 (pool=cat) in drive 0
>     unloads drive 0 into slot-37
>     loads slot-39 (pool=full) into drive 0
>     job fails before writing ("no data available" at append.c:159 - see
> note below)
>
> 12-May 14:01  drive 0 = slot-39 (pool=full) - drive 1 = slot-34 (pool=inc)
>   job 7266
> 	using drive 0
> 	job failed (no route to host)
>
> 12-May 19:01  drive 0 = slot-39 (pool=full) - drive 1 = slot-34 (pool=inc)
>   job 7268
>     using drive 1
>     finds slot-34 (pool=inc) in drive 1
>     writes to slot-34
>     exits OK
>
> 12-May 19:02  drive 0 = slot-39 (pool=full) - drive 1 = slot-34 (pool=inc)
>   job 7269
>     using drive 1
>     finds slot-34 (pool=inc) in drive 1
>     writes to slot-34
>     exits OK
>
> 12-May 23:51  drive 0 = slot-39 (pool=full) - drive 1 = slot-34 (pool=inc)
>   job 7272
>     using drive 0
>     finds slot-39 (pool=full) in drive 0
>     unloads drive 0 into slot 39
>     attempts to load slot-34 (pool=inc) into drive 0
>     fails - autochanger script returns (Storage Element 34 Empty (loaded in
> drive 1))
>
> 13-May 00:31  drive 0 = unloaded  - drive 1 = slot-34 (pool=inc)
>   job 7273 started
>     using drive 0
>     finds drive 0 unloaded
>     loads slot-34 into drive 0
>     writes to slot-34
>     exits OK
>
>   job 7274 started
>     using drive 1
>     finds slot-34 (pool=inc) in drive 1
>     unloads drive 1 into slot 34
>     loads slot-32 (pool=inc) into drive 1
>     writes to slot-32
>     exits OK



-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-devel mailing list
Bacula-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/bacula-devel


This mailing list archive is a service of Copilotco.