[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bacula-devel] 2.4.1 not releasing tapes
On Saturday 26 July 2008 02:14:54 Shad L. Lords wrote:
> Kern Sibbald wrote:
> > In the output that you show below, there are absolutely no Volumes that
> > are locked to any drive. The only thing that is unusual is that it shows
> > two Volumes that we previously reserved on a single drive -- that should
> > normally be a maximum of one. In this case, aside from a tiny amount of
> > memory use, I doubt that there is any harm. I suspect that it has to do
> > with running restore jobs, and is very likely fixed by the patch that I
> > posted to bug #1126, and will be in 2.4.2 to be released shortly.
> I've been able to narrow down when this happens. It has nothing to do
> with restores at all. In fact I've not issued any restores since
> starting all the daemons. Backups all start about the same time and all
> running from the same pool. I've got other jobs that are running at
> prio 5 and so jobs 1-3 all start at the same time.
> - Job 1 (pri 10) starts and is assigned to Drive 1
> - Job 2 (pri 10) starts and is assigned to Drive 0
> - Job 3 (pri 10) waits because max jobs to client is 2
> - Job 4 (pri 15) waits for job 1-3 to finish
> - MA3001 (almost full) is loaded in Drive 0
> - MA3002 (almost full) is loaded in Drive 1
> - MA3001 fills up before Job 1 finishes
> - MA3001 is ejected from Drive 0
> - MA3003 is put in Drive 0
> - Job 1 ends before MA3002 fills up
> - Job 3 is released and starts writing to Drive 1 (MA3002)
> - Job 2 ends (freeing Drive 0 & MA3003 ??)
> - MA3002 fills up
> - MA3002 is ejected from Drive 1
> - MA3003 is ejected from Drive 0
> - MA3003 is put in Drive 1
> - Job 3 ends (supposedly freeing Drive 1 & MA3003 ??)
> - Job 4 starts as all higher priority jobs have finished
> - Job 4 is assigned Drive 0
> - MA3004 is loaded in Drive 0
> - Job 4 ends
> What you end up with is MA3001 and MA3002 are full, MA3003 is < 50%
> full, MA3004 is < 1% full (catalog in my case). I would have expected
> Job 4 to swap MA3003 back to Drive 0 and write the data there before
> starting a new volume.
Well, what would probably clarify this is to have debug level 200 output from
the SD we could then see exactly what the Director was returning to job 4 as
a Volume or why the SD decided it could not use MA3003 for job 4.
Despite the fact that it did not do what you expected, it doesn't sound like
it really did anything wrong. It may just be one of those cases that could
use optimization, and there are a lot of them, but my first priority is to
get it working correctly, then hopefully we can optimize the swapping of
drives to be a minimum, and finally in some later version we can dynamically
switch drives and avoid swapping altogether, which is where I really want to
> You also end up with MA3003 showing up in the "In Use Volume status:"
> section of the status storage until you restart the SD.
> > IMO there is nothing to diagnose further. Any other problems you may be
> > having -- e.g. with your catalog job are unlikely to be related.
> I don't know if this is a bug or not. Just wanted to be through and
> provide enough information that someone else might be able to duplicate
> it. I'll continue to watch and see if it does the same thing with 2.4.2.
Yes, I think seeing if the problem exists on 2.4.2 is the first step ...
> > On Thursday 24 July 2008 23:59:59 Shad L. Lords wrote:
> >> I've run into a situation where it appears that bacula isn't releasing
> >> the tape in the drive. I've done releases on both drives but it still
> >> shows one of the tapes from each of the last two backups as being in use
> >> by drive2. I'm performing about 20 jobs spread across 18 hosts and
> >> bacula is balancing the jobs between the two drives successfully. The
> >> one odd thing I've noticed is that the catalog job always gets written
> >> to a new tape. The catalog job usually gets assigned drive1 (nst0) and
> >> the tape that ends up being locked to drive2 (MA3004/WA3003) both had
> >> free space and weren't in use at the time.
> >> Here is the relevant part of status storage from bconsole:
> >> Device status:
> >> Autochanger "SL-10K" with devices:
> >> "SL-10K-Drive1" (/dev/nst0)
> >> "SL-10K-Drive2" (/dev/nst1)
> >> Device "SL-10K-Drive1" (/dev/nst0) is not open.
> >> Drive 0 is not loaded.
> >> Device "SL-10K-Drive2" (/dev/nst1) is not open.
> >> Drive 1 is not loaded.
> >> ====
> >> In Use Volume status:
> >> MA3004 on device "SL-10K-Drive2" (/dev/nst1)
> >> Reader=0 writers=0 devres=0 volinuse=0
> >> WA3003 on device "SL-10K-Drive2" (/dev/nst1)
> >> Reader=0 writers=0 devres=0 volinuse=0
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
Bacula-devel mailing list
This mailing list archive is a service of Copilotco.