[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bacula-devel] 2.4.4-b1 strange volume/device handling


Hi,

I just updated from 2.2.8 to 2.4.4-b1 (dir/sd/fd debian etch)) so this might no
new 2.4.x issue.

I've problems with the volume/device handling in an multi drive
autochanger environment.

2 verify jobs were started to check the backups from last night. Both
backups are on the same volume (A00033L4).


Run Verify job
JobName:     VerifyVU0EM003-FBR
Level:       VolumeToCatalog
Client:      VU0EA003-fd
FileSet:     VU0EM003-FBR
Pool:        1-Week-Incremental (From User input)
Storage:     Neo4100 (From Pool resource)
Verify Job:  VU0EM003-FBR
Verify List: 
When:        2008-12-17 11:30:16
Priority:    10
OK to run? (yes/mod/no): yes
Job queued. JobId=7878


Run Verify job
JobName:     VerifyVU0EM003-VPU
Level:       VolumeToCatalog
Client:      VU0EA003-fd
FileSet:     VU0EM003-VPU
Pool:        1-Week-Incremental (From User input)
Storage:     Neo4100 (From Pool resource)
Verify Job:  VU0EM003-VPU
Verify List: 
When:        2008-12-17 11:30:51
Priority:    10
OK to run? (yes/mod/no): yes
Job queued. JobId=7879


Neo-4100 ist the autochanger with 3 LTO-4 drives.


Now the first job starts and wants volume A00033L4 in drive ULTRIUM-TD4-D1 (drive #1).


job queued. JobId=7879
17-Dez 11:30 VUMEM004-dir JobId 7878: Verifying against JobId=7854 Job=VU0EM003-FBR.2008-12-16_22.00.00.08
17-Dez 11:30 VUMEM004-dir JobId 7878: Bootstrap records written to /var/lib/bacula//VUMEM004-dir.restore.95.bsr
17-Dez 11:30 VUMEM004-dir JobId 7878: 
17-Dez 11:30 VUMEM004-dir JobId 7878: The job will require the following
   Volume(s)                 Storage(s)                SD Device(s)
===========================================================================
17-Dez 11:30 VUMEM004-dir JobId 7878:    
17-Dez 11:30 VUMEM004-dir JobId 7878:    A00033L4                  Neo4100                   Neo4100                  
17-Dez 11:30 VUMEM004-dir JobId 7878: 
17-Dez 11:30 VUMEM004-dir JobId 7878: Start Verify JobId=7878 Level=VolumeToCatalog Job=VerifyVU0EM003-FBR.2008-12-17_11.30.45.08
17-Dez 11:30 VUMEM004-dir JobId 7878: Using Device "ULTRIUM-TD4-D1"
17-Dez 11:30 VU0EA003-sd JobId 7878: Ready to read from volume "A00033L4" on device "ULTRIUM-TD4-D1" (/dev/ULTRIUM-TD4-D1).
17-Dez 11:30 VU0EA003-sd JobId 7878: Forward spacing Volume "A00033L4" to file:block 36:0.



A few moments later the second job wants volumes A00033L4 in drive ULTRIUM-TD4-D2 (drive #2).



17-Dez 11:31 VUMEM004-dir JobId 7879: Verifying against JobId=7855 Job=VU0EM003-VPU.2008-12-16_22.00.00.09
17-Dez 11:31 VUMEM004-dir JobId 7879: Bootstrap records written to /var/lib/bacula//VUMEM004-dir.restore.96.bsr
17-Dez 11:31 VUMEM004-dir JobId 7879: 
17-Dez 11:31 VUMEM004-dir JobId 7879: The job will require the following
   Volume(s)                 Storage(s)                SD Device(s)
===========================================================================
17-Dez 11:31 VUMEM004-dir JobId 7879:    
17-Dez 11:31 VUMEM004-dir JobId 7879:    A00033L4                  Neo4100                   Neo4100                  
17-Dez 11:31 VUMEM004-dir JobId 7879: 
17-Dez 11:31 VUMEM004-dir JobId 7879: Start Verify JobId=7879 Level=VolumeToCatalog Job=VerifyVU0EM003-VPU.2008-12-17_11.31.01.09
17-Dez 11:31 VUMEM004-dir JobId 7879: Using Device "ULTRIUM-TD4-D2"
17-Dez 11:31 VU0EA003-sd JobId 7879: 3301 Issuing autochanger "loaded? drive 1" command.
17-Dez 11:31 VU0EA003-sd JobId 7879: 3302 Autochanger "loaded? drive 1", result: nothing loaded.



Then bacula realised that the volume needed for the second job is already in drive 1....



17-Dez 11:34 VU0EA003-sd JobId 7879: Warning: Volume "A00033L4" is in use by device "ULTRIUM-TD4-D1" (/dev/ULTRIUM-TD4-D1)
17-Dez 11:34 VU0EA003-sd JobId 7879: 3301 Issuing autochanger "loaded? drive 1" command.
17-Dez 11:34 VU0EA003-sd JobId 7879: 3302 Autochanger "loaded? drive 1", result: nothing loaded.
17-Dez 11:34 VU0EA003-sd JobId 7879: Warning: acquire.c:221 Read open device "ULTRIUM-TD4-D2" (/dev/ULTRIUM-TD4-D2) Volume "A00033L4" failed: ERR=dev.c:432 Unable to open device "ULTRIUM-TD4-D2" (/dev/ULTRIUM-TD4-D2): ERR=Kein Medium gefunden

17-Dez 11:34 VU0EA003-sd JobId 7879: 3301 Issuing autochanger "loaded? drive 1" command.
17-Dez 11:34 VU0EA003-sd JobId 7879: 3302 Autochanger "loaded? drive 1", result: nothing loaded.



...and wants manual intervention.



17-Dez 11:37 VU0EA003-sd JobId 7879: Warning: Volume "A00033L4" is in use by device "ULTRIUM-TD4-D1" (/dev/ULTRIUM-TD4-D1)
17-Dez 11:37 VU0EA003-sd JobId 7879: Please mount Volume "A00033L4" for:
    Job:          VerifyVU0EM003-VPU.2008-12-17_11.31.01.09
    Storage:      "ULTRIUM-TD4-D2" (/dev/ULTRIUM-TD4-D2)
    Pool:         1-Week-Incremental
    Media type:   LTO4



After a while the first verify job finishes without error.



17-Dez 11:53 VU0EA003-sd JobId 7878: Alert: smartctl version 5.36 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen
17-Dez 11:53 VU0EA003-sd JobId 7878: Alert: Home page is http://smartmontools.sourceforge.net/
17-Dez 11:53 VU0EA003-sd JobId 7878: Alert: 
17-Dez 11:53 VU0EA003-sd JobId 7878: Alert: TapeAlert: OK
17-Dez 11:53 VU0EA003-sd JobId 7878: Alert: 
17-Dez 11:53 VU0EA003-sd JobId 7878: Alert: Error Counter logging not supported
17-Dez 11:53 VUMEM004-dir JobId 7878: Bacula VUMEM004-dir 2.4.4-b1 (03Dec08): 17-Dez-2008 11:53:52
  Build OS:               x86_64-pc-linux-gnu debian 4.0
  JobId:                   
  Job:                    VerifyVU0EM003-FBR.2008-12-17_11.30.45.08
  FileSet:                VU0EM003-FBR
  Verify Level:           VolumeToCatalog
  Client:                 VU0EA003-fd
  Verify JobId:           7854
  Verify Job:             VU0EM003-FBR
  Start time:             17-Dez-2008 11:30:47
  End time:               17-Dez-2008 11:53:52
  Files Expected:         11
  Files Examined:         11
  Non-fatal FD errors:    0
  FD termination status:  OK
  SD termination status:  OK
  Termination:            Verify OK

17-Dez 11:53 VUMEM004-dir JobId 7878: Begin pruning Jobs.
17-Dez 11:53 VUMEM004-dir JobId 7878: No Jobs found to prune.
17-Dez 11:53 VUMEM004-dir JobId 7878: Begin pruning Files.
17-Dez 11:53 VUMEM004-dir JobId 7878: No Files found to prune.
17-Dez 11:53 VUMEM004-dir JobId 7878: End auto prune.




But the second jobs still waits for a manual mount command.



Running Jobs:
 JobId Level   Name                       Status
======================================================================
  7879 VolumeT  VerifyVU0EM003-VPU.2008-12-17_11.31.01.09 is waiting for a mount request


At this point a 'stat stor' for the device looks like this:



VU0EA003-sd Version: 2.4.4-b1 (03 December 2008) x86_64-pc-linux-gnu debian 4.0
Daemon started 17-Dez-08 09:54, 4 Jobs run since started.
 Heap: heap=794,624 smbytes=508,663 max_bytes=732,584 bufs=211 max_bufs=1,185
Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8

Running Jobs:
Reading: Verify Volume to Catalog Restore job VerifyVU0EM003-VPU.2008-12-17_11 JobId=7879 Volume="A00033L4"
    pool="1-Week-Incremental" device="ULTRIUM-TD4-D2" (/dev/ULTRIUM-TD4-D2)
Writing: Verify Volume to Catalog Restore job VerifyVU0EM003-VPU.2008-12-17_11 JobId=7879 Volume="A00033L4"
    pool="1-Week-Incremental" device="ULTRIUM-TD4-D2" (/dev/ULTRIUM-TD4-D2)
    spooling=0 despooling=0 despool_wait=0
    Files=0 Bytes=0 Bytes/sec=0
    FDReadSeqNo=88 in_msg=87 out_msg=12 fd=8
====

Jobs waiting to reserve a drive:
====

Terminated Jobs:
 JobId  Level    Files      Bytes   Status   Finished        Name 
===================================================================
  7616                0         0   Cancel   08-Dez-08 10:11 VerifyVU0EM003-VPU
  7617                0         0   Cancel   08-Dez-08 10:13 VerifyVU0EF005-Absicherung-MPC-Volume1
  7625  Full      1,213    847.5 G  OK       08-Dez-08 13:34 VU0EF005-Absicherung-MPC-Volume2
  7635  Incr          0         0   OK       08-Dez-08 22:02 VU0EM003-FBR
  7628  Full    182,234    951.2 G  OK       08-Dez-08 23:48 VUMEI001
  7663  Incr          0         0   OK       09-Dez-08 22:00 VU0EM003-FBR
  7855  Incr        349    7.276 G  OK       17-Dez-08 10:38 VU0EM003-VPU.2008-12-16_22
  7877                0         0   Error    17-Dez-08 11:18 VerifyVU0EM003-VPU.2008-12-17_11
  7876                0         0   Cancel   17-Dez-08 11:25 VerifyVU0EM003-FBR.2008-12-17_11
  7878                0         0   OK       17-Dez-08 11:53 VerifyVU0EM003-FBR.2008-12-17_11
====

Device status:
Autochanger "Neo4100" with devices:
   "ULTRIUM-TD4-D1" (/dev/ULTRIUM-TD4-D1)
   "ULTRIUM-TD4-D2" (/dev/ULTRIUM-TD4-D2)
   "ULTRIUM-TD4-D3" (/dev/ULTRIUM-TD4-D3)
Device "ULTRIUM-TD4-D1" (/dev/ULTRIUM-TD4-D1) is mounted with:
    Volume:      A00033L4
    Pool:        *unknown*
    Media type:  LTO4
    Slot 64 is loaded in drive 0.
Configured device capabilities:
EOF BSR BSF FSR FSF EOM REM !RACCESS AUTOMOUNT !LABEL !ANONVOLS ALWAYSOPEN
Device state:
OPENED TAPE LABEL !MALLOC !APPEND !READ !EOT !WEOT !EOF !NEXTVOL !SHORT !MOUNTED
num_writers=0 block=0

Device parameters:
Archive name: /dev/ULTRIUM-TD4-D1 Device name: ULTRIUM-TD4-D1
File=171 block=1
Min block=0 Max block=0
    Total Bytes Read=0 Blocks Read=0 Bytes/block=0
    Positioned at File=171 Block=1
Device "ULTRIUM-TD4-D2" (/dev/ULTRIUM-TD4-D2) is not open.
    Device is BLOCKED waiting for mount of volume "A00033L4",
       Pool:        1-Week-Incremental
       Media type:  LTO4
    Drive 1 status unknown.
Configured device capabilities:
EOF BSR BSF FSR FSF EOM REM !RACCESS AUTOMOUNT !LABEL !ANONVOLS ALWAYSOPEN
Device state:
!OPENED TAPE !LABEL !MALLOC !APPEND !READ !EOT !WEOT !EOF !NEXTVOL !SHORT !MOUNTED
num_writers=0 block=2

Device parameters:
Archive name: /dev/ULTRIUM-TD4-D2 Device name: ULTRIUM-TD4-D2
File=0 block=0
Min block=0 Max block=0
Device "ULTRIUM-TD4-D3" (/dev/ULTRIUM-TD4-D3) is not open.
    Drive 2 is not loaded.
Configured device capabilities:
EOF BSR BSF FSR FSF EOM REM !RACCESS AUTOMOUNT !LABEL !ANONVOLS ALWAYSOPEN
Device state:
!OPENED TAPE !LABEL !MALLOC !APPEND !READ !EOT !WEOT !EOF !NEXTVOL !SHORT !MOUNTED
num_writers=0 block=0

Device parameters:
Archive name: /dev/ULTRIUM-TD4-D3 Device name: ULTRIUM-TD4-D3
File=0 block=0
Min block=0 Max block=0
====

Used Volume status:
A00033L4 on device "ULTRIUM-TD4-D1" (/dev/ULTRIUM-TD4-D1)
    Reader=0 writers=0 devres=0 volinuse=0
====

====




So bacula knows that the needed volumes is in drive 1, but it doesn't unload it
and load it in drive 2. Or even better just use drive 1.

It needs a manual mount command to get the second job running.

Is this maybe something special with verify jobs?

Right now I can't think of a configuration error, but I think not many people
are using the verify feature. So maybe this is someting new, that wasn't
discovered in the new 2.4.x code.


Any ideas?

Ralf




Here are parts of my bacula config:

bacula-dir client/job section

Job {
  Name = VerifyVU0EM003-FBR
  Type = Verify
  Level = VolumeToCatalog
  Verify Job = VU0EM003-FBR
  FileSet = VU0EM003-FBR
  Schedule = "Verify VU0EM003-Messdaten"
  Client= VU0EA003-fd
  Messages = "Regular Backup"
  Pool = 2-Month-Full
  Storage = Neo4100
  Priority = 10
}

Job {
  Name = VerifyVU0EM003-VPU
  Type = Verify
  Level = VolumeToCatalog
  Verify Job = VU0EM003-VPU
  FileSet = VU0EM003-VPU
  Schedule = "Verify VU0EM003-Messdaten"
  Client= VU0EA003-fd
  Messages = "Regular Backup"
  Pool = 2-Month-Full
  Storage = Neo4100
  Priority = 10
}


bacula-dir storage section

Storage {
  Name = Neo4100
  Address = xx.60.9.193
  SDPort = 9103
  Password = "xxx"
  Device = Neo4100
  Media Type = LTO4
  Autochanger = yes
  Heartbeat Interval = 5min
  Maximum Concurrent Jobs = 3
}

Storage {
  Name = Neo4100-LTO4-D1
  Address = xx.60.9.193
  SDPort = 9103
  Password = "xxx"
  Device = ULTRIUM-TD4-D1
  Media Type = LTO4
  Maximum Concurrent Jobs = 1
}

Storage {
  Name = Neo4100-LTO4-D2
  Address = xx.60.9.193
  SDPort = 9103
  Password = "xxx"
  Device = ULTRIUM-TD4-D2
  Media Type = LTO4
  Maximum Concurrent Jobs = 1
}

Storage {
  Name = Neo4100-LTO4-D3
  Address = xx.60.9.193
  SDPort = 9103
  Password = "xxx"
  Device = ULTRIUM-TD4-D3
  Media Type = LTO4
  Maximum Concurrent Jobs = 1
}



bacula-sd.conf

Autochanger {
  Name = Neo4100
  Device = ULTRIUM-TD4-D1 
  Device = ULTRIUM-TD4-D2
  Device = ULTRIUM-TD4-D3
  Changer Command = "/etc/bacula/scripts/mtx-changer %c %o %S %a %d"
  Changer Device = /dev/Neo4100
}

Device {
  Name = ULTRIUM-TD4-D1
  Drive Index = 0
  Media Type = LTO4
  Archive Device = /dev/ULTRIUM-TD4-D1
  AutomaticMount = yes;               # when device opened, read it
  AlwaysOpen = yes;
  RemovableMedia = yes;
  RandomAccess = no;
  AutoChanger = yes
  Maximum Spool Size = 370g
  Maximum Job Spool Size = 150g
  Spool Directory = /bacula-spool/
  Alert Command = "sh -c 'smartctl -d scsi -H -l error %c'"  
}


Device {
  Name = ULTRIUM-TD4-D2
  Drive Index = 1
  Media Type = LTO4
  Archive Device = /dev/ULTRIUM-TD4-D2
  AutomaticMount = yes;               # when device opened, read it
  AlwaysOpen = yes;
  RemovableMedia = yes;
  RandomAccess = no;
  AutoChanger = yes
  Maximum Spool Size = 370g
  Maximum Job Spool Size = 150g
  Spool Directory = /bacula-spool/
  Alert Command = "sh -c 'smartctl -d scsi -H -l error %c'"  
}

Device {
  Name = ULTRIUM-TD4-D3
  Drive Index = 2
  Media Type = LTO4
  Archive Device = /dev/ULTRIUM-TD4-D3
  AutomaticMount = yes;               # when device opened, read it
  AlwaysOpen = yes;
  RemovableMedia = yes;
  RandomAccess = no;
  AutoChanger = yes
  Maximum Spool Size = 370g
  Maximum Job Spool Size = 150g
  Spool Directory = /bacula-spool/
  Alert Command = "sh -c 'smartctl -d scsi -H -l error %c'"  
}


------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Bacula-devel mailing list
Bacula-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/bacula-devel


This mailing list archive is a service of Copilotco.