[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] bacula-sd hanging after tape gets full + unload (2.5.19)


On Thu, 2008-12-04 at 14:35 +0200, Pasi Kärkkäinen wrote:
> On Thu, Dec 04, 2008 at 02:13:56PM +0200, Pasi Kärkkäinen wrote:
> 
> > > Could you stop all daemons with a sigsegv to force a backtrace ?
> > > killall -SEGV bacula-sd bacula-dir
> > > 
> > > (you will find 2 kind of file, *traceback and *bactrace in working directory)
> > > 
> > > After, if you can put results to pastbin, it will give information about your 
> > > problem.
> > > 
> > 
> > Ok, problems again.. here are the tracebacks:
> > 
> > http://pasik.reaktio.net/bacula/debug/bacula-sd-traceback.txt
> > http://pasik.reaktio.net/bacula/debug/bacula-dir-traceback.txt
> > 
> > Here's what I did to make bacula-sd hang:
> > 
> > 1. Rebooted the bacula server and the tape library
> > 2. Fresh after the reboot made sure mtx and bacula mtx-changer work OK.
> > 3. Started bacula
> > 4. Ran a job that copies jobs from disk pool to tape pool
> > 5. Bacula starts a bunch of jobs, but nothing happens.. bacula-sd is stuck.
> > 
> > Any ideas how to debug this further? 
> > 
> > Atm I'm running Bacula 2.5.20 (svn rev 8083) on CentOS 5.2 x86 32bit.
> > 
> > I also tried applying 2.4.3-sd-deadlock.patch (from bug #1192) but it didn't
> > seem to help.
> > 
> 
> And how did I verify bacula-sd is stuck/hanged.. 
> 
> - Checking what's happening on SCSI devices with "iostat 1" -> I don't see any disk activity.
> - Nothing happens in bconsole
> - Checking the status of Storage (tape pool) in bconsole makes bconsole stuck:
> 
> http://pasik.reaktio.net/bacula/debug/bconsole-sd-hang.txt
> 

Hi,

Did you notice broken "Terminated Jobs:" list in bconsole-sd-hang.txt?


Here ist my output of "status dir" (after upgrade to current svn)


Connecting to Director troll:9101
1000 OK: troll-dir Version: 2.5.22 (01 December 2008)
Enter a period to cancel a command.
*status dir
troll-dir Version: 2.5.22 (01 December 2008) i686-pc-linux-gnu redhat
Enterprise release
Daemon started 04-Dec-08 15:18, 0 Jobs run since started.
 Heap: heap=274,432 smbytes=129,912 max_bytes=130,465 bufs=1,260
max_bufs=1,294

...

Terminated Jobs:
 JobId  Level    Files      Bytes   Status   Finished        Name 
====================================================================
  8696  Incr        366    6.058 G  OK       03-Feb-27 18:17
belix.2008-12-04_09
1228072365  � (2          1    5.275 E  Other    29-Sep-21 15:51
008-12-04_09
104123763  8 (5   1,228,379,050    8.299 E  Other    15-Jan-94 19:04
4_09
1228378991         1,802,725,700    3.615 E  Other    28-May-00 01:50 56
1801675074  � (1   841,903,973    3.471 E  Other    28-Sep-97 12:49 
1632923476  D (1   808,268,337    3.328 E  Other    01-Jan-70 01:00 
758657072  i (8   774,975,534    232.8 G  Other    01-Jan-70 01:00 
959471412  1 (8         53         0   Other    01-Jan-70 01:00 
775304494  9 (8          0         0   Other    01-Jan-70 01:00 



> Like that.. nothing appears after "Used Volume status:".
> 
> I've noticed always when this crash/hang happens the status of tape drive is like this:
> 
> Device "IBM-LTO3-Drive" (/dev/nst0) is not open.
>     Device is being initialized.
>     Drive 0 status unknown.
> 
> Physical tape drive/library doesn't show any errors on the display, and everything looks OK. 
> I don't see any SCSI errors in dmesg or logs.
> 
> -- Pasi
> 
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
> Build the coolest Linux based applications with Moblin SDK & win great prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> Bacula-devel mailing list
> Bacula-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.sourceforge.net/lists/listinfo/bacula-devel

-- 
Ulrich Leodolter <ulrich.leodolter@xxxxxxxx>
OBVSG


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-devel mailing list
Bacula-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/bacula-devel


This mailing list archive is a service of Copilotco.