[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] Bacula 2.2.8 segfaulting with multiple autochangers


Hi Kern.

Maybe I was misunderstood. I'm not asking for support with running
bacula. I have a working, healthy system. Rather, I was asking if you
guys were interested in debugging a seg faulting process.

No, I didn't read the kaboom section, so I didn't know that bacula will
force itself to segfault. I generally jump to the conclusion that a
segfault in one version that doesn't happen in the previous, working
version, is a coding error, and want to help in debugging/fixing that. A
crash caused solely by an upgrade performed according to the
documentation is probably not really a "support issue", is it?

I guess if you guys don't want help finding and fixing bugs.. I can 
respect that. 

It will probably take me a couple of weeks to fix or at least fully
identify this problem by myself. Would it be out of line to post patches
here, should I find that there is, in fact, a problem with the code?

On Wed, 2008-02-20 at 23:20 +0100, Kern Sibbald wrote:
> Hello,
> 
> I would suggest that you start by reading www.bacula.org -> Support and then 
> asking for help on the support list.  Off hand, I would say that you build 
> was perhaps broken, but most likely you just have your bacula-sd.conf file 
> incorrectly configured, and we don't deal with those issues on this list. 
> Since you are also talking about modifying the port addresses, so you are  
> treading on very dangerous waters.  
> 
> Then if the above doesn't help, there is plenty of info in the Kaboom chapter 
> of the manual.
> 
> On Wednesday 20 February 2008 19.29:50 Clint Byrum wrote:
> > Hi everyone. This is my first post to this list.
> >
> > I think I found a bug in Bacula 2.2.8 after I spent a couple of days
> > struggling with two autochangers before regressing to 2.2.7, which
> > solved my problems.
> >
> > Recently I rebuilt my director and main storage daemon box after a RAID
> > array died on the old one. I setup a new box (HP DL380 G4) with CentOS
> > 5, x86_64, and 8G RAM. I also added a second Quantum PX502 autochanger.
> >
> > I downloaded the 2.2.8 source RPM and built it using this commandline:
> >
> > rpmbuild --rebuild --define nobuild_gconsole=1 --define build_mysql=1
> > --define build_centos5=1
> >
> > The RPM built fine, and everything seemed to be working fine.. I used
> > the binaries to restore the old catalog from tape (some of you may have
> > seen me and my fun with MySQL index building in the IRC channel as
> > MapspaM).
> >
> > Anyway, after getting things running again as they were with the old
> > configurations, I wanted to start using my new, second autochanger,
> > which is on /dev/sg0 and /dev/nst0 (the old one ended up on /dev/sg2
> > and /dev/nst1). Things got ugly at this point.
> >
> > (In case you're wondering, my config files are at the bottom)
> >
> > Any time I would try to run a backup job on the new Storage resource,
> > bacula-sd would disappear.
> >
> > I tried a few more things including running seperate SD's and such.
> > Something about this new storage resource (In the attached configs as
> > LTO3-2) was broken.
> >
> > I finally ran a seperate bacula-sd with
> >
> > bacula-sd -f -d99 -c /etc/bacula/bacula-sd2.conf -u bacula -g bacula
> >
> > And ran my backup job against it, which caused this output:
> >
> > 19-Feb 16:33 snap2-sd: ABORTING due to ERROR in dev.c:724
> > dev.c:723 Bad call to rewind. Device "PX502-2-Drive-1" (/dev/nst0) not open
> > Kaboom! bacula-sd, snap2-sd got signal 11 - Segmentation violation.
> > Attempting traceback.
> >
> > The traceback failed due to "Permission Denied".  I'm sorry but I
> > accidentally let the other debug messages scroll off my terminal, though
> > they looked normal. I can re-run the test after my backups for this week
> > finish if the debug output before/after that would be helpful. The
> > problems are very reproduceable, so it won't be hard.
> >
> > Here are the configs. A few notes:
> >
> > * I removed all the clients and jobs that didn't cause issues from the
> > configs.
> > * The segfault happened whether I was running with a seperate bacula-sd
> > or not, so bacula-sd.conf, and bacula-sd2.conf both had the problem. It
> > would seem though, that bacula-sd2.conf has something in it that
> > isolates this issue.
> > * When I was running bacula-sd2.conf, LTO3-2 had the port number changed
> > to 9104 to connect to the second sd, and the PX502-2 and its associated
> > resources were commented out of bacula-sd.conf.
> > * Of course, this could be the result of something in my catalog too,
> > I'm aware of that.
> >
> > I'm hoping I can work with you all to figure out why 2.2.8 segfaults,
> > and 2.2.7 does not. I'm at your disposal and quite experienced with C/C
> > ++ development, though not bacula's internals. Please just tell me what
> > other information and procedures would be helpful to you all to
> > debug/reproduce this.
> >
> > ----------------- bacula-dir.conf -----------------
> >
> > Director {                            # define myself
> >   Name = snap-dir
> >   DIRport = 9101                # where we listen for UA connections
> >   QueryFile = "/etc/bacula/query.sql"
> >   WorkingDirectory = "/var/lib/bacula"
> >   PidDirectory = "/var/run"
> >   #Password = "xxxxxxxxxx"         # Console password
> >   Password = "xxxxxxxxxx"
> >   Messages = Daemon
> >   Maximum Concurrent Jobs = 3
> > }
> >
> > Console {
> > 	Name = lavache-mon
> > 	Password = "xxxxxxxxxx"
> > 	CatalogACL = MyCatalog
> > 	CommandACL = status, .status
> > }
> >
> >
> > JobDefs {
> >   Name = "DefaultJob"
> >   Type = Backup
> >   Level = Incremental
> >   FileSet = "Full Set"
> >   Schedule = "WeeklyCycle"
> >   Storage = LTO3
> >   Storage = LTO3-2
> >   Messages = Standard
> >   Full Backup Pool = Monthly
> >   Differential Backup Pool = Weekly
> >   Incremental Backup Pool = Daily
> >   Pool = Default
> >   Priority = 10
> >   SpoolData = yes
> >   Maximum Concurrent Jobs = 2
> > }
> >
> >   #Snap's local backups -- should go after production stuff
> > JobDefs {
> >   Name = "DefaultJobNoSpool"
> >   Type = Backup
> >   Level = Incremental
> >   FileSet = "Full Set"
> >   Schedule = "WeeklyCycle"
> >   Storage = LTO3
> >   Messages = Standard
> >   Full Backup Pool = Monthly
> >   Differential Backup Pool = Weekly
> >   Incremental Backup Pool = Daily
> >   Pool = Default
> >   Priority = 11
> >   SpoolData = no
> >   Maximum Concurrent Jobs = 1
> > }
> > JobDefs {
> >   Name = "DefaultJobHulkExt"
> >   Type = Backup
> >   Level = Full
> >   FileSet = "External Set"
> >   Storage = Hulk-LTO-1
> >   Messages = Standard
> >   Pool = DefaultHulk
> >   Priority = 10
> >   SpoolData = no
> >   Maximum Concurrent Jobs = 1
> > }
> >
> > JobDefs {
> >   Name = "DefaultJobHulk"
> >   Type = Backup
> >   Level = Incremental
> >   FileSet = "Full Set"
> >   Schedule = "WeeklyCycle"
> >   Storage = Hulk-LTO-1
> >   Messages = Standard
> >   Pool = DefaultHulk
> >   Priority = 10
> >   SpoolData = no
> >   Maximum Concurrent Jobs = 1
> > }
> >
> > #
> > # Define the main nightly save backup job
> > #
> > Job {
> >   Name = "snap"
> >   Client = snap-fd
> >   JobDefs = "DefaultJobNoSpool"
> >   Write Bootstrap = "/mnt/remote/pull/snap/bootstraps/snap.bsr"
> > }
> >
> > Job {
> > 	Name = "clapSnapshot"
> > 	Client = clap-fd
> > 	JobDefs = "DefaultJob"
> > 	FileSet = "clapSnapshot"
> >   Maximum Concurrent Jobs = 2
> > 	Write Bootstrap = "/mnt/remote/pull/snap/bootstraps/clap.bsr"
> > 	Storage = LTO3
> >   Schedule = "DailyFullDaytime"
> > }
> >
> >
> > # Backup the catalog database (after the nightly save)
> > Job {
> >   Name = "BackupCatalog"
> >   JobDefs = "DefaultJob"
> >   Level = Full
> >   FileSet="Catalog"
> >   Schedule = "DailyFull"
> >   # This creates an ASCII copy of the catalog
> >   RunBeforeJob = "/etc/bacula/make_tmp_catalog_backup bacula bacula
> > xxxxxxxxxxxxx"
> >   # This deletes the copy of the catalog
> >   #RunAfterJob  = "/etc/bacula/delete_catalog_backup"
> >   Write Bootstrap = "/home/tmp/bacula.bsr"
> >   Client = snap-fd
> >   Priority = 11                   # run after main backup
> > }
> >
> > #
> > # Standard Restore template, to be changed by Console program
> > #  Only one such job is needed for all Jobs/Clients/Storage ...
> > #
> > Job {
> >   Name = "RestoreFiles"
> >   Storage = LTO3
> >   Fileset = "Full Set"
> >   Pool = Default
> >   Client = snap-fd
> >   Type = Restore
> >   Messages = Standard
> >   Where = /back/bacula/restores
> > }
> >
> >
> > # List of files to be backed up
> > FileSet {
> >   Name = "Full Set"
> >   Include {
> >     Options {
> >       signature = MD5
> >     }
> >     File = /
> >     File = /home
> >     File = /home/vpopmail
> >     File = /back
> >     File = /usr
> >     File = /var
> >   }
> >   Exclude {
> >     File = /proc
> >     File = *swap.file*
> >     File = /tmp
> >     File = /sys
> >     File = /home/tmp
> >     File = /var/tmp
> >     File = /var/log/lastlog
> >     File = /back/bacula
> >     File = /.journal
> >     File = /.fsck
> >   }
> > }
> >
> > FileSet {
> >   Name = "External Set"
> >   Include {
> >     Options {
> >       signature = MD5
> >     }
> >   File = /external
> >   }
> > }
> >
> > FileSet {
> >   Name = "NFS Set"
> >   Include {
> >     Options {
> >       signature = MD5
> >     }
> >     File = /
> >     File = /home
> >     File = /home/ftp
> >     File = /home/files
> >     File = /back
> >     File = /usr
> >     File = /var
> >   }
> >   Exclude {
> >     File = /proc
> >     File = /tmp
> >     File = /sys
> >     File = /home/tmp
> >     File = /var/tmp
> >     File = /var/log/lastlog
> >     File = /.journal
> >     File = /.fsck
> >   }
> > }
> >
> > FileSet {
> >   Name = "Xen Host Set"
> >   Include {
> >     Options {
> >       signature = MD5
> >     }
> >     File = /
> >     File = /home
> >     File = /back
> >     File = /usr
> >     File = /var
> >   }
> >   Exclude {
> >     File = /proc
> >     File = /tmp
> >     File = /sys
> >     File = /home/tmp
> >     File = /home/xen/boxes
> >     File = /var/tmp
> >     File = /var/log/lastlog
> >     File = /.journal
> >     File = /.fsck
> >   }
> > }
> >
> > FileSet {
> > 	Name = "clapSnapshot"
> > 	Include {
> > 		File = /mnt/snapshots/latest
> > 	}
> > }
> >
> > FileSet {
> > 	Name = "homeLogs"
> > 	Include {
> > 		File = /home/logs
> > 	}
> > }
> > FileSet {
> > 	Name = "quickbooks"
> > 	Include {
> > 		File = "C:/Quickbooks Data"
> > 	}
> > }
> >
> >
> >
> > #
> > # When to do the backups, full backup on first sunday of the month,
> > #  differential (i.e. incremental since full) every other sunday,
> > #  and incremental backups other days
> > Schedule {
> >   Name = "WeeklyCycle"
> >   Run = Full 1st friday at 21:05
> >   Run = Differential 2nd-5th friday at 21:05
> >   Run = Incremental sat-thu at 21:05
> > }
> > Schedule {
> >   Name = "AfterBusinessHours"
> >   Run = Full 1st sat at 17:00
> >   Run = Differential 2nd-5th sat at 17:00
> >   Run = Incremental sun-fri at 17:00
> > }
> >
> >
> > # This schedule does the catalog. It starts after the WeeklyCycle
> > Schedule {
> >   Name = "DailyFull"
> >   Run = Full sun-sat at 23:10
> > }
> > Schedule {
> >   Name = "DailyFullDaytime"
> >   Run = Full sat at 14:30
> >   Run = Full tue at 14:30
> >   Run = Full thu at 14:30
> > }
> >
> >
> > # This is the backup of the catalog
> > FileSet {
> >   Name = "Catalog"
> >   Include {
> >     Options {
> >       signature = MD5
> >     }
> >     File = /mnt/remote/pull/snap/catalog
> >   }
> > }
> >
> > # Client (File Services) to backup
> > Client {
> >   Name = snap-fd
> >   Address = snap
> >   FDPort = 9102
> >   Catalog = MyCatalog
> >   Password = "xxxxxxxxxx"
> >   File Retention = 2 months            # 30 days
> >   Job Retention = 60 months            # six months
> >   AutoPrune = yes                     # Prune expired Jobs/Files
> > }
> >
> > Client {
> >   Name = clap-fd
> >   Address = clap
> >   FDPort = 9102
> >   Catalog = MyCatalog
> >   Password = "xxxxxxxxxx"
> >   File Retention = 30 days
> >   Job Retention = 18 months
> >   AutoPrune = yes
> > }
> >
> > Storage {
> >   Name = LTO3
> >   Address = snap
> >   SDPort = 9103
> >   Password = "xxxxxxxxxx"
> >   Device = PX502
> >   Media Type = Ultrium-LTO-3
> >   Autochanger = yes
> >   Maximum Concurrent Jobs = 2
> > }
> >
> > Storage {
> >   Name = LTO3-2
> >   Address = snap
> >   SDPort = 9103
> >   Password = "xxxxxxxxxx"
> >   Device = PX502-2
> >   Media Type = Ultrium-LTO-3
> >   Autochanger = yes
> >   Maximum Concurrent Jobs = 2
> > }
> >
> > Storage {
> >   Name = Hulk-LTO-1
> >   Address = hulk
> >   SDPort = 9103
> >   Password = "xxxxxxxxxx"
> >   Device = LTO-Drive-1
> >   Media Type = Ultrium-LTO-3
> > }
> >
> > # Generic catalog service
> > Catalog {
> >   Name = MyCatalog
> >   dbname = bacula; user = bacula; Password = "xxxxxxxxxx"
> > }
> >
> > # Reasonable message delivery -- send most everything to email address
> > #  and to the console
> > Messages {
> >   Name = Standard
> > #
> > # NOTE! If you send to two email or more email addresses, you will need
> > #  to replace the %r in the from field (-f part) with a single valid
> > #  email address in both the mailcommand and the operatorcommand.
> > #
> >   mailcommand = "/usr/sbin/bsmtp -h localhost -f \"\(Bacula\) %r\" -s
> > \"Bacula: %t %e of %c %l\" %r"
> >   operatorcommand = "/usr/sbin/bsmtp -h localhost -f \"\(Bacula\) %r\"
> > -s \"Bacula: Intervention needed for %j\" %r"
> >   mail = bacula@xxxxxxxxxx = all, !skipped
> >   operator = bacula@xxxxxxxxxx = mount
> >   console = all, !skipped, !saved
> > #
> > # WARNING! the following will create a file that you must cycle from
> > #          time to time as it will grow indefinitely. However, it will
> > #          also keep all your messages if they scroll off the console.
> > #
> >   append = "/var/lib/bacula/log" = all, !skipped
> > }
> >
> > #
> > # Message delivery for daemon messages (no job).
> > Messages {
> >   Name = Daemon
> >   mailcommand = "/usr/sbin/bsmtp -h localhost -f \"\(Bacula\) %r\" -s
> > \"Bacula daemon message\" %r"
> >   mail = bacula@xxxxxxxxxx = all, !skipped
> >   console = all, !skipped, !saved
> >   append = "/var/lib/bacula/log" = all, !skipped
> > }
> >
> > # Default pool definition
> > Pool {
> >   Name = Default
> >   Pool Type = Backup
> >   Recycle = no                       # Bacula can automatically recycle
> > Volumes
> >   AutoPrune = no                     # Prune expired volumes
> >   Volume Retention = 365 days         # one year
> > #  Accept Any Volume = yes             # write on any volume in the pool
> > }
> > Pool {
> >   Name = DefaultHulk
> >   Pool Type = Backup
> >   Recycle = yes                       # Bacula can automatically recycle
> > Volumes
> >   AutoPrune = yes                     # Prune expired volumes
> >   Volume Retention = 365 days         # one year
> > #  Accept Any Volume = yes             # write on any volume in the pool
> > }
> >
> > # Disk backups - mostly incrementals
> > Pool {
> > 	Name = Daily
> > 	Pool Type = Backup
> > 	Recycle = yes
> > 	AutoPrune = yes
> > 	Volume Retention = 8 days
> > #	Accept Any Volume = yes
> > }
> > Pool {
> > 	Name = Weekly
> > 	Pool Type = Backup
> > 	Recycle = yes
> > 	AutoPrune = yes
> > 	Volume Retention = 6 weeks
> > #	Accept Any Volume = yes
> > }
> > Pool {
> > 	Name = Monthly
> > 	Pool Type = Backup
> > 	Recycle = no					# We don't want these recycled, they will be retained
> > for a long time
> > 	AutoPrune = no
> > 	Volume Retention = 5 years # Some are taken offsite, others will be
> > recycled
> > #	Accept Any Volume = yes
> >         Cleaning Prefix = "CLN";
> > }
> >
> > # Tapes for dumps of backup server
> > Pool {
> > 	Name = BackMonthlyArchives
> > 	Pool Type = Backup
> > 	Recycle = no					# We don't want these recycled, they will be retained
> > for a long time
> > 	AutoPrune = no
> > 	Volume Retention = 5 years # Some are taken offsite, others will be
> > recycled
> > #	Accept Any Volume = yes
> > }
> >
> >
> >
> > #
> > # Restricted console used by tray-monitor to get the status of the
> > director
> > #
> > Console {
> >   Name = snap-mon
> >   Password = "xxxxxxxxxx"
> >   CommandACL = status, .status
> > }
> >
> > -------------------- bacula-sd.conf ------------------------
> >
> > Storage {                             # definition of myself
> >   Name = snap-sd
> >   SDPort = 9103                  # Director's port
> >   WorkingDirectory = "/var/lib/bacula"
> >   Pid Directory = "/var/run"
> >   Maximum Concurrent Jobs = 20
> > }
> >
> > #
> > # List Directors who are permitted to contact Storage daemon
> > #
> > Director {
> >   Name = snap-dir
> >   Password = "xxxxxxxxxx"
> > }
> >
> > #
> > # Restricted Director, used by tray-monitor to get the
> > #   status of the storage daemon
> > #
> > Director {
> >   Name = lavache-mon
> >   Password = "xxxxxxxxxx"
> >   Monitor = yes
> > }
> >
> > #
> > # Devices supported by this Storage daemon
> > # To connect, the Director's bacula-dir.conf must have the
> > #  same Name and MediaType.
> > #
> >
> > #
> > # An autochanger device with two drives
> > #
> > Autochanger {
> >   Name = PX502
> >   Device = PX502-Drive-1
> >   #Device = PX502-Drive-2 # uncomment when second drive is installed
> >   Changer Command = "/etc/bacula/mtx-changer %c %o %S %a %d"
> >   Changer Device = /dev/sg2
> > }
> >
> > Autochanger {
> >   Name = PX502-2
> >   Device = PX502-2-Drive-1
> >   Changer Command = "/etc/bacula/mtx-changer %c %o %S %a %d"
> >   Changer Device = /dev/sg0
> > }
> >
> > Device {
> >   Name = PX502-Drive-1;
> >   Drive Index = 0;
> >   Media Type = Ultrium-LTO-3;
> >   Archive Device = /dev/nst1;
> >   AutomaticMount = yes;
> >   Always Open = yes;
> >   RemovableMedia = yes;
> >   Random Access = no;
> >   AutoChanger = yes;
> >   Changer Device = /dev/sg2;
> >   Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'";
> >   Maximum Job Spool Size = 100G;
> >   Maximum Spool Size = 120G;
> >   Spool Directory = /var/spool/bacula
> > }
> >
> > Device {
> >   Name = PX502-2-Drive-1;
> >   Drive Index = 0;
> >   Media Type = Ultrium-LTO-3;
> >   Archive Device = /dev/nst0;
> >   AutomaticMount = yes;
> >   Always Open = yes;
> >   RemovableMedia = yes;
> >   Random Access = no;
> >   AutoChanger = yes;
> >   Changer Device = /dev/sg0;
> >   Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'";
> >   Maximum Job Spool Size = 100G;
> >   Maximum Spool Size = 120G;
> >   Spool Directory = /var/spool/bacula
> > }
> >
> > #
> > # Send all messages to the Director,
> > # mount messages also are sent to the email address
> > #
> > Messages {
> >   Name = Standard
> >   director = snap-dir = all
> > }
> >
> > ------------------ bacula-sd2.conf -----------------
> >
> > Storage {                             # definition of myself
> >   Name = snap2-sd
> >   SDPort = 9104                  # Director's port
> >   WorkingDirectory = "/var/lib/bacula"
> >   Pid Directory = "/var/run"
> >   Maximum Concurrent Jobs = 20
> > }
> >
> > #
> > # List Directors who are permitted to contact Storage daemon
> > #
> > Director {
> >   Name = snap-dir
> >   Password = "xxxxxxxxxx"
> > }
> >
> > #
> > # Restricted Director, used by tray-monitor to get the
> > #   status of the storage daemon
> > #
> > Director {
> >   Name = lavache-mon
> >   Password = "xxxxxxxxxx"
> >   Monitor = yes
> > }
> >
> > Autochanger {
> >   Name = PX502-2
> >   Device = PX502-2-Drive-1
> >   Changer Command = "/etc/bacula/mtx-changer %c %o %S %a %d"
> >   Changer Device = /dev/sg0
> > }
> >
> > Device {
> >   Name = PX502-2-Drive-1;
> >   Drive Index = 0;
> >   Media Type = Ultrium-LTO-3;
> >   Archive Device = /dev/nst0;
> >   AutomaticMount = yes;
> >   Always Open = yes;
> >   RemovableMedia = yes;
> >   Random Access = no;
> >   AutoChanger = yes;
> >   Changer Device = /dev/sg0;
> >   Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'";
> >   Maximum Job Spool Size = 100G;
> >   Maximum Spool Size = 120G;
> >   Spool Directory = /var/spool/bacula
> > }
> >
> >
> > #
> > # Send all messages to the Director,
> > # mount messages also are sent to the email address
> > #
> > Messages {
> >   Name = Standard
> >   director = snap-dir = all
> > }
> 
> 
-- 
Clint Byrum <clint@xxxxxxxxxx>


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-devel mailing list
Bacula-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/bacula-devel


This mailing list archive is a service of Copilot Consulting.