[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] Bacula 2.2.8 segfaulting with multiple autochangers


Hello,

I would suggest that you start by reading www.bacula.org -> Support and then 
asking for help on the support list.  Off hand, I would say that you build 
was perhaps broken, but most likely you just have your bacula-sd.conf file 
incorrectly configured, and we don't deal with those issues on this list. 
Since you are also talking about modifying the port addresses, so you are  
treading on very dangerous waters.  

Then if the above doesn't help, there is plenty of info in the Kaboom chapter 
of the manual.

On Wednesday 20 February 2008 19.29:50 Clint Byrum wrote:
> Hi everyone. This is my first post to this list.
>
> I think I found a bug in Bacula 2.2.8 after I spent a couple of days
> struggling with two autochangers before regressing to 2.2.7, which
> solved my problems.
>
> Recently I rebuilt my director and main storage daemon box after a RAID
> array died on the old one. I setup a new box (HP DL380 G4) with CentOS
> 5, x86_64, and 8G RAM. I also added a second Quantum PX502 autochanger.
>
> I downloaded the 2.2.8 source RPM and built it using this commandline:
>
> rpmbuild --rebuild --define nobuild_gconsole=1 --define build_mysql=1
> --define build_centos5=1
>
> The RPM built fine, and everything seemed to be working fine.. I used
> the binaries to restore the old catalog from tape (some of you may have
> seen me and my fun with MySQL index building in the IRC channel as
> MapspaM).
>
> Anyway, after getting things running again as they were with the old
> configurations, I wanted to start using my new, second autochanger,
> which is on /dev/sg0 and /dev/nst0 (the old one ended up on /dev/sg2
> and /dev/nst1). Things got ugly at this point.
>
> (In case you're wondering, my config files are at the bottom)
>
> Any time I would try to run a backup job on the new Storage resource,
> bacula-sd would disappear.
>
> I tried a few more things including running seperate SD's and such.
> Something about this new storage resource (In the attached configs as
> LTO3-2) was broken.
>
> I finally ran a seperate bacula-sd with
>
> bacula-sd -f -d99 -c /etc/bacula/bacula-sd2.conf -u bacula -g bacula
>
> And ran my backup job against it, which caused this output:
>
> 19-Feb 16:33 snap2-sd: ABORTING due to ERROR in dev.c:724
> dev.c:723 Bad call to rewind. Device "PX502-2-Drive-1" (/dev/nst0) not open
> Kaboom! bacula-sd, snap2-sd got signal 11 - Segmentation violation.
> Attempting traceback.
>
> The traceback failed due to "Permission Denied".  I'm sorry but I
> accidentally let the other debug messages scroll off my terminal, though
> they looked normal. I can re-run the test after my backups for this week
> finish if the debug output before/after that would be helpful. The
> problems are very reproduceable, so it won't be hard.
>
> Here are the configs. A few notes:
>
> * I removed all the clients and jobs that didn't cause issues from the
> configs.
> * The segfault happened whether I was running with a seperate bacula-sd
> or not, so bacula-sd.conf, and bacula-sd2.conf both had the problem. It
> would seem though, that bacula-sd2.conf has something in it that
> isolates this issue.
> * When I was running bacula-sd2.conf, LTO3-2 had the port number changed
> to 9104 to connect to the second sd, and the PX502-2 and its associated
> resources were commented out of bacula-sd.conf.
> * Of course, this could be the result of something in my catalog too,
> I'm aware of that.
>
> I'm hoping I can work with you all to figure out why 2.2.8 segfaults,
> and 2.2.7 does not. I'm at your disposal and quite experienced with C/C
> ++ development, though not bacula's internals. Please just tell me what
> other information and procedures would be helpful to you all to
> debug/reproduce this.
>
> ----------------- bacula-dir.conf -----------------
>
> Director {                            # define myself
>   Name = snap-dir
>   DIRport = 9101                # where we listen for UA connections
>   QueryFile = "/etc/bacula/query.sql"
>   WorkingDirectory = "/var/lib/bacula"
>   PidDirectory = "/var/run"
>   #Password = "xxxxxxxxxx"         # Console password
>   Password = "xxxxxxxxxx"
>   Messages = Daemon
>   Maximum Concurrent Jobs = 3
> }
>
> Console {
> 	Name = lavache-mon
> 	Password = "xxxxxxxxxx"
> 	CatalogACL = MyCatalog
> 	CommandACL = status, .status
> }
>
>
> JobDefs {
>   Name = "DefaultJob"
>   Type = Backup
>   Level = Incremental
>   FileSet = "Full Set"
>   Schedule = "WeeklyCycle"
>   Storage = LTO3
>   Storage = LTO3-2
>   Messages = Standard
>   Full Backup Pool = Monthly
>   Differential Backup Pool = Weekly
>   Incremental Backup Pool = Daily
>   Pool = Default
>   Priority = 10
>   SpoolData = yes
>   Maximum Concurrent Jobs = 2
> }
>
>   #Snap's local backups -- should go after production stuff
> JobDefs {
>   Name = "DefaultJobNoSpool"
>   Type = Backup
>   Level = Incremental
>   FileSet = "Full Set"
>   Schedule = "WeeklyCycle"
>   Storage = LTO3
>   Messages = Standard
>   Full Backup Pool = Monthly
>   Differential Backup Pool = Weekly
>   Incremental Backup Pool = Daily
>   Pool = Default
>   Priority = 11
>   SpoolData = no
>   Maximum Concurrent Jobs = 1
> }
> JobDefs {
>   Name = "DefaultJobHulkExt"
>   Type = Backup
>   Level = Full
>   FileSet = "External Set"
>   Storage = Hulk-LTO-1
>   Messages = Standard
>   Pool = DefaultHulk
>   Priority = 10
>   SpoolData = no
>   Maximum Concurrent Jobs = 1
> }
>
> JobDefs {
>   Name = "DefaultJobHulk"
>   Type = Backup
>   Level = Incremental
>   FileSet = "Full Set"
>   Schedule = "WeeklyCycle"
>   Storage = Hulk-LTO-1
>   Messages = Standard
>   Pool = DefaultHulk
>   Priority = 10
>   SpoolData = no
>   Maximum Concurrent Jobs = 1
> }
>
> #
> # Define the main nightly save backup job
> #
> Job {
>   Name = "snap"
>   Client = snap-fd
>   JobDefs = "DefaultJobNoSpool"
>   Write Bootstrap = "/mnt/remote/pull/snap/bootstraps/snap.bsr"
> }
>
> Job {
> 	Name = "clapSnapshot"
> 	Client = clap-fd
> 	JobDefs = "DefaultJob"
> 	FileSet = "clapSnapshot"
>   Maximum Concurrent Jobs = 2
> 	Write Bootstrap = "/mnt/remote/pull/snap/bootstraps/clap.bsr"
> 	Storage = LTO3
>   Schedule = "DailyFullDaytime"
> }
>
>
> # Backup the catalog database (after the nightly save)
> Job {
>   Name = "BackupCatalog"
>   JobDefs = "DefaultJob"
>   Level = Full
>   FileSet="Catalog"
>   Schedule = "DailyFull"
>   # This creates an ASCII copy of the catalog
>   RunBeforeJob = "/etc/bacula/make_tmp_catalog_backup bacula bacula
> xxxxxxxxxxxxx"
>   # This deletes the copy of the catalog
>   #RunAfterJob  = "/etc/bacula/delete_catalog_backup"
>   Write Bootstrap = "/home/tmp/bacula.bsr"
>   Client = snap-fd
>   Priority = 11                   # run after main backup
> }
>
> #
> # Standard Restore template, to be changed by Console program
> #  Only one such job is needed for all Jobs/Clients/Storage ...
> #
> Job {
>   Name = "RestoreFiles"
>   Storage = LTO3
>   Fileset = "Full Set"
>   Pool = Default
>   Client = snap-fd
>   Type = Restore
>   Messages = Standard
>   Where = /back/bacula/restores
> }
>
>
> # List of files to be backed up
> FileSet {
>   Name = "Full Set"
>   Include {
>     Options {
>       signature = MD5
>     }
>     File = /
>     File = /home
>     File = /home/vpopmail
>     File = /back
>     File = /usr
>     File = /var
>   }
>   Exclude {
>     File = /proc
>     File = *swap.file*
>     File = /tmp
>     File = /sys
>     File = /home/tmp
>     File = /var/tmp
>     File = /var/log/lastlog
>     File = /back/bacula
>     File = /.journal
>     File = /.fsck
>   }
> }
>
> FileSet {
>   Name = "External Set"
>   Include {
>     Options {
>       signature = MD5
>     }
>   File = /external
>   }
> }
>
> FileSet {
>   Name = "NFS Set"
>   Include {
>     Options {
>       signature = MD5
>     }
>     File = /
>     File = /home
>     File = /home/ftp
>     File = /home/files
>     File = /back
>     File = /usr
>     File = /var
>   }
>   Exclude {
>     File = /proc
>     File = /tmp
>     File = /sys
>     File = /home/tmp
>     File = /var/tmp
>     File = /var/log/lastlog
>     File = /.journal
>     File = /.fsck
>   }
> }
>
> FileSet {
>   Name = "Xen Host Set"
>   Include {
>     Options {
>       signature = MD5
>     }
>     File = /
>     File = /home
>     File = /back
>     File = /usr
>     File = /var
>   }
>   Exclude {
>     File = /proc
>     File = /tmp
>     File = /sys
>     File = /home/tmp
>     File = /home/xen/boxes
>     File = /var/tmp
>     File = /var/log/lastlog
>     File = /.journal
>     File = /.fsck
>   }
> }
>
> FileSet {
> 	Name = "clapSnapshot"
> 	Include {
> 		File = /mnt/snapshots/latest
> 	}
> }
>
> FileSet {
> 	Name = "homeLogs"
> 	Include {
> 		File = /home/logs
> 	}
> }
> FileSet {
> 	Name = "quickbooks"
> 	Include {
> 		File = "C:/Quickbooks Data"
> 	}
> }
>
>
>
> #
> # When to do the backups, full backup on first sunday of the month,
> #  differential (i.e. incremental since full) every other sunday,
> #  and incremental backups other days
> Schedule {
>   Name = "WeeklyCycle"
>   Run = Full 1st friday at 21:05
>   Run = Differential 2nd-5th friday at 21:05
>   Run = Incremental sat-thu at 21:05
> }
> Schedule {
>   Name = "AfterBusinessHours"
>   Run = Full 1st sat at 17:00
>   Run = Differential 2nd-5th sat at 17:00
>   Run = Incremental sun-fri at 17:00
> }
>
>
> # This schedule does the catalog. It starts after the WeeklyCycle
> Schedule {
>   Name = "DailyFull"
>   Run = Full sun-sat at 23:10
> }
> Schedule {
>   Name = "DailyFullDaytime"
>   Run = Full sat at 14:30
>   Run = Full tue at 14:30
>   Run = Full thu at 14:30
> }
>
>
> # This is the backup of the catalog
> FileSet {
>   Name = "Catalog"
>   Include {
>     Options {
>       signature = MD5
>     }
>     File = /mnt/remote/pull/snap/catalog
>   }
> }
>
> # Client (File Services) to backup
> Client {
>   Name = snap-fd
>   Address = snap
>   FDPort = 9102
>   Catalog = MyCatalog
>   Password = "xxxxxxxxxx"
>   File Retention = 2 months            # 30 days
>   Job Retention = 60 months            # six months
>   AutoPrune = yes                     # Prune expired Jobs/Files
> }
>
> Client {
>   Name = clap-fd
>   Address = clap
>   FDPort = 9102
>   Catalog = MyCatalog
>   Password = "xxxxxxxxxx"
>   File Retention = 30 days
>   Job Retention = 18 months
>   AutoPrune = yes
> }
>
> Storage {
>   Name = LTO3
>   Address = snap
>   SDPort = 9103
>   Password = "xxxxxxxxxx"
>   Device = PX502
>   Media Type = Ultrium-LTO-3
>   Autochanger = yes
>   Maximum Concurrent Jobs = 2
> }
>
> Storage {
>   Name = LTO3-2
>   Address = snap
>   SDPort = 9103
>   Password = "xxxxxxxxxx"
>   Device = PX502-2
>   Media Type = Ultrium-LTO-3
>   Autochanger = yes
>   Maximum Concurrent Jobs = 2
> }
>
> Storage {
>   Name = Hulk-LTO-1
>   Address = hulk
>   SDPort = 9103
>   Password = "xxxxxxxxxx"
>   Device = LTO-Drive-1
>   Media Type = Ultrium-LTO-3
> }
>
> # Generic catalog service
> Catalog {
>   Name = MyCatalog
>   dbname = bacula; user = bacula; Password = "xxxxxxxxxx"
> }
>
> # Reasonable message delivery -- send most everything to email address
> #  and to the console
> Messages {
>   Name = Standard
> #
> # NOTE! If you send to two email or more email addresses, you will need
> #  to replace the %r in the from field (-f part) with a single valid
> #  email address in both the mailcommand and the operatorcommand.
> #
>   mailcommand = "/usr/sbin/bsmtp -h localhost -f \"\(Bacula\) %r\" -s
> \"Bacula: %t %e of %c %l\" %r"
>   operatorcommand = "/usr/sbin/bsmtp -h localhost -f \"\(Bacula\) %r\"
> -s \"Bacula: Intervention needed for %j\" %r"
>   mail = bacula@xxxxxxxxxx = all, !skipped
>   operator = bacula@xxxxxxxxxx = mount
>   console = all, !skipped, !saved
> #
> # WARNING! the following will create a file that you must cycle from
> #          time to time as it will grow indefinitely. However, it will
> #          also keep all your messages if they scroll off the console.
> #
>   append = "/var/lib/bacula/log" = all, !skipped
> }
>
> #
> # Message delivery for daemon messages (no job).
> Messages {
>   Name = Daemon
>   mailcommand = "/usr/sbin/bsmtp -h localhost -f \"\(Bacula\) %r\" -s
> \"Bacula daemon message\" %r"
>   mail = bacula@xxxxxxxxxx = all, !skipped
>   console = all, !skipped, !saved
>   append = "/var/lib/bacula/log" = all, !skipped
> }
>
> # Default pool definition
> Pool {
>   Name = Default
>   Pool Type = Backup
>   Recycle = no                       # Bacula can automatically recycle
> Volumes
>   AutoPrune = no                     # Prune expired volumes
>   Volume Retention = 365 days         # one year
> #  Accept Any Volume = yes             # write on any volume in the pool
> }
> Pool {
>   Name = DefaultHulk
>   Pool Type = Backup
>   Recycle = yes                       # Bacula can automatically recycle
> Volumes
>   AutoPrune = yes                     # Prune expired volumes
>   Volume Retention = 365 days         # one year
> #  Accept Any Volume = yes             # write on any volume in the pool
> }
>
> # Disk backups - mostly incrementals
> Pool {
> 	Name = Daily
> 	Pool Type = Backup
> 	Recycle = yes
> 	AutoPrune = yes
> 	Volume Retention = 8 days
> #	Accept Any Volume = yes
> }
> Pool {
> 	Name = Weekly
> 	Pool Type = Backup
> 	Recycle = yes
> 	AutoPrune = yes
> 	Volume Retention = 6 weeks
> #	Accept Any Volume = yes
> }
> Pool {
> 	Name = Monthly
> 	Pool Type = Backup
> 	Recycle = no					# We don't want these recycled, they will be retained
> for a long time
> 	AutoPrune = no
> 	Volume Retention = 5 years # Some are taken offsite, others will be
> recycled
> #	Accept Any Volume = yes
>         Cleaning Prefix = "CLN";
> }
>
> # Tapes for dumps of backup server
> Pool {
> 	Name = BackMonthlyArchives
> 	Pool Type = Backup
> 	Recycle = no					# We don't want these recycled, they will be retained
> for a long time
> 	AutoPrune = no
> 	Volume Retention = 5 years # Some are taken offsite, others will be
> recycled
> #	Accept Any Volume = yes
> }
>
>
>
> #
> # Restricted console used by tray-monitor to get the status of the
> director
> #
> Console {
>   Name = snap-mon
>   Password = "xxxxxxxxxx"
>   CommandACL = status, .status
> }
>
> -------------------- bacula-sd.conf ------------------------
>
> Storage {                             # definition of myself
>   Name = snap-sd
>   SDPort = 9103                  # Director's port
>   WorkingDirectory = "/var/lib/bacula"
>   Pid Directory = "/var/run"
>   Maximum Concurrent Jobs = 20
> }
>
> #
> # List Directors who are permitted to contact Storage daemon
> #
> Director {
>   Name = snap-dir
>   Password = "xxxxxxxxxx"
> }
>
> #
> # Restricted Director, used by tray-monitor to get the
> #   status of the storage daemon
> #
> Director {
>   Name = lavache-mon
>   Password = "xxxxxxxxxx"
>   Monitor = yes
> }
>
> #
> # Devices supported by this Storage daemon
> # To connect, the Director's bacula-dir.conf must have the
> #  same Name and MediaType.
> #
>
> #
> # An autochanger device with two drives
> #
> Autochanger {
>   Name = PX502
>   Device = PX502-Drive-1
>   #Device = PX502-Drive-2 # uncomment when second drive is installed
>   Changer Command = "/etc/bacula/mtx-changer %c %o %S %a %d"
>   Changer Device = /dev/sg2
> }
>
> Autochanger {
>   Name = PX502-2
>   Device = PX502-2-Drive-1
>   Changer Command = "/etc/bacula/mtx-changer %c %o %S %a %d"
>   Changer Device = /dev/sg0
> }
>
> Device {
>   Name = PX502-Drive-1;
>   Drive Index = 0;
>   Media Type = Ultrium-LTO-3;
>   Archive Device = /dev/nst1;
>   AutomaticMount = yes;
>   Always Open = yes;
>   RemovableMedia = yes;
>   Random Access = no;
>   AutoChanger = yes;
>   Changer Device = /dev/sg2;
>   Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'";
>   Maximum Job Spool Size = 100G;
>   Maximum Spool Size = 120G;
>   Spool Directory = /var/spool/bacula
> }
>
> Device {
>   Name = PX502-2-Drive-1;
>   Drive Index = 0;
>   Media Type = Ultrium-LTO-3;
>   Archive Device = /dev/nst0;
>   AutomaticMount = yes;
>   Always Open = yes;
>   RemovableMedia = yes;
>   Random Access = no;
>   AutoChanger = yes;
>   Changer Device = /dev/sg0;
>   Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'";
>   Maximum Job Spool Size = 100G;
>   Maximum Spool Size = 120G;
>   Spool Directory = /var/spool/bacula
> }
>
> #
> # Send all messages to the Director,
> # mount messages also are sent to the email address
> #
> Messages {
>   Name = Standard
>   director = snap-dir = all
> }
>
> ------------------ bacula-sd2.conf -----------------
>
> Storage {                             # definition of myself
>   Name = snap2-sd
>   SDPort = 9104                  # Director's port
>   WorkingDirectory = "/var/lib/bacula"
>   Pid Directory = "/var/run"
>   Maximum Concurrent Jobs = 20
> }
>
> #
> # List Directors who are permitted to contact Storage daemon
> #
> Director {
>   Name = snap-dir
>   Password = "xxxxxxxxxx"
> }
>
> #
> # Restricted Director, used by tray-monitor to get the
> #   status of the storage daemon
> #
> Director {
>   Name = lavache-mon
>   Password = "xxxxxxxxxx"
>   Monitor = yes
> }
>
> Autochanger {
>   Name = PX502-2
>   Device = PX502-2-Drive-1
>   Changer Command = "/etc/bacula/mtx-changer %c %o %S %a %d"
>   Changer Device = /dev/sg0
> }
>
> Device {
>   Name = PX502-2-Drive-1;
>   Drive Index = 0;
>   Media Type = Ultrium-LTO-3;
>   Archive Device = /dev/nst0;
>   AutomaticMount = yes;
>   Always Open = yes;
>   RemovableMedia = yes;
>   Random Access = no;
>   AutoChanger = yes;
>   Changer Device = /dev/sg0;
>   Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'";
>   Maximum Job Spool Size = 100G;
>   Maximum Spool Size = 120G;
>   Spool Directory = /var/spool/bacula
> }
>
>
> #
> # Send all messages to the Director,
> # mount messages also are sent to the email address
> #
> Messages {
>   Name = Standard
>   director = snap-dir = all
> }



-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-devel mailing list
Bacula-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/bacula-devel


This mailing list archive is a service of Copilot Consulting.