[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] sql query with PathId=0 during verify job


On Tuesday 25 March 2008 09:09:19 Ralf Gross wrote:
> Kern Sibbald schrieb:
> > On Monday 24 March 2008 23:28:07 Ralf Gross wrote:
> > > > The workaround is either to turn off heartbeat in your FD for your
> > > > Verify jobs (not possible on a job by job basis) or set it longer
> > > > than the time it takes to run the verify.
> >
> > By the way, when using your localhost loopback device, you should never
> > need to enable heartbeat.  It is needed only when you have a router
> > between two machines and that router does not correctly implement the
> > Internet keepalive standard.
>
> The heartbeat option is part of my template file for all fd configs.
> But it's obviously not needed here.
>
> [...]
>
> > > > The longterm solution is that Bacula should not use the heartbeat
> > > > code during a Verify.
> > >
> > > I'm really glad that you found the reason, as you can see in bug
> > > report #1061 it was slowly driving me crazy ;)
> >
> > Sorry it was driving you crazy, but often some of the most difficult
> > problems are uncovered and resolved when people are upset with it and
> > determined to find a resolution, which was your case here :-)
>
> The symptom with the bsock error was misleading. Looking at the bacula
> logs, a message about a missing or different file wasn't all the time
> present when a bsock error occured.
>
> eg:
>
> 15-Mär 12:32 VUMEM004-sd JobId 1580: Forward spacing Volume
> "vu0em003-inc-0120" to file:block 0:226. 15-Mär 12:40 VUMEM004-dir JobId
> 1580: Fatal error: bsock.c:415 Packet size too big from "Client:
> VUMEM004-fd:10.60.1.231:9102. Terminating connection.
>
> 22-Mär 10:31 VUMEM004-sd JobId 1663: Forward spacing Volume "itd-diff-0133"
> to file:block 0:227. 22-Mär 10:35 VUMEM004-dir JobId 1663: Fatal error:
> bsock.c:415 Packet size too big from "Client: VUMEM004-fd:10.60.1.231:9102.
> Terminating connection.
>
> in contrast to this message:
>
> 22-Mär 21:11 VUMEM004-sd JobId 1674: Forward spacing Volume "itd-diff-0133"
> to file:block 0:227. 22-Mär 21:14 VUMEM004-dir JobId 1674: New file:
> /var/www_etas/howto/fit????/pics/.xvpics/email3.jpg 22-Mär 21:14
> VUMEM004-dir JobId 1674: Fatal error: bsock.c:415 Packet size too big from
> "Client: VUMEM004-fd:10.60.1.231:9102. Terminating connection.

Yes, it can be very confusing if you are not familiar with the code as I am.  
Once the packets start getting corrupted, you can get all kinds of problems, 
simple job failures as in your Verify, packet size too big, and unfortunately 
in a few cases crashes.  Bacula tries to protect itself from bad data, but it 
doesn't always work out.  In your case, I still don't understand why the 
mutex error occurred, but 95% it is caused at the root by the bad data 
arriving from the FD.

I've got a fix that I am testing for the problem, but it is definitely too 
large to go into 2.2.9, so I will probably release a patch for it next week.

Regards,

Kern

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-devel mailing list
Bacula-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/bacula-devel


This mailing list archive is a service of Copilotco.