[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] sql query with PathId=0 during verify job


On Monday 24 March 2008 23:28:07 Ralf Gross wrote:
> Kern Sibbald schrieb:
> > I just took my dog out for his late night run.  The nice thing about that
> > is that it gave me a chance to think about your problem given your new
> > information, and I now am about 95% sure I now know what is going wrong.
> >
> > The FFFFFFFA you are seeing is a negative integer, as I previously
> > mentioned. In fact, it is a -6, which is exactly the code that Bacula
> > uses to signal a heartbeat.
> >
> > So, I imagine that my hypothesis #3 is kicking in (a Bacula bug). You
> > most likely have heartbeat turned on between the DIR and the FD, and you
> > probably have it set it to a low interval.  Unfortunately, the heartbeat
> > during a Verify is very likely to create exactly this problem.
>
> fd: Heartbeat Interval = 300
> dir: Heartbeat Interval = 5min
> sd: Heartbeat Interval = 5min
>
> The funny thing is, that the problem does not always happen in the
> same time frame, or even near the heartbeat interval of 5min.

Depending on what is going on the first heartbeat can occur anytime during the 
first 5 minute interval, so that is not too surprising.

>
> > The workaround is either to turn off heartbeat in your FD for your Verify
> > jobs (not possible on a job by job basis) or set it longer than the time
> > it takes to run the verify.

By the way, when using your localhost loopback device, you should never need 
to enable heartbeat.  It is needed only when you have a router between two 
machines and that router does not correctly implement the Internet keepalive 
standard.

>
> There are some very long running verify jobs (TB's of data), setting
> the heartbeat to an interval of xx hours wouldn't make sense. But I'll
> try to disable the fd's heartbeat completely. The fd I use for verify
> jobs is running on the same system the dir is running. So it shouldn't
> be a problem.

Yes, I agree, there should be no problems during backup.

>
> > The longterm solution is that Bacula should not use the heartbeat code
> > during a Verify.
>
> I'm really glad that you found the reason, as you can see in bug
> report #1061 it was slowly driving me crazy ;)

Sorry it was driving you crazy, but often some of the most difficult problems 
are uncovered and resolved when people are upset with it and determined to 
find a resolution, which was your case here :-)

Regards,

Kern

>
> Ralf
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Bacula-devel mailing list
> Bacula-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.sourceforge.net/lists/listinfo/bacula-devel


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-devel mailing list
Bacula-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/bacula-devel


This mailing list archive is a service of Copilotco.