[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] sql query with PathId=0 during verify job


On Monday 24 March 2008 22:24:54 Frank Sweetser wrote:
> Kern Sibbald wrote:
> > On Monday 24 March 2008 20:17:05 Ralf Gross wrote:
> >> Hi,
> >>
> >> while trying to debug the bsock errors that I get during some verify
> >> jobs, I found something interesting in the fd's debug file (-d100).
> >
> > This is a good idea.
> >
> >> A verify job of the same jobid (diff backup from sunday) first
> >> failed yesterday but was successful in a second attempt.
> >>
> >> Part of the successful verify job from sunday:
> >>
> >> sql_get.c:73-0 db_get_file_att_record
> >> fname=/server/cvsroot/iprep/ants-rt/src/components/vehicle/rectification
> >>/ip _highlevel/src/pelCont.cc,v sql_get.c:127-0 Get_file_record
> >> JobId=1683 FilenameId=539133 PathId=750055 sql_get.c:129-0 Query=SELECT
> >> FileId, LStat, MD5 FROM File WHERE File.JobId=1683 AND
> >> File.PathId=750055 AND
> >> File.FilenameId=539133 sql_get.c:133-0 get_file_record num_rows=1
> >> getmsg.c:110-0 bget_dirmsg 42: 89449 3 p3STS8yKrYA5KQzv5yelcg
> >> *MD5-89449*
> >>
> >>
> >> Now for debug reasons I reran this verify job.
> >>
> >> This is the last file that was checked before the bsock error occured
> >> (packt size too big), it's the same FilenameId as above:
> >>
> >> sql_get.c:73-0 db_get_file_att_record
> >> fname=/server/cvsroot/iprep/ANTSRT/SRC/Components/UTA2/ImageRectificatio
> >>nOp enGl/ip/ip_hi<FF><FF><FF><FA>ghlevel/src/pelCont.cc,v sql_get.c:127-0
> >> Get_file_record JobId=1683 FilenameId=539133 PathId=0 sql_get.c:129-0
> >> Query=SELECT FileId, LStat, MD5 FROM File WHERE File.JobId=1683 AND
> >> File.PathId=0 AND File.FilenameId=539133 sql_get.c:133-0 get_file_record
> >> num_rows=0
> >> verify.c:580-0 File not in catalog:
> >> /server/cvsroot/iprep/ANTSRT/SRC/Components/UTA2/ImageRectificationOpenG
> >>l/i p/ip_hi<FF><FF><FF><FA>ghlevel/src/pelCont.cc,v
> >>
> >> I don't know where the <FF><FF><FF><FA> characters are coming from.
> >
> > Well, I would say that the most likely causes of those characters are:
> >
> > 1. You have a bad network card or the loopback interface is screwed up.
> > 2. You are not running Bacula with UTF-8 turned on and you have German
> > accents in some filenames.
> > 3. There is some strange bug in Bacula that is causing the filename
> > record to get clobbered.
> >
> > To me the most likely is #1 because it looks to me like a negative
> > integer (4 characters) has been inserted in the middle of the line.
>
> That kind of data corruption is highly unlikely - TCP has pretty good
> checks in it against that kind of problem.

Yes, I agree 100%, but it is (was) a possibility ...

>
> That said, it should be relatively straightforward to verify by examining a
> tcpdump of the problem, and looking to see if the corrupted strings appear
> corrupted in the network stream as well.

I should have suggested tcpdump. That would have been the ideal tool for 
seeing the problem.

What surprises me is that I always believed that write() operations were 
atomic.  However, in this case, two thread are writing to the same socket at 
the same time, and the data of the two write()s is intermixed -- a bit ugly.



-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-devel mailing list
Bacula-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/bacula-devel


This mailing list archive is a service of Copilotco.