[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] sql query with PathId=0 during verify job


On Monday 24 March 2008 20:17:05 Ralf Gross wrote:
> Hi,
>
> while trying to debug the bsock errors that I get during some verify
> jobs, I found something interesting in the fd's debug file (-d100).

This is a good idea.

>
> A verify job of the same jobid (diff backup from sunday) first
> failed yesterday but was successful in a second attempt.
>
> Part of the successful verify job from sunday:
>
> sql_get.c:73-0 db_get_file_att_record
> fname=/server/cvsroot/iprep/ants-rt/src/components/vehicle/rectification/ip
>_highlevel/src/pelCont.cc,v sql_get.c:127-0 Get_file_record JobId=1683
> FilenameId=539133 PathId=750055 sql_get.c:129-0 Query=SELECT FileId, LStat,
> MD5 FROM File WHERE File.JobId=1683 AND File.PathId=750055 AND
> File.FilenameId=539133 sql_get.c:133-0 get_file_record num_rows=1
> getmsg.c:110-0 bget_dirmsg 42: 89449 3 p3STS8yKrYA5KQzv5yelcg *MD5-89449*
>
>
> Now for debug reasons I reran this verify job.
>
> This is the last file that was checked before the bsock error occured
> (packt size too big), it's the same FilenameId as above:
>
> sql_get.c:73-0 db_get_file_att_record
> fname=/server/cvsroot/iprep/ANTSRT/SRC/Components/UTA2/ImageRectificationOp
>enGl/ip/ip_hi<FF><FF><FF><FA>ghlevel/src/pelCont.cc,v sql_get.c:127-0
> Get_file_record JobId=1683 FilenameId=539133 PathId=0 sql_get.c:129-0
> Query=SELECT FileId, LStat, MD5 FROM File WHERE File.JobId=1683 AND
> File.PathId=0 AND File.FilenameId=539133 sql_get.c:133-0 get_file_record
> num_rows=0
> verify.c:580-0 File not in catalog:
> /server/cvsroot/iprep/ANTSRT/SRC/Components/UTA2/ImageRectificationOpenGl/i
>p/ip_hi<FF><FF><FF><FA>ghlevel/src/pelCont.cc,v
>
> I don't know where the <FF><FF><FF><FA> characters are coming from.

Well, I would say that the most likely causes of those characters are:

1. You have a bad network card or the loopback interface is screwed up.
2. You are not running Bacula with UTF-8 turned on and you have German accents 
in some filenames.
3. There is some strange bug in Bacula that is causing the filename record to 
get clobbered.

To me the most likely is #1 because it looks to me like a negative integer (4 
characters) has been inserted in the middle of the line.


>
> Interesting is, that the successful verify job did a different sql
> query with an additional File.PathId=750055 instead of File.PathId=0.
> This is the only query with File.PathId=0 in the debug file.

Well, the Pathid=0 is to be expected if the path is messed up as is the above 
case.  That simply means that the path was not found.

However, in looking over the code, I can see that it is not really optimal and 
it really should avoid submitting the sql with the 0 because it is clear that 
the sql will fail, so why bother submitting it in the first place.

>
> Is the File.PathId in the failed job 0 because of the extra characters
> in the path to the file? Where are these characters are coming from?
> Bad memory?  But wouldn't I then get similar errors during a backup
> job that runs 10 hours? I haven't had a backup error for weeks, but
> many verify error (100% bsock errors).

If hypothesis #1 is correct, I would worry about my backups too, but as you 
point out, it doesn't quite make sense that only verify fails.

>
>
> All daemons that are involved in this verify job are running on the
> same server. All traffic is going over the lo interface. postgres 8.1,
> bacula 2.2.8.
>
> I've no problems in backing up TB's of data, but verify jobs are
> giving me a hard time at the moment.

I've attached a patch for 2.2.8 that will prevent Bacula from submitting an 
SQL command when the PathId (or FilenameId) is zero.  I recommend that you 
apply it and see if it fixes your problem.  I give it a probability of about 
1% that it will solve your problem (i.e. I don't see why it should help 
unless the PathId=0 makes PostgreSQL go off the deep end.

The real problem lies in why you are getting the FFFFFFFA in the middle of 
your data record.

Apply the patch with:

  cd <bacula-source>
  patch -p0 <2.2.8-sql.patch
  ./configure <your options>
  make
  ...
  make install

Another interesting thing you might want to try is to get debug output from 
the DIR and from the FD.  When you find the FFFFFFFA garbage in the middle of 
your DIR record, it would be interesting to see what the FD actually sent.  
If you set the FD debug level to 20 the FD should dump the record just after 
it sent it.

Best regards,

Kern

>
>
> Ralf
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Bacula-devel mailing list
> Bacula-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.sourceforge.net/lists/listinfo/bacula-devel


Index: src/cats/sql_get.c
===================================================================
--- src/cats/sql_get.c	(revision 6671)
+++ src/cats/sql_get.c	(working copy)
@@ -69,18 +69,16 @@
  */
 int db_get_file_attributes_record(JCR *jcr, B_DB *mdb, char *fname, JOB_DBR *jr, FILE_DBR *fdbr)
 {
-   int stat;
+   int stat = 0;
    Dmsg1(100, "db_get_file_att_record fname=%s \n", fname);
 
    db_lock(mdb);
    split_path_and_file(jcr, mdb, fname);
 
-   fdbr->FilenameId = db_get_filename_record(jcr, mdb);
-
-   fdbr->PathId = db_get_path_record(jcr, mdb);
-
-   stat = db_get_file_record(jcr, mdb, jr, fdbr);
-
+   if ((fdbr->FilenameId=db_get_filename_record(jcr, mdb)) != 0 &&
+       (fdbr->PathId=db_get_path_record(jcr, mdb)) != 0) {
+      stat = db_get_file_record(jcr, mdb, jr, fdbr);
+   }
    db_unlock(mdb);
 
    return stat;
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-devel mailing list
Bacula-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/bacula-devel


This mailing list archive is a service of Copilotco.