[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] MaximumBlockSize Problem and Question


On Friday 07 November 2008 16:38:51 Ulrich Leodolter wrote:
> On Fri, 2008-11-07 at 14:47 +0100, Eric Bollengier wrote:
> > Le Friday 07 November 2008 14:36:46 Kern Sibbald, vous avez écrit :
> > > On Friday 07 November 2008 13:19:45 Ralf Gross wrote:
> > > > Kern Sibbald schrieb:
> > > > > On Thursday 06 November 2008 22:47:41 Ralf Gross wrote:
> > > > > > Alex Chekholko schrieb:
> > > > > > > On Wed, 5 Nov 2008 16:12:51 +0100
> > > > > > >
> > > > > > > Kern Sibbald <kern@xxxxxxxxxxx> wrote:
> > > > > > > > For writing to tape (providing it is LTO-n) I strongly
> > > > > > > > recommend a block size not to exceed 256K.
> > > > > > >
> > > > > > > Hi Kern,
> > > > > > >
> > > > > > > Why do you say that?  Is this thread relevant?:
> > > > > > > http://www.mail-archive.com/bacula-devel@xxxxxxxxxxxxxxxxxxxxx/
> > > > > > >msg0 12 46.h tml
> > > > > > >
> > > > > > > Also, I would like to corroborate the OP's experiences; I had
> > > > > > > an almost identical thread about small block size and slow
> > > > > > > write speed:
> > > > > > > http://www.nabble.com/LTO-4-performance--td17407840.html
> > > > > > >
> > > > > > > In fact, I was unable to get higher block sizes working at all
> > > > > > > with btape:
> > > > > > > http://www.adsm.org/lists/html/Bacula-users/2008-05/msg00504.ht
> > > > > > >ml
> > > > > > >
> > > > > > > So I am still stuck at ~22MB/s writing to LTO-4 with the
> > > > > > > default block size.
> > > > > >
> > > > > > I don't think that the blocksize is the problem. I did some tests
> > > > > > but couldn't get higher results with larger blocksizes. I get
> > > > > > 75-85 MB/s with the default bs and no additional tuning.
> > > > >
> > > > > That is probably correct, but most likely only because you have a
> > > > > bottleneck elsewhere -- probably in one of the points I mentioned. 
> > > > > The speed is always capped by the slowest component. Once you
> > > > > remove the other bottlenecks on your system, the blocksize will
> > > > > very likely become the bottleneck and then you can measure the
> > > > > difference.
> > > >
> > > > I didn't want to compain, just show the org. poster that his 22 MB/s
> > > > are likely not a bs issue.
> > > >
> > > > That being said, I started a thread on the user list a while ago
> > > > where I aked what throughput people are getting when writing to tape.
> > > > Nobody involved in this thread got higher numbers than 80-85 MB/s for
> > > > a single job.
> > >
> > > That is probably reasonable for one job, but if you are writing to an
> > > LTO-2,3, or 4, we know that with multiple simultaneous jobs it is
> > > possible to get write speeds of 150 MB/sec.
> > >
> > > Kern
> >
> > This is with a good hardware compression rate, but it's a very good test,
> > IMHO, more than using random data to get only 80MB/sec. If you able to
> > write at 150MB/s, i'm pretty sure that you will be able to write at
> > 80MB/s... Even if your source file is made with a dd if=/dev/zero, your
> > harddisk, your network, your SCSI/SAS or whatever controler have to
> > handle it like real data.
> >
> > Bye
>
> Hi
>
> I think we are talking about different things.
>
> I have only a performance problem for one special job:
>
> Copy Disk to Tape
>
> My Disk Backup jobs run concurrent and spread over size
> limited Disk Volumes.
>
> Bacula runs the Copy Disk to Tape jobs in sequential order
> ( Selection Type = PoolUncopiedJobs )
>
>
> foreach job in PoolUncopiedJobs
> {
>    Read Job Data from Disk Volumes
>    Optionally Spool Data
>    Write Job Data to Tape
> }
>
>
> Better Performance inside a single storage daemon is
> only possibly if Read from Disk and (Spooling) Write to Tape
> run in buffered and parallel mode.
> (multithreading,  afaik not used for this situation)

Yes, the above is more or less correct.  That is the way it works today, some 
day we will have mutiple buffering ...

>
>
> But there is an other issue:
>
> Bacula reads much more (up to 10 times) Data from Disk than
> it writes to Tape (verified that by sysstat tools)
>
> It looks like bacula first does an lseek() to position
> and then always reads to the end of the disk volume.
> I see "End of Volume" messages for each disk volume which is used.

Yes, it is possible that it is reading to the end of the volume. The original 
code did that.  I added code a long time ago that keeps track of when the bsr 
is consumed and then stops.  However, perhaps it doesn't work or doesn't work 
in some situations.  With a few restores, extra reading is not very noticable 
or important, but for Copy or Migration jobs it could be a real performance 
problem.

>
> 07-Nov 11:10 troll-sd JobId 5572: Forward spacing Volume "Backup-0358" to
> file:block 0:1963856407. 07-Nov 11:11 troll-sd JobId 5572: End of Volume at
> file 0 on device "FileStorage" (/disk0/bacula/files), Volume "Backup-0358"
> 07-Nov 11:11 troll-sd JobId 5572: Ready to read from volume "Backup-0359"
> on device "FileStorage" (/disk0/bacula/files). 07-Nov 11:11 troll-sd JobId
> 5572: Forward spacing Volume "Backup-0359" to file:block 0:387281. 07-Nov
> 11:12 troll-sd JobId 5572: End of Volume at file 0 on device "FileStorage"
> (/disk0/bacula/files), Volume "Backup-0359" 07-Nov 11:12 troll-sd JobId
> 5572: Ready to read from volume "Backup-0360" on device "FileStorage"
> (/disk0/bacula/files). 07-Nov 11:12 troll-sd JobId 5572: Forward spacing
> Volume "Backup-0360" to file:block 0:580817. 07-Nov 11:13 troll-sd JobId
> 5572: End of Volume at file 0 on device "FileStorage"
> (/disk0/bacula/files), Volume "Backup-0360" 07-Nov 11:13 troll-sd JobId
> 5572: Ready to read from volume "Backup-0361" on device "FileStorage"
> (/disk0/bacula/files). 07-Nov 11:13 troll-sd JobId 5572: Forward spacing
> Volume "Backup-0361" to file:block 0:258257.
>
> Remember my Disk Volumes hold data for up to 15 backup jobs.
>
> Other people reported the same issue for Migration jobs.
>
> I will try to verify and track this down at home
> over the weekend.  debug = 999 should help :-)

Well, the problem is very likely to be in src/stored/match_bsr.c, which 
currently is set to debug level 500 so 999 will turn it on, but you will also 
get a lot of other data particularly from read_record.c so depending on the 
output, you might want to set the dbglevel variable at the top of match_bsr.c 
to something like 50 and then use a debug=50 simply to reduce the output a 
bit.

Regards,

Kern


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-devel mailing list
Bacula-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/bacula-devel


This mailing list archive is a service of Copilot Consulting.