[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bacula-devel] Correct usage of posix_fadvise in SD?
You have some interesting ideas.
One thing to note is that you cannot do an fadvise() until the file is open,
which you surely know, unless there is another form of fadvise, that I am not
If you want to do comparisons of fadvise() and no fadvise() and apply the
results to Bacula, you will need to take into consideration that once the
Bacula FD opens a file for doing a backup, it does an fadvise(WILLNEED), and
when it closes that file, it does an fdatasync() on the file and does an
fadvise(DONTNEED). This potentially could flush the cache out from another
job that is simultaneously using the file, but I considered that to be of
minimal impact to reading the file in fadvise() mode, then flushing to disk
it after the backup and telling the OS that it is not needed by us any more.
So hopefully the current code already has at least a simplistic fadvise
optimization during backup.
On Monday 29 September 2008 16:52:21 Marc Cousin wrote:
> Le Monday 29 September 2008 15:44:09 Kern Sibbald, vous avez écrit :
> > On Monday 29 September 2008 14:13:16 Brice Figureau wrote:
> > > Hi,
> > >
> > > I was looking to the 2.4.2 SD spooling code lately (this was part of
> > > understanding why despooling performances were not that good on my
> > > hardware), when I noticed the following usage of posix_fadvise while
> > > sequentially reading the spool file (despool_data):
> > >
> > > #if defined(HAVE_POSIX_FADVISE) && defined(POSIX_FADV_WILLNEED)
> > > posix_fadvise(rdcr->spool_fd, 0, 0, POSIX_FADV_WILLNEED);
> > > #endif
> > >
> > > I don't understand why we're telling the kernel to page cache the spool
> > > file we're reading since we won't reuse those data.
> > > Moreover, there is no "DONTNEED" call after despool_data to let the
> > > kernel know it can trash what we read.
> > >
> > > I thought that something along the line of this in despool_data:
> > > #if defined(HAVE_POSIX_FADVISE) && defined(POSIX_FADV_SEQUENTIAL)
> > > posix_fadvise(rdcr->spool_fd, 0, 0, POSIX_FADV_SEQUENTIAL);
> > > #endif
> > > #if defined(HAVE_POSIX_FADVISE) && defined(POSIX_FADV_NOREUSE)
> > > posix_fadvise(rdcr->spool_fd, 0, 0, POSIX_FADV_NOREUSE);
> > > #endif
> > >
> > > And a few POSIX_FADV_DONTNEED after each block read with the correct
> > > offset and length to tell the pagecache we don't need this part
> > > anymore.
> > >
> > > Does it make sense?
> > > Or did I miss something?
> > Why don't you run some tests measuring the performance of your proposed
> > changes versus what is there now. That would give a much more definitive
> > answer than I can ...
> > Regards,
> > Kern
> First a thing about what Brice said : I don't think there would be
> something to gain for bacula in using NOREUSE. It would be more for the
> other programs running concurrently with it, in order to help the file
> daemon not trashing the filesystem cache.
> This said ...
> I've also been working on posix_fadvise these days ...
> I wanted to find out if it was possible to speedup backups of really small
> Bacula, like most backup software, is very good at reading big files,
> because of the read ahead features mentionned all around in this thread :
> you start a big file, the os reads ahead, so bacula reads the file
> efficiently, as about everything it needs is already cached.
> For small files, it's not the case :
> the file daemon spends all its time opening a file, reading it, closing it,
> going to the next ... Of course, the OS can do no readahead, and the
> performance is extremely poor.
> Here's what I've been trying to model :
> Before doing the real backup, you tell the OS what you're going to read in
> the next few seconds. To do this, you need to open the files you're going
> to read in advance, posix_fadvice them (WILL_NEED), and have a bunch of
> them done before the real work from bacula comes. I've already discussed it
> a bit with Eric : it might not be very hard to do this using a fifo of
> opened files in the file daemon.
> For now, here's how I've done it (proof of concept, I didn't code anything
> in bacula as I didn't know where to start :) )
> I've done a C program that behaves exactly like the unix find command, but
> preloads the files before printing them to stdout.
> Then I've compared the result from my program (fadvise) and find running
> these :
> ./fadvise | cpio -o --file=/tmp/test
> find dir/ | cpio -o --file=/tmp/test2
> Of course, before each run, I've reset the linux cache (
> echo 1 > /proc/sys/vm/drop_caches), in order to measure the same thing on
> both runs :)
> The purpose is to use the pipe as a cheap (but not that smart) replacement
> of the real fifo for my tests.
> time ./fadvise | cpio -o --file=/tmp/test2
> real 16m25.809s
> user 0m8.977s
> sys 1m19.657s
> time find dir/ | cpio -o --file=/tmp/test2
> real 25m6.516s
> user 0m10.621s
> sys 2m36.558s
> The results are for 2,000,000 files, dispatched into 1,000 directories.
> Each file is 200 bytes (some sort of worst case scenario). It's on my work
> PC, so nothing fancy, only 2 SATA drives (one for the test directory, the
> other one contains /tmp).
> Of course, if it were to be implemented, there should be some sort of
> safeguards added into the real code : there is no need to preload big files
> (where should the threshold be put ?), there is no need to preload
> thousands of files (it may be counter effective, if some files may be
> evicted from the cache before the file deamon comes to read them), there is
> no need to preload a file we already know hasn't changed ...
> Anyhow, I just wanted start a discussion about this, because I feel bacula
> could be better on this use case, and this way of doing things may be a
> solution. Of course, we won't get the throughput of big files at least
> because the files won't be contiguous on disk, but at least the OS will be
> able to do a bit of reordering for all these reads.
> This SF.Net email is sponsored by the Moblin Your Move Developer's
> challenge Build the coolest Linux based applications with Moblin SDK & win
> great prizes Grand prize is a trip for two to an Open Source event anywhere
> in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/
> Bacula-devel mailing list
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
Bacula-devel mailing list
This mailing list archive is a service of Copilotco.