[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bacula-devel] Correct usage of posix_fadvise in SD?
Le Monday 29 September 2008 15:44:09 Kern Sibbald, vous avez écrit :
> On Monday 29 September 2008 14:13:16 Brice Figureau wrote:
> > Hi,
> > I was looking to the 2.4.2 SD spooling code lately (this was part of
> > understanding why despooling performances were not that good on my
> > hardware), when I noticed the following usage of posix_fadvise while
> > sequentially reading the spool file (despool_data):
> > #if defined(HAVE_POSIX_FADVISE) && defined(POSIX_FADV_WILLNEED)
> > posix_fadvise(rdcr->spool_fd, 0, 0, POSIX_FADV_WILLNEED);
> > #endif
> > I don't understand why we're telling the kernel to page cache the spool
> > file we're reading since we won't reuse those data.
> > Moreover, there is no "DONTNEED" call after despool_data to let the
> > kernel know it can trash what we read.
> > I thought that something along the line of this in despool_data:
> > #if defined(HAVE_POSIX_FADVISE) && defined(POSIX_FADV_SEQUENTIAL)
> > posix_fadvise(rdcr->spool_fd, 0, 0, POSIX_FADV_SEQUENTIAL);
> > #endif
> > #if defined(HAVE_POSIX_FADVISE) && defined(POSIX_FADV_NOREUSE)
> > posix_fadvise(rdcr->spool_fd, 0, 0, POSIX_FADV_NOREUSE);
> > #endif
> > And a few POSIX_FADV_DONTNEED after each block read with the correct
> > offset and length to tell the pagecache we don't need this part anymore.
> > Does it make sense?
> > Or did I miss something?
> Why don't you run some tests measuring the performance of your proposed
> changes versus what is there now. That would give a much more definitive
> answer than I can ...
First a thing about what Brice said : I don't think there would be something
to gain for bacula in using NOREUSE. It would be more for the other programs
running concurrently with it, in order to help the file daemon not trashing
the filesystem cache.
This said ...
I've also been working on posix_fadvise these days ...
I wanted to find out if it was possible to speedup backups of really small
Bacula, like most backup software, is very good at reading big files, because
of the read ahead features mentionned all around in this thread : you start a
big file, the os reads ahead, so bacula reads the file efficiently, as about
everything it needs is already cached.
For small files, it's not the case :
the file daemon spends all its time opening a file, reading it, closing it,
going to the next ... Of course, the OS can do no readahead, and the
performance is extremely poor.
Here's what I've been trying to model :
Before doing the real backup, you tell the OS what you're going to read in the
next few seconds. To do this, you need to open the files you're going to read
in advance, posix_fadvice them (WILL_NEED), and have a bunch of them done
before the real work from bacula comes. I've already discussed it a bit with
Eric : it might not be very hard to do this using a fifo of opened files in
the file daemon.
For now, here's how I've done it (proof of concept, I didn't code anything in
bacula as I didn't know where to start :) )
I've done a C program that behaves exactly like the unix find command, but
preloads the files before printing them to stdout.
Then I've compared the result from my program (fadvise) and find running
./fadvise | cpio -o --file=/tmp/test
find dir/ | cpio -o --file=/tmp/test2
Of course, before each run, I've reset the linux cache (
echo 1 > /proc/sys/vm/drop_caches), in order to measure the same thing on
both runs :)
The purpose is to use the pipe as a cheap (but not that smart) replacement of
the real fifo for my tests.
time ./fadvise | cpio -o --file=/tmp/test2
time find dir/ | cpio -o --file=/tmp/test2
The results are for 2,000,000 files, dispatched into 1,000 directories. Each
file is 200 bytes (some sort of worst case scenario). It's on my work PC, so
nothing fancy, only 2 SATA drives (one for the test directory, the other one
Of course, if it were to be implemented, there should be some sort of
safeguards added into the real code : there is no need to preload big files
(where should the threshold be put ?), there is no need to preload thousands
of files (it may be counter effective, if some files may be evicted from the
cache before the file deamon comes to read them), there is no need to preload
a file we already know hasn't changed ...
Anyhow, I just wanted start a discussion about this, because I feel bacula
could be better on this use case, and this way of doing things may be a
solution. Of course, we won't get the throughput of big files at least
because the files won't be contiguous on disk, but at least the OS will be
able to do a bit of reordering for all these reads.
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
Bacula-devel mailing list
This mailing list archive is a service of Copilot Consulting.