[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] Alternative DB structures-Porposal

> On Wednesday 24 September 2008 23:40:03 David Boyes wrote:
> > > ALTER TABLE file  add column size bigint default 0;
> >
> > This seems to assume that files/objects are byte streams. This is
> > always true on all the platforms that Bacula supports.
> I am not sure I understand why this assumes that files/objects are
> streams.  I am not arguing, but I find your statement interesting ...
> One point is that no platform is obligated to provide size (st_mtime
> needed), so it may not always be available -- perhaps that is what you
> (David) mean above?

Partially, but I've been working on USS on z/OS and OpenVMS Bacula
clients, where the filesystems are block oriented rather than byte
oriented. One *can* obtain a precise dataset size in bytes, but the cost
is reading the entire file to determine where the file data actually
ends, which is very expensive on terabyte- or petabyte-scale datasets.
It also doesn't really take sparse files or structured files (like VMS
indexed datasets or VSAM data spaces) into account very well, so if this
proposal is added to the "standard" Bacula database structure, you will
encounter problems when you deal with these platforms (or anything more
complicated than a simple sequential file). 

Example: I can easily tell you that a file on VMS takes 4M 4K blocks in
a few nanoseconds, but telling you that it takes exactly 22,239,394
bytes would take order of several seconds. I propose for my client work
to store the number the OS returns from stat as units of allocation, and
include a scaling factor. (BTW this concept is built into the virtual
storage manager I talked about a few weeks ago). 

I'm not objecting to the idea per se, but all the world is not Unix or
Windows, and you'll get some very strange results if you ran scripts
against my database and you don't understand what units you're talking
If the idea is modified to be "number of units" and adds a "unit size"
factor of bytes per unit (ie, is the unit 1 byte or 4K block, or VSAM
cluster size of 32K, etc) I'd be more inclined to go for it. 

So, while the original idea works if the base assumption is
non-structured byte oriented filesystems, it's not easily expanded when
more complex filesystem constructs are in play. Thus my objection. 

This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
Bacula-devel mailing list

This mailing list archive is a service of Copilot Consulting.