[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] Accurate file project hash tables

Kern Sibbald wrote:
> On Tuesday 25 March 2008 21:41:09 Jesper Krogh wrote:
>> Kern Sibbald wrote:
>>> Hello Eric,
>>> Background (somewhat simplified):
>>> The Accurate delete/restore project (item #1) is now implemented (thanks
>>> to Eric).  For it to work, the client machine must maintain a temporary
>>> table supplied by the Director for each job. This table contains all the
>>> files that are in the current backup from the Director's point of view at
>>> the beginning of the job.  The table allows the FD to determine what
>>> files have been added and/or deleted to/from the system that are not yet
>>> saved (or deleted).
>>> As currently implemented this table is a hash table using the hash class
>>> that I wrote 3 or 4 years ago for this particular project. It is fast and
>>> efficient.
>>> Problem:
>>> Currently the hash table is entirely kept in memory, which means the
>>> client uses a rather huge amount of memory, which for systems with 10
>>> million files will probably exceed even the largest amounts of RAM
>>> currently used.  We would like to have some solution that allows the
>>> number of files to grow to 20 million or more and still use less than 1GB
>>> of RAM.
>> I wont claim that my applications are programmed in a sane manner..  but
>> my largest fileset currently contains 27m files. I would like to disable
>> this feature if it kills my server.
> You won't need to disable it; just don't use the option if you don't want it.
>> Doesnt the bacula-directors restore-browser suffer from the excact same
>> problem?
> Yes, which is why I lean toward simply writing the data to a file - simple and 
> it will solve the problem for both cases without the enormous pains it always 
> takes when implementing pre-existing libraries with their portability 
> problems, license problems, and huge amounts of time spent correcting bugs in 
> unknown code.  Virtually all code that we have pulled into Bacula that was 
> pre-existing, with the exception of some system library code has created one 
> or more of those problems. 

My question (without having any clue of the actual code). Wouldn't it 
for both cases be most efficient retrieving it "as-needed" directly from 
the directors database instead of having to pull everything before 

Is the DB-structure unfit for that?


Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
Bacula-devel mailing list

This mailing list archive is a service of Copilotco.