[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bacula-devel] Fwd: [Bacula-users] bacula-dir virtual memory limit during restore

Hello John,

First, a workaround for your restore problem
Bacula don't need File information to restore a backup, you can specify a 
bootstrap to your restore command. if you are using the WriteBootStrap option 
and you have it this is very simple, else you can read 
to generate a basic bootstrap from the job log.

restore bootstrap=/path/to/file

Now, for smartalloc. If the configure option doesn't work, you can turn off 
SMARTALLOC in src/config.h after the configure step. This works very well.
change #define SMARTALLOC 1 to #define SMARTALLOC 0 or comment them (2 lines).

But for 6.000.000 files, i suppose that your backup server is too small... A 
full 64bit system with 4GB (ram + swap) is a minimum.


On Wednesday 04 June 2008 01:02:31 John Kloss wrote:
> Hello,
> It was suggested that I forward this email to the bacula developers
> list.
> In the time after I had sent this message originally I have been able
> to compile bacula without the smartalloc routines.  This required
> some hacking of the code because, despite the many #ifdef's blocking
> smartalloc definitions, much of the code is still dependent on the
> routines.  Regardless, the code did compile but does not run
> (segmentation fault).
> I now suspect the issue is not in the smartalloc routines but in the
> mem_pool functions.  However, the configure script includes the
> directive '--enable-smartalloc' which it lists as disabled by
> default.  The directive should read enabled by default.  Considering
> the difficulty in compiling without smartalloc I would suggest
> removing the '--enable-smartalloc' directive altogether because as
> the statement currently reads it suggests that we have a choice in
> smartallocs enable/disablement, which we do-- if we're willing to
> hack the code and then locate the reasons for the memory fault.
> I have tried various restore commands all to poor effect
> restore client=<client> current select all done (result-- hang)
> restore client=<client> jobid=<jobid> select all done (result-- hang)
> restore client=<client> jobid=<jobid> all done (result -- hang)
> My next step is to try, or rather, I am trying at this moment
> bconsole
> @output /var/tmp/files.txt
> *query
> *12 (List Files for a selected JobId)
> *<jobid>
> . . . wait for 6.5 millions files to be dumped to files.txt
> *@output
> *quit
> massage data so that it represents a single list of 6.5 millions
> files with no extraneous spaces or '|' separators as given by the
> above command
> bconsole
> restore
> 7 (enter a list of files to restore)
> Enter full filename: <files.txt
> I have no idea whether the above will work because it's taking
> forever and a day to pull the information necessary from catalog to
> perform the restore.  As I mention below, I don't have forever and a
> day to wait-- I have 36 hours and a coffee break.
> I'm willing to wade through code if necessary but I'd really
> appreciate it if someone could at least show me where the shallow
> water is.  As stated, I don't have that much time, or rather, I
> haven't been given much time to recover the 2.5 terabytes of data
> before I am asked to leave.  If the later occurs then I suppose I
> will have quite a bit of time on my hands.
> Thank you.
> 	John Kloss.
> Begin forwarded message:
> > From: John Kloss <John.Kloss@xxxxxxxx>
> > Date: June 3, 2008 10:52:34 AM EDT
> > To: baculausers <bacula-users@xxxxxxxxxxxxxxxxxxxxx>
> > Subject: [Bacula-users] bacula-dir virtual memory limit during restore
> >
> > Hello,
> >
> > I am currently running bacula-2.2.8 compiled as a 32bit binary.
> > I am using postgresql-8.3.1 as the catalog, also compiled as a 32bit
> > binary.
> > I am running bacula on solaris 9 running in 64bit mode which of
> > course means I can run both 64bit and 32bit binaries.
> >
> > ulimit -v shows unlimited.  I know that's a lie and that soft limit
> > is 2GB.  I know that I can change that to 3.5GB or so for a 32bit
> > process.  I have done so and then started bacula-dir.
> > ulimit -d shows unlimited.  I know that's also a lie and that the
> > default limit is 2GB.  I know that I change that to 3.5GB or so for a
> > 32bit process.  I have done so (along with ulimit -v) and then
> > started bacula-dir.
> >
> > I am trying to restore 2.5 terabytes of data composed of 6.5 million
> > files.
> >
> > My process is
> >
> > Run bconsole
> > Choose restore
> > Chose the most recent restore for a client
> > Wait for the directory structure to be generated in memory (10
> > minutes tops-- postgres temp files are on a ram disk which makes life
> > fast)
> >   Chose 'mark *'
> > Watch bacula churn away for a couple of minutes and then report 6.5
> > million files marked.
> > Type done.
> > See that the prompt never returns.  The restore never happens.
> > Actually, I don't have time to wait for forever so I waited for 36
> > hours instead and saw that nothing had changed.  No prompt.  No
> > restore.
> >
> > Bacula-dir consumes up to 2GB of memory and then freezes.  Running
> > pmap on the bacula-dir shows that, along with kernel space usage and
> > library space and whatever, the heap usage is 2GB.  ulimit was set to
> > give bacula-dir 3.5GB.  This gift of memory is apparently ignored.
> >
> > Thinking this was a 32bit limit I switched to a 64bit compile of
> > bacula and tried the same thing.  This time bacula-dir took 2.5GB of
> > heap and no more even though I gifted it 10GB (I have 16GB of memory
> > and it's pretty much for postgres and bacula).  I still never see the
> > prompt return after a 'mark *; done'.
> >
> > Running 64bit versions of bacula on solaris completely hoses any
> > dates such as last written for media.  I think this is a solaris
> > sprintf thing and has nothing to do with bacula.  Regardless, I don't
> > want to run a 64bit version of bacula on solaris and, given the above
> > limitations, it wouldn't help me anyway.
> >
> > Running truss -u *:: on the bacula-dir process shows it continually
> > spinning in mutex locks and unlocks around memory allocation and
> > frees.
> >
> > Previous versions of bacula (1.36) were able to restore 5.5 terabytes
> > of data composed of 9 million files via the above method.  Same
> > machine, less memory, 32bit binaries, old version of postgres (8.0).
> > The new version as I have compiled and configured cannot.
> >
> > How does one recover 2.5 terabytes and 6.5 million files using the
> > latest version of bacula?  What am I doing wrong?  Is there anyway to
> > change smartalloc so that it will use 3.5GBs of memory (nothing
> > popped out at me when I scanned the include files)?
> >
> > I should note that a couple of weeks ago I had a complete system
> > failure of my SAN and lost 25 terabytes of data.  Bacula 1.36
> > restored all of it.  I got to keep my job.  Thank you bacula.
> >
> > 	John Kloss. <John.Kloss@xxxxxxxx>
> > 	IT Manager, Systems Manager
> > 	Institute of Genetic Medicine
> > 	Johns Hopkins Medical Institution
> >

Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
Bacula-devel mailing list

This mailing list archive is a service of Copilotco.