[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bacula-devel] Fwd: [Bacula-users] bacula-dir virtual memory limit during restore
Although I'm just one of the dormant list readers following bacula's
progress I was wondering how many ways you attempted to restore the
data. (This ofcourse does not void the problem itself, I'm only thinking
It is possible to select separate folders to restore? I don't know the
structure of the data you are trying to restore but perhaps it is
possible to partially restore the data and by doing this a couple of
times restore the full data set?
The next thing that sprang to mind was this: did you try to use a
different version of the 2.2 branch? I might get slapped for suggesting
this but perhaps an older version or maybe one of the latest beta's can
pull it off?
The last thing that comes to mind is looking for the fault in the
underlying OS or perhaps the whole system: did you try to use another
(possibly faster) system with a different OS to run the restore? I'm
assuming the problem is in the Director rather than the Storage Daemon,
perhaps running a clone of the director on a different system could
improve the situation (you could access the database from the remote
system and by simply shutting down the current Director move operations
to the rescue-system).
I know the last 2 options are a lot of work (and with number 2 it might
be a good thing to backup the db - just in case) but it sounds like you
are rather desperate so it might just work.
Hopefully someone on here can just point as something at tell you to
change a small thing to get it to work but just in case, my 2 cents.
John Kloss wrote:
> It was suggested that I forward this email to the bacula developers
> In the time after I had sent this message originally I have been able
> to compile bacula without the smartalloc routines. This required
> some hacking of the code because, despite the many #ifdef's blocking
> smartalloc definitions, much of the code is still dependent on the
> routines. Regardless, the code did compile but does not run
> (segmentation fault).
> I now suspect the issue is not in the smartalloc routines but in the
> mem_pool functions. However, the configure script includes the
> directive '--enable-smartalloc' which it lists as disabled by
> default. The directive should read enabled by default. Considering
> the difficulty in compiling without smartalloc I would suggest
> removing the '--enable-smartalloc' directive altogether because as
> the statement currently reads it suggests that we have a choice in
> smartallocs enable/disablement, which we do-- if we're willing to
> hack the code and then locate the reasons for the memory fault.
> I have tried various restore commands all to poor effect
> restore client=<client> current select all done (result-- hang)
> restore client=<client> jobid=<jobid> select all done (result-- hang)
> restore client=<client> jobid=<jobid> all done (result -- hang)
> My next step is to try, or rather, I am trying at this moment
> @output /var/tmp/files.txt
> *12 (List Files for a selected JobId)
> . . . wait for 6.5 millions files to be dumped to files.txt
> massage data so that it represents a single list of 6.5 millions
> files with no extraneous spaces or '|' separators as given by the
> above command
> 7 (enter a list of files to restore)
> Enter full filename: <files.txt
> I have no idea whether the above will work because it's taking
> forever and a day to pull the information necessary from catalog to
> perform the restore. As I mention below, I don't have forever and a
> day to wait-- I have 36 hours and a coffee break.
> I'm willing to wade through code if necessary but I'd really
> appreciate it if someone could at least show me where the shallow
> water is. As stated, I don't have that much time, or rather, I
> haven't been given much time to recover the 2.5 terabytes of data
> before I am asked to leave. If the later occurs then I suppose I
> will have quite a bit of time on my hands.
> Thank you.
> John Kloss.
> Begin forwarded message:
>> From: John Kloss <John.Kloss@xxxxxxxx>
>> Date: June 3, 2008 10:52:34 AM EDT
>> To: baculausers <bacula-users@xxxxxxxxxxxxxxxxxxxxx>
>> Subject: [Bacula-users] bacula-dir virtual memory limit during restore
>> I am currently running bacula-2.2.8 compiled as a 32bit binary.
>> I am using postgresql-8.3.1 as the catalog, also compiled as a 32bit
>> I am running bacula on solaris 9 running in 64bit mode which of
>> course means I can run both 64bit and 32bit binaries.
>> ulimit -v shows unlimited. I know that's a lie and that soft limit
>> is 2GB. I know that I can change that to 3.5GB or so for a 32bit
>> process. I have done so and then started bacula-dir.
>> ulimit -d shows unlimited. I know that's also a lie and that the
>> default limit is 2GB. I know that I change that to 3.5GB or so for a
>> 32bit process. I have done so (along with ulimit -v) and then
>> started bacula-dir.
>> I am trying to restore 2.5 terabytes of data composed of 6.5 million
>> My process is
>> Run bconsole
>> Choose restore
>> Chose the most recent restore for a client
>> Wait for the directory structure to be generated in memory (10
>> minutes tops-- postgres temp files are on a ram disk which makes life
>> Chose 'mark *'
>> Watch bacula churn away for a couple of minutes and then report 6.5
>> million files marked.
>> Type done.
>> See that the prompt never returns. The restore never happens.
>> Actually, I don't have time to wait for forever so I waited for 36
>> hours instead and saw that nothing had changed. No prompt. No
>> Bacula-dir consumes up to 2GB of memory and then freezes. Running
>> pmap on the bacula-dir shows that, along with kernel space usage and
>> library space and whatever, the heap usage is 2GB. ulimit was set to
>> give bacula-dir 3.5GB. This gift of memory is apparently ignored.
>> Thinking this was a 32bit limit I switched to a 64bit compile of
>> bacula and tried the same thing. This time bacula-dir took 2.5GB of
>> heap and no more even though I gifted it 10GB (I have 16GB of memory
>> and it's pretty much for postgres and bacula). I still never see the
>> prompt return after a 'mark *; done'.
>> Running 64bit versions of bacula on solaris completely hoses any
>> dates such as last written for media. I think this is a solaris
>> sprintf thing and has nothing to do with bacula. Regardless, I don't
>> want to run a 64bit version of bacula on solaris and, given the above
>> limitations, it wouldn't help me anyway.
>> Running truss -u *:: on the bacula-dir process shows it continually
>> spinning in mutex locks and unlocks around memory allocation and
>> Previous versions of bacula (1.36) were able to restore 5.5 terabytes
>> of data composed of 9 million files via the above method. Same
>> machine, less memory, 32bit binaries, old version of postgres (8.0).
>> The new version as I have compiled and configured cannot.
>> How does one recover 2.5 terabytes and 6.5 million files using the
>> latest version of bacula? What am I doing wrong? Is there anyway to
>> change smartalloc so that it will use 3.5GBs of memory (nothing
>> popped out at me when I scanned the include files)?
>> I should note that a couple of weeks ago I had a complete system
>> failure of my SAN and lost 25 terabytes of data. Bacula 1.36
>> restored all of it. I got to keep my job. Thank you bacula.
>> John Kloss. <John.Kloss@xxxxxxxx>
>> IT Manager, Systems Manager
>> Institute of Genetic Medicine
>> Johns Hopkins Medical Institution
>> This SF.net email is sponsored by: Microsoft
>> Defy all challenges. Microsoft(R) Visual Studio 2008.
>> Bacula-users mailing list
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services for
> just about anything Open Source.
> Bacula-devel mailing list
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
Bacula-devel mailing list
This mailing list archive is a service of Copilotco.