Author Topic: Crash of parallel ricc2 with "too small mbfblk in ccbasblk" error message  (Read 11049 times)

evgeniy

  • Sr. Member
  • ****
  • Posts: 110
  • Karma: +0/-0
Hello,

My ricc2 parallel job crashes with the error message:

"too small mbfblk in ccbasblk"

I would very appreciate if someone could give me a hint
how I can get it over. Thanks!

Evgeniy

Arnim

  • Developers
  • Sr. Member
  • *
  • Posts: 253
  • Karma: +0/-0
Hi,

could you give a bit more input?
How is $macor set? How large is your system and
how many nodes are you using?

Best,
Arnim


evgeniy

  • Sr. Member
  • ****
  • Posts: 110
  • Karma: +0/-0
Hi Arnim,

Thanks for your response.


could you give a bit more input?
How is $macor set? How large is your system and
how many nodes are you using?


Sorry, is $macor an option of the control file? It is not present in
my control file. Could you explain what it specifies.

Regarding the system, it is very small, just the water molecule
(it's a test calculation). The dscf run has run fine, but ricc2 crashes.
I am using 4 nodes, with 4 processors each. I have run this job
as a serial run, and it went ok. The ulimits seem to be also fine,
the stack size is set to "unlimited".

The job crashes in the very begining, just when optimization of
the ground state cluster amplitudes begins, i.e. even MP2 energy is
not calculated. The end of the output looks the following:

ri-cc2 ended abnormally
 lbfmax,mbfblk:            1            0

 ========================
  internal module stack:
 ------------------------
    ricc2
    cc_solve_t0
    ccvecfun
    cc_jgterm
 ========================

 too small mbfblk in ccbasblk
 ri-cc2 ended abnormally
 ri-cc2 ended abnormally
 ri-cc2 ended abnormally
 ri-cc2 ended abnormally
MX:opt052:Remote endpoint is closed, peer=00:60:dd:47:b6:cc (opt055:0)
MPI Application rank 0 exited before MPI_Finalize() with status 13


Evgeniy

Arnim

  • Developers
  • Sr. Member
  • *
  • Posts: 253
  • Karma: +0/-0
Hi Evgeniy,

water is probably too small.
Maybe you should try something bigger.

$maxcor specifies the amount of memory
that can be used. It should be written to control,
if you prepare the input with define.
You can check the chapter Prerequisites in the ricc2 part
of the manual. The recommended value is 66-75% of
the available (physical) core memory.

Best,
Arnim

evgeniy

  • Sr. Member
  • ****
  • Posts: 110
  • Karma: +0/-0
Hi Arnim,

Thanks for the reply!

There is some additional information regarding
the problem I faced. Namely: I start the job on 16
"processors" or to be more precise on 16 cores. These
16 cores are on 4 nodes, with each node being 2x Dual Core
AMD Opteron 2.8 GHz. The job crashes if I ask for entire 16
cores (on 4 nodes). But it runs well if i specify only 1 processor
(core) on each node. Sometimes it also goes fine if i ask for
2 processors on each node, but sometimes it does not. From
this it appears that if it starts on 2 cores of the same processor
it crashes. Didn't you happen to hear about such ricc2 behavior
before?

In my case $maxcor is set to 1000 MB. The physical memory is
4 GB.

Yes, I will try a bigger system.

Evgeniy

Arnim

  • Developers
  • Sr. Member
  • *
  • Posts: 253
  • Karma: +0/-0
Hi,

ricc2 will crash, if the system is so small, that
one or more nodes will not have anything to do.
For a simple H2O 16 nodes are just too much computers.

Bye,
Arnim

evgeniy

  • Sr. Member
  • ****
  • Posts: 110
  • Karma: +0/-0
Hi,

Yes, in the case of big systems everything works.
Many thanks for you help!

Best,
Evgeniy