Author Topic: Innappropriate ioctl for device error with Turbomole 6.1 parallel GA  (Read 11635 times)

Christopher Rowley

  • Assistant Professor
  • Newbie
  • *
  • Posts: 4
  • Karma: +0/-0
    • Rowley Group Website
I'm getting an error message "Inappropriate ioctl for device" when I try to use the new parallel GA ridft code. I get the same message on all the difference processor types on our cluster; both AMD Opteron and Intel Xeon. We're running Fedora core 6 with kernel 2.6.25. I can recreate the problem when I run ridft directly with mpiexec. I tried compiling GA from scratch and that passes all the diagnostic tests. Has anyone else seen this error?

Thanks,
Chris

Last System Error Message from Task 1:: Inappropriate ioctl for device
Last System Error Message from Task 2:: Inappropriate ioctl for device
Last System Error Message from Task 3:: Inappropriate ioctl for device
MPI Application rank 3 exited before MPI_Finalize() with status 11
0:Terminate signal was sent, status=: 15
Last System Error Message from Task 0:: Inappropriate ioctl for device
forrtl: error (78): process killed (SIGTERM)
 pri_clustinfo: node                      0 range:                     0
                     3
                     3
                     3
 nodeid_=                     2 newfile=control.2
 nodeid_=                     1 newfile=control.1
 nodeid_=                     0 newfile=control
 nodeid_=                     3 newfile=control.3
 rdgrad ended abnormally
SEVERE ERROR from node:   0  CONTRL dead = actual step
 ABORTING
0:0:GA Aborting:: 1
Last System Error Message from Task 0:: Inappropriate ioctl for device
MPI Application rank 0 exited before MPI_Finalize() with status 1
3:Terminate signal was sent, status=: 15
1:Terminate signal was sent, status=: 15
2:Terminate signal was sent, status=: 15
Last System Error Message from Task 1:: Inappropriate ioctl for device
Last System Error Message from Task 3:: Inappropriate ioctl for device
Last System Error Message from Task 2:: Inappropriate ioctl for device
forrtl: error (78): process killed (SIGTERM)
forrtl: error (78): process killed (SIGTERM)
forrtl: error (78): process killed (SIGTERM)
Mon Oct 26 10:06:21 EDT 2009

uwe

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 560
  • Karma: +0/-0
Re: Innappropriate ioctl for device error with Turbomole 6.1 parallel GA
« Reply #1 on: October 27, 2009, 05:34:11 PM »
Hi Chris,

well, here we go, first post for the new parallel version :-)

The first and often quite difficult step is to find out what the initial reason for an abort has been. Since all processes will give error messages sooner or later, it is not easy to find out which line is the one that came first.

In your case, I assume that the reason is a memory problem:

Quote
MPI Application rank 3 exited before MPI_Finalize() with status 11

signal 11 is segmentation fault, so please check for:

  • stack size limit
  • shared memory limits, e.g. /proc/sys/kernel/shmall and /proc/sys/kernel/shmmax
  • shared memory that is still allocated: try ipcs and if there are shared memory segments unattached to a process, ipcrm -m <id>

Then, if all that looks good, I would suggest to:

  • run on 2 CPUs only
  • and set $ricore to 0

to test if ridft does run at all for your input.

The shared memory is used for several arrays (density, fock, orbitals,...) and it is limited to a certain size to avoid excessive memory usage (and swapping). $ricore does not speed up the calculation that much, especially if you are using $marij (which should be switched on by default in all cases).

Regards,

Uwe

evgeniy

  • Sr. Member
  • ****
  • Posts: 110
  • Karma: +0/-0
Re: Innappropriate ioctl for device error with Turbomole 6.1 parallel GA
« Reply #2 on: December 16, 2009, 09:42:20 AM »
Dear Uwe,

I am wondering about the proper settings for

/proc/sys/kernel/shmall

and

/proc/sys/kernel/shmmax

In my case shmall is 2 MB and shmmax is 32 MB which is definitely too small.
What numbers would you recommend? Thanks!

Best regards,
Evgeniy