TURBOMOLE Users Forum
TURBOMOLE Modules => Ridft, Rdgrad, Dscf, Grad => Topic started by: golden on August 16, 2020, 06:11:36 AM
-
Dear All,
I am trying to run a ridft run on molecule before tying to run a jobex run.
So in the first step, I am running ridft using:
ridft > ridft.out
I am using a HPC cluster, which uses SLURM queuing system. When I try to run the above, it gives me this error;
TURBOMOLE/mpirun_scripts/IMPI/intel64/bin/mpirun: line 103: 30944 Segmentation fault mpiexec.hydra "$@" 0<&0
during the run I do use the;
ulimit -a > mylimits.out
which gives me;
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1031305
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) 52428800
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I have run previous calculation using ridft and it has gone with out problem, just wondering why it is giving me this error?
Help is appreciated.
Best,
Neranjan
-
Also I want to add, the control file for the calculation is:
$title
Au25C4
$symmetry c1
$redundant file=coord
$user-defined bonds file=coord
$coord file=coord
$optimize
internal on
redundant on
cartesian off
global off
basis off
$atoms
au 1-13,140-151 \
basis =au def2-TZVP \
ecp =au def2-ecp \
jbas =au def2-TZVP
s 14-22,152-160 \
basis =s def2-TZVP \
jbas =s def2-TZVP
c 23,26,29,32,36,39,42,45,49,52,55,58,62,65,68,71,75,78,81,84,88,91,94,97,101 \
104,107,110,114,117,120,123,127,130,133,136,161,164,167,170,174,177,180,183 \
187,190,193,196,200,203,206,209,213,216,219,222,226,229,232,235,239,242,245 \
248,252,255,258,261,265,268,271,274 \
basis =c def2-TZVP \
jbas =c def2-TZVP
h 24-25,27-28,30-31,33-35,37-38,40-41,43-44,46-48,50-51,53-54,56-57,59-61, \
63-64,66-67,69-70,72-74,76-77,79-80,82-83,85-87,89-90,92-93,95-96,98-100, \
102-103,105-106,108-109,111-113,115-116,118-119,121-122,124-126,128-129, \
131-132,134-135,137-139,162-163,165-166,168-169,171-173,175-176,178-179, \
181-182,184-186,188-189,191-192,194-195,197-199,201-202,204-205,207-208, \
210-212,214-215,217-218,220-221,223-225,227-228,230-231,233-234,236-238, \
240-241,243-244,246-247,249-251,253-254,256-257,259-260,262-264,266-267, \
269-270,272-273,275-277 \
basis =h def2-TZVP \
jbas =h def2-TZVP
$basis file=basis
$ecp file=basis
$scfmo file=mos
$closed shells
a 1-679 ( 2 )
$scfiterlimit 500
$thize 0.10000000E-04
$thime 5
$scfdamp start= 1.000 step= 0.050 min= 0.100
$scfdump
$scfintunit
unit=30 size=0 file=twoint
$scfdiis
$maxcor 500 MiB per_core
$scforbitalshift automatic=.1
$drvopt
cartesian on
basis off
global off
hessian on
dipole on
nuclear polarizability
$interconversion off
qconv=1.d-7
maxiter=25
$coordinateupdate
dqmax=0.3
interpolate on
statistics 5
$forceupdate
ahlrichs numgeo=0 mingeo=3 maxgeo=4 modus=<g|dq> dynamic fail=0.3
threig=0.005 reseig=0.005 thrbig=3.0 scale=1.00 damping=0.0
$forceinit on
diag=default
$energy file=energy
$grad file=gradient
$forceapprox file=forceapprox
$dft
functional b-p
gridsize m3
$scfconv 6
$ricore 500
$rij
$jbas file=auxbasis
$rundimensions
natoms=277
nbf(CAO)=5470
nbf(AO)=4870
$last step define
$end
During the run in SLURM script I do ask for 50GB memory, and the Turbomole run, I would have asked;
export PARA_ARCH=SMP
export PARNODES=4
export TURBODIR=/home/software/TURBOMOLE/TURBOMOLE
export PATH=$TURBODIR/scripts:$PATH
export PATH=$TURBODIR/bin/`sysname`:$PATH
As I am running on single node, parallel in 4 cores, so that is why I am using SMP.
-
Hello,
the error (Segmentation fault mpiexec.hydra) stems from Intel MPI. Most likely Intel MPI did not succeed finding a network interface, or it did find an incompatible version of e.g. Infiniband or OmniPath drivers. You could try to set
export FI_PROVIDER=sockets
in your script. If that does not work, try 'tcp' instead of 'sockets'. Note that on one node the communication will not use the network card but shared memory for the communication, so it does not really matter what you set - nevertheless it should help Intel MPI to avoid this error.
But if you use the SMP version anyway, MPI can be avoided by either using the Fork version or the OpenMP version of ridft. Note that since Turbomole 7.5 the MPI version is not the default for the SMP case any more and this error should not come up at all any more. See the documentation how to switch to the other parallel implementations or contact the support team to get help.
Regards, Uwe
-
Hi Uwe,
Thank you very much, I used
export FI_PROVIDER=sockets
and it seems to be working.
Only thing I can not understand is:
I have been running jobex optimization before and with out using the above, but they seems to run fine.
Only exception was my previous runs was using a same cluster size with small ligand groups.
Anyway this is running now, so appreciate the help given, and thank you very much.
Best Regards,
Neranjan