Author Topic: RIDFT problem  (Read 3519 times)

golden

  • Full Member
  • ***
  • Posts: 34
  • Karma: +0/-0
RIDFT problem
« on: August 16, 2020, 06:11:36 AM »
Dear All,

I am trying to run a ridft run on molecule before tying to run a jobex run.

So in the first step, I am running ridft using:
Code: [Select]
ridft > ridft.out
I am using a HPC cluster, which uses SLURM queuing system. When I try to run the above, it gives me this error;

Quote
TURBOMOLE/mpirun_scripts/IMPI/intel64/bin/mpirun: line 103: 30944 Segmentation fault      mpiexec.hydra "$@" 0<&0


during the run I do use the;
Code: [Select]
ulimit -a > mylimits.out
which gives me;
Quote
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 1031305
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) 52428800
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited


I have run previous calculation using ridft and it has gone with out problem, just wondering why it is giving me this error?

Help is appreciated.

Best,
Neranjan

golden

  • Full Member
  • ***
  • Posts: 34
  • Karma: +0/-0
Re: RIDFT problem
« Reply #1 on: August 16, 2020, 06:24:41 AM »
Also I want to add, the control file for the calculation is:

Code: [Select]
$title
Au25C4
$symmetry c1
$redundant    file=coord
$user-defined bonds    file=coord
$coord    file=coord
$optimize
 internal   on
 redundant  on
 cartesian  off
 global     off
 basis      off
$atoms
au 1-13,140-151                                                                \
   basis =au def2-TZVP                                                         \
   ecp   =au def2-ecp                                                          \
   jbas  =au def2-TZVP
s  14-22,152-160                                                               \
   basis =s def2-TZVP                                                          \
   jbas  =s def2-TZVP
c  23,26,29,32,36,39,42,45,49,52,55,58,62,65,68,71,75,78,81,84,88,91,94,97,101 \
   104,107,110,114,117,120,123,127,130,133,136,161,164,167,170,174,177,180,183 \
   187,190,193,196,200,203,206,209,213,216,219,222,226,229,232,235,239,242,245 \
   248,252,255,258,261,265,268,271,274                                         \
   basis =c def2-TZVP                                                          \
   jbas  =c def2-TZVP
h  24-25,27-28,30-31,33-35,37-38,40-41,43-44,46-48,50-51,53-54,56-57,59-61,    \
   63-64,66-67,69-70,72-74,76-77,79-80,82-83,85-87,89-90,92-93,95-96,98-100,   \
   102-103,105-106,108-109,111-113,115-116,118-119,121-122,124-126,128-129,    \
   131-132,134-135,137-139,162-163,165-166,168-169,171-173,175-176,178-179,    \
   181-182,184-186,188-189,191-192,194-195,197-199,201-202,204-205,207-208,    \
   210-212,214-215,217-218,220-221,223-225,227-228,230-231,233-234,236-238,    \
   240-241,243-244,246-247,249-251,253-254,256-257,259-260,262-264,266-267,    \
   269-270,272-273,275-277                                                     \
   basis =h def2-TZVP                                                          \
   jbas  =h def2-TZVP
$basis    file=basis
$ecp    file=basis
$scfmo   file=mos
$closed shells
 a       1-679                                  ( 2 )
$scfiterlimit      500
$thize     0.10000000E-04
$thime        5
$scfdamp   start=  1.000  step=  0.050  min=  0.100
$scfdump
$scfintunit
 unit=30       size=0        file=twoint
$scfdiis
$maxcor    500 MiB  per_core
$scforbitalshift  automatic=.1
$drvopt
   cartesian  on
   basis      off
   global     off
   hessian    on
   dipole     on
   nuclear polarizability
$interconversion  off
   qconv=1.d-7
   maxiter=25
$coordinateupdate
   dqmax=0.3
   interpolate  on
   statistics    5
$forceupdate
   ahlrichs numgeo=0  mingeo=3 maxgeo=4 modus=<g|dq> dynamic fail=0.3
   threig=0.005  reseig=0.005  thrbig=3.0  scale=1.00  damping=0.0
$forceinit on
   diag=default
$energy    file=energy
$grad    file=gradient
$forceapprox    file=forceapprox
$dft
   functional b-p
   gridsize   m3
$scfconv        6
$ricore      500
$rij
$jbas    file=auxbasis
$rundimensions
   natoms=277
   nbf(CAO)=5470
   nbf(AO)=4870
$last step     define
$end

During the run in SLURM script I do ask for 50GB memory, and the Turbomole run, I would have asked;

Code: [Select]
export PARA_ARCH=SMP
export PARNODES=4
export TURBODIR=/home/software/TURBOMOLE/TURBOMOLE
export PATH=$TURBODIR/scripts:$PATH
export PATH=$TURBODIR/bin/`sysname`:$PATH


As I am running on single node, parallel in 4 cores, so that is why I am using SMP.



uwe

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 558
  • Karma: +0/-0
Re: RIDFT problem
« Reply #2 on: August 17, 2020, 05:39:37 PM »
Hello,

the error (Segmentation fault      mpiexec.hydra) stems from Intel MPI. Most likely Intel MPI did not succeed finding a network interface, or it did find an incompatible version of e.g. Infiniband or OmniPath drivers. You could try to set
export FI_PROVIDER=sockets
in your script. If that does not work, try 'tcp' instead of 'sockets'. Note that on one node the communication will not use the network card but shared memory for the communication, so it does not really matter what you set - nevertheless it should help Intel MPI to avoid this error.

But if you use the SMP version anyway, MPI can be avoided by either using the Fork version or the OpenMP version of ridft. Note that since Turbomole 7.5 the MPI version is not the default for the SMP case any more and this error should not come up at all any more. See the documentation how to switch to the other parallel implementations or contact the support team to get help.

Regards, Uwe


golden

  • Full Member
  • ***
  • Posts: 34
  • Karma: +0/-0
Re: RIDFT problem
« Reply #3 on: August 17, 2020, 07:13:44 PM »
Hi Uwe,

Thank you very much, I used
Code: [Select]
export FI_PROVIDER=sockets
and it seems to be working.


Only thing I can not understand is:
I have been running jobex optimization before and with out using the above, but they seems to run fine.
Only exception was my previous runs was using a same cluster size with small ligand groups.

Anyway this is running now, so appreciate the help given, and thank you very much.

Best Regards,
Neranjan