TURBOMOLE Users Forum
TURBOMOLE Modules => Define => Topic started by: prasanta13 on January 31, 2022, 01:50:41 PM
-
Hi there,
I am trying to calculate and tabulate the time it takes to perform single-point energy calculation of a molecule by different density functionals. The DFT methods I use are b2-plyp, b3lyp (with/without -D2 and -D3), M06-2X, PBE (with/without -D2, -D3 (and ABC) and -D4) with def2-QZVPPD.
However, I am finding different results, as B3LYP taking significantly longer time (cpu-time = ~1 day 4 hours) to perform than B3LYP-D3 (cpu-time = ~ 22 hours). I have no idea.
Again, I was doing three consecutive job runs using same control, coord etc etc in different directory using same method and basis.
The runtime was (all cpu runtime), 1. 63490 mins, 2. 85263 mins, 3. 87830 mins.
This is very confusing. Why there is so much time difference?
I have not changed anything in bash (parallel methods), neither I have used different commands to call ridft. Everytime I used nohup ridft > ridft.out | tail -f ridft.out
There is no such difference in energy also.
What is the problem, can anyone help?
Thanks in advance and best regards.
-
Hi,
did you check the number of SCF cycles?
Also, do you run the job on a local scratch disk? And are there mayber other jobs running on the machine?
Cheers,
Arnim
-
Actually, same numbers of scf iterations were performed.
The computer was same with nothing but turbomole was running.
I didn't set up any scratch directory explicitly, neither local nor remote.
Thanks Arnim.
-
Hello,
did you use the parallel SMP version? Then the only valid output for timing is the wall time - which is the 'real' time the job needs. The CPU time is accumulated over all running threads and depends on a lot of factors, from my experience the numbers are very often meaningless...
-
So I did the meaningless thing by taking the cpu-time for parallel run... :(
Thanks for the help Uwe...
Cheers
-
Hi,
can you share your wall time numbers here such that we can see if this is more consistent?
Cheers, Uwe
-
Sure, I am sharing both wall time and CPU time for benzene dimer single-point calculation with B3LYP/def2-QZVPPD. I have run five such instances, all are given below.
1. total cpu-time : 20 hours 54 minutes and 32 seconds
total wall-time : 23 minutes and 42 seconds
2. total cpu-time : 1 days 7 hours 27 minutes and 4 seconds
total wall-time : 31 minutes and 35 seconds
3. total cpu-time : 22 hours 50 minutes and 16 seconds
total wall-time : 25 minutes and 9 seconds
4. total cpu-time : 21 hours 10 minutes and 33 seconds
total wall-time : 24 minutes and 16 seconds
5. total cpu-time : 21 hours 4 minutes and 11 seconds
total wall-time : 23 minutes and 52 seconds
The control file and ridft.out file for the first instances is also provided for your convenience.
control:
$title
$symmetry c1
$user-defined bonds file=coord
$coord file=coord
$optimize
internal off
redundant off
cartesian on
global off
basis off
$atoms
c 1,3,5,7,9,11,13,15,17,19,21,23 \
basis =c def2-QZVPPD \
jbas =c universal
h 2,4,6,8,10,12,14,16,18,20,22,24 \
basis =h def2-QZVPPD \
jbas =h universal
$basis file=basis
$scfmo file=mos
$closed shells
a 1-42 ( 2 )
$scfiterlimit 30
$thize 0.10000000E-04
$thime 5
$scfdamp start=0.300 step=0.050 min=0.100
$scfdump
$scfintunit
unit=30 size=0 file=twoint
$scfdiis
$maxcor 500 MiB per_core
$scforbitalshift automatic=.1
$drvopt
cartesian on
basis off
global off
hessian on
dipole on
nuclear polarizability
$interconversion off
qconv=1.d-7
maxiter=25
$coordinateupdate
dqmax=0.3
interpolate on
statistics 5
$forceupdate
ahlrichs numgeo=0 mingeo=3 maxgeo=4 modus=<g|dq> dynamic fail=0.3
threig=0.005 reseig=0.005 thrbig=3.0 scale=1.00 damping=0.0
$forceinit on
diag=default
$energy file=energy
$grad file=gradient
$forceapprox file=forceapprox
$dft
functional b3-lyp
gridsize m5
$scfconv 7
$ricore 500
$rij
$jbas file=auxbasis
$rundimensions
natoms=24
nbf(CAO)=1404
nbf(AO)=1152
$last step ridft
$orbital_max_rnorm 0.37825343447967E-02
$last SCF energy change = -464.39460
$subenergy Etot E1 Ej Ex Ec En
-464.3945967316 -1873.134408000 835.7340698417 -52.77879696369 -3.277199147521 629.0617375378
$charge from ridft
0.000 (not to be modified here)
$dipole from ridft
x 0.00001818366549 y 0.00000086972861 z -0.00020512736756 a.u.
| dipole | = 0.0005234349 debye
$end
The ridft.out is,
OpenMP run-time library returned nthreads = 64
ridft (mozart) : TURBOMOLE rev. V7.5.0 compiled 17 Jun 2020 at 09:15:30
Copyright (C) 2020 TURBOMOLE GmbH, Karlsruhe
2022-01-31 16:03:58.238
r i d f t
DFT program with RI approximation
for coulomb part
References:
TURBOMOLE:
R. Ahlrichs, M. Baer, M. Haeser, H. Horn, and
C. Koelmel
Electronic structure calculations on workstation
computers: the program system TURBOMOLE
Chem. Phys. Lett. 162: 165 (1989)
Density Functional:
O. Treutler and R. Ahlrichs
Efficient Molecular Numerical Integration Schemes
J. Chem. Phys. 102: 346 (1995)
Parallel Version:
Performance of parallel TURBOMOLE for Density
Functional Calculations
M. v. Arnim and R. Ahlrichs
J. Comp. Chem. 19: 1746 (1998)
RI-J Method:
Auxiliary Basis Sets to approximate Coulomb
Potentials
Chem. Phys. Lett. 240: 283 (1995)
K. Eichkorn, O. Treutler, H. Oehm, M. Haeser
and R. Ahlrichs
Chem. Phys. Lett. 242: 652 (1995)
Auxiliary Basis Sets for Main Row Atoms and their
Use to approximate Coulomb Potentials
K. Eichkorn, F. Weigend, O. Treutler and
R. Ahlrichs
Theo. Chem. Acc. 97: 119 (1997)
Accurate Coulomb-fitting basis sets for H to Rn
F. Weigend
Phys. Chem. Chem. Phys. 8: 1057 (2006)
Multipole accelerated RI-J (MARI-J):
Fast evaluation of the Coulomb potential for
electron densities using multipole accelerated
resolution of identity approximation
M. Sierka, A. Hogekamp and R. Ahlrichs
J. Chem. Phys. 118: 9136 (2003)
RI-JK Method:
A fully direct RI-HF algorithm: Implementation,
optimised auxiliary basis sets, demonstration of
accuracy and efficiency
F. Weigend
Phys. Chem. Chem. Phys. 4: 4285 (2002)
Two-component HF and DFT with spin-orbit coupling:
Self-consistent treatment of spin-orbit
interactions with efficient Hartree-Fock and
density functional methods
M. K. Armbruster, F. Weigend, C. van Wüllen and
W. Klopper
Phys. Chem. Chem. Phys. 10: 1748 (2008)
Two-component difference density and DIIS algorithm
Efficient two-component self-consistent field
procedures and gradients: implementation in
TURBOMOLE and application to Au20-
A. Baldes, F. Weigend
Mol. Phys. 111: 2617 (2013)
Relativistic all-electron 2c calculations
An efficient implementation of two-component
relativistic exact-decoupling methods for large
molecules
D. Peng, N. Middendorf, F. Weigend, M. Reiher
J. Chem. Phys. 138: 184105 (2013)
Finite nucleus model and SNSO approximation
Efficient implementation of one- and two-
component analytical energy gradients in exact
two-component theory
Y. J. Franzke, N. Middendorf, F. Weigend
J. Chem. Phys. 148: 104110 (2018)
Grids for all-electron relativistic methods
Error-consistent segmented contracted all-
electron relativistic basis sets of double-
and triple-zeta quality for NMR shielding
constants
Y. J. Franzke, R. Tress, T. M. Pazdera,
F. Weigend
Phys. Chem. Chem. Phys. 21: 166658 (2019)
Seminumerical exchange algorithms
Seminumerical calculation of the Hartree-Fock
exchange matirx: Application to two-component
procedures and efficient evaluation of local
hybrid functionsl
P. Plessow, F. Weigend,
J. Comput. Chem. 33: 810 (2012)
Improved seminumerical algorithms
C. Holzer, in preparation (2020)
OpenMP Shared-Memory Parallelization: 64 CPUs.
By: Christof Holzer and Yannick J. Franzke
+--------------------------------------------------+
| general information about current run |
+--------------------------------------------------+
Becke-3-Parameter hybrid functional: B3-LYP
exchange: 0.8*LDA + 0.72*B88 + 0.2*HF
correlation: 0.19*LDA(VWN) + 0.81*LYP
A Hybrid-DFT calculation using the RI-J approximation will be carried out.
Allocatable memory for RI due to $ricore (MB): 500
+--------------------------------------------------+
| Atomic coordinate, charge and isotop information |
+--------------------------------------------------+
atomic coordinates atom charge isotop
1.34670449 2.11837486 0.11440550 c 6.000 0
2.56594884 3.75375040 0.24138827 h 1.000 0
2.37772093 -0.30094248 0.23476732 c 6.000 0
4.39352684 -0.54254014 0.46627676 h 1.000 0
0.80669567 -2.40850633 0.08059763 c 6.000 0
1.60710747 -4.28671650 0.17905146 h 1.000 0
-1.79444206 -2.09773595 -0.18956522 c 6.000 0
-3.01308458 -3.73461605 -0.30937364 h 1.000 0
-2.82613387 0.32323871 -0.30527773 c 6.000 0
-4.84484732 0.56544602 -0.51722393 h 1.000 0
-1.25445236 2.43140268 -0.15760587 c 6.000 0
-2.05394464 4.31046568 -0.25111840 h 1.000 0
3.68399881 2.09781576 6.80458924 c 6.000 0
4.90257948 3.73466581 6.92506063 h 1.000 0
4.71570069 -0.32320894 6.92084468 c 6.000 0
6.73431915 -0.56537177 7.13390038 h 1.000 0
3.14411094 -2.43138759 6.77221691 c 6.000 0
3.94361950 -4.31044137 6.86604874 h 1.000 0
0.54308895 -2.11838275 6.49874189 c 6.000 0
-0.67597827 -3.75377110 6.37093474 h 1.000 0
-0.48786979 0.30090652 6.37786400 c 6.000 0
-2.50350465 0.54246872 6.14508988 h 1.000 0
1.08301598 2.40848171 6.53295560 c 6.000 0
0.28266995 4.28669002 6.43399267 h 1.000 0
center of nuclear mass : 0.94484663 0.00000451 3.30704123
center of nuclear charge: 0.94484812 0.00000437 3.30703847
+--------------------------------------------------+
| basis set information |
+--------------------------------------------------+
we will work with the 1s 3p 5d 7f 9g ... basis set
...i.e. with spherical basis functions...
type atoms prim cont basis
---------------------------------------------------------------------------
c 12 83 63 def2-QZVPPD [8s4p4d2f1g|16s8p4d2f1g]
h 12 36 33 def2-QZVPPD [4s4p2d1f|7s4p2d1f]
---------------------------------------------------------------------------
total: 24 1428 1152
---------------------------------------------------------------------------
total number of primitive shells : 45
total number of contracted shells : 360
total number of cartesian basis functions : 1404
total number of SCF-basis functions : 1152
integral neglect threshold : 0.24E-11
integral storage threshold THIZE : 0.10E-04
integral storage threshold THIME : 5
RI-J AUXILIARY BASIS SET information:
we will work with the 1s 3p 5d 7f 9g ... basis set
...i.e. with spherical basis functions...
type atoms prim cont basis
---------------------------------------------------------------------------
c 12 70 49 universal [6s4p3d1f1g|12s5p4d2f1g]
h 12 16 11 universal [3s1p1d|5s2p1d]
---------------------------------------------------------------------------
total: 24 1032 720
---------------------------------------------------------------------------
total number of primitive shells : 32
total number of contracted shells : 240
total number of cartesian basis functions : 876
total number of SCF-basis functions : 720
symmetry group of the molecule : c1
the group has the following generators :
c1(z)
1 symmetry operations found
there are 1 real representations : a
maximum number of shells which are related by symmetry : 1
------------------
density functional
------------------
Becke-3-Parameter hybrid functional: B3-LYP
exchange: 0.8*LDA + 0.72*B88 + 0.2*HF
correlation: 0.19*LDA(VWN) + 0.81*LYP
iterations will be done with small grid
spherical integration : Lebedev's spherical grid
spherical gridsize : 5
i.e. gridpoints : 590
value for diffuse not defined
radial integration : Chebyshev 2nd kind (scaling 3)
radial gridsize : 8
integration cells : 24
partition function : becke
partition sharpness : 3
biggest AO integral is expected to be 5.262544080
------------------------
nuclear repulsion energy : 629.061737538
------------------------
-----------------
-S,T+V- integrals
-----------------
1e-integrals will be neglected if expon. factor < 0.238031E-12
Difference densities algorithm switched on.
The maximal number of linear combinations of
difference densities is 20 .
DIIS switched on: error vector is FDS-SDF
Max. Iterations for DIIS is : 4
DIIS matrix (see manual)
Scaling factor of diagonals : 1.200
threshold for scaling factor : 0.000
scf convergence criterion : increment of total energy < .1000000D-06
and increment of one-electron energy < .1000000D-03
MOs are in ASCII format !
mo occupation :
irrep mo's occupied
a 1152 42
number of basis functions : 1152
number of occupied orbitals : 42
reading orbital data $scfmo from file mos
orbital characterization : expanded
virtual MOs provided and orthogonalized by Cholesky decomposition
automatic virtual orbital shift switched on
shift if e(lumo)-e(homo) < 0.10000000
------------------------
RI-J - INFORMATION
------------------------
Contributions to RI integral batches:
neglected integral batches: 13039
direct contribution: 38593
memory contribution: 13348
Memory core needed for (P|Q) and Cholesky 4 MByte
Memory core minimum needed except of (P|Q) 1 MByte
Total minimum memory core needed (sum) 5 MByte
****************************************
Memory allocated for RI-J 368 MByte
****************************************
DSCF restart information will be dumped onto file mos
Starting SCF iterations
Overall gridpoints after grid construction = 114189
ITERATION ENERGY 1e-ENERGY 2e-ENERGY NORM[dD(SAO)] TOL
1 -462.76998066830 -1852.4017515 760.57003328 0.000D+00 0.237D-11
Exc = -54.3418580931 Coul = 827.790865307
exK = -12.8789739376
N = 83.999861595
current damping = 0.300
max. resid. norm for Fia-block= 4.972D-01 for orbital 14a
max. resid. fock norm = 4.248D+01 for orbital 934a
ITERATION ENERGY 1e-ENERGY 2e-ENERGY NORM[dD(SAO)] TOL
2 -464.25952120760 -1874.8112773 781.49001858 0.145D+03 0.237D-11
Exc = -56.0234961703 Eck = 837.513514755
N = 83.999927448
current damping = 0.250
Norm of current diis error: 2.9008
max. resid. norm for Fia-block= 7.004D-02 for orbital 13a
max. resid. fock norm = 1.533D-01 for orbital 70a
ITERATION ENERGY 1e-ENERGY 2e-ENERGY NORM[dD(SAO)] TOL
3 -464.36341842212 -1869.5371526 776.11199662 0.606D+02 0.175D-11
Exc = -55.8661987655 Eck = 831.978195388
N = 83.999936630
current damping = 0.200
Norm of current diis error: 1.4969
max. resid. norm for Fia-block= 2.778D-02 for orbital 23a
max. resid. fock norm = 3.609D-02 for orbital 23a
ITERATION ENERGY 1e-ENERGY 2e-ENERGY NORM[dD(SAO)] TOL
4 -464.39301936418 -1873.5705724 780.11581549 0.977D+01 0.167D-11
Exc = -56.0566764334 Eck = 836.172491922
N = 83.999944236
current damping = 0.250
Norm of current diis error: 0.30891
max. resid. norm for Fia-block= 7.064D-03 for orbital 40a
max. resid. fock norm = 1.235D-02 for orbital 1147a
ITERATION ENERGY 1e-ENERGY 2e-ENERGY NORM[dD(SAO)] TOL
5 -464.39411208778 -1873.1112574 779.65540773 0.215D+01 0.123D-11
Exc = -56.0578033446 Eck = 835.713211075
N = 83.999948759
current damping = 0.300
Norm of current diis error: 0.16916
max. resid. norm for Fia-block= 3.307D-03 for orbital 41a
max. resid. fock norm = 8.877D-03 for orbital 1147a
mo-orthogonalization: Cholesky decomposition
ITERATION ENERGY 1e-ENERGY 2e-ENERGY NORM[dD(SAO)] TOL
6 -464.39455645028 -1873.1122018 779.65590777 0.583D+00 0.109D-11
Exc = -56.0556637591 Eck = 835.711571525
N = 83.999948631
current damping = 0.350
Norm of current diis error: 0.47262E-01
max. resid. norm for Fia-block= 8.593D-04 for orbital 40a
max. resid. fock norm = 3.590D-03 for orbital 216a
ITERATION ENERGY 1e-ENERGY 2e-ENERGY NORM[dD(SAO)] TOL
7 -464.39459159106 -1873.1172024 779.66087328 0.329D+00 0.105D-11
Exc = -56.0549241657 Eck = 835.715797441
N = 83.999948861
current damping = 0.200
Norm of current diis error: 0.10878E-01
max. resid. norm for Fia-block= 2.587D-04 for orbital 27a
max. resid. fock norm = 2.269D-03 for orbital 216a
ITERATION ENERGY 1e-ENERGY 2e-ENERGY NORM[dD(SAO)] TOL
8 -464.39459280386 -1873.1485946 779.69226424 0.195D+00 0.993D-12
Exc = -56.0567618643 Eck = 835.749026106
N = 83.999949027
current damping = 0.100
Norm of current diis error: 0.69219E-02
max. resid. norm for Fia-block= 1.173D-04 for orbital 39a
max. resid. fock norm = 8.811D-04 for orbital 292a
ITERATION ENERGY 1e-ENERGY 2e-ENERGY NORM[dD(SAO)] TOL
9 -464.39459349489 -1873.1320139 779.67568287 0.208D+00 0.958D-12
Exc = -56.0559238699 Eck = 835.731606740
N = 83.999949035
current damping = 0.150
Norm of current diis error: 0.14810E-02
max. resid. norm for Fia-block= 2.941D-05 for orbital 41a
max. resid. fock norm = 1.438D-03 for orbital 292a
ITERATION ENERGY 1e-ENERGY 2e-ENERGY NORM[dD(SAO)] TOL
10 -464.39459351718 -1873.1352195 779.67888846 0.201D+00 0.936D-12
Exc = -56.0560075273 Eck = 835.734895988
N = 83.999949033
current damping = 0.200
Norm of current diis error: 0.85515E-03
max. resid. norm for Fia-block= 1.339D-05 for orbital 41a
max. resid. fock norm = 1.276D-03 for orbital 216a
mo-orthogonalization: Cholesky decomposition
ITERATION ENERGY 1e-ENERGY 2e-ENERGY NORM[dD(SAO)] TOL
11 -464.39459352402 -1873.1345997 779.67826862 0.207D+00 0.887D-12
Exc = -56.0560016912 Eck = 835.734270315
N = 83.999949040
current damping = 0.250
Norm of current diis error: 0.26779E-03
max. resid. norm for Fia-block= 5.748D-06 for orbital 41a
max. resid. fock norm = 9.798D-04 for orbital 216a
ENERGY CONVERGED !
Overall gridpoints after grid construction = 368523
ITERATION ENERGY 1e-ENERGY 2e-ENERGY NORM[dD(SAO)] TOL
12 -464.39459673160 -1873.1344080 779.67807373 0.109D+00 0.834D-12
Exc = -56.0559961112 Eck = 835.734069842
N = 83.999995742
current damping = 0.100
Norm of current diis error: 0.10678E-03
max. resid. norm for Fia-block= 7.888D-06 for orbital 33a
max. resid. fock norm = 3.783D-03 for orbital 216a
End of SCF iterations
convergence criteria satisfied after 12 iterations
------------------------------------------
| total energy = -464.39459673160 |
------------------------------------------
: kinetic energy = 462.23698088144 :
: potential energy = -926.63157761304 :
: virial theorem = 1.99535391698 :
: wavefunction norm = 1.00000000000 :
..........................................
<geterg> : there is no data group $energy
<skperg> : $end is missing
orbitals $scfmo will be written to file mos
irrep 38a 39a 40a 41a 42a
eigenvalues H -0.34204 -0.26332 -0.25315 -0.25003 -0.24163
eV -9.3074 -7.1653 -6.8886 -6.8037 -6.5751
occupation 2.0000 2.0000 2.0000 2.0000 2.0000
irrep 43a 44a 45a 46a 47a
eigenvalues H -0.01554 -0.01074 -0.01016 -0.00550 -0.00027
eV -0.4228 -0.2923 -0.2765 -0.1497 -0.0072
==============================================================================
electrostatic moments
==============================================================================
reference point for electrostatic moments: 0.00000 0.00000 0.00000
nuc elec -> total
------------------------------------------------------------------------------
charge
------------------------------------------------------------------------------
84.000000 -84.000000 0.000000
------------------------------------------------------------------------------
dipole moment
------------------------------------------------------------------------------
x 79.367242 -79.367224 0.000018
y 0.000367 -0.000366 0.000001
z 277.791231 -277.791436 -0.000205
| dipole moment | = 0.0002 a.u. = 0.0005 debye
------------------------------------------------------------------------------
quadrupole moment
------------------------------------------------------------------------------
xx 566.780538 -615.718905 -48.938367
yy 380.765157 -429.315839 -48.550682
zz 1861.767389 -1921.985465 -60.218076
xy -0.891919 0.901208 0.009289
xz 630.117393 -629.442455 0.674939
yz -4.987954 4.939007 -0.048947
1/3 trace= -52.569041
anisotropy= 11.538162
==============================================================================
HOMO-LUMO Separation
HOMO : -0.24162879 H = -6.57506 eV
LUMO : -0.01553732 H = -0.42279 eV
HOMO-LUMO gap: 0.22609147 H = +6.15227 eV
==============================================================================
------------------------------------------------------------------------
total cpu-time : 20 hours 54 minutes and 32 seconds
total wall-time : 23 minutes and 42 seconds
------------------------------------------------------------------------
**** ridft : all done ****
2022-01-31 16:27:40.589
ridft ended normally
Hope this helps to all.
-
Hi,
the difference in wall time looks more reasonable, but still too large to be explained by noise only. As you wrote that this was the very same job on the very same machine, something must have changed during the different runs.
First thing that comes to my mind is CPU temperature. If the PC is idle and cools down, it will run faster in the first couple of minutes, but then throttles CPU frequency when it heats up. Modern CPUs do change the clock speed quite often which makes it hard to run benchmarks.
I am more concerned about your total wall time. I tried the same job on 48 cores and it was done in 10 minutes using Turbomole 7.5 (the version you also used). I wonder how many cores you have on your machine.
Please try to run lscpu and check the output:
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 96
On-line CPU(s) list: 0-95
Thread(s) per core: 2
Core(s) per socket: 24
Socket(s): 2
NUMA node(s): 2
To get the number of cores, multiply Socket(s) with Core(s) per socket:, in the example here this gives 48. The number of CPU(s): is given as 96, but just because Hyper-threading is activated, so each physical core is treated as two virtual ones (Thread(s) per core: 2).
Next question is: What do you want to achieve with the timings? Compare different CPU types? Or find the fastest way to do those kind of calculations?
If you are looking for an optimal system for speed, please note that a) newer versions of Turbomole might be more efficient and b) using different methods like semi-numerical treatment of Hartree-Fock exchange for hybrid functionals like B3-LYP can have a large impact (especially if the basis set is large or even 'huge' as def2-QZVPPD). See e.g. https://arxiv.org/abs/1610.07779 (https://arxiv.org/abs/1610.07779)
I took your input and ran four jobs, a default RI-DFT calculation and one with semi-numerical exchange activated ($senex keyword), using either Turbomole version 7.5 or the latest (March 2022) 7.6:
Exchange | Version | Energy | Time |
default | 7.5 | -464.3945967 | 10 min 1 sec |
default | 7.6 | -464,3945967 | 7 min 28 sec |
senex | 7.5 | -464,3945679 | 1 min 15 sec |
senex | 7.6 | -464,3945711 | 50 sec |
For 'production runs' the total energy is not really important and the error for relative energies is (much) smaller. As you can see, quite some work has been done on the seminumerical exchange algorithm (see https://aip.scitation.org/doi/10.1063/5.0022755 and https://www.turbomole.org/turbomole/release-notes-turbomole-7-6/ (https://www.turbomole.org/turbomole/release-notes-turbomole-7-6/).
-
The reason I needed those numbers was to compare the time required for PBE (with/without dispersion correction), B3LYP and other XC functionals. This is the output for lscpu from which machine I ran these jobs.
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 64
On-line CPU(s) list: 0-63
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
Stepping: 7
CPU MHz: 2300.000
CPU max MHz: 3900.0000
CPU min MHz: 1000.0000
BogoMIPS: 4600.00
L1d cache: 1 MiB
L1i cache: 1 MiB
L2 cache: 32 MiB
L3 cache: 44 MiB
NUMA node0 CPU(s): 0-15,32-47
NUMA node1 CPU(s): 16-31,48-63
Vulnerability Itlb multihit: KVM: Mitigation: VMX unsupported
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Mitigation; TSX disabled
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb r
dtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl smx
est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3
dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 sm
ep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xget
bv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke avx512
_vnni md_clear flush_l1d arch_capabilities