TURBOMOLE Modules => ccsdf12 => Topic started by: martijn on May 26, 2020, 06:52:04 PM

Title: (T) bit of CCSD(T) calculation taking forever
Post by: martijn on May 26, 2020, 06:52:04 PM

I have been trying to run a CCSD(T)/def2-TZVPP using Turbomole 7.41 on a coordination complex containing one Ni atom, eight heavy atoms and for hydrogens (412 basis functions). The CCSD bit of the calculations runs fine, taking 7 hours on 8 of the not newest cores but now 4 days later I have seen no progress on the (T) bit of the calculation. The last thing in the output file of the still running calculation is the end of the CCSD bit:

     *                                                                    *
     *   RHF  energy                             :  -3250.7809131050      *
     *   correlation energy                      :     -2.9299415667      *
     *                                                                    *
     *   Final CCSD energy                       :  -3253.7108546717      *
     *                                                                    *
     *   D1 diagnostic                           :      0.3059            *
     *                                                                    *

and non of the temporary files in the directory have changed since this was written to the output, even if according to top the ccsdf12_omp process is happily beavering away.

Is the calculation of the pertubative triples really others of magnitude more expensive than the CCSD part or has something likely gone wrong?


Title: Re: (T) bit of CCSD(T) calculation taking forever
Post by: uwe on May 28, 2020, 09:39:50 AM

the triples part should in general take approximately as long as the doubles (the time from start till the final CCSD energy is printed).

The (T) part heavily uses linear algebra routines, which are done by using Intel's multi-threaded library (MKL). There have been cases where ccsdf12 tried to generate a new thread and waited for it to be started for an extremely long time - that was observed on CPUs with AVX2 or AVX512 only.

You probably did run into this problem, but there are work-arounds for it. Telling MKL not to use AVX2 does usually help, but it will run a factor of two slower (on CPUs where this does not happen). So I'd be careful to use this approach in general.

Could you please report that to the Turbomole Support? It is also important to make sure that the next Turbomole release will not run into this specific error.

Title: Re: (T) bit of CCSD(T) calculation taking forever
Post by: martijn on May 29, 2020, 11:36:57 AM
Thanks Uwe! Would that involve setting: export MKL_ENABLE_INSTRUCTIONS=SSE4.2 ? If so I'll have a look.

In the meantime, I've successfully finished the calculation with version 7.01 (same as with the Raman script issue a while back), suggesting indeed that this is a problem with some sort of code optimisation in later versions rather than an inherent bug.

I'll raise the issue with TM support when I've figured out how you do that nowadays, now cosmologic has become part of Dassault.
Title: Re: (T) bit of CCSD(T) calculation taking forever
Post by: uwe on May 29, 2020, 11:51:25 AM

yes, MKL_ENABLE_INSTRUCTIONS is one option. Another option is to change the allowed memory for the intermediate steps ($maxcor). This changes the number of MKL calls and also the size of the individual steps and could help too... but that's a bit of a lottery.

The TM support team is still there and not lost, so do hesitate to get in touch with them.