Академический Документы
Профессиональный Документы
Культура Документы
ackage
ienna
imulation
Introdu
tion
VASP is a
omplex pa
kage for performing ab-initio quantum-me
hani
al mole
ular dynami
s (MD) simulations using pseudopotentials or the proje
tor-augmented wave method and a plane wave basis set. The approa
h implemented in VASP
is based on the (nite-temperature) lo
al-density approximation with the free energy as variational quantity and an exa
t
evaluation of the instantaneous ele
troni
ground state at ea
h MD time step. VASP uses ef
ient matrix diagonalisation
s
hemes and an ef
ient Pulay/Broyden
harge density mixing. These te
hniques avoid all problems possibly o
urring in
the original Car-Parrinello method, whi
h is based on the simultaneous integration of ele
troni
and ioni
equations of motion. The intera
tion between ions and ele
trons is des
ribed by ultra-soft Vanderbilt pseudopotentials (US-PP) or by the
proje
tor-augmented wave (PAW) method. US-PP (and the PAW method) allow for a
onsiderable redu
tion of the number
of plane-waves per atom for transition metals and rst row elements. For
es and the full stress tensor
an be
al
ulated with
VASP and used to relax atoms into their instantaneous ground-state.
The VASP guide is written for experien
ed user, although even beginners might nd it useful to read. The book is mainly
a referen
e guide and explains most les and
ontrol ags implemented in the
ode. The book also tries to give an impression,
how VASP works. However, a more
omplete des
ription of the underlying algorithms
an be found elsewhere. The guide
ontinues to grow as new features are added to the
ode. It is therefore always possible that the version you hold in your
hands is outdated. Therefore, users might nd it useful to
he
k the online version of the VASP guide from time to time, to
learn about new features added to the
ode.
Here is a short summary of some highlights of the VASP
ode:
VASP uses the PAW method or ultra-soft pseudopotentials. Therefore the size of the basis-set
an be kept very small
even for transition metals and rst row elements like C and O. Generally not more than 100 plane waves (PW) per atom
are required to des
ribe bulk materials, in most
ases even 50 PW per atom will be suf
ient for a reliable des
ription.
In any plane wave program, the exe ution time s ales like
VASP uses a rather traditional and old fashioned self- onsisten y y le to al ulate the ele troni ground-state. The
ombination of this s
heme with ef
ient numeri
al methods leads to an ef
ient, robust and fast s
heme for evaluating
the self-
onsistent solution of the Kohn-Sham fun
tional. The implemented iterative matrix diagonalisation s
hemes
(RMM-DISS, and blo
ked Davidson) are probably among the fastest s
hemes
urrently available.
VASP in
ludes a full featured symmetry
ode whi
h determines the symmetry of arbitrary
ongurations automati
ally.
The symmetry
ode is also used to set up the Monkhorst Pa
k spe
ial points allowing an ef
ient
al
ulation of bulk
materials, symmetri
lusters. The integration of the band-stru
ture energy over the Brillouin zone is performed with
smearing or tetrahedron methods. For the tetrahedron method, Blo
hl's
orre
tions, whi
h remove the quadrati
error
of the linear tetrahedron method,
an be used resulting in a fast
onvergen
e speed with respe
t to the number of spe
ial
points.
VASP runs equally well on super-s alar pro essors, ve tor omputers and parallel omputers. Presently support for the
(for a performan
e prole of these ma
hines have a look at the Se
tion 3.8). In addition, makeles for the following
platforms are supplied. Sin
e we do not have a
ess to most of these ma
hines, support for these platforms is usually not
available (the value in bra
kets indi
ates whether is likely that VASP runs without problems: ++ no problems ex
ellent
performan
e; + usually no problems; 0 presently unknown; - unlikely):
The following platforms are not well suited for the exe ution of VASP.
SUN
For these platforms makeles are distributed, but we
an not offer help, if the
ompilations fails or if the exe
utable
rashes during exe
ution. Please do not order VASP if this is the only platform available to you.
CONTENTS
Contents
1
1.1
1.2
1.3
1.4
1.5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 9
. 9
. 10
. 10
. 10
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
12
12
13
13
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
16
16
19
20
20
21
21
21
21
21
21
21
22
22
22
22
22
22
22
22
23
23
23
23
23
24
24
25
26
26
30
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
31
32
32
33
33
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3.1
3.2
3.3
3.4
3.5
Parallelization of VASP.4
4.1
4.2
4.3
4.4
4.5
.
.
.
.
.
3.6
3.7
3.8
3.9
3.10
4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
12
16
31
CONTENTS
5
5.1
5.2
5.3
5.4
5.5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
35
35
35
36
36
36
37
38
40
40
40
41
42
42
43
43
43
44
44
45
45
45
45
45
45
46
46
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
47
49
49
49
49
49
50
50
50
50
50
50
51
51
52
52
52
53
54
54
55
56
56
57
57
57
57
58
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5.6
5.7
5.8
5.9
5.10
5.11
5.12
5.13
5.14
5.15
5.16
5.17
5.18
5.19
5.20
5.21
5.22
5.23
INCAR le . . . . . . . . . . . . . . . . . . . . . . . .
STOPCAR le . . . . . . . . . . . . . . . . . . . . . .
stdout, and OSZICAR-le . . . . . . . . . . . . . . . .
POTCAR le . . . . . . . . . . . . . . . . . . . . . . .
KPOINTS le . . . . . . . . . . . . . . . . . . . . . . .
5.5.1 Entering all k-points expli
itly . . . . . . . . . .
5.5.2 Strings of k-points for bandstru
ture
al
ulations
5.5.3 Automati
k-mesh generation . . . . . . . . . .
5.5.4 hexagonal latti
es . . . . . . . . . . . . . . . . .
IBZKPT le . . . . . . . . . . . . . . . . . . . . . . . .
POSCAR le . . . . . . . . . . . . . . . . . . . . . . .
CONTCAR le . . . . . . . . . . . . . . . . . . . . . .
EXHCAR le . . . . . . . . . . . . . . . . . . . . . . .
CHGCAR le . . . . . . . . . . . . . . . . . . . . . . .
CHG le . . . . . . . . . . . . . . . . . . . . . . . . . .
WAVECAR le . . . . . . . . . . . . . . . . . . . . . .
TMPCAR le . . . . . . . . . . . . . . . . . . . . . . .
EIGENVALUE le . . . . . . . . . . . . . . . . . . . .
DOSCAR le . . . . . . . . . . . . . . . . . . . . . . .
PROCAR le . . . . . . . . . . . . . . . . . . . . . . .
PCDAT le . . . . . . . . . . . . . . . . . . . . . . . .
XDATCAR le . . . . . . . . . . . . . . . . . . . . . .
LOCPOT le . . . . . . . . . . . . . . . . . . . . . . .
ELFCAR le . . . . . . . . . . . . . . . . . . . . . . .
PROOUT le . . . . . . . . . . . . . . . . . . . . . . .
makeparam utility . . . . . . . . . . . . . . . . . . . . .
Memory requirements . . . . . . . . . . . . . . . . . . .
47
CONTENTS
6.22
6.23
6.24
6.25
6.26
6.27
6.28
6.29
6.30
6.31
6.32
6.33
6.34
6.35
6.36
6.37
6.38
6.39
6.40
6.41
6.42
6.43
6.44
6.45
6.46
6.47
6.48
6.49
6.50
6.51
6.52
6.53
6.54
6.55
6.56
6.57
6.58
6.59
6.60
6.61
6.62
6.63
6.21.1 IBRION=-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.21.2 IBRION=0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.21.3 IBRION=1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.21.4 IBRION=2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.21.5 IBRION=3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.21.6 IBRION=5 and IBRION=6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.21.7 IBRION=7 and IBRION=8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.21.8 IBRION some general
omments (ISIF, POTIM) . . . . . . . . . . . . . . . . . .
POTIM-tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ISIF-tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PSTRESS-tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
IWAVPR-tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ISYM-tag and SYMPREC-tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
LCORR-tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
TEBEG, TEEND-tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
SMASS-tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
NPACO and APACO-tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
POMASS, ZVAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
RWIGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
LORBIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
NELECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
NUPDOWN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
EMIN, EMAX, NEDOS tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ISMEAR, SIGMA, FERWE, FERDO tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
LREAL-tag (and ROPT-tag) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
GGA-tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VOSKOWN-tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
DIPOL-tag (VASP.3.2 only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ALGO-tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
IALGO, and LDIAG-tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
NSIM - tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Mixing-tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
WEIMIN, EBREAK, DEPER -tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
TIME-tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
LWAVE,LCHARG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
LVTOT-tag, and
ore level shifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
LELF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Parallelisation: NPAR swit
h, and LPLANE swit
h . . . . . . . . . . . . . . . . . . . .
LASYNC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Ls
aLAPACK, Ls
aLU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Elasti
band method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PAW
ontrol tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Monopole, Dipole and Quadrupole
orre
tions . . . . . . . . . . . . . . . . . . . . . . .
Dipole
orre
tions for defe
ts in solids . . . . . . . . . . . . . . . . . . . . . . . . . . .
Band de
omposed
hargedensity (parameters) . . . . . . . . . . . . . . . . . . . . . . .
Berry phase
al
ulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.59.1 LBERRY, IGPAR, NPPSTR, DIPOL tags . . . . . . . . . . . . . . . . . . . . . . . .
6.59.2 An example: The uorine displa
ement dipole (Born effe
tive
harge) in NaF . .
Non-
ollinear
al
ulations and spin orbit
oupling . . . . . . . . . . . . . . . . . . . . .
6.60.1 LNONCOLLINEAR tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.60.2 LSORBIT tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Constraining the dire
tion of magneti
moments . . . . . . . . . . . . . . . . . . . . . .
On site Coulomb intera
tion: L(S)DA+U . . . . . . . . . . . . . . . . . . . . . . . . . .
HF type
al
ulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.63.1 Introdu
tion: HF fun
tional . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.63.2 LHFCALC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.63.3 Amount of exa
t/DFT ex
hange and
orrelation : AEXX, AGGAX, AGGAC and ALDAC
6.63.4 ENCUTFOCK: FFT grid in the HF related routines . . . . . . . . . . . . . . . . . .
5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
58
58
58
58
59
59
61
61
61
62
62
62
63
64
64
65
65
65
66
66
67
67
67
67
69
71
71
71
72
72
75
75
78
78
78
78
79
79
80
80
81
82
82
84
86
87
87
87
89
89
89
91
92
94
94
94
95
95
CONTENTS
6.63.5 HFLMAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.63.6 HFSCREEN and LTHOMAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.63.7 NKRED, NKREDX, NKREDY, NKREDZ and EVENONLY, ODDONLY . . . . . . . . . . . . . . . . . . . . . .
6.63.8 Typi
al HF type
al
ulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.64 Opti
al properties and density fun
tional perturbation theory (PT) . . . . . . . . . . . . . . . . . . . . . . .
6.64.1 LOPTICS: frequen
y dependent diele
tri
matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.64.2 CSHIFT:
omplex shift in Kramers-Kronig transformation . . . . . . . . . . . . . . . . . . . . . . .
6.64.3 LNABLA: transversal gauge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.64.4 LEPSILON: stati
diele
tri
matrix, ion-
lamped piezoele
tri
tensor and the Born effe
tive
harges
using density fun
tional perturbation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.64.5 LRPA: lo
al eld effe
ts on the Hartree level (RPA) . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.64.6 Vibrational frequen
ies, relaxed-ion stati
diele
tri
tensor and relaxed-ion piezoele
tri
tensor . . . .
6.65 Frequen
y dependent GW
al
ulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.65.1 ALGO for response fun
tions and GW
al
ulations . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.65.2 NOMEGA, NOMEGAR number of frequen
y points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.65.3 LSPECTRAL: use the spe
tral method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.65.4 OMEGAMAX, OMEGATL and CSHIFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.65.5 ENCUTGW energy
utoff for response fun
tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.65.6 ODDONLYGW and EVENONLYGW: redu
ing the k-grid for the response fun
tions . . . . . . . . . . . . .
6.65.7 LSELFENERGY: the frequen
y dependent self energy . . . . . . . . . . . . . . . . . . . . . . . . . .
6.65.8 LWAVE: self
onsistent GW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.65.9 Re
ipy for G0 W0
al
ulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.65.10 Re
ipy for self
onsistent GW
al
ulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.65.11 Re
ipy for partially self
onsistent GW0
al
ulations . . . . . . . . . . . . . . . . . . . . . . . . . .
6.65.12 Using the GW routines for the determination of frequen
y dependent diele
tri
matrix . . . . . . . .
6.66 Not enough memory, what to do . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
Theoreti al Ba kground
8.1
8.2
8.3
8.4
8.5
8.6
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
95
96
97
98
98
98
99
99
99
101
101
101
101
101
102
102
102
103
103
103
104
104
105
105
106
106
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
106
107
107
109
109
109
109
110
110
112
112
112
113
113
114
114
115
115
116
116
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
117
117
117
118
119
119
116
CONTENTS
9 Examples
9.1 Simple bulk
al
ulations . . . . . . . . . . . . . . . . . . . . .
9.2 Bulk
al
ulations with internal parameters . . . . . . . . . . . .
9.3 A
urate DOS and Band-stru
ture
al
ulations . . . . . . . . .
9.4 Atoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.5 Determining the groundstate energ of atoms . . . . . . . . . . .
9.6 Dimers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.7 Mole
ular Dynami
s . . . . . . . . . . . . . . . . . . . . . .
9.8 Simulated annealing . . . . . . . . . . . . . . . . . . . . . . . .
9.9 Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.10 Surfa
e
al
ulations . . . . . . . . . . . . . . . . . . . . . . . .
9.10.1 1. Step: Bulk
al
ulation . . . . . . . . . . . . . . . . .
9.10.2 2. Step: FFT-meshes and k-points for surfa
e
al
ulation
9.10.3 3. Step: Number of bulk and va
uum layers . . . . . . .
9.11 Latti
e dynami
s, via the for
e
onstant approa
h . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
120
120
122
123
124
126
127
128
128
129
129
129
129
130
130
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
131
132
132
133
133
133
134
134
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
134
135
136
136
137
137
137
137
137
137
137
138
138
138
138
139
139
140
140
140
140
140
141
141
142
143
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
144
CONTENTS
13 Example PSCTR les
13.1 Potassium pseudopotential
13.2 Vanadium pseudopotential
13.3 Palladium pseudopotential
13.4 Carbon pseudopotential . .
13.5 Hydrogen pseudopotential
8
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
144
144
144
145
145
146
146
15 FAQ
148
The pre
ision of the for
es was in
reased from 4 to 7-8 signi
ant digits. In the previous version, 1st order nite differen
es
were used to
al
ulate
ertain
omponents of the for
es. In the new version,
entral differen
es are used in all pla
es.
A new ag was introdu
ed, to
ontrol the behaviour of the mixer. It is
alled MAXMIX, and spe
ies the maximum number
of iterations stored in the mixer. If MAXMIX is set to a positive value (for example 40), the mixer is not reset when the ions are
moved. This
an redu
e the number of ele
troni
steps during mole
ular dynami
s (MD) and ioni
relaxations. Please read
se
tion 6.45 to nd more information. In addition in VASP.4.4, an improved
harge density predi
tion (based on a quadrati
extrapolation of the bond
harge) was implemented by Dario Alfe, whi
h also redu
es the number of iterations during MD
simulations.
The RMM-DIIS algorithm has been rewritten to run in a blo
ked mode (i.e. several bands are optimised at the same time).
This allows to use matrix-matrix operations instead of matrix-ve
tor operation for the evaluations of the non-lo
al proje
tion
operators in real spa
e, and might speed up
al
ulations on some ma
hines (see se
tion 6.44). The start up phase of the
RMM-DIIS algorithm was also rewritten. In the new version eight, non self-
onsistent steps are performed, and in ea
h step
ea
h band is optimised using a steepest des
ent algorithm. The new version is signi
antly more reliable.
A new real spa
e proje
tion s
heme was implemented in VASP. It
an redu
e the
omputational requirements of the real
spa
e proje
tion s
heme by 10-30 %. This new s
heme is sele
ted by spe
ifying LREAL=Auto or LREAL=A in the INCAR le.
Per default, the s
heme also sear
hes automati
ally for an optimised real spa
e
utoff and makes the spe
i
ation of the ag
ROPT unne
essary (see se
tion 6.38). (If LREAL=A is used, the ROPT line must be removed from the INCAR le to a
tivate the
automati
sear
h).
VASP.4.4 is also the rst version whi
h supports the PAW method. However, data sets will not be released before the end
of the year 2000 (ex
ept for sele
ted long time VASP users,
oauthor-ship in the rst PAW paper is required). VASP.4.4
ontains also all les required for the parallel exe
ution, and is hen
e the rst of
ial parallel version of VASP.
In VASP.4.4, the spring
onstant is redened for the nudged elasti
band method. In the old version, the spring
onstant
had to be halved, when the number of images was doubled. Now it should remain
onstant when the number of images is
hanged. The default value for the spring
onstant is now SPRING=-5, whi
h is a sensible
hoi
e in most
ases.
VASP.4.4 now allows to perform damped mole
ular dynami
s, by setting the SMASS tag in the INCAR le. Although
this feature was do
umented before (see also Se
. 6.21), it was not working properly in previous releases. For reasons of
onsisten
y, the time step (POTIM) has been redened for damped mole
ular dynami
s. The POTIM parameter in the INCAR
les should be
hanged to
POTIM = old POTIM / 2
Sele
tive dynami
s are now
orre
tly supported even during MD (the temperature is for instan
e
orre
tly evaluated), ant one
an now freeze a sele
ted number of ions during the MD.
1.2
Basi
support of the
al
ulation of opti
al properties is now supplied in VASP (the operator < f jjf > is
al
ulated).
Sin
e no do
umentation is available presently, please
he
k the opti
s.F subroutine. The required post pro
essing les are
available from Jurgen Furthmuller upon request.
The MAGMOM line is now inspe
ted, and symmetry operations whi
h are not
ompatible to MAGMOM are removed (see Se
.
6.12).
The average ele
trostati
potential in the region of the
ore
an be evaluated now. This allows to estimate
ore level shift
in the initial state approximation (see Se
. 6.49)
i
10
Polarisation
al
ulations using the Berry phase approa
h were added to VASP by Martijn Marsman. Presently no do
umentation is available (see Se
. 6.59).
Please also
he
k the README le of the latest VASP release to learn about bug xes and other
hanges.
1.3
VASP 4.5
The major
ode improvement is the in
lusion of spinors in the VASP
ode. It is now possible to treat non
ollinear magneti
stru
tures and spin-orbit
oupling on a fully self-
onsistent basis (see se
tion 6.60)
An automati
way to
al
ulate for
e
onstants and vibrational frequen
ies using nite differen
es has been implemented
(see Se
. 6.21).
For
opyright reasons, VASP.4.5 does not support IALGO=8 (M. Teter, Corning and M. Payne hold an US Patent on
this algorithm). As a faster and equally reliable substitute for IALGO=8 a Davidson like algorithm has been implemented
(IALGO=38). In addition, it is now possible to sele
t the algorithm using ALGO=Normal, Fast or Very Fast (see Se
. 6.42 for
details).
VASP.4.5 also treats the unbalan
ed latti
e ve
tors differently than VASP.4.4. In VASP.4.4, the
harge density at unbalan
ed latti
e ve
tors was set to zero. But, in
ombination with US-PPs and PAW potentials, this has signi
ant disadvantages
for an a
urate des
ription of wavefun
tions in the va
uum (STM images). Therefore, the
harge at unbalan
ed latti
e ve
tors
is not zeroed in VASP.4.5. To for
e a behaviour
ompatible to VASP.4.4, the ag LCOMPAT=.TRUE.
an be set in the INCAR
le (in VASP.4.4 this ag was used to obtain
ompatability to VASP.3.2, please do not set this ag if you use VASP.4.4, ex
ept
if you need
ompatibility to VASP.3.2).
Additionally, a subtle mistake in the real spa
e proje
tion s
heme (LREAL=.T., LREAL=O, and LREAL=A) was removed in
VASP.4.5.4 and older releases. The real spa
e proje
tors are zero beyond a
ertain radial
utoff r (line Optimized for a
Real-spa
e Cutoff X.XX Angstroem in the OUTCAR le). Versions before VASP.4.5.4, however, in
orre
tly extrapolate the
real spa
e proje
tion operators beyond this
utoff up to r /100*101. As a result the pre
ision of VASP was slightly redu
ed
when using real spa
e proje
tors. VASP.4.5.4 and newer releases have removed this error. Usually the energy differs only by
1 meV per atom, but in some
ases the error
an be up to a few meV per atom. Again
ompatibility to VASP.4.4
an be for
ed
by simply setting LCOMPAT=.TRUE. For the few users, who used already VASP.4.5.3, it is possible to obtain
ompatibility to
that version, by setting only LREAL COMPAT=.TRUE. (presently the default is in fa
t in any
ase LREAL COMPAT=.TRUE.)
Another
hange
on
erns the WAVECAR le. To make them smaller, VASP.4.5 writes the WAVECAR le in single pre
ision.
VASP.4.5 is still able to read WAVECAR les generated by VASP.4.4, but VASP.4.4 is not able to read les generated by
VASP.4.5. If this behaviour is disliked, the pre-
ompiler ag WAVECAR double
an be spe
ied in the makeles (Se
. 3.5.14).
Finally, the MPI
ommuni
ation layer and the parallel fast Fourier transformation (FFT) routines have been rewritten to
perform optimally on workstation
lusters
onne
ted by a Fast or Gigabit Ethernet. Usually you
an expe
t a performan
e
improvements of 10-20% with VASP.4.5. Additionally on one pro
essor, the parallel version of VASP.4.5 is now as fast as
the serial version.
rl
rl
1.4
VASP 4.6
Presently VASP.4.6 is pretty identi
al to VASP.4.5. The most important differen
e is that LREAL COMPAT defaults now to
LREAL COMPAT=.FALSE. (see above). L(S)DA+U now also works
orre
tly for f -elements. In addition, LDA+U is now
supported (i.e. no ex
hange splitting in the LDA part). VASP.4.6 also reports the orbital moment. In any
ase, several tiny
bugs in the spin orbit
oupling have been removed in this version.
VASP.4.6 also generates a new output le with the name vasprun.xml. This output will be used in
ombination with the
new vasp utility p4v (python for vasp). More will be announ
ed later.
1.5
VASP 5
VASP.5 is urrently not distributed, expe ted release date not before 2007.
VASP.5 is a signi
ant update from vasp.4.X. Internally it has been rewritten to separate the data representation from
algorithms. With the new version it will be possible to implement different basis sets (for instan
e nite elements) without a
major
ode rewrite.
VASP.5 supports or will support a large number of additional features (for internal tests vasp.5.1.21 is presently available):
Opti
al properties, in parti
ular the real and imaginary part of the frequen
y dependent diele
tri
fun
tion are supported
Linear response with respe t to an external eld and with respe t to the ioni positions is supported (beta stage) (Se .
6.64.4).
11
Most se
ond order response fun
tions, su
h as internal strain tensor, piezoele
tri
tensor, Born effe
tive
harges,
interatomi
for
e
onstants (alpha stage).
Exa
t ex
hange and hybrid fun
tionals (PBE0) are supported both in the G-point only version and for the full k-point
version. The k point sampling
an be performed in the IRZ, be
ause routines for the symmetrisation of the wavefun
tions are now implemented. The estimated
omputing requirements will however in
rease dramati
ally; expe
t
something like two orders of magnitude (beta stage).
S reened ex hange (beta stage) and model GW in the COHSEX (alpha stage) will be supported.
Exa
t ex
hange in the framework of the optimized effe
tive potential method will be supported (alpha stage, but unlikely to be ever released publi
ly: rst very ex
ited, later not so ex
iting at all).
Full frequen
y dependent GW at the speed of the plasmon pole model: fully parallel, very fast (beta stage, Si 128 bands,
6 6 6 k-points takes 500-1000 se
onds on dual Opteron). Very affordable, indeed. (Se
. 6.65).
Bethe-Salpether, TD-DFT and TD-HF for ex
itons (alpha stage).
Relaxed
ore PAW is supported (beta stage).
A new dire
t optimization s
heme for the KS fun
tional is supported, whi
h performs (almost) as fast and robust as the
harge density mixing s
hemes (beta stage).
A new matrix diagonalisations routine will be implemented, whi
h will s
ale essentially like N 2 (
on
ept exists, but
work not yet in progress).
VASP AN INTRODUCTION
12
History of VASP
CASTEP/CETEP
ode, but bran
hed from this root at a very early stage. At the time, the VASP development was
started the name CASTEP was not yet established. The CASTEP version upon whi
h VASP is based only supported
lo
al pseudopotentials and a Car-Parrinello type steepest des
ent algorithm.
July 1989: Jurgen Hafner brought the
ode to Vienna after half a year stay in Cambridge.
Sep. 1991: work on the VASP
ode was started. At this time, in fa
t, the CASTEP
ode, was already further developed,
but VASP development was based on the old 1989 CASTEP version.
O t. 1992: ultra-soft pseudopotentials were in luded in the ode, the self- onsisten y loop was introdu ed to treat
Jan 1993: J. Furthmuller joined the group. He wrote the rst version of the Pulay/Broyden harge density mixer and
ontributed among other things the symmetry ode, the INCAR-reader and a fast 3d-FFT.
Feb 1995: J. Furthmuller left Vienna. In the time due, VASP has got it's nal name, and had be
ome a stable and
versatile tool for for ab initio
al
ulations.
Sep. 1996:
onversion to Fortran 90 (VASP.4.1). The MPI (message passing) parallelisation of the
ode was started
at this time. J.M. Holender, who initially worked on the parallelisation, unfortunately
opied the
ommuni
ation
kernels from CETEP to VASP. This was the se
ond time developments originating from CASTEP were in
luded in
VASP, whi
h subsequently
aused quite some understandable anger and uproar.
Most of the work on the parallelisation was done in Keele, Staffordshire, UK by Georg Kresse. MPI parallelisation
was nished around January 1997. Around July 1998, the
ommuni
ation kernel was
ompletely rewritten in order to
remove any CETEP remainders. Unfortunately, this meant giving up spe
ial support for T3D/T3E shmem
ommuni
ation. Sin
e that time VASP is no longer parti
ularly ef
ient on the T3D/T3E.
July 1997-De . 1999: the proje tor augmented wave (PAW) method was implemented.
In addition, the following people have
ontributed to the
ode: The tetrahedron integration method was
opied from a LMTO
program (original author unknown, but it might be Jepsen or Blo
hl). The
ommuni
ation kernels were initially developed
by Peter Lo
key at Daresbury (CETEP), but they have been subsequently modied
ompletely. The kernel for the parallel
FFT was initially written by D. White and M. Payne, but it has been rewritten from s
rat
h around July 1998. Several parts
of VASP were
o-developed by A. Ei
hler, and other members of the group in Vienna. David Hobbs worked on the non
ollinear version. Martijn Marsman has written the routines for
al
ulating the polarisation using the Berry phase approa
h,
spin spirals and Wannier fun
tions. He also rewrote the LDA+U routines initially written by O. Bengone, and extended the
spin-orbit
oupling to f ele
trons. Robin Hirs
hl implemented the Meta-GGA, and is
urrently working on the Hartree-Fo
k
support (together with Martijn Marsman and Adrian Rohrba
h).
2.2
VASP.4.X is a Fortran 90 program. This allows for dynami
memory allo
ation and a single exe
utable whi
h
an be used
for any type of
al
ulation.
Generally the sour
e
ode and the pseudo potentials should reside in the following dire
tories:
VASP/sr
/vasp.4.lib
VASP/sr
/vasp.4.X
VASP/pot/..
VASP/pot_GGA/..
VASP/potpaw/..
VASP/potpaw_GGA/..
13
VASP AN INTRODUCTION
The dire
tory vasp.4.lib
ontains sour
e
ode whi
h rarely
hanges and this dire
tory usually does not require reinstallation upon updates. However, signi
ant
hanges in vasp.4.lib might be required, when adopting the
ode to new
platforms. The dire
tory vasp.4.X
ontains the main Fortran 90
ode. The dire
tories pot/ pot GGA/ (and possibly potpaw/
potpaw GGA/) hold the (ultrasoft) pseudopotentials and the proje
tor augmented wave potentials respe
tively. LDA versions
are supplied in the dire
tories pot and potpaw, whereas GGA versions (Perdew, Wang 1991) are distributed in the dire
tories
pot GGA and potpaw GGA. The sour
e les and the pseudopotentials are available on a le server (see se
tion 3.2).
Most
al
ulations will be done in a work dire
tory, and before starting a
al
ulation, several les must be
reated in this
dire
tory. The most important input les are:
INCAR POTCAR POSCAR
KPOINTS
diamond
Copy all les from the tutor/diamond dire
tory to a work dire
tory, and pro
eed step by step:
1. The following four les are the
entral input les, and must exist in the work dire
tory before VASP
an be ex
euted.
Please,
he
k ea
h of these les using an editor.
INCAR le
The INCAR le is the
entral input le of VASP. It determines 'what to do and how to do it'. It is a tagged format
free-ASCII le: Ea
h line
onsists of a tag (i.e. a string) the equation sign '=' and one or several values. Defaults
are supplied for most parameters. Please
he
k the INCAR le supplied in the tutorial. It is longer than it must be.
A default for the energy
utoff is for instan
e given in the POTCAR le, and therefore usually not required in the
INCAR le. For this simple example however, the energy
utoff is supplied in the INCAR le (and it is probably
wise to do this in most
ases).
POSCAR
The POSCAR le
ontains the positions of the ions. For the diamond example, the POSCAR le
ontains the follow-
ing lines:
ubi
diamond
3.7
0.5 0.5 0.0
0.0 0.5 0.5
0.5 0.0 0.5
2
dire
t
0.0 0.0 0.0
0.25 0.25 0.25
omment line
universal s
aling fa
tor
first Bravais latti
e ve
tor
se
ond Bravais latti
e ve
tor
third Bravais latti
e ve
tor
number of atoms per spe
ies
dire
t or
art (only first letter is signifi
ant)
positions
The positions
an be given in dire
t (fra
tional) or Cartesian
oordinates. In the se
ond
ase, positions will be
s
aled by the universal s
aling fa
tor supplied in the se
ond line. The latti
e ve
tors are always s
aled by the
universal s
aling fa
tor.
KPOINTS
The KPOINTS les determines the k-points setting
4x4x4
0
Monkhorst
4 4 4
0 0 0
Comment
0 = automati
generation of k-points
M use Monkhorst Pa
k
grid 4x4x4
shift (usually 0 0 0)
The rst line is a
omment. If the se
ond line equals zero, k-points are generated automati
ally using the
Monkhorst-Pa
k's te
hnique (rst
hara
ter in third line equals M). With the supplied KPOINTS le a 4 4 4
Monkhorst-Pa
k grid is used for the
al
ulation.
VASP AN INTRODUCTION
14
POTCAR
The POTCAR le
ontains the pseudopotentials (for more then one spe
ies simply
on-
at POTCAR les using the
UNIX
ommand
at). The POTCAR le also
ontains information about the atoms (i.e. their mass, their valen
e,
the energy of the atomi referen e onguration for whi h the pseudopotential was reated et .).
Again this
ommand will work properly only, if the vasp ex
e
utable is lo
ated somewhere in the sear
h path. The
sear
h path is usually supplied in the PATH variable of your UNIX shell. For more details, the user is refered to a UNIX
manual.
After starting VASP, you will get a output similar to
VASP.4.4.3 10Jun99
POSCAR found : 1 types and
2 ions
LDA part: x
-table for CA standard interpolation
file io ok, starting setup
WARNING: wrap around errors must be expe
ted
entering main loop
N
E
dE
d eps
n
g
rms
rms(
)
CG : 1 0.1209934E+02 0.120E+02 -0.175E+03 165 0.475E+02
CG : 2 -0.1644093E+02 -0.285E+02 -0.661E+01 181 0.741E+01
CG : 3 -0.2047323E+02 -0.403E+01 -0.192E+00 173 0.992E+00 0.416E+00
CG : 4 -0.2002923E+02 0.444E+00 -0.915E-01 175 0.854E+00 0.601E-01
CG : 5 -0.2002815E+02 0.107E-02 -0.268E-03 178 0.475E-01 0.955E-02
CG : 6 -0.2002815E+02 0.116E-05 -0.307E-05 119 0.728E-02
1 F= -.20028156E+02 E0= -.20028156E+02 d E =0.000000E+00
writing wavefun
tions
VASP uses a self-
onsisten
y
y
le with a Pulay mixer and an iterative matrix diagonalisation s
heme to
al
ulate the
Kohn Sham (KS) ground-state. Ea
h line
orresponds to one ele
troni
step, and in ea
h step the wavefun
tions are
iteratively improved a little bit, and the
harge density is rened on
e. A
opy of stdout (that's what you see on the
s
reen) is also written to the le OSZICAR.
The
olumns have the following meaning: Column N is
ounter for the the ele
troni
iteration step, E is the
urrent
free energy, dE the
hange of the free energy between two steps, and d eps the
hange of the band-stru
ture energy.
The
olumn n
g indi
ates how often the Hamilton operator is applied to the wavefun
tions. The
olumn rms gives the
initial norm of the residual ve
tor (R = (H eS)jfi) summed over all o
upied bands, and is an indi
ation how well the
wavefun
tions are
onverged. Finally the
olumn rms(
) indi
ates the differen
e between the input and output
harge
density. During the rst ve steps, the density and the potentials are not updated to pre-
onverge the wavefun
tions
(therefore rms(
) is not shown). After the rst ve iterations, the update of the
harge density starts. For the diamond
example, only three updates are required to obtain a suf
iently a
urate ground-state. The nal line shows the free
ele
troni
energy F after
onvergen
e has been rea
hed.
More information (for instan
e the for
es and the stress tensor)
an be found in the OUTCAR le. Please
he
k this le
in order to get an impression whi
h information
an be found on the OUTCAR le.
Another important le is the WAVECAR le whi
h stores the nal wave fun
tions. To speed up
al
ulations, VASP usually
tries to read this le upon startup. At the end of
al
ulations, the le is written (or if it exists overwritten).
3. To
al
ulate the equilibrium latti
e
onstant try to type ./run. The shell s
ript run is a simple shell s
ript, whi
h runs
vasp for different latti
e parameters. You
an
he
k the
ontents of this s
ript with an editor.
4. Determine the equilibrium volume (for instan
e using a quadrati
t of the energy). The equilibrium latti
e
onstant
should be
lose to 3.526.
5. Now set the equilibrium latti
e
onstant in the POSCAR le and move the ion lo
ated at 0.25 0.25 0.25 to 0.24 0.24 0.24,
and relax it ba
k to the equilibrium position using VASP. You have to add the lines
NSW
= 10 !
ISIF = 2 !
IBRION = 2 !
allow 10 steps
relax ions only
use CG algorithm
VASP AN INTRODUCTION
15
to the INCAR le. (At this point you might nd it helpful to read se
tion 6.21).
In order to nd the minimum, VASP performs a line minimisations of the energy along the dire
tion of the for
es (see
6.21). The line minimisation, requires VASP to take a small trial step into the dire
tion of the for
e, then the total
energy is re-evaluated. From the energy
hange and the initial and nal for
es, VASP
al
ulates the position of the
minimum. For
arbon, the automati
ally
hosen trial step is mu
h too large, and VASP
an run more ef
iently, if the
parameter POTIM is set in the INCAR le:
POTIM = 0.1 !
Do that and start on
e again from a more exited stru
ture (i.e. 0.20,0.20,0.20).
At the end of any job, VASP writes the nal positions to the le CONTCAR. This le has the same format as the POSCAR
le, and it is possible to
ontinue a run, by
opying CONTCAR to POSCAR and running VASP again.
6. As a nal exer
ise,
hange the latti
e
onstant in the POSCAR le to 3.40, and
hange ISIF in the INCAR le to
ISIF = 3 ! relax ions + volume
POTIM = 0.1 ! you need to spe
ify POTIM as well
and start on
e again. If ISIF is set to 3, VASP relaxes the ioni
positions and the
ell volume.
Do not forget to
he
k the OUTCAR le from time to time.
in the INCAR le. Start from the CONTCAR le of the last
al
ulation (i.e.
opy CONTCAR to POSCAR).
9. The Pulay error is independent of the stru
ture, so it
an be evaluated on
e and for ever using rst a large basis-set and
than a small one. Start at the equilibrium stru
ture, with a high
utoff (ENCUT=550). The stress tensor should be zero.
Then use the default
utoff. The stress is now -43 kBar. This yields an estimation of the possible errors
aused by the
basis set in
ompleteness. (You might
orre
t the relaxation by setting
PSTRESS = -43 ! Pulay stress = -43 kB
16
VASP is not publi
-domain or share-ware, and will be distributed only after a li
ense
ontra
t has been signed. Presently the
li
ense fee for a
ademi
users is 3000 USD. Enquiries must be send to Jurgen Hafner (Juergen.Hafnerunivie.a
.at).
The enquiry should
ontain a short des
ription of the short term resear
h aims (less than half a page).
3.2
Installation of VASP
To install VASP, basi
UNIX knowledge is required. The user should be a
quainted with the tar, gzip, and ideally with the
make
ommand of the UNIX environment.
VASP requires that the BLAS pa
kage is installed on the
omputer. This pa
kage
an be retrieved from many publi
domain servers, for instan
e http://math-atlas.sour
eforge.net, but if possible one should use an optimised BLAS
pa
kage from the ma
hine supplier (see se
tion. 3.7).
To install VASP,
reate a dire
tory for VASP to reside in. We re
ommend to use the dire
tory
/VASP/sr
ms.mpi.univie.a
.at
vasp
is sent by email after the li
ense
ontra
t has been signed
sr
The *.gz (gzip) les are generally smaller, but gzip is not installed on all ma
hines.
At the same lo
ation (
ms.mpi.univie.a
.at), pseudopotentials for all s-, p- and d-elements
an be found in the les
(pot/pot
ar.date.tar and pot GGA/pot
ar.date.tar). The tar le pot/pot
ar.date.tar
ontains ultrasoft pseudopotentials for the lo
al density approximation (LDA). This le should be untared in a seperated dire
tory, e.g. using the
ommands
d /VASP
mkdir pot
d pot
tar -xvf dire
tory_of_downloaded_file/pot
ar.date.tar
About 80 dire
tories, all
ontaining a le POTCAR.Z, are generated. The elements for whi
h the potential le was generated
an be re
ognised by the name of the dire
tory (e.g. Al, Si, Fe, et
). For more detail, we refer to se
tion 10. The
pot GGA/pot
ar.date.tar le
ontains the pseudopotentials for gradient
orre
ted (Perdew Wang 91)
al
ulations and
should be untared in a different dire
tory, e.g. using the
ommands
d /VASP
mkdir pot_GGA
d pot_GAA
tar -xvf dire
tory_of_downloaded_file/pot
ar.date.tar
Potential les for the proje
tor-augmented wave (PAW) method, are lo
ated in a seperate a
ount on the same ftp server:
server
login
password
dire
tories
ms.mpi.univie.a
.at
paw
is sent by email after the paw-li
ense
ontra
t has been signed
potpaw and potpaw_GGA
17
To untar these les, a similar pro
edure as des
ribed above should be used.
Do
umentations on VASP (for instan
e this le) might be found in the do
/ dire
tory.
After the les vasp.4.X.X.tar.gz and vasp.4.lib.tar.gz have been retrieved from the le server, the installation
pro
eeds along the following lines: First, un
ompress the *.Z or *.gz les using un
ompress or gunzip. Then untar the
vasp.*.tar les using e.g.:
gunzip vasp.4.X.X.tar.gz (or un
ompress vasp.4.X.X.tar.Z)
tar -xvf vasp.4.X.X.tar
gunzip vasp.4.lib.tar.gz (or un
ompress vasp.4.lib.tar.Z)
tar -xvf 4.4.lib.tar
Go to the vasp.4.lib dire
tory, and
opy the appropriate makefile.ma
hine to Makefile:
d vasp.4.lib
p makefile.ma
hine Makefile
makefile.de
makefile.linux_if
_P4
makefile.rs6000
makefile.t3d
makefile.hp
makefile.linux_if
_ath
makefile.sgi
makefile.t3e
makefile.linux_abs
makefile.linux_pg
makefile.sp2
makefile.vpp
The value in bra
kets indi
ates whether is likely that VASP will
ompile and exe
ute without problems: ++ no problems; +
usually no problems; 0 presently unknown; - unlikely. Type
make
The
ompilation should nish without errors, although warnings are possible. Go to the vasp.4.x dire
tory. Copy the appropriated makefile.ma
hine to Makefile. Now
he
k the rst 10-20 lines in the Makefile for additional hints. It is absolutely
required to follow these guidelines, sin
e the exe
utable might not work properly otherwise. If the Makefile suggests that
ertain routines must be
ompiled with a lower optimisation, you
an usually do this by inserting lines at the end of the
makele. For instan
e
radial.o : radial.F
$(CPP)
$(F77) $(FFLAGS) -O1 $(INCS) -
$*$(SUFFIX)
Finally, type
make
18
again. It should be possible to nish again without errors (although numerous warnings are possible). If problems are en
ountered during the
ompilation, please make rst shure that you have followed exa
tly the guidelines in the Makefile. If you
have done so, generate a bug report by typing the following
ommands (bash or ksh):
make
lean
make >bugreport 2>&1
Send, us the les Makefile, bugreport, the exa
t operating system version, and the exa
t
ompiler version (see Se
. 3.6).
Presently, we
an solve problems only for the following platforms, sin
e we do not have a
ess to other operating systems:
makefile.de
makefile.linux_pg
makefile.linux_alpha
makefile.rs6000
makefile.linux_if
_P4
makefile.sp2
makefile.linux_if _ath
Bug reports for the sun platform are rather useless. We know that vasp fails to work reliably on Sun ma
hines, but this is
related to an utterly bad Fortran 90
ompiler. Any suggestions how to solve this problem are appri
iated.
Mind: The VASP makeles assume that optimised BLAS pa
kages are installed on the ma
hine. The following BLAS libraries are linked in, if the standard makeles are used:
libessl.a
lib
xml.a
libblas.a
libve
lib.a
libs
i.a
libmkl_p4
Usually these pa
kages are spe
ied in the line starting with
BLAS=
If you do not have a ess to these optimized BLAS libraries, you an download the ATLAS based BLAS from
http://math-atlas.sour eforge.net. In this ase (and for most linux makeles), the BLAS line in the Makefile must
be
ostumized manually. Additional BLAS related hints are dis
ussed in se
tion 3.7 and in some of the makeles.
Next step: Create a work dire
tory,
opy the ben
h*.tar.gz les to this dire
tory and untar the ben
hmark.tar le.
gunzip <ben
hmark.tar.gz | tar -xvf -
Then type
dire
tory_where_VASP_resides/vasp
One should get the following results prompted to the s
reen (VASP.4.5 and newer versions):
VASP.4.4.4 24.Feb 2000
POSCAR found : 1 types and
8 ions
WARNING: mass on POTCAR and INCAR are in
ompatible
typ
1 Mass 63.5500000000000
63.5460000000000
----------------------------------------------------------------------------|
|
|
W
W
AA
RRRRR N
N II N
N GGGG !!!
|
|
W
W A A R
R NN N II NN N G
G !!!
|
|
W
W A
A R
R N N N II N N N G
!!!
|
|
W WW W AAAAAA RRRRR N N N II N N N G GGG !
|
|
WW WW A
A R R N NN II N NN G
G
|
|
W
W A
A R
R N
N II N
N GGGG !!!
|
19
|
|
|
VASP found
21 degrees of freedom
|
|
the temperature will equal 2*E(kin)/ (degrees of freedom)
|
|
this differs from previous releases, where T was 2*E(kin)/(3 NIONS). |
|
The new definition is more
onsistent
|
|
|
----------------------------------------------------------------------------file io ok, starting setup
WARNING: wrap around errors must be expe
ted
predi
tion of wavefun
tions initialized
entering main loop
N
E
dE
d eps
n
g
rms
rms(
)
CG : 1 -0.88871893E+04 -0.88872E+04 -0.15902E+04 96 0.914E+02
CG : 2 -0.90140943E+04 -0.12691E+03 -0.93377E+02 126 0.142E+02
CG : 3 -0.90288324E+04 -0.14738E+02 -0.49449E+01 112 0.293E+01 0.175E+01
CG : 4 -0.90228639E+04 0.59686E+01 -0.28031E+01 100 0.264E+01 0.373E+00
CG : 5 -0.90228253E+04 0.38602E-01 -0.64323E-01 100 0.337E+00 0.141E+00
CG : 6 -0.90227973E+04 0.28000E-01 -0.90047E-02 99 0.131E+00 0.643E-01
CG : 7 -0.90227865E+04 0.10730E-01 -0.31225E-02 98 0.677E-01 0.180E-01
CG : 8 -0.90227861E+04 0.43257E-03 -0.13932E-03 98 0.169E-01 0.800E-02
CG : 9 -0.90227859E+04 0.23479E-03 -0.47878E-04 62 0.814E-02 0.362E-02
CG : 10 -0.90227858E+04 0.41776E-04 -0.10154E-04 51 0.514E-02
1 T= 2080. E= -.90209042E+04 F= -.90227859E+04 E0= -.90220337E+04
EK= 0.18817E+01 SP= 0.00E+00 SK= 0.57E-05
bond
harge predi
ted
N
E
dE
d eps
n
g
rms
rms(
)
CG : 1 -0.90226970E+04 -0.90227E+04 -0.32511E+00 96 0.935E+00
CG : 2 -0.90226997E+04 -0.27335E-02 -0.26667E-02 109 0.957E-01
CG : 3 -0.90226998E+04 -0.23857E-04 -0.23704E-04 57 0.741E-02 0.455E-01
CG : 4 -0.90226994E+04 0.34907E-03 -0.15696E-03 97 0.150E-01 0.121E-01
CG : 5 -0.90226992E+04 0.22898E-03 -0.54745E-04 75 0.915E-02 0.327E-02
CG : 6 -0.90226992E+04 0.13733E-04 -0.50646E-05 49 0.395E-02
2 T= 1984. E= -.90209039E+04 F= -.90226992E+04 E0= -.90219455E+04
EK= 0.17948E+01 SP= 0.42E-03 SK= 0.37E-04
The ben
hmark requires 50 MBytes, and takes between 4-60 minutes. It is best if the ma
hine is idle, but generally results are
also useful if this is not the
ase. Mind that the last Typi
al values for LOOP+ are shown indi
ated in Se
tion 3.8. The output
produ
ed by this run
an be found in the OSZICAR.ref le (version VASP.4.4.3) in the tar le.
3.3
There are two dire
tories in whi
h VASP resides. vasp.4.lib holds les whi
h
hange rarely, but might require
onsiderable
hanges for supporting new ma
hines. vasp.4.x
ontains the VASP
ode, and
hanges with every update.
There are also several utility and maintenan
e programs that
an be found in the vasp.4.x dire
tory for instan
e the
> makeparam
utility. These les are not automati
ally
reated and must be
ompiled by hand, for instan
e typing
> make makeparam
20
Mind: Make sure that you have removed or renamed the old vasp.4.X dire
tory. Unpa
king the latest version into an existing
vasp.4.x dire
tory will usually
ause problems during
ompilation. Then pro
eed as des
ribed above.
3.5 Pre-
ompiler ags overview, parallel version and Gamma point only version
To support different ma
hines and different version VASP relies heavily on the C-pre-
ompiler (
pp). The
pp is used to
reate
*.f les from the *.F les. Several ags
an be passed to the
pp to generate different versions of the *.f les: Following ags
are
urrently supported:
single_BLAS
ve
tor
essl
NGXhalf
NGZhalf
wNGXhalf
wNGZhalf
NOZTRMM
REAL_to_DBLE
VASP.4 only:
debug
noSTOPCAR
F90_T3D
s
aLAPACK
T3D_SMA
MY_TINY
USE_ERF
CACHE_SIZE
MPI
MPI_CHAIN
pro_loop
use_
olle
tive
MPI_BLOCK
WAVECAR_double
These ags are usually dened in the makele in the
pp line with
-Dflag
Most of these ags are set properly in the platform dependent makeles, and therefore most users do not need to modify them.
To generate the parallel version however, modi
ation of the makeles are required. Most makeles have a se
tion starting
with
#----------------------------------------------------------------------#MPI VERSION
#-----------------------------------------------------------------------
If the the
omment sign '#' is removed from the following lines, the parallel version of vasp is generated. Please mind, that if
you want to
ompile the parallel version, you should either start from s
rat
h (by unpa
king VASP from the tar le) or type
> tou
h *.F
> make vasp
Finally, there are two ags that are of importan
e for the all users. If wNGXhalf is set in the makele, a version of VASP
is
ompiled that works at the G-point only. This version is 30-50% faster than the standard version. For the
ompilation of a
parallel G-point only version, the ag wNGZhalf instead of wNGXhalf must be set. Again it must be stressed, that if one of
these ags is set in the makele, all Fortran les must be re
ompiled. This
an be done by unpa
king the tar le or typing
21
tou
h *.F
make vasp
In the following se
tion all pre-
ompiler ags are briey des
ribed.
3.5.1
single BLAS
This ag is required, if the
ode is
ompiled for a single pre
ision ma
hine. In this
ase, the single pre
ision version of
BLAS/LAPACK
alls are used. Use this ag only on CRAY ve
tor
omputers.
3.5.2
ve tor
This ag should be set, if a ve
tor ma
hine is used. In this
ase,
ertain
onstru
tions whi
h are not ve
torisable are avoided,
resulting a
ode whi
h is usually faster on ve
tor ma
hines.
3.5.3
essl
Use this ag only if you are linking with ESSL before linking with LAPACK. ESSL uses a different
alling sequen
e for
DSYGV than LAPACK. (At the moment the makele for the RS 6000 links LAPACK before ESSL, so this ag is not
required).
3.5.4
NOZTRMM
If the LAPACK is not well optimised, the
all to ZTRMM should be avoided, and repla
ed by ZGEMM. This is done by
spe
ifying NOZTRMM in the makele.
3.5.5
This ag results in a
hange of all REAL(X)
alls to DBLE(X)
alls, and is only required on SGI ma
hines. On SGI ma
hines
the REAL
all is not automati
ally augmented to the DBLE
all if the auto-double
ompiler ag (-r8) is used. This ag is no
longer required in VASP.4.
3.5.6
NGXhalf, NGZhalf
For
harge densities and potentials, half the storage
an be saved if one of these ags is used, sin
e
A = A
and A = A :
q
(3.1)
To use a real to
omplex FFT you must spe
ify -DNGXhalf for the serial version and -DNGZhalf for the parallel version. If
-DNGXhalf is spe
ied for the serial version the real to
omplex FFT is simulated by a
omplex to
omplex FFT.
Mind: If this ag is
hanged in the makele, re
ompile all *.F les. This
an be done typing
tou
h *.F
make vasp
3.5.7
wNGXhalf, wNGZhalf
At the G-point half the storage for the wavefun
tions
an be saved if one of these ags is used be
ause
C = C
and C = C
q
(3.2)
To use a real to
omplex FFT you must spe
ify -DwNGXhalf for the serial version and -DwNGZhalf for the parallel version.
If -DwNGXhalf is spe
ied for the serial version the real to
omplex FFT is simulated by a
omplex to
omplex FFT.
Mind: If this ag is
hanged in the makele, re
ompile all *.F les. This
an be done using
tou
h *.F
make vasp
It is a good idea to
ompile the G-point only version in a separate dire
tory (for instan
e vasp gamma). Copy all les from
vasp to vasp gamma,
opy makele.ma
hine to makele, and edit the makele. Add the wNGXhalf (or wNGZhalf) ag to
the
pp line.
CPP
Usually the G-point only version is 2 times faster than the onventional version.
3.5.8
22
debug
Dening debug gives more information during a run. The additional information is written to stderr and might help to gure
out where the program
rashes. Mind, that the use of a debugger is usually mu
h faster for nding errors, but on some parallel
ma
hines, debuggers are not fully supported.
3.5.9
noSTOPCAR
Spe
ifying this ag avoids that the STOPCAR le is read at ea
h ele
troni
iteration. This step is too expensive on very fast
ma
hines with slow IO-subsystems (like T3D, T3E or Fujitsu VPP). Mind that LSTOP = .TRUE. is still supported (i.e. it is
possible to break after ele
troni
minimisation).
3.5.10
F90 T3D
Compile for the T3D, this has only minor effe
ts, for instan
e some
ompiler dire
tives like
!DIR$ IVDEP
are
hanged to
!DIR$
The rst dire
tive is required on a Cray ve
tor ma
hines for
orre
t ve
torisation, but it gives a warning on the T3D.
In addition the STOPCAR le will not be read on the T3D in ea
h iteration (see previous subse
tion) be
ause re-reading
the STOPCAR le is too expensive (0.5-1 se
) on a T3D. The F90 T3D ag must also be spe
ied if the s
aLAPACK ag is
used on the T3D, sin
e the T3D requires that some arrays are allo
ated in a spe
ial way (shmem-allo
ation).
3.5.11
MY TINY
In VASP, the symmetry is determined from the POSCAR le. In VASP.4.4, the a
ura
y to whi
h the positions must be
orre
tly spe
ied in the POSCAR
an be
ustomised only during
ompile time using the variable MY TINY. Per default
MY TINY is 10 6 implying that the positions must be
orre
t to within around 7 digits. If positions are not entered with the
required a
ura
y VASP will be unable to determine the symmetry group of the basis.
3.5.12
avoidallo
If -Davoidallo
is set in the makele, ALLOCATE and DEALLOCATE sequen
ies are avoided in some performan
e sensitive areas. Notably under LINUX ALLOCATE and DEALLOCATE is slow, and hen
e avoiding it improves the performan
e
of some routines by roughly 10%.
3.5.13
pro loop
If -Dpro loop is set in the makele, some DGEMV and DGEMM
alles are repla
ed by DO loops. This improves the
performan
e of the non lo
al proje
tor fun
tions on the SGI. Other ma
hines do not benet.
3.5.14
WAVECAR double
VASP.4.5 only.
If -DWAVECAR double is set in the makele, the WAVECAR les are written with double pre
ision a
ura
y, in a fully
ompatible manner to VASP.4.4. The default in VASP.4.5 is single pre
ision.
3.5.15
MPI
If this ag is set, the parallel version is generated. It is ne
essary to re
ompile all les (tou
h *.F). The parallelisation
requires that MPI is installed on the ma
hine and the path of the libraries must be spe
ied in the makele.
There is one minor te
hni
al problem: MPI requires an in
lude le mpif.h, whi
h is sometimes y not F90 free format
onform-able (CRAY is one ex
eption). Therefore the in
lude le mpif.h must be
opied to the dire
tory VASP.4 and
onverted to f90 style and named mpif.h. This
an be done using the following lines:
>
p ...mpi.../in
lude/mpif.h mpif.h
> ./
onvert mpif.h
The
onvert utility
onverts a F77 fortran le to a F90 free format le and is supplied in the VASP.4 dire
tory. (On most Cray
T3E this is for instan
e not required, and mpif.h
an be found in one of the default in
lude paths).
3.5.16
23
MPI CHAIN
Using this ag a version is
ompiled whi
h supports the nudged elasti
band method. The mpif.h le must be
reated in the
same way as explained above. Most les will be
ompiled in the same way as in the serial version (for instan
e no parallel
FFT support is required). In this
ase ea
h image, must run on one and only one node, the tag IMAGES must be set to the
number of nodes:
IMAGES = number of nodes
This version is as fast as the serial version (and thus usually faster than the full MPI version), and
an run very ef
iently on
lusters of workstation.
VASP.4.4 and VASP.4.5
urrently do not support this ag properly
3.5.17
In VASP.4.5, the MPI version of VASP avoids
olle
tive
ommuni
ation, sin
e they are very ineff
iently implemented in
the publi
domain MPI pa
kages, su
h as LAM or MPICH. On the SGI Origins and on the T3E, on the other hand the
olle
tive MPI routines are highly optimised. Hen
e use
olle
tive should be spe
ied on these platforms, and whenever
the
olle
tive MPI routines were optimised for the ar
hite
ture.
3.5.18
MPI BLOCK
Presently VASP breaks up immediate MPI send (MPI isend) and MPI re
eive (MPI ire
v)
alls using large data blo
ks into
smaller ones. We found that large blo
ks
ause a dramati
bandwidth redu
tion on LINUX
lusters linked by a 100 Mbit
and/or Gbit Ethernet (all Kernels, all mpi versions in
luding 2.6.X Linux kernels, lam.7.1.1). MPI BLOCK determines the
blo
k size. If use
olle
tive is used, MPI BLOCK is used only for the fast global sum routine (sear
h for M sumf d in
mpi.F).
3.5.19
T3D SMA
Although VASP.4 was initially optimised for the T3D (and T3E), the support for shmem
ommuni
ation is now only very
rudimentary, and might not even work. To make use of the ef
ient T3D (T3E) shmem
ommuni
ation s
heme, spe
ify
T3D SMA in the makele. This might speed up
ommuni
ation by up to a fa
tor of 2. But, mind that this
an also
ause
problems on the T3E if VASP is used with data-streams:
export SCACHE_D_STREAMS=1
The default makele on the T3E, therefore does not use the optimised
ommuni
ation routines, be
ause performan
e improvements due to data-streams are usually more important than optimised
ommuni
ation (it is thus safe to swit
h on data
streaming on the T3E typing i.e. export SCACHE D STREAMS=1).
3.5.20
s aLAPACK
If spe
ied, VASP will use s
aLAPACK instead of LAPACK for the LU de
omposition (timing ORTHCH) and diagonalisation (timing SUBROT) of the sub spa
e matrix (Nbands Nbands ). These operations are very fast in the serial version (2%) but
be
ome a bottlene
k on massively parallel ma
hine for systems with many ele
trons. If s
aLAPACK is installed on massively
parallel ma
hine use this swit
h (T3E, SGI, IBM SPX). s
aLAPACK
an be used on the T3E starting from programming
environment 3.0.1.0. (3.0.0.0 does for instan
e not offer the required routines). On the T3D (but not T3E) the additional
swit
h
-DT3D_SCA
must be spe
ied, at least for the s
aLAPACK version we have tested (the T3D s
aLAPACK is not
ompatible to standard
s
aLAPACK routines).
On slow networks and PC
lusters (100 Mbit Ethernet and even 1 Gbit Ethernet), it is not re
ommended to use s
aLAPACK. Performan
e improvements are small or s
aLAPACK is even slower than LAPACK. If you still want to give it a
try, please download the required sour
e les from www.netlib.org/SCALAPACK. Compilation is fairly straightforward, but
requires familiarity with MPI, Fortran, C and UNIX makeles (always make sure that the underlying BLACS routines are
working
orre
tly !).
S
aLAPACK
an be swit
hed of during runtime by spe
ifying
LSCALAPACK = .FALSE.
24
in the INCAR le. Use this as a fallba
k, when you en
ounter problems with s
aLAPACK. Furthermore, in some
ases, the
LU de
omposition (timing ORTHCH) based on s
aLAPACK is slower than the serial LU de
omposition. Hen
e it also is
possible, to swit
h of the parallel LU de
omposition by spe
ifying
LSCALU = .FALSE.
in the INCAR le (the subspa
e rotation is still done with s
aLAPACK in this
ase).
3.5.21
CRAY MPP
We en
ountered several problems with the MPI version of VASP.4.X on the CRAY J90. First MPI double pre
ision
(MPI double
omplex) must be
hanged to MPI real (MPI
omplex). Se
ond the reading of the INCAR le must be serialised (i.e. only one node
an do the reading at a time). Dening CRAY MPP in the makele xes these problems. But
we are not yet sure whether this ag is required on all CRAY MPP ma
hines or not. Any information on that would be
appri
iated.
3.6
Compilation of VASP.4.X is not always straightforward, be
ause f90
ompilers are in general not very reliable yet. Mind that
the in
lude le mpif.h must be supplied in f90 style for the
ompilation of the parallel version (see Se
tion 3.5.15). Here is a
list of
ompilers and platforms and the kind of problems we have dete
ted, in some
ases more information
an be found in
the relevant makeles:
CRAY C90/J90
No problems, but
ompilation (espe
ially of main.F) takes a long time. If there are time-limits the f90
ompiler might be
killed during
ompilation. In that
ase a
orrupt .o le remains, and must be removed by hand. If the last le
ompiled
was for instan
e nonl.F, the user must logout, login again and type
rm nonl.o
All
ompiler versions starting from 3.2.5.0 work
orre
tly (in
luding xlf90 4.X.X). Compiler version 3.2.0.0 will not
ompile the parallel version
orre
tly, but the serial version should be ne. One user reported that the version 3.2.3.0
ompiles the parallel version
orre
tly if the option -qddim is used.
On some systems the le mpif.h is lo
ated in the default in
lude sear
h path. Copying the mpif.h le to the lo
al
dire
tory and
onverting it to f90 style does not work (be
ause the system wide mpif.h le is always in
luded). One
solution is to rename the mpif.h le to mpif90.h. If the new mpi routines (parallel new.tar) are used only the line
INCLUDE "mpif.h"
must be
hanged to
INCLUDE "mpif90.h"
On some SGI's the option -64 must be
hanged to -n32 in the makeles of VASP.4.X and VASP.lib (O2 for instan
e).
Power Fortran 90, 7.2 on irix 6.2 works
orre
tly. Older version tend to
rash when
ompiling main.F, in parti
ular
ompiler version Fortran 90, 6.3 and 7.1 will not work.
(use versions grep f90 to nd out the
urrent
ompiler version)
DEC
The
ompiler version DIGITAL Fortran 90 V5.0-492 and V5.2
ompile VASP.4.X
orre
tly. Older
ompiler releases
and release V5.1 do not
ompile VASP, and require a
ompiler x or upgrade.
25
T3D
No problems, but
ompilation (espe
ially of main.F) takes a long time. If there are time-limits the f90
ompiler might be
killed during
ompilation. In that
ase a
orrupt .o le remains, and must be removed by hand. If the last le
ompiled
was for instan
e nonl.F, the user must logout, login again and type
rm nonl.o
before typing make again. Do not forget to upload all required modules before starting
ompilation. This is usually
done in the prole, on the U.K. T3D the following modules must be initialised:
if [ -f /opt/modules/modules/init/ksh ; then
# Initialize modules
. /opt/modules/modules/init/ksh
module load modules PrgEnv
fi
VASP supports only the newest alpha s
aLAPACK release on the T3D (on the T3E PrgEnv 3.0.1.0 must be installed),
and VASP will not work
orre
tly with the s
aLAPACK version supplied in the libs
i.a (libs
i.a
ontains only a downs
aled s
aLAPACK version, supporting very limited fun
tionality). If you do not have a
ess to this alpha release you
must swit
h of the s
aLAPACK (see Se
. 3.5.20).
T3E
The
ompiler versions 3.0.1.0 (and newer) should
ompile the
ode
orre
tly and without dif
ulties.
It might be ne
essary to
hange the makeles slightly: On the IDRIS-T3E the
pp (C-prepro
essor) was lo
ated in the
dire
tory /usr/lib/make/, it might be ne
essary to
hange this lo
ation (line CPP in the makeles) on other T3E
ma
hines.
For best performan
e one should also allow for hardware data streaming on the T3E, this
an be done using
export SCACHE_D_STREAMS=1
before running the
ode. The performan
e improvements
an be up to 30%. But we have to point out that the
ode
rashed from time to time if the swit
h T3D SMA is spe
ied in the makele. Therefore in the default makele,
T3D SMA is
urrently not spe
ied (and the optimised T3D/T3E
ommuni
ation routines are not used). If the
ommuni
ation performan
e is very important, T3D SMA
an be spe
ied in the makele, but then it might be required to
swit
h on data streaming expli
itly of by typing:
export SCACHE_D_STREAMS=0
LINUX
Reportedly the NAG
ompiler NAGWare f90
ompiler Version 2.2(260)
an
ompile the
ode. We do not have a
ess
to this version, so that we
an not help if problems are experien
ed with NAG
ompilers under LINUX. Please also
he
k the makeles before attempting the
ompilation.
At present we support the Portland Group F90/HPF (PGI). Tests for the Absoft f90
ompiler have shown that the
ode
generated by the PGI
ompiler is 10-30% faster. The makeles for the PGI f90
ompiler have the extension linux pg.
Release 1.7 and 3.0.1 have been tested to date, the resulting
ode has the same speed for both releases. For more details
please
he
k the makele.
3.7
For good performan
e, VASP requires highly optimised BLAS routines. This pa
kage
an be retrieved from many publi
domain servers, for instan
e ftp.netlib.org. Most ma
hine suppliers also offer optimised BLAS pa
kages. BLAS routines are
for instan
e part of the following libraries:
libessl
lib
xml
libblas
libmkl
libgoto
26
(on IBM)
(on DEC ALPHA)
(available from SGI)
(available from INTEL)
(P4/Athlon http://www.
s.utexas.edu/users/kgoto/signup_first.html)
These pa
kages rea
h peak performan
e on most ma
hines (up to 6 Gops). Whenever possible one should obtain these
routines from the manufa
turer of the ma
hine. As an alternative, one
an install the publi
domain versions but this might
slow down VASP by a fa
tor of 1.5 to 2 for very large systems.
If possible, an optimised LAPACK should also be installed, although this is less important for good performan
e. All
required LAPACK routines are also available in the les vasp.lib/lapa
k double.f. If optimised LAPACK routines are not
available, it is often possible to improve performan
e slightly by spe
ifying -DNOZTRMM (see se
tion 3.5.4) in the makele.
The
an be determined, using a large test system (for instan
e ben
h.Hg.tar) and running with IALGO=-1 spe
ied in the
INCAR le. The only timing inuen
ed is ORTHCH.
Of
onsiderable importan
e is in addition the performan
e of the FFT routines. VASP is supplied with routines written
and optimised by J. Furthmuller (it is a version of S
hwarztrauber's multiple sequen
e FFT, supporting radi
es 2,3,4,5 and
7). On most ma
hines these routines outperform the manufa
turer supplied routines (for instan
e CRAY C90, SGI, DEC). It
is possible to optimise these routines by supplying an additional ag to the pre-
ompiler
-DCACHE_SIZE=XXXXX
-DCACHE_SIZE=32768
-DCACHE_SIZE=8000
-DCACHE_SIZE=8000
-DCACHE_SIZE=16000
CACHE SIZE=0 has a spe
ial meaning. It performs the FFT's in x and y dire
tion plane by plane, in
reasing the
a
he
onsisten
y on some ma
hines. So it is worthwhile trying this setting as well. After
hanging CACHE SIZE in the makele
fft3dfurth must be tou
hed
tou
h fft3dfurth.F
and vasp re
ompiled. On ve
tor
omputers CACHE SIZE should be set to 0. It is also worthwhile in
reasing the optimisation
level for these routines (but in our tests we have never found a signi
ant performan
e improvement).
There are a few other routines whi
h might benet from higher optimisation: Most important are nonl.F and nonlr.F. Tests
for these routines
an be done with ben
h.Hg.tar and IALGO=-1. For LREAL=.TRUE. the timings for RPRO and RACC
(nonlr.F) are affe
ted, whereas for LREAL=.FALSE. the timings for VNLACC and PROJ (nonl.F) are affe
ted. In parti
ular,
one
an try to set -Davoidallo
in the makele (see Se
. 3.5.12). In this
ase ALLOCATE and DEALLOCATE sequen
ies
are avoided in some performan
e sensitive areas. Notably under LINUX, ALLOCATE and DEALLOCATE is slow, and hen
e
avoiding it, improves the performan
e of nonlr.F by roughly 10% (presently this option is sele
ted on all Linux platforms).
27
This will exe
ute the tests Lin
om-TPP, matrix-ve
and fft in this order (serial version only). Note that the present
algorithms make the matrix-ve
tor part less important than the syntheti
mix of Lin
om-TPP, matrix-ve
and fft. In
addition for the ben
h.Hg ben
hmark, the performan
e of the matrix-matrix part plays a more signi
ant role than in the
syntheti
ben
hmark.
Currently, all high performan
e ma
hines run VASP fairly well. The
heapest option (best value at lowest pri
e) are
presently AMD Athlon-64 based and Intel P4 PC's. For
ompilation we re
ommend the if
ompiler. Whi
h pro
essor (
lo
k
speed) to buy depends a little bit on the budget and the available spa
e. If you need a high pa
king density, dual Opteron
ma
hines are a good option. IBM Power 4 based ma
hines, Intel Itanium (SGI Altix, HP-UX) remain
ompetitive, but at a
somewhat steeper pri
e than PC's.
28
lin
om-TPP(Mops)
matrix-ve
(Mops)
Lin
om-TPP
matrix-ve
fft
TOTAL
RATING
ben
h.Hg
lin
om-TPP(Mops)
matrix-ve
(Mops)
Lin
om-TPP
matrix-ve
fft
TOTAL
RATING
ben
h.Hg
ben
h.PdO
lin
om-TPP(Mops)
matrix-ve
(Mops)
Lin
om-TPP
matrix-ve
fft
TOTAL
RATING
ben
h.Hg
lin
om-TPP(Mops)
matrix-ve
(Mops)
Lin
om-TPP
matrix-ve
fft
TOTAL
RATING
ben
h.Hg
ben
h.Hg1
ben
h.PdO
lin
om-TPP(Mops)
matrix-ve
(Mops)
lin
om-tpp
matrix-ve
fft
TOTAL
RATING
ben
h.Hg
IBM RS6000
590
245
110
40.6 s
32.3 s
31.4 s
103 s
1
1663
IBM RS6000
3CT
237
73/128
42.7 s
40.4 s
35.0 s
117 s
0.9
1920
IBM RS6000
595++
389
110
25.0 s
32.3 s
24.0 s
81.3 s
1.3
1380
IBM RS6000
595++
389
110
21.4 s
19.4 s
17.3 s
58.3 s
1.8
1000
IBM RS6000
397
580
300
17.8 s
15.3 s
14.4 s
47.5 s
2.2
809
IBM SP3
High Node
1220
300/400
8.4 s
12.1 s
5.1 s
26.8 s
3.8
356
IBM RS6000
590
IBM SP4
245
110
40.6 s
32.3 s
31.4 s
103 s
1
1663
3100
600/800
3.2 s
6.0 s
2.8 s
12.0 s
8.5
181/50
4000/1129
ITANIUM 2
1300
HP-UX
5000
1200/2300
2.0 s
2.3 s
1.7 s
6.0 s
16.3
127
2758
ITANIUM 2
1300
LINUX
4300
1200/1500
2.3 s
2.6 s
2.1 s
7.2 s
14.8
135
2900
Altix 350
1600
SUSE SLES9
5932
1378/2021
1.7 s
3.1 s
1.1 s
5.9 s
17.5
81
1733
SGI
Power C.
300
38
32.0 s
90.2 s
41.0 s
163 s
0.64
2200/653
SGI
Origin
430
100/150
22.0 s
31.0 s
17.0 s
70 s
1.47
1200/330
SUN
USpar
366
290
42/65
19.7 s
59 s
24 s
111 s
0.9
1660
DEC-SX
ev5/530
439
74/108
21.8 s
40.3 s
26.1 s
90 s
1.12
1424
DEC-LX
ev5/530
650
67/100
14.3 s
48.8 s
17.8 s
81 s
1.3
1140
DS20
ev6/500
800
135/200
12.0 s
19.8 s
9.8 s
41.4 s
2.4
546
584
DS202
ev6/500
1000
135/200
10.6 s
20.8 s
8.6 s
40.0 s
2.6
536
564
10792
DS20e2
ev6/666
1200
135/200
8.4 s
17.6 s
6.7 s
33.7 s
3.1
385
395
8151
UP2000
ev6/666
1100
170/260
9.3 s
17.9 s
8.5 s
35.7 s
2.8
465
516
UP20002
ev6/666
1100
CRAY T3D+
ev4
96
28/42
99.5 s
110.0 s
174.0 s
400 s
0.25
CRAY T3E+
ev5
400
101
25 s
33 s
42 s
100 s
1.0
639+
CRAY T3E+
1200
579
101
16.5 s
33 s
34 s
100 s
1.2
420 +
CRAY
C90
800
459
12.0 s
8.3 s
6.9 s
27.2 s
4.1
CRAY
J90
188
50
53 s
74 s
43 s
170 s
0.6
9.0 s
17.1 s
7.7 s
34 s
3.0
453
485
UP 1000
ev6/600
800
140/200
11.4 s
30.0 s
10.9 s
52 s
2.0
786
VPP
500
1500
600
7.1 s
5.0 s
5.4 s
17.5 s
6.5
220
29
LINUX
based PC's
lin
om-TPP(Mops)
matrix-ve
(Mops)
Lin
om-TPP
matrix-ve
fft
TOTAL
RATING
ben
h.Hg
Xeon GX
450
268
70/100
36 s
44 s
27 s
107 s
1
Xeon GX
550/512
378
90/120
27.3 s
37.1 s
22.4 s
87 s
1.18
1631
PIII BX
450
303
80/105
34.0 s
43.2 s
26.6 s
104 s
1.0
2000
PIII BX
500
324
90/118
32.9 s
41.9 s
24.6 s
100 s
0.9
1866
PIII
700
500
90/118
29.6 s
30.0 s
25.1 s
84 s
0.9
1789
LINUX
based PC's
lin
om-TPP(Mops)
matrix-ve
(Mops)
Lin
om-TPP
matrix-ve
fft
TOTAL
RATING
ben
h.Hg
Athlon
550
700
100/142
16.8 s
30.6 s
19.5 s
67 s
1.5
1350 s
Athlon
TB 800
770
115/190
12.8 s
26.3 s
18.7 s
57.8 s
1.8
1131 s
Athlon
TB 850
800
115/190
12.3 s
25.8 s
18.0 s
56 s
1.8
1124 s
Athlonx
TB 850
850
130/210
11.6 s
22.6 s
17.3 s
51.5 s
2.0
1045 s
Athlonx
TB 900
890
120/200
11.3 s
24.6 s
14.0 s
50 s
2.1
959 s
Athlonx
1200
1100
200/300
8.6 s
18.7 s
10.9 s
38.3 s
2.5
818 s
LINUX
based PC's
Athloni
1400b
SDRAM
1200
200/300
5.9 s
17.3 s
9.8 s
39.3 s
Athloni
XP/1900b
DDR
2200
230/370
4.9 s
13.1 s
7.3 s
25.3 s
Opteron j
244
32 bit
2900
650/850
3.5 s
5.4 s
3.3 s
12.2
Opteronk
246
32 bit
3300
700/950
3.1 s
4.3 s
3.0 s
10.4 s
Opteronk
250
32 bit
3800
750/1050
2.7 s
4.2 s
2.6 s
9.5 s
Opteron p
246
64 bit
3300
700/950
3.2 s
3.9 s
2.6 s
9.8 s
644
455
8412
248
4840
203
4256
177
3506
211
4172
lin
om-TPP(Mops)
matrix-ve
(Mops)
Lin
om-TPP
matrix-ve
fft
TOTAL
RATING
ben
h.Hg
ben
h.PdO
LINUX
based PC's
lin
om-TPP(Mops)
matrix-ve
(Mops)
Lin
om-TPP
matrix-ve
fft
TOTAL
RATING
ben
h.Hg
ben
h.PdO
Ath-64k
3700+
DDRAM
3400
700/1050
2.9 s
4.3 s
2.6 s
9.8 s
173
3550
30
LINUX
based PC's
lin
om-TPP(Mops)
matrix-ve
(Mops)
Lin
om-TPP
matrix-ve
fft
TOTAL
RATING
ben
h.Hg
ben
h.PdO
LINUX
based PC's
lin
om-TPP(Mops)
matrix-ve
(Mops)
Lin
om-TPP
matrix-ve
fft
TOTAL
RATING
ben
h.Hg
ben
h.PdO
P4i
1700
RAMBUS
2000
422/555
5.5 s
7.6 s
7.5 s
20.6 s
5
384
7600
XEONi
2400
RAMBUS
3030
600/750
3.5 s
5.3 s
4.9 s
13.7 s
7.5
298
6335
XEON j
2800
RAMBUS
4100
566/880
2.6 s
5.6 s
3.1 s
11.3 s
9.4
226/94
4790/1801
XEON j
2800
DDR
4200
650/950
2.5 s
5.0 s
2.9 s
10.5 s
10
208/85
4542/1787
P4 nrthwk
3200
FSB 800
4700
890/1300
2.3 s
3.9 s
2.6 s
8.8 s
11.7
175
3784
P4 presk
3200
FSB800/DDR1
5200
1000/1300
P4 pres j
3400
FSB800/DDR2
5200
1000/1300
2.0 s
3.1 s
2.0 s
7.1 s
14.5
144
2850
P4 presk
3400
FSB800/DDR2
5200
1000/1300
2.0 s
3.1 s
2.0 s
7.1 s
14.5
129
2580
P4 940sk
2x3200
FSB800/DDR2
5500
1100/1400
1.9 s
2.8 s
1.8 s
6.5 s
16.5
129
P4 940sl
2x3200
FSB800/DDR2
5500
1100/1400
1.9 s
2.8 s
1.7 s
6.5 s
16.5
111
2270
7.1 s
14.5
148/47
3224/939
P4 nrthw j
3400
FSB 800
5400
1200/1500
2.0 s
3.8 s
2.4 s
8.2 s
12.5
165
3250
+ VASP.4.4, hardware data streaming enabled; ben
h.Hg is running on 4 nodes, all other data per node
++ system equipped with 2 (rst) or 4 (se
ond) memory boards.
export MALLOC_MMAP_MAX_=0
export MALLOC_TRIM_THRESHOLD_=-1
improve the performan
e by 10-20%!! NOTE: sometimes, the tables show very different timings for similar ma
hines with similar
lo
k
rates. This is often related to an upgrade of the
ompiler or of the motherboard.
3.10
The table below shows the s
aling of VASP.4
ode on the T3D. The system is l-Fe with a
ell
ontaining 64 atoms, Gamma
point only was used, the number of plane waves is 12500, and the number of in
luded bands is 384.
pu's
4
8 16 32 64 128
NPAR
2
4
4
8
8 16
POTLOK:
11.72 5.96 2.98 1.64 0.84 0.44
SETDIJ:
4.52 2.11 1.17 0.61 0.36 0.24
EDDIAG:
73.51 35.45 19.04 10.75 5.84 3.63
RMM-DIIS: 206.09 102.80 52.32 28.43 13.87 6.93
ORTHCH: 22.39 8.67 4.52 2.4 1.53 0.99
DOS :
0.00 0.00 0.00 0.00 0.00 0.00
LOOP:
319.07 155.42 80.26 44.04 22.53 12.39
t =topt
100 % 99 % 90 % 90 % 80 %
31
PARALLELIZATION OF VASP.4
40
0.95
35
0.9
30
25
0.8
Efficiency
SpeedUp
20
0.75
0.7
15
Efficiency
SpeedUp
0.85
0.65
10
0.6
5
0
0.55
24 8
16
32
64
0.5
CPUs
executation time
speedup
12
8
12
number of CPUs
16 0
0.2
0.4
0.8
0.6
1/number of CPUs
Parallelization of VASP.4
Fortan 90 and VASP
VASP was widely rewritten to use the power and exibility of Fortran 90. On passing one must note that performan
e was
not a high priority during the restru
turing (although performan
e of VASP.4.x is usually better than of VASP.3.2). The main
32
PARALLELIZATION OF VASP.4
aim was to improve the maintainability of the
ode. Subroutine
alls in VASP.3.2 used to have
alling sequen
es of several
lines:
CALL EDDIAG(IFLAG,NBANDS,NKPTS,NPLWV,MPLWV,NRPLWV,
&
NINDPW,NPLWKP,WTKPT,SV,CPTWFP,NTYP,NITYP,
&
NBLK,CBLOCK,A,B,ANORM,BNORM,CELEN,NGPTAR,
&
LOVERL,LREAL,CPROJ,CDIJ,
&
CQIJ,IRMAX,NLI,NLIMAX,QPROJ,CQFAK,RPROJ,CRREXP,CREXP,
&
DATAKE,CPRTMP,CWORK3,CWORK4,CWORK5,
&
FERWE,NIOND,NIONS,LMDIM,LMMAX,
&
NPLINI,CHAM,COVL,CWORK2,R,DWORK1,NWRK1,CPROTM,NWORK1,m
pu)
This was an out
ome of not using any COMMON blo
ks in VASP.3.2. Due to the introdu
tion of derived types (or stru
tures)
the same CALL
onsists now of only 2 lines:
CALL EDDIAG(GRID,LATT_CUR,NONLR_S,NONL_S,WUP,WDES, &
LMDIM,CDIJ,CQIJ, IFLAG,INFO%LOVERL,INFO%LREAL,NBLK,SV)
This adds
onsiderably to the readability and stru
turing of the
ode. It is now mu
h easier to introdu
e and support new
features in VASP. We estimate that the introdu
tion of F90 redu
ed the time required for the parallelization of VASP from
approximately 4 to 2 months.
In VASP.3.2 work arrays were allo
ated stati
ally and several EQUIVALENCE statements existed to save memory. The
introdu
tion of new subroutines requiring work arrays was always extremely tedious. In VASP.4.x all work spa
e is allo
ated
on the y using ALLOCATE and DEALLOCATE. This results in a smaller
ode, and makes the program signi
antly safer.
Finally VASP.4.x uses MODULES wherever possible. Therefore dummy parameters are
he
ked during
ompilation time,
making further
ode development easier and safer.
4.2
VASP has still a quite at hierar
hy, i.e. the modularity of the
ode is not extremely high. But in
reasing the modularity would
have required too mu
h
ode restru
turing and man power whi
h was not available (the
urrent
ode size is approximately
50 000 lines, making a
omplete rewrite almost impossible).
Ea
h stru
ture in VASP.4 is dened in an in
lude le:
base.in
broyden.in
onstant.in
pos
ar.in
pseudo.in
setexm.in
mpimy.in
If one wants to understand VASP one should start with an examination of these les.
4.3
Parallelization of VASP.4.x
On
e F90 has been introdu
e it was mu
h easier to do the parallelization of VASP. One stru
ture at the heart of VASP is for
instan
e the grid stru
ture (whi
h is required to des
ribe 3-dimensional grids). Here is a slightly simplied version of the
stru
ture found in the mgrid.in
le:
TYPE grid_3d
!only GRID
INTEGER NGX,NGY,NGZ
! number of grid points in x,y,z
INTEGER NPLWV
! total number of grid points
INTEGER MPLWV
! allo
ation in
omplex words
TYPE(layout)
:: RC
! re
ipro
al spa
e layout
TYPE(layout)
:: IN
! intermediate layout
TYPE(layout)
:: RL
! real spa
e layout
! mapping for parallel version
TYPE(grid_map) :: RC_IN
! re
ip -> intermeadiate
omm.
TYPE(grid_map) :: IN_RL
! intermeadiate -> real spa
e
omm.
TYPE(
ommuni
), POINTER :: COMM ! opaque
ommuni
ator
33
PARALLELIZATION OF VASP.4
NGX, NGY, NGZ des
ribes the number of grid points in x, y and z dire
tion, and NPLWV the total number of points (i.e.
NGX*NGY*NGZ). Most quantities (like
harge densities) are dened on these 3-dimensional grids. In the sequential version
NGX, NGY and NGZ were suf
ient to perform a three dimensional FFT of quantities dened on these grids. In the parallel
version the distribution of data among the pro
essors must also be known. This is a
hieved with the stru
tures RL and RC,
whi
h des
ribe how data are distributed among pro
essors in real and re
ipro
al spa
e. In VASP data are distributed
olumn
wise on the nodes, in re
ipro
al spa
e the fast index is the rst (or x) index and and
olumns
an be indexed by a pair (y,z).
In real spa
e the fast index is the z index,
olumns are indexed by the pair (z,y). In addition the FFT-routine (whi
h performs
lots of
ommuni
ation) stores all required setup data in two mapping-stru
tures
alled RC IN and IN RL.
The big advantage of using stru
tures instead of
ommon blo
ks is that it is trivial to have more than one grid. For
instan
e, VASP uses a
oarse grid for the representation of the ultra soft wavefun
tions and a se
ond mu
h ner grid for
the representation of the augmentation
harges. Therefore two grids are dened in VASP one is
alled GRID (used for
the wavefun
tions) and other one is
alled GRIDC (used for the augmentation
harges). A
tually a third grid exists whi
h
has in real spa
e a similar distribution as GRID and in re
ipro
al spa
e a similar distribution as GRIDC. This third grid
(GRID SOFT) is used to put the soft pseudo
harge density onto the ner grid GRIDC.
VASP
urrently offers parallelization over bands and parallelization over plane wave
oef
ients. To get a high ef
ien
y
it is strongly re
ommended to use both at the same time. The only algorithm whi
h works with the over band distribution is
the RMM-DIIS matrix diagonalizer (IALGO=48). Conjugate gradient band-by-band method (IALGO=8) is only supported
for parallelization over plane wave
oef
ients.
Parallelization over bands and plane wave
oef
ients at the same time redu
es the
ommuni
ation overhead signi
antly.
To rea
h this aim a 2 dimensional
artesian
ommuni
ation topology is used in VASP:
node-id's
0
1
4
5
8
9
12
13
2
6
10
14
3
7
11
15
bands 1,5,9,...
bands 2,6,10,...
et
.
Bands are distributed among a group of nodes in a round robin fashion, separate
ommuni
ation universe are set
up for the
ommuni
ation within one band (in-band
ommuni
ation COMM INB), and for inter-band
ommuni
ation
(COMM INTER). Communi
ation within one in-band
ommuni
ation group (for instan
e 0-1-2-3) does not interfere with
ommuni
ation done within another group (i.e. 4-5-6-7). This
an be a
hieved easily with MPI, but we have also implemented
the required
ommuni
ation routines with T3D shmem
ommuni
ation.
Overall we have found a very good load balan
ing and an extremely good s
aling in the band-by-band RMM-DIIS
algorithm. For the re-orthogonalization and subspa
e rotation whi
h is required from time to time the wavefun
tions
are redistributed from over bands to a over plane wave
oef
ient distribution. The
ommuni
ation in this part is by the way
very small in
omparison with the
ommuni
ation required in the FFT's. Nevertheless subspa
e rotation on massively parallel
omputer is
urrently still problemati
, mainly be
ause the diagonalization of the NBANDSNBANDS subspa
e-matrix is
extremely slow.
There are some points whi
h should be noted: Parallelization over plane waves means that the non lo
al proje
tion operators must be stored on ea
h in-band-pro
essor group (i.e. nodes 0-1-2-3 must store all real spa
e proje
tion operators). This
means relatively high
osts in terms of memory, and therefore parallelization over bands should not be done too ex
essively.
Having for instan
e 64 nodes, we found that it is best to generate a 8 by 8
artesian
ommuni
ator. Mind also that the hard
augmentation
harges are always distributed over ALL nodes, even if parallelization over bands is sele
ted. This was possible
using the previously mentioned third grid GRID SOFT, i.e. this third helper grid allows one to de
ouple the presentation of
the augmentation and ultra soft part.
4.4
Files in the parallel version and serial version are fully
ompatible, and
an be ex
hanged freely. Notably it is possible to
restart from an existing WAVECAR and/or CHGCAR le even if the number of nodes in the parallel version has
hanged.
But also mind, that the WAVECAR le is a binary le, and therefore it
an be transfered only between ma
hines with a
similar binary oating point format (for instan
e IEEE standard format).
4.5
In most respe
ts VASP.4.X should behave like VASP.3.2. However in VASP.4.4, IALGO=48 was redesigned to work more
reliable in problemati
ases. Therefore the iteration history might not be dire
tly
omparable. VASP.4.X also subtra
ts the
atomi
energies in ea
h iteration, VASP.3.2 does not. On
e again this means that the energies written in ea
h ele
troni
step
are not
omparable.
34
The parallel version (i.e. if VASP is
ompiled with the MPI ag) has some further restri
tion, some of them might be
removed in the future:
Here is a list of features not supported by VASP.4.4 running on a parallel ma
hine:
VASP.4.4 (VASP.4.5 does not posses this restri
tion): The most severe restri
tion is that it is not possible to
hange
the
utoff or the
ell size/shape on restart from existing WAVECAR le. This means that if the
ell size/shape and or
the
utoff has been
hanged the WAVECAR should be removed before starting the next
al
ulation (a
tually VASP will
realize if the
utoff or the
ell shape have been
hanged and will pro
eed automati
ally as if the WAVECAR le does
not exist). The reason for this restri
tion is that the re-padding (i.e. the redistribution of the plane wave
oef
ients
on
hanging the
utoff sphere) would require a sophisti
ated redistribution of data and the required
ommuni
ation
routines are not implemented at present.
As a matter of fa
t, it is of
ourse possible to restart with an existing WAVECAR le even if the number of nodes has
hanged. The only point that requires attention is that
hanging the NPAR parameter might also effe
t the number of
bands (NBANDS). WAVECAR les
an only be read if the numbers of bands is stri
tly the same on the le and for the
present run. In some
ases, it might be required to set the number of bands expli
itly in the INCAR le by spe
ifying
the NBANDS parameter.
Symmetry is fully supported by the parallel version, BUT we have used a brute for e method to implement it. The
harge density is rst merged from all nodes, then symmetrized lo
ally and nally the result is redistributed onto the
nodes. This means that the symmetrization of the
harge density will be very slow, this
an have serious impa
t on the
total performan
e.
In VASP.4.4.3 (and newer version) this problem
an be redu
ed by spe
ifying ISYM=2 instead of ISYM=1. In this
ase
only the soft
harge density and the augmentation o
upan
ies are symmetrized, whi
h results in pre
isely the same
result as ISYM=1 but requires less memory. ISYM=2 is the default for the PAW method.
Partial lo al DOS is only supported with parallelization over plane wave oef ients but not with parallelization over
bands. The reason is that some les (like PROCAR) have a rather
ompli
ated band-by-band layout, and it would be
ompli
ated to mimi
this layout with a data distribution over bands.
in
**
in
out
in
**
in
**
out
in
**
out
in (should not be used in VASP.3.2 and VASP.4.x)
in/out
out
in/out
in/out
out
out
out
out
out
out
out
out
out
A short des
ription of theses les will be given in the next se
tion. Important input les required for all
al
ulations are
marked with stars in the list, please
he
k des
ription and
ontents of these les rst.
35
than VASP stops at the next ioni
step. On the other hand, if the STOPCAR le
ontains the line
LABORT = .TRUE.
VASP stops at the next ele
troni
step, i.e. WAVECAR and CHGCAR might
ontain non
onverged results. If possible use
the rst option.
36
to
on
at three POTCAR les. The rst le will
orrespond to the rst spe
ies on the POSCAR and INCAR le and so on.
Starting from version VASP 3.2, the POTCAR le also
ontains information about the atoms (i.e. there mass, their valen
e,
the energy of the referen
e
onguration for whi
h the pseudopotential was
reated et
.). With these new POTCAR le it is
not ne
essary to spe
ify valen
e and mass in the INCAR le. If tags for the mass and valen
e exist in the INCAR le they
are
he
ked against the parameters found on the POTCAR le and error messages are printed.
Mind: Be very
areful with the
on
atenation of the POTCAR les, it is a frequent error to give the wrong ordering in the
POTCAR le!
The new POTCAR les also
ontains a default energy
utoff (ENMAX and ENMIN line), therefore it is no longer
ne
essary to spe
ify ENCUT in the INCAR le. Of
ourse the value in the INCAR le overwrites the default in the POTCAR
le. For POTCAR les with more than one spe
ies the maximum
utoffs (ENMAX or ENMIN) are used for the
al
ulation
(see Se
. 6.10). For more information about the supplied pseudopotentials please refer the se
tion 10.
In this format an expli
it listing of all
oordinates and of the
onne
tion tables for the tetrahedra if one wants to use the
tetrahedron integration methods is supplied (the latter part
an be omitted for nite temperaturesmearing methods, see
se
tion 7.4). The most general format is:
Example file
4
Cartesian
0.0 0.0 0.0 1.
0.0 0.0 0.5 1.
0.0 0.5 0.5 2.
0.5 0.5 0.5 4.
Tetrahedra
1 0.183333333333333
6
1 2 3 4
The rst line is treated as a
omment line. In the se
ond line you must provide the number of k-points and in the third line you
have to spe
ify whether the
oordinates are given in
artesian or re
ipro
al
oordinates. Only the rst
hara
ter of the third
line is signi
ant. The only key
hara
ters re
ognized by VASP are 'C', '
', 'K' or 'k' for swit
hing to
artesian
oordinates,
any other
hara
ter will swit
h to re
ipro
al
oordinates. Anyway, write 're
ipro
al' to swit
h to re
ipro
al
oordinates to
make
lear what you want to use. Next, the three
oordinates and the (symmetry degeneration) weight for ea
h k-points
follow (one line for ea
h k-point). The sum of all weights must not be one VASP will renormalize them internally, only the
relative ratios of all weights have to be
orre
t. In the re
ipro
al mode the k-points are given by
(5.1)
k=
:::
2p
a
are the supplied values. In the
artesian input format the k-points
(5.2)
( x1 ; x 2 ; x3 )
The following example illustrates how to spe
ify the kpoints. The unit
ell of the f
latti
e is spanned by the following
basis ve
tors:
0 0
A= a 2
=
a=2
a=2
a=2
a=2
a=2
1
A
37
0
2p
2 piB =
a
1
1
1
1
1
1
1
1
1
1
A
The following input is required in order to spe
ify the high symmetry k-points.
Point
If the tetrahedron method is not used the KPOINTS le may end after the list of
oordinates. The tetrahedron method
requires an additional
onne
tion list for the tetrahedra: In this
ase, the next line must start with 'T' or 't' signaling that this
onne
tion list is supplied. On the next line after this '
ontrol line' one must enter the number of tetrahedra and the volume
weight for a single tetrahedron (all tetrahedra must have the same volume). The volume weight is simply the ratio between the
tetrahedron volume and the volume of the (total) Brillouin zone. Then a list with the (symmetry degeneration) weight and the
four
orner points of ea
h tetrahedron follows (four integers whi
h represent the indi
es to the points in the k-point list given
above, 1
orresponds to the rst entry in the list). Warning: In
ontrast to the weighting fa
tors for ea
h k-point you must
provide the
orre
t 'volume weight' and (symmetry degeneration) weight for ea
h tetrahedron no internal renormalization
will be done by VASP!
This method is normally used if one has only a small number of k-points or if one wants to sele
t some spe
i
k-points
whi
h do not form a regular mesh (e.g. for
al
ulating the bandstru
ture along some spe
ial lines within the Brillouin zone,
se
tion 9.3). Tetrahedron
onne
tion tables will rarely be given 'by hand'. Nevertheless this method for providing all k-point
oordinates and weights (and possibly the
onne
tion lists) is also important if the mesh
ontains a very large number of kpoints: VASP (or an external tool
alled 'k-points')
an
al
ulate regular k-meshes automati
ally (see next se
tion) generating
an output le IBZKPT whi
h has a valid KPOINTS-format. For very large meshes it takes a lot of CPU-time to generate the
mesh. Therefore, if you want to use the same k-mesh very frequently, do the automati
generation only on
e and
opy the le
IBZKPT to the le KPOINTS. In subsequent runs, VASP
an avoid a new generation by reading the expli
it list given in this
le.
If the tetrahedron method is not used the KPOINTS le may end after the list of
oordinates. The tetrahedron method
requires an additional
onne
tion list for the tetrahedra: In this
ase, the next line must start with 'T' or 't' signaling that this
onne
tion list is supplied. On the next line after this '
ontrol line' one must enter the number of tetrahedra and the volume
weight for a single tetrahedron (all tetrahedra must have the same volume). The volume weight is simply the ratio between the
tetrahedron volume and the volume of the (total) Brillouin zone. Then a list with the (symmetry degeneration) weight and the
four
orner points of ea
h tetrahedron follows (four integers whi
h represent the indi
es to the points in the k-point list given
above, 1
orresponds to the rst entry in the list). Warning: In
ontrast to the weighting fa
tors for ea
h k-point you must
provide the
orre
t 'volume weight' and (symmetry degeneration) weight for ea
h tetrahedron no internal renormalization
will be done by VASP!
This method is normally used if one has only a few number of k-points or if one wants to sele
t some spe
i
k-points
whi
h do not form a regular mesh (e.g. for
al
ulating the bandstru
ture along some spe
ial lines within the Brillouin zone,
se
tion 9.3). Tetrahedron
onne
tion tables will rarely be given 'by hand'. Nevertheless this method for providing all k-point
oordinates and weights (and possibly the
onne
tion lists) as also important if the mesh
ontains a very large number of kpoints: VASP (or an external tool
alled 'k-points')
an
al
ulate regular k-meshes automati
ally (see next se
tion) generating
an output le IBZKPT whi
h has a valid KPOINTS-format. For very large meshes it takes a lot of CPU-time to generate the
mesh. Therefore, if you want to use the same k-mesh very
5.5.2
To generated strings of k-points
onne
ting spe
i
points of the Brillouin zone, the third line of the KPOINTS le must
start with an L for line-mode:
k-points along high symmetry lines
10 ! 10 interse
tions
38
Line-mode
art
0 0 0
0 0 1
! gamma
! X
0 0
0.5 0
1
1
! X
! W
0.5 0
0 0
1
1
! W
! gamma
VASP will generate 10 k-points, between the rst and the se
ond supplied point, 10 k-points between the third and the fourth,
and another 10 points between the nal two points. The
oordinates of the k-points
an be supplied in
artesian (4th line starts
with
or k) or in re
ipro
al
oordinates (4th line starts with r):
k-points along high symmetry lines
10 ! 10 interse
tions
Line-mode
re
0 0 0 ! gamma
0.5 0.5 0 ! X
0.5 0.5 0 ! X
0.5 0.75 0.25 ! W
0.5 0.75 0.25 ! W
0 0 0 ! gamma
This parti
ular mode is useful for the
al
ulation of band-stru
tures. When band stru
tures are
al
ulated, it is required
to perform a fully self
onsistent
al
ulations with a full k-point grid (see below) rst, and to perform a non-self
onsistent
al
ulation next (ICHARG=11, see Se
. 6.14, 9.3).
5.5.3
The se
ond method generates k-meshes automati
ally, and requires only the input of subdivisions of the Brillouin zone in ea
h
dire
tion and the origin ('shift') for the k-mesh. There are three possible input formats. The simplest one is only supported by
VASP.4.5 and newer versions:
Automati
mesh
0
! number of k-points = 0 ->automati
generation s
heme
Auto
! fully automati
10
! length (l)
As before, the rst line is treated as a
omment. On the se
ond line a number smaller or equal 0 must be spe
ied. In the
previous se
tion, this value supplied the number of k-points, a zero in this line a
tivates the automati
generation s
heme.
The fully automati
s
heme, sele
ted by the rst
hara
ter in the third line ('a'), generates G
entered Monkhorst-Pa
k grids,
where the numbers of subdivisions along ea
h re
ipro
al latti
e ve
tor are given by
N1 = max(1; l j~b1 j + 0:5)
N2 = max(1; l j~b2 j + 0:5)
N3 = max(1; l j~b3 j + 0:5):
bi are the re
ipro
al latti
e ve
tors, and j~bi j is their norm. VASP generates a equally spa
ed k-point grid with the
oordinates:
k = ~b1
n1
N1
+~b2
n2
N2
+~b3
n3
;
N3
n1 = 0:::; N1
1 n2 = 0:::; N2 1 n3 = 0:::; N3 1
Symmetry is used to map equivalent k-points to ea
h other, whi
h
an redu
e the total number of k-points signi
antly. Useful
values for the length vary between 10 (large gap insulators) and 100 (d-metals).
A slightly enhan
ed version, allows to supply the numbers for the subdivisions N1 , N2 and N3 manually:
39
Automati
mesh
0
! number of k-points = 0 ->automati
generation s
heme
Gamma
! generate a Gamma
entered grid
4 4 4
! subdivisions N_1, N_2 and N_3 along re
ipr. l. ve
tors
0. 0. 0.
! optional shift of the mesh (s_1, s_2, s_3)
In this
ase, the third line (again, only the rst
hara
ter is signi
ant) might start with 'G' or 'g' for generating meshes
with their origin at the G point (as above) or 'M' or 'm', whi
h sele
ts the original Monkhorst-Pa
k s
heme. In the latter
ase k-point grids, with even (mod(N ; 2) = 0) subdivisions are shifted off G:
i
k = ~b1
The fth line is optional, and supplies an additional shift of the k-mesh (
ompared to the origin used in the Gamma
entered
or Monkhorst-Pa
k
ase). Usually the shift is zero, sin
e the two most important
ases are
overed by the ags 'M' or 'm',
'G' or 'g'. The shift must be given in multiples of the length of the re
ipro
al latti
e ve
tors, and the generated grids are then
('G'
ase):
k = ~b1
n1 + s1
N1
+~b2
n2 + s2
N2
+~b3
n3 + s 3
:
N3
The sele
tion 'M' without shift, is obviously equivalent to 'G' with a shift of 0.5 0.5 0.5, and vi
e versa.
If the third line does not start with 'M', 'm', 'G' or 'g' an alternative input mode is sele
ted. this mode is mainly for
experts, and should not be used for
asual VASP users. Here one
an provide dire
tly the generating basis ve
tors for the
k-point mesh (in
artesian or re
ipro
al
oordinates). The input le has the following format:
Automati
0
Cartesian
0.25 0.00
0.00 0.25
0.00 0.00
0.00 0.00
generation
0.00
0.00
0.25
0.00
The entry in the third line swit
hes between
artesian and re
ipro
al
oordinates (as in the expli
it input format des
ribed rst
key
hara
ters 'C', '
', 'K' or 'k' swit
h to
artesian
oordinates). On the fth, sixth and seventh line the generating basis
ve
tors must be given and the eighth line supplies the shift (if one likes to shift the k-mesh off G, default is to take the origin
at G, the shift is given in multiples of the generating basis ve
tors, only (0,0,0) and (1/2,1/2,1/2) and arbitrary
ombinations
are usually usefull). This method
an always be repla
ed by an appropriate Monkhorst-Pa
k setting. For instan
e for a f
latti
e the setting
art
0.25 0 0
0 0.25 0
0 0 0.25
0.5 0.5 0.5
is equivalent to
Monkhorst-pa
k
4 4 4
0 0 0
This input s
heme is espe
ially interesting to build meshes, whi
h are based on the
onventional
ell (for instan
e s
for f
and b
), or the primitive
ell if a large super
ell is used. In the example above the k-point mesh is based on the s
-
ell. (for
the se
ond input le the tetrahedron method
an not be used be
ause the shift breaks the symmetry (see below), whereas the
rst input le
an be used together with the tetrahedron method). For more hints please read se
tion 8.6.
Mind: The division s
heme (or the generating basis of the k-mesh) must lead to a k-mesh whi
h belongs to the same
lass
of Bravais latti
e as the re
ipro
al unit
ell (Brillouin zone). Any symmetry-breaking set-up of the mesh
annot be handled
40
by VASP. Hen
e su
h set-ups are not allowed if you break this rule an error message will be displayed. Furthermore the
symmetrisation of the k-mesh
an lead to meshes whi
h
an not be divided into tetrahedrons (at least not by the tetrahedron
division s
heme implemented in VASP) if one uses meshes whi
h do not have their origin at G (for
ertain lower symmetri
types of Bravais latti
es or
ertain non-symmetry
onserving shifts). Therefore only very spe
ial shifts are allowed. If a shift
is sele
ted whi
h
an not be handled you get an error message. For reasons of safety it might be a good
hoi
e to use only
meshes with their origin at G (swit
h 'G' or 'g' on third line or odd divisions) if the tetrahedron method is used.
5.5.4
hexagonal latti es
We strongly re
ommend to use only Gamma
entered grids for hexagonal latti
es. Many tests we have performed indi
ate
that the energy
onverges signi
antly faster with Gamma
entered grids than with standard Monkhorst Pa
k grids. Grids
generated with the M setting in the third line, in fa
t do not have full hexagonal symmerty.
IBZKPT may be opied to the le KPOINTS to save time, if one KPOINTS set is used several times.
or
Cubi
BN
3.57
0.0 0.5 0.5
0.5 0.0 0.5
0.5 0.5 0.0
1 1
Dire
t
0.00 0.00 0.00
0.25 0.25 0.25
41
The rst line is treated as a
omment line (you should write down the 'name' of the system). The se
ond line provides a
universal s
aling fa
tor ('latti
e
onstant'), whi
h is used to s
ale all latti
e ve
tors and all atomi
oordinates. (If this value
is negative it is interpreted as the total volume of the
ell). On the following three lines the three latti
e ve
tors dening the
unit
ell of the system are given (rst line
orresponding to the rst latti
e ve
tor, se
ond to the se
ond, and third to the third).
The sixth line supplies the number of atoms per atomi
spe
ies (one number for ea
h atomi
spe
ies). The ordering must be
onsistent with the POTCAR and the INCAR le. The seventh line swit
hes to 'Sele
tive dynami
s' (only the rst
hara
ter
is relevant and must be 'S' or 's'). This mode allows to provide extra ags for ea
h atom signaling whether the respe
tive
oordinate(s) of this atom will be allowed to
hange during the ioni
relaxation. This setting is useful if only
ertain 'shells'
around a defe
t or 'layers' near a surfa
e should relax. Mind: The 'Sele
tive dynami
s' input tag is optional: The seventh line
supplies the swit
h between
artesian and dire
t latti
e if the 'Sele
tive dynami
s' tag is omitted.
The seventh line (or eighth line if 'sele
tive dynami
s' is swit
hed on) spe
ies whether the atomi
positions are provided
in
artesian
oordinates or in dire
t
oordinates (respe
tively fra
tional
oordinates). As in the le KPOINTS only the rst
hara
ter on the line is signi
ant and the only key
hara
ters re
ognized by VASP are 'C', '
', 'K' or 'k' for swit
hing to
the
artesian mode. The next lines give the three
oordinates for ea
h atom. In the dire
t mode the positions are given by
R = x1~a1 + x2~a2 + x3~a3
(5.3)
where ~a1 3 are the three basis ve
tors, and x1 3 are the supplied values. In the
artesian mode the positions are only s
aled
by the fa
tor s on the se
ond line of the POSCAR le
:::
:::
0x 1
1
R = s x2 A
x3
(5.4)
The ordering of these lines must be
orre
t and
onsistent with the number of atoms per spe
ies on the sixth line. If your are
not sure whether you have a
orre
t input please
he
k the OUTCAR le, whi
h
ontains both the nal
omponents of the
ve
tor ~R, and the positions in dire
t (fra
tional)
oordinates. If sele
tive dynami
s are swit
hed on ea
h
oordinatetriplet
is followed by three additional logi
al ags determining whether to allow
hanges of the
oordinates or not (in our example
the 1.
oordinate of atom 1 and all
oordinates of atom 2 are xed). If the line 'Sele
tive dynami
s' is removed from the le
POSCAR these ag will be ignored (and internally set to .T.).
Mind: The ags refer to the positions of the ions in dire
t
oordinates, no matter whether the positions are entered in
artesian
or dire
t
oordinates. Therefore, in the example given above the rst ion is allowed to move into the dire
tion of the rst and
se
ond dire
t latti
e ve
tor.
If no initial velo
ities are provided, the le may end here. For mole
ular dynami
s the velo
ities are initialised randomly
a
ording to a Maxwell-Boltzmann distribution at the initial temperature TEBEG (see se
tion 6.28).
Entering velo
ities by hand is rarely done, ex
ept for the
ase IBRION=0 and SMASS=-2 (see se
tion 6.29). In this
ase
the initial velo
ities are kept
onstant allowing to
al
ulate the energy for a set of different linear dependent positions (for
instan
e frozen phonons, se
tion 9.11, dimers with varying bond-length, se
tion 9.6). As previously the rst line supplies a
swit
h between
artesian
oordinates and dire
t
oordinates. On the next lines the initial velo
ities are provided. They are
given in units (A /fs, no multipli
ation with the s
aling fa
tor in this
ase) or (dire
t latti
e ve
tor/timestep).
Mind: For IBRION=0 and SMASS=-2 the a
tual steps taken are POTIM*read velo
ities. To avoid ambiguities, set POTIM
to 1. In this
ase the velo
ities are simply interpreted as ve
tors, along whi
h the ions are moved. For the
artesian swit
h,
the ve
tor is given in
artesian
oordinates(A , no multipli
ation with the s
aling fa
tor in this
ase) for the dire
t swit
h the
ve
tor is given in dire
t
oordinates.
The predi
tor-
orre
tor
oordinates are only provided to
ontinue a mole
ular dynami
run from a CONTCAR-le of a
previous run, they
an not be entered by hand.
42
setex h
setex
h is distributed with the pa
kage, but it must be
reated separately, by typing
>
make setex h
To get a good a
ura
y in the interpolation, the table is splitted in two regions, a low density region (0... maximal small
ele
tron density RHO(1) ?) and a high density region ( maximal ele
tron density RHO(2) ?). This allows an a
urate
interpolation for atoms and mole
ules. As a
rude guideline RHO(2) should not ex
eed 200, for transition metals this value
was suf
ient, and we generally re
ommend this setting for all materials. For 'simple' elements of the main group a value
around 10 is suf
ient. The
orrelation type sele
ted should be the same as used for the pseudopotential generation (usually
Ceperley-Alder as parameterized by Perdew and Zunger with relativisti
orre
tions, i.e. swit
h '1').
Starting from version 3.2 VASP generates the EXHCAR le internally, in this
ase the parameters (given in the example
session above) are used to
reate the table, only the rst parameter is adopted to the pseudopotential.
= .FALSE.
in the INCAR le (see se
tion 6.48). In VASP, the density is written using the following
ommands in Fortran:
WRITE(IU,FORM) (((C(NX,NY,NZ),NX=1,NGXC),NY=1,NGYZ),NZ=1,NGZC)
The x index is the fastest index, and the z index the slowest index. The le
an be read format-free, be
ause at least in new
versions, it is guaranteed that spa
es separate ea
h number. Please do not forget to divide by the volume before visualizing
the le!
For spinpolarized
al
ulations, two sets of data
an be found in the CHGCAR le. The rst set
ontains the total
harge
density (spin up plus spin down), the se
ond one the magnetization density (spin up minus spin down). For non
ollinear
43
al
ulations the CHGCAR le
ontains the total
harge density and the magnetisation density in the x, y and z dire
tion in
this order.
For dynami
simulation (IBRION=0), the
harge density on the le is the predi
ted
harge density for the next step: i.e. it
is
ompatible with CONTCAR, but in
ompatible with the last positions in the OUTCAR le. This allows the CHGCAR and
the CONTCAR le to be used
onsistently for a mole
ular dynami
s
ontinuation job. For stati
al
ulations and relaxations
(IBRION=-1,1,2) the written
harge density is the self
onsistent
harge density for the last step and might be used e.g. for
a
urate band-stru
ture
al
ulations (see se
tion 9.3).
Mind: Sin
e the
harge density written to the le CHGCAR is not the self
onsistent
hargedensity for the positions on the
CONTCAR le, do not perform a bandstru
ture
al
ulation (ICHARG=11) dire
tly after a dynami
simulation (IBRION=0)
(see se
tion 9.3).
where NSTEP starts from 1. To save dis
spa
e less digits are written to the le CHG than to CHGCAR. The le
an be used
to provide data for visualization programs for instan
e IBM data explorer. (For the IBM data explorer, a tool exists to
onvert
the CHG le to a valid data explorer le). It is possible to avoid that the CHG le is written out by setting
LCHARG = .FALSE.
in the INCAR le (see se
tion 6.48). The data arrangement of the CHG le is similar to that of the CHGCAR le (see se
tion
5.10), with the ex
eption of the PAW one
entre o
upan
ies, whi
h are missing on the CHG le.
number of bands
'initial'
ut-off energy
'initial' basis ve
tors defining the super
ell
('initial') eigenvalues
('initial') Fermi-weights
('initial') wavefun
tions
Usually WAVECAR provides ex
ellent starting wavefun
tions for a
ontinuation job. For dynami
simulation (IBRION=0)
the wavefun
tions in the le are usually those predi
ted for the next step: i.e. the le is
ompatible with CONTCAR. The
WAVECAR, CHGCAR and the CONTCAR le
an be used
onsistently for a mole
ular dynami
s
ontinuation job. For
stati
al
ulations and relaxations (IBRION=-1,1,2) the written wavefun
tions are the solution of the KS-equations for the
last step. It is possible to avoid, that the WAVECAR is written out by setting
LWAVE = .FALSE.
44
DOS
integrated DOS
The density of states (DOS) n , is a
tually determined as the differen
e of the integrated DOS between two pins, i.e.
n (ei ) = (N (ei )
N ( ei
De;
1 ))=
where De is the distan
e between two pins (energy differen
e between two grid point in the DOSCAR le), and N (e ) is the
integrated DOS
i
N ( ei ) =
ei
n(e)d e:
This method
onserves the total number of ele
trons exa
tly. For spin-polarized
al
ulations ea
h line holds ve data
energy
If RWIGS or LORBIT (Wigner Seitz radii, see se
tion 6.326.33) is set in the INCAR le, a lm- and site-proje
ted DOS is
al
ulated and also written to the le DOSCAR. One set of data is written for ea
h ion, ea
h set of data holds NDOS lines
with the following data
energy s-DOS p-DOS d-DOS
or
energy s-DOS(up) s-DOS(down) p-DOS(up) p-DOS(dwn) d-DOS(up) d-DOS(dwn)
for the non spin-polarized and spin polarized
ase respe
tively. As before the written densities are understood as the differen
e
of the integrated DOS between two pins.
For non-
ollinear
al
ulations, the total DOS has the following format:
energy
DOS(total)
integrated-DOS(total)
Information on the individual spin
omponents is available only for the site proje
ted density of states, whi
h has the format:
energy s-DOS(total) s-DOS(mx) s-DOS(my) s-DOS(mz) p-DOS(total) p-DOS(mx) ...
In this
ase, the (site proje
ted) total density of states (total) and the (site proje
ted) energy resolved magnetization density in
the x (mx), y (my) and z (mz) dire
tion are available.
In all
ases, the units of the l- and site proje
ted DOS are states/atom/energy.
The site proje
ted DOS is not evaluated in the parallel version for the following
ases:
vasp.4.5, NPAR6=1
vasp.4.6, NPAR6=1, LORBIT=0-5
In vasp.4.6 the site proje
ted DOS
an be evaluated for LORBIT=10-12, even if NPAR is not equal 1 (
ontrary to previous
releases).
Mind: For relaxations, the DOSCAR is usually useless. If you want to get an a
urate DOS for the nal
onguration,
rst
opy CONTCAR to POSCAR and
ontinue with one stati
(ISTART=1; NSW=0)
al
ulation.
45
File PCDAT
ontains the pair
orrelation fun
tion. For dynami
simulations (IBRION>=0) an averaged pair
orrelation is
written to the le (see se
tions 6.20, 6.30).
= .TRUE.
must exist on the INCAR le (see se
tion 6.48). In the present version (VASP.4.4.3), the ele
trostati
part of the potential
only is written (ex
hange
orrelation is not added). This is desirable for the evaluation of the work-fun
tion, be
ause the
ele
trostati
potential
onverges more rapidly to the va
uum level than the total potential. However if the ex
hange
orrelation
potential is to be in
luded,
hange one line in main.F:
!
omment out the following line to add ex
hange
orrelation
!
INFO%LEXCHG=-1
CALL POTLOK(...)
Mind: Older version might have had a different behavior, when they were retrieved from the server. Please always
he
k
yourself, whether main.F is working in the way you expe
t (simply sear
h for LEXCHG=-1 in main.F). If the line LEXCHG=-1
is
ommented out, the ex
hange
orrelation is added. It is re
ommended to avoid wrap around errors, when evaluating
LOCPOT. This
an be done by spe
ifying PREC=High in the INCAR le.
The data arrangement on the LOCPOT le is similar to that of the CHGCAR le (see se
tion 5.10).
46
Format:
1st line: PROOUT
2nd line: number of kpoints, bands and ions
3rd line: twi
e the number of types followed by the number of ions for ea
h type
4th line: the Fermi weights for ea
h kpoint (inner loop) and band (outer loop)
line 5 - . . . : real and imaginary part of the proje
tion PNlmnk for every lm-quantum number (inner loop), band, ion per type,
kpoint and ion-type (outer loop)
below : augmentation part
and nally: the
orresponding augmentation part of the proje
tions for every lm-quantum number (inner loop), ion per type,
ion-type, band and kpoint (outer loop)
This information makes it possible to
onstru
t e.g. partial DOSs proje
ted onto bonding and anti-bonding mole
ular orbitals
or the so-
alled
oop (
rystal overlap population fun
tion).
5.22
makeparam utility
The makeparam utility allows to
he
k the required memory amount. The program is
ompiled (seriel version only) by typing
make makeparam
Memory requirements
The memory requirements of VASP
an easily ex
eed your
omputer fa
ilities. In this
ase the rst step is to estimate where
the ex
essive memory requirements derive from. There are two possibilities:
Storage of wave fun
tions: All bands for all k-points must be kept in memory at the same time. The memory require-
The fa
tor 16 arises from the fa
t that all quantities are COMPLEX*16.
Work arrays for the representation of the
harge density, lo
al potentials, stru
ture fa
tor and large work arrays: A total
47
for
0
2
1
this Run:
job : 0-new 1-
ont 2-same
ut
harge: 1-file 2-atom 10-
onst
ele
tr: 0-lowe 1-rand
10.0
NELMIN= 0; NELMDL= 3
# of ELM steps m
POMASS = 102.91
ZVAL = 11.0
DOS related values:
SIGMA =
0.4; ISMEAR = 1 broad. in eV, -4-tet -1-fermi 0-gaus
Here is a short overview of all parameters urrently supported. Parameters whi h are used frequently are emphasized.
48
NBLK
SYSTEM
NWRITE
ISTART
ICHARG
ISPIN
MAGMOM
INIWAV
ENCUT
PREC
PREC
IWAVPR
ISYM
SYMPREC
LCORR
POTIM
TEBEG, TEEND
SMASS
NPACO and APACO
POMASS
ZVAL
RWIGS
NELECT
NUPDOWN
EMIN, EMAX
ISMEAR
SIGMA
ALGO
IALGO
LREAL
ROPT
GGA
VOSKOWN
DIPOL
AMIX, BMIX
WEIMIN, EBREAK, DEPER
TIME
LWAVE,LCHARG and LVTOT
LELF
LORBIT
NPAR
LSCALAPACK
LSCALU
LASYNC
49
=
=
=
=
=
=
=
0
2
1
40
2
-5
10E-4
#
#
#
#
#
#
#
=
=
=
=
=
1
0
40
2
0
#
#
#
#
#
You
an set ICHARG=1 by hand if an old CHGCAR le exists. If the
harge sloshing is signi
ant this will save a few steps,
ompared to the default setting. To
ontinue relaxation from a previous run
opy the CONTCAR le to POSCAR.
6.2.3 Re
ommended minimum setup
Although the previous
al
ulations
an be performed using an empty INCAR le it is re
ommended to spe
ify a few parameter
always manually
PREC = Normal
ENCUT = 300
LREAL = .FALSE. or Auto
ISMEAR = 0 or 1 or -5
#
#
#
#
These four parameters should be present in all
al
ulations. They
ompletely
ontrol the te
hni
al a
ura
y of the
al
ulations in parti
ular the basis sets (ENCUT), and wether the real spa
e proje
tion s
heme is used or not. Total energies of two
al
ulations should be only
ompared and subtra
ted, if the rst three parameters are set identi
ally in both
al
ulations.
Ideally the parameter ISMEAR should be also identi
al throughout all
al
ulations (but this might be dif
ult in some
ases).
6.2.4 Ef
ient relaxation from an unreasonable starting guess
If you want to do an ef
ient relaxation from a
onguration that is not
lose to the minimum, set the following values in the
INCAR le (for briefness the re
ommended setup is la
king, see Se
. 6.2.3):
NELMIN
EDIFF
EDIFFG
NSW
IBRION
=
=
=
=
=
5
1E-2
-0.3
10
2
#
#
#
#
#
This way only low a
ura
y will be required in the rst few steps, but sin
e a minimum of 5 ele
troni
steps is done the
a
ura
y of the
al
ulated ele
troni
groundstate will gradually improve. If you are a slightly advan
ed user you
an also use
the damped MD algorithm, whi
h is usually more ef
ient than the CG one:
IBRION = 1 ; SMASS = 0.4 # damped MD
POTIM = 0.4
# time step needs to
hosen with
are
50
6.2.5
Close to a lo
al minimum the variable-metri
(RMM-DIIS algorithm) is most ef
ient. INCAR le (for briefness the re
ommended setup is la
king, see Se
. 6.2.3):
NELMIN
EDIFF
EDIFFG
NSW
MAXMIX
=
=
=
=
=
8
1E-5
-0.01
20
80
IBRION = 1
NFREE = 10
#
#
#
#
#
#
#
Now very a
urate for
es are required (EDIFF is small). In addition a minimum of eight ele
troni
steps is done between ea
h
ioni
updated, so that the ele
troni
groundstate is always
al
ulated with very high a
ura
y. NELMIN=8 is only required
for systems with extreme
harge sloshing whi
h are very hard to
onverge ele
troni
ally. For most systems values between
NELMIN=4 and NELMIN=6 are suf
ient.
6.2.6 Mole
ular dynami
s
Please see se
tion 9.7.
6.2.7 Making the
al
ulations faster
Use the following lines in the INCAR le to improve the ef
ien
y of VASP for large systems:
ALGO = Fast
LREAL = A
NSIM = 4
NGX, NGY, NGZ ontrols the number of grid-points in the FFT-mesh into the dire tion of the three latti e-ve tors. X orresponds
to the rst, Y to the se
ond and Z to the third latti
e-ve
tor (X,Y and Z are not
onne
ted with
artesian
oordinates, don't be
fooled by the histori
al naming
onventions).
NGXF, NGYF, NGZF
ontrols the number of grid-points for a se
ond ner FFT-mesh. On this mesh the lo
alized augmentation
harges are represented, if ultrasoft (US) Vanderbilt potentials or the PAW method are used. In addition, lo
al potentials
(ex
hange-
orrelation, Hartree-potential and ioni
potentials) are also
al
ulated on this se
ond ner FFT-mesh if (and only
if) US-pseudopotentials are used.
Mind: There is no need to set NGXF to a value larger than NGX, if you do not use US-pseudopotential or the PAW method.
In this
ase ,neither the
harge density nor the lo
al potentials are set on the ne mesh. The only result is a
onsiderable waste
of storage. In this
ase set NGXF, NGYF, NGZF simply to 1.
In VASP.4.X all parameters are determined during runtime, either defaults are used Se
. 6.10 or 5.22 or NGX et
. are
read from the INCAR le, see Se
. 6.3).
6.4
NBANDS-tag
NBANDS determines the a
tual number of bands in the
al
ulation. To get additional information how to set NBANDS please
read also se
tion 8.1.
6.5
NBLK-tag
This determines the blo
king fa
tor in many BLAS level 3 routines.
In some
ases, VASP has to perform a unitary transformation of the
urrent wave fun
tions. This is done using a work array
CBLOCK and the following FORTRAN
ode:
51
DO 100 IBLOCK=0,NPL-1,NBLK
ILEN=MIN(NBLK,NPL-IBLOCK)
DO 200 N1=1,N
DO 200 M=1,ILEN
CBLOCK(M,N1)=C(M+IBLOCK,N1)
C(M+IBLOCK,N1)=0
200 CONTINUE
C
ZGEMM is the matrix matrix multipli
ation
ommand of the BLAS pa
kage. The task performed by this
all is indi
ated
by the
omment line written above the ZGEMM
all. Generally NBLK=16 or 32 is large enough for super-s
alar ma
hines. A
large value might be ne
essary on ve
tor ma
hines for optimal performan
e (NBLK=128).
6.6
SYSTEM-tag
SYSTEM = string
NWRITE-tag
NWRITE = 0 | 1 | 2 | 3 | 4
Default:
This ag determines how mu
h will be written to the le OUTCAR ('verbosity ag').
0
f
f
f+l
f+l
f
f
i
i
e
e
i
i
e
e
e
e
i
i
f+l
f+l
i
i
i
i
i
i
i
i
X
i
i
i
i
X
NWRITE
ontributions to ele
troni
energy
total energy
f+l
f
i
e
X
For long MD-runs use NWRITE=0 or NWRITE=1. For short runs use NWRITE=2. NWRITE=3 might give information if something
goes wrong. NWRITE=4 is for debugging only.
52
ENCUT-tag
6.8
with
E ut =
h2 2
G
2m
ut
The number of plane waves differs for ea
h k-point, leading to a superior beahviour for e.g. energy-volume
al
ulations. If
the volume is in
reased the total number of plane waves
hanges fairly smoothly. The
riterion jGj < G
ut (i.e. same basis set
for ea
h k-point) would lead to a very rough energy-volume
urve and, generally, slower energy
onvergen
e.
Starting from version VASP 3.2 the POTCAR les
ontains a default ENMAX (and ENMIN) line, therefore it is in prin
iple
not ne
essary to spe
ify ENCUT in the INCAR le. For
al
ulations with more than one spe
ies, the maximum
utoff (ENMAX
or ENMIN) value is used for the
al
ulation (see below, Se
. 6.10). For
onsisten
y reasons we still re
ommend to spe
ify the
utoff manually in the INCAR le and keep in
onstant throughout a set of
al
ulations.
ENAUG-tag
6.9
PREC-tag
Default:
The settings Normal and A
urate are only available in VASP.4.5 and newer versions. The setting Single is only available
in VASP.5.1.
Changing the PREC parameter inuen
es the default for four sets of parameters (ENCUT; NGX, NGY, NGZ; NGXF, NGYF, NGZF
and ROPT), and it is also possible to obtain the same
hara
teristi
s by
hanging the
orresponding parameters in the INCAR
le (VASP.4.X) dire
tly.
The PREC-ag determines the energy
utoff ENCUT, if (and only if) no value is given for ENCUT in the INCAR
le. For PREC=Low, ENCUT will be set to the maximal ENMIN value found in the POTCAR les. For PREC=Medium
and PREC=A
urate, ENCUT will be set to maximal ENMAX value found on the POTCAR le (see 5.4). Finally for
PREC=High, ENCUT is set to the maximal ENMAX value in the POTCAR le plus 30%. PREC=High guarantees that the
absolute energies are
onverged to a few meV, and it ensures that the stress tensor is
onverged within a few kBar. In
general, an in
reased energy
utoff is only required for a
urate evaluation of quantities related to the stress tensor (e.g.
elasti
properties).
The following table summarizes how PREC determines other ags in the INCAR le:
PREC
Normal
Single
A
urate
ENCUT
max(ENMAX)
max(ENMAX)
max(ENMAX)
NGx
3/2 G
ut
3/2 G
ut
2 G
ut
NGxF
2 NGx
NGx
2 NGx
ROPT
Low
Med
High
max(ENMIN)
max(ENMAX)
max(ENMAX)*1.3
3/2 G
ut
3/2 G
ut
2 G
ut
3 Gaug
4 Gaug
16/3 Gaug
-1E-2
-2E-3
-4E-4
h2 G 2 = ENCUT
ut
2me
j
h2 G 2 = ENAUG
aug
2me
j
-5E-4
-5E-4
-2.5E-4
53
Low
Med
Normal
A
urate
High
Low
Med
Normal
A
urate
High
ROPT=-1E-2
ROPT=-2E-3
ROPT=-5E-4
ROPT=-2.5E-4
ROPT=-4E-4
This behaviour
an be overwritten by spe
ifying the option ROPT in the INCAR le. For mixed atomi
spe
ies we, in
fa
t, strongly re
ommend to use LREAL=A (see se
tion 6.38).
We re
ommend to use PREC=Normal for
al
ulations in VASP.4.5 (default in VASP.5.X) and PREC=Medium for VASP.4.4.
PREC=A
urate avoids wrap around errors and uses an augmentation grid that is exa
tly twi
e as large as the
oarse grid
for the representation of the pseudo wavefun
tions. PREC=A
urate in
reases the memory requirements somewhat, but it
should be used, if very a
urate for
es (phonons and se
ond derivatives) are required. The a
ura
y of for
es
an be further
improved by spe
ifying ADDGRID = .TRUE. (see Se
. 6.55).
New manual entry for
PREC=High:
The use of PREC=High is no longer re
ommend (and exists only for
ompatibility reasons). For an a
urate stress tensor
the energy
utoff should be in
reased manually, and if additionally very a
urate for
es are required, PREC=A
urate
an
be used in
ombination with an in
rease energy
utoff. Note, that we now re
ommend to spe
ify the energy
utoff always
manually in the INCAR le, to avoid in
ompatibilities between
al
ulations (see Se
. 6.2.3).
Old manual entry for PREC=High:
PREC=High, should be used if properties like the stress tensor are evaluated. If PREC=High
al
ulations are too expensive,
ENMAX
an also be in
reased manually in the INCAR le, sin
e this is usually suf
ient to obtain a reliable stress-tensor.
6.11
ISPIN-tag
ISPIN = 1 or 2
default: ISPIN = 1
For ISPIN=1 non spin polarized
al
ulations are performed, whereas for ISPIN=2 spin polarized
al
ulations are performed.
54
6.12
MAGMOM-tag
Default
NIONS*1
Spe
ies the initial magneti
moment for ea
h atom, if and only if ICHARG is equal 2, or if the CHGCAR le
ontains
no magnetisation density (ICHARG=1). If one is sear
hing for a spin polarised (magneti
or antiferromagneti
) solution, it
is usually safest to start from larger lo
al magneti
moments, and in some
ases, the default values might not be suf
iently
big. A save default is usually the experimental magneti
moment multiplied by 1.2 or 1.5. It is important to emphasis that
the MAGMOM tag is used only, if the CHGCAR le holds no information on the magnetisation density, and if the initial
harge
density is not
al
ulated from the wavefun
tions supplied in the WAVECAR le. This means that the MAGMOM tag is usefull
for two kind of
al
ulations
Cal
ulations starting from s
rat
h with no WAVECAR and CHGCAR le.
Cal
ulations starting from a non magneti
WAVECAR and CHGCAR le (ICHARG = 1). Often su
h
al
ulations
onverge more reliably to the desired magneti
onguration than
al
ulations of the rst kind. Hen
e, if you have
problems to
onverge to a desired magneti
solution, try to
al
ulate rst the non magneti
groundstate, and
ontinue
from the generated WAVECAR and CHGCAR le. For the
ontinuation job, you need to set
ISPIN=2
ICHARG=1
.00000 .00000
1.00000 .00000
.00000 1.00000
.00000
.50000
.00000
.50000
With the MAGMOM line spe
ied above, VASP should
onverge to the proper groundstate. In this example, the total net magnetisation is matter of fa
tly zero, but it is possible to determine the lo
al magneti
moments by using the RWIGS or LORBIT
tags (see se
tions 6.33 6.32).
6.13
ISTART-tag
ISTART
= 0j1j2
Default:
if WAVECAR exists
else
1
0
55
1 restart with
onstant energy
ut-off. Continuation job read wave fun
tions from le WAVECAR (usage is restri
ted
in the parallel version, see se
tion 4.5).
The set of plane waves will be redened and re-padded a
ording to the new
ell size/shape (POSCAR) and the new
plane wave
ut-off (INCAR). These values might differ from the old values, whi
h are stored in the le WAVECAR. If
the le WAVECAR is missing or if le WAVECAR
ontains an inappropriate number of bands and / or k-points the ag
ISTART will be set to 0 (see above). In this
ase VASP starts from s
rat
h and initializes the wave fun
tions a
ording
to the ag INIWAV.
The usage of ISTART=1 is re
ommended if the size/shape of the super
ell (see se
tion 7.6) or the
ut-off energy
hanged with respe
t to the last run and if one wishes to redene the set of plane waves a
ording to a new setting.
ISTART=1 is the usual setting for
onvergen
e tests with respe
t to the
ut-off energy and for all jobs where the
volume/
ell-shape varies (e.g. to
al
ulate binding energy
urves looping over a set of volumes).
Mind: main.f
an be re
ompiled with new settings for NGX,NGY,NGZ,NPLWV ... between different runs, the program
will
orre
tly repad and reorganize the 'storage layout' for the wavefun
tion arrays et
. In addition it is also possible to
hange the k-point mesh if the number of k-points remains
onstant. This might be of importan
e if a loop over a set
of k-points (band-stru
ture
al
ulations) is performed.
2 'restart with
onstant basis set': Continuation job read wave fun
tions from the le WAVECAR
The set of plane waves will not be
hanged even if the
ut-off energy or the
ell size/shape given on les INCAR and
POSCAR are different from the values stored on the le WAVECAR. If the le WAVECAR is missing or if the le
WAVECAR
ontains an inappropriate number of bands and/or k-points the ag ISTART will be set to 0 (see above).
In this
ase VASP starts from s
rat
h and initializes the wave fun
tions a
ording to the ag INIWAV. If the
ell shape
has not
hanged then ISTART=1 and ISTART=2 lead to the same result.
ISTART=2 is usually used if one wishes to restart with the same basis set used in the previous run.
Mind: Due to Pullay stresses (se
tion 7.6) there is a differen
e between evaluating the equilibrium volume with a
onstant basis set and a
onstant energy
ut-off unless absolute
onvergen
e with respe
t to the basis set is a
hieved!
If you are looking for the equilibrium volume,
al
ulations with a
onstant energy
ut-off are preferable to
al
ulations
with a
onstant basis set, therefore always restart with ISTART=1 ex
ept if you really know what you are looking for
(see se
tion 7.6).
There is only one ex
eption to this general rule: All volume/
ell shape relaxation algorithms implemented in VASP
work with a
onstant basis set, so
ontinuing su
h jobs requires to set ISTART=2 to get a '
onsistent restart' with
respe
t to the previous runs (see se
tion 7.6)!
3 'full restart in
luding wave fun
tion and
harge predi
tion'
Same as ISTART=2 but in addition a valid le TMPCAR must exist
ontaining the positions and wave fun
tions at
time steps t(N-1) and t(N-2), whi
h are needed for the wavefun
tion and
harge predi
tion s
heme (used for MD-runs).
ISTART=3 is generally not re
ommended unless an operating system imposes serious restri
tion on the CPU time
per job: If you
ontinue with ISTART=1 or 2, a relatively large number of ele
troni
iterations might be ne
essary to
onvergen
e the wave fun
tions in the se
ond and third MD-steps. ISTART=3 therefore saves time and is important if a
MD-run is split into very small pie
es (NSW<10). Nevertheless, we have found that it is safer to restart the wavefun
tionpredi
tion after 100 to 200 steps. If NSW>30 ISTART=1 or 2 is strongly re
ommended.
Mind: If ISTART=3, a non-existing WAVECAR or TMPCAR le or any in
onsisten
y of input data will immediately
stop exe
ution.
6.14
ICHARG-tag
ICHARG = 0j1j2
Default:
if ISTART=0
else
2
0
56
1 Read the
harge density from le CHGCAR, and extrapolate from the old positions (on CHCGAR) to the new positions
using a linear
ombination of atomi
harge densities. In the PAW method, there is however one important point to keep
in mind. For the on-site densities (that is the densities within the PAW sphere) only l-de
omposed
harge densities up
to LMAXMIX are written. Upon restart the energies might therefore differ slightly from the fully
onverged energies.
The dis
repan
ies
an be large for the L(S)AD+U method. In this
ase, one might need to in
rease LMAXMIX to 4
(d-elements) or even 6 (f-elements) (see Se
tion 6.55).
2 Take superposition of atomi
harge densities
4 VASP.5.1: read potential from le POT. The lo
al potential on the le POT is written by the optimized effe
tive potential
methods (OEP), if the ag LVTOT=.TRUE. is supplied in the INCAR le.
+10 non-self
onsistent
al
ulation
Adding ten to the value of ICHARG (e.g. using 11,12 or the less
onvenient value 10) means that the
harge density will be
kept
onstant during the whole ele
troni
minimization.
There are several reasons why to use this ag:
ICHARG=11: To obtain the eigenvalues (for band stru
ture plots) or the DOS for a given
harge density read from
CHGCAR. The self
onsistent CHGCAR le must be determined beforehand doing by a fully self
onsistent
al
ulation
with a k-point grid spanning the entire Brillouin zone.9.3.
ICHARG=12: Non-self onsistent al ulations for a superposition of atomi harge densities. This is in the spirit of
the non-self
onsistent Harris-Foulkes fun
tional. The stress and the for
es
al
ulated by VASP are
orre
t, and it is
absolutely possible to perform an ab-initio MD for the non-self
onsistent Harris-Foulkes fun
tional (see se
tion 7.3).
6.15
INIWAV-tag
INIWAV
Default :
= 0j1
This ag is only used for start jobs (ISTART=0) and has no meaning else. It spe
ies how to set up the initial wave fun
tions:
0 Take 'jellium wave fun
tions', this means simply: ll wavefun
tion arrays with plane waves of lowest kineti
energy =
lowest eigenve
tors for a
onstant potential ('jellium')
1 Fill wavefun
tion arrays with random numbers. Use whenever possible.
Mind: This is denitely the safest foolproof swit
h, and unless you really know that other initialization works as well
use this swit
h.
6.16
NELM = integer;
Default
NELM
NELMIN
NELMDL
NELMDL
=
=
=
=
60
2
-5
-12
0
NELMIN = integer;
NELMDL = integer
NELM gives the maximum number of ele
troni
SC (self
onsisten
y) steps whi
h may be performed. Normally, there is no
need to
hange the default value: if the self-
onsisten
y loop does not
onverge within 40 steps, it will probably not
onverge
at all. In this
ase you should re
onsider the tags IALGO, LDIAG, and the mixing-parameters.
57
NELMIN gives the minimum number of ele
troni
SC steps. Generally you do not need to
hange this setting. In some
ases (for instan
e MD's, or ioni
relaxation) you might set NELMIN to a larger value (4 to 8) (see se
tions 9.9, 9.7).
NELMDL gives the number of non-self
onsistent steps at the beginning; if one initializes the wave fun
tions randomly
the initial wave fun
tions are far from anything reasonable. The resulting
harge density is also 'nonsense'. Therefore it makes
sense to keep the initial Hamiltonian, whi
h
orresponds to the superposition of atomi
harge densities, xed during the rst
few steps.
Choosing a 'delay' for starting the
harge density update be
omes essential in all
ases where the SC-
onvergen
e is
very bad (e.g. surfa
es or mole
ules/
lusters
hains). Without setting a delay VASP will probably not
onverge or at least the
onvergen
e speed is slowed down.
NELMDL might be positive or negative. A positive number means that a delay is applied after ea
h ioni
movement
in general not a
onvenient option. A negative value results in a delay only for the start-
onguration.
6.17
EDIFF-tag
EDIFF
Default :
10
Spe
ies the global break
ondition for the ele
troni
SC-loop. The relaxation of the ele
troni
degrees of freedom will be
stopped if the total (free) energy
hange and the band stru
ture energy
hange ('
hange of eigenvalues') between two steps
are both smaller than EDIFF. For EDIFF=0, NELM ele
troni
SC-steps will always be performed.
Mind: In most
ases the
onvergen
e speed is exponential. So if you want the total energy signi
ant to 4 gures set
EDIFF to 10 4 . There is no real reason to use a mu
h smaller number.
6.18
EDIFFG-tag
Default :
EDIFF*10
EDIFFG denes the break
ondition for the ioni
relaxation loop. If the
hange in the total (free) energy is smaller than
EDIFFG between two ioni
steps relaxation will be stopped. If EDIFFG is negative it has a different meaning: In this
ase
the relaxation will stop if all for
es are smaller than j EDIFFG j. This is usually a more
onvenient setting.
EDIFFG might be 0; in this
ase the ioni
relaxation is stopped after NSW steps. EDIFFG does not apply for MDsimulations.
6.19
NSW-tag
Default :
NBLOCK = integer;
Default
NBLOCK
KBLOCK
=
=
KBLOCK = integer
NSW
After NBLOCK ioni
steps the pair
orrelation fun
tion and the DOS are
al
ulated and the ioni
onguration will be written
to the XDATCAR-le. In addition NBLOCK
ontrols how often the kineti
energy is s
aled if SMASS=-1 (see se
tion 6.29).
Mind: The CPU
osts for these tasks are quite small so use NBLOCK=1.
After KBLOCK*NBLOCK main loops the averaged pair
orrelation fun
tion and DOS are written to the les PCDAT and
DOSCAR.
58
6.21
IBRION-tag, NFREE-tag
IBRION = -1 | 0 | 1 | 2 | 3 | 5 | 6 | 7 | 8
Default
for NSW=0 or NSW=1
else
1
0
IBRION determines how the ions are updated and moved. For IBRION=0, a mole
ular dynami
s is performed, whereas all other
algorithms are destined for relaxations into a lo
al energy minimum. For dif
ult relaxation problems it is re
ommended to
use the
onjugate gradient algorithm (IBRION=2), whi
h presently possesses the most reliable ba
kup routines. Damped
mole
ular dynami
s (IBRION=3) are often usefull, when starting from very bad initial guesses. Close to the lo
al minimum
the RMM-DIIS (IBRION=1) is usually the best
hoi
e. IBRION=5 and IBRION=6 are using nite differen
es to determine
the se
ond derivatives (Hessian matrix and phonon frequen
ies), whereas IBRION=7 and IBRION=8 use density fun
tional
perturbation theory to
al
ulate the derivatives.
6.21.1
IBRION=-1
No update; ions are not moved, but NSW outer loops are performed. In ea
h outer loop the ele
troni
degrees of freedom are
re-optimized (for NSW>0 this obviously does not make mu
h sense, ex
ept for test purposes). If no ioni
update is required
use NSW=0 instead.
6.21.2
IBRION=0
Standard ab-initio mole
ular dynami
s. A Verlet algorithm (or fourth order predi
tor
orre
tor if VASP was linked with
steppre
or.o) is used to integrate Newton's equations of motion. POTIM supplies the timestep in femto se
onds. The parameter
SMASS allows additional
ontrol (see Se
. 6.29).
Mind: At the moment only
onstant volume MD's are possible.
6.21.3
IBRION=1
For IBRION=1, a quasi-Newton (variable metri
) algorithm is used to relax the ions into their instantaneous groundstate. The
for
es and the stress tensor are used to determine the sear
h dire
tions for nding the equilibrium positions (the total energy
is not taken into a
ount). This algorithm is very fast and ef
ient
lose to lo
al minima, but fails badly if the initial positions
are a bad guess (use IBRION=2 in that
ase). Sin
e the algorithm builds up an approximation of the Hessian matrix it requires
very a
urate for
es, otherwise it will fail to
onverge. An ef
ient way to a
hieve this is to set NELMIN to a value between
4 and 8 (for simple bulk materials 4 is usually adequate, whereas 8 might be required for
omplex surfa
es where the
harge
density
onverges very slowly). This for
es a minimum of 4 to 8 ele
troni
steps between ea
h ioni
step, and guarantees that
the for
es are well
onverged at ea
h step.
The implemented algorithm is
alled RMM-DIIS[26. It impli
itly
al
ulates an approximation of the inverse Hessian
matrix by taking into a
ount information from previous iterations. On startup, the initial Hessian matrix is diagonal and
equal to POTIM. Information from old steps (whi
h
an lead to linear dependen
ies) is automati
ally removed from the
iteration history, if required. The number of ve
tors kept in the iterations history (whi
h
orresponds to the rank of the Hessian
matrix must not ex
eed the degrees of freedom. Naively the number of degrees of freedom is 3*(NIONS-1). But symmetry
arguments, or
onstraints
an redu
e this number signi
antly. There are two algorithms build in to remove information from
the iteration history. i) If NFREE is set in the INCAR le, only up to NFREE ioni
steps are kept in the iteration history
(the rank of the approximate Hessian matrix is not larger than NFREE). ii) If NFREE is not spe
ied, the
riterion whether
information is removed from the iteration history is based on the eigenvalue spe
trum of the inverse Hessian matrix: if one
eigenvalue of the inverse Hessian matrix is larger than 8, information from previous steps is dis
arded. For
omplex problems
NFREE
an usually be set to a rather large value (i.e. 10-20), however systems of low dimensionality require a
arful setting
of NFREE (or preferably an exa
t
ounting of the number of degrees of freedom). To in
rease NFREE beyond 20 rarely
improves
onvergen
e. If NFREE is set to too large, the RMM-DIIS algorithm might diverge.
The
hoi
e of a reasonable POTIM is also important and
an speed up
al
ulations signi
antly, we re
ommend to nd
an optimal POTIM using IBRION=2 or performing a few test
al
ulations (see below).
6.21.4
IBRION=2
A
onjugate-gradient algorithm (a simple dis
ussion of this algorithm
an be found for instan
e in [28) is used to relax
the ions into their instantaneous groundstate. In the rst step ions (and
ell shape) are
hanged along the dire
tion of the
59
steepest des
ent (i.e. the dire
tion of the
al
ulated for
es and stress tensor). The
onjugate gradient method requires a line
minimization, whi
h is performed in several steps: i) rst a trial step into the sear
h dire
tion (s
aled gradients) is done, with
the length of the trial step
ontrolled by the POTIM parameter (se
tion 6.22). Then the energy and the for
es are re
al
ulated.
ii) The approximate minimum of the total energy is
al
ulated form a
ubi
(or quadrati
) interpolation taking into a
ount
the
hange of the total energy and the
hange of the for
es (3 pie
es of information), then a
orre
tor step to the approximate
minimum is performed. iii) After the
orre
tor step the for
es and energy are re
al
ulated and it is
he
ked whether the for
es
ontain a signi
ant
omponent parallel to the previous sear
h dire
tion. If this is the
ase, the line minimization is improved
by further
orre
tor steps using a variant of Brent's algorithm[28.
To summarize: In the rst ioni
step the for
es are
al
ulated for the initial
onguration read from POSCAR, the se
ond
step is a trial (or predi
tor step), the third step is a
orre
tor step. If the line minimization is suf
iently a
urate in this step,
the next trial step is performed.
NSTEP:
1
initial positions
trial step
trial step
...
IBRION=3
6.21.5
If a damping fa
tor, is supplied in the INCAR le by means of the SMASS tag, a damped se
ond order equation of motion is
used for the update of the ioni
degrees of freedom:
~
x
a
~
F
~
x;
N +1=2 = ((1
~
v
=2)~
v
1=2
a
N )=(1 + =2)
~
F
~
x
= 2 is equivalent to a simple steepest des ent algorithm (of ourse without line optimiza-
= 2 orresponds to maximal damping, = 0 orresponds to no damping. The optimal damping fa tor depends
on the Hessian matrix (matrix of the se
ond derivatives of the energy with respe
t to the atomi
positions). A reasonable
rst guess for
is usually 0.4. Mind that our implementation is parti ular user-friendly, sin e hanging
require to re-adjust the time step (POTIM). To
hose an optimal time step and damping fa
tor, we re
ommend the following
two step pro
edure: First x
(for instan e to 1) and adjust POTIM. POTIM should be hosen as large as possible without
and keep POTIM xed. If POTIM and SMASS are hose orre tly,
the damped mole
ular dynami
s mode usually outperforms the
onjugate gradient method by a fa
tor of two.
If SMASS is not set in the INCAR le (respe
tively SMASS<0), a velo
ity quen
h algorithm is used. In this
ase ions are
updated a
ording using the following algorithm: Here ~
F are the
urrent for
es, and
orresponds to POTIM. This equation
implies that, if the for
es are antiparallel to the velo
ities, the velo
ities are quen
hed to zero. Otherwise the velo
ities are
made parallel to the present for
es, and they are in
reased by an amount that is proportional to the for
es.
Mind:
For
IBRION=3,
must
be supplied by the POTIM parameter. Too large time steps will re-
sult in divergen e, too small ones will slow down the onvergen e. The stable time step is usually twi e the
smallest
line
6.21.6
IBRION=5, is only supported starting from VASP.4.5. IBRION=6, is only supported starting from VASP.5.1. Both ags allow
to determine the Hessian matrix (matrix of the se
ond derivatives of the energy with respe
t to the atomi
positions) and the
vibrational frequen ies of a system. Only zone entered ( -point) frequen ies are al ulated automati ally and printed after
i.e.
oordinate, and from the for
es the Hessian matrix is determined. The two modes differ in the way symmetry is
onsidered.
For
IBRION=5, all atoms are displa ed in all three Cartesian dire tions, resulting in a signi ant omputational effort even for
60
moderately sized high symmetry systems. For IBRION=6, however only symmetry inequivalent displa
ements are
onsidered,
and the reminder of the Hessian matrix is lled using symmetry
onsiderations.
Sele
tive dynami
s are presently only supported for IBRION=5; in this
ase, only those
omponents of the Hessian matrix
are
al
ulated for whi
h the sele
tive dynami
s tags are set to .TRUE. Contrary to the
onventional behavior, the sele
tive
dynami
s tags now refer to the Cartesian
omponents of the Hessian matrix. For the following POSCAR le, for instan
e,
Cubi
BN
3.57
0.0 0.5 0.5
0.5 0.0 0.5
0.5 0.5 0.0
1 1
sele
tive
Dire
t
0.00 0.00 0.00 F F F
0.25 0.25 0.25 T F F
atom 2 is displa
ed in the x-dire
tion only, and only the x
omponent of the se
ond atom of the Hessian matrix is
al
ulated.
Three parameters inuen
e the determination of the Hessian matrix. The parameter NFREE determines how many dis
pla
ements are used for ea
h dire
tion and ion, and POTIM determines the step size. The step size is defaulted to 0.015 A,
if too large values are supplied in the input le. Expertise shows that this is a very reasonable
ompromise. NFREE=2 uses
entral differen
e, i.e. ea
h ion is displa
ed in ea
h dire
tion by a small positive and negative displa
ement
...
For NFREE=1, only a single displa
ement is applied (it is strongly re
ommend to avoid NFREE=1).
Finally, IBRION=6 and ISIF3 allows to
al
ulate the elasti
onstants. The elasti
tensor is determined by performing
six nite distortions of the latti
e and deriving the elasti
onstants from the strain-stress relationship [4. The elasti
tensor
is
al
ulated both, for rigid ions, as well, as allowing for relaxation of the ions. The elasti
moduli for rigid ions are written
after the line
SYMMETRIZED ELASTIC MODULI (kBar)
The ioni
ontributions are determined by inverting the ioni
Hessian matrix and multiplying with the internal strain tensor [5,
and the
orresponding
ontributions are written after the lines:
ELASTIC MODULI CONTR FROM IONIC RELAXATION (kBar)
The nal elasti
moduli in
luding both, the
ontributions for distortions with rigid ions and the
ontributions from the ioni
relaxations, are summarized at the very end.
TOTAL ELASTIC MODULI (kBar)
There are a few
aveats to this approa
h: most notably the plane wave
utoff needs to be suf
iently large to
onverge the
stress tensor. This is usually only a
hieved if the default
utoff is in
reased by roughly 30 %, but it is strongly re
ommended
to in
rease the
utoff systemati
ally (e.g. in steps of 15 %), until full
onvergen
e is a
hieved.
Mind: In some older versions, NSW (number of ioni
steps) must be set to 1 in the INCAR le, sin
e NSW=0 resets the
IBRION tag to 1 regardless of the value supplied in the INCAR le.
A nal problem
on
erns the symmetry treatment in VASP.4.6. VASP determines the symmetry for the displa
ed
ongurations
orre
tly, but unfortunately VASP does not
hange the set of k-points automati
ally (often the lower symmetry of
ongurations with displa
ed ions would require one to use more k points). Hen
e, for a
urate
al
ulations, the symmetry
must be swit
hed off, or a k point set whi
h has not been redu
ed using symmetry
onsiderations must be applied. VASP.5.1
hanges the k-point set on the y and the previous restri
tion does not apply.
61
6.21.7
IBRION=7 and IBRION=8 is only supported starting from VASP.5.1. It determines the Hessian matrix (matrix of se
ond
derivatives) using density fun
tional perturbation theory. As for IBRION=5, IBRION=7 does not apply symmetry, whereas
IBRION=8 uses symmetry to redu
e the number of displa
ements. The output is similar to the previous
ase, although with
the ex
eption of the ioni
relaxation
ontributions to the elasti
moduli, elasti
moduli are presently not determined. Born
effe
tive
harges and piezoele
tri
onstants
an be
al
ulated by spe
ifying LEPSILON=.TRUE. (see also Se
. 6.64.6)
6.21.8
For IBRION=1,2 and 3, the ag ISIF (see se
tion 6.23) determines whether the ions and/or the
ell shape is
hanged. No
update of the
ell shape is supported for mole
ular dynami
s (IBRION=0).
Within all relaxation algorithms (IBRION=1,2 and 3) the parameter POTIM should be supplied in the INCAR le. For
IBRION>0, the for
es are s
aled internally before
alling the minimization routine. Therefore for relaxations, POTIM has
no physi
al meaning and serves only as a s
aling fa
tor. For many systems, the optimal POTIM is around 0.5. Be
ause the
Quasi-Newton algorithm and the damped algorithms are sensitive to the
hoi
e of this parameter, use IBRION=2, if you are
not sure how large the optimal POTIM is.
In this
ase, the OUTCAR le and stdout will
ontain a line indi
ating a reliable POTIM. For IBRION=2, the following
lines will be written to stdout after ea
h
orre
tor step (usually ea
h odd step):
trial: gam= .00000 g(F)=
(trialstep = .82)
The quantity gam is the
onjugation parameter to the previous step, g(F) and g(S) are the norm of the for
e respe
tively
the norm of the stress tensor. The quantity ort is an indi
ator whether this sear
h dire
tion is orthogonal to the last sear
h
dire
tion (for an optimal step this quantity should be mu
h smaller than (g(F) + g(S)). The quantity trialstep is the size
of the
urrent trialstep. This value is the average step size leading to a line minimization in the previous ioni
step. An optimal
POTIM
an be determined, by multiplying the
urrent POTIM with the quantity trialstep.
After at the end of a trial step, the following lines are written to stdout:
trial-energy
hange: -1.153185 1.order -1.133 -1.527 -.739
step: 1.7275(harm= 2.0557) dis= .12277
next Energy= -1341.57 (dE= -.142E+01)
The quantity trial-energy
hange is the
hange of the energy in the trial step. The rst value after 1.order is the expe
ted energy
hange
al
ulated from the for
es ((F(start) + F(trial))=2
hange of positions). The se
ond and third value
orresponds to F(start)
hange of positions and F(trial)
hange of positions. The rst value in the se
ond line is the size
of the step leading to a line minimization along the
urrent sear
h dire
tion. It is
al
ulated from a third order interpolation
formula using data form the start and trial step (for
es and energy
hange). harm is the optimal step using a se
ond order (or
harmoni
) interpolation. Only information on the for
es is used for the harmoni
interpolation. Close to the minimum both
values should be similar. dis is the maximum distan
e moved by the ions in fra
tional (dire
t)
oordinates. next Energy
gives an indi
ation how large the next energy should be (i.e. the energy at the minimum of the line minimization), dE is the
estimated energy
hange.
The OUTCAR le will
ontain the following lines, at the end of ea
h trial step:
trial-energy
hange: -1.148928 1.order -1.126 -1.518 -.735
(g-gl).g = .152E+01
g.g = .152E+01 gl.gl
= .000E+00
g(For
e) = .152E+01 g(Stress)= .000E+00 ortho
= .000E+00
gamma
=
.00000
opt step = 1.72745 (harmoni
= 2.05575) max dist = .12277085
next E
= -1341.577507 (d E = 1.42496)
The line trial-energy
hange was already dis
ussed. g(For
e)
orresponds to g(F), g(Stress) to g(S), ortho to ort,
gamma to gam. The values after gamma
orrespond to the se
ond line (step: ...) previously des
ribed.
6.22
POTIM-tag
For
IBRION=1,2
or
3,
POTIM
serves
as
a
Default
if IBRION=0 (MD)
no default, user must supply this value
if IBRION=1,2,3 (relaxation) 0.5
s aling
onstant
for
the
for es.
62
POTIM supplies the time step for an ab-initio mole
ular dynami
s (IBRION=0), and must be entered by the user for all MD
simulations.
In addition POTIM severs as a s
aling
onstant in all minimization algorithms (quasi-Newton,
onjugate gradient, and
damped mole
ular dynami
s). Espe
ially the Quasi-Newton algorithm is sensitive to the
hoi
e of this parameter (see se
tion
IBRION 6.21).
6.23
ISIF-tag
ISIF
= 0j1j2j3j4j5j6
Default
if IBRION=0 (MD)
else
0
2
ISIF
ontrols whether the stress tensor is
al
ulated. The
al
ulation of the stress tensor is relatively time-
onsuming, and
therefore by default swit
hed off for ab initio MD's. For
es are always
al
ulated.
In addition ISIF determines whi
h degrees of freedom (ions,
ell volume,
ell shape) are allowed to
hange.
The following table shows the meaning of ISIF. At the moment
ell
hanges are only supported for relaxations and nor
fot mole
ular dynami
s simulations.
ISIF
0
1
2
3
4
5
6
7
al
ulate
for
e
yes
yes
yes
yes
yes
yes
yes
yes
al
ulate
stress tensor
no
tra
e only
yes
yes
yes
yes
yes
yes
relax
ions
yes
yes
yes
yes
yes
no
no
no
hange
ell shape
no
no
no
yes
yes
yes
yes
no
hange
ell volume
no
no
no
yes
no
no
yes
yes
Tra e only means that only the total pressure, i.e. the line
external pressure =
... kB
is
orre
t. The individual
omponents of the stress tensor are not reliable in that
ase. This swit
h must be used with
aution.
Mind: Before you perform relaxations in whi
h the volume or the
ell shape is allowed to
hange you must read and understand se
tion 7.6. In general volume
hanges should be done only with a slightly in
reased energy
utoff (i.e. ENCUT=1.3 *
default value, or PREC=High in VASP.4.4).
6.24
PSTRESS-tag
If the PSTRESS tag is spe
ied VASP will add this stress to to stress tensor, and an energy
E = V PSTRESS
to the energy. This allows the user to
onverge to a spe
ied external pressure. Before using this ag please read se
tion 7.6.
6.25
IWAVPR-tag
IWAV PR = 0j1j2j3
Default
if IBRION=0 (MD)
if IBRION=1,2 (relaxation)
else (stati
al
ulation)
2
1
0
IWAVPR determines how wave fun
tions and/or
harge density are extrapolated from one ioni
onguration to the next
onguration. Usually the le TMPCAR is used to store old wavefun
tions, whi
h are required for the predi
tion. If IWAVPR
is larger than 10, the predi
tion is done without an external le TMPCAR (i.e. all required arrays are stored in main memory,
63
this option works from version VASP.4.1). If the IWAVPR is set to 10, the reader will set it to the following default values:
if IBRION=0 (MD)
12
if IBRION=1,2 (relaxation) 11
0 no extrapolation, usually less preferable if you want to do an ab initio MD or an relaxation of the ions into the instantaneous groundstate.
1,11 Simple extrapolation of
harge density using atomi
harge densities is done (eq. (9.8) in thesis G. Kresse). This swit
h
is
onvenient for all kind of geometry optimizations (ioni
relaxation and volume/
ell shape with
onjugate gradient or
Quasi-Newton methods, i.e. IBRION=1,2)
2,12 A se
ond order extrapolation for the wave fun
tions and the
harge density is done (equation 9.9 in thesis G. Kresse).
A must for ab-initio MD-runs.
3,13 In this
ase a se
ond order extrapolation for the wave fun
tions, and a simple extrapolation of
harge density using
atomi
harge densities is done. This is some kind of mixture between IWAVPR=1 and 2, but it is denitely not better
than IWAVPR=2.
Mind: We don't en
ourage this setting at all.
6.26
ISY M = 0j1j2
Default
swit
h symmetry stuff ON (1 or 2) or OFF (0). For ISYM=2 a more ef
ient memory
onserving symmetrisation of the
harge density is used. This redu
es memory requirements in parti
ular for the parallel version. ISYM=2 is the default if
PAW data sets are used. ISYM=1 is the default if VASP runs with US-PP's.
The program determines automati
ally the point group symmetry and the spa
e group a
ording to the POSCAR le and
the line MAGMOM in the INCAR le. The SYMPREC-tag (VASP.4.4.4 and newer versions only) determines how a
urate the
positions in the POSCAR le must be. The default is 10 5 , whi
h is usually sufently large even if the POSCAR le has
been generated with a single pre
ision program. In
reasing the SYMPREC tag means, that the positions in the POSCAR le
an be less a
urate. During the symmetry analysis, VASP determines
the Bravais latti
e type of the super
ell,
the point group symmetry and the spa
e group of the super
ell with basis (stati
and dynami
) - and prints the names
the type of the generating elementary (primitive)
ell if the super
ell is a non-primitive
ell,
all 'trivial non-trivial' translations (= trivial translations of the generating elementary
ell within the super
ell)
the symmetry-irredu ible set of k-points if automati k-mesh generation was used and additionally the symmetry-
irredu
ible set of tetrahedra if the tetrahedron method was
hosen together with the automati
k-mesh generation and
of
ourse also the
orresponding weights ('symmetry degenera
y'),
(VASP.4.4.4 and newer releases only; if you use older version please also see se tion 6.12).
64
Why is a symmetrisation ne essary: Within LDA the symmetry of the super ell and the harge density are always the same.
This symmetry is broken, be
ause a symmetry-irredu
ible set of k-points is used for the
al
ulation. To restore the
orre
t
harge density and the
orre
t for
es it is ne
essary to symmetrise these quantities.
It must be stressed that VASP does not determine the symmetry elements of the primitive
ell. If the super
ell has a lower
symmetry than the primitive
ell only the lower symmetry of the super
ell is used in the
al
ulation. In this
ase one should
not expe
t that for
es that should be zero a
ording to symmetry will be pre
isely zero in a
tual
al
ulations. The symmetry
of the primitive
ell is in fa
t broken in several pla
es in VASP:
lo
al potential:
In re
ipro
al spa
e, the potential V (G) should be zero, if G is not a re
ipro
al latti
e ve
tor of the primitive
ell.
For PREC=Med, this is not guaranteed due to aliasing or wrap around and the
harge density (and therefore the
Hatree potential) might violate this point. But even for PREC=High, small errors are introdu
ed, be
ause the ex
hange
orrelation potential Vx
is
al
ulated in real spa
e.
k-points:
In most
ases, the automati
k-point grid does not have the symmetry of the primitive
ell.
6.27
LCORR-tag
Default
.TRUE.
Based on the ideas of the Harris Foulkes fun
tional (see se
tion 7.3) it is possible to derive a
orre
tion to the for
es for non
fully self
onsistent
al
ulations, we
all these
orre
tions Harris
orre
tions. For LCORR=.T. these
orre
tions are
al
ulated
and in
luded in the stress-tensor and the for
es. The
ontributions are expli
itly written to the le OUTCAR and help to show
how well for
es and stress are
onverged. For surfa
es the
orre
tion term might be relatively large and testing has shown
that the
orre
ted for
es
onverge mu
h faster to the exa
t for
es than un
orre
ted for
es.
6.28
TEBEG, TEEND-tag
Default:
TEBEG
TEEND
=
=
0
TEBEG
TEBEG and TEEND
ontrol the temperature during an ab-initio mule
ular dynami
s. (see next se
tion). If no initial velo
ities
are supplied on the POSCAR le the velo
ities are set randomly a
ording to a Maxwell-Boltzmann distribution at the initial
temperature TEBEG. Velo
ities are only used for mole
ular dynami
s (IBRION=0).
Mind that VASP denes the temperature as
1
T=
M j~v j2 :
(6.1)
3kB T Nions n n n
But, be
ause the
enter of mass is
onserved there are only 3(Nions 1) degrees of freedom (the sum of all velo
ities is zero,
if a random initialization is
hosen). This means that the real simulation temperature is
T
= TEBEG
Nions
=(
Nions
1):
(6.2)
Also the temperature written by VASP (see e.g. OUTCAR le) is in
orre
t and has to be
orre
ted a
ordingly. Usually the
effe
t is rather small and subtle, but one should
orre
t the error if very pre
ise results are required. This means that a lower
teperature should be spe
ied a
ording to
TEBEG = Tsoll (Nions 1)=Nions ;
(6.3)
65
6.29
SMASS-tag
SMASS = 3j
2j 1j0j Nose-mass
where NSTEP is the
urrent step (starting from 1). This allows a
ontinuous in
rease or de
rease of the kineti
energy.
In the intermediate period a mi
ro
anoni
al ensemble is simulated.
>=0 For SMASS>=0 a
anoni
al ensemble is simulated using the algorithm of Nose. The Nose mass
ontrols the fre-
quen
y of the temperature os
illations during the simulation (see [1, 2, 3. For SMASS=0 Nose-mass
orresponding to
period of 40 time steps will be
hosen. The Nose-mass should be set so that the indu
ed temperature u
tuation show
approximately the same frequen
ies as the typi
al 'phonon'-frequen
ies for the spe
i
system. For liquids something
like 'phonon'-frequen
ies might be obtained from the spe
trum of the velo
ity auto-
orrelation fun
tion. If the ioni
frequen
ies differ by an order of magnitude from the frequen
ies of the indu
ed temperature u
tuations, Nose thermostat and ioni
movement might de
ouple leading to a non
anoni
al ensemble. The frequen
y of the approximate
temperature u
tuations indu
ed by the Nose-thermostat is written to the OUTCAR le.
6.30
NPACO
APACO
Default
NPACO
APACO
=
=
256
16
VASP evaluates the pair-
orrelation (PC) fun
tion ea
h NBLOCK steps and writes the PC-fun
tion after
NBLOCK*KBLOCK steps to the le PCDAT.
6.31
POMASS, ZVAL
POMASS
ZVAL
Default
POMASS
ZVAL
=
=
values read from POTCAR These two lines determine the valen
y and the atomi
mass of ea
h atomi
values read from POTCAR
spe
ies, and should be ommited usually sin
e the values are read from the POTCAR le. If in
ompatibilities exist, VASP will
stop.
66
6.32
RWIGS
The Wigner Seitz radius is optional. It must be supplied for ea
h spe
ies in the POSCAR le i.e.
RWIGS = 1.0 1.5
for a system with 2 spe
ies (types of atoms). If the RWIGS values is supplied, the spd- and site proje
ted wave fun
tion
hara
ter of ea
h band is evaluted, and the lo
al partial DOS is
al
ulated (see se
tions 5.16 and 5.15). For mono-atomi
system RWIGS
an be dened unambiguously. The sum of the volume of the spheres around ea
h atom should be the same
as the total volume of the
ell (assuming that you do not have a va
uum region within your
ell). This is in the spirit of atomi
sphere
al
ulations. VASP writes a line
Volume of Typ
1:
98.5 %
to the OUTCAR le. You should use a RWIGS value whi
h yields a volume of approximately 100%.
For binary system there is no unambiguous way to dene RWIGS and several
hoi
es are possible. In all
ases, the sum
of the volume of the spheres should be
lose to the total volume of the
ell (i.e the sum of the values given by VASP should
be around 100%).
One possible
hoi
e is to set RWIGS so that the overlap between the spheres is minimized.
However in most
ases, it is simpler to
hoose the radius of ea
h sphere so that they are
lose to the
ovalent radius
as tabulated in most periodi
tables. This simple
riterion
an be used in most
ases, and it relies at least on some
physi
al intuition.
Please keep in mind that results are qualitative i.e. there is no unambiguous way to determine the lo
ation of an ele
tron.
With the
urrent implementation, it is for instan
e hardly possible to determine
harge transfer. What
an be derived from the
partial DOS is the typi
al
hara
ter of a peak in a DOS. Quantitative results
an be obtained only by
arefull
omparison with
a referen
e system (e.g. bulk versus surfa
e).
6.33
LORBIT
Available up from VASP version 3.2. In VASP.3.2 ORBIT
an be either .TRUE. or .FALSE. In VASP.4.X LORBIT
an also
take integer values:
logi
al
.FALSE.
.TRUE.
integer
0
1
2
10
11
12
les written
DOSCAR and PROCAR le
DOSCAR and extended PROCAR le
DOSCAR and PROOUT le
DOSCAR and PROCAR le
DOSCAR and PROCAR le with phase fa
tors
not supported
VASP.4.6 behaviour:
integer
0
1
2
5
10
11
12
les written
DOSCAR and PROCAR le
DOSCAR and lm de
omposed PROCAR le
DOSCAR and lm de
omposed PROCAR le + phase fa
tors
PROOUT le
DOSCAR and PROCAR le
DOSCAR and lm de
omposed PROCAR le
DOSCAR and lm de
omposed PROCAR le + phase fa
tors
67
for the PAW method (LORBIT=10,11,12, see below). If the LORBIT ag is not equal zero, the site and l-proje
ted density
of states is also
al
ulated.
The PROOUT le (LORBIT=2)
ontains the proje
tion of the wavefun
tions onto spheri
al harmoni
s
entered at the
N jf i) and the
orresponding augmentation part.
position of the ions (PNlmnk hYlm
nk
This information
an be used to
onstru
t e.g. the partial DOS proje
ted onto mole
ular orbitals or the so-
alled
oop (
rystal
overlap population fun
tion). Mind, that in VASP.4.5 (and later releases), two PROOUT les are generated one for spin up
(PROOUT.1) and one for spin down (PROOUT.2). For a non spin polarised
al
ulation only PROOUT.1 is generated.
If the proje
tor augmented wave method is used, LORBIT
an also be set to 10, 11 or 12. This alternative setting sele
ts a
qui
k method for the determination of the spd- and site proje
ted wave fun
tion
hara
ter and does not require the spe
i
ation of a Wigner-Seitz radius in the INCAR le (the RWIGS line is negle
ted in this
ase). The method works only for PAW
POTCAR les and not for ultrasoft or norm
onserving pseudopotentials.
The parallel version has some restri
tions: The site proje
ted DOS is not evaluated in the parallel version in the following
ases:
VASP.4.5, NPAR6=1
no site proje
ted DOS
VASP.4.6, NPAR6=1, LORBIT=0-5 no site proje
ted DOS
6.34
NELECT
NELECT
Usually you should not set this line the number of ele
trons is determined automati
ally.
If the number of ele
trons is not
ompatible with the number derived from the valen
e and the number of atoms a
homogeneous ba
kground-
harge is assumed.
If the number of ions spe
ied in the POSCAR le is 0 and NELECT=n, then the energy of a homogeneous LDA-ele
tron
gas is
al
ulated.
6.35
NUPDOWN
NUPDOWN = differen e between number of ele trons in up and down spin omponent
Allows
al
ulations for a spe
i
spin multiplet, i.e. the the differen
e of the number of ele
trons in the up and down spin
omponent will be kept xed to the spe
ied value. There is a word of
aution required: If NUPDOWN is set in the INCAR le
the initial moment for the
harge density should be the same. Otherwise
onvergen
e
an slow down. When starting from
atomi
harge density (ICHARG=2), VASP will try to do this automati
ally by setting MAGMOM to NUPDOWN/NIONS. The user
an of
ourse overwrite this default by spe
ifying a different MAGMOM (whi
h should still result in the
orre
t total moment). If
one starts from the wavefun
tions, the initial moment will be always
orre
t, be
ause VASP will push the required number
of ele
trons from the down to the up
omponent. If starting from a
hargedensity supplied in the CHGCAR le (ICHARG=1),
the initial moment is usually in
orre
t!
If no value is set (or NUPDOWN=-1) a full relaxation will be performed. This is also the default.
6.36
defaults: EMIN and EMAX default to sensible values given by the minium and maximum band energies. NEDOS defaults to
NEDOS= 300.
The rst two tags determine the energy-range in eV, for whi
h the DOS is
al
ulated. VASP evaluates the DOS ea
h
NBLOCK steps and writes the DOS after NBLOCK*KBLOCK steps to the le DOSCAR. If you are not sure where the
region of interest lies, set EMIN to a value larger than EMAX.
6.37
ISMEAR = -5 | -4 | -3 | -2 | 0 | N
SIGMA = width of the smearing in eV
Default
ISMEAR
SIGMA
=
=
68
1
0.2
ISMEAR determines how the partial o
upan
ies fnk are set for ea
h wavefun
tion. For the nite temperature LDA SIGMA
determines the width of the smearing in eV.
ISMEAR:
1 Fermi-smearing
0 Gaussian smearing
1..N method of Methfessel-Paxton order N .
Mind: For the Methfessel-Paxton s
heme the partial o
upan
ies
an be negative.
2 partial o
upan
ies are read in from WAVECAR (or INCAR), and kept xed throughout run.
There should be a tag
FERWE = f1 f2 f3 ....
in the INCAR le supplying the partial o
upan
ies for all bands and k-points. The band-index runs fastest. The partial
o
upan
ies must be between 0 and 1 (for spin-polarized and non-spin-polarized
al
ulations).
Mind: Partial o
upan
ies are also written to the OUTCAR le, but in this
ase they are multiplied by 2, i.e. they are
between 0 and 2.
3 perform a loop over smearing-parameters supplied in the INCAR le. In this
ase a tag
SMEARINGS= ismear1 sigma1 ismear2 sigma2 ...
must be present in the INCAR le, supplying different smearing parameters. IBRION is set to -1 and NSW to the number
of supplied values. The rst loop is done using the tetrahedron method with Blo
hl
orre
tions.
4 tetrahedron method without Blo
hl
orre
tions
5 tetrahedron method with Blo
hl
orre
tions
For the
al
ulation of the total energy in bulk materials we re
ommend the tetrahedron method with Blo
hl
orre
tions
(ISMEAR=-5). This method also gives a good a
ount for the ele
troni
density of states (DOS). The only drawba
k is that the
methods is not variational with respe
t to the partial o
upan
ies. Therefore the
al
ulated for
es and the stress tensor
an be
wrong by up to 5 to 10 % for metals. For the
al
ulation of phonon frequen
ies based on for
es we re
ommend the method
of Methfessel-Paxton (ISMEAR>0). For semi
ondu
tors and insulators the for
es are
orre
t, be
ause partial o
upan
ies do
not vary and are zero or one.
The method of Methfessel-Paxton (MP) also results in a very a
urate des
ription of the total energy, nevertheless the
width of the smearing (SIGMA) must be
hosen
arefully (see also 7.4). Too large smearing-parameters might result in a
wrong total energy, small smearing parameters require a large k-point mesh. SIGMA should be as large as possible keeping
the differen
e between the free energy and the total energy (i.e. the term 'entropy T*S') in the OUTCAR le negligible (1
meV/atom). In most
ases N = 1 and N = 2 leads to very similar results. The method of MP is also the method of
hoi
e for
large super
ells, sin
e the tetrahedron method is not appli
able, if less than three k-points are used.
Mind: Avoid using ISMEAR>0 for semi
ondu
tors and insulators, sin
e this often leads to in
orre
t results (The o
upan
ies
of some states might be larger or smaller than 1). For insulators use ISMEAR=0 or ISMEAR=-5.
The Gaussian smearing (GS) method leads in most
ases also to reasonable results. Within this method it is ne
essary to
extrapolate from nite SIGMA results to SIGMA=0 results. You
an nd an extra line in the OUTCAR le 'energy( SIGMA
! 0)' giving the extrapolated results. Large SIGMA values lead to a similar error as the MP s
heme, but in
ontrast to the
MP s
heme one
an not determine, how large the error due to the smearing is with systemati
ally redu
ing SIGMA. Therefore
the method of MP is more
onvenient than the GS method. In addition, in the GS method for
es and the stress tensor are
onsistent with the free energy and not the energy for SIGMA ! 0. Overall the Methfessel-Paxton is easier to use for metalli
systems.
For further
onsiderations on the
hoi
e for the smearing method see se
tions 7.4,8.6. To summarize, use the following
guidelines:
69
For semi
ondu
tors or insulators use the tetrahedron method (ISMEAR=-5), if the
ell is too large (or if you use only a
single or two k-points) use ISMEAR=0 in
ombination with a small SIGMA=0.05.
For relaxations in metals always use ISMEAR=1 or ISMEAR=2 and an appropriated SIGMA value (the entropy term should
be less than 1 meV per atom). Mind: Avoid to use ISMEAR>0 for semi
ondu
tors and insulators, sin
e it might
ause
problems.
For metals a sensible value is usually SIGMA= 0.2 (whi
h is the default).
For the
al
ulations of the DOS and very a
urate total energy
al
ulations (no relaxation in metals) use the tetrahedron
method (ISMEAR=-5).
6.38
.FALSE.
Determines whether the proje
tion operators are evaluated in real-spa
e or in re
ipro
al spa
e: The non lo
al part of the
pseudopotential requires the evaluation of an expression D jb >< b jf k >. The proje
ted wavefun
tion
hara
ter is
dened as:
ij
=
=
NFFT
r
G b jk
<
<
b jr
><
><
rjf k
n
>=
k + Gjf k
n
ij
b r f k r
G b k G CG k
NFFT r
>=
( )
( +
( )
This expression
an be evaluated in re
ipro
al or real spa
e: In re
ipro
al spa
e (se
ond line) the number of operations s
ales
with the size of the basis set (i.e. number of plane-waves). In real spa
e (rst line) the proje
tion-operators are
onned to
spheres around ea
h atom. Therefore the number of operations ne
essary to evaluate one C k does not in
rease with the
system size (usually the number of grid points within the
ut-off-sphere is between 500 and 2000). One of the major obsta
les
of the method working in real spa
e is that the proje
tion operators must be optimized, i.e. all high frequen
y
omponents
must be removed from the proje
tion operators. If this is not done 'aliasing'
an happen (i.e. the high frequen
y
omponents
of the proje
tion operators are aliased to low frequen
y
omponents and a random noise is introdu
ed).
Currently VASP supports three different s
hemes to remove the high frequen
y
omponents from the proje
tors.
LREAL=.TRUE. is the simplest one. If LREAL=.TRUE. is sele
ted the real spa
e proje
tors whi
h have been generated by
the pseudopotential generation
ode are used. This requires no user interferen
e. For LREAL=On the real spa
e proje
tors are
optimized by VASP using an algorithm proposed by King-Smith et al. [47. For LREAL= Auto a new s
heme [48 is used
whi
h is
onsiderably better (resulting in more lo
alized) proje
tor fun
tions than the King-Smith et al. method. To ne tune
the optimization pro
edure the ag ROPT
an be used if LREAL=Auto or LREAL=On is used.
We re
ommend to use the real-spa
e proje
tion s
heme for systems
ontaining more than 20 atoms. We also re
ommend
to use only LREAL= Auto (for version VASP.4.4 and newer releases) and LREAL= On (for all other versions). Version 4.4 also
supports the old mode LREAL= O to allow
al
ulations that are fully
ompatible to VASP.4.3 (and VASP.3.2). The best performan
e is generally a
hieved with LREAL = Auto, but if performan
e is not that important you
an also use LREAL=.TRUE.
whi
h generally requires less user interferen
e. You
an skip the rest of the paragraph, if you use only LREAL=.TRUE..
For LREAL= O and LREAL= A the proje
tion operators are optimized by VASP on the y (i.e. on startup). Several ags
inuen
e the optimization
in
ENCUT (i.e. the energy
utoff),
omponents beyond the energy
utoff are 'removed' from the proje
tion operators.
PREC tag spe
ies how pre
ise the real spa
e proje
tors should be, and sets the variables ROPT a
ordingly to the
following values:
For LREAL=On
PREC= Low
PREC= Med
PREC= High
70
PREC= Low
PREC= Med
For LREAL=Auto
PREC= High
a
ura
y 10 2 (ROPT=0.01)
a
ura
y 2 10 3 (ROPT=0.002)
a
ura
y 2 10 4 (ROPT=2E-4)
will set the number of real spa
e points within the
utoff sphere for the rst spe
ies to approximately 700, and that for
the se
ond spe
ies to 1500. In VASP.4.4 alternatively the pre
ision of the operators
an be spe
ied writing i.e.
ROPT = 1E-3 1E-3
In that
ase the real spa
e operators will be optimized for an a
ura
y of approximately 1meV/atom (10 3 ). The
pre
ision mode works both for LREAL=On and LREAL=Auto (but to maintain
ompatibility with older VASP
version it is only sele
ted if LREAL= Auto is spe
ied in the INCAR le). The pre
ision mode is generally swit
hed
on if the value for ROPT is smaller than 0.1. The pre
ision mode and the
onventional mode
an be intermixed, i.e.
it is possible to spe
ify
ROPT = 0.7 1E-3
in that
ase the number of real spa
e points within the
utoff sphere for the rst spe
ies will be approximately 700,
whereas the real spa
e proje
tor fun
tions for the se
ond spe
ies are optimized for an a
ura
y of approximately 1 meV.
We re
ommend to use the pre
ision mode with a target a
ura
y of around 10 3 eV/atom if your version supports
this.
If you use the mode in whi
h the number of grid points in the real spa
e proje
tion sphere is spe
ied you have to
sele
t ROPT
arefully, espe
ially if a hard spe
ies is mixed with a soft spe
ies. In that
ase the following lines in the
OUTCAR le must be
he
ked (here is the output for LREAL=On, but that one for LREAL=Auto is quite similar )
Optimization of the real spa
e proje
tors
maximal supplied Q-value
= 12.85
optimization between [QCUT,QGAM = [ 4.75, 9.51 = [ 6.33, 25.32 Ry
Optimized for a Real-spa
e Cutoff
2.30 Angstroem
l
X(QCUT)
X( ont)
0
0
1
1
2
9.518
-2.149
8.957
1.870
3.874
9.484
-2.145
8.942
1.870
3.866
18.582
3.059
9.950
1.837
4.764
.11E-03
.17E-03
.14E-03
.95E-03
.15E-03
.16E-06
.25E-06
.34E-06
.51E-06
.68E-07
The meaning of QCUT and QGAM is explained in Se
. 11.5.6. The most important information is given in the
olumn
W(q)/X(q) (respe
tively the
olumn W(low)/X(q) for LREAL=Auto). The values in these
olumns must be as small as
possible. If these values are too large, in
rease the ROPT tag from the default value. As a rule of thumb the maximum
allowed value in this
olumn is 10 3 for PREC=Med. (For PREC=Low errors might be around 10 2 and for PREC=High
errors should be smaller than 10 4 ). If W(q)/X(q) is larger than 10 2 the errors introdu
ed by the real spa
e proje
tions
an be substantial. In this
ase ROPT must be spe
ied in the INCAR le to avoid in
orre
t results. If the new pre
ision
mode is used in VASP.4.4 (ROPT<0.1) the
ode automati
ally sele
ts the real-spa
e
utoff so that the required pre
ision
is rea
hed.
71
A few
omments for non-experts and experts: Real spa
e optimization (LREAL=.TRUE., LREAL=On or LREAL=Auto)
always results in a small (not ne
essarily negligible) error (the error is usually a
onstant energy shift for ea
h atom). If you
are interested in energy differen
es of a few meV use only
al
ulations with the same setup (i.e. same ENCUT, PREC, LREAL
and ROPT setting) for all
al
ulations. For example, if you want to
al
ulate surfa
e energies re
al
ulate the bulk groundstate
energy with exa
tly the same setting you are going to use for the surfa
e. Another possibility is to relax the surfa
e with
real spa
e proje
tion, and to do one nal total energy
al
ulation with LREAL=.FALSE. to get exa
t energies. Anyway, for
PREC=Med, the errors introdu
ed by the real spa
e proje
tion are usually of the same order magnitude as those introdu
ed
by the wrap around errors. For PREC=High errors are usually less than 1meV. PREC=Low should be used only for high speed
MD's, if
omputer resour
es are really a problem.
A few notes for experts: There are three parameters for the real spa
e optimization (see Se
. 11.5.6). First the energy-
utoff
(equivalent to QCUT in Se
. 11.5.6) then a value whi
h spe
ies from whi
h energy-
utoff the proje
tion operator should be
zero (equivalent to QGAM in Se
. 11.5.6) and the maximal radial extend of the real spa
e proje
tion operator (equivalent to
RMAX in Se
. 11.5.6). The rst parameter QCUT is xed by the energy
utoff, the se
ond one is set to QGAM=2*QCUT for PREC=
Low and PREC= Med, and to QGAM=3*QCUT for PREC= High. Finally the maximal radial extend of the proje
tor fun
tions is
determined by ROPT (respe
tively by PREC if ROPT is not spe
ied in the INCAR le).
6.39
GGA-tag
Default
This tag was added to perform GGA
al
ulation with pseudopotentials generated with
onventional LDA referen
e
ongurations. The tag is named GGA. Possible options are
GGA = PW jPBjLM j91jPE jRP
Perdew -Be
ke
Perdew -Wang 86
Langreth-Mehl-Hu
Perdew -Wang 91
Perdew-Burke-Ernzerhof (VASP.4.5)
revised Perdew-Burke-Ernzerhof (VASP.4.5)
VOSKOWN-tag
Default
0 Usually VASP uses the standard interpolation for the orrelation part of the ex hange orrelation fun tional.
If VOSKOWN is set to 1 the interpolation formula a
ording to Vosko Wilk and Nusair[49 is used. This usually enhan
es
the magneti
moments and the magneti
energies. Be
ause the Vosko-Wilk-Nusair interpolation is the interpolation usually
applied in the
ontext of gradient
orre
ted fun
tionals, it is desirable to use this interpolation whenever the PW91 fun
tional
is applied.
6.41
Mind: the
al
ulation of the dipole requires a denition of the
enter of the
ell, and results might differ for different positions.
You should use this option only for surfa
es and isolated mole
ules. In this
ase use the
enter of mass for the position (for
surfa
e only the
omponent normal to the surfa
e is meaningful).
The main problem is that the denition of the dipole 'destroys' the translational symmetry, i.e. the dipole is dened as
Z
r R enter )rions+valen e rd 3 r
(6.4)
Now this makes only sense if rions+valen
e drops to zero at some distan
e from R
enter . If this is not the
ase, than the values
are extremely sensible with respe
t to
hanges in R
enter .
72
6.42
ALGO-tag
Default
IALGO
LDIAG
=
=
8 or 38 for VASP.4.5
.TRUE.
Please mind, that the VASP.4.5 default is IALGO=38 (a Davidson blo
k iteration s
heme). IALGO=8 is not supported for
opyright reasons in VASP.4.5, but IALGO=38 is roughly 2 times faster for large systems than IALGO=8 and at least as stable.
You
an sele
t the algorithm also by setting ALGO= Normal Fast Very Fast in the INCAR le (see Se
. 6.42).
IALGO sele
ts the main algorithm, and LDIAG determines whether a subspa
ediagonalization is performed, or not. We
strongly urge the users to set the algorithms via
ALGO.
ALGO
stabilities.
Generally the rst digit of IALGO spe
ies the main algorithm, the se
ond digit
ontrols the a
tual settings within the
algorithm. For instan
e 4X will always
all the same routine for the ele
troni
minimization the se
ond digit X
ontrols the
details of the ele
troni
minimization (pre
onditioning et
.).
Mind: All implemented algorithms will result in the same answer, i.e. they will
orre
tly
al
ulate the KS groundstate, if
they
onverge. This is guaranteed be
ause all minimization routines use the same set of subroutines to
al
ulate the residual
(
orre
tion) ve
tor (H eS)jfi for the
urrent wavefun
tions f and they are
onsidered to be
onverged if this
orre
tion
ve
tor be
omes smaller than some spe
ied threshold. The only differen
e between the algorithms is the way this
orre
tion
ve
tor is added to the trial wavefun
tion and therefore the performan
e of the routines might be quite different.
The most extensive tests has been done for IALGO=38 (IALGO=8 before VASP.4.5). If random ve
tors (INIWAV=1) are used
for the initialization of the wavefun
tions, this algorithm always gives the
orre
t KS groundstate. Therefore, if you have
problems with
-1 Performan
e test.
VASP does not perform an a
tual
al
ulations only some important parts of the program will be exe
uted and the
timing for ea
h part is printed out at the end.
5-8 Conjugate gradient algorithm (se
tion 7.1.5)
Optimize ea
h band iteratively using a
onjugate gradient algorithm. Subspa
e-diagonalization before
onjugate gradient algorithm. The
onjugate gradient algorithm is used to optimize the eigenvalue of ea
h band.
Sub-swit
hes:
5 steepest des
ent
6
onjugated gradient
7 pre
onditioned steepest des
ent
8 pre
onditioned
onjugated gradient
IALGO=8 is always fastest, IALGO=5-7 are only implemented for test purpose.
Please mind, that IALGO=8 is not supported by VASP.4.5, sin
e M. Teter, Corning and M. Payne hold a patent on this
algorithm.
73
38 (ALGO=N) Kosugi algorithm (spe
ial Davidson blo
k iteration s
heme) (see se
tion 7.1.6)
This algorithm is the default in VASP.4.6 and VASP.5. It optimizes a subset of NSIM bands simultaneously (Se
. 6.44).
The optimized bands are kept orthogonal to all other bands. If problems are en
ountered with the algorithm, try to
de
rease NSIM. Su
h problems are en
ountered, if linear dependen
ies develop in the sear
h spa
e and by redu
ing
NSIM the rank of the sear
h spa
e is de
reased.
44-48 (ALGO=F) Residual minimization method dire
t inversion in the iterative subspa
e (RMM-DIIS see se
tion 7.1.4 and
7.1.7)
The RMM-DIIS algorithm redu
es the number of orthonormalization steps (o(N 3 ))
onsiderably and is therefore mu
h
faster than IALGO=8 and IALGO=38, at least for large systems and for workstations with a small memory band width.
For optimal performan
e, we re
ommend to use this swit
h together with LREAL= Auto (Se
tion 6.38). The algorithm
works in a blo
ked mode in whi
h several bands are optimized at the same time. This
an improve the performan
e
even further on systems with a low memory band width (see 6.44, default is presently NSIM=4).
The following sub-swit
hes exist:
44 steepest des
ent eigenvalue minimization
46 residuum-minimization + pre
onditioning
48 pre
onditioned residuum-minimization (ALGO=F)
IALGO=48 is usually most reliable (IALGO=44 and 46 are mainly for test purposes).
For IALGO=4X, a subspa
e-diagonalization is performed before the residual ve
tor minimization, and a Gram-S
hmidt
orthogonalization is employed after the RMM-DIIS step. In the RMM-DIIS step, ea
h band is optimized individually
(without the orthogonality
onstraint); a maximum of NDAV iterative steps per band are performed for ea
h band. The
default for NDAV is NDAV=4, and we we re
ommend to leave this value un
hanged.
Please mind, that the RMM-DIIS algorithm
an fail in rare
ases, whereas IALGO=38 did not fail for any system tested
up to date. Therefore, if you have problems with IALGO=48 try rst to swit
h to IALGO=38.
However, in some
ases the performan
e gains due to IALGO=48 are so signi
ant that IALGO=38 might not be a feasible
option. In the following we try to explain what to do if IALGO=48 does not work reliable:
In general two major problems
an be en
ountered when using IALGO=48. First, the optimization of uno
upied bands
might fail for mole
ular dynami
s and relaxations. This is be
ause our implementation of the RMM-DIIS algorithm
treats uno
upied bands more sloppy then o
upied bands (see se
tion 6.46) during MD's. The problem
an be solved
rather easily by spe
ifying WEIMIN=0 in the INCAR le. In that
ase all bands are treated a
urately.
The other major problem whi
h o
urs also for stati
al
ulations is the initialization of the wavefun
tions. Be
ause
the RMM-DIIS algorithm tends to nd eigenve
tors whi
h are
lose the the initial set of trial ve
tors there is no
guarantee to
onverge to the
orre
t ground state! This situation is usually very easy to re
ognize; whenever one
eigenve
tor is missing in the nal solution, the
onvergen
e be
omes slow at the end (mind, that it is possible that one
state with a small fra
tional o
upan
y above the Fermi-level is missing). If you suspe
t that this is the
ase swit
h
to ICHARG=12 (i.e. no update of
harge and Hamiltonian) and try to
al
ulate the wavefun
tions with high a
ura
y
(10 6 ). If the
onvergen
e is fairly slow or stu
ks at some pre
ision, the RMM-DIIS algorithm has problems with the
initial set of wavefun
tions (as a rule of thumb not more than 12 ele
troni
iterations should be required to determine
the wavefun
tion for the default pre
ision for ICHARG=12). The rst thing to do in that
ase is to in
rease the number
of bands (NBANDS) in the INCAR le. This is usually the simplest and most ef
ient x, but it does not work in all
ases. This solution is also undesirable for MD's and long relaxations be
ause it in
reases the
omputational demand
somewhat. A simple alternative whi
h worked in all tested
ases is to use IALGO=48 (Davidson) for a few non
self
onsistent iterations and to swit
h then to the RMM-DIIS algorithm. This setup is automati
ally sele
ted when
ALGO= Fast is spe
ied in the INCAR le (IALGO must not spe
ied in the INCAR le in this
ase).
The nal option is somewhat
ompli
ated and requires an understanding of how the initialization algorithm of the
RMM-DIIS algorithm works: after the random initialization of the wavefun
tions, the initial wavefun
tions for the
RMM-DIIS algorithm are determined during a non self
onsistent steepest des
ent phase (the number of steepest des
ent
sweeps is given by NELMDL, default is NELMDL=-12 for RMM-DIIS, se
tion 6.16). During this initial phase in ea
h
sweep, one steepest des
ent step per wavefun
tion is performed between ea
h sub spa
e rotation. This automati
simple steepest des
ent approa
h during the delay is fa
ed with a rather ill-
onditioned minimization problem and
an
fail to produ
e reasonable trial wavefun
tions for the RMM-DIIS algorithm. In this
ase the quantity in the
olumn
rms will not de
rease during the initial phase (12 steps), and you must improve the
onditioning of the problem by
setting the ENINI parameter in the INCAR le. ENINI
ontrols the
utoff during the initial (steepest des
ent) phase
for IALGO=48. Default for ENINI is ENINI=ENCUT. If
onvergen
e problems are observed, start with a slightly smaller
74
ENINI; redu e ENINI in steps of 20 %, till the norm of the residual ve tor ( olumn rms) de reases ontinuously
53-58 Treat total free energy as variational quantity and minimize the fun
tional
ompletely self
onsistently.
This algorithm is based on an idea rst proposed in Refs. [29, 30, 31. The algorithm has been
arefully optimized and
should be sele
ted for Hartree-Fo
k type
al
ulations. The present version is rather stable and robust even for metalli
systems. Important sub-swit
hes:
53 damped MD with damping term automati
ally determined by the given time-step (ALGO=D)
54 damped MD (velo
ity quen
h or qui
kmin)
58 pre
onditioned
onjugated gradient (ALGO=A)
Furthermore LDIAG determines, whether the subspa
e rotation matrix (rotation matrix in the spa
e spanned by the o
upied and uno
upied orbitals) is optimized. The
urrent default is LDIAG=.TRUE. sele
ting the algorithm presented in
Ref. [32. This allows for ef
ient groundstate
al
ulations of metals and small gap semi
ondu
tors. LDIAG=.FALSE.
sele
ts Loewdin perturbation theory for the subspa
e rotation matrix[14 whi
h is mu
h faster but generally signi
antly less stable for metalli
and small gap systems.
The pre
onditioned
onjugate gradient (IALGO = 58, ALGO = A) algorithm is re
ommended for insulators. The best
stability is usually obtained if the number of bands equals half the number of ele
trons (non spin polarized
ase). In
this
ase, the algorithm is fairly robust and fool proof and might even outperform the mixing algorithm.
For small gap systems and for metals, it is however usually required (metals) or desirable (semi
ondu
tors) to use a
larger value for NBANDS. In this
ase, we re
ommend to use the damped MD algorithm (IALGO = 53, ALGO = Damped)
instead of the
onjugate gradient one.
The stability of the all bands simultaneously algorithms depends strongly on the setting of TIME. For the
onjugate
gradient
ase, TIME
ontrols the step size in the trial step, whi
h is required in order to perform a line minimization
of the energy along the gradient (or
onjugated gradient, see se
tion 6.21 for details). Too small steps make the line
minimization less a
urate, whereas too large steps
an
ause instabilities. The step size is usually automati
ally s
aled
by the a
tual step size minimizing the total energy along the gradient (values
an range from 1.0 for insulators to 0.01
for metals with a large density of states at the Fermi-level).
For the damped MD algorithm (IALGO = 53, ALGO = Damped), a sensible TIME step is even more important. In this
ase TIME is not automati
ally adjusted, and the user is entirely responsible to
hose an appropriate value. Too small
time-steps slow the
onvergen
e signi
antly, whereas too large values will always lead to divergen
e. It is sensible to
optimize this value, in parti
ular, if many different
ongurations are
onsidered for a parti
ular system. It is re
ommended to start with a small step size TIME, and to in
rease TIME by a fa
tor 1.2 until the
al
ulations diverge. The
largest stable step TIME should then be used for all
al
ulations.
The nal algorithm IALGO = 54 also uses a damped mole
ular dynami
s algorithm and quen
hes the velo
ities to zero
if they are antiparallel to the present for
es (qui
k-min). It is usually not as ef
ient as IALGO=53, but it is also less
sensitive to the TIME parameter. (for detail please also read se
tion 6.21).
Note: it is very important to set the TIME tag for these algorithms (see se
tion 6.47).
ALL REMAINING ALGORITHMS ARE ONLY FOR EXPERTS AND THOSE THAT CAN NOT KEEP
THEIR FINGERS FROM GAMBLING
3 wavefun
tions are kept xed, perform only re
al
ulation of band stru
ture energy (mainly testing)
4 wavefun
tions are kept xed, perform only sub spa
e rotation (mainly testing)
15-18 Conjugate gradient algorithm
Subspa
e-diagonalization after iterative renement of the eigenve
tors using the
onjugate gradient algorithm. This
swit
h is for
ompatibility reasons only and should not be used any longer. Generally IALGO=5-8 is preferable, but was
not implemented previous to VAMP 1.1.
Sub-swit
hes as above.
75
NSIM - tag
If NSIM is spe
ied in VASP.4.4 and newer versions, the RMM-DIIS algorithm (IALGO=48) works in a blo
ked mode. In
this
ase, NSIM bands are optimized at the same time. This allows to use matrix-matrix operations instead of matrix-ve
tor
operation for the evaluations of the non lo
al proje
tion operators in real spa
e, and might speed up
al
ulations on some
ma
hines. There should be no differen
e in the total energy and the
onvergen
e behavior between NSIM=1 and NSIM>1, only
the performan
e should improve.
6.45
Mixing-tags
= type of mixing
= linear mixing parameter
= minimal mixing parameter
=
utoff wave ve
tor for Kerker mixing s
heme
= linear mixing parameter for magnetization
=
utoff wave ve
tor for Kerker mixing s
heme for mag.
= weight fa
tor for ea
h step in Broyden mixing s
heme
= type of initial mixing in Broyden mixing s
heme
= type of pre
onditioning in Broyden mixing s
heme
= maximum number steps stored in Broyden mixer
Default (please rely on these defaults)
IMIX
AMIX
AMIN
BMIX
AMIX MAG
BMIX MAG
WC
INIMIX
MIXPRE
MAXMIX
IMIX
AMIX
BMIX
WC
INIMIX
MIXPRE
MAXMIX
US-PP
4
0.8
1.0
1000.
1
1
-45
=
=
=
=
=
=
=
PAW
4
0.4
1.0
1000.
1
1
-45
MAXMIX is only available in VASP.4.4 and newer versions, and it is strongly re
ommended to use this option for mole
ular
dynami
s and relaxations.
With the default setting, a Pulay mixer[26 with an initial approximation for the
harge diele
tri
fun
tion a
ording to Kerker,
Ref. [41
AMIX min(
G2
G2 + BMIX2
AMIN)
(6.5)
is used. This is a very safe setting resulting in good
onvergen
e for most systems. In VASP.4.X for magneti
systems, the
initial setup for the mixing parameters for the magnetization density
an be supplied seperately in the INCAR le. The defaults for AMIX, BMIX, AMIX MAG and BMIX MAG are different from non magneti
al
ulations:
AMIX
AMIN
BMIX
AMIX MAG
BMIX MAG
=
=
=
=
=
US-PP
0.4
0.1
1.0
1.6
1.0
PAW
0.4
0.1
1.0
1.6
1.0
The above setting is equivalent to an (initial) spin enhan
ement fa
tor of 4, whi
h is usually a reasonable approximation.
There are only a few other parameter
ombinitions whi
h
an be tried, if
onvergen
e turns out to be very slow. In parti
ular,
76
for slabs, magneti
systems and insulating systems (e.g. mole
ules and
lusters), an initial linear mixing
an result in faster
onvergen
e than the Kerker model fun
tion. One
an therefore try to use the following setting
AMIX
BMIX
AMIX MAG
BMIX MAG
0.2
0.8
In VASP.4.x the eigenvalue spe
trum of the
harge diele
tri
matrix is
al
ulated and written to the OUTCAR le at ea
h
ele
troni
step. This allows a rather easy optimization of the mixing parameters, if required. Sear
h in the OUTCAR le for
0) an optimal setting for A (AMIX)
an be found easily by setting Aopt = A
urrent Gmean .
A or q0 (i.e. AMIX or BMIX)
an be optimized, but we re
ommend to
hange only BMIX and keep
AMIX xed (you must de
rease BMIX if the mean eigenvalue is larger than one, and in
rease BMIX if the mean eigenvalue is
One important option whi h might help to redu e the number of iterations for MD's and ioni relaxations is the option
MAXMIX, whi h is only available in VASP.4.4. MAXMIX spe ies the maximum number of ve tors stored in the Broyden/Pulay
mixer, in other words it
orresponds to the maximal rank of the approximation of the
harge diele
tri
fun
tion build up
by the mixer.
MAXMIX
MAXMIX is positive, the
harge density mixer is only reset if the storage
apabilities are ex
eeded.
MAXMIX is positive
The reset is done smoothly by removing the ve oldest ve tors from the iteration history. Therefore, if
the approximation for the
harge diele
tri
fun
tion whi
h was obtained in previous ioni
steps is reused in the
urrent ioni
step, and this in turn
an redu
e the number of ele
troni
steps during relaxations and MD's. Espe
ially for relaxations whi
h
start from a good ioni
starting guess and for systems with a strong
harge sloshing behavior the speedup
an be signi
ant.
We found that for a 12 A long box
ontaining 16 Fe atoms the number of ele
troni
iterations de
reased from 8 to 2-3 when
MAXMIX
was set to 40. For a arbon surfa e the number of iterations de reased from 7 to 3. At the same time the energy
stability in
reased signi
antly. But be
areful this option in
reases the memory requirements for the mixer
onsiderably,
and thus the option is not re
ommended for systems were
harge sloshing is negligible anyway (like bulk simple metals).
MAXMIX is usually around three times the number of ele
troni
steps required in the rst iteration. Too
MAXMIX might
ause the
ode to
rash (be
ause linear dependen
ies between input ve
tors might develop).
Please go to the next se tion if you are not interested in a more detailed di ussion of the ags that inuen e the mixer.
0 no mixing ( mixed =
rout )
BMIX
G2
G2 + BMIX2
BMIX=0.0001,
(rout (G)
rin (G))
(6.6)
BMIX=0
might ause
r in (G) = 2 AMIX
G2
G2 + BMIX2
(rout (G)
AMIN
in the INCAR le. A simple velo ity Verlet algorithm is used to integrate this
equation, and the dis
retized equation reads (the index N now refers to the ele
troni
iteration,
the
harge):
rN +1
~
2 = ((1
=2)~r N
1=2 + 2
FN )
~
=(1 +
=2)
77
F (G) = AMIX
G2
G2 + BMIX2
rN +1 = ~rN +1 +~r N +1
For BMIX
(rout (G)
rin (G))
0, no model for the diele tri matrix is used. It is easy to see, that for = 2 a simple straight mixing
is obtained. Therefore
parameters for the
Then the eigenvalues of the
harge diele
tri
matrix as given in the OUTCAR le must be inspe
ted. Sear
h for the
last orruran
e of
AMIX
AMIN=
4 Broyden's 2. method[24, 25, or Pulay's mixing method [26 (depending on the hoi e of
WC)
AMIN is usually 0.4. AMIX depends very mu
h on the system, for metals this parameter usually has to
AMIX=0.02.
The parameters WC, INIMIX and MIXPRE are meaningful only for the Broyden s
heme:
WC determines the weight fa
tors for ea
h iteration
>
WC (resulting in Pulay's mixing method), up to now Pulay's s heme was always superior to
= 0 swit
h to Broyden's 2nd method, i.e. set the weight for the last step equal to 1000 and all other weights equal to 0.
<
Witer
= 0:01
jWCj jjrout
=
weights for the rst steps and in
reasing weights for the last steps (not re
ommended this was only implemented
during the test period).
INIMIX
determines the fun tional form of the initial mixing matrix (i.e.
G0
matrix might inuen e the onvergen e speed for omplex situations (espe ially surfa es and magneti systems), nevertheless
INIMIX must not be
hanged from the default setting: anything whi
h
an be done with INIMIX
an also be done with AMIX
BMIX, and
hanging AMIX and BMIX is denitely preferable.
Anyway, possible
hoi
es for INIMIX are:
and
AMIX
P(G) = 1 +
BMIX
BMIX from INCAR, for G > 0 the weights for the metri are given by
(6.7)
G2
only on the total harge density (i.e. up+down omponent) and not on the magnetization harge
density (i.e. up-down
omponent). Up to now we have found that introdu
tion of a metri
always improves the
onvergen
e
speed. The best
hoi
e is therefore
78
6.46
These tags allow ne tuning of the iterative matrix diagonalization and should not be
hanged. They are optimized for a large
variety of systems, and
hanging one of the parameters usually de
reases performan
e or
an even s
rew up the iterative
matrix diagonalization totally.
WEIMIN = maximum weight for a band to be
onsidered empty
EBREAK = absolute stopping
riterion for optimization of eigenvalue
DEPER = relative stopping
riterion for optimization of eigenvalue
Defaults
WEIMIN = 0.001
for dynami
al
ulation IBRION >= 0
= 0
for stati
al
ulation IBRION = 1
EBREAK = EDIFF/N-BANDS/4
DEPER
= 0.3
In general, these tags
ontrol when the optimization of a single band is stopped within the iterative matrix diagonalization
s
hemes:
Within all implemented iterative s
hemes a distin
tion between empty and o
upied bands is made to speed up
al
ulations. Uno
upied bands are optimized only twi
e, whereas o
upied bands are optimized up to four times till another
break
riterion is met. Eigenvalue/eigenve
tor pairs for whi
h the partial o
upan
ies are smaller than WEIMIN are treated as
uno
upied states (and are thus only optimized twi
e).
EBREAK determines whether a band is fully
onverged or not. Optimization of an eigenvalue/eigenve
tors pair is stopped
if the
hange in the eigenenergy is smaller than EBREAK.
DEPER is a relative break-
riterion. The optimization of a band is stopped after the energy
hange be
omes smaller than
DEPER multiplied with the energy
hange in the rst iterative optimization step. The maximum number of optimization steps
is always 4.
6.47
TIME-tag
Controls the trial time step for IALGO=5X, for the initial (steepest des
ent) phase of IALGO=4X.
6.48
LWAVE,LCHARG
Default
LVTOT
.FALSE.
This tag determines whether the total lo
al potential (le LOCPOT) is written. Starting from version VASP 4.4.4, VASP
also
al
ulates the average ele
trostati
potential at ea
h ion. This is done, by pla
ing a test
harge with the norm 1, at ea
h
ion and
al
ulating
Vn =
V (r)rtest (jr
Rn j ) d 3 r
The spatial extend of the test
harge is determined by ENAUG (see Se
. 6.9), so that
al
ulations
an be
ompared only if
ENAUG is kept xed. The
hange of the
ore level shift D
between to models
an be
al
ulated by the simple formula
D = Vn1 e1Fermi
2
(Vn
e2Fermi );
79
where Vn1 and Vn2 are the ele
trostati
potentials at the
ore of an ion for the rst and se
ond
al
ulations, respe
tively, and
e1Fermi and e2Fermi are the Fermi levels in these
al
ulations. Clearly, the
ore level shift is the same for all
ore ele
trons in this
simple approximation. In addition, s
reening effe
ts are not taken into a
ount.
6.50
LELF
VASP
urrently offers parallelization (and data distribution) over bands and parallelization (and data distribution) over plane
wave
oef
ients (see also Se
tion 4). To get a high ef
ien
y on massively parallel systems it is strongly re
ommended to
use both at the same time. The only algorithm whi
h works with the over band distribution is the RMM-DIIS iterative matrix
diagonalization (IALGO=48). The
onjugate gradient band-by-band method (IALGO=8) is only supported for parallelization
over plane wave
oef
ients.
NPAR tells VASP to swit
h on parallelization (and data distribution) over bands. NPAR=1 implies distribution over plane
wave
oef
ients only (IALGO=8 and IALGO=48 both work), All nodes will work on ea
h band. We suggest to use this
default setting only when running on a small number of nodes.
In VASP.4.5, the default for NPAR is equal to the (total number of nodes). For NPAR=(total number of nodes), ea
h band
will be treated by only one node. This
an improve the performan
e for platforms with a small
ommuni
ation bandwidth,
however it also in
reases the memory requirements
onsiderably, be
ause the non lo
al proje
tor fun
tions must be stored in
that
ase on ea
h node. In addition a lot of
ommuni
ation is required to orthogonalize the bands. If NPAR is neither 1, nor
equal to the number of nodes, the number of nodes working on one band is given by
total number nodes=NPAR:
The se
ond swit
h whi
h inuen
es the data distribution is LPLANE. If LPLANE is set to .TRUE. in the INCAR le,
the data distribution in real spa
e is done plane wise. Any
ombination of NPAR and LPLANE
an be used. Generally,
LPLANE=.TRUE. redu
es the
ommuni
ation band width during the FFT's, but at the same time it unfortunately worsens
the load balan
ing on massively parallel ma
hines. LPLANE=.TRUE. should only be used if NGZ is at least 3*(number of
nodes)/NPAR, and optimal load balan
ing is a
hieved if NGZ=n*NPAR, where n is an arbitrary integer. If LPLANE=.TRUE.
and if the real spa
e proje
tor fun
tions (LREAL=.TRUE. or ON or AUTO) are used, it might be ne
essary to
he
k the lines
following
real spa
e proje
tor fun
tions
total allo
ation :
max/ min on nodes :
The max/ min values should not differ too mu
h, otherwise the load balan
ing might worsen as well.
The optimum setting of NPAR and LPLANE depends very mu
h on the type of ma
hine you are running. Here are a few
guidelines
SGI power
hallenge:
Usually one is running on a relatively small number of nodes, so that load balan
ing is no problem. Also the
ommuni
ation band width is reasonably good on SGI power
hallenge ma
hines. Best performan
e is often a
hived with
LPLANE = .TRUE.
NPAR = 1
NSIM = 1
In
reasing NPAR usually worsens performan
e. For NPAR=1 we have in fa
t observed a superlinear s
aling w.r.t. the
number of nodes in many
ases. This is due to the fa
t that the
a
he on the SGI power
hallenge ma
hines is relatively
large (4 Mbytes); if the number of nodes is in
reased the real spa
e proje
tors (or re
ipro
al proje
tors)
an be kept in
the
a
he and therefore
a
he misses de
rease signi
antly if the number of nodes are in
reased.
80
SGI Origin: The SGI Origin behaves quite differently from the SGI Power Challenge. Mainly be ause the memory
bandwidth is a fa
tor of three better than on the SGI Power Challenge. The following setting seems to be optimal when
running on 4-16 nodes:
LPLANE = .TRUE.
NPAR = 4
NSIM = 4
Contrary to the SGI Power Challenge superlinear s
aling
ould not be observed, obviously be
ause data lo
ality and
a
he reusage is only of minor importan
e on the Origin 2000.
LINUX
luster linked by 100 Mbit Ethernet: On a LINUX
luster linked by a relatively slow network, LPLANE must
be set to .TRUE., and the NPAR ag should be equal to the number of nodes:
LPLANE
NPAR
LSCALU
NSIM
=
=
=
=
.TRUE.
number of nodes.
.FALSE.
4
Mind that you need at least a 100 Mbit full duplex network, with a fast swit
h offering at least 2 Gbit swit
h
apa
ity.
T3D, T3E On many T3D, T3E platforms one is for
ed to use a huge number of nodes. In that
ase load balan
ing
problems and problems with the
ommuni
ation bandwidth are likely to be experien
ed. In addition the
a
he is fairly
small on T3E and T3D ma
hines so that it is impossible to keep
p the real spa
e proje
tors in the
a
he with any setting.
Therefore, we re
ommend to set NPAR on these ma
hines to number of nodes (expli
it timing
an be helpful to nd
the optimum value). The use of LPLANE = .TRUE. is only re
ommend if the number of nodes is signi
antly smaller
than NGX, NGY and NGZ.
In summary the following setting is re
ommended
LPLANE = .FALSE.
NPAR = sqrt(number of nodes)
NSIM = 1
6.52
LASYNC
If
LASYNC = .TRUE.
is set in the INCAR le, VASP will try to overlap
ommuni
ation with
al
ulations. This swit
h is only supported in VASP.4.5
and newer releases, its use is however not re
ommended, sin
e LASYNC =.TRUE. has not been tested
arefully.
Overlapping
ommuni
ation and
al
ulations, might improve performan
e a little bit, but it is also possible that the
performan
e drops signi
antly. Please try yourself, and send a brief report to Georg.Kresseunivie.a
.at.
6.53
Ls aLAPACK, Ls aLU
s
aLAPACK will not be used by VASP.4.X. This swit
h is required on the T3D/T3E if VASP was
ompiled with the s
aLAPACK and several images are run at the same time by setting IMAGES=X in the INCAR le (see next se
tion). If s
aLAPACK
is not swit
hed of in the nudged elasti
band mode on the T3D/T3E, VASP will
rash.
In some
ases, the LU de
omposition (timing ORTHCH) based on s
aLAPACK is slower than the serial LU de
omposition. Hen
e it also is possible, to swit
h of the parallel LU de
omposition by spe
ifying
LSCALU = .FALSE.
in the INCAR le (the subspa
e rotation is still done with s
aLAPACK in this
ase).
MIND: in the Gamma point only T3D version, the parallel sub spa
e diagonalisation (Ls
aLAPACK= True) is performed
with a Ja
obi algorithm instead of s
aLAPACK. This routine was written by Ian Bush. The Ja
obi routine is faster than
s
aLAPACK.
6.54
81
If the elasti band method is used on the T3D s aLAPACK has to be swit hed of (see 6.53).
VASP.4.X supports the elasti
band method to
al
ulate energy barriers. The INCAR, KPOINTS, and POTCAR les
must be lo
ated in the dire
tory in whi
h VASP is started. In addition, a set of subdire
tories (numbered 00,01,02...) must be
reated, and ea
h subdire
tory must
ontain one POSCAR le. The tag
IMAGES= number of images
(spe
ied in the INCAR le) for
es VASP to run the elasti
band method. The number of nodes must be dividable by the
number of images (the NPAR swit
h
an still be used as des
ribed above). VASP divides the nodes in groups, and ea
h group
then works on one image. The rst group of nodes reads the POSCAR le from the dire
tory 01, the se
ond group from 02
et
. In the elasti
band method, the endpoints are kept xed, and the position of the end points must be supplied in the les
00/POSCAR and XX/POSCAR, where XX is
XX=number of images+1.
All output (OUTCAR, WAVECAR, CHGCAR et
.) is written to the subdire
tories. Sin
e no nodes are exe
uting for the
positions supplied in the dire
tories 00 and XX, no output les will be
reated in these sub dire
tories. The usual stdout of
the images 02,03,...,number of images is redire
ted to the les 02/stdout, 03/stdout et
. (only image 01 writes to the usual
stdout). In addition to the IMAGES tag, a spring
onstant
an be supplied in the SPRING tag. The default is
SPRING=-5
For SPRING=0, ea
h image is only allowed to move into the dire
tion perpendi
ular to the
urrent hyper-tangent, whi
h
is
al
ulated as the normal ve
tor between two neighboring images. This algorithm keeps the distan
e between the images
onstant to rst order. It is therefore possible to start with a dense image spa
ing around the saddle point to obtain a ner
resolution around this point.
The nudged elasti
band method[55, 56 is applied when SPRING is set to a negative value e.g.
SPRING=-5
This is also the re
ommended setting. Compared to the previous
ase, additional tangential springs are introdu
ed to keep the
images equidistant during the relaxation (remember the
onstraint is only
onserved to rst order otherwise). Do not use too
large values, be
ause this
an slow down
onvergen
e. The default value usually works quite reliably.
One problem of the nudged elasti
band method is that the
onstraint (i.e movements only in the hyper-plane perpendi
ular to the
urrent tangent) is non linear. Therefore, the CG algorithm usually fails to
onverge, and we re
ommended to
use the RMM-DIIS algorithm (IBRION=1) or the qui
k-min algorithm (IBRION=3). Additionally, the non-linear
onstraint
(equidistant images) tends to be violated signi
antly during the rst few steps (it is only enfor
ed to rst order). If this
problem is en
ountered, a very low dimensionality parameter (IBRION=1, NFREE=2) should be applied in the rst we steps,
or a steepest des
ent minimization without line optimization (IBRION=3, SMASS=2). should be used, to pre-
onverge the
images.
If all degrees of freedom are allowed to relax (isolated mole
ules, no surfa
e, et
.), make sure that the sum of all positions
is the same for ea
h
ell. In other words,
i=1;N
~Rai
(6.8)
ions
must be equal for all images. Otherwise fake for
es are introdu
ed, and the images drift against ea
h other (this will not
introdu
e problems during the VASP
al
ulations, but it is awkward to visualize the nal results). Often an initial linearly
interpolated starting guess is appropriated, this
an be done with a small s
ript
alled
interpolatePOS
found in vamp/s
ripts/. The s
ript also removes as an option the
enter of mass motion.
Finally, we strongly re
ommend to keep the number of images to an absolute minimum. The fewer images are used the
faster to
onvergen
e to the groundstate is. Often, it is advisable to start with a single image between the two endpoints, and
to in
rease the number of images, on
e this rst run has
onverged.
6.55
82
In prin
iple, the PAW method
an be used in the same manner as the US-PP method. Only spe
ial PAW POTCAR les are
required. In prin
iple, also no additional user interferen
e is required. However there are a few ags that
ontrol the behavior
of the PAW implementation. The rst one is LMAXPAW:
LMAXPAW = l
This ag
ontrols the maximum l quantum number for the evaluation of the on-site terms on the radial support grids in the
PAW method. The default for LMAXPAW is 2 lmax , where lmax is the maximum angular quantum number of the partial waves.
Useful settings for LMAXPAW are for instan
e:
LMAXPAW = 0
In this
ase, only spheri
al terms are evaluated on the radial grid. This does not mean that a-spheri
al terms are totally
negle
ted, be
ause the
ompensation
harges are always expanded up to 2 lmax on the plane wave grid.
Finally, LMAXPAW=-1 has a spe
ial meaning. For LMAXPAW=-1, no on-site
orre
tion terms are evaluated on the radial
support grid, whi
h effe
tively means that the behavior of US-PP's is re
overed with PAW input datasets. Usually this allows
very ef
ient and fast
al
ulations, and this swit
h might be of interest for relaxations and mole
ular dynami
s runs. Energies
should be evaluated with the default setting for LMAXPAW.
An additional ag
ontrols up to whi
h l quantum number the onsite PAW
harge densities are passed through the
harge
density mixer:
LMAXMIX = l
The default is LMAXMIX=2. Higher l-quantum numbers are usually not handled by the mixer, i.e. a straight mixing is applied
for them (the PAW on-site
harge density for higher l quantum numbers is reset pre
isely to the value
orresponding to the
present wavefun
tions). Usually, it is not required to in
rease LMAXMIX, but the following two
ases are ex
eptions:
L(S)DA+U
al
ulations require in many
ases an in
rease of LMAXMIX to 4 (or 6 for f-elements) in order to obtain fast
The CHGCAR le also
ontains only information up to LMAXMIX for the on-site PAW o
upan
y matri
es. When the
CHGCAR le is read and kept xed in the
ourse of the
al
ulations (ICHARG=11), the results will be ne
essarily
not identi
al to a self
onsistent run. The deviations
an be (or a
tually are) large for L(S)DA+U
al
ulations. For
the
al
ulation of band stru
tures within the L(S)DA+U approa
h it is stri
tly required to in
rease LMAXMIX to 4 (d
The se
ond swit
h, that is useful in the
ontext of the PAW method (and US-PP) is ADDGRID. The default for ADDGRID
is .FALSE. If
ADDGRID = .TRUE.
is written in the INCAR le, an additional (third) support grid is used for the evaluation of the augmentation
harges. This
third grid
ontains 8 times more points than the ne grid NGXF, NGYF, NGZF. Whenever terms involving augmentation
harges are evaluated, this third grid is used. For instan
e: The augmentation
harge is evaluated rst in real spa
e on this ne
grid, FFT-transformed to re
ipro
al spa
e and then added to the total
harge density on the grid NGXF, NGYF, NGZF. The
additional grid helps to redu
e the noise in the for
es signi
antly. In many
ases, it even allows to perform
al
ulations in
whi
h NGXF=NGX et
. This
an be a
hieved by setting
ENAUG = 1 ; ADDGRID = .TRUE.
in the INCAR le. The ag
an also be used for US-PPs, in parti
ular, to redu
e the noise in the for
es.
6.56
For
harged
ells or for
al
ulations of mole
ules and surfa
es with a large dipole moment, the energy
onverges very slowly
with respe
t to the size L of the super
ell. Using methods dis
ussed in Ref. [51, 52 VASP is able to
orre
t for the leading
errors, but one should stress, that in many details, we have taken a more general approa
h than that one outlined in Ref. [51.
The following ags
ontrol the behavior of VASP.
83
NELECT determines the total number of ele
trons in the system (see Se
. 6.34). For
harged systems this value has to
be supplied by hand and a neutralizing ba
kground
harge is assumed by VASP. For these systems the energy
onverges
very slowly with respe
t to the size of the super
ell. The required rst order energy
orre
tion is given by
e2 q2 a=L=e
where q is the net
harge of the system, a the Madelung
onstant of a point
harge q pla
ed in a homogeneous
ba
kground
harge q, and e the diele
tri
onstant of the system. For atoms or mole
ules surrounded by va
uum,
e takes the va
uum value e = 1. In that
ase VASP.4.X
an
orre
t for the leading error if the IDIPOL tag is set (see
below).
IDIPOL tag
If set in the INCAR le monopole/dipole and quadrupole
orre
tions will be
al
ulated. There are four possible settings
for IDIPOL
IDIPOL = 1-4
For 1 to 3, the dipole moment will be
al
ulated only into the dire
tion of the rst, se
ond or third latti
e ve
tor. The
orre
tions for the total energy are
al
ulated as the energy differen
e between a monopole/dipole and quadrupole
in the
urrent super
ell and the same dipole pla
ed in a super
ell with the
orresponding latti
e ve
tor approa
hing
innity. This ag should be used for slab
al
ulations.
For IDIPOL=4 the full dipole moment in all dire
tions will be
al
ulated, and the
orre
tions to the total energy
are
al
ulated as the energy differen
e between a monopole/dipole/quadrupole in the
urrent super
ell and the same
monopole/dipole/quadrupole pla
ed in a va
uum, use this ag for
al
ulations for isolated mole
ules.
DIPOL tag
DIPOL =
enter of
ell (in dire
t, fra
tional
oordinates)
This tag determines as in VASP.3.2 the
enter of the net
harge distribution. The dipol is dened as
Z
r R enter )rions+valen e rd 3 r
(6.9)
where R
enter is position as dened by the DIPOL tag. If the ag is not set VASP, determines the points where the
harge
density averaged over one plane drops to a minimum and dedu
es the
enter of the
harge distribution by adding half of
the latti
e ve
tor perpendi
ular to the plane where the
harge density has a minimum (this is a rather reliable approa
h
for orthorhombi
ells).
LDIPOL tag
This tag swit
hes on the potential
orre
tion mode: Due to the periodi
boundary
onditions not only the total energy
onverges slowly with respe
t to the size of the super
ell, but also the potential and the for
es are wrong. This
effe
t
an be
ounterbalan
ed by setting LDIPOL=.TRUE. in the INCAR le. In that
ase a linear and (in the
ase of
a
harged system) a quadrati
ele
trostati
potential is added to the lo
al potential
orre
ting the errors introdu
ed by
the periodi
boundary
onditions. This is in the spirit of Ref. [52 (but more general and the total energy has been
orre
tly implemented). The biggest advantage of this mode is that the leading errors in the for
es are
orre
ted, and
that the workfun
tion
an be evaluated for asymetri
slabs. The disadvantage is that the
onvergen
e to the ele
troni
groundstate might slow down
onsiderably (i.e. more ele
troni
iterations might be required to obtain the required
pre
ision). It is re
ommended to use this mode only after pre-
onverging the wavefun
tions without the LDIPOL ag,
and the
enter of
harge should be set by hand (DIPOL =
enter of mass). The user must also make sure that the
ell is
suf
iently large to determine the dipol moment with good a
ura
y. If the
ell is too small, it is usually very dif
ult to
tell whether
harge is lo
ated on the left or right side of the slab,
ausing very slow
onvergen
e (often
onvergen
e
improves with the size of the super
ell).
84
For the
urrent implementation, there are several restri
tion; please be
areful:
Charged systems:
Quadrupole
orre
tions are only
orre
t for
ubi
super
ells (this means that the
al
ulated 1=L3
orre
tions are wrong
for
harged super
ells if the super
ell is not
ubi
). In addition we have found empiri
ally that for
harged systems
with ex
ess ele
trons (NELECT>NELECTneutral ) more reliable results
an be obtained if the energy after
orre
tion of
the linear error (1=L) is plotted against 1=L3 to extrapolate results manually for L ! . This is due to the un
ertainties
in extra
ting the quadrupole moment of systems with ex
ess ele
trons.
Potential
orre
tions are only possible for orthorhombi
ells (at least the dire
tion in whi
h the potential is
orre
ted
6.57
Similar to the
ase of
harged atoms and mole
ules in a large
ubi
box also
harged defe
ts in semi
ondu
tors impose the
problem of potentially slow
onvergen
e of the results with respe
t to the super
ell size due to spurious ele
trostati
intera
tion between defe
ts in neighboring super
ells. Generally, the errors are less dramati
than for
harged atoms or mole
ules
sin
e the
harged defe
t is embedded in a diele
tri
medium (bulk) and all spurious intera
tions between neighboring
ells
are s
aled down by the bulk diele
tri
onstant e. Hen
e, the total error might remain small (order of 0.1 eV) and one has
not to worry too mu
h about spurious ele
trostati
intera
tions between neighboring
ells. However, there exist three
riti
al
ases where one should denitely start to worry (and to apply dipole
orre
tions):
semi
ondu
tors
ontaining rstrow elements sin
e they possess rather small latti
e
onstants and hen
e the distan
e
between two neighboring defe
ts is smaller than in most other semi
ondu
tor materials (though one should note that
the smaller latti
e
onstant alone must not yet in
rease the errors dramati
ally sin
e the leading s
aling is 1=L, only the
ontributions s
aling 1=L3 may be
ome dangerous for small
ells),
semi
ondu
tors with a rather small diele
tri
onstant e, and
high-
harge states like 3+, 4+, 3- or 4- sin
e the spurious intera
tions s
ale (approximately) proportional to the square
of the total ell harge, e.g., for a 4+ state the error is about 16 times larger than for a 1+ state!
The worst
ase one
an ever think of is that all three
onditions mentioned above are fullled simultaneously. In this
ase the
orre
tions
an amount to the order of several eV (instead of the otherwise typi
al order of few 0.1 eV)!
In prin
iple it is possible to apply the same pro
edure as in the
ase of
harged atoms and mole
ules in va
uum. However,
with the
urrent implementation one has to
are about following things and following restri
tions apply:
Unfortunately a full
orre
tion is only possible for
ubi
ells, the only
ontribution whi
h
an always be
orre
ted for
any arbitrary
ell shape, is the monopole-monopole intera
tion. However, for intermediate
ell sizes the quadrupolemonopole intera
tion is not always negligible (it
an rea
h the order of minus 30-40 % of the monopole-monopole
term!). Therefore, whenever possible the use of
ubi
ells is re
ommended. Otherwise one should try to use as large
as possible
ells (the dipole-dipole and monopole-quadrupole intera
tions s
ale like 1=L3 and therefore, for larger
ells
a monopole-monopole
orre
tion alone be
omes more and more reliable).
The orre tions are only reasonable if the defe t-indu ed perturbation of the harge density is stri tly lo alized around
the defe
t, i.e., if only the o
upation of lo
alized defe
t states is
hanged. Whenever the problem o
urs that (partially)
wrong bands (e.g. delo
alized
ondu
tion band or valen
e band states instead of defe
t states) are o
upied the
al
ulated
orre
tions be
ome meaningless (the
orre
tion formulas are not valid for overlapping
harges)! Therefore one
should rst
al
ulate the differen
e between the
harge densities of the
harged defe
t
ell and the ideal unperturbed
bulk
ell and
he
k the lo
alization of this differen
e
harge (in between the defe
ts the differen
e must vanish within
the numeri
al error bars for the
harge densities)!
Don't forget to s ale down all results by the bulk diele tri onstant e! Yet, there is no possibility to enter any diele tri
onstant, all
orre
tions are
al
ulated and printed for e = 1. Therefore, the
orre
ted total energies printed after the nal
ele
troni
iteration are meaningless! Hen
e, you should rst
al
ulate the energies without any
orre
tions and later
you have to add the
orre
tions by hand using the output printed in OUTCAR (you must sear
h for a line DIPCOR:
dipole
orre
tions for dipole and following lines, there you nd the dipole moment, the quadrupole moment and the
energy
orre
tions). One should note that stri
tly one has to take the diele
tri
onstant
al
ulated by rst-prin
iples
methods. Sin
e VASP does not yet allow a simple
al
ulation of diele
tri
onstants, however, you have to use the
experimental value (or values taken from other
al
ulations). This empirism introdu
es slight un
ertainties in your
energy
orre
tions. However, one
an expe
t that the un
ertainty should rarely ex
eed 5-10% sin
e diele
tri
onstants
85
taken from experiment and those obtained from rst-prin
iples
al
ulations usually agree very well (often within the
order of 1-3%).
The dipole-dipole plus quadrupole-monopole
orre
tions printed in OUTCAR are meaningless in their original form!
We have to
al
ulate a
orre
tion for the defe
t-indu
ed multipoles, but sin
e we have also in
luded the surrounding
bulk a quadrupole moment asso
iated with the
orresponding
harge (extending over the whole
ell!) is also in
luded
in the printed quadrupole moment (and in the
orresponding energy
orre
tions). Sin
e in systems with
ubi
symmetry
dipoles are forbidden by symmetry a dipole moment
an only be defe
t indu
ed (and only if the
ubi
symmetry is
broken by atomi
relaxations). In order to obtain the
orre
t (usually quadrupole-monopole intera
tion only) energy
orre
tion, one has to pro
eed as follows: One has to
al
ulate the quadrupole moment for an ideal bulk
ell (neutral!)
by setting IDIPOL=4 and DIPOL=same position as in defe
t
ell (sear
h for the line
ontaining Tr[quadrupol ... in
le OUTCAR). The
orresponding quadrupole moment has to be subtra
ted from the quadrupole moment printed for
the
harged defe
t
ell. The differen
e
orresponds to the defe
t-indu
ed part of the quadrupole moment. If no dipoledipole intera
tion is present you
an now simply s
ale down the energy printed on the line dipol+quadrupol energy
orre
tion ... of le OUTCAR by the ratio defe
t-indu
ed quadrupole/total
ell quadrupole sin
e this intera
tion is
proportional to the quadrupole moment. After this s
aling you should end up with reasonable numbers (usually smaller
than the monopole-monopole
orre
tion printed on the line
ontaining energy
orre
tion for
harged system ... in
le OUTCAR). Add now the
orre
ted value for the quadrupole-monopole intera
tion to the
al
ulated monopolemonopole intera
tion energy (and nally s
ale the sum with 1/e). The whole pro
edure is even more
ompli
ated if a
dipole moment o
urs also, sin
e then only the quadrupole-monopole term has to be
orre
ted but the dipole-dipole
term is already
orre
t! But you
an easily help yourself: Take simply a
ell of the same dimension and
al
ulate a free
ion (does not matter whi
h one!) of the same
harge state (if this
auses trouble try the opposite state, e.g. 4+ instead
of 4- but don't forget then to take the opposite sign for the printed monopole quadrupole energy sin
e this energy is
proportional to the
ell
harge!). The
al
ulation will provide a quadrupole moment and a
ertain quadrupole-monopole
intera
tion energy. Sin
e this energy is proportional to the quadrupole moment (times total
ell
harge) you
an estimate
the proportionality
onstant with whi
h one has to multiply the quadrupole moment in order to obtain the
orresponding
monople-quadrupole intera
tion for the given
ell size by dividing the energy by the quadrupole moment. Multiplying
this
onstant by the quadrupole moment of the defe
t
ell you
an now
al
ulate the quadrupole-monopole
ontribution
alone and hen
e, the dipole-dipole
ontribution is then known too. The dipole-dipole
ontribution will be kept and the
defe
t-indu
ed quadrupole-monopole
ontribution has to be added to this (just multiply the proportionality
onstant
with the the defe
t-indu
ed quadrupole moment). Then you nally end up with the
orre
t values for all intera
tions
(whi
h have to be summed again and res
aled with 1/e). It's
urrently a
lumsy pro
edure but it works satisfa
torily.
Any potential
orre
tion (LDIPOL=.TRUE.) is
urrently impossible! Hen
e you
an only use LDIPOL=.FALSE.! The
reasons are: rst the downs
aling with e is missing and se
ond the
orre
tion is not
al
ulated from the defe
t-indu
ed
multipoles but from the total monopoles of the defe
t
ell
ontaining at least a meaningless quadrupole
ontribution
(one had to subtra
t the quadrupole moment of the ideal
ell before
al
ulating any
orre
tion potential, but this is not
yet implemented in routine dipol.F!). However, one has to expe
t that the potential
orre
tions do not
hange the results
dramati
ally ... .
Besides
harged defe
ts there's another
riti
al type of defe
ts whi
h may
ause serious trouble (and for whi
h one should
also apply dipole
orre
tions): neutral defe
ts or defe
t
omplexes of low symmetry. For su
h defe
ts a dipole moment may
o
ur leading to
onsiderable dipole-dipole intera
tions. Though they fall off like 1=L3 they might not be negligible (even
for somewhat larger
ells!) if the indu
ed dipole moment is rather large. The worst
ase that
an happen is a defe
t
omplex
with two (or more) rather distant defe
ts (separated by distan
es of the order of nearest-neighbor bond lengths or larger) with
a strong
harge transfer between the defe
ts forming the
omplex (e.g., one defe
t might possess the
harge state 2+ and
the other one the
harge state 2-). This
an easily happen for defe
t
omplexes representing a
eptor-donor pairs. The most
riti
al
ases are again given for semi
ondu
tors with rather small latti
e
onstants, rather small diele
tri
onstants of for any
defe
t
omplex
ausing strong
harge transfers. Again the same restri
tions and
omments hold as stated above for
harged
ells: you may
urrently only use
ubi
ells, LDIPOL=.FALSE. and you have to res
ale the
orre
tion printed in OUTCAR
by the bulk diele
tri
onstant e (i.e., the printed energies are again meaningless and have to be
orre
ted by hand). There is
only one point whi
h might help: sin
e in
ubi
ells any dipole moment
an only be defe
t-indu
ed no additional
orre
tions
are ne
essary (in
ontrast to the monopole-quadrupole energies of
harged
ells). However, the other bad news is: for su
h
defe
t
omplexes it may sometimes be hard to nd the
orre
t
enter of mass (input DIPOL=... in INCAR!) for the defe
t
indu
ed
harge perturbation (it's usually more easy for single point defe
ts sin
e usually DIPOL=position of the point defe
t
is the
orre
t
hoi
e). This introdu
es some un
ertainties and one might try different values for DIPOL (the one giving the
minimum
orre
tion should be the
orre
t one). But also note: DIPOL is internally aligned to the position of the
losest
FFT-grid point in real spa
e. Hen
e, the position DIPOL is only determined within distan
es
orresponding to the FFT-grid
spa
ing (
ontrolled by NG*F). As an additional note this might also play a
ertain role if for
harged single point defe
ts the
86
position of the defe
t is not
hosen to be (0,0,0)! In this
ase DIPOL might
orrespond to a position lying slightly off the
position of the defe
t what may also introdu
es ina
ura
ies in the
al
ulation of the ele
trostati
intera
tions (i.e., apparent
dipole moments may o
ur whi
h should be zero if the
orre
t position DIPOL would have been
hosen). In this
ase you
should whenever possible try to adjust your FFT-grid in su
h a way that the position of the defe
t mat
hes exa
tly some
FFT-grid point in real spa
e or otherwise never use any other (point) defe
t position than (0,0,0) ... .
A nal note has to be made: besides the ele
trostati
intera
tions there exist also spurious elasti
intera
tions between
neighboring
ells whi
h (a
ording to a simple elasti
dipole latti
e model) should s
ale like 1=L3 (leading order). Therefore, the
orre
ted values may still show a
ertain variation with respe
t to the super
ell size. One
an
he
k the relaxation
energies (elasti
energies) separately by
al
ulating (and
orre
ting) also unrelaxed
ells (defe
t plus remaining atoms in their
ideal bulk positions). If the k-point sampling is suf
ient to obtain well-
onverged results (with respe
t to the BZ-integration)
one might even try to extrapolate the elasti
intera
tion energies empiri
ally by plotting the relaxation energies versus 1=L3
(hopefully a linear fun
tion if not try to plot it against 1=L5 and look whether it mat
hes a linear fun
tion) and taking the
value for 1=L ! 0 (i.e. the axis offset). However, usually the remaining errors due to spurious elasti
intera
tions
an be
expe
ted to be small (rarely larger than about 0.1 eV) and the extrapolation towards L ! may also be rather unreliable if
the results are not perfe
tly
onverged with respe
t to the k-point sampling (though one should note that this may then hold
for the ele
trostati
orre
tions too!).
6.58
VASP.4.4
an
al
ulate the partial (band de
omposed)
harge density a
ording to parameters spe
ied in the le INCAR.
Mind that the partial
harge density
an be
al
ulated only if a pre
onverged WAVECAR le exists, VASP enters the
evaluation routine very qui
kly and stops immediately after evaluating the partial
harge density. This implementation was
hosen to allow a fast (almost intera
tive) re
al
ulation of the
harge density for parti
ular bands and kpoints.
The following parameters
ontrol the behavior of VASP.
LPARD: Evaluate partial (band and/or k-point) de
omposed
harge density. We want to stress again, that the wavefun
tions read from WAVECAR must be
onverged in a separate prior run. If only LPARD is set (and none of the tags
dis ussed below), the total harge density is evaluated from the wavefun tions and written to CHGCAR.
There are several ways how to spe
ify for whi
h bands the
harge density is evaluated: In general the input lines with
IBAND, EINT and NBMOD
ontrol this respe
t of the routine:
IBAND: Cal
ulate the partial
harge density for all bands spe
ied in the array IBAND. If IBAND is spe
ied in the
INCAR le and NBMOD is not given, NBMOD is set automati
ally to the size of the array. If IBAND is for instan
e
IBAND= 20 21 22 23
0
-1
-2
-3
Take all bands to
al
ulate the
harge density, even uno
upied bands are taken into a
ount.
Cal
ulate the total
harge density as usual. This is the default value if nothing else is given.
Cal
ulate the partial
harge density for ele
trons with there eigenvalues in the range spe
ied by EINT.
The same as before, but the energy range is given vs. the Fermi energy.
KPUSE: KPUSE spe ies whi h k-points are used in the evaluation of the partial dos. KPUSE is an array of integer values.
KPUSE= 1 2 3 4
means that the
harge density is evaluated and summed for the rst four k-points. Be
areful: VASP
hanges the kpoint
weights if KPUSE is spe
ied.
87
LSEPB: Spe
ies whether the
harge density is
al
ulated for every band separately and written to a le PARCHG.nb.?
(TRUE) or whether
harge density is merged for all sele
ted bands and write to the le PARCHG.ALLB.? or PARCHG.
Default is FALSE.
LSEPK: Spe ies whether the harge density of every k-point is write to the les PARCHG.?.nk (TRUE) or whether it
6.59
is merged (FALSE) to a single le. If the merged le is written, then the weight of ea
h k-point is determined from the
KPOINTS le, otherwise the kpoints weights of one are
hosen.
Berry phase
al
ulations
Evaluation of the usual Berry phase expression for the ele
troni
polarization of an insulating groundstate system [57, as
modied for the appli
ation of USPP's and PAW datasets [58, was implemented in the VASP
ode by Martijn Marsman.
We would greatly appre
iate, if you would in
lude a statement, a
knowledging Martijn Marsman, in any publi
ation based
on this part of VASP.
6.59.1 LBERRY, IGPAR, NPPSTR, DIPOL tags
Setting LBERRY= .TRUE. in the INCAR le swit
hes on the evaluation of the usual Berry phase expression for the ele
troni
polarization of an insulating groundstate system, as modied for the appli
ation of USPP's and PAW datasets (see Refs. [57,
[58 and [59). In addition, the following keywords must be spe
ied in order to generate the mesh of k-points:
IGPAR = 1|2|3
This tag spe
ies the so
alled parallel or Gk dire
tion in the integration over the re
ipro
al spa
e unit
ell.
This tag spe
ies the number of k-points on the strings k
= k? + j Gk
This tag spe
ies the origin with respe
t to whi
h the ioni
ontribution to the dipole moment in the
ell is
al
ulated.
When
omparing
hanges in this
ontribution due to the displa
ement of an ion, this
enter should be
hosen in su
h
a way that the ions in the distorted and the undistorted stru
ture remain on the same side of DIPOL (in terms of a
minimum image
onvention).
6.59.2 An example: The uorine displa
ement dipole (Born effe
tive
harge) in NaF
First we determine the ele
troni
polarization of the undistorted NaF (whi
h, sin
e it is
ubi
, should be zero).
Cal
ulation 1
We begin by
al
ulating the self-
onsistent Kohn-Sham potential of the undistorted stru
ture, using a symmetry redu
ed
(444) Monkhorst-Pa
k sampling of the Brillouin zone.
KPOINTS le:
4x4x4
0
Monkhorst
4 4 4
0 0 0
POSCAR le:
NaF
4.5102
0.0 0.5 0.5
0.5 0.0 0.5
0.5 0.5 0.0
1 1
Dire
t
0.0000000000000000 0.0000000000000000 0.0000000000000000
0.5000000000000000 0.5000000000000000 0.5000000000000000
88
Cal
ulation 2
To
al
ulate the ele
troni
ontribution to the polarization, along the G1 , add the following lines to the INCAR le:
LBERRY = .TRUE.
IGPAR = 2
NPPSTR = 6
DIPOL = 0.25 0.25 0.25
Setting LBERRY=.TRUE. automati
ally sets ICHARG=11, sin
e we mean to use the
harge density obtained in Cal
ulation
1. The reason for this is that the number of k-points, used to evaluate the Berry phase expression
an be quite large, large
enough for it to be
omputationally advantageous to use the
harge density obatined with the smaller grid used in the previous
al
ulation.
The OUTCAR will now
ontain something similar to the following lines (grep on < R >):
Expe
tation value term: <R>ev
<R>x = (
-0.00001,
0.00000 )
<R>y = (
0.00000,
0.00000 )
<R>z = (
0.00001,
0.00000 )
Berry-Phase term: <R>bp
<R> = (
0.00000,
0.00000,
0.00000 ) ele
trons Angst
ioni
term: <R>ion
<R> = (
20.29590,
20.29590,
20.29590 ) ele
trons Angst
The output of the Berry phase
al
ulation using IGPAR=1 should now be something like this:
Expe
tation value term: <R>ev
<R>x = (
0.00000,
0.00000 )
<R>y = (
0.00000,
0.00000 )
<R>z = (
0.00116,
0.00000 )
Berry-Phase term: <R>bp
<R> = (
0.00000,
0.17982,
0.17982 ) ele
trons Angst
ioni
term: <R>ion
<R> = (
20.29590,
20.29590,
19.98019 ) ele
trons Angst
Add the < R >bp terms obtained in Cal
ulations 24. Lets
all this < R >bp;undist
The ele
troni
polarization of the undistorted stru
ture is then given by:
89
Repeat the above three steps for the results obtained using the distorted stru
ture (Cal
ulations 68), to evaluate
R >ev dist , < R >bp dist , and < R >el dist
;
<
The
hange in the ele
troni
ontribution to the polarization due to the F-sublatti
e displa
ement, D < R >el , is then
simply found as < R >el dist < R >el undist
;
To
al
ulate the total
hange in polarization, D < R >, one should take a
ount of the ioni
ontribution to this
hange.
This
an be simply
al
ulated from the < R >ion as written in for instan
e Cal
ulations 2 and 6. D < R > is then given by
Considering we moved the F-sublatti
e by
D < R >ion +D < R >el . In this example we nd D < R >= 0:04393 ele
trons/A.
this
al
ulation yields a Born effe
tive
harge for uorine in NaF of Z = 0:9740.
0.045102 A,
N.B.(I) One should take
are of the fa
t that the
al
ulated Berry phase term, < R >bp along G , is in prin
iple obtained
modulo a
ertain period, determined by the latti
e ve
tor R for whi
h R G = 2p, the spin multipli
ity of the wave fun
tions,
the volume of the unit
ell, the number of k-point in the perpendi
ular grid, and some aspe
ts of the symmetry of the system.
More information on this parti
ular aspe
t of the Berry phase
al
ulations
an be found in Refs. [57, and [59.
N.B.(II) In
ase of spinpolarized
al
ulations (ISPIN=2) the Berry phase of the wavefun
tions is evaluated separately for ea
h
spin dire
tion. This means a grep on < R > will yield two sets of < R >ev and < R >bp terms, whi
h have to added to
oneanother to obtain the total ele
troni
polarization of the system.
i
6.60
Spinors were in
luded by Georg Kresse in the VASP
ode. The
ode required for the treatment of non-
ollinear magneti
stru
tures was written by David Hobbs, and spin-orbit
oupling was implemented by Olivier Leba
q and Georg Kresse.
Spinors are only supported by VASP.4.5.
6.60.1
LNONCOLLINEAR tag
Supported only by VASP.4.5 and on. THIS FEATURE IS IN LATE BETA STAGE (BUGS ARE POSSIBLE).
Setting LNONCOLLINEAR= .TRUE. in the INCAR le allows to perform fully non-
ollinear magneti
stru
ture
al
ulations. VASP is
apable of reading WAVECAR and CHGCAR les from previous non-magneti
or
ollinear
al
ulations, it is
however not possible to rotate the magneti
eld lo
ally on sele
ted atoms.
Hen
e, in pra
ti
e, we re
ommend to perform non
ollinear
al
ulations in two steps:
First,
al
ulate the non magneti
groundstate and generate a WAVECAR and CHGCAR le.
Se
ond, read the WAVECAR and CHGCAR le, and supply initial magneti
moments by means of the MAGMOM tag
(
ompare Se
. 6.12). For a non
ollinear setup, three values must be supplied for ea
h ion in the MAGMOM line. The three
entries
orrespond to the initial lo
al magneti
moment for ea
h ion in x, y and z dire
tion respe
tively. The line
MAGMOM = 1 0 0
0 1 0
initialises the magneti
moment on the rst atom in the x-dire
tion, and on the se
ond atom in the y dire
tion. Mind,
that the MAGMOM line supplies initial magneti
moments only if ICHARG is set to 2, or if the CHGCAR le
ontains
only
harge but no magnetisation density.
6.60.2
LSORBIT tag
Supported only by VASP.4.5 and on. THIS FEATURE IS IN LATE BETA STAGE (BUGS ARE POSSIBLE).
LSORBIT = .TRUE. swit
hes on spin-orbit
oupling and automati
ally sets LNONCOLLINEAR= .TRUE.. This option works
only for PAW potentials and is not supported by ultrasoft pseudopotentials. If spin-orbit
oupling is not in
luded, the energy
does not depend on the dire
tion of the magneti
moment, i.e. rotating all magneti
moments by the same angle results in
prin
iple exa
tly in the same energy. Hen
e there is no need to dene the spin quantization axis, as long as spin-orbit
oupling
is not in
luded. Spin-orbit
oupling however
ouples the spin to the
rystal stru
ture. Spin orbit
oupling is swit
hed on by
sele
ting
LSORBIT = .TRUE.
SAXIS = s_x s_y s_z (quantisation axis for spin)
90
where the default for SAXIS=(0+; 0; 1) (the notation 0+ implies an innitesimal small positive number in x dire
tion). All
magneti
moments are now given with respe
t to the axis (sx ; sy ; sz ), where we have adopted the
onvention that all magneti
moments and spinor-like quantities written or read by VASP are given with respe
t to this axis. This in
ludes the MAGMOM
line in the INCAR le, the total and lo
al magnetizations in the OUTCAR and PROCAR le, the spinor-like orbitals in
the WAVECAR le, and the magnetization density in the CHGCAR le. With respe
t to the
artesian latti
e ve
tors the
omponents of the magnetization are (internally) given by
mx
my
mz
axis
os(b)
os(a)maxis
sin(a)maxis
+ sin(b)
os(a)mz
x
y
axis
os(b) sin(a)mx +
os(a)maxis
+ sin(b) sin(a)mz
y
axis
sin(b)maxis
+
os(b)mz
x
Where maxis is the externally visible magneti
moment. Here, a is the angle between the SAXIS ve
tor (sx ; sy ; sz ) and the
artesian ve
tor x, and b is the angle between the ve
tor SAXIS and the
artesian ve
tor z:
atan
atan
sy
sx
js2x + s2y j
sz
mx
axis
my
axis
mz
It is easy to see that for the default (sx ; sy ; sz ) = (0+; 0; 1), both angles are zero, i.e. b = 0 and a = 0. In this
ase, the internal
representation is simply equivalent to the external representation:
axis
mx
mx
my
my
mz
mz
axis
axis
q
q
mx
axis
sin(b)
os(a)maxis
= mz sx =
z
my
axis
sin(b) sin(a)maxis
= mz sy =
z
mz
axis
os(b)maxis
= mz sz =
z
2
2
s2
x + sy + sz
2
2
s2
x + sy + s z
2
2
s2
x + sy + sz
(6.10)
(6.11)
(6.12)
Hen
e now the magneti
moment is parallel to the ve
tor SAXIS. Thus there are two ways to rotate the spins in an arbitrary
dire
tion, either by
hanging the initial magneti
moments MAGMOM or by
hanging SAXIS.
To initialise
al
ulations with the magneti
moment parallel to a
hosen ve
tor (x; y; z), it is therefore possible to either
spe
ify (assuming a single atom in the
ell)
MAGMOM = x y z
SAXIS = 0 0 1
or
MAGMOM = 0 0 total_magneti
_moment ! lo
al magneti
moment parallel to SAXIS
SAXIS = x y z ! quantisation axis parallel to ve
tor (x,y,z)
Both setups should in prin
iple yield exa
tly the same energy, but for implementation reasons the se
ond method is usually
more pre
ise. The se
ond method also allows to read a preexisting WAVECAR le (from a
ollinear or non
ollinear run),
and to
ontinue the
al
ulation with a different spin orientation. When a non
ollinear WAVECAR le is read, the spin is
assumed to be parallel to SAXIS (hen
e VASP will initially report a magneti
moment in the z-dire
tion only).
The re
ommended pro
edure for the
al
ulation of magneti
anisotropies is therefore:
Start with a ollinear al ulation and al ulate a WAVECAR and CHGCAR le.
91
LSORBIT = .TRUE.
ICHARG = 11
! non self
onsistent run, read CHGCAR
SAXIS = x y z ! dire
tion of the magneti
field
NBANDS = 2 * number of bands of
ollinear run
VASP reads in the WAVECAR and CHGCAR les, aligns the spin quantization axis parallel to SAXIS, whi
h implies
that the magneti
eld is now parallel to SAXIS, and performs a non self
onsistent
al
ulation. By
omparing the energies for different orientations the magneti
anisotropy
an be determined. Please mind, that a
ompletely self
onsistent
al
ulation (ICHARG= 1) is in prin
iple also possible with VASP, but this would allow the the spinor wavefun
tions
to rotate from their initial orientation parallel to SAXIS until the
orre
t groundstate is obtained, i.e. until the magneti
moment is parallel to the easy axis. In pra
ti
e this rotation will be slow, however, sin
e reorientation of the spin gains
little energy. Therefore if the
onvergen
e
riterion is not too tight, sensible results might be obtained even for fully
self
onsistent
al
ulations (in the few
ases we have tried this worked beautifully).
Be very
arefull with symmetry. We re
ommend to swit
h off symmetry (ISYM=0) altogether, when spin orbit
oupling
is sele
ted. Often the k-point set
hanges from one to the other spin orientation, worsening the transferability of the
results (also the WAVECAR le
an not be reread properly if the number of k-points
hanges). Additionally VASP.4.6
(and all older versions) had a bug in the symmetrisation of magneti
elds (xed only VASP.4.6.23).
Generally be extremely arefull, when using spin orbit oupling: energy differen es are tiny, k-point onvergen e is
tedious and slow, and the
omputer time you require might be innite. Additionally, this feature although long
implemented in VASP is still in a late beta stage, as you might dedu
e from the frequent updates. No promise, that
your results will be usefull!!! Here a small summary from the README le:
20.11.2003: The present GGA routine breaks the symmetry slightly for non orthorhombi ells. A spheri al utoff
is now imposed on the gradients and all intermediate results in re
ipro
al spa
e. This
hanges the GGA results
slightly (usually by 0.1 meV per atom), but is important for magneti
anisotropies.
05.12.2003:
ontinue... Now VASP.4.6 defaults to the old behavior GGA COMPAT = .TRUE., the new behavior
an
be obtained by setting GGA COMPAT = .FALSE. in the INCAR le. VASP.5.0 defaults to GGA COMPAT = .FALSE..
12.08.2003: MAJOR BUG FIX in symmetry.F and paw.F: for non
ollinear
al
ulations the symmetry routines
did not work properly
If you have read the previous lines, you will realize that it is re
ommended to set GGA COMPAT = .FALSE. for non
ollinear al ulations in VASP.4.6, sin e this improves the numeri al pre ission of GGA al ulations.
6.61
Supported only by VASP.4.6 and on. THIS FEATURE IS IN LATE BETA STAGE (BUGS ARE POSSIBLE).
VASP offers the possibility to add a penalty
ontribution to the total energy expression (and
onsequently a penalty
fun
tional the Hamiltonian) whi
h drives the lo
al moment (integral of the magnetization in a site
entered sphere) into a
dire
tion spe
ied by the user. This feature is
ontrolled using the following tags:
I CONSTRAINED M = 1
Where r is a (real) number whi
h spe
ies the weight with whi
h the penalty terms enter into the total
energy expression and the Hamiltonian
LAMBDA = r
The desired dire
tion(s) of the integrated lo
al moment(s) with respe
t to
artesian
oordinates (3
oordinates must be spe
ied for ea
h ion). The norm of this ve
tor is meaningless sin
e it will normalized by
VASP anyway. Setting M CONSTR = 0 0 0 for an ion is equivalent to imposing no
onstraints.
M CONSTR = a b ...
In addition one must set the RWIGS-tag to spe
ify the radius of integration around the atomi
sites whi
h determines the lo
al
moments.
When one uses the
onstrained moment approa
h, additional information pertaining to the effe
t of the
onstraints is
written into the OSZICAR le.
92
E_p
ion
1
2
DAV:
68
0.101E+00
0.802E-01
E p is the
ontribution to the total energy arising from the penalty fun
tional. Under M int VASP lists the integrated magneti
moment at ea
h atomi
site. The
olumn labeled MW int shows the result of the integration of magnetization density whi
h
has been smoothed towards the boundary of the sphere. It is a
tually this integrated moment whi
h enters in the penalty terms
(the smoothing makes the pro
edure more stable). One should look at the latter numbers to
he
k whether enough of the
magnetization denstity around ea
h atomi
site is
ontained within the integration sphere and in
rease RWIGS a
ordingly.
What exa
tly
onstitutes enough in this
ontext is hard to say. It is best to set RWIGS in su
h a manner that the integration
spheres do not overlap and are otherwise as large as possible.
At the end of the run the OSZICAR le
ontains some extra information:
DAV: 35
-0.905322335169E+01
0.58398E-04 -0.60872E-04
60
1 F= -.90532234E+01 E0= -.90355617E+01 d E =-.529849E-01 mag=
0.734E-02
-0.0005
2.1161
5.1088
Under lambda*MW perp the
onstraining magneti
eld at ea
h atomi
site is listed. It shows whi
h magneti
eld is added
to the LSDA Hamiltonian to stabilize the magneti
onguration.
As is probably
lear from the above, applying
onstraints by means of a penalty fun
tional
ontributes to the total energy.
This
ontribution, however, de
reases with in
reasing LAMBA and
an in prin
iple be made vanishingly small. In
reasing
LAMBDA stepwise, from one run to another (slowly so the solution remains stable) one thus approa
hes the LSDA total energy
for a given magneti
onguration (
ompare the lines below with the pre
eeding output from the OSZICAR; the effe
t of
in
reasing LAMBDA from 10 to 50).
E_p = 0.22591E-03 lambda = 0.500E+02
ion
MW_int
M_int
1 0.000 0.002 1.545
0.001 -0.005 2.654
2 0.000 1.086 1.087
0.001 1.871 1.862
DAV: 33
-0.907152551238E+01
0.48186E-04 -0.33125E-04
60
1 F= -.90715255E+01 E0= -.90541505E+01 d E =-.521251E-01 mag=
6.62
0.163E-01
0.0042
2.0902
5.0659
Supported only by VASP.4.6 and on. THIS FEATURE IS IN LATE BETA STAGE (BUGS ARE POSSIBLE).
The L(S)DA often fails to des
ribe systems with lo
alized (strongly
orrelated) d and f ele
trons (this manifests itself
primarily in the form of unrealisti
one-ele
tron energies). In some
ases this
an be remedied by introdu
ing a strong intraatomi
intera
tion in a (s
reened) Hartree-Fo
k like manner, as an on site repla
ement of the L(S)DA. This approa
h is
ommonly known as the L(S)DA+U method.
VASP allows one to
hoose between two different approa
hes to L(S)DA+U:
The rotationally invariant version introdu
ed by Lie
htenstein et al. [60, whi
h is of the form
EHF
1
2
(Ug1 g3 g2 g4
Ug
1 g3 g4 g2 )ng1 g2 ng3 g4
fgg
ng
1 g2
hY j 2 ih 1 j Y i
s2
s1
and the (uns
reened) on site ele
tron-ele
tron intera
tion
Ug
1 g3 g2 g4
m 1 m3
j jr 1 r j j
0
m 2 m4
id d
s1 s2
s3 s4
93
The uns
reened e-e intera
tion Ug1 g3 g2 g4
an be written in terms of Slater's integrals F 0 , F 2 , F 4 , and F 6 (f-ele
trons).
Using values for the Slater integrals
al
ulated from atomi
wave fun
tions, however, would lead to a large overestimation of the true e-e intera
tion, sin
e in solids the Coulomb intera
tion is s
reened (espe
ially F 0 ).
In pra
ti
e these integrals are therefore often treated as parameters, i.e., adjusted to rea
h agreement with experiment
in some sense: equilibrium volume, magneti
moment, band gap, stru
ture. They are normally spe
ied in terms of
the effe
tive on site Coulomb- and ex
hange parameters, U and J . (U and J are sometimes extra
ted from
onstrainedLSDA
al
ulations.)
These translate into values for the Slater integrals in the following way (as implemented in VASP at the moment):
-ele
trons: F 0 = U , F 2 = 5J
4
2
d -ele
trons: F 0 = U , F 2 = 1+14
0 625 J , and F = 0:625F
-ele trons:
0
F = U,
6435
2
F =
286+1950:668+2500:494 J ,
4 = 0:668F 2 ,
and F 6 = 0:494F 2
The essen e of the L(S)DA+U method onsists of the assumption that one may now write the total energy as:
Etot (n; n)
Ed (n)
where the Hartree-Fo
k like intera
tion repla
es the L(S)DA on site due to the fa
t that one subtra
ts a double
ounting
energy (Ed
) whi
h supposedly equals the on site L(S)DA
ontribution to the total energy.
Currently VASP allows for the
hoi
e between two different denitions for the double
ounting energy:
LSDA+U Ed
(n ) = U2 n tot (ntot 1) 2J s n stot (nstot 1)
LDA+U
Ed (n)
= 2 n tot (n tot
1)
4 n tot (n tot
J
2)
The simplied (rotationally invariant) approa h to the LSDA+U , due to Dudarev et al. [61, is of the following form:
+U = ELSDA +
ELSDA
(U
J)
"
m1
nm ;m
1 1
m1 ; m2
!#
nm ;m nm ;m
2 1
1 2
This
an be understood as adding a penalty fun
tional to the LSDA total energy expression that for
es the on site
o
upan
y matrix in the dire
tion of idempoten
y, i.e., ns = ns ns . (Real matri
es are only idempotent when their
eigenvalues are either 1 or 0, whi
h for an o
upan
y matrix translates to either fully o
upied or fully uno
upied
levels.)
Note: in Dudarev's approa
h the parameters U and J do not enter seperately, only the differen
e (U J ) is meaningfull.
The L(S)DA+U in VASP is swit
hed on by means of the following tags
LDAU = .TRUE.
LDAUTYPE = 1|2|4
LDAUL = L ..
LDAUU = U ..
LDAUJ = J ..
94
NB: LDAUL, LDAUU, and LDAUJ must be spe
ied for all atomi
spe
ies!
It is important to be aware of the fa
t that when using the L(S)DA+U, in general the total energy will depend on the parameters
U and J . It is therefore not meaningful to
ompare the total energies resulting from
al
ulations with different U and/or J
[
.q. (U J ) in
ase of Dudarev's approa
h.
Note on bandstru
ture
al
ulation: The CHGCAR le also
ontains only information up to LMAXMIX for the on-site
PAW o
upan
y matri
es. When the CHGCAR le is read and kept xed in the
ourse of the
al
ulations (ICHARG=11), the
results will be ne
essarily not identi
al to a self
onsistent run. The deviations
an be (or a
tually are) large for L(S)DA+U
al
ulations. For the
al
ulation of band stru
tures within the L(S)DA+U approa
h, it is hen
e stri
tly required to in
rease
LMAXMIX to 4 (d elements) and 6 (f elements). (see Se
. 6.55).
6.63
HF type al ulations
Available only in VASP.5.X. This version is presently not distributed. Do
umentation under
onstru
tion and for internal use
only!
6.63.1
(6.13)
n; m
with ffk (r)g being the set of one-ele
tron Blo
h states of the system, and f fk g the
orresponding set of (possibly fra
tional)
o
upational numbers. The sums over k and q run over all k-points
hosen to sample the Brillouin zone (BZ), whereas the
sums over m and n run over all bands at these k-points.
The
orresponding non-lo
al Fo
k potential is given by
0
fq (r0 )fq (r) e2
e2
qr uq (r )uq (r) e qr
fq
f
e
=
(6.14)
V r; r0 =
q
2 q
j r r0 j
2 q
j r r0 j
n
where uq (r) is the
ell periodi
part of the Blo
h state, fq (r), at k-point, q, with band index m.
Using the de
omposition of the Blo
h states, fq , in plane waves,
m
W
G
1
q (r) = p
Cmq (G)ei(q+G)r
G; G
(6.15)
0 e (k+G )r
0
(6.16)
Vk
G; G
2
0 = hk + GjV jk + G0 i = 4pe
x
q fq G C q
m
00
( G0
00 )C q (G
00 2
q+G j
G
k
00 )
(6.17)
LHFCALC
Default: .FALSE.
The ag spe
ies, whether HF type
al
ulations are performed. At the moment, it is re
ommended to sele
t an all bands
simultaneous algorithm, i.e. ALGO=Damped (IALGO=53) or ALGO=All (IALGO=58) in the INCAR le (see Se
. 6.42 6.43).
The blo
ked Davidson algorithm ALGO=Normal is, with
ertain
aveat, also supported, whereas
al
ulations for the
other algorithms (ALGO=Fast) are not properly supported (note: no warning is printed). The blo
ked Davidson algorithm
ALGO=Normal is generally rather slow, and in many
ases the Pulay mixer will be unable to determine the proper groundstate. We hen
e re
ommend to sele
t the blo
ked Davidson algorithm only in
ombination with straight mixing or a Kerker
like mixing. The following
ombination have been su
essfully applied for small and medium sized systems
95
AEXX =
ALDAC=
AGGAX=
AGGAC=
real
real
real
real
number
number
number
number
(fra
tion
(fra
tion
(fra
tion
(fra
tion
of
of
of
of
exa
t ex
hange)
LDA
orrelation energy)
gradient
orre
tion to ex
hange)
gradient
orre
tion to
orrelation)
Default: AEXX = 0.25 for LHFCALC = .TRUE., and AEXX = 0.0 for LHFCALC = .FALSE..
AGGAX = 1.0-AEXX,
AGGAC = 1.0
ALDAC = 1.0.
Spe
ies the amount of exa
t ex
hange and various other ex
hange and
orrelation settings. The sum of the fra
tion of
the exa
t ex
hange and LDA ex
hange is always 1.0, and is is not possible to set the amount of LDA ex
hange indepently.
Examples: if AEXX=0.25, 1/4 of the exa
t ex
hange is used, and 3/4 of the LDA ex
hange is added. For AEXX=0.5, half of
the exa
t ex
hange is used, and one half of the LDA ex
hange is added.
The amount of GGA ex
hange, and the
orrelation
ontributions
an be set indepently, however (some popular hybride
fun
tionals for instan
e use only 0.8 of the gradient
ontribition to the ex
hange). The GGA ags AGGAX and AGGAC are only
used if GGA is already sele
ted (for LDA type
al
ulations no gradient
orre
tion will be added regardless of values is used
for AGGAX and AGGAC).
Note: The defaults are
hosen su
h that the hybride PBE0 fun
tional is sele
ted for PBE pseudopotentials (the PBE0 fun
tional
ontains 25 % of the exa
t ex
hange, and 75 % of the PBE ex
hange, and 100 % of the PBE
orrelation energy). The
resulting expression for the ex
hange-
orrelation energy then takes the following simple form:
1
3
Ex + ExPBE + E
PBE
(6.18)
4
4
Other sensible values are of
ourse AEXX = 1.0 (full Hartree Fo
k type
al
ulations). In this
ase, one might want to set
ALDAC=0.0 and AGGAC=0.0, in order to avoid the addition of
orrelation energy.
A
omprehensive evaluation of the performan
e of the PBE0 fun
tional, as
ompared to PBE,
an be found in Ref. [64.
PBE0
Ex
=
6.63.4
ENCUTFOCK = real number (energy utoff determining the FFT grids in the HF related routines)
default: none
The ENCUTFOCK parameter
ontrols the FFT grid for the HF routines. The only sensible value for ENCUTFOCK is
ENCUTFOCK=0. This implies that the smallest possible FFT grid, whi
h just en
loses the
utoff sphere
orresponding to the
plane wave
utoff, is used. This a
elerates the
al
ulations by roughly a fa
tor two to three, but
auses slight
hanges in the
total energies and a small noise in the
al
ulated for
es. The FFT grid used internally in the Hartree Fo
k routines is written
to the OUTCAR le. Simply sear
h for lines starting with
FFT grid for exa
t ex
hange (Hartree Fo
k)
In many
ases, a sensible approa
h is to determine the ele
troni
and ioni
groundstate using ENCUTFOCK = 0, and to
make one nal total energy
al
ulation without the ag ENCUTFOCK.
6.63.5
HFLMAX
HFLMAX = integer
96
default: HFLMAX=4
Maximum angular quantum number l for the augmentation of
harge densities in Hartree-Fo
k type routines. This ags
determines the treatment on the plane wave grid only (pseudo wave fun
tions). To
ompensate resulting errors, the
ontributions from the one-
enter terms are evaluated for the pseudo wave fun
tions also only up to l =HFLMAX, whereas the
one-
enter terms for the exa
t all-ele
tron wave fun
tions are evaluated up to the maximum required l (twi
e the angular
quantum number of the partial wave with the highest l ). The default is 4, and it might be required to in
rease this parameter,
if the system
ontains f-ele
trons. Sin
e this in
reases the
omputational load
onsiderably (fa
tor 2), it is re
ommended to
perform tests, whether the results are already reasonably
onverged using the default HFLMAX=4.
HFSCREEN and LTHOMAS
6.63.6
HFSCREEN = real
In
ombination with PBE potentials, attributing a value to HFSCREEN will swit
h from the PBE0 fun
tional (in
ase
LHFCALC=.TRUE.) to the
losely related HSE03 or HSE06 fun
tional [65, 66, 67.
The HSE03 and HSE06 fun
tional repla
es the slowly de
aying long-ranged part of the Fo
k ex
hange, by the
orresponding density fun
tional
ounterpart. The resulting expression for the ex
hange-
orrelation energy is given by:
1 SR
3
E () + ExPBE SR () + ExPBE LR () + E
PBE :
(6.19)
4 x
4
As
an be seen above, the separation of the ele
tron-ele
tron intera
tion into a short- and long-ranged part, labeled SR and
LR respe
tively, is realized only in the ex
hange intera
tions. Ele
troni
orrelation is represented by the
orresponding part
of the PBE density fun
tional.
The de
omposition of the Coulomb kernel is obtained using the following
onstru
tion ( HFSCREEN):
HSE
Ex
=
1
r
= S ( r ) + L ( r ) =
erf
(r)
r
erf(r)
(6.20)
where r = jr r0 j, and is the parameter that denes the range-separation, and is related to a
hara
teristi
distan
e, (2=),
at whi
h the short-range intera
tions be
ome negligible.
Note: It has been shown [65 that the optimum ,
ontrolling the range separation is approximatively 0:2 0:3 A 1 . To
onform with the HSE06 fun
tional you need to sele
t (HFSCREEN=0.2) [65, 66, 67.
Using the de
omposed Coulomb kernel and Eq. (6.13), one straightforwardly obtains:
Z Z
e2
erf
(jr r0 j)
0
0
ExSR () =
(6.21)
fkn fqm
d 3 rd 3 r0
2 kn qm
jr r0 j fkn (r)fqm (r )fkn (r )fqm (r):
00
(6.22)
00
Clearly, the only differen
e to the re
ipro
al spa
e representation of the
omplete (unde
omposed) Fo
k ex
hange potential,
given by Eq. (6.17), is the se
ond fa
tor in the summand in Eq. (6.22), representing the
omplementary error fun
tion in
re
ipro
al spa
e.
The short-ranged PBE ex
hange energy and potential, and their long-ranged
ounterparts, are arrived at using the same
de
omposition [Eq. (6.20), in a
ordan
e with Heyd et al. [65 It is easily seen from Eq. (6.20) that the long-range term
be
omes zero for = 0, and the short-range
ontribution then equals the full Coulomb operator, whereas for ! it is the
other way around. Consequently, the two limiting
ases of the HSE03/HSE06 fun
tional [see Eq. (6.19) are a true PBE0
fun
tional for = 0, and a pure PBE
al
ulation for ! .
LTHOMAS
If the ag LTHOMAS is set, a similar de
omposition of the ex
hange fun
tional into a long range and a short range part is used.
This time it is more
onvenient to write the de
omposition in re
ipro
al spa
e:
4pe2
j j2
G
= S (
2
2
2
j j) + L (j j) = j j42p+e k2 + 4j pej2 j j42p+e k2
TF
TF
G
(6.23)
where qT F is the Thomas Fermi s
reening length. Here, HFSCREEN is used to spe
ify this parameter qT F . VASP
al
ulates
this density dependent parameter and writes it to the OUTCAR le in the line:
97
Thomas-Fermi ve tor in A
2.00000
Note however that the parameter depends on the ele
trons
ounted as valen
e ele
trons: For the determination of the value
written to the OUTCAR le, VASP simply
ounts all ele
trons in the POTCAR le as valen
e ele
trons, whereas literature
suggests that semi-
ore states and d -states should not be in
luded in the determination of the Thomas Fermi s
reening length
(HFSCREEN
an be manually set to any value). Details
an be found in literature [69, 70, 71. An important detail
on
erns that
implementation of the density fun
tional part in the s
reened ex
hange
ase. Literature suggests that a global enhan
ement
fa
tor z (see Equ. (3.15) in Ref. [71) should be used), whereas VASP implements a lo
al density dependent enhan
ement
fa
tor z = kT F =k , where k is the Fermi wave ve
tor
orresponding to the lo
al density (and not the average density as suggested
in Ref. [71). This is in the spirit of the lo
al density approximation.
Note: A
omprehensive study of the performan
e of the HSE03/HSE06 fun
tional as
ompared to the PBE and PBE0 fun
tionals
an be found in Ref. [68.
NKRED, NKREDX, NKREDY, NKREDZ and EVENONLY, ODDONLY
6.63.7
NKRED = integer
NKREDX= integer
NKREDY= integer
NKREDZ= integer
EVENONLY = logi
al
ODDONLY = logi
al
Under
ertain
ir
umstan
es it is possible to evaluate the HF kernel (see Eq. 6.13) on a sub grid of q-points, without mu
h
loss of a
ura
y. Whether this is possible, depends on the range of the ex
hange intera
tions in the
ompound of
hoi
e. This
an be understood along the following lines:
Consider the des
ription of a
ertain bulk system, using a super
ell made up of N primitive
ells, in su
h a way that, fA0i g,
the latti
e ve
tors of the super
ell are given by A0i = ni Ai (i = 1; 2; 3), where fAi g are the latti
e ve
tors of the primitive
ell.
Let Rmax = 2= be the distan
e for whi
h
erf
(jr r0 j)
0
(6.24)
jr r0 j 0; for jr r j > Rmax
When the nearest neighbour distan
e between the periodi
ally repeated images of the super
ell RNN > 2Rmax (i.e. RNN > 4=),
the short-ranged Fo
k potential, VxSR [,
an be represented exa
tly, sampling the BZ at the G-point only, i.e.,
erf
(jr r0 j)
e2
(6.25)
fGm uGm (r0 )uGm (r)
Vx [ r; r0 =
2
j r r0 j
This is equivalent to a representation of the bulk system using the primitive
ell and a n1 n2 n3 sampling of the BZ,
erf
(jr r0 j)
e2
Vx [ r; r0 =
fqm e iqr uqm (r0 )uqm (r)eiqr
(6.26)
2
j r r0 j
qm
f g = fi
q
G1
jG2 + kG3 g;
(6.27)
for i = 1; ::; n1 , j = 1; ::; n2 , and k = 1; ::; n3 , with G1 2 3 being the re
ipro
al latti
e ve
tors of the super
ell.
In light of the above it is
lear that the number of q-points needed to represent the short-ranged Fo
k potential de
reases with
de
reasing Rmax (i.e., with in
reasing ). Furthermore, one should realize that the maximal range of the ex
hange intera
tions
is not only limited by the erf
(jr r0 j)=jr r0 j kernel, but depends on the extend of the spatial overlap of the wavefun
tions as
well [this
an easily be shown for the Fo
k ex
hange energy when one adopts a Wannier representation of the wavefun
tions
in Eqs. (6.13) or (6.21); Rmax , as dened in Eq. (6.24), therefore, provides an upper limit for the range of the ex
hange
intera
tions,
onsistent with maximal spatial overlap of the wavefun
tions.
It is thus well
on
eivable that the situation arises where the short-ranged Fo
k potential may be represented on a
onsiderably
oarser mesh of points in the BZ than the other
ontributions to the Hamiltonian. To take advantage of this situation
one may, for instan
e, restri
t the sum over q in Eq. (6.22) to a subset, fqk g, of the full (N1 N2 N3 ) k-point set, fkg, for
whi
h the following holds
; ;
k = b1
n1C1
N1
+ b2
n2C2
N2
+ b3
n3C3
;
N3
(ni = 0; ::; Ni
1)
(6.28)
98
where b1 2 3 are the re
ipro
al latti
e ve
tors of the primitive
ell, and Ci is the integer grid redu
tion fa
tor along re
ipro
al
latti
e dire
tion bi . This leads to a redu
tion in the
omputational workload to:
; ;
(6.29)
C1C2C3
The integer grid redu
tion fa
tor are either set separately through C1 =NKREDX, C2 =NKREDY, and C3 =NKREDZ, or simultaneously
through C1 = C2 = C3 =NKRED. The ag EVENONLY
hoses a subset of k points with C1 = C2 = C3 = 1, and n1 + n2 + n2 even.
It redu
es the
omputational work load for HF type
al
ulations by a fa
tor two, but is only sensible for high symmetry
ases
(su
h as s
, f
or b
ells).
Note: From o
uren
e of the range-separation parameter in the above, one should not get the impression that the grid
redu
tion
an only be used/useful in
onjun
tion with the HSE03/HSE06 fun
tional (see Se
. 6.63.6). It
an be applied in the
PBE0 and pure HF
ases as well, although from the above it might be
lear that the HSE03 in general will allow for a larger
redu
tion of the grid than the beforementioned fun
tionals (see Ref. [68).
6.63.8
It is strongly re
ommended to perform standard DFT
al
ulations rst, and to start HF type
al
ulations from a pre
onverged
WAVECAR le.
A typi
al INCAR le for a HF or hybrid HF/DFT
al
ulation for an insulator or semi
ondu
tor has the following input
lines:
ISTART = 1
LHFCALC = .TRUE. ; HFSCREEN = 0.2
NBANDS = number of o
upied bands
ALGO = All ; TIME = 0.4
ENCUTFOCK = 0 ! omit flag for high quality
al
ulations
NKRED
= 2 ! omit flag for high quality
al
ulations
For metals and small gap semi
ondu
tors it is re
ommended to use.
ISTART = 1
LHFCALC = .TRUE. ; HFSCREEN = 0.2
ALGO = Damped ; TIME = 0.4
ENCUTFOCK = 0 ! omit flag for high quality
al
ulations
NKRED
= 2 ! omit flag for high quality
al
ulations
These input les sele
t the HSE06 fun
tional, whi
h tends to yield very similar thermo
hemistry as the PBE0 fun
tional, but
onverges more rapidly with respe
t to the number of k-points [68. We thus re
ommend to apply and use this fun
tional
instead of the more demanding PBE0 fun
tional. The NKRED ag is appli
able, if and only if the number of k-points is
dividable by 2 (see Se
. 6.63.7). ENCUTFOCK sele
ts a small FFT grid for the fast-Fourier-transforms (see Se
. 6.63.4). For
high a
ura
y NKRED and in parti
ular ENCUTFOCK might be ommited, but we re
ommend to do this only after pre
onverging
the wavefun
tions and atomi
positions with these ags spe
ied.
Mind, that the parameter TIME defaults to 0.4, and for the present algorithm this hardly ever needs to be
hanged. If
divergen
e is observed, simply de
rease TIME until the damped or
onjugate gradient algorithm be
omes stable (see Se
. 6.43
and 6.47).
6.64
Available only in VASP.5.X. This version is presently not distributed. Do
umentation under
onstru
tion and for internal use
only!
6.64.1
Default: .FALSE.
If LOPTICS = .TRUE., VASP
al
ulates the frequen
y dependent diele
tri
matrix after the ele
troni
ground state has
been determined. The imaginary part is determined by a summation over empty states using the equation:
2 2
limq!0
2w d e
q2 ;v;k
k ( k
evk w) hu k
+ea q
juvk ihu k
+eb q
juvk i
(6.30)
99
where the indi
es
and v refer to
ondu
tion and valen
e band states respe
tively, and u
k is the
ell periodi
part of the
wavefun
tions at the k-point k. The real part of the diele
tri
tensor e(1) is obtained by the usual Kramers-Kronig transformation
Z e(2) (w0 )w0
ab
e(ab1) (w) = 1 + 2 P
d w0 ;
(6.31)
02
2
+i
where P denotes the prin
iple value. The method is explained in detail in Ref. [63 (Equ. (15), (29) and (30) in Ref. [63).
The
omplex shift h is determined by the parameter CSHIFT (Se
. 6.64.2).
Note that lo
al eld effe
ts, i.e.
hanges of the
ell periodi
part of the potential are negle
ted in this approximation. These
an be evaluated using either the implemented density fun
tional perturbation theory (see Se
. 6.64.4) or the GW routines (see
Se
. 6.65). Furthermore the method sele
ted using LOPTICS = .TRUE. requires an appre
iable number of empty
ondu
tion
band states. Reasonable results are usually only obtained, if the parameter NBANDS is roughly doubled or tripled in the INCAR
le with respe
t to the VASP default. Furthermore it is emphasized that the routine works properly even for HF and s
reened
ex
hange type
al
ulations and hybrid fun
tionals. In this
ase, nite differen
es are used to determine the derivatives of the
Hamiltonian with respe
t to k.
Note that the number of frequen
y grid points is determined by the parameter NEDOS (see Se
. 6.36). In many
ases it is
desirable to in
rease this parameter signi
antly from its default value. Values around 2000 are strongly re
ommended.
6.64.2
CSHIFT
Default: .FALSE.
Usually VASP uses the longitudinal expression for the frequen
y dependent diele
tri
matrix as des
ribed in the pre
eeding se
tion (see. 6.64.1). It is however possible to swit
h to the
omputationally somewhat simpler transversal expressions
by sele
ting LNABLA = .TRUE. (in this
ase Equ. (17) and (20) in Ref. [63). In this simpli
ation the imaginary part of the
ma
ros
opi
diele
tri
fun
tion e(2) is given by
p2 e2h4 lim
eab2 (w) = 4Ww
q!0 2wk d(e
k
2 m2
( )
j ih j
j i
:
(6.32)
k
Ex
ept for the purpose of testing, there is however hardly ever a reason to use the transversal expression, sin
e it is less
a
urate.[63
6.64.4
;v;
a uvk u k i b
b uvk
LEPSILON: stati diele tri matrix, ion- lamped piezoele tri tensor and the Born effe tive harges using density fun tional perturbation theory
Default: .FALSE.
Determines the stati
ion-
lamped diele
tri
matrix using density fun
tional perturbation theory. The diele
tri
matrix
is
al
ulated with and without lo
al eld effe
ts. Usually lo
al eld effe
ts are determined on the Hartree level, i.e. in
luding
hanges of the Hartree potential. To in
lude mi
ros
opi
hanges of the ex
hange
orrelation potential the tag LRPA =
.FALSE. must be set (see Se
. 6.64.5). The method is explained in detail in Ref. [63, and follows
losely the original work
of Baroni and Resta.[62 A summation over empty
ondu
tion band states is not required, as opposed to the method sele
ted
by setting LOPTICS=.TRUE. (see Se
. 6.64.1). Instead, the usual expressions in perturbation theory
(H(k) enk S(k))
juenk i :
jk uenk i = juen k ihuen k je ek
(6.33)
nk
nk
n=
6 n
100
H(k)
Sk j
enk =
enk ( )) k u
Hk
( ( )
S k jue i
nk
k
enk ( ))
The solution of this equation involves similar iterative te
hniques as the
onventional self
onsisten
y
y
les. Hen
e, for ea
h
element of the diele
tri
matrix several lines will be written to the stdout and OSZICAR. These possess a similar stru
ture as
for
onventional self
onsistent or non-self
onsistent
al
ulations (a residual minimization s
heme is used to solve the linear
equation, other s
hemes su
h as Davidson do not apply to a linear equation):
RMM:
RMM:
RMM:
RMM:
RMM:
N
1
2
3
4
5
E
-0.14800E+01
-0.14248E+01
-0.13949E+01
-0.13949E+01
-0.13949E+01
dE
-0.85101E-01
0.55195E-01
0.29864E-01
0.13883E-04
0.28357E-04
d eps
-0.72835E+00
-0.27994E-01
-0.10673E-01
-0.31511E-03
-0.25757E-04
n
g
220
221
240
242
228
rms
0.907E+00
0.449E+00
0.322E+00
0.600E-01
0.177E-01
rms(
)
0.146E+00
0.719E-01
0.131E-01
0.336E-02
0.126E-02
It is important to note that exa
t values for the diele
tri
matrix are obtained even if only valen
e band states are
al
ulated.
Hen
e this method does not require to in
rease the NBANDS parameter. The nal values for the stati
diele
tri
matrix
an be
found in the OUTCAR le after the lines
MACROSCOPIC STATIC DIELECTRIC TENSOR (ex
luding lo
al field effe
ts)
and
MACROSCOPIC STATIC DIELECTRIC TENSOR (in
luding lo
al field effe
ts in DFT)
The values found after MACROSCOPIC STATIC DIELECTRIC TENSOR (ex
luding lo
al field effe
ts) should mat
h
exa
tly to the zero frequen
y values w ! 0 determined by the method sele
ted using LOPTICS=.TRUE. (see Se
. 6.64.1). This
offers a
onvenient way to determine how many empty bands are required for LOPTICS=.TRUE.. Simply exe
ute VASP using
LEPSILON = .TRUE. in order to determine the exa
t values for the diele
tri
onstants. Next, swit
h to LOPTICS=.TRUE. and
in
rease the number of
ondu
tion bands until the same values are obtained as using density fun
tional perturbation theory.
Note that the routine also parses and uses the value supplied in the LNABLA tag (see Se
. 6.64.3). Furthermore, the routine
al
ulates the Born effe
tive
harge tensor (dynami
al
harges) and ele
troni
ontribution to the the piezoele
tri
tensor ,
and prints them after
BORN EFFECTIVE CHARGES (in e,
ummulative output)
and
PIEZOELECTRIC TENSOR for field in x, y, z
(C/m2)
if LRPA=.FALSE. is set (the
al
ulated tensors are not sensible in the random phase approximation LRPA=.TRUE. ).
Pros
ompared to LOPTICS=.TRUE. (see Se
. 6.64.1):
It is not sensible to sele
t LOPTICS=.TRUE. and LEPSILON=.TRUE. in a single run (most likely it does work however).
Density fun
tional perturbation theory LEPSILON=.TRUE. does not require to in
rease NBANDS and is, in fa
t, mu
h slower if
NBANDS is in
reased, whereas the summation over emtpy
ondu
tion band states requires a large number of su
h states.
101
Default: .TRUE.
Usually lo
al eld effe
t are in
luded on the Hartree-Fo
k level only. This means that
ell periodi
mi
ros
opi
hanges
of the lo
al potential related to the
hange of the Hartree potential are in
luded. If LRPA = .FALSE., however,
hanges of
the Hartree potential and the ex
hange
orrelation potential are in
luded. This usually in
reases the diele
tri
onstants. The
nal values for the diele
tri
matrix
an be found in the OUTCAR le after the lines.
MACROSCOPIC STATIC DIELECTRIC TENSOR (in
luding lo
al field effe
ts in RPA (Hartree))
For LRPA=.FALSE. the diele
tri
matrix is written after the lines:
MACROSCOPIC STATIC DIELECTRIC TENSOR (in
luding lo
al field effe
ts in DFT)
The diele
tri
onstants without lo
al eld effe
ts is always determined (irregardless of LRPA). The piezoele
tri
tensors and
the Born effe
tive
harges as well as the ioni
ontributions the to diele
tri
tensor are only
al
ulated for LRPA=.FALSE.
6.64.6 Vibrational frequen
ies, relaxed-ion stati
diele
tri
tensor and relaxed-ion piezoele
tri
tensor
Setting IBRION=8 or IBRION=7 sele
ts the
al
ulation of the interatomi
for
e
onstants using density fun
tional perturbation
theory. For IBRION=8, symmetry is taken into a
ount, whereas IBRION=7 negle
ts symmetry
onsiderations and is thus
usually signi
antly more expensive. If IBRION=7 (or IBRION=8) and LEPSILON=.TRUE. is sele
ted, the relaxed-ion stati
diele
tri
tensor, or low frequen
y diele
tri
tensor, and the relaxed-ion piezoele
tri
tensors are determined [72. All values
are
olle
ted and printed at the end of the OUTCAR le (see also Se
. 6.21.7). Spe
i
ally the ioni
ontribution to the
piezoele
tri
tensor is printed after
PIEZOELECTRIC TENSOR IONIC CONTR for field in x, y, z
(C/m2)
and the ioni
ontributions to the diele
tri
tensor are printed after:
MACROSCOPIC STATIC DIELECTRIC TENSOR IONIC CONTRIBUTION
Available only in VASP.5.X. This version is presently not distributed. Do
umentation under
onstru
tion and for internal use
only!
6.65.1 ALGO for response fun
tions and GW
al
ulations
ALGO
102
Typi
ally NOMEGA should be
hosen around 50-100 (for the parallel version, NOMEGA should be dividable by the number of
ompute nodes to obtain maximum ef
ien
y). For qui
k and memory
onserving
al
ulations, it is suf
ient to set NOMEGA to
values around NOMEGA= 20-30, but then you must expe
t errors of the order of 20-50 meV for the gap, and 100-200 meV for
the bottom of the
ondu
tion band. We furthermore re
ommend to in
rease NOMEGA not beyond 100 for a k-point sampling
of 4 4 4 points/atom: the joint DOS and the self-energy tend to posses spurious ne stru
ture related to the nite k-point
grid. This ne stru
ture is smoothed, when smaller values for NOMEGA are used, or if more k-points are used. For 6 6 6
k-points/atom NOMEGA
an be usually in
reased to 200-300 without noti
ing problems asso
iated with this kind of noise.
Note that the spe
tral method (LSPECTRAL, see Se
. 6.65.3) s
ales very favourable with respe
t to the number of frequen
y
points, hen
e NOMEGA= 30 is usually only slightly faster than NOMEGA = 100.
6.65.3
OMEGAMAX = real number (maximum frequen
y for dense part of frequen
y grid)
OMEGATL = real number (maximum frequen
y for
oarse part of frequen
y grid)
CSHIFT
=
omplex shift
n;n ;
103
The ENCUTGW
ontrols how many G ve
tors are in
luded in the the response fun
tion
0q (G; G0 ; w).
Tests have shown that
hoosing ENCUTGW = ENCUT yields essentially exa
t results. In prin
iple, however, the response
fun
tion
ontains
ontributions up to twi
e the plane wave
utoff G
ut (see Se
. 7.2). Sin
e the diagonal of the diele
tri
matrix
onverges rapidly to one, su
h a large
utoff is never a
tually required (the present release has only been tested for
ENCUTGW ENCUT, and might
rash if ENCUTGW ENCUT). Furthermore, in most
ases, it is even possible to set ENCUTGW to
a value between 150 to 200 eV, and even 100 eV gives usually QP shifts that are a
urate to within a few hundreds of an eV
(0.01-0.02 eV). This
an help to speed up the
al
ulations signi
antly and redu
es the memory requirements substantially.
The ag ENCUTFOCK (Se
. 6.63.4), determines the FFT grid in all Hartree-Fo
k and GW routines. It therefore, inuen
es
the behavior and performan
e of the GW routines (see Se
. 6.63.4). But be
ause FFT's do not dominating the
omputational
work load for GW
al
ulations, savings are small if ENCUTFOCK is set. On the other hand, setting ENCUTFOCK=0 hardly
inuen
es the QP-shifts, it does not do any harm to set ENCUTFOCK=0 routinely in GW
al
ulations.
ODDONLYGW and EVENONLYGW: redu
ing the k-grid for the response fun
tions
6.65.6
EVENONLYGW = logi
al
ODDONLYGW = logi
al
ODDONLYGW allows to avoid the in
lusion of the G-point in the evaluation of response fun
tions. The independent parti
le
polarizability
0q (G; G0 ; w) is given by:
1
q (G; G0 ; w) =
nn k
;
k n k+q
2w (f
nk
hynk je
)
en k+q enk w
0
(6.35)
If the G point is in
luded in the summation over k,
onvergen
e is very slow for some materials (e.g. GaAs).
To deal with this problem the ag ODDONLYGW has been in
luded. In the automati
mode, the k-grid is given by (see Se
.
5.5.3):
k = ~b1
n1
N1
+~b2
n2
N2
+~b3
n3
;
N3
n1 = 0:::; N1
1 n2 = 0:::; N2 1 n3 = 0:::; N3 1:
If the three integers n sum to an odd value, the k-point is in
luded in the previous summation in the GW routine
(ODDONLYGW=.TRUE.). Note that other routines (linear opti
al properties) presently do not re
ognize this ag. EVENONLYGW
=.TRUE. is only of limited use and restri
ts the summation to k-points with n1 + n2 + n3 being even (G-point and from there
on ever se
ond k-point in
luded).
A
elerations are also possible by evaluating the response fun
tion itself at a restri
ted number of q-points. Note that the
GW loop, involves a sum over k, and a se
ond one over q (the index in the response fun
tion). To some extend both
an
be varied independently. The former one by using ODDONLYGW, and the latter one using the HF related ags NKRED, NKREDX,
NKREDY, NKREDZ and EVENONLY, ODDONLY. As explained in Se
. 6.63.7 the index q
an be restri
ted to the values
i
q = ~b1
n1C1
N1
+~b2
n2C2
N2
+~b3
n3C3
;
N3
(ni = 0; ::; Ni
1)
(6.36)
The integer grid redu
tion fa
tors are either set separately through C1 =NKREDX, C2 =NKREDY, and C3 =NKREDZ, or simultaneously through C1 = C2 = C3 =NKRED.
6.65.7
default: LSELFENERGY=.FALSE.
If LSELFENERGY=.FALSE., QP shifts are evaluated. This is the default behavior.
If LSELFENERGY=.TRUE. the frequen
y dependent self-energy hf k jS(w)jf k i is evaluated. Evaluation of QP shifts is bypassed in this
ase.
n
6.65.8
If LWAVE=.TRUE. is set expli
itly in the INCAR le, the WAVECAR le is updated after the GW
al
ulations, and the updated
QP-energies are written to the le. This allows to perform self
onsistent GW instead of G0 W0
al
ulations. Note that only
the energies are updated, whereas wavefun
tions are kept
onstant on the DFT level.
6.65.9
104
GW
al
ulations always require the
al
ulation of a standard DFT WAVECAR le in an initial step, using for instan
e the
following INCAR le:
System = Si
NBANDS = 96
ISMEAR = 0 ; SIGMA = 0.05 ! small sigma is required to avoid partial o
upan
ies
LOPTICS = .TRUE.
Note, that the a signi ant number of empty bands is required for GW al ulations. Furthermore note that the ag
LOPTICS=.TRUE. is required in order to write the le WAVEDER, whi h ontains the derivative of the wavefun tions with
respe
t to k. The a
tual GW
al
ulations are performed in a se
ond step, using an INCAR le su
h as (it is
onvenient to add
a single line):
System = Si
NBANDS = 96
ISMEAR = 0 ; SIGMA = 0.05
LOPTICS = .TRUE.
ALGO = GW0 ; NOMEGA = 50
The head and wings of the diele
tri
matrix are
onstru
ted using k.p perturbation theory (this requires that the le WAVEDER
exists). In the present release the intera
tion between the
ore and the valen
e ele
trons is always treated on the Hartree Fo
k
level.
For hybride fun
tionals, the two step pro
edure will a
ordingly involve the following INCAR les. In the rst step,
onverged HSE03 wave fun
tions are determined (usually HSE03
al
ulations should be pre
eeded by standard DFT
al
ulations, we have not do
umented this step here, see Se
. 6.63.8):
System = Si
NBANDS = 96
ISMEAR = 0 ; SIGMA = 0.05
ALGO = Damped ; TIME = 0.5
AEXX = 0.25 ; HFSCREEN = 0.3
LOPTICS = .TRUE.
In the GW step, the head and the wings of the response matrix are
orre
tly determined for WAVEDER le.
System
NBANDS
ISMEAR
ALGO =
= Si
= 96
= 0 ; SIGMA = 0.05
GW0 ; NOMEGA = 50
Convergen
e with respe
t to the number of empty bands NBANDS and with respe
t to the number of frequen
ies NOMEGA must
be
he
ked
arfully.
6.65.10
Presently only self
onsistent GW
al
ulations within a QP pi
ture are supported, in the sense that the eigenvalues (and
possibly eigenfun
tions) are updated, but satellite peaks (shake ups and shake downs)
an not be a
ounted for in the self
onsisten
y
y
le. Self
onsistent GW
al
ulations
an be either performed by simply repeatedly
alling VASP using:
System = Si
NBANDS = 96
ISMEAR = 0 ; SIGMA = 0.05
ALGO = GW # or ALGO = s
GW
LWAVE = .TRUE.
Results are identi
al for ALGO = GW0 and ALGO = GW. For s
GW0 or s
GW non diagonal terms in the Hamiltonian are also
a
ounted for, and the linearized QP equation is diagonalized. Alternatively spe
ify an ele
troni
iteration
ounter using
NELM:
105
System = Si
NBANDS = 96
ISMEAR = 0 ; SIGMA = 0.05
ALGO = GW # or ALGO = s
GW
NELM = 4
LWAVE = .TRUE. ! depends on whether you want to have final updated
! eigenvalues on WAVECAR
In this
ase the QP energies are updated 4 times (starting from the DFT eigenvalues) in both G and W.
6.65.11
In most
ases, the best results (i.e.
losest to experiment) are obtained by iterating only G, but keeping W xed to the initial
DFT W0 . This
an be a
hieved in two manners. If the spe
tral method is not sele
ted, the QP shifts are iterated automati
ally
four times, and you will nd four sets of QP shifts in the OUTCAR le. The rst one
orresponds to the G0 W0
ase, the nal
one to the GW0 results. The INCAR le is simply:
System
NBANDS
ISMEAR
ALGO =
= Si
= 96
= 0 ; SIGMA = 0.05
GW0 ; NOMEGA = 50
For te
hni
al reasons, it is not possible to iterate G in that manner, if LSPECTRAL=.TRUE. is set in the INCAR le. In this
ase, an iteration number must be supplied in the INCAR le using the NELM tag. Usually three to four iterations are suf
ient
for a
urate QP shifts.
System
NBANDS
ISMEAR
ALGO =
NELM =
= Si
= 96
= 0 ; SIGMA = 0.05
GW0 ; NOMEGA = 50
4
If the spe
tral method is not used, the spe
i
ation of NELM is not sensible. If non diagonal
omponents of the selfenergy
should be in
luded use ALGO = s
GW0.
6.65.12
Using the GW routines for the determination of frequen y dependent diele tri matrix
The GW routine also determines the frequen
y dependent diele
tri
matrix without lo
al eld effe
ts and with lo
al eld
effe
ts in the random phase approximation (RPA, LRPA=.TRUE.), or the DFT approximation (LRPA=.FALSE, see Se
. 6.64.5).
The
al
ulated mi
ros
opi
frequen
y dependent diele
tri
fun
tion, must mat
h exa
tly those determined using the opti
al
routine (LOPTICS =.TRUE. see Se
. 6.64.1), and, in the stati
limit, the density fun
tional perturbation routines (LEPSILON
=.TRUE. see Se
. 6.64.4). In fa
t, it is guaranteed that the results are identi
al to those determined using a summation over
ondu
tion band states (Se
. 6.64.1). Differen
es for LSPECTRAL=.FALSE. must be negligible, and
an be solely related to a
different
omplex shift CSHIFT (defaults for CSHIFT are different in both routines). Setting CSHIFT manually in the INCAR
le will remedy this issue. If differen
es prevail, it might be required to in
rease NEDOS (in this
ase the LOPTICS routine
was suffering from an ina
urate frequen
y sampling, and the GW routine was most likely performing perfe
tly well). For
LSPECTRAL=.TRUE. differen
es
an arise, be
ause (i) the GW routine uses less frequen
y points and different frequen
y grids
than the opti
s routine or again (ii) from a different
omplex shift. In
reasing NOMEGA should remove all dis
repan
ies. Finally,
the GW routine is the only routine
apable to in
lude lo
al eld effe
ts for the frequen
y dependent diele
tri
fun
tion.
The imaginary and real part of frequen
y dependent diele
tri
fun
tion is always determined by the GW routine. It
an
be
onveniently grepped from the le using the
ommand (note two blanks between the two words)
grep " diele
tri
onstant" OUTCAR
The rst value is the frequen
y (in eV) and the other two are the real and imaginary part of the tra
e of the diele
tri
matrix.
Note that two sets
an be found on the OUTCAR le. The rst one
orresponds to the head of the mi
ros
opi
diele
tri
matrix (and therefore does not in
lude lo
al eld effe
ts), whereas the se
ond one is the inverse of the diele
tri
matrix with
lo
al eld effe
ts in
luded in the random phase approximation or density fun
tional approximation (depending on LRPA).
If full GW
al
ulations are not required, it is possible to greatly a
elerate the
al
ulations, by
al
ulating the response
fun
tions only at the G-point. This
an be a
hieved by setting (see Se
. 6.65.6):
THEORETICAL BACKGROUND
106
The
al
ulation of the QP shifts
an be bypassed by setting ALGO = CHI (see Se
. 6.65.1). Furthermore, if only the stati
response fun
tion is required the number of frequen
y points should be set to NOMEGA=1 and LSPECTRAL=.FALSE.
6.66
First of all, the memory requirements of the serial version
an be estimated using the makeparam utility (see Se
. 5.23). At
present, there is however no way to estimate the memory requirements of the parallel version.
In fa
t, it might be dif
ult to run huge jobs on thin T3E or SP2 nodes. Most tables (pseudopotentials et
.) and the
exe
utable must be held on all nodes (10-20 Mbytes). In addition one
omplex array of the size Nbands Nbands is allo
ated
on ea
h node; during dynami
simulation even up to three su
h arrays are allo
ated. Upon reading and writing the
harge
density, a
omplex array that
an hold all data points of the
harge density is allo
ated (8*NGXF*NGYF*NGZF). Finally, three
su
h arrays are allo
ated (and deallo
ated) during the
harge density symmetrisation (the
harge density symmetrisation takes
usually the hugest amount of memory.) All other data are distributed among all nodes.
The following things
an be tried to redu
e the memory requirements on ea
h node.
Possibly the exe
utable be
omes smaller if the options -G1 (T3E) and -g are removed from the lines OFLAG and DEBUG
in the makele.
Swit
h of symmetrisation (ISYM=0). Symmetrisation is done lo
ally on ea
h node requiring three huge arrays.
VASP.4.4.2 (and newer versions) have a swit
h to run a more memory
onserving symmetrization. This
an be sele
ted by spe
ifying ISYM=2. Results might however differ somewhat from ISYM=1 (usually only 1/100th of an
meV). Also avoid writing or reading the le CHGCAR (LCHARG=F).
Use NPAR=1.
It should be mentioned that VASP relies heavily on dynami
memory allo
ation (ALLOCATE and DEALLOCATE). As far
as we know there is no memory leakage (ALLOCATE without DEALLOCATE), however unfortunately it is impossible to
be entirely sure that no leakage exists. It should be mentioned that some users have observed that the
ode is growing during
dynami
simulations on the T3E. This is however most likely due to a problemati
dynami
memory management of the
f90 runtime system and not due to programming error in VASP. Unfortunately the dynami
memory subsystems of most f90
ompilers are still rather inef
ient. As a result it might happen, that the memory be
omes more and more fragmented during
the run, so that large pie
es of memory
an not be allo
ated. We
an only hope for improvements in the dynami
memory
management (for instan
e the introdu
tion of garbage
olle
tors).
7 Theoreti
al Ba
kground
The following se
tions
ontain some ba
kground information on the algorithms used in VASP. They do not
ontain a
omplete referen
e to all the things implemented in VASP but try to give hints on the most important topi
s. You should really
understand at least the ideas tou
hed here but it might be still possible to get good results without understanding all of it.
For a basi
outline of pseudopotential plane wave programs we refer to [6, 7. Ultrasoft pseudopotentials are explained
in [8, 9, 10, 18. An ex
ellent introdu
tion to PP plane wave
odes albeit in German
an be found in the thesis of J.
Furthmuller [11. The best explanation of the algorithms found in VASP
an be found in Ref. [13, 14, these two papers give
mu
h more information than
an be found in the following se
tions. And last but not least, you want might read the thesis
of G. Kresse [12 (in German too) it
ontains a general dis
ussion of PP in
luding ultrasoft PP, and a dis
ussion of the
KS-fun
tional and algorithms to
al
ulate the KS-groundstate.
7.1
The following se
tion dis
usses the minimization algorithms implemented in VASP. We generally have one outer loop in
whi
h the
harge density is optimized, and one inner loop in whi
h the wavefun
tions are optimized. Have at least a look at
the ow
hart.
Most of the algorithms implemented in VASP use an iterative matrix-diagonalization s
heme: the used algorithms are
based on the
onjugate gradient s
heme [20, 21, blo
k Davidson s
heme [22, 23, or a residual minimization s
heme dire
t
inversion in the iterative subspa
e (RMM-DIIS) [19, 26). For the mixing of the
harge density an ef
ient Broyden/Pulay
mixing s
heme[24, 25, 26 is used. Fig. 3 shows a typi
al ow-
hart of VASP. Input
harge density and wavefun
tions are
107
THEORETICAL BACKGROUND
independent quantities (at start-up these quantities are set a
ording to INIWAV and ICHARG). Within ea
h self
onsisten
y
loop the
harge density is used to set up the Hamiltonian, then the wavefun
tions are optimized iteratively so that they get
loser to the exa
t wavefun
tions of this Hamiltonian. From the optimized wavefun
tions a new
harge density is
al
ulated,
whi
h is then mixed with the old input-
harge density. A brief ow
hart is given in Fig. 3.
The
onjugate gradient and the residual minimization s
heme do not re
al
ulate the exa
t Kohn-Sham eigenfun
tions but
an arbitrary linear
ombination of the NBANDS lowest eigenfun
tions. Therefore it is in addition ne
essary to diagonalize the
Hamiltonian in the subspa
e spanned by the trial-wavefun
tions, and to transform the wavefun
tions a
ordingly (i.e. perform
a unitary transformation of the wavefun
tions, so that the Hamiltonian is diagonal in the subspa
e spanned by transformed
wavefun
tions). This step is usually
alled sub-spa
e diagonalization (although a more appropriate name would be, using the
Rayleigh Ritz variational s
heme in a sub spa
e spanned by the wavefun
tions):
h f j j j fi i
Hi jU jk
Hi j
ekUik
U jk fk
fj
The sub-spa
e diagonalization
an be performed before or after the
onjugate gradient or residual minimization s
heme. Tests
we have done indi
ate that the rst
hoi
e is preferable during self
onsistent
al
ulations.
In general all iterative algorithms work very similar: The
ore quantity is the residual ve
tor
jRn i = (
E ) j fn i
with E =
hfn j jfn i
hfn jfn i
H
(7.1)
This residual ve
tor is added to the wavefun
tion fn , the algorithms differ in the way this is exa
tly done.
7.1.1
Pre onditioning
The idea is to nd a matrix whi
h multiplied with the residual ve
tor gives the exa
t error in the wavefun
tion. Formally this
matrix (the Greens fun
tion)
an be written down and is given by
1
H
en
where en is the exa
t eigenvalue for the band in interest. A
tually the evaluation of this matrix is not possible, re
ognizing that
2
the kineti
energy dominates the Hamiltonian for large G-ve
tors (i.e. HG G ! dG G 2hm G2 ), it is a good idea to approximate
the matrix by a diagonal fun
tion whi
h
onverges to h22Gm2 for large G ve
tors, and possess a
onstant value for small G
ve
tors. We a
tually use the pre
onditioning fun
tion proposed by Teter et. al.[20
;
x + 12x + 8x
h j j 0 i = dGG 27 +2718+x18
+ 12x2 + 8x3 + 16x4
G K G
und x =
G2
h2
;
2m 1:5E kin (R)
with E kin (R) being the kineti energy of the residual ve tor. The pre onditioned residual ve tor is then simply
j pn i = jRn i
K
7.1.2
The pre
onditioned residual ve
tor is
al
ulated for ea
h band resulting in a 2 Nbands basis-set
bi;i=1;2Nbands
ffn pn jn = 1 Nbands g
=
Within this subspa e the NBANDS lowest eigenfun tions are al ulated solving the eigenvalue problem
hbi j
e j Sjb j i = 0:
The NBANDS lowest eigenfun tions are used in the next step.
THEORETICAL BACKGROUND
108
trial- harge
set up Hamiltonian
subspa e-diagonalization
fn
( Un n fn
0
subspa e-diagonalization
fn
fn
( Un n fn
0
E = n en fn
fn
d.
.
sS
no
((
(
h
hh
((
hh
((
?
((hhh
DE < Ebreak
hh
hh(
((
hh
hh
(
((
h
h
(
((
109
THEORETICAL BACKGROUND
(7.2)
Then the linear
ombination of this 'sear
h dire
tion' gn and the
urrent wavefun
tion fn is
al
ulated whi
h minimizes the
expe
tation value of the Hamiltonian. This requires to solve the 2 2 eigenvalue problem
hbi jH
j i
eS b j = 0;
hbi jH
j i
eS b j = 0
1 2 3
1 = fn =gn =gn =gn =:::
The lowest eigenve
tor of the eigenvalue problem is used to
al
ulate a new (possibly pre
onditioned) sear
h ve
tor gNn .
7.1.5 Conjugate gradient optimization
Instead of the previous iteration s
heme, whi
h is just some kind of Quasi-Newton s
heme, it also possible to optimize the
expe
tation value of the Hamiltonian using a su
essive number of
onjugate gradient steps. The rst step is equal to the
steepest des
ent step in se
tion 7.1.3. In all following steps the pre
onditioned gradient gNn is
onjugated to the previous
sear
h dire
tion. The resulting
onjugate gradient algorithm is almost as ef
ient as the algorithm given in se
tion 7.1.4. For
further reading see [20, 21, 28.
7.1.6 Implemented Davidson-blo
k iteration s
heme
sele
ts a subset of all bands from ffn jn = 1; ::; Nbands g ) ff1k jk = 1; ::; n1 g
Optimize this subset by adding the orthogonalized pre
onditioned residual ve
tors to the presently
onsidered
subspa
e
(
fk = gk =
Nbands
jfnihfnjS
n=1
K H
eapp S f1k k = 1; ::; n1
apply Raighly Ritz optimization in the spa
e spanned by these ve
tors (sub-spa
e rotation in a 2n1 dim. spa
e)
to determine n1 lowest ve
tors ff2k jk = 1; n1 g
Add additional pre
onditioned residuals
al
ulated from the yet optimized bands
(
fk = gk = gk =
Nbands
jfnihfnjS
n=1
K H
eapp S f2k k = 1; ::; n1
110
THEORETICAL BACKGROUND
7.1.7
The s
hemes 7.1.3-7.1.5 try to optimize the expe
tation value of the Hamiltonian for ea
h wavefun
tion using an in
reasing
trial basis-set. Instead of minimizing the expe
tation value it is also possible to minimize the norm of the residual ve
tor. This
leads to a similar iteration s
heme as des
ribed in se
tion 7.1.4, but a different eigenvalue problem has to be solved (see Ref.
[19, 26).
There is a signi
ant differen
e between optimizing the eigenvalue and the norm of the residual ve
tor. The norm of the
residual ve
tor is given by
(H
e)jfn i
and possesses a quadrati
unrestri
ted minimum at the ea
h eigenfun
tion fn . If you have a good starting guess for the
eigenfun
tion it is possible to use this algorithm without the knowledge of other wavefun
tions, and therefore without the
expli
it orthogonalization of the pre
onditioned residual ve
tor (eq. 7.2). In this
ase after a sweep over all bands a GramS
hmidt orthogonalization is ne
essary to obtain a new orthogonal trial-basis set. Without the expli
it orthogonalization to
the
urrent set of trial wavefun
tions all other algorithms tend to
onverge to the lowest band, no matter from whi
h band
they are start.
7.2
In this se
tion we will dis
uss wrap around errors. Wrap around errors arise if the FFT meshes are not suf
iently large. It
an be shown that no errors exist if the FFT meshes
ontain all G ve
tors up to 2G
ut .
It
an be shown that the
harge density
ontains
omponents up to 2G
ut , where 2G
ut is the 'longest plane' wave in the
basis set:
The wavefun
tion is dened as
jfnk i = CGnk jk + Gi
G
G)rCGnk :
( +
(7.3)
(7.4)
r rr e iGr
ps
(7.5)
Inserting rps from equation (7.4) and Crnk from (7.3) it is very easy to show that rps
r
ontains Fourier-
omponents up to 2G
ut .
Generally it
an be shown that a the
onvolution fr = fr1 fr2 of two 'fun
tions' fr1 with Fourier-
omponents up to G1 and
fr2 with Fourier-
omponents up to G2
ontains Fourier-
omponents up to G1 + G2 .
The property of the
onvolution
omes on
e again into play, when the a
tion of the Hamiltonian onto a wavefun
tion is
al
ulated. The a
tion of the lo
al-potential is given by
ar = VrCrnk
Only the
omponents aG with jGj < G
ut are taken into a
ount (see se
tion 7.1: aG is added to the wavefun
tion during the
iterative renement of the wavefun
tions CGnk , and CGnk
ontains only
omponents up to G
ut ). From the previous theorem we
see that ar
ontains
omponents up to 3G
ut (Vr
ontains
omponents up to 2G
ut ). If the FFT-mesh
ontains all
omponents
up to 2G
ut the resulting wrap-around error is on
e again 0. This
an be easily seen in Fig. 4.
THEORETICAL BACKGROUND
111
Figure 4: The small sphere
ontains all plane waves in
luded in the basis set G < G
ut . The
harge density
ontains
omponents up to 2G
ut (se
ond sphere), and the a
eleration a
omponents up to 3G
ut , whi
h are ree
ted in (third sphere)
be
ause of the nite size of the FFT-mesh. Nevertheless the
omponents aG with jGj < G
ut are
orre
t i.e. the small sphere
does not interse
t with the third large sphere
THEORETICAL BACKGROUND
7.3
112
Re
ently there was an in
reased interest in the so
alled Harris-Foulkes (HF) fun
tional. This fun
tional is non self
onsistent:
The potential is
onstru
ted for some 'input'
harge density, then the band-stru
ture term is
al
ulated for this xed non
self
onsistent potential. Double
ounting
orre
tions are
al
ulated from the input
harge density: the fun
tional
an be
written as
EHF [
rin r
;
=
+
It is interesting that the fun
tional gives a good des
ription of the binding-energies, equilibrium latti
e
onstants, and bulkmodulus even for
ovalently bonded systems like Ge. In a test
al
ulation we have found that the pair-
orrelation fun
tion of
l-Sb
al
ulated with the HF-fun
tion and the full Kohn-Sham fun
tional differs only slightly. Nevertheless, we must point out
that the
omputational gain in
omparison to a self
onsistent
al
ulation is in many
ases very small (for Sb less than 20 %).
The main reason why to use the HF fun
tional is therefore to a
ess and establish the a
ura
y of the HF-fun
tional, a topi
whi
h is
urrently widely dis
ussed within the
ommunity of solid state physi
ists. To our knowledge VASP is one of the few
pseudopotential
odes, whi
h
an a
ess the validity of the HF-fun
tional at a very basi
level, i.e. without any additional
restri
tions like lo
al basis-sets et
.
Within VASP the band-stru
ture energy is exa
tly evaluated using the same plane-wave basis-set and the same a
ura
y
whi
h is used for the self
onsistent
al
ulation. The for
es and the stress tensor are
orre
t, insofar as they are an exa
t
derivative of the Harris-Foulkes fun
tional. During a MD or an ioni
relaxation the
harge density is
orre
tly updated at
ea
h ioni
step.
7.4
In this se tion we dis uss partial o upan ies. A must for all readers.
First there is the question why to use partial o
upan
ies at all. The answer is: partial o
upan
ies help to de
rease the
number of k-points ne
essary to
al
ulate an a
urate band-stru
ture energy. This answer might be strange at rst sight. What
we want to
al
ulate is, the integral over the lled parts of the bands
n WBZ
1
WBZ
enk Q(enk
dk;
where Q(x) is the Dira
step fun
tion. Due to our nite
omputer resour
es this integral has to be evaluated using a dis
rete
set of k-points[37:
1
WBZ
WBZ
! wk
(7.6)
wk
enk Q(enk
);
whi
h
onverges ex
eedingly slow with the number of k-points in
luded. This slow
onvergen
e speed arises only from the
fa
t that the o
upan
ies jump form 1 to 0 at the Fermi-level. If a band is
ompletely lled the integral
an be
al
ulated
a
urately using a low number of k-points (this is the
ase for semi
ondu
tors and insulators).
For metals the tri
k is now to repla
e the step fun
tion Q(enk ) by a (smooth) fun
tion f (fenk g) resulting in a mu
h
faster
onvergen
e speed without destroying the a
ura
y of the sum. Several methods have been proposed to solve this
dazzling problem.
7.4.1
Within the linear tetrahedron method, the term enk is interpolated linearly between two k-points. Bloe
hel [35 has re
ently
revised the tetrahedron method to give effe
tive weights f (fenk g) for ea
h band and k-point. In addition Bloe
hel was able
to derive a
orre
tion formula whi
h removes the quadrati
error inherent in the linear tetrahedron method (linear tetrahedron
method with Bloe
hel
orre
tions). The linear tetrahedron is more or less fool proof and requires a minimal interferen
e by
the user.
The main drawba
k is that the Bloe
hels method is not variational with respe
t to the partial o
upan
ies if the
orre
tion
terms are in
luded, therefore the
al
ulated for
es might be wrong by a few per
ent. If a
urate for
es are required we
re
ommend a nite temperature method.
113
THEORETICAL BACKGROUND
Table 2: Typi
al
onvenient settings for sigma for different metals: Aluminium possesses an extremely simple DOS, Lithium
and Tellurium are also simple nearly free ele
tron metals, therefore sigma might be large. For Copper sigma is restri
ted by
the fa
t that the d-band lies approximately 0.5 eV beneath the Fermi-level. Rhodium and Vanadium posses a fairly
omplex
stru
ture in the DOS at the Fermi-level, sigma must be small.
Aluminium
Lithium
Tellurium
Copper, Palladium
Vanadium
Rhodium
Potassium
7.4.2
Sigma (eV)
1:0
0.4
0.8
0.4
0.2
0.2
0.3
In this
ase the step fun
tion is simply repla
ed by a smooth fun
tion, for example the Fermi-Dira
fun
tion[33
f(
e
s
)=
1
:
exp( e s ) + 1
e
s
)=
1
1 erf
2
e
(7.7)
is one used quite frequently in the
ontext of solid state
al
ulations. Nevertheless, it turns out that the total energy is no
longer variational (or minimal) in this
ase. It is ne
essary to repla
e the total energy by some generalized free energy
F
=E
k wksS f k
n
( n ):
The
al
ulated for
es are now the derivatives of this free energy F (see se
tion 7.5). In
onjun
tion with Fermi-Dira
statisti
s
the free energy might be interpreted as the free energy of the ele
trons at some nite temperature s = kB T , but the physi
al
signi
an
e remains un
lear in the
ase of Gaussian smearing. Despite this problem, it is possible to obtain an a
urate
extrapolation for s ! 0 from results at nite s using the formula
1
(F + E ):
2
In this way we get a 'physi
al' quantity from a nite temperature
al
ulation, and the Gaussian smearing method serves as
an mathemati
al tool to obtain faster
onvergen
e with respe
t to the number of k-points. For Al this method
onverges even
faster than the linear tetrahedron method with Bloe
hel
orre
tions.
E (s ! 0) = E0 =
7.4.3
The method des ribed in the last se tion has two short omings:
The for
es
al
ulated by VASP are a derivative of the free ele
troni
energy F (see se
tion 7.5). Therefore the for
es
an
not be used to obtain the equilibrium groundstate, whi
h
orresponds to an energy-minimum of E (s ! 0). Nonetheless
the error in the for
es is generally small and a
eptable.
The parameter s must be
hosen with great
are. If s is too large the energy E (s ! 0) will
onverge to the wrong value
even for an innite k-point mesh, if s is too small the
onvergen
e speed with the number of k-points will deteriorate.
An optimal
hoi
e for s for several
ases is given in table 2. The only way to get a good s is by performing several
al
ulations with different k-point meshes and different parameters for s.
These problems
an be solved by adopting a slightly different fun
tional form for f (fenk g). This is possible by expanding
the step fun
tion in a
omplete orthonormal set of fun
tions (method of Methfessel and Paxton [36). The Gaussian fun
tion
THEORETICAL BACKGROUND
114
is only the rst approximation (N=0) to the step fun
tion, further su
essive approximations (N=1,2,...) are easily obtained.
In similarity to the Gaussian method, the energy has to be repla
ed by a generalized free energy fun
tional
F
nk
=E
k sS( f nk ):
In
ontrast to the Gaussian method the entropy term nk wk sS( fnk ) will be very small for reasonable values of s (for instan
e
for the values given in table 2). The nk wk sS( fnk ) is a simple error estimation for the differen
e between the free energy F
and the 'physi
al' energy E (s ! 0). s
an be in
reased till this error estimation gets too large.
7.5
For es
Within the nite temperature LDA for
es are dened as the derivative of the generalized free energy. This quantity
an be
evaluated easily. The fun
tional F depends on the wavefun
tions f, the partial o
upan
ies f , and the positions of the ions
R. In this se
tion we will shortly dis
uss the variational properties of the free energy and we will explain why we
al
ulate
the for
es as a derivative of the free energy. The formulas given are very symboli
and we do not take into a
ount any
onstraints on the o
upation numbers or the wavefun
tions. We denote the whole set of wavefun
tions as f and the set of
partial o
upan
ies as f .
The ele
troni
groundstate is determined by the variational property of the free energy i.e.
0 = dF (f; f ; R)
for arbitrary variations of f and f . We
an rewrite the right hand side of this equation as
F
F
df +
df:
f
f
For arbitrary variations this quantity is zero only if fF = 0 and Ff = 0, leading to a system of equations whi
h determines f
and f at the ele
troni
groundstate. We dene the for
es as derivatives of the free energy with respe
t to the ioni
positions
i.e.
for
e =
dF (
f; f ; R)
dR
F f F f F
+
+
:
f R f R R
At the groundstate the rst two terms are zero and we
an write
for
e =
dF (
f; f ; R)
dR
F
R
i.e. we
an keep f and f xed at their respe
tive groundstate values and we have to
al
ulate the partial derivative of the free
energy with respe
t to the ioni
positions only. This is relatively easy task.
Previously we have mentioned that the only physi
al quantity is the energy for s ! 0. It is in prin
iple possible to evaluate
the derivatives of E(s ! 0) with respe
t to the ioni
oordinates but this is not easy and requires additional
omputer time.
7.6
If you are doing energyvolume
al
ulations or
ell shape and volume relaxations you must understand the Pulay stress, and
related problems.
The Pulay stress arises from the fa
t that the plane wave basis set is not
omplete with respe
t to
hanges of the volume.
Thus, unless absolute
onvergen
e with respe
t to the basis set has been a
hieved the diagonal
omponents of the stress
tensor are in
orre
t. This error is often
alled Pulay stress. The error is almost isotropi
(i.e. the same for ea
h diagonal
omponent), and for a nite basis set it tends to de
rease volume
ompared to fully
onverged
al
ulations (or
al
ulations
with a
onstant energy
utoff).
The Pulay stress and related problems affe
t the behavior of VASP and any plane wave
ode in several ways: First it
evidently affe
ts the stress tensor
al
ulated by VASP, i.e. the diagonal
omponents of the stress tensor are in
orre
t, unless
the energy
utoff is very large (ENMAX=1.3 *default is usually a safe setting to obtain a reliable stress tensor). In addition it
should be noted that all volume/
ell shape relaxation algorithms implemented in VASP work with a
onstant basis set. In that
way all energy
hanges are stri
tly
onsistent with the
al
ulated stress tensor, and this in turn results in an underestimation
of the equilibrium volume unless a large plane wave
utoff is used. Keeping the basis set
onstant during relaxations has
also some strange effe
t on the basis set. Initially all G-ve
tors within a sphere are in
luded in the basis. If the
ell shape
relaxation starts the dire
t and re
ipro
al latti
e ve
tors
hange. This means that although the number of re
ipro
al G-ve
tors
in the basis is kept xed, the length of the G-ve
tors
hanges,
hanging indire
tly the energy
utoff. Or to be more pre
ise,
THEORETICAL BACKGROUND
115
the shape of
utoff region be
omes an elipsoide. Restarting VASP after a volume relaxation
auses VASP to adopt a new
spheri
al
utoff sphere and thus the energy
hanges dis
ontinuously (see se
tion 6.13).
One thing whi
h is important to understand, is that problems due to the Pulay stress
an often be negle
ted if only volume
onserving relaxations are performed. This is be
ause the Pulay stress is usually almost uniform and it therefore
hanges
the diagonal elements of the stress tensor only by a
ertain
onstant amount (see below). In addition many
al
ulations have
shown that Pulay stress related problems
an also be redu
ed by performing
al
ulations at different volumes using the same
energy
utoff for ea
h
al
ulation (this is what VASP does by default, see se
tion 6.13), and tting the nal energies to an
equation of state. This of
ourse implies that the number of basis ve
tors is different at ea
h volume. But
al
ulations with
many plane wave
odes have shown that su
h
al
ulations give very reliable results for the latti
e
onstant and the bulk
modulus and other elasti
properties even at relatively small energy
utoffs. Constant energy
ut-off
al
ulations are less
prone to errors
ause by the basis set in
ompleteness than
onstant basis set
al
ulations. But it should be kept in mind that
volume
hanges and
ell shape
hanges must be rather large in order to obtain reliable results from this method, be
ause in
the limit of very small distortions the energy
hanges obtained with this method are equivalent with that obtained from the
stress tensor and are therefore affe
ted by the Pulay stress. Only volume
hanges of the order of 5-10 % guarantee that the
errors introdu
ed by the basis set in
ompleteness are averaged out.
7.6.1
The Pulay stress shows only a weak dependen
y on volume and the ioni
onguration. It is mainly determined by the
omposition. The simplest way to estimate the Pulay stress is to relax the stru
ture with a large basis-set (1.3 default
utoff
is usually suf
ient, or PREC=High in VASP.4.4). Then re-run VASP for the nal relaxed positions and
ell parameters with
the default
utoff or the desired
utoff. Look for the line 'external pressure' in the OUTCAR le:
external pressure =
-100.29567 kB
The
orresponding (negative) pressure gives a good estimation of the Pulay stress.
7.6.2
The general message is: whenever possible avoid volume relaxation with the default energy
utoff. Either in
rease the basis set
by setting ENCUT manually in the INCAR le, or use method two suggested below, whi
h avoids doing volume relaxations
at all. If volume relaxations are the only possible and feasible option please use the following step by step pro
edure (whi
h
minimizes errors to a minimum):
1. Relax from starting stru
ture (ISMEAR should be 0 or 1, see se
tion 6.37).
2. Start a se
ond relaxation from previous CONTCAR le (re-relaxation).
3. As a nal step perform one more energy
al
ulation with the tetrahedron method swit
hed on (i.e. ISMEAR=-5), to
get very a
urate and unambiguous energies (no relaxation for the nal run). The nal
al
ulation should be done with
PREC=High, to get very a
urate energies.
A few things should be remarked here: Never take the energy obtained at the end of a relaxation run, if you allow for
ell
shape relaxations (the nal basis set might not be isotropi
). Instead perform one additional stati
run at the end.
The relaxation will give a stru
ture whi
h is
orre
t to rst order, the nal error in the energy of step 3 is of se
ond order
(with respe
t to the stru
tural errors). If you take the energy dire
tly from the relaxation run, errors are usually signi
antly
larger. Another important point is that the most a
urate results for the relaxation will be obtained if the starting
ell parameters are very
lose to the nal
ell parameters. If different runs yield different results, then the run whi
h started from the
onguration whi
h was
losest to the relaxed stru
ture, is the most reliable one.
We strongly re
ommend to do any volume (and to lesser extend
ell shape) relaxation with an in
reased basis set. ENCUT=1.3 default
utoff is reasonable a
urate in most
ases. PREC=High does also in
rease the energy
utoff by a fa
tor
1.25. At an in
reased
utoff the Pulay stress
orre
tion are usually not required.
However, if the default
utoff is used for the relaxation, the PSTRESS line should be set in the INCAR le: Evaluate the
Pulay stress along the guidelines given in the previous se
tion and add an input-line to the INCAR le whi
h reads (usually
a negative number):
PST RESS = Pulay stress
From now on all STRESS output of VASP is
orre
ted by simply subtra
ting PSTRESS. In addition, all volume relaxations
will take PSTRESS into a
ount (see se
. 6.24). Again this te
hnique (PSTRESS line in the INCAR le) is not really re
ommended. However one is often saved by the fa
t that rst order stru
tural errors will only
ause a se
ond order error in the
energy (at least if the pro
edure outlined above is used).
7.6.3
116
It is possible to avoid volume relaxation in many
ases: The method we have used quite often in the past, is to relax the
stru
ture (
ell shape and internal parameters) for a set of xed volumes (ISIF=4). The nal equilibrium volume and the
groundstate energy
an be obtained by a t to an equation of state. The reason why this method is better than volume
relaxation is that the Pulay stress is almost isotropi
, and thus adds only a
onstant value to the diagonal elements of the stress
tensor. Therefore, the relaxation for a xed volume will give an almost
orre
t stru
ture.
The outline for su
h a
al
ulation is almost the same as in the previous se
tion. But in this
ase, one has to do the
al
ulations for a set of xed volumes. At rst sight this seems to be mu
h more expensive than method number one (outlined
in the previous se
tion). But in many
ases the additional
osts are only small, be
ause the internal parameters do not
hange
very mu
h from volume to volume.
1. Sele
t one volume and relax from starting stru
ture keeping the volume xed (ISIF=4 see se
. 6.23; ISMEAR=0 or 1,
see se
tion 6.37).
2. Start a se
ond relaxation from previous CONTCAR le (if the initial
ell shape was reasonable this step
an be skipped,
if the
ell shape is kept xed, you never have run VASP twi
e).
3. As a nal step perform one more energy
al
ulation with the tetrahedron method swit
hed on (ISMEAR=-5), to get
very a
urate unambiguous energies (no relaxation for the nal run).
The method has also other advantages, for instan
e the bulk modulus is readily available. We have found in the past that
this method
an be used safely with the default
utoff. (see also se
tion 9.2).
7.6.4
This is a very
ommon questions from people who start to do
al
ulations with plane wave
odes. There are two reasons why
the energy vs. volume plot looks jagged:
1. Basis set in
ompleteness. The basis set is dis
rete and in
omplete, and when the volume
hanges, additional plane
waves are added. That
auses small dis
ontinuous
hanges in the energy.
Solutions:
use a larger plane wave
utoff:
That means, at ea
h k-point a different basis set is used, and additional plane waves are added at ea
h k-point at
different volumes. In turn, the energy vs. volume
urve be
omes smoother.
2. However the most probable reason for the jagged E(V)
urve is another one: For PREC=High the FFT grids are
hosen
so that Hjf > is exa
tly evaluated. For PREC=Med the FFT grids are set to 3/4 of the value that is in prin
iple required
for an exa
t evaluation of Hjf >. This introdu
es small errors, be
ause when the volume
hanges the FFT grids do
hange dis
ontinuously. In other words, at ea
h volume a different FFT-grid is used,
ausing the energy to jump dis
ontinuously.
Solutions:
Set your FFT grids manually. Choose that one that is used per default for the largest volume
use PREC=High. In the new version (starting from VASP.4.4.3) this also in
reases the plane wave
utoff by 30
%. If this is undesirable, the plane wave
utoff
an be xed manually by spe
ifying ENMAX=... in the INCAR le
8
In the last two se
tions all input parameters were explained, nevertheless it is not easy to set all parameters
orre
tly. In this
se
tion we will try to
on
entrate on those parameters whi
h are most important.
8.1
117
One should
hose NBANDS so that a
onsiderable number of empty bands is in
luded in the
al
ulation. As a minimum we
require one empty band. VASP will give a warning, if this is not the
ase.
NBANDS is also important from a te
hni
al point of view: In iterative matrix-diagonalization s
hemes eigenve
tors
lose
to the top of the
al
ulated number of ve
tors
onverge mu
h slower than the lowest eigenve
tors. This might result in a
signi
ant performan
e loss if not enough empty bands are in
luded in the
al
ulation. Therefore we re
ommend to set
NBANDS to NELECT/2 + NIONS/2, this is also the default setting of the makeparam utility and of VASP.4.X. This setting is
safe in most
ases. In some
ases, it is also possible to de
rease the number of additional bands to NIONS/4 for large systems
without performan
e loss, but on the other hand transition metals do require a mu
h larger number of empty bands (up to
2*NIONS).
To
he
k this parameter perform several
al
ulations for a xed potential (ICHARG=12) with an in
reasing number of
bands (e.g. starting from NELECT/2 + NIONS/2). An a
ura
y of 10 6 should be obtained in 10-15 iterations. Mind that the
RMM-DIIS s
heme (IALGO=48) is more sensible to the number of bands than the default CG algorithm (IALGO=8).
8.2
Before going into further details, we want to distinguish between high quality quantitative (PREC should be high) and
qualitative
al
ulations (PREC
an be medium or even low).
A high quality
al
ulation is ne
essary if very small energy-differen
es (<10 meV) between two
ompeting phases,
whi
h
an not be des
ribed with the same super
ell, have to be
al
ulated.
The term same super
ell
orresponds here to
ells
ontaining the same number of atoms and no dramati
hanges in
the
ell-geometry (i.e. latti
e ve
tors should be almost the same for both
ells). For the
al
ulation of energy-differen
es
between two
ompeting bulk-phases it is in many
ases impossible to nd a super
ell, whi
h meets this
riterion. If one
wants to
al
ulate small energy-differen
es it is ne
essary to
onverge with respe
t to all parameters (k-points, FFT-meshes,
and sometimes energy
ut-off). In most
ases these three parameters are independent, so that
onvergen
e
an be
he
ked
independently.
For surfa
es, things are quite
ompli
ated. The
al
ulation of the surfa
e energy is
learly a high quality quantitative
al
ulation. In this
ase you have to subtra
t from the energy of the slab the energy of the bulk phase. Both energies must
be
al
ulated with high a
ura
y. If the slab
ontains 20 atoms, an error of 5 meV per bulk atom will result in an error of
100 meV per surfa
e atom. The situation is not as bad if one is interested in the adsorption energy of mole
ules. In this
ase
a
urate results (with errors of a few meV)
an be obtained with PREC=med, if the referen
e energy of the slab, and the
referen
e energy of the adsorbate are
al
ulated in the same super
ell as that one used to des
ribe adsorbate and slab together.
Ab initio mole
ular dynami
s
learly do not fall into the high quality
ategory be
ause the
ell shape and the number of
atoms remains
onstant during the
al
ulation, and most ab initio MD's
an be done with PREC=Low. We will give some
ex
eption to this general rule when the inuen
e of the k-point mesh is dis
ussed.
8.3
Errors due to k-points sampling. This will be dis
ussed in se
tion 8.6. Mind that the errors due to the k-points mesh
are not transferable i.e. a 9 9 9 k-points grid leads to a
ompletely different error for f
, b
and s
. It is therefore
absolutely essential to be very
areful with respe
t to the k-points sampling.
Errors due to the
ut-off ENCUT. This error is highly transferable, i.e. the default
utoff ENCUT (read from the
POTCAR le) is in most
ases safe, and one
an expe
t that energy differen
es will be a
urate within a few meV (see
se
tion 8.4). An ex
eption is the stress tensor whi
h
onverges notoriously slow with respe
t to the size of the plane
wave basis set (see se
tion 7.6).
Wrap around errors (see se
tion algo-wrap). These errors are due to an insuf
ient FFT mesh and they are not as well
behaved as the errors due to the energy
utoff (see se
tion 8.4). But on
e again, if one uses the default
utoff (read
from the POTCAR le) the wrap around errors are usually very small (a few meV per atom) even if the FFT mesh is
not suf
ient. The reason is that the default
utoffs in VASP are rather large, and therefore the
harge density and the
potentials
ontain only small
omponents in the region where the wrap around error o
urs.
Errors due to the real spa
e proje
tion. Real spa
e proje
tion always introdu
es additional (small) errors. These errors are also quite well behaved i.e. if one uses the same real spa
e proje
tion operators all the time, the errors are
almost
onstants. Anyway, one should try to avoid the evaluation of energy differen
es between
al
ulations with
LREAL=.FALSE. and LREAL=On/.TRUE (see se
tion 6.38). Mind that for LREAL=On (the re
ommended setting)
118
the real spa
e operators are optimized by VASP a
ording to ENCUT and PREC and ROPT i.e. one gets different real
spa
e proje
tion operators if ENCUT or PREC is
hanged (see se
tion 6.38).
In
on
lusion, to minimize errors one should use the same setting for ENCUT, ENAUG, PREC, LREAL and ROPT throughout
all
al
ulations, and these ags should be spe
ied expli
itly in the INCAR le. In addition it is also preferable to use the
same super
ell for all
al
ulations whenever possible.
8.4
In general, the energy-
ut-off must be
hosen a
ording to the pseudopotential. All POTCAR les
ontain a default energy
utoff. Use this energy
ut-off but please also perform some bulk
al
ulations with different energy
ut-off to nd out
whether the re
ommended setting is
orre
t. The
ut-off whi
h is spe
ied in the POTCAR le will usually result in an error
in the
ohesive energy whi
h is less than 10 meV.
You should be aware of the differen
e between absolute and relative
onvergen
e. The absolute
onvergen
e with respe
t
to the energy
ut-off ENCUT is the
onvergen
e speed of the total energy, whereas relative
onvergen
e is the
onvergen
e
speed of energy differen
es between different phases (e.g. energy of f
minus energy of b
stru
ture). Energy differen
es
onverge mu
h faster than the total energy. This is espe
ially true if both situations are rather similar (e.g. h
p f
). In this
ase the error due to the nite
ut-off is 'transferable' from one situation to the other situation. If two
ongurations differ
strongly from ea
h other (different distribution of s p and d ele
trons, different hybridization) absolute
onvergen
e gets more
and more
riti
al.
There are some rules of thumb, whi
h you should
he
k whenever making a
al
ulation: For bulk materials the number of
plane waves per atom should be between 50-100. A smaller basis set might result in serious errors. A larger basis set is rarely
ne
essary, and is a hint for a badly optimized pseudopotential. If a large va
uum is in
luded the number of plane waves will
be larger (i.e. 50% of your super
ell va
uum number of plane waves in
reases by a fa
tor of 2).
More problemati
than ENCUT is the
hoi
e of the FFT-mesh, be
ause this error is not easily transferable from one
h2 2
situation to the next. For an exa
t
al
ulation the FFT-mesh must
ontain all wave ve
tors up to 2G
ut if E
ut = 2m
G
ut , E
ut
being the used energy-
ut-off. In
reasing the FFT-mesh from this value does not
hange the results, ex
ept for a possibly very
small
hange due to the
hanged ex
hange-
orrelation potential. The reasons for this behavior are explained in se
tion 7.2.
Nevertheless it is not always possible and ne
essary to use su
h a large FFT-mesh. In general only 'high quality'
al
ulations (as dened in the previous se
tion) require a mesh whi
h avoids all wrap around errors. For most
al
ulations and in
parti
ular for the supplied pseudopotentials with the default
utoff it is suf
ient to set NGX,NGY and NGZ to 3=4 of the
required values (set PREC=Medium or PREC=Low in the INCAR le before running the makeparam utility or VASP.4.X).
The values whi
h stri
tly avoid any wrap-around errors are also written to the OUTCAR le:
Just sear
h for the string 'wrap'. As a rule of thumb the 3=4 will result in FFT-mesh, whi
h
ontain approximately 8x8x8=256
FFT-points per atom (assuming that there is no va
uum).
One hint, that the FFT mesh is suf
ient, is given by the lines
soft
harge-density along
0
1
2
x
32.0000 -.7711 1.9743
y
32.0000 6.7863 .0205
z
32.0000 -.7057 -.7680
one line
3
4
5
6
7
8
.0141 .3397 -.0569 -.0162 -.0006 .0000
.2353 .1237 -.1729 -.0269 -.0006 .0000
-.0557 .1610 -.2262 -.0042 -.0069 .0000
also written to the le OUTCAR (sear
h for the string 'along'). These lines
ontain the
harge density in re
ipro
al spa
e at
the positions
G = 2p x g x
m
( )
G = 2p y g y
m
( )
G = 2p z g z
m
( )
The last number will always be 0 (it is set expli
itly by VASP), but as a rule of thumb the previous value divided by the total
number of ele
trons should be smaller than 10 4 . To be more pre
ise: Be
ause of the wrap-around errors,
ertain parts of the
119
harge density are wrapped to the other side of the grid, and the size of the wrapped
harge density divided by the number
of ele
trons should be less than 10 3 10 4 .
Another important hint that the wrap around errors are too large is given by the for
es. If there is a
onsiderable drift
in the for
es, in
rease the FFT-mesh. Sear
h for the string 'total drift' in the OUTCAR le, it is lo
ated beneath the line
TOTAL-FORCE:
total drift:
-.00273
-.01048
.03856
The drift should denitely not ex
eed the magnitude of the for
es, in general it should be smaller than the size of the for
es
you are interested in (usually 0.1 eV/A ).
For the representation of the augmentation
harges a se
ond more a
urate FFT-mesh is used. Generally the time spent
for the
al
ulation on this mesh is relatively small, therefore there is no need to worry too mu
h about the size of the mesh,
and relying on the defaults of the makeparam utility is in most
ases safe. In some rare
ases like Cu, Fe pv with extremely
'hard' augmentation
harges, it might be ne
essary to in
rease NGXF in
omparison to the default setting. This
an be done
either by hand (setting NGXF in the param.in
le) or by giving a value for ENAUG in the INCAR le 6.9.
As for the soft part of the
harge density the total
harge density (whi
h is the sum of augmentation
harges and soft part)
is also written to the le OUTCAR:
total
harge-density along one line
0
1
2
3
4
5
6
7
8
x
32.0000 -.7711 1.9743 .0141 .3397 -.0569 -.0162 -.0006 .0000
y
32.0000 6.7863 .0205 .2353 .1237 -.1729 -.0269 -.0006 .0000
z
32.0000 -.7057 -.7680 -.0557 .1610 -.2262 -.0042 -.0069 .0000
The same
riterion whi
h holds for the soft part should hold for the total
harge density. If the se
ond mesh is too small the
for
es might also be wrong (leading to a 'total drift' in the for
es).
Mind: The se
ond mesh is only used in
onjun
tion with US-pseudopotentials. For norm
onserving pseudopotentials neither
the
harge density nor the lo
al potentials are set on the ne mesh. In this
ase set NG(X,Y,Z)F to NGX,Y,Z or simply to 1.
Both settings result in the same storage allo
ation.
Mind: If very hard non-linear/partial
ore
orre
tions are in
luded the
onvergen
e of the ex
hange-
orrelation potential with
respe
t to the FFT grid might
ause problems. All supplied pseudopotentials have been tested in this respe
t and are safe.
8.5
In most
ases on
e
an safely use the default values for ENCUT and ENAUG, whi
h are read from the POTCAR le. But
there are some
ases where this
an results in small, easily avoidable ina
ura
ies.
For instan
e, if you are interested in the energy differen
e between bulk phases with different
ompositions (i.e. Co
CoSi Si). In this
ase the default ENCUT will be different for the
al
ulations of pure Co and pure Si, but it is preferable
to use the same
utoff for all
al
ulations. In this
ase determine the maximal ENCUT and ENAUG from the POTCAR les
and use this value for all
al
ulations.
Another example is the
al
ulation of adsorption energies of mole
ules on surfa
es. To minimize (for instan
e) nontransferable wrap errors one should
al
ulate the energy of an isolated mole
ule, of the surfa
e only, and of the adsorbate/surfa
e
omplex in the same super
ell, using the same
utoff. This usually requires to x ENCUT and ENAUG by hand
in the INCAR le. If one also wants to use real spa
e optimization (LREAL=On), it is re
ommended to use LREAL=On for
all three
al
ulations as well (the ROPT ag should also be similar for all
al
ulations, se
tion 6.38).
8.6
The number of k-points ne
essary for a
al
ulation depends
riti
ally on the ne
essary pre
ision and on the fa
t whether
the system is metalli
. Metalli
systems require an order of magnitude more k-points than semi
ondu
ting and insulating
systems. The number of k-points also depends on the smearing method in use; not all methods
onverge with similar speed.
In addition the error is not transferable at all i.e. a 9 9 9 leads to a
ompletely different error for f
, b
and s
. Therefore
absolute
onvergen
e with respe
t to the number of k-points is ne
essary. The only ex
eption are
ommensurable super
ells.
If it is possible to use the same super
ell for two
al
ulations it is denitely a good idea to use the same k-point set for both
al
ulations.
k-point mesh and smearing are
losely
onne
ted. We repeat here the guidelines for ISMEAR already given in se
tion
6.37:
For semi
ondu
tors or insulators always use tetrahedron method (ISMEAR=-5), if the
ell is too large to use tetrahedron
method use ISMEAR=0.
EXAMPLES
120
For relaxations in metals always use ISMEAR=1 and an appropriated SIGMA value (so that the entropy term is less
than 1 meV per atom). Mind: Avoid to use ISMEAR>0 for semi
ondu
tors and insulators, it might result in problems.
For the DOS and very a
urate total energy
al
ulations (no relaxation in metals) use the tetrahedron method
(ISMEAR=-5).
On
e again, if possible we re
ommend the tetrahedron method with Blo
hl
orre
tions (ISMEAR=-5), this method is fool
proof and does not require any empiri
al parameters like the other methods. Espe
ially for bulk materials we were able to get
highly a
urate results using this method.
Even with this s
heme the number of k-points remains relatively large. For insulators 100 k-points/per atom in the full
Brillouin zone are generally suf
ient to redu
e the energy error to less than 10 meV. Metals require approximately 1000
k-points/per atom for the same a
ura
y. For problemati
ases (transition metals with a steep DOS at the Fermi-level) it
might be ne
essary to in
rease the number of k-points up to 5000/per atom, whi
h usually redu
es the error to less than 1
meV per atom.
Mind: The number of k-points in the irredu
ible part of the Brillouin zone (IRBZ) might be mu
h smaller. For f
/b
and
s
a 11 11 11
ontaining 1331 k-points is redu
ed to 56 k-points in the IRBZ. This is a relatively modest value
ompared
with the values used in
onjun
tion with LMTO pa
kages using linear tetrahedron method.
Not in all
ases it is possible to use the tetrahedron method, for instan
e if the number of k-points falls beneath 3, or if a
urate for
es are required. In this
ase use the method of Methfessel-Paxton with N=1 for metals and N=0 for semi
ondu
tors.
SIGMA should be as large as possible, but the differen
e between the free energy and the total energy (i.e. the term
entropy T*S
in the OUTCAR le) must be small (i.e. < 1-2 meV/per atom). In this
ase the free energy and the energy one is really
interested in E (s ! 0) are almost the same. The for
es are also
onsistent with E (s ! 0).
Mind: A good
he
k whether the entropy term
auses any problems is to
ompare the entropy term for different situations.
The entropy must be the same for all situations. One has a problem if the entropy is 100 meV per atom at the surfa
e but
10 meV per atom for the bulk.
Comparing different k-points meshes:
It is ne
essary to be
areful
omparing different k-point meshes. Not always does the number of k-points in the IRBZ
in
rease
ontinuously with the mesh-size. This is for instan
e the
ase for f
, where even grids
entered not at the G-point
(e.g. Monkhorst Pa
k 8 8 8 ! 60) result in a larger number of k-points than odd divisions (e.g. 9 9 9 ! 35). In fa
t
the differen
e
an be tra
ed ba
k to whether or whether not the G-point is in
luded in the resulting k-point mesh. Meshes
entered at G (option 'G' in KPOINTS le or odd divisions, see Se
. 5.5.3) behave different than meshes without G (option
'M' in the KPOINTS le and even divisions). The pre
ision of the mesh is usually dire
tly proportional to the number of
k-points in the IRBZ, but not to the number of divisions. Some ambiguities
an be avoided if even meshes (not
entered at G)
are not
ompared with odd meshes (meshes
entered at G).
Some other
onsiderations:
It is re
ommended to use even meshes (e.g. 8 8 8) for up to n = 8. From there on odd meshes are more ef
ient (e.g.
11 11 11). However we have already stressed that the number of divisions is often totally unrelated to the total number
of k-points and to the pre
ision of the grid. Therefore a 8 8 8 might be more a
urate then a 9 9 9 grid. For f
a
8 8 8 grid is approximately as pre
ise as a 8 8 8 mesh. Finally, for hexagonal
ells the mesh should be shifted so that
the G point is always in
luded i.e. a KPOINTS le
automati
mesh
0
Gamma
8 8 6
0. 0. 0.
is mu h more ef ient than a KPOINTS le with Gamma repla ed by Monkhorst (see also Ref. 5.5.3).
9 Examples
9.1
Bulk al ulations are the easiest al ulations whi h an be performed using VASP.
121
EXAMPLES
param.in
INCAR
POSCAR
POTCAR
KPOINTS
A minimal INCAR le is strongly en
ouraged: the smaller the INCAR le the smaller the number of possible errors. In
general the INCAR le might look like:
SYSTEM = Pd: f
Ele
troni
minimisation
ENCUT = 200.00 eV ! energy
ut-off for the
al
ulation (optional)
ENAUG = 350.00 eV ! energy
ut-off for the augmentation
harges
DOS related values
ISMEAR =
-5;
For bulk
al
ulation without internal degrees of freedom we re
ommend the tetrahedron method with Bloe
hel
orre
tions.
The method
onverges rapidly with the number of k-points and requires only minimal interferen
e of the user. It is a good
pra
ti
e to spe
ify the energy
utoffs (ENCUT and ENAUG) manually in the INCAR le, but please always
he
k the POTCAR le (grep ENMAX POTCAR and grep EAUG POTCAR, the maximal ENMAX
orresponds to ENCUT, and the maximal
EAUG to ENAUG).
VASP.3.2 only:
If your
ell
ontains only one atomi
spe
ies the param.in
le will be similar to (use the makeparam utility to
reate this le, before running
makeparam be sure that you POSCAR le
orresponds to the most expanded volume):
The NGX,Y,Z setting given above will be suf
ient even for relatively a
urate
al
ulations, the augmentation part (NGXF...) will be also
suf
ient in most
ases. With this le it is possible to use re
ipro
al and the real spa
e proje
tors (for reasons of ef
ien
y only re
ipro
al
proje
tors should be used for su
h a small
ell).
Monkhorst Pa
k
0
Monkhorst Pa
k
11 11 11
0 0 0
The number of k-points and therefore the mesh-size depends on the ne
essary pre
ision. In most
ases, a 11 11 11 mesh
(leading to a mesh
ontaining approximately 60 points) is suf
ient to
onverge the energy to within 10 meV (see also se
tion
8.6), and might be used as some kind of default for bulk
al
ulations. If the system is semi
ondu
ting, you
an often redu
e
the grid to 4 4 4 (also read se
tion 8.6). For very a
urate
al
ulations (energy differen
es 1 meV), it might be ne
essary to
in
rease the number of k-points
ontinuously, and to
he
k when the band-stru
ture energy is
onverged (for most transition
metals a mesh of 15 15 15 is suf
ient).
A typi
al task performed for bulk materials is the
al
ulation of the equilibrium volume. Unless absolute
onvergen
e
with respe
t to the basis set is a
hieved, volume relaxation's using the stress tensor are not re
ommended and
al
ulations
with a
onstant energy
ut-off (CEC) are
onsidered to be preferable to
al
ulations with a
onstant basis set (CBS) (see
se
tion 7.6). Due to the same reason you should not try to obtain the equilibrium volume from
al
ulations whi
h differ in the
latti
e
onstant by a few hundreds of an Angstrom. These
al
ulations tend to be CBS
al
ulation and not a CEC
al
ulation
(for a very small
hange in the latti
e
onstant the basis set will remain un
hanged). It is preferable to t the energy over a
ertain energy range to a equation of states. A simple loop over different bulk parameters might be done using a UNIX shell
s
ript:
122
EXAMPLES
rm WAVECAR
for i in 3.7 3.8 3.9 4.0 4.1
do
at >POSCAR <<!
f
:
$i
0.5 0.5 0.0
0.0 0.5 0.5
0.5 0.0 0.5
1
artesian
0 0 0
!
e
ho "a= $i" ; vasp
E=`tail -1 OSZICAR` ; e
ho $i $E >>SUMMARY.f
done
at SUMMARY.f
After a run the le SUMMARY.f
ontains the energy for different latti
e parameters. The total energy
an be tted to some
equation of states to obtain the equilibrium volume, the bulk-modulus and so on. Using the s
ript and the parameter les
given above a simple energy-volume
al
ulation is possible.
Exer
ise 1: Perform a simple
al
ulation using the INCAR le given above. Read the OUTCAR-le
arefully. Somewhere
in the OUTCAR le a set of parameters is written beginning with the line
SYSTEM = Pd: f
These lines give a
omplete parameter setting for the job and might be
ut from the OUTCAR le and used as a new INCAR
le. Go through the lines and gure out, what ea
h parameter means. Using the INCAR and the bat
h le given above, what
is the default setting of ISTART for the rst and for all following runs? Is this a
onvenient setting (
onstant energy
ut-off
onstant basis set) ?
Exer
ise 2: In
rease the number of KPOINTS till the total energy is
onverged to 10 meV. Start with a 5 5 5 k-points
mesh. Is the equilibrium volume still
orre
t for the 5 5 5 k-points mesh? Repeat the
al
ulation for a different smearing
(ISMEAR=1). Whi
h
hoi
e is reasonable for SIGMA?
Exer
ise 3: Cal
ulate the equilibrium latti
e
onstant for different bulk phases (e.g. f
, s
, b
) and for different
ut-offs
ENCUT. The energy differen
es between different bulk phases (e.g. dE = Ef
Eb
) will
onverge rapidly with the
ut-off.
Exer
ise 4: Cal
ulate the Pulay stress for a spe
i
energy
ut-off. Then relax the
onguration by setting the Pulay stress
expli
itly (see se
tion 7.6). Su
h a
al
ulation requires to set the following parameters in the INCAR le:
NSW
ISIF
IBRION
POT IM
=
=
=
=
to the INCAR le and use a UNIX bat
h le to
al
ulate the equilibrium
ell shape for different volumes. The bat
h le might
look similar to
EXAMPLES
123
rm WAVECAR
for i in 3.7 3.8 3.9 4.0 4.1
do
at >POSCAR <<!
C: h
p
$i
1.00000
0.00000000000000
0.00000
-0.50000
0.86602540378444
0.00000
0.00000
0.00000000000000
1.63299
2
dire
t
0.00000000000000
0.00000000000000
0.000000
0.33333333333333
0.66666666666667
0.500000
!
e
ho "a= $i" ; vamp
E=`tail -1 OSZICAR` ; e
ho $i $E >>SUMMARY.h
p
done
at SUMMARY.f
Exer
ise 5: If you want to relax the volume as well, always use a large
utoff. Usually 1.3 times the default
utoff is
suf
ient. Re
reate the param.in
le with the makeparam utility program. Che
k the ISIF parameter and set it
orre
tly. Start
an relaxation allowin all degrees of freedom to relax simultaneously.
9.3
Cal
ulating a DOS
an be done in two ways: The simple one is to perform a stati
(NSW=0, IBRION=-1) self
onsistent
al
ulation and to take the DOSCAR le from this
al
ulation. The DOSCAR le
an be visualized with
> drawdos
a simple FORTRAN program, whi
h requires erlgraph routines. Mind that VASP
an
al
ulate partial DOS. Partial DOS are
very powerful for the analysis of the ele
troni
DOS (see se
tion 6.32).
The simple approa
h dis
ussed above is not appli
able in all
ases: A high quality DOS requires usually very ne kmeshes. You should think at least in orders of 16x16x16 meshes for small
ells and even for large
ells you might need
something like 6x6x6- or 8x8x8-meshes. For larger
ells it is often only possible to do
al
ulations for one or two k-points
(due to restri
tions in
entral memory). This problem also o
urs for band-stru
ture
al
ulations. In this
ase one is interested
in the band-stru
ture along
ertain lines in the BZ and for ea
h line a division into approximately 10 k-points is required
to get a dense pa
king of data points allowing visualization routines a smooth and realisti
interpolation between these data
points.
The usual way, to do DOS or band-stru
ture
al
ulations in this
ase is the following: the
harge density and the effe
tive
potential
onverge rapidly with in
reasing number of k-points. So, as a rst step one generates a high quality
harge density
using a few k-points in a stati
self
onsistent run. The next step is to perform a non-self
onsistent
al
ulation using the
CHGCAR le from this self
onsistent run (i.e. ICHARG is set to 11, see se
tion 6.14) . Mind, this is the only way to
al
ulate
the band stru
ture, be
ause for band-stru
ture
al
ulations the supplied k-points form usually no regular three-dimensional
grid and therefore a self
onsistent
al
ulation gives pure nonsense !
For ICHARG=11, all k-points
an be treated independently, there is no
oupling between them, be
ause the
harge density
and the potential are kept xed. Therefore there is also no need to treat the k-points within one single run simultaneously. Just
split the job into runs in
luding only one single k-point and merge the results for the individual k-points into one single data
le. For people being not so familiar with the output formats of the various les this pro
edure
ould produ
e some heada
he.
Therefore we provide some tools for doing this (a Bourne-shell s
ript for UNIX systems and a set of FORTRAN programs)
and in the following a short des
ription how to use these utilities is given:
The rst step is to provide a KPOINTS le in the entering all k-point
oordinates expli
itly-format. If you want to
al
ulate a DOS this le must also
ontain
onne
tion lists for tetrahedra (the tetrahedron method is the only probably the
most usefull approa
h to
al
ulate a DOS be
ause it is parameter-free). To generate su
h a le you
an use the utility
> kpoints
or
> vamp .
EXAMPLES
124
Both programs read the POSCAR and KPOINTS le and generate a le IBZKPT whi
h
an be
opied to KPOINTS. Having
set up POSCAR and INCAR
orre
tly use a shell-s
ript
alled
> rundos.
The rundos s
ript is also also useful for band-stru
tures the last step whi
h is the
al
ulation of the DOS fails in this
ase,
but when you have rea
hed this point all required a
tions have been performed
orre
tly and all ne
essary les have been
reated. For band-stru
ture
al
ulation the utility
> toband
an help to
reate a set of k-points along
ertain dire
tions of the IRBZ.
The s
ript rundos
alls rst a utility
alled splitk whi
h splits the original KPOINTS le into many KPOINTS-les
ea
h with a single k-point. Then a loop over these k-point les is done and the EIGENVAL- and (if proje
tion was swit
hed
on) PROCAR-les are saved. The individual EIGENVAL- and PROCAR-les are then merged together by tools
alled
mergeeig and mergepro. After this the original KPOINTS-le et
. is restored and all temporary EIGENVAL-, PROCARand KPOINTS-les are erased. To get the DOS, nally some utility
alled
> getdos
is
alled generating a DOSCAR-le a
ording to the data found on PROCAR or EIGENVAL. (This tool
an always be used
if valid EIGENVAL, KPOINTS, INCAR and PROCAR-les exist.)
The obtained data
an be visualized with the FORTRAN programs
> drawband
> drawdos .
There are also some MATHEMATICA utilities to draw band-stru
ture data (though they are not yet very user-friendly be
ause
many things have to be
ustomized by hand for ea
h individual
ase). For drawing band-stru
tures of lo
alized surfa
e states
there exists a tool
alled sbands to nd out bands with a
ertain degree of lo
alization at some atom(s) and generating an
output le SBAND whi
h
an be used dire
tly as input for the MATHEMATICA tool sband.m. Furthermore there exists a
tool
alled bbands whi
h tries to nd minima and maxima of the eigenvalues for all k-points with distin
t x-/y-
oordinates
but different z-
oordinates. It
reates a le BBAND whi
h
an be used as input for the MATHEMATICA tool bband.m
whi
h draws allowed energy regions for the bulk band stru
ture (by shading allowed ranges).
9.4
Atoms
INCAR
POSCAR
POTCAR
KPOINTS
Before using a pseudopotential intensively, it is not only required to test it for various bulk phases, but the pseudopotential
should also reprodu
e exa
tly the eigenvalues and the total energy of the free atom for whi
h it was
reated. If the energy
utoff and the
ell size are suf
ient, the agreement between the atomi
referen
e
al
ulation (EATOM in the POTCAR
le) and a
al
ulation using VASP is normally better than 1 meV (but errors
an be 10 meV for some transition metals). In
most
ases,
al
ulations for an atom are relatively fast and unproblemati
. For the
al
ulation the G should be used i.e. the
KPOINTS le should have the following
ontents:
Monkhorst Pa
k
0
Monkhorst Pa
k
1 1 1
0 0 0
A simple
ubi
ell is usually re
ommended; the size of the
ell depends on the element in question. Some values for
reliable results are
ompiled in Tab. 3. These
ells are also large enough to perform
al
ulations on dimers, explained in the
next se
tion. The POSCAR le is similar to:
125
EXAMPLES
Table 3: Typi
al
onvenient settings for the
ell size for the
al
ulation of atoms and dimers are (reoughly 4-5 times the dimer
length):
Lithium
Aluminium
Potassium
Copper, Rhodium, Palladium ...
Nitrogen
C
atom
1
ell size
13 A
12 A
14 A
10 A
7 A
8 A
10.00000
.00000
.00000
.00000 10.00000
.00000
.00000
.00000 10.00000
1
art
0
0
0; SIGMA=0.1
The only differen
e to the bulk
al
ulation is that Gaussian smearing should be used. If the atomi
orbitals are almost
degenerated, you might have to set SIGMA to a smaller value (but be
areful very small values might degrade
onvergen
e
signi
antly). For initial tests, SIGMA=0.1 eV is usually a good starting point.
Mind: Extra
t the
orre
t value for the energy, it is not F = E + sS whi
h
ontains a meaningless entropy term related to
a
idential orbital degenera
y, but the energy without entropy in the OUTCAR le.
In some rare
ases, the real LDA/GGA groundstate might differ from the
onguration for whi
h the pseudopotential
was generated (most transition metals, see Se
. 10), sin
e the o
upan
ies have been set manually during the pseudopotential
generation. For Pd, for instan
e, a s1 d 9
onguration was
hosen to be the referen
e
onguration, whi
h is not the LDA/GGA
groundstate of the atom. In this
ase, it is ne
essary to set the o
upan
ies for VASP manually in order to obtain the same
energy as the one found in the POTCAR le. This
an be done in
luding the following lines in the INCAR le: This
an be
done in
luding the following lines in the INCAR le:
LDIAG = .FALSE.
! keep ordering of eigenstates fixed
ISMEAR = -2
! keep o
upan
ies fixed
FERWE = 5*0.9 0.5 ! set the o
upan
ies manually
(5*0.9 is interpreted as 0.9 0.9 0.9 0.9 0.9). To determine the ordering of the eigenvalues it might be ne
essary to perform a
al
ulation with ICHARG=12 (i.e. xed atomi
harge density). After a su
essful atomi
al
ulation
ompare the differen
es
between the eigenvalues with those obtained by the pseudopotential generation program. Also
he
k the total energy, the
differen
es should be smaller than 20 meV.
Here another example: If the energy of an atom with a parti
ular
onguration has to be
al
ulated, i.e. spin polarized
Fe with a valen
e
onguration of 3d6.2 4s1.8, the
al
ulation has to be performed in two step. First a non self
onsistent
al
ulation with the following INCAR must be performed:
ISPIN = 2
ICHARG = 12
MAGMOM = 4
! magnetization in Fe is 4
This rst step is required in order to determine a set of initial wavefun
tions and the orbital ordering. In the OUTCAR le
one nds the following level ordering:
126
EXAMPLES
k-point 1 :
0.0000
band No. band energies
1
-5.0963
2
-5.0963
3
-5.0954
4
-5.0954
5
-5.0954
6
-4.6929
7
-0.7528
8
-0.7528
0.0000
0.0000
0.0000
0.0000
Spin
omponent 2
k-point 1 :
0.0000
band No. band energies
1
-3.6296
2
-2.2968
3
-2.2968
4
-2.2889
5
-2.2889
6
-2.2889
7
-0.1247
8
-0.1247
In the spin up
omponent, the 5 d states have lower energy than the s state, whereas in the down
omponent, the s state has
lower energy than the d states. This ordering is important for supplying the o
upan
ies in the lines FERWE and FERDO in
the INCAR le in the se
ond
al
ulation. For a spheri
al atom, the nal
al
ulation is performed with the following INCAR
le:
ISTART = 1
! read in the WAVECAR file
ISPIN = 2
MAGMOM = 4
AMIX = 0.2 ; BMIX = 0.0001 ! re
ommended mixing for magneti
systems
LDIAG = .FALSE.
ISMEAR = -2
FERWE = 5*1 1*1
3*0
FERDO = 0.8 5*0.24 3*0
!
!
!
!
!
The determination of the spin-polarisation broken symmetry groundstate of atoms is dis
ussed in the next se
tion 9.5.
Mind: The size of the
ell
an be redu
ed if one spe
ial point is used instead of the G point, i.e. if the KPOINTS le has
the following
ontents:
Monkhorst Pa
k
0
Monkhorst Pa
k
2 2 2
0 0 0
The reasons for this behavior are: Due to the nite size of the
ell a band dispersion exists i.e. the atomi
eigenvalues split
and form a band with nite width. To rst order the
enter of the band lies exa
tly at the position of the atomi
eigenvalues.
At the G-point, however the eigenvalues at the bottom of the band are obtained. If the spe
ial point (0.25,0.25,0.25) 2p=a is
used instead of the G-point, the energy of the
enter of the band is obtained. Nevertheless we re
ommend this setting only for
experts: in most
ases the degenera
y of the p- and d-orbitals is removed and only the mean value of the eigenvalues remains
physi
ally signi
ant. In this
ases it is also ne
essary to in
rease SI GMA or to set the partial o
upan
ies by hand!
9.5
The POTCAR le
ontains information on the energy of the atom in the referen
e
onguration (i.e. the
onguration for
whi
h the PP was generated). Cohesive energies
al
ulated by vasp are with respe
t to this
onguration. The referen
e
127
EXAMPLES
al
ulation, however, did not allow for spin-polarisation or broken symmetry solution. To in
lude these effe
ts properly, it is
required to
al
ulate the lowest energy magneti
groundstate using VASP.
Unfortunately
onvergen
e to the symmetry broken spin polarized groundstate
an be relatively slow in VASP. The following INCAR le worked reasonably well for most elements:
ISYM = 0
ISPIN = 2
VOSKOWN = 1
ISMEAR = 0
SIGMA = 0.1
AMIX = 0.2
BMIX = 0.0001
NELM = 20
ICHARG = 1
!
!
!
!
!
!
no symmetry
allow for spin polarisation
this is important, in parti
ular for GGA
Gaussian smearing, otherwise negative o
upan
ies
intermid. smearing width
mixing set manually
Exe
ute VASP twi
e,
onse
utively with this input le to get
onverged energies.
9.6
Dimers
Espe
ially for
riti
al
ases, dimers are ex
ellent test systems. If a pseudopotential has passed dimer and bulk
al
ulations,
you
an be quite
ondent that the pseudopotential possesses ex
ellent transferability. For the simulation of the dimer, one
an
use the G point and displa
e the se
ond atom along the diagonal dire
tion. Generally bonding length and vibrational frequen
y
have to be
al
ulated. It is highly re
ommended to perform these
al
ulations using the
onstant velo
ity mole
ular dynami
mode (i.e. IBRION=0, SMASS=-2). This mode speeds up the
al
ulation be
ause the wave fun
tions are extrapolated and
predi
ted using information of previous steps. Your INCAR le must
ontain some additional lines to perform the
onstant
velo
ity MD:
ioni
relaxation
NSW
=
10
SMASS =
-2
POTIM =
1
To avoid
ompli
ations use POTIM=1 for all
onstant velo
ity MD's. In addition to the positions the POSCAR le must also
ontain velo
ities:
dimer
1
10.00000
.00000
.00000
.00000 10.00000
.00000
.00000
.00000 10.00000
2
art
0
0
0
1.47802 1.47802 1.47802
art
0
0
0
-.02309 -.02309 -.02309
exists.
Mind: In some rare
ases like C2 , the
al
ulation of the dimer turns out to be problemati
. For C2 the LUMO (lowest
uno
upied mole
ular orbital) and the HOMO (highest o
upied mole
ular orbital)
ross at a
ertain distan
e, and are a
tually
degenerated if the total energy is used as variational quantity (i.e. s 0). Within the nite temperature LDA these dif
ulties
are avoided, but interpreting the results is not easy be
ause of the nite entropy (for C2 see Ref. [50).
128
EXAMPLES
9.7
param.in
INCAR
POSCAR
POTCAR
KPOINTS
Use the makeparam utility to
reate the param.in
le. For a mole
ular dynami
s PREC= Low is denitely suf
ient. The
INCAR le might be similar to
SYSTEM = Se
ENCUT = 150 eV
IALGO = 48
LREAL = A
NELMIN = 4
BMIX = 2.0
MAXMIX = 50
Ioni
Relaxation
ISYM = 0
!
NSW
=
100 !
NBLOCK = 1 ; KBLOCK
SMASS =
2.0 !
POTIM = 3.00 !
TEBEG = 573
!
PC-fun
tion
APACO = 10.0
Use IALGO=48 (RMM-DIIS for ele
trons) for large mole
ular dynami
runs. You should also evaluate the proje
tion operators in real spa
e (LREAL=A), and require at least 4 ele
troni
iterations per ioni
step (NELMIN = 4). For surfa
e you
might need to in
rease this value to NELMIN = 8.
Spe
ial
onsideration require the parameters BMIX and MAXMIX: it is usually desirable to use optimal mixing parameters for mole
ular dynami
s simulations. This
an be done by performing a few stati
al
ulations with varying AMIX and
BMIX parameters and do determine the one leading to the fastest
onvergen
e. However in the latest version of VASP the
diele
tri
fun
tion is reused when the ions are updated (an optimal AMIX and BMIX is no longer that important). The diele
tri
fun
tion is reused after ioni
updates, if MAXMIX ist set. MAXMIX should be about three times as larger as the number
of iterations required to
onverge the ele
troni
wavefun
tions in the rst iteration.
After doing exe
uting VASP on
e, it is only ne
essary to
opy CONTCAR to POSCAR and to restart VASP. Usually a
shell s
ript is used for this task. An example shell s
ript
an be found on the vamp a
ount in the le vamp/s
ripts/iter.
9.8
Simulated annealing
Simulated annealing runs
an be very helpful for an automati
determination of favourable stru
tural models. A few points
should be kept in mind.
Usually a simulated annealing run is more ef
ient if all masses are equal, sin
e then the energy dissipates more qui
kly
between different vibrational modes. This
an be done by editing the lines POMASS in the POTCAR le. The partition
fun
tions remains unaffe
ted by a
hange of the ioni
masses.
The timestep
an be
hosen larger than usual, in parti
ular if the masses have been
hanged.
The temperature should be de
reased only slowly. This
an be done by de
reasing the temperature (TEBEG) in the
EXAMPLES
9.9
129
Relaxation
9.10
Surfa e al ulations
Surfa
e
al
ulations are denitely very subtle, and you should be rather
areful if you want to do su
h
al
ulations. Before
starting read the se
tion 8 with great
are and understand the basi
outlines of this se
tion. In the following
hapters we
will explain the typi
al steps involved in a surfa
e
al
ulation. Even if you follow all these steps dif
ulties might
ome
up. So whenever you get physi
al meaningless results rst think about your possible mistakes (see se
tion 8): i.e. are your
FFT-meshes suf
ient, have you used enough k-points, is your
al
ulation
onverged
orre
tly, are your positions
orre
t,
in general are the parameters in the INCAR, POSCAR, KPOINTS le and the param.in
le
hosen
orre
tly. Also mind
that an error in an early step of the
al
ulation might result in serious errors for all su
essive
al
ulations. For instan
e an
error of 1% in the latti
e
onstant might result in an error of up to 3% in the
al
ulation of the surfa
e relaxation. So it is a
good idea to spend more time in the rst few steps (bulk
al
ulation, determining the ne
essary size of the FFT grids, k-points
et
.).
9.10.1
As a rst step perform a bulk
al
ulation, use the tetrahedron method and in
rease the number of k-points till your
al
ulation
is
onverged to the required a
ura
y. What is the required a
ura
y: Sorry, no general answer to this question, if you want
to
al
ulate surfa
e energies within 10 meV you should probably in
rease the k-mesh till your energy is
onverged to 1 meV.
Mind that a slab used to model the bulk usually
ontains around 20-100 atoms. This means that you need a very a
urate bulk
energy to get reliable surfa
e energies. On
e again, be as
areful as possible. If you generate the param.in
le automati
ally
hose PREC=High for the bulk
al
ulation.
As a se
ond step swit
h from the tetrahedron method to a nite temperature method. There are two possible
hoi
es at
this moment:
You
an use the Gaussian method (ISMEAR=0) and a small SIGMA value (SIGMA=0.1). This method was previously
From now on you should neither
hange ISMEAR nor SIGMA nor ENCUT. Repeat the bulk-
al
ulations with an in
reasing set of k-points. The
onvergen
e speed with respe
t to the number of k-points should be almost the same as with the
tetrahedron method.
Choose a reasonable set of k-points and the energy
ut-off you want to use for the surfa
e
al
ulation and
al
ulate the
equilibrium latti
e
onstant. Avoid wrap around errors (PREC=High for the makeparam utility). The latti
e
onstant you
obtain must be used as the latti
e
onstant in the surfa
e
al
ulations. The free energy is the referen
es value for all further
al
ulations. Also
al
ulate the entropy (i.e. the term
entropy T*S
in the OUTCAR le) or write down the total energy and the physi
al energy (s ! 0).
9.10.2
The rst step involves nding a reasonable FFT mesh. If you want to avoid wrap around errors at all
hose the values
re
ommended by VASP (or the makeparam utility for PREC=High). As a rst test use a super
ell
ontaining approximately
5 layers bulk and 5 layers va
uum. Use a reasonable not too large k-points set (see below). The values for the FFT-mesh
whi
h stri
tly avoid any wrap-around errors are also written to the OUTCAR le:
WARNING: wrap around error must be expe
ted
set NGX to 22
These meshes will result in long
omputational times, but you must afford at least one exa
t
al
ulation. If you want to redu
e
the meshes try to use the 3=4 rule (makeparam PREC=Med) and
ompare the results with the exa
t
onverged results.
As a next step nd a reasonable k-point mesh. First hints are already given by the bulk
al
ulation. For a surfa
e
al
ulation
you will have one long latti
e ve
tor and two short latti
e ve
tors. For the long dire
tion one division for the k-point mesh is
suf
ient, be
ause the band dispersion is due to the va
uum zero in this dire
tion. For the short dire
tions the
onvergen
e
EXAMPLES
130
speed with respe
t to the number of divisions will be approximately the same as for the bulk. In
rease the number of k-points
till you get a suf
iently
onverged free energy. On
e again, avoid large wrap around errors.
Possible
ross
he
ks:
The entropy per atom should be the same as in the bulk
al
ulation. If this is not the
ase de
rease SIGMA and repeat
all
al
ulations.
The total drift in the for
es must be small. If this is not the
ase your FFT-mesh is not suf
ient and must be in
reased
a ordingly.
Also
he
k the
onvergen
e speed of the for
es with respe
t to the k-points mesh and the size of the FFT-mesh.
9.10.3
From now on keep the number of k-points and the wrap around errors xed (i.e. try to use always the same ratio between the
value whi
h avoids all wrap around errors and the a
tual FFT-mesh). Test how many bulk and va
uum layers are ne
essary
to get a reasonable surfa
e energy, and a reasonable
onverged for
e on the rst (and possibly se
ond) slab layer.
Mind: Do not
hange more than one parameter from one
al
ulation to the next
al
ulation. It is almost impossible to
ompare two
al
ulations whi
h differ in the number of k-points and in the size of the super
ell. Be very
areful about the
FFT-meshes: If you in
rease the size of the super
ell without in
reasing the size of the FFT mesh the results do not improve.
A
tually results get even worse in this
ase be
ause the wrap around error in
reases.
9.11
We have a small program to
al
ulate phonon dispersion relations for bulk materials with arbitrary symmetry, but presently
we do not plan to release this program, sin
e it is dif
ult to use and rather
ompli
ated. If you want to perform phonon
al
ulations we hen
e re
ommend to use the following pa
kage developed by Krzysztof Parlinski:
Krzysztof Parlinski
E-mail: b8parlin
yf-kr.edu.pl
Fax: +48-12-637-3073
http://wolf.ifj.edu.pl/phonon
The program runs under windows and offers a ni e graphi al user interfa e. Presently the program is not free.
10
131
Table 4: Corre
tion to the energy of the atom for the US-PP. Add this value to the energies determined by VASP.
3d
exp.
GS
d(E)
GGA
LDA
4d
exp.
GS
d(E)
GGA
LDA
5d
exp.
GS
d(E)
GGA
LDA
S
3d 4s2
3d 4s2
Ti
3d2 4s2
3d3 4s
V
3d3 4s2
3d4 4s
Cr
3d5 4s
3d5 4s
Mn
3d5 4s2
3d5 4s2
Fe
3d6 s2
3d6.2
4s1.8
Co
3d7 4s2
3d7.7
4s1.3
Ni
3d8 4s2
3d9 4s
Cu
3d10 4s1
3d10 4s1
1.78
1.73
Y
4d 5s2
4d 5s2
2.24
1.99
Zr
4d2 5s2
4d3 5s
3.77
3.38
Nb
4d4 5s
4d4 5s
5.87
5.30
Mo
4d5 5s
4d5 5s
5.62
5.02
T
4d5 5s2
4d5 5s2
3.15
2.82
Ru
4d7 5s
4d7 5s
1.43
1.28
Rh
4d8 5s
4d8 5s
0.55
0.49
Pd
4d10
4d10
0.22
0.18
1.91
1.90
1.91
1.66
Hf
5d2 6s2
5d2 6s2
3.08
2.70
Ta
5d3 6s2
5d3 6s2
4.61
4.09
W
5d4 6s2
5d5 6s
3.06
2.73
Re
5d5 6s2
5d5 6s2
1.96
1.74
Os
5d6 6s2
5d6 6s2
1.06
0.94
Ir
5d9
5d8 6s1
1.51
1.46
Pt
5d9 6s
5d9 6s
3.05
2.98
3.24
3.10
4.53
4.00
4.42
4.07
2.53
2.33
0.87
0.92
0.48
0.41
It was a dif
ult and time-
onsuming task to generate these PP's. The reasoning for their generation is, however, obvious. PP
generation was, and still is, a tri
ky,
umbersome, error-prone and time-
onsuming task, and only few groups
an afford to
generate a new PP's for every problem at hand. But, if a large user
ommunity applies the same set of pseudopotentials to
widely different problems, ill-behaved PP are easily spotted and
an be repla
ed by improved potentials.
This philosophy has
ertainly paid of. The PP's supplied with VASP are among the best pseudopotentials presently
available, but the pseudopotential method has been super
eded by better ele
troni
stru
ture methods, su
h as the PAW
method. Hen
e, the development of the pseudopotentials distributed has
ome to an end, and we strongly re
ommend to use
the PAW datasets now supplied in the VASP-PAW pa
kage (see Se
. 10.2).
All supplied PP's with VASP are of the ultra soft type (with few ex
eptions). And for most elements only one LDA and
one GGA PP is supplied. All pseudopotentials are supplied with default
utoffs (lines ENMAX and ENMIN in the POTCAR
les), and information on how the PP was generated. This should make it easier to determine whi
h version was used, and
user mistakes are easier to
orre
t. The POTCAR les also
ontain information on the energy of the atom in the referen
e
onguration (i.e. the
onguration for whi
h the PP was generated). Cohesive energies
al
ulated by VASP are with respe
t
to this
onguration. Mind that the
ohesive energies written out by VASP requires a
orre
tion for the spin-polarization
energies of the atoms.
For the transition metals an additional problem exists: The
ohesive energies written out by VASP are with respe
t
to a virtual non spin-polarized pseudo-atom having one s ele
tron and Nvalen
e-1 d ele
trons. This is usually not the
experimental ground state
onguration.
The table below gives the required energy
orre
tions (d(E)) for transition metals: i.e. it
ontains the differen
e between
the virtual non spin-polarized pseudo-atom and a spin-polarized groundstate (GS) atom
al
ulated with VASP. The
al
ulations have been done
onsistently with VASP, using the pro
edure des
ribed in Se
. 9.5.
Mind that LDA/GGA is not able to predi
t the
orre
t groundstate (line exp.) for all transition metals. This is not a failure
of VASP but related to de
ien
ies of the LDA/GGA approximation. Only
onguration intera
tion (CI)
al
ulations are
presently able to predi
t the groundstate of all transition metals
orre
tly.
10
132
The POTCAR le also
ontains information about the approximate error a
ording to the RRKJ (Rappe, Rabe, Kaxiras
and Joannopoulos) kineti
energy
riterion. This approximate error is taken into a
ount when
ohesive energies are
al
ulated, and this is the reason why
ohesive energies do not de
rease stri
tly with the energy
utoff. If you do not like this
feature remove the lines after
Error from kineti
energy argument (eV)
in the POTCAR le. We want to point out, that the RRKJ kineti
energy is usually very a
urate and
orre
ts for more than
90% of the error in the
ohesive energy, but it works only if there is not a
onsiderable
harge transfer from one state to
another state (s d or s p).
10.1
For H three POTCAR les exist. The H/POTCAR and H 200eV/POTCAR les a
tually
ontain the same PP. The only
differen
e is that H 200eV has a lower default energy
utoff of 200 eV (the default
utoff for H is 340 eV). Up to now we
have not found any differen
e between
al
ulations using 200 and 340 eV, we therefore re
ommend to use only H 200eV
(differen
es for the H2 dimer are for instan
e less than 1%). If H is used together with hard elements like
arbon VASP will
anyway adopt the higher default
utoff of C. The third potential H soft (generated by J. Furthmueller) should be used in
onjun
tion with soft elements like Si, Ge, Te et
. As one
an see from the data base le H2 dimer length and vibrational
frequen
ies are still quite reasonable.
For the rst row elements two PP exist, we re
ommend the standard version, whi
h gives very high a
ura
y. The se
ond
set ( B s,C s,O s,N s,F s) is signi
antly softer and should be used only after
areful testing. We have found that the se
ond
set is safe if a hard spe
ies is mixed with a softer one (that is for instan
e the
ase in Si-C, Si-O2 , or even Ti-O2 ).
For Ga, In, Sn and Pb one should des
ribe the 3d or 4d states as valen
e,
orresponding PP
an be found on the server in
the dire
tories
Ga_d, In_d, Sn_d, Pb_d
If one puts the 3d or 4d states in the
ore the results depend strongly on the lo
ation of the position of the d-referen
e
energy. The d-referen
e energy for the
onventional Ga, In, Sn and Pb PP (with d in the
ore) has been adjusted so that the
equilibrium volume is within 1 per
ent of the equilibrium volume for the Ga d, In d and Sn d PP. This is
learly a ad ho
x, but results in reasonably a
urate pseudopotentials. Mind that PP in
luding d are
urrently missing for Ge, and for very
a
urate
al
ulations su
h a PP might be required.
The following PP are
urrently available with p semi-
ore states
Li_pv
Na_pv
K_pv
Rb_pv
Cs_pv
Mg_pv
Ca_pv S
_pv Ti_pv V_pv
Fe_pv
Sr_pv Y_pv Zr_pv Nb_pv Mo_pv
Ba_pv
Ta_pv W_pv
For a few elements harder NC-PP exist whi
h
an be used in
al
ulations under pressure, for ioni
systems, or for oxides:
Na_h Mg_h Al_h Si_h
10.2
PAW potential for all elements in the periodi
table are available. With the ex
eption of the 1st row elements, all PAW
potentials were generated to work reliably and a
urately at an energy
utoff of roughly 250 eV (as usual the default energy
utoff is read by VASP from the POTCAR le). If you use any of the supplied PAW potentials you should in
lude a referen
e
to the following arti
le:
P.E. Blo
hl, Phys. Rev. B 50, 17953 (1994).
G. Kresse, and J. Joubert,
From ultrasoft pseudopotentials to the proje
tor augmented wave method,
Phys. Rev. B 59, 1758 (1999).
10
133
The distributed PAW potentials have been generated by G. Kresse following the re
epies dis
ussed in the se
ond referen
e.
Generally the PAW potentials are more a
urate than the ultra-soft pseudopotentials. There are two reasons for this: rst,
the radial
utoffs (
ore radii) are smaller than the radii used for the US pseudopotentials, and se
ond the PAW potentials
re
onstru
t the exa
t valen
e wave fun
tion with all nodes in the
ore region. Sin
e the
ore radii of the PAW potentials are
smaller, the required energy
utoffs and basis sets are also somewhat larger. If su
h a high pre
ession is not required, the older
US-PP
an be used. In pra
ti
e, however, the in
rease in the basis set size will be anyway small, sin
e the energy
utoffs have
not
hanged appre
iably for C, N and O, so that
al
ulations for models, whi
h in
lude any of these elements, are not more
expensive with PAW than with US-PP.
For some elements several PAW versions exist. The standard version has generally no extension. An extension h implies
that the potential is harder than the standard potential and hen
e requires a larger energy
utoff. The extension s means that
the potential is softer than the standard version. The extensions pv and sv imply that the p and s semi-
ore states are treated
as valen
e states (i:e: for V pv the 3 p states are treated as valen
e states, and for V sv the 3s and 3 p states are treated as
valen
e states). PAW les with an extension d, treat the d semi
ore states as valen
e states (for Ga d the 3d states are treated
as valen
e states).
In the following se
tions, the PAW potentials are dis
ussed in somewhat more detail.
10.2.1
For Li (and Be), a standard potential and a potential whi
h treats the 1s shell as valen
e states are available (Li sv, Be sv).
For many appli
ations one should use the sv potential sin
e their transferability is mu
h improved
ompared to the standard
potentials.
For the other rst row elements three pseudopotential versions exist. For most purposes the standard versions should
be used. They work for
utoffs between 325 and 400 eV, where 370-400 eV are required to a
urately predi
t vibrational
properties, but binding geometries and energy differen
es are well reprodu
ed with 325 eV. The typi
al bond length errors
for rst row dimers (N2 , CO, O2 ) are about 1% (
ompared to more a
urate DFT
al
ulations not experiment). The hard
pseudopotentials h give results that are essentially identi
al to the best DFT
al
ulations presently available (FLAPW, or
Gaussian with huge basis sets). The soft potentials are optimised to work around 250-280 eV. They yield very reliable
des
ription for most oxides, su
h as Vx Oy , TiO2 , CeO2 , but fail to des
ribe some stru
tural details in zeolites (i.e.
ell
parameters, and volume).
10.2.2
For most alkali and alkali-earth elements the semi-
ore s and p states should be treated as valen
e states. For lighter elements
(Na-Ca) it is usually suf
ient to treat the the 2 p and 3 p states, respe
tively, as valen
e states ( pv), whereas for Rb-Sr the
4s; 4 p and 5s; 5 p states, respe
tively, must be treated as valen
e states ( sv). Hen
e the standard potentials are
Na_pv
K_pv or K_sv
Rb_sv
Cs_sv
Mg or Mg_pv
Ca_pv or Ca_sv
Sr_sv
Ba_sv
For K results should not be sensible to whether K pv or K sv is used. Likewise, for Mg the standard potential will be suf
ient
in most
ases.
10.2.3
d -elements
The same holds for the d elements: the semi-
ore p states and possibly the semi-
ore s states should be treated as valen
e
states. In most
ases, reliable results however
an be obtained even of the semi
ore states are kept frozen. As a rule of thumb
the p states should be treated as valen
e states, if their eigenenergy e lies above -2.5 Ry. If this is used as the
riterion whether
the semi-
ore p states are kept frozen, we obtain the following set of standard potentials:
1
2
S
_sv Ti
Y_sv Zr_sv
Hf_pv
3
V
Nb_pv
Ta
4
Cr
Mo
W
5
Mn
T
Re
6
Fe
Ru
Os
7
Co
Rh
Ir
8
Ni
Pd
Pt
9
Cu
Ag
Au
10
Zn
Cd
Hg
For Ta-Os, presently only potentials whi h in lude the 5p as valen e states are available (Ta pv - Os pv).
11
134
f -elements
For f elements, potentials whi
h treat the f orbitals as valen
e orbitals are available for La, Ce, A
, Th, Pa U, Np and Pu. For
all elements one standard version and one softer potential ( s) is available. Whereas the semi-
ore p states are always treated
as valen
e states, the semi-
ore s states are treated as valen
e states only in the standard potentials. For most appli
ations
(oxides, suldes), the standard version should be used. For
al
ulations on inter-metalli
ompounds the soft versions are
however suf
iently a
urate.
In addition, spe
ial GGA potential are supplied for Ce-Lu, in whi
h f-ele
trons are kept frozen in the
ore (standard model
for the treatment of lo
alised f ele
trons). The number of f-ele
trons in the
ore equals the total number of valen
e ele
trons
minus the formal valen
y. For instan
e: a
ording to the periodi
table Sm has a total of 8 valen
e ele
trons (6 f ele
trons
and 2 s ele
trons). In most
ompounds Sm, however, adopts a valen
y of 3, hen
e 5 f ele
trons are pla
ed in the
ore, when
the pseudopotential is generated (the
orresponding potential
an be found in the dire
tory Sm 3). The formal valen
y n is
indi
ted by n, where n is either 3 or 2. Ce 3 is for instan
e a Ce potential for trivalent Ce (for tetravalent Ce the standard
potential should be used).
prepares the pseudopotentials for VAMP and
reates the POTCAR le, whi
h
an be used by VAMP. Several les are used
by both programs:
PSCTR
V_RHFIN
V_RHFOUT
V_TABIN
V_TABOUT
PSEUDO
WAVE_FUNCTION
DDE
POTCAR
The
entral input le for both programs is PSCTR. It
ontains all information for the
al
ulation of the pseudopotential. The
input le V RHFIN on the other hand des
ribes the atomi
referen
e
onguration and
ontrols the all ele
tron (AE) part
of the pseudopotential generation program. The pseudopotential generation program rhfsps
reates the les PSEUDO and
WAVE FUNCTION, whi
h are read and interpreted by the fourpot3 program. The nal output le is the POTCAR le, whi
h
an be read by VAMP.
Mind: All programs dis
ussed in this se
tion use a.u., energies are always in Rydberg. This is an important differen
e to
VAMP (whi
h uses eV and A ).
11
135
CA
.002000 106.42000 125.
0
.5-1761.5171 2.0000
.5 -257.9015 2.0000
1.5 -231.7505 6.0000
.5 -46.6977 2.0000
1.5 -38.0485 6.0000
2.5 -24.196610.0000
.5 -6.4877 2.0000
1.5 -3.9976 6.0000
.5
-.3403 1.0000
2.5
-.5091 9.0000
.5
-.1000 .0000
The rst line is a
omment, whi
h should
ontain the name of the element and the referen
e
onguration for the valen
e
ele
trons. The se
ond line
11 46. .002000 106.42000 125.
J Z XION N AM
H
gives the most important information about the atom. J is the number of orbitals, Z the ordering number. XION
an be used
to supply a degree of ionization, but normally this value is zero. N is the number of grid points, usually we use 2000, AM the
atomi
mass, whi
h is used to
al
ulate the innermost point for the logarithmi
grid. H determines the spa
ing between the
grid points. The grid points are given by
number
r = rsmall e H
(11.1)
We normally use H=125. DELRVR is the break
ondition for the self
onsisten
y loop and PHI the linear mixing parameter
for the
harge density. NC1 determines the maximum number of self
onsisten
y loops. If a V TABIN le exists GREEN
should be FALSE (F), if no V TABIN exists set GREEN to T; in this
ase an appropriated start potential will be
al
ulated.
The parameter CH determines the type of the ex
hange
orrelation, the following settings are possible:
HL
CA
WI
PB
PW
LM
91
Slater-XC
Hedin Lundquist (1971)
Ceperly and Alder parameterized by
J.Perdew and Zunger
Wigner interpolation
Perdew -Be
ke
Perdew -Wang 86
Langreth-Mehl-Hu
Perdew -Wang 91
Among these, the last four are gradient
orre
ted fun
tionals. The parameter QCOR determines the number of
ore ele
trons
(i.e. non valen
e ele
trons). The next line in the V RHFIN le supplies less important information. The rst parameter is
the SLATER parameter used only in
onjun
tion with the Slater-XC. The next parameter is no longer used, and the last
one
an be used the set up so
alled latter
orre
tion to the ex
hange
orrelation potential. Latter
orre
tions must not be
applied if pseudopotentials are
al
ulated. The remaining J lines give information about ea
h atomi
orbital. The
ode is
s
alar relativisti
, but the inputle is
ompatible to a relativisti
input format. The rst value in ea
h line is the main quantum
number, the se
ond one the l-quantum number, and the third one the j-quantum number ( j = l 1=2). The j-quantum number
is not used in the program. The next value gives the energy of the atomi
orbital, the last number is the o
upan
y of the
orbital. The supplied energy is un
riti
al and only used as a start value for the
al
ulation of the atomi
orbitals. As a starting
guess you might insert values obtained from an atom lying
lose to the atom of interest.
The program rhfsps writes two les V RHFOUT and V TABOUT. The V RHFOUT le is
ompatible to V RHFIN and
an be
opied to V RHFIN, if V TABOUT is
opied to V TABIN. In this
ase rhfsps will start from the fully
onverged
AE-potential supplied in V TABIN. This saves time, and generally we re
ommend this setting.
11
11.2
136
PSCTR
The PSCTR le
ontrols the pseudopotential generation program (rhfsps) and the
al
ulation of the US pseudopotentials
(fourpot3). A simple PSCTR le might have the following
ontents:
TITEL = Pd: NC=2.0 US=2.7, real-spa
e 200eV, opt
LULTRA =
T
use ultrasoft PP ?
RWIGS =
2.600
Wigner-Seitz radius
ICORE =
RMAX =
QCUT =
0
lo
al potential
3.000
ore radius for proj-oper
4.000; QGAM =
8.000
optimization parameters
Des
ription
l
E
0 .000
2 .000
2 -.600
1 .000
TYP
15
7
7
7
RCUT
2.100
2.000
2.000
2.700
TYP
15
23
23
7
RCUT
2.100
2.700
2.700
2.700
The PSEUDO and POTCAR les generated by rhfsps and fourpot3
ontain a default energy
utoff, whi
h might be used for
the
al
ulations with VASP. The default
utoff guarantees reliable
al
ulations, with errors in the eigenvalues smaller than
1 mRy (i.e. 13 meV, for s elements the error is usually mu
h smaller). This is suf
ient as long as the stress tensor is not
important, be
ause Pulay
ontributions are usually not negligible for this
utoff. (in
rease the
utoff by a fa
tor of 1.5 if Pulay
ontributions should be avoided).
The default energy
utoff works only for US-PP
onstru
ted with the RRKJ s
heme. The default
utoff is proportional to
the square of the highest expansion
oef
ient used in the RRKJ s
heme[18, 43.
E NMAX
(11.2)
(qhigh is in a.u., whereas ENMAX is in eV, therefore the
onversion fa
tor 13.6058). There is also a line ENMIN in the
POTCAR and PSEUDO le, ENMIN
orresponds to the minimal energy required for a reasonable a
urate
al
ulation (for
instan
e ENMIN is suf
ient for mole
ular dynami
s), ENMIN is
al
ulated a
ording to
E NMIN
(qhigh is in a.u., whereas ENMIN is in eV, therefore the onversion fa tor 13.6058).
(11.3)
11
11.4
11.4.1
137
TITEL-tag
T IT EL = string
NWRITE-tag
NW RIT E = verbosity
Default :
0j1j2
LULTRA-tag
Default :
F jT
.FALSE.
Determines, whether US pseudopotentials are
reated. The
al
ulation of the US pseudopotentials is not done within rhfsps
but within fourpot3. A
tually for LULTRA=T, simply two sets of pseudo wave fun
tions per l-quantum number are
al
ulated.
The rst set is used by fourpot3 to set up the augmentation part, and the se
ond pseudo wave fun
tion is used for the a
tual
pseudopotential des
ription.
11.4.4
RPACOR-tag
Default :
If RPACOR is supplied and non zero, partial
ore
orre
tions are
al
ulated. The partial
ore
orre
tion
an improve the
transferability of pseudopotentials signi
antly, if
ore and valen
e ele
trons overlap[46. If RPACOR is a positive non zero
value the
ore
harge density is trun
ated at RPARCOR and the
orresponding trun
ated
harge density is used for the
uns
reening pro
edure. If RPACOR is negative rhfsps sear
hes for the point where the
ore
harge density is -RPACOR
times larger as the valen
e
harge density. At this radius the
ore
harge density is trun
ated.
11.4.5
IUNSCR-tag
Default :
if RPACOR 6= 0
else
0j1j2
1
0
Determines how the uns
reening is done, and is used in
onjun
tion with RPACOR (se
tion 11.4.4). Usually the user must
no set this ag by hand. It is saver to use RPACOR. If RPACOR is supplied IUNSCR will be set to 1
orresponding to a non
linear uns
reening[46. If RPACOR is not supplied or zero, IUNSCR will be set to 0
orresponding to a linear uns
reening,
and no partial
ore
orre
tion. IUNSCR = 2 uses Lindharts approa
h for the
ore-valen
e ex
hange
orrelation, this approa
h
is only interesting in
onjun
tion with pseudopotential perturbation theory and must not be used with VAMP.
11.4.6
RCUT-tag
RCUT
= R ut
default utoff
Determines the
utoff radius for a pseudopotential if nothing is supplied in the Des
ription se
tion of the PSCTR le 11.4.11.
This line is not required and the Des
ription se
tion of the PSCTR le should be used instead.
11
11.4.7
138
RCORE-tag
Default :
for TM and RRKJ
else
Determines the
ore radius for the pseudopotential generation. At the
ore radius the logarithmi
derivatives of the AE wave
fun
tions and the pseudo wave fun
tions are mat
hed. For some s
hemes (TM and RRKJ) this
ore radius
an be similar
to the
utoff radius R
ut supplied in the Des
ription se
tion of the PSCTR le 11.4.11. For these s
hemes the pseudo wave
fun
tion is stri
tly the same as the AE wave fun
tion for r < R
ut . This is not the
ase for the BHS, VAN and XNC s
heme.
Here RCORE must be supplied by the user and should be 1.5 times as large as the maximum
utoff radius R
ut .
11.4.8
RWIGS-tag
Default :
RCORE
Determines a radius where some quantities are
he
ked for their a
ura
y. Usually RWIGS is set to the Wigner Seitz radius
or to half the distan
e between nearest neighbors. This value is passed to VAMP and used as the Wigner Seitz radius for the
al
ulation of the partial spd wave fun
tion
hara
ters and the lo
al partial DOS (se
tion 5.16).
11.4.9
xl
HOCHN
parameters used for the XNC (extended norm
onserving) pseudopotentials, see equ. (11) in [40:
f 3 (x) = (1
mpxn )100
(11.4)
11.4.10
These parameters
ontrol the RRKJ s
heme and its variants[43, 18. We have found that Bessel fun
tions are a natural basis
set to expand the pseudo wave fun
tions, but generally the optimization proposed by RRKJ does not improve the
onvergen
e
speed signi
antly[18.
Optimization
an be swit
hed of if NMAX1 and NMAX2 are set to 0. In all other
ases NMAX1 and NMAX2 gives
the number of Bessel fun
tions used in the optimization, NMAX1 is used for the rst set of parameters in the Des
ription
se
tion of the PSCTR le 11.4.11 (usually the NC part) and NMAX2 is used for the se
ond set of parameters (usually the non
norm
onserving part). LCONT
ontrols whether the third derivatives of the pseudo wave fun
tions are
ontinues at the
utoff
radius. This results in a
ontinues rst derivative of the pseudopotential at the
utoff radius. QRYD is the allowed energy error
in the optimization [12, 18.
NMAX1=0 and NMAX2=0 gives always the best pseudopotentials. Anything else is only for absolute experts.
11
139
in the PSCTR le. It
ontains information, how pseudopotentials for ea
h quantum number l are
al
ulated. For ea
h quantum
number l more than one line, ea
h
orresponding to a different referen
e energy,
an be supplied. The ordering must not be
the same as in the V RHFIN le, but for ea
h valen
e orbital in the V RHFIN le at least one
orresponding line in the
PSCTR le should exist. For
onventional pseudopotentials (tag LULTRA=F, se
tion 11.4.3) ea
h line
onsists of one data
set
ontaining the following information
0
L
.000
15 2.100
EREF ITYPE RCUT
for ultrasoft pseudopotentials (tag LULTRA=T, se
tion 11.4.3) ea
h line must
ontain two data sets:
2
L
.000
7 2.000
23 2.700
EREF ITYPE1 RCUT1 ITYPE2 RCUT2
The rst data set
ontrols the
al
ulation of the norm
onserving wave fun
tions used for the augmentation part, the se
ond
one
ontrols the possibly non norm
onserving part [18. If LULTRA=T and if a spe
i
l-pseudopotential should be norm
onserving (for instan
e we usually
reate a norm
onserving s pseudopotential and an ultrasoft d-pseudopotential for the
transition metals), both datasets must be stri
tly similar, for instan
e:
0
.000
15 2.100
15 2.100
In this
ase the augmentation
harge is simply zero for the s pseudopotential and a norm
onserving s PP is generated.
The rst number in ea
h line of the Des
ription se
tion is the l-quantum number, the se
ond line gives the referen
e
energy. If the referen
e energy is zero the pseudopotential is
reated for a bound state (i.e. the referen
e energy is similar to the
orresponding eigenenergy of the valen
e wave fun
tion). If EREF is nonzero the pseudo wave fun
tion (and pseudopotential)
for a non bound state is
al
ulated [45. ITYPE
ontrols the type of the pseudopotential. The following values are possible to
al
ulate norm
onserving pseudo wave fun
tions:
1
2
3
6
7
15
BHS
TM
VAN
XNC
RRKJ wave fun
tion possibly with node
RRKJ wave fun
tion stri
tly no node
For the BHS, VAN and XNC s
heme the the energy derivative of x (E ) is tted at the referen
e energy and no norm
onservation
onstraint is applied (for the non relativisti
ase a one to one relation ship between the logarithmi
derivative and
the norm
onservation
onstraint exists, this equation does not hold exa
tly for the s
alar relativisti
ase). If the norm
onservation
onstraint should be used instead add 16 to these values. The RRKJ s
heme without optimization (i.e. NMAX1=0,
NMAX2=0) (se
tion 11.4.10) might result in wave fun
tions with a node
lose to R = 0 this
an be avoided setting ITYP to
15. Nevertheless nodes do not matter if fa
torized KB pseudopotential are generated.
Non norm
onserving pseudo wave fun
tions
an be
al
ulated adding 8 to the values given above i.e.:
l
9
10
11
14
15
23
BHS
TM
VAN
XNC
RRKJ wave fun
tion possibly with node
RRKJ wave fun
tion stri
tly no node
Extensive testing has been done only for ITYPE=15 and 23.
11.5
As a default a
tion the fourpot3 tries to read the FOURCTR le to set up the
ontrol parameters for the run. We do not
re
ommend the use of the FOURCTR, instead it is better to supply the parameters in the PSCTR le. If no FOURCTR le
exists the fourpot3 program reads
ertain tagged lines from the PSCTR le,
11
140
11.5.1
The ICORE, respe
tively the RCORE line, determines the lo
al
omponent of the pseudopotential; one of these lines must
be supplied in the PSCTR le. If ICORE is supplied, the lo
al pseudopotential is set to the rst pseudopotential with the
l-quantum number equal to ICORE found on the PSEUDO le. Alternatively, if the RCLOC ag is found on the PSCTR le,
than the exa
t AE potential is trun
ated at RCLOC and set to
Csin(Ar)=r
(11.5)
for r < RCLOC. C and A are determined so that the potential is
ontinues at the
utoff RCLOC. This potential is used as the
lo
al potential.
11.5.2
MD
NFFT
NFFT sets the number of points for the FFT of the lo
al potential and the
harge densities. Default is 32768, and must not be
hanged ex
ept for testing the a
ura
y.
MD supplies the number of points for a gauss integration used in
ertain parts of the
ode. Default is 64, and must not be
hanged ex
ept for testing the a
ura
y. The next smaller possible value is 48.
11.5.3
NQL
DELQL
These tags determine the grid for the lo
al potential in re
ipro
al spa
e. If you want to avoid in
ompatibilities with VAMP,
NQL must be 1000, this is also the default value. Default for DELQL is 0.05, the a
tual spa
ing is 2/RCORE DELQL 1/a.u.
11.5.4
NQNL
DELQNL
These tags determine the grid for the non lo
al potential in re
ipro
al spa
e. If you want to avoid in
ompatibilities with VAMP,
NQNL must be 100, this is also the default value. Default for DELQNL is 0.1, the a
tual spa
ing is 2/RCORE DELQNL
1/a.u., NQNNL is only used in
onjun
tion with perturbation theory.
11.5.5
RW IGS
NE
EFORM
ET O
These tags determine the the energy range the radius and the number of energies for whi
h the logarithmi
derivatives x (E )
are
al
ulated. (see also se
tion 11.8)
Defaults:
RWIGS
= RCORE
NE
= 100
EFROM = -2
ETO
= 2
l
11
11.5.6
141
RMAX
RDE P
QCU T
QGAM
These tags
ontrol the real spa
e optimization of the pseudopotentials [47, and the extend of the non lo
al proje
tion operators. If no real spa
e optimization is sele
ted QCUT must be zero. The default values are:
RMAX = RCORE
RDEP
= RCORE
QCUT = -1, automati
real spa
e optimization
default
utoff
QGAM = 2*QCUT
If real spa
e optimization should be done, QCUT must be set to the energy
utoff, whi
h will be used in VAMP. Anyway here
QCUT has to be supplied in 1/a.u. (i.e. as a inverse length) and
an be
al
ulated from the
utoff energy using the formula
13:6058):
E ut =
If any wrap around errors are omitted in VAMP, QGAM
an be 3*QCUT, but if the 3/4 rule is used for setting up the
FFT meshes (see se
tion 8.4) QGAM must be 2*QCUT. To get a
urate real spa
e proje
tion operators RMAX has to be
somewhat large than RCORE, usually 1.25*RCORE is suf
ient. After the optimization the proje
tion operators have been
hanged between QCUT and QGAM, the new proje
tion operators are written to the le POTCAR in real and re
ipro
al
spa
e. This means that slightly different results might be obtained if the real spa
e optimization has been done even if the
proje
tions are evaluated in VAMP in real spa
e (LREAL=.F.). If the unmodied re
ipro
al proje
tion operator should be
written to POTCAR set QCUT to a negative value.
Finally there is a default optimization build into fourpot3, whi
h
an be sele
ted by QCUT=-1. In this
ase the pseudopotential is optimized for the 3/4 FFT meshes, QCUT is set a
ording to the default
utoff ENMAX and the RMAX is set to
RCORE*1.3.
For further reading we refer to [47.
.0 J=
S heme:
RRKJ
additional minimization of kineti
energy
infinit interval
utoffradius RCUT=2.12
oreradius RCORE=2.70 testradius RCHECK=2.61
outmost min RMIN=1.15 outmost max RMAX =2.56 turningpt RTURN =1.19
number of nodes
= 0
2.step Energyerror:-.00000022
<T> [0,RCHECK
= .21212519
<T(Q)> [0,RCHECK
= .21212588
NORM= .35843725
10mRy 5mRy 2mRy 1mRY 0.5mRy 0.2mRy 0.1mRy
T(Q)
3.29 3.29 9.10 9.10 17.83 17.83 29.46
11
142
<T> [0,RMAX
= .21677927
<T> [0,Infinity
= .27147372
<T(Q)>
= .27147349
NORM= .99999696
10mRy 5mRy 2mRy 1mRY 0.5mRy 0.2mRy 0.1mRy
T(Q)
5.40 6.32 7.30 8.35 14.21 16.68 18.25
Energy of next bound state
AE-frozen-potential : -.00041
PS:
The rst line states the quantum numbers and the referen
e energy. The next lines give information about the pseudization
s
heme and information about the AE wave fun
tion. Important are the lines following
<T> [0,Infinity
.27147372
these lines give the ne
essary energy
utoff to obtain a
ertain degree of
onvergen
e (for instan
e 14.2 Ry to
onverge the
energy of a single s dominated ele
tron to to 0.5 mRy). Do not take these values too seriously, they are
al
ulated from the
kineti
energy spe
trum of the pseudo wave fun
tion, and have to be veried with VAMP (see [18).
The lines after
Energy of next bound state
show the energy of the next bound state assuming a frozen
ore. Following the lines
error of pseudopotential for different energies
the error of the pseudopotential at different energies around the referen
e energy is printed. For ultrasoft or fa
torized KB
potentials these lines are not very important (and a
tually in
orre
t), so use them only to judge the a
ura
y of norm
onserving
PP. Even in this
ase plotting the logarithmi
derivative is more
onvenient.
<w|V|w>
-.10385E+01
-.10736E+01
.20284E+00
NQNL = 100
RMAX =
3.0000
DELQNL=
.0950
file
(
3.80x
<w|V V|v>
.51783E+01
.50350E+01
.71058E-01
9.50)
Strength
-4.986187
-4.689804
.350314
These lines give some information on the fa
torization of the PP, and on the strength of the non lo
al proje
tion operators. The
values given in the Column Strength should not bee to large (espe
ially large positive values might result in ghost states).
Next the matri
es Q , B and D as dened in Vanderbilds paper are written out (see [18, 12, 8). The matrix Q should be
very similar to the values following
ij
ij
ij
Depletion harge
ij
11
143
shows the effe
t of uns
reening the non lo
al part of the PP.
Se
tion
Optimization of the real spa
e proje
tors
gives very important information on the optimized real spa
e proje
tors. First QCUT and QGAM is written out and
onverted
to eV. Che
k these values on
e again. Next results for the optimization of ea
h proje
tor are written out.
l
2
2
1
X(QCUT) X(
ont)
41.262
41.208
-6.651
-6.643
.768
.769
W(q)/X(q)
.32E-03
.49E-03
.73E-03
e(spline)
.15E-05
.18E-05
.12E-05
X(QCUT) is the value of the proje
tion operator at QCUT, X(
ont) the new value after the optimization (should be equal
X(QCUT)), X(QGAM) is the value of the optimized proje
tion operator at QGAM (should be
lose to 0). W(q)/X(q) is the
approximate error of the real spa
e optimized proje
tion operator. This value should be smaller than 10 3 , otherwise serious
errors have to be expe
ted.
Next information about the FFT of the lo
al potential, the uns
reening
harge density (i.e. the atomi
harge density) and
the partial
ore
harge density are printed. Very important are the lines
estimated error in ...
...
Core Pot
L =
.00000
-.34032
NE + 1 data pairs follow. The rst ontrol line (Core Pot ...) ontains the l-quantum number and the referen e energy in
Rydberg. The rst value of ea
h data pair, following the
ontrol line, supplies the energy, the se
ond value the logarithmi
derivative of the AE l wave fun
tion. Information about the non-separable pseudopotential follows after the line
Potential L =
.00000
-.34032
information about the fa
torized Kleinman-Bylander or ultrasoft pseudopotential is printed after the lines
KBPotential L =
4.00000
.00000
A program for plotting the data points exists. This program is
alled
drawdde
and requires erlgraph. No support for erlgraph will be given by our institute so do not ask for support. If you want to make
plots you
an
opy the program and
hange it to use your own plot routines.
13
144
T
2.0
TYP
7
7
15
15
15
15
use ultrasoft PP ?
use i.e 2/3 of the radial
utoff
RCUT
2.800
2.800
2.900
2.900
2.900
2.900
TYP
23
23
23
23
23
23
RCUT
2.900
2.900
2.900
2.900
2.900
2.900
This PP is mu
h better than for example a standard BHS pseudopotentials, and the
onvergen
e speed is also reasonable. To
improve ef
ien
y it is possible to in
rease the radial
utoffs for the US-part in our example up to 3.2 a.u., and that one of the
norm
onserving part to 3.0 a.u., without loss of a
ura
y.
Se
ond, it is not always ne
essary to in
lude two proje
tors per l-quantum number, for instan
e there is no need to
make the s and p-part ultrasoft for the transition metals, and rst row elements do not require an a
urate des
ription of the
d-ele
trons. Examples are given below.
Potassium pseudopotential
Referen
e Konguration: s1 p0 d 0
TITEL =K : NC R=4.6
RPACOR = -1.000
partial
ore radius
RWIGS =
4.800
wigner-seitz radius
ICORE
RMAX
RDEP
QCUT
=
=
=
=
Des
ription
l
E
0 .000
1 -.100
2 .150
0
lo
al potential
5.500
ore radius for proj-oper
4.000
ore radius for depl-
harge
2.100; QGAM =
4.200
optimization parameters
TYP
7
7
15
RCUT
4.600
4.600
3.000
TYP RCUT
Very simple PP, a urate norm onserving des ription for d was in luded, but is not really ne essary for K. Lo al potential is
s PP. Cutoffs for other similar metals might be obtained by s aling the used utoffs with the Wigner Seitz radius. Partial ore
is important and
hanges dimer length by 2%. PP is optimized for a simulation of l-K with a
utoff of 60 eV. Very a
urate
al
ulations would require 80 eV.
13.2
Vanadium pseudopotential
Referen
e Konguration: s2 p0 d 3
s1 p4
an be used as well and does not
hange the results.
13
145
TITEL =V : US
LULTRA =
T
RPACOR =
1.400
ICORE
RWIGS
DELQL
RMAX
QCUT
=
=
=
=
=
use ultrasoft PP ?
partial
ore radius
0
lo
al potential
2.800
Wigner
.020
grid for lo
al potential
3.200
ore radius for proj-oper
3.500; QGAM =
7.000
optimization parameters
Des
ription
l
E
0 .000
1 -.100
2 .000
2 -.300
TYP
7
7
7
7
RCUT
2.200
2.600
2.000
2.000
TYP
7
7
23
23
RCUT
2.200
2.600
2.600
2.600
The Wigner Seitz Radius is approximately 2.8 a.u.,
utoffs for other transition metals might be obtained by s
aling the
utoffs
by the
ovalent radii, whi
h
an be found in any periodi
table. s and p PP are norm
onserving, s PP is lo
al. d PP is ultrasoft
with 2 referen
e energies. Partial
ore
orre
tions are sele
ted, and are important for the transition elements at the beginning
of the row. The
utoff for the s PP was made as small as possible without
reating a node in the s wave fun
tion (it is also
possible to set ITYPE to 15 and set R
ut = 2:6 for the s part, but differen
es are negligible). A node in the s PP must be
avoided, be
ause the s PP is the the lo
al potential (ICORE=0). The pseudopotential is real spa
e optimized for a
utoff of
160 eV for a simulation of liquid V. Very a
urate
al
ulations would require approximately 200 eV.
13.3
Palladium pseudopotential
Referen
e Konguration: s1 p0 d 9
TITEL =Pd : US
LULTRA =
ICORE
RWIGS
RMAX
QCUT
=
=
=
=
Des
ription
l
E
0 .000
1 -.100
2 .000
2 -.600
use ultrasoft PP ?
0
lo
al potential
2.900
wigner-seitz radius
3.000
ore radius for proj-oper
4.000; QGAM =
8.000
optimization parameters
TYP
15
7
7
7
RCUT
2.100
2.700
2.000
2.000
TYP
15
7
23
23
RCUT
2.100
2.700
2.700
2.700
The Wigner Seitz Radius is approximately 2.9 a.u., i.e. slightly large than in the previous example, therefore the
utoffs where
in
rease slightly. Partial
ore
orre
tions are not ne
essary for palladium, be
ause it is lo
ated at the end of the row. On
e
again s
utoff was made as small as possible without getting a node. The pseudopotential is real spa
e optimized for a
utoff
of 200 eV for a simulation of H on a Pd surfa
e.
13.4
Carbon pseudopotential
Referen
e Konguration: s2 p2 d 9
TITEL =C:
LULTRA =
ICORE =
RWIGS =
2
1.640
use ultrasoft PP ?
lo
al potential
wigner-seitz radius
14
Des
ription
l
E
0 .000
0 -.700
1 .000
1 -.700
2 -.300
TYP
7
7
7
7
7
RCUT
1.300
1.300
1.200
1.200
1.900
TYP
23
23
23
23
23
146
RCUT
1.900
1.900
1.900
1.900
1.900
RWIGS is the Wigner Seitz radius if empty spheres of the same size are in
luded for diamond. This radius gives good
proje
tion operators for the partial lo
al DOS. NC d-PP is the lo
al potential, s and p are US. This is a very soft PP requiring
only 270 eV
utoff, it works well for bulk phases and surfa
es. The a
ura
y
an be improved making the
utoffs smaller. We
have also used 1.4 instead of 1.9, but results are only marginally effe
ted.
13.5
Hydrogen pseudopotential
Referen
e Konguration: s1 p0
TITEL =H:
LULTRA =
RCORE =
RWIGS =
Des
ription
l
E
0 .000
0 -.700
1 -.250
T
0.65
1.000
TYP
7
7
7
use ultrasoft PP ?
lo
al potential
wigner-seitz radius
RCUT
0.800
0.800
0.800
TYP
23
23
23
RCUT
1.250
1.250
1.250
s and p are US, lo
al potential is the trun
ated AE-potential. Only one proje
tor is suf
ient for the unimportant p-PP.
Comment: Some people
onsider H-pseudopotentials as a nonsense. Nevertheless this PP gives ex
ellent des
ription of
the bond length for the H2 dimer, and for H on C surfa
es, and it requires only 200 eV.
at the beginning of all subroutines. All real and
omplex variables must be dened as REAL(q) and COMPLEX(q) (NEVER:
REAL or COMPLEX). The use of IMPLICIT NONE is strongly re
ommended, but
urrently not used in all subroutines. If
you do not use IMPLICIT NONE, you must use
IMPLICIT REAL(q) (A-H,O-Z)
to guarantee that all real variables have the
orre
t type. The IMPLICIT statement must be the rst statement after the USE
statement (some
ompiler allow IMPLICIT statements somewhere else, but not all F90
ompiler do so). For instan
e:
SUBROUTINE RHOATO(LFOUR,LPAR,GRIDC,T_INFO,B,P,CSTRF,CHTOT,CHDER)
USE pre
USE mgrid
USE pseudo
USE
onstant
IMPLICIT REAL(q) (A-B,D-H,O-Z)
TYPE (type_info) T_INFO
TYPE (pot
ar)
P(T_INFO%NTYP)
TYPE (grid_3d)
GRIDC
COMPLEX(q) CHTOT(GRIDC%RC%NP),CHDER(GRIDC%RC%NP)
14
147
COMPLEX(q) CSTRF(GRIDC%MPLWV,T_INFO%NTYP)
REAL(q)
B(3,3)
LOGICAL LFOUR,LPAR
Work arrays SHOULD be allo
ated on the y with ALLOCATE and DEALLOCATE. DO NOT USE DYNAMIC F90 arrays
(ex
ept for small performan
e insensitive arrays). The dynami
arrays are allo
ated from the sta
k and this
an degrade
performan
e by up to 20 In addition, it might happen that one runs out of sta
k memory if large arrays are allo
ated from the
sta
k, unpredi
table
rashes are possible (at least on IBM workstations). ALLOCATE and DEALLOCATE uses the heap and
not the sta
k and is therefore often saver.
All le must
onform to the F90 free format. A small utility
alled
onvert
an be found in the pa
kage to
onvert F77
style programs to F90 free format.
All subroutines should be pla
ed in a MODULE so that dummy-parameters
an be
he
ked during
ompilation.
Input/Output (IO) should be done with extreme
are, to allow later parallelisation. The following rules must be obeyed:
Six
lasses of information
an be distinguished
debugging messages
general results
Noti
ations (important results)
Warnings (strange behaviour,
ontinuation possible)
Errors (user error, le
an not be opened et
.)
internal errors (absolute
haos, internal in
onsisten
y)
Debugging ode and messages might remain within the subroutines, and simply bra keted by
#ifdef debug
#endif
IO%IU6 is >=0. (in the parallel version most nodes will have IO%IU6 set to -1).
Noti ation and warnings should be written to unit IO%IU0, and before writing it must be he ked whether IO%IU0
is >=0. (in the parallel version most nodes will have IO%IU0 set to -1). Unit * must not be used for noti
ations
and warnings.
If the program omes to a point where ontinuation is impossible (errors, or internal errors) the program should STOP
and write why
ontinuation is impossible. If program logi
allows to determine that all nodes will
ome to the same
STOP, then preferably only one node should report to unit IO%IU0. If this is not possible and whenever in doubt all
nodes should write an error status to the unit *.
Defensive programming should be used whenever possible (i.e. input parameter he ked against ea h other). If a
subroutine nds an internal in onsisten y errors might be reported to unit * (internal error).
15
148
FAQ
15 FAQ
Question: I
an not
ompile the parallel version of VASP under LINUX.
Mind that VASP will generally not link
orre
tly to mpi versions
ompiled with g77/f77, sin
e g77/f77 append two
unders
ores to external symbols already
ontaining one unders
ore (i.e. MPI SEND be
omes mpi send ). The portland group
ompiler however appends one unders
ore. Although the pgf90
ompiler has an option to work around this
problem, we yet faild to link agains mpi libraries generated for g77/f77. Hen
e you must
ompile mpi (mpi
h and/or
lam) yourself. This is really easy and simple, if the ma
hine has been set up properly (have a look at our makeles). If
the
ompilation of mpi
h and/or lam fails, VASP will almost
ertainly not work in parallel on your ma
hine, and we
strongly urge you to reinstall LINUX.
Question: Why is the
ohesive energy mu
h large than reported in other papers.
For metalli
system, k-point
onvergen
e is usually a
riti
al issue. There are a few general hints whi
h might be
helpfull:
For hexagonal
ells, Gamma
entered k-point grids
onverge mu
h faster than other grids. In fa
t, most meshes
that do not in
lude the G point break the symmetry of the hexagonal latti
e! Even with in
reasing grid densities
the wrong results might be obtained.
Up to divisions of 8 (i.e. 8x8x1 for a surfa
e) even Monkhorst Pa
k grids whi
h do not
ontain the Gamma point,
performe better than odd Monkhorst Pa
k grids (this does not apply to hexagonal
ells, see above). In other words
one obtains better
onverged results with even grids.
For adsorbates on surfa
es, it is sometimes feasable to use only the k-points of the high symmetry Brillouine
zone, even if the adsorbate breaks the symmetry. These k-point grids
an be generated by running VASP with a
POSCAR for whi
h all adatoms have been removed. The resulting IBZKPT le
an be
opied to KPOINTS. For
onveni
ene, the following k-point grids
an be used for hexagonal
ells:
Gamma
entered 2x2
Automati
ally generated mesh
2
Re
ipro
al latti
e
0.00000000000000
0.00000000000000
0.50000000000000
0.00000000000000
0.00000000000000
0.00000000000000
1
3
0.00000000000000
0.00000000000000
0.00000000000000
1
6
2
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
1
6
3
6
mesh
0.00000000000000
0.00000000000000
0.33333333333333
mesh
0.00000000000000
0.00000000000000
0.00000000000000
0.25000000000000
15
149
FAQ
Automati
ally generated
5
Re
ipro
al latti
e
0.00000000000000
0.20000000000000
0.40000000000000
0.20000000000000
0.40000000000000
Gamma
entered 6x6
Automati
ally generated
7
Re
ipro
al latti
e
0.00000000000000
0.16666666666667
0.33333333333333
0.50000000000000
0.16666666666667
0.33333333333333
0.33333333333333
mesh
0.00000000000000
0.00000000000000
0.00000000000000
0.20000000000000
0.20000000000000
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
1
6
6
6
6
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
1
6
6
3
6
12
2
mesh
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
0.16666666666667
0.16666666666667
0.33333333333333
0.25000000000000
0.00000000000000
Monkhorst Pa
k: 4x4x1
3
Re
ipro
al latti
e
0.12500000000000
0.37500000000000
0.37500000000000
0.12500000000000
0.12500000000000
0.37500000000000
0.00000000000000
0.00000000000000
0.00000000000000
4
8
4
Monkhorst Pa
k: 6x6x1
6
Re
ipro
al latti
e
0.08333333333333
0.25000000000000
0.41666666666667
0.25000000000000
0.41666666666667
0.08333333333333
0.08333333333333
0.08333333333333
0.25000000000000
0.25000000000000
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
4
8
8
4
8
Monkhorst Pa
k: 8x8x1
10
Re
ipro
al latti
e
0.06250000000000
0.18750000000000
0.31250000000000
0.43750000000000
0.18750000000000
0.31250000000000
0.43750000000000
0.31250000000000
0.43750000000000
0.43750000000000
0.06250000000000
0.06250000000000
0.06250000000000
0.06250000000000
0.18750000000000
0.18750000000000
0.18750000000000
0.31250000000000
0.31250000000000
0.43750000000000
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
0.00000000000000
4
8
8
8
4
8
8
4
8
4
In general
onvergen
e depends on the eigenvalue spe
trum of the Hessian matrix (se
ond derivative of the energy with
respe
t to positions). Roughly speaking the number of steps equals
N=
re
max
emin
15
150
FAQ
if a
onjugate gradient, or Quasi-Newton algorithm is
hosen. If a good stru
tural start guess exists, the best
onvergen
e
an be obtained with IBRION=1 and NFREE (number of degrees of freedon) set to a reasonable value. If the initial start
guess is bad, it is sometimes required to use the safer
onjugate gradient algorithm.
A very important point
on
erns the required a
ura
y of the ele
troni
degrees of freedom. If the eigenvalue spe
trum
of the Hessian matrix is small, EDIFF
an be rather large (EDIFF= 1E-3). However if the eigenvalue spe
trum is broad,
EDIFF must be set to a smaller value EDIFF=1E-5, sin
e otherwise the slowly varying degrees of freedom
an not be
a
urately determined in the Hessian matrix. If no
onvergen
e is observed for IBRION=1, try to de
rease EDIFF.
Question: I see unphysi
al oszillations and negative values for the
hargedensity in the va
uum. Is VASP not able to
>0, when
onsidering the wavefun
tions in the va
uum. ISMEAR > 0
an
ause negative
o
upan
ies
lose to the Fermi-level, and sin
e states at the Fermi-level de
ay slowest in the va
uum, the
harge
density in the va
uum might be negativ (energies are not effe
ted by this, sin
e the wavefun
tions in the va
uum
do not
ontribute signi
antly to the energy).
The
harge density of self
onsistent
al
ulations might have negative values in the va
uum, sin
e the mixer is
very insensitive to the
harge density in the va
uum. It is better to set LPARD=.TRUE. and
all VASP a se
ond
time. The generated CHGCAR le
ontains the
hargedensity
al
ulated dire
tly from the wavefun
tions.
In VASP, pseudo
harge density
omponents from unbalan
ed latti
e ve
tors are set to zero: although the
harge
density is initially
al
ulated in real spa
e and therefore positive denite, it is modied then in re
ipro
al spa
e,
and Fourier transformed ba
k to real spa
e. The nal
harge density has small os
illations in the va
uum.
To avoid this problem, use FFT grids that avoid wrap around errors (PREC=A
urate). The problem
an also be
redu
ed by in
reasing the energy
utoff.
Ultrasoft pseudopotentials require a se
ond support grid. In VASP.4.4.4 and older version,
harge density
omponents from unbalan
ed latti
e ve
tors are also zeroed on the se
ond support grid,
ausing additional small
os
illations in the va
uum. This problem is removed in VASP.4.5 and in VASP.4.4.5. In VASP.4.4.5 the ag DVASP45 must be spe
ied in the CPP line of the makele before
ompiling the VASP
ode. Total energies
might however
hange by a fra
tion of a meV.
Question: I am running mole
ular dynami
s and observe a large drift in the total energy, that should be
onserved.
Three reasons
an hamper the energy
onservation in VASP. i) First the ele
troni
onvergen
e might not be
suf
iently tight. It is often ne
essary to de
rease the toleran
e to 10 6 or 10 7 to obtain ex
ellent energy
onservation. Alternatively NELMIN
an be set to values around 6.
ii) The se
ond reason is an insuf
iently a
urate real spa
e proje
tion. This usually
auses a slightly spiky and
dis
ontinuous total energy. If you observe su
h a behavior, you have to improve ROPT, or set REAL=.FALSE.
iii) Finally,
onsider redu
ing the time step.
The following graph illustrates the behavior for a small liquid metalli
system (Ti). Please mind, that redu
ing
ROPT from -0.002 to -.0005 (LREAL=.A.) had the same effe
t as using LREAL=.F.
Avoid, ISMEAR
Question: I am running VASP on a SGI Origin, and the simple ben hmark (ben hmark.tar.gz) fails with
Answer: VASP extrapolates the wave fun
tions between mole
ular dynami
s time steps. To store the wave fun
tions
of the previous time steps either a temporary s
rat
h le (TMPCAR) is used (IWAVPR=1-9) or large work arrays are
allo
ated (IWAVPR=11-19). On the SGI, the version that uses a temporary s
rat
h le does not
ompile
orre
tly, and
hen
e the user has to set IWAVPR to 10.
Question: The parallel performan
e of VASP is not as good as expe
ted!
What do you mean by performan
e was not as expe
ted ? Matter of fa
t, you
an never obtain the same s
aling on a
P3/P4/Athlon XP based workstation
luster as on the T3D. The T3D was a very very slow ma
hine (by todays standard)
15
151
FAQ
-58.12
-58.14
-58.16
default
EDIFF=1E-7
EDIFF=1E-7, t/2
EDIFF=1E-7, LREAL=.F., t/2
EDIFF=1E-7, LREAL=.F.
-58.18
-58.2
-58.22
50
100
150
200
Figure 5: Energy
onservation for a liquid metalli
system for various setting.
equipped with an extraordinarily fast network (that's what made the pri
e of the T3D). A Gigabit network has roughly
the same overall performan
e as the T3D (Gigabit has longer laten
y, larger node-to-node bandwidth, but smaller total
aggregated bandwidth), but the P4 CPU is about 10 times faster than one T3D node. Additionally VASP was hot-spot
optimized
arefully on the T3D.
Altogether VASP will run reasonable ef
ient on up to 8-16 P4/Athlon XP type nodes (until k-point parallelization is
implemented)!
Question: Why is the VASP performan
e so bad on a dual pro
essor ma
hine?
It is a bad idea to run vasp on dual pro
essor P3/P4/Athlon ma
hines, sin
e two CPU's with small
a
he have to share
the small memory bandwidth (P4 RD-RAMS RIMM based ma
hines are an ex
eption). If you run two serial VASP
jobs on su
h a ma
hine, the performan
e already drops by 20 to share additionally one Gigabit
ard whi
h makes things
even worse (the argument, that these two CPUs
an ex
hange data faster, is irrelevant, sin
e most of the data ex
hange
is not between the two lo
al CPU's).
Question: We are using the LINUX kernel X.X.X and LAM/MPICH X.X.X but VASP fails to run.
First, it must be emphasized that we do NOT SUPPORT VASP on parallel ma
hines (in parti
ular LINUX
lusters).
This is
learly spelled out in the manual. One reason for this poli
y is that LINUX systems are too heterogeneous to
foresee all possible problems. Most problems are in fa
t not VASP related but related to very simple basi
mistakes
made by the system administrator, or
ompli
ated in
onsisten
ies between the LINUX kernel and the LAM/MPICH
installation, or the
ompilers and the installed MPICH/LAM version. Su
h problems
an not be solved by us!
But there is no reason to put off qui
kly: things have
ertainly improved a lot in the last few years, and parallel
omputing is still an area were one kernel/LAM/MPICH upgrade
an make a huge differen
e (both to the better or,
unfortunately, to the worse).
Some
ommon failures o
urring during the installation of MPICH/LAM should be highlighted:
the
ompilation of MPICH/LAM fails:
Certainly not a problem we
an solve for you. Please
onta
t the MPICH/LAM developers.
VASP fails to link properly:
Make sure that MPICH/LAM was
ompiled with the same
ompiler as used for VASP. Try to adhere stri
tly to
the guidelines in our vasp.4.X makeles.
In parti
ular, it is not possible to link with g77/f77
ompiled MPICH/LAM routines, sin
e g77/f77 appends two
unders
ores to MPI XXXX
alls, whereas if
and pgf90 append only one. Also make sure that the f90 linker uses
the proper libraries. This
an be a
hived usually by using mpif90 or mpif77 as linkers instead of f90. But one
needs to make sure that the proper mpif77 front-end is
alled (try to in
lude the option -v verbose upon
alling
mpif77). This
an be a parti
ular problem on some LINUX installations (SUSE), that install a mpif90 and mpif77
ommand. Type whi
h mpif90 or whi
h mpif77 to determine whi
h front-end you are using.
15
FAQ
152
LAM requires a daemon to run. It is essential to use a VASP exe
utable and LAM daemon
ompiled using the
same LAM distribution! The problem is related to the one already dis
ussed in the previous se
tion.
The use of s
aLAPACK is NOT en
ouraged, sin
e it is a tri
ky and dif
ult task to
ompile s
aLAPACK properly.
Furthermore, makeles for s
aLAPACK are not distributed with either s
aLAPACK, LAM/MPICH or vasp. One
reason for this is that the makeles depend to some extend on the LAM/MPICH version, on the lo
ation of the
libraries, on the pre
isse LINUX distribution et
. et
. Additionally, on most
lusters the performan
e gains due
to s
aLAPACK are very modest for VASP, sin
e VASP relies mostly on it's own iterative matrix diagonalisation
routines. Therefore, you
an safely
ompile VASP without s
aLAPACK, if the s
aLAPACK support fails to work.
If you have done everything
orre
tly, and VASP still fails to exe
ute... well, then, you will need to sti
k to the
serial version, or seek professional support from a
ompany distributing or maintaining parallel LINUX
lusters.
I adsorb, an ioni
spe
ies e.g. O on an insulating surfa
e. To sele
t a spe
i
harge state, I have in
reased the
number of ele
trons by one
ompared to the neutral system. Now, I have no
lue how to evalute the total energy
properly (i.e. are there
onvergen
e
orre
tions).
A
tually, you MUST NOT set the number of ele
trons manually for a slab
al
ulation. I.e., when you
al
ulate
the slab-O system you are not allowed to sele
t a spe
i
harge state for the oxygen ion, by in
reasing the
number of ele
trons manually. Spe
i
harge state
al
ulations make sense only in 3D systems and for
luster
al
ulations.
If you
ondu
t the
al
ulations properly, i.e. if your slab is large enough and the lateral dimension (x,y) of your
surfa
e is large enough the energy should
onverge to the proper value, i.e. the O should a
quire the
orre
t
harge
state automati
ally.
Reason: If you set the number of ele
trons in the INCAR le for a slab
al
ulation you end up with a
harged
slab. The ele
trostati
energy of su
h a slab is however only
onditionally
onvergent and worse, in pra
ti
e, even
innite (BASIC, BASIC ELECTROSTATICS). Therefore, no method whatsoever exists to
orre
t the error in the
ele
trostati
energy. E.g. the energy
onverges towards innity, when the va
uum width is in
reased. You
an try
to validate this, by simply in
reasing the va
uum width in VASP for a
harged slab. You will nd that the energy
in
reases or de
reases linearly with the va
uum width.
Well, there is maybe one method that
an surmount the aforementioned problem. You
an
harge the slab and in
rease systemati
ally the distan
e between the O- spe
ies (by in
reasing the lateral dimensions of your super
ell)
at a xed va
uum width, and nally extrapolate the energies towards innite lateral distan
es. The energy should
onverge towards the
orre
t value as 1/d , where d is the distan
e between the adsorbed spe
ies. This might yield
a
onverged value. The point is that, as I mentioned above, the ele
trostati
energy is only
onditionally
onvergent for the
ase of a
harged slab/system, and results depend on how you evaluate the limit towards innity.
However, to the best of my knowledge, this has not been done or attempted hereto (and therefore we
an not assist
you on that issue).
153
REFERENCES
Referen
es
[1 S. Nose, J. Chem. Phys. 81, 511 (1984).
[2 S. Nose, Prog. Theor. Phys. Suppl. 103, 1 (1991).
[3 D.M. Bylander, L. Kleinman Phys. Rev. B 46, 13756 (1992).
[4 Y. Le Page and P. Saxe, Phys. Rev. B 65, 104104 (2002).
[5 X. Wu, D. Vanderbilt, and D. R. Hamann Phys. Rev. B 72, 035105 (2005)
[6 J. Ihm, A. Zunger and L. Cohen, J. Phys. C: 12, 4409 (1979).
[7 M.C. Payne, M.P. Teter, D.C. Allan, T.A. Arias and J.D. Joannopoulos, Rev. Mod. Phys. 64, 1045 (1992).
[8 D. Vanderbilt, Phys. Rev. B41 7892 (1990).
[9 K. Laasonen, A. Pasquarello, R. Car, C. Lee and D. Vanderbilt, Phys. Rev. B 47, 10142 (1993).
[10 A. Pasquarello, K. Laasonen, R. Car, C. Lee and D. Vanderbilt, Phys. Rev. Lett. 69, 1982 (1992).
[11 J. Furthmuller, Thesis, Universiat Stuttgart (1991).
[12 G. Kresse, Thesis, Te
hnis
he Universiat Wien 1993.
[13 G. Kresse and J. Furthmuller, Ef
ien
y of abinitio total energy
al
ulations for metals and semi
ondu
tors using a
planewave basis set, Comput. Mat. S
i. 6, 15-50 (1996).
[14 G. Kresse and J. Furthmuller, Ef
ient iterative s
hemes for
basis set,
Phys. Rev. B 54, 11169 (1996).
ab initio
REFERENCES
154
REFERENCES
155
[63 M. Gajdos, K. Hummer, G. Kresse, J. Furthmuller, and F. Be
hstedt, Phys. Rev. B in print.
[64 J. Paier, R. Hirs
hl, M. Marsman, and G. Kresse, J. Chem. Phys. 122, 234102 (2005).
[65 J. Heyd, G. E. S
useria, and M. Ernzerhof, J. Chem. Phys. 118, 8207 (2003).
[66 J. Heyd and G. E. S
useria, J. Chem. Phys. 121, 1187 (2004).
[67 J. Heyd, G. E. S
useria, and M. Ernzerhof, J. Chem. Phys. 124, 219906 (2006).
[68 J. Paier, M. Marsman, K. Hummer, G. Kresse, I.C. Gerber, and J.G. Angy
an, J. Chem. Phys. 124, 154709 (2006).
[69 D.M. Bylander and L. Kleinman, Phys. Rev. B 41, 7868 (1990).
[70 S. Pi
ozzi, A. Continenza, R. Asahi, W. Mannstadt, A.J. Freeman, W. Wolf, E. Wimmer, and C.B. Geller, Phys. Rev. B
61, 4677 (2000).
[71 A. Seidl, A. Gorling, P. Vogl, J.A. Majewski, and M. Levy, Phys. Rev. B 53, 3764 (1996).
[72 X. Wu, D. Vanderbilt, D.R. Hamann, Phys. Rev. B 72, 035105 (2005).