Mklman PDF

Intel® Math Kernel Library
Reference Manual
August 2008
Disclaimer and Legal Information
Document Number: 630813-029US

World Wide Web: http://developer.intel.com
Version Version Information Date
-001 Original Issue. 11/94
-002 Added functions crotg, zrotg. Documented versions of functions ?her2k, ?symm, ?syrk, and 5/95
?syr2k not previously described. Pagination revised.
-003 Changed the title; former title: “Intel BLAS Library for the Pentium® Processor Reference 1/96
Manual.”
Added functions ?rotm, ?rotmg and updated Appendix C.
-004 Documents Intel® Math Kernel library (Intel® MKL) release 2.0 with the parallelism capability. 11/96
Information on parallelism has been added in Chapter 1 and in section “BLAS Level 3 Routines”
in Chapter 2.
-005 Two-dimensional FFTs have been added. C interface has been added to both one- and 8/97
two-dimensional FFTs.
-006 Documents Intel Math Kernel Library release 2.1. Sparse BLAS section has been added in 1/98
Chapter 2.
-007 Documents Intel Math Kernel Library release 3.0. Descriptions of LAPACK routines (Chapters 1/99
4 and 5) and CBLAS interface (Appendix C) have been added. Quick Reference has been
excluded from the manual; MKL 3.0 Quick Reference is now available in HTML format.
-008 Documents Intel Math Kernel Library release 3.2. Description of FFT routines have been 6/99
revised. In Chapters 4 and 5 NAG names for LAPACK routines have been excluded.
-009 New LAPACK routines for eigenvalue problems have been added in chapter 5. 11/99
-010 Documents Intel Math Kernel Library release 4.0. Chapter 6 describing the VML functions 06/00
has been added.
-011 Documents Intel Math Kernel Library release 5.1. LAPACK section has been extended to 04/01
include the full list of computational and driver routines.
-6001 Documents Intel Math Kernel Library release 6.0 beta. New DFT interface and Vector Statistical 07/02
Library functions have been added.
-6002 Documents Intel Math Kernel Library 6.0 beta update. DFT functions description has been 12/02
updated. CBLAS interface description was extended.
-6003 Documents Intel Math Kernel Library 6.0 gold. DFT functions have been updated. Auxiliary 03/03
LAPACK routines’ descriptions were added to the manual.
-6004 Documents Intel Math Kernel Library release 6.1. 07/03
-6005 Documents Intel Math Kernel Library release 7.0 beta. Includes ScaLAPACK and sparse solver 11/03
descriptions.
-017 Documents Intel MKL and Intel® Cluster MKL release 7.0 gold. Auxiliary ScaLAPACK and 04/04
alternative sparse solver interface were added.
3
-018 Documents Intel MKL and Intel® Cluster MKL release 8.0 beta. Sparse BLAS and DFTI 03/05
sections were extended. New functionality was added: Sparse BLAS, Cluster DFTI,
iterative sparse solver, multiple-precision arithmetic, interval linear solver, and
convolution/correlation. Fortran95 interface to LAPACK functions was added.
-019 Documents Intel MKL and Intel® Cluster MKL release 8.0 gold. Fortran95 interface to 08/05
BLAS and Sparse BLAS functions has been added.
-020 Documents Intel MKL and Intel Cluster MKL release 8.0.2. PARDISO functionality 03/06
description has been extended with indefinite symmetric matrices pivoting.
-021 Documents Intel MKL and Intel Cluster MKL release 8.1 gold. Chapter 13 on Trigonometric 03/06
Transform functions has been added. Information on specific features of Fortran-95
implementation for LAPACK routines has been reflected in a new Appendix E and the
relevant subsection of Chapter 3.
-022 Documents Intel MKL and Intel Cluster MKL release 9.0 beta. Statistical Functions, 05/06
Fourier Transform Functions (Cluster DFT) descriptions have been updated. Description
of new VML functions, RCI Sparse Solvers, and Poisson solver have been added. Chapter
13 has been renamed to PDE Support. Code examples have been expanded.
-023 Documents Intel MKL and Intel Cluster MKL release 9.0 gold. Complex Interval Solvers 09/06
have been added. Description of the old deprecated FFT functions have been removed.
-024 Documents Intel MKL and Intel Cluster MKL release 9.1 beta. Optimization Solvers 01/07
Routines and Support Functions chapters have been added. Chapters on Sparse Solvers
and Partial Differential Equations Support have been extended. LAPACK chapters have
been partially updated to reflect LAPACK 3.1 version.
-025 Documents Intel MKL and Intel Cluster MKL release 9.1 gold. BLACS chapter has been 03/07
added. Chapters on BLAS and FFT have been extended. LAPACK chapters have been
additionally updated to reflect LAPACK 3.1 version. Description of BSR format has been
added to Appendix A. New FFT examples have been added to Appendix C.
-026 Documents Intel MKL and Intel Cluster MKL release 10.0 beta. BLACS, BLAS, Sparse 07/07
Solver, VML, and FFT chapters have been extended. Description of new Threading
Control functions has been added to Support Functions chapter.
-027 Documents Intel MKL release 10.0. BLAS, Sparse Solver, VML, and PDES chapters have 09/07
been updated.
-028 Documents Intel MKL release 10.1 beta. Chapter on PBLAS routines has been added. 04/08
BLAS, LAPACK, Sparse Solvers, FFT, VML, and Statistics Functions chapters have been
updated. Interval Linear Solvers chapter has been removed. Appendix G on FFTW to
Intel(R) MKL wrappers has been added. Information on header files have been added
to descriptions of all functions.
4
-029 Documents Intel MKL release 10.1. Sparse BLAS, Sparse Solvers, FFT chapters, and 08/08
code examples have been updated. Fortran 95 interface has been changed for relevant
BLAS and LAPACK functions.
5
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE,
EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED
BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS,
INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY,
RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO
FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT
OR OTHER INTELLECTUAL PROPERTY RIGHT.
UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED
FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE
PERSONAL INJURY OR DEATH MAY OCCUR.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must
not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel
reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities
arising from future changes to them. The information here is subject to change without notice. Do not finalize a
design with this information.
The products described in this document may contain design defects or errors known as errata which may cause
the product to deviate from published specifications. Current characterized errata are available on request.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your
product order.
Copies of documents which have an order number and are referenced in this document, or other Intel literature,
may be obtained by calling 1-800-548-4725, or by visiting Intel's Web Site.
Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each
processor family, not across different processor families. See http://www.intel.com/products/processor_number
for details.
BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino Atom, Centrino Inside, Centrino logo, Core Inside,
FlashFile, i960, InstantIP, Intel, Intel logo, Intel386, Intel486, IntelDX2, IntelDX4, IntelSX2, Intel Atom, Intel
Core, Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel NetBurst, Intel NetMerge,
Intel NetStructure, Intel SingleDriver, Intel SpeedStep, Intel StrataFlash, Intel Viiv, Intel vPro, Intel XScale,
Itanium, Itanium Inside, MCS, MMX, Oplus, OverDrive, PDCharm, Pentium, Pentium Inside, skoool, Sound Mark,
The Journey Inside, Viiv Inside, vPro Inside, VTune, Xeon, and Xeon Inside are trademarks of Intel Corporation
in the U.S. and other countries.
* Other names and brands may be claimed as the property of others.
Copyright© 1994-2008, Intel Corporation. All rights reserved.
Portions© Copyright 2001 Hewlett-Packard Development Company, L.P.
Chapters 4 and 5 include derivative work portions that have been copyrighted:
© 1991, 1992, and 1998 by The Numerical Algorithms Group, Ltd.
7
Contents
Version Information..................................................................... 3
Legal Information........................................................................ 7
Chapter 1: Overview
About This Software..........................................................................51
Technical Support.....................................................................52
BLAS Routines..........................................................................52
Sparse BLAS Routines...............................................................53
LAPACK Routines......................................................................53
ScaLAPACK Routines.................................................................54
PBLAS Routines........................................................................54
Sparse Solver Routines..............................................................54
VML Functions..........................................................................55
Statistical Functions..................................................................55
Fourier Transform Functions.......................................................55
Partial Differential Equations Support...........................................56
Optimization Solver Routines......................................................56
Support Functions.....................................................................56
BLACS Routines........................................................................57
GMP Arithmetic Functions..........................................................57
Performance Enhancements.......................................................57
Parallelism...............................................................................58
Platforms Supported..................................................................58
About This Manual.............................................................................58
9
Audience for This Manual.....................................................59

Manual Organization...........................................................59
Notational Conventions........................................................61
Chapter 2: BLAS and Sparse BLAS Routines

BLAS Routines............................................................................63
Routine Naming Conventions................................................63
Fortran 95 Interface Conventions..........................................65
Matrix Storage Schemes......................................................66
BLAS Level 1 Routines.........................................................67
?asum......................................................................68
?axpy.......................................................................69
?copy.......................................................................71
?dot.........................................................................73
?sdot........................................................................74
?dotc........................................................................76
?dotu........................................................................77
?nrm2......................................................................79
?rot..........................................................................80
?rotg........................................................................82
?rotm.......................................................................83
?rotmg.....................................................................86
?scal.........................................................................88
?swap.......................................................................90
i?amax.....................................................................91
i?amin......................................................................93
dcabs1......................................................................94
BLAS Level 2 Routines.........................................................95
?gbmv......................................................................96
?gemv....................................................................100
?ger........................................................................103
?gerc......................................................................105
10
Contents
?geru......................................................................107
?hbmv....................................................................109
?hemv....................................................................112
?her........................................................................115
?her2......................................................................117
?hpmv....................................................................120
?hpr.......................................................................122
?hpr2......................................................................124
?sbmv.....................................................................127
?spmv.....................................................................130
?spr........................................................................133
?spr2......................................................................135
?symv.....................................................................137
?syr........................................................................140
?syr2......................................................................142
?tbmv.....................................................................144
?tbsv......................................................................148
?tpmv.....................................................................151
?tpsv......................................................................154
?trmv......................................................................156
?trsv.......................................................................159
BLAS Level 3 Routines.......................................................161
?gemm...................................................................162
?hemm...................................................................166
?herk......................................................................169
?her2k....................................................................172
?symm....................................................................176
?syrk......................................................................179
?syr2k....................................................................183
?trmm....................................................................187
?trsm......................................................................190
Sparse BLAS Level 1 Routines.....................................................193
11
Vector Arguments.............................................................193
Naming Conventions.........................................................194
Routines and Data Types....................................................194
BLAS Level 1 Routines That Can Work With Sparse Vectors.....195
?axpyi.............................................................................195
?doti...............................................................................197
?dotci..............................................................................199
?dotui.............................................................................200
?gthr...............................................................................202
?gthrz.............................................................................203
?roti................................................................................205
?sctr...............................................................................206
Sparse BLAS Level 2 and Level 3 Routines....................................208
Naming Conventions in Sparse BLAS Level 2 and Level 3.......208
Sparse Matrix Storage Formats...........................................209
Routines and Supported Operations.....................................210
Interface Consideration.....................................................212
Sparse BLAS Level 2 and Level 3 Routines............................219
mkl_?csrgemv..........................................................224
mkl_?bsrgemv.........................................................227
mkl_?coogemv.........................................................231
mkl_?diagemv.........................................................235
mkl_?csrsymv..........................................................239
mkl_?bsrsymv..........................................................243
mkl_?coosymv.........................................................247
mkl_?diasymv..........................................................251
mkl_?csrtrsv............................................................255
mkl_?bsrtrsv............................................................259
mkl_?cootrsv...........................................................263
mkl_?diatrsv............................................................267
mkl_cspblas_?csrgemv..............................................271
mkl_cspblas_?bsrgemv.............................................275
12
Contents
mkl_cspblas_?coogemv.............................................279
mkl_cspblas_?csrsymv..............................................283
mkl_cspblas_?bsrsymv..............................................287
mkl_cspblas_?coosymv.............................................291
mkl_cspblas_?csrtrsv................................................295
mkl_cspblas_?bsrtrsv................................................299
mkl_cspblas_?cootrsv...............................................303
mkl_?csrmv.............................................................308
mkl_?bsrmv.............................................................314
mkl_?cscmv.............................................................320
mkl_?coomv............................................................326
mkl_?csrsv..............................................................330
mkl_?bsrsv..............................................................336
mkl_?cscsv..............................................................342
mkl_?coosv.............................................................348
mkl_?csrmm............................................................353
mkl_?bsrmm............................................................359
mkl_?cscmm............................................................365
mkl_?coomm...........................................................371
mkl_?csrsm.............................................................377
mkl_?cscsm.............................................................383
mkl_?coosm............................................................389
mkl_?bsrsm.............................................................394
mkl_?diamv.............................................................400
mkl_?skymv............................................................405
mkl_?diasv..............................................................409
mkl_?skysv.............................................................413
mkl_?diamm............................................................418
mkl_?skymm...........................................................424
mkl_?diasm.............................................................430
mkl_?skysm............................................................435
mkl_ddnscsr............................................................439
13
mkl_dcsrcoo............................................................442
mkl_dcsrbsr.............................................................445
mkl_?csradd............................................................449
mkl_?csrmultcsr.......................................................455
mkl_?csrmultd.........................................................461
BLAS-like Extensions.................................................................466
mkl_?imatcopy.................................................................467
mkl_?omatcopy................................................................471
mkl_?omatcopy2..............................................................476
mkl_?omatadd.................................................................480
?gemm3m.......................................................................485
Chapter 3: LAPACK Routines: Linear Equations

Routine Naming Conventions......................................................492
Fortran 95 Interface Conventions................................................493
Intel® MKL Fortran 95 Interfaces for LAPACK Routines vs. Netlib
Implementation............................................................495
Matrix Storage Schemes............................................................496
Mathematical Notation...............................................................497
Error Analysis...........................................................................497
Computational Routines.............................................................498
Routines for Matrix Factorization.........................................501
?getrf.....................................................................501
?gbtrf.....................................................................504
?gttrf......................................................................507
?potrf.....................................................................509
?pptrf.....................................................................511
?pbtrf.....................................................................514
?pttrf......................................................................516
?sytrf......................................................................518
?hetrf.....................................................................521
?sptrf......................................................................525
14
Contents
?hptrf.....................................................................528
Routines for Solving Systems of Linear Equations..................530
?getrs.....................................................................531
?gbtrs.....................................................................534
?gttrs......................................................................537
?potrs.....................................................................540
?pptrs.....................................................................542
?pbtrs.....................................................................545
?pttrs......................................................................548
?sytrs.....................................................................550
?hetrs.....................................................................553
?sptrs.....................................................................555
?hptrs.....................................................................558
?trtrs......................................................................561
?tptrs......................................................................564
?tbtrs......................................................................567
Routines for Estimating the Condition Number......................569
?gecon....................................................................570
?gbcon....................................................................572
?gtcon....................................................................575
?pocon....................................................................578
?ppcon....................................................................580
?pbcon....................................................................582
?ptcon....................................................................585
?sycon....................................................................587
?hecon....................................................................589
?spcon....................................................................591
?hpcon....................................................................594
?trcon.....................................................................596
?tpcon....................................................................598
?tbcon....................................................................601
Refining the Solution and Estimating Its Error.......................603
15
?gerfs.....................................................................604
?gbrfs.....................................................................607
?gtrfs......................................................................610
?porfs.....................................................................613
?pprfs.....................................................................616
?pbrfs.....................................................................619
?ptrfs......................................................................622
?syrfs.....................................................................625
?herfs.....................................................................628
?sprfs.....................................................................631
?hprfs.....................................................................634
?trrfs......................................................................637
?tprfs......................................................................640
?tbrfs......................................................................643
Routines for Matrix Inversion..............................................646
?getri......................................................................646
?potri......................................................................649
?pptri......................................................................651
?sytri......................................................................653
?hetri......................................................................655
?sptri......................................................................657
?hptri......................................................................659
?trtri.......................................................................661
?tptri......................................................................663
Routines for Matrix Equilibration.........................................665
?geequ....................................................................665
?gbequ....................................................................667
?poequ....................................................................670
?ppequ....................................................................672
?pbequ....................................................................675
Driver Routines.........................................................................677
?gesv..............................................................................679
16
Contents
?gesvx............................................................................683
?gbsv..............................................................................690
?gbsvx............................................................................692
?gtsv..............................................................................699
?gtsvx.............................................................................701
?posv..............................................................................706
?posvx............................................................................708
?ppsv..............................................................................714
?ppsvx............................................................................716
?pbsv..............................................................................722
?pbsvx............................................................................724
?ptsv..............................................................................730
?ptsvx.............................................................................732
?sysv..............................................................................736
?sysvx.............................................................................739
?hesv..............................................................................744
?hesvx............................................................................747
?spsv..............................................................................752
?spsvx............................................................................755
?hpsv..............................................................................759
?hpsvx............................................................................762
Chapter 4: LAPACK Routines: Least Squares and Eigenvalue

Problems
Routine Naming Conventions......................................................768
Matrix Storage Schemes............................................................770
Mathematical Notation...............................................................770
Computational Routines.............................................................771
Orthogonal Factorizations..................................................771
?geqrf.....................................................................773
?geqpf....................................................................777
?geqp3....................................................................780
17
?orgqr.....................................................................783
?ormqr....................................................................786
?ungqr....................................................................789
?unmqr...................................................................792
?gelqf.....................................................................795
?orglq.....................................................................799
?ormlq....................................................................802
?unglq....................................................................805
?unmlq...................................................................807
?geqlf.....................................................................810
?orgql.....................................................................813
?ungql....................................................................816
?ormql....................................................................818
?unmql...................................................................821
?gerqf.....................................................................824
?orgrq.....................................................................827
?ungrq....................................................................830
?ormrq....................................................................832
?unmrq...................................................................835
?tzrzf......................................................................838
?ormrz....................................................................841
?unmrz...................................................................844
?ggqrf.....................................................................847
?ggrqf.....................................................................851
Singular Value Decomposition.............................................855
?gebrd....................................................................858
?gbbrd....................................................................862
?orgbr.....................................................................865
?ormbr....................................................................869
?ungbr....................................................................873
?unmbr...................................................................876
?bdsqr....................................................................880
18
Contents
?bdsdc....................................................................885
Symmetric Eigenvalue Problems.........................................888
?sytrd.....................................................................893
?syrdb....................................................................896
?herdb....................................................................899
?orgtr.....................................................................902
?ormtr....................................................................904
?hetrd.....................................................................907
?ungtr.....................................................................911
?unmtr....................................................................913
?sptrd.....................................................................916
?opgtr.....................................................................918
?opmtr....................................................................920
?hptrd.....................................................................922
?upgtr.....................................................................924
?upmtr....................................................................926
?sbtrd.....................................................................929
?hbtrd.....................................................................931
?sterf......................................................................934
?steqr.....................................................................936
?stemr....................................................................940
?stedc.....................................................................945
?stegr.....................................................................950
?pteqr.....................................................................956
?stebz.....................................................................960
?stein.....................................................................964
?disna.....................................................................967
Generalized Symmetric-Definite Eigenvalue Problems............969
?sygst.....................................................................970
?hegst....................................................................972
?spgst.....................................................................975
?hpgst....................................................................977
19
?sbgst.....................................................................979
?hbgst....................................................................982
?pbstf.....................................................................985
Nonsymmetric Eigenvalue Problems....................................987
?gehrd....................................................................992
?orghr.....................................................................994
?ormhr....................................................................997
?unghr..................................................................1001
?unmhr.................................................................1004
?gebal...................................................................1008
?gebak..................................................................1011
?hseqr...................................................................1014
?hsein...................................................................1020
?trevc...................................................................1026
?trsna...................................................................1031
?trexc...................................................................1037
?trsen...................................................................1040
?trsyl....................................................................1046
Generalized Nonsymmetric Eigenvalue Problems.................1048
?gghrd..................................................................1050
?ggbal...................................................................1054
?ggbak..................................................................1057
?hgeqz..................................................................1059
?tgevc...................................................................1067
?tgexc...................................................................1072
?tgsen...................................................................1076
?tgsyl....................................................................1083
?tgsna...................................................................1089
Generalized Singular Value Decomposition..........................1094
?ggsvp..................................................................1095
?tgsja...................................................................1099
Driver Routines.......................................................................1107
20
Contents
Linear Least Squares (LLS) Problems.................................1107

?gels.....................................................................1108
?gelsy...................................................................1112
?gelss...................................................................1117
?gelsd...................................................................1121
Generalized LLS Problems................................................1126
?gglse...................................................................1126
?ggglm..................................................................1130
Symmetric Eigenproblems................................................1133
?syev....................................................................1134
?heev....................................................................1138
?syevd..................................................................1141
?heevd..................................................................1145
?syevx..................................................................1149
?heevx..................................................................1154
?syevr...................................................................1159
?heevr..................................................................1165
?spev....................................................................1172
?hpev....................................................................1174
?spevd..................................................................1177
?hpevd..................................................................1181
?spevx..................................................................1185
?hpevx..................................................................1190
?sbev....................................................................1194
?hbev....................................................................1197
?sbevd..................................................................1199
?hbevd..................................................................1204
?sbevx..................................................................1208
?hbevx..................................................................1213
?stev.....................................................................1218
?stevd...................................................................1220
?stevx...................................................................1224
21
?stevr...................................................................1228
Nonsymmetric Eigenproblems...........................................1234
?gees....................................................................1235
?geesx..................................................................1240
?geev....................................................................1247
?geevx..................................................................1252
Singular Value Decomposition...........................................1259
?gesvd..................................................................1259
?gesdd..................................................................1265
?ggsvd..................................................................1270
Generalized Symmetric Definite Eigenproblems...................1276
?sygv....................................................................1277
?hegv....................................................................1281
?sygvd..................................................................1284
?hegvd..................................................................1288
?sygvx..................................................................1293
?hegvx..................................................................1299
?spgv....................................................................1305
?hpgv....................................................................1308
?spgvd..................................................................1311
?hpgvd..................................................................1315
?spgvx..................................................................1320
?hpgvx..................................................................1325
?sbgv....................................................................1330
?hbgv....................................................................1333
?sbgvd..................................................................1336
?hbgvd..................................................................1340
?sbgvx..................................................................1345
?hbgvx..................................................................1350
Generalized Nonsymmetric Eigenproblems..........................1355
?gges....................................................................1356
?ggesx..................................................................1363
22
Contents
?ggev....................................................................1372
?ggevx..................................................................1378
Chapter 5: LAPACK Auxiliary and Utility Routines

Auxiliary Routines....................................................................1387
?lacgv...........................................................................1402
?lacrm...........................................................................1403
?lacrt............................................................................1404
?laesy...........................................................................1405
?rot...............................................................................1407
?spmv...........................................................................1408
?spr..............................................................................1410
?symv...........................................................................1412
?syr..............................................................................1414
i?max1..........................................................................1416
?sum1...........................................................................1417
?gbtf2...........................................................................1418
?gebd2..........................................................................1419
?gehd2..........................................................................1422
?gelq2...........................................................................1424
?geql2...........................................................................1426
?geqr2...........................................................................1428
?gerq2...........................................................................1429
?gesc2...........................................................................1431
?getc2...........................................................................1433
?getf2...........................................................................1434
?gtts2............................................................................1436
?isnan...........................................................................1437
?laisnan.........................................................................1438
?labrd............................................................................1439
?lacn2...........................................................................1442
?lacon...........................................................................1444
23
?lacpy...........................................................................1445
?ladiv............................................................................1447
?lae2.............................................................................1448
?laebz...........................................................................1449
?laed0...........................................................................1454
?laed1...........................................................................1457
?laed2...........................................................................1459
?laed3...........................................................................1462
?laed4...........................................................................1465
?laed5...........................................................................1466
?laed6...........................................................................1467
?laed7...........................................................................1469
?laed8...........................................................................1473
?laed9...........................................................................1477
?laeda...........................................................................1479
?laein............................................................................1481
?laev2...........................................................................1484
?laexc...........................................................................1486
?lag2.............................................................................1488
?lags2...........................................................................1490
?lagtf............................................................................1492
?lagtm...........................................................................1494
?lagts............................................................................1496
?lagv2...........................................................................1498
?lahqr............................................................................1501
?lahrd............................................................................1504
?lahr2............................................................................1507
?laic1............................................................................1510
?laln2............................................................................1513
?lals0............................................................................1516
?lalsa............................................................................1520
?lalsd............................................................................1524
24
Contents
?lamrg...........................................................................1527
?laneg...........................................................................1528
?langb...........................................................................1529
?lange...........................................................................1531
?langt............................................................................1532
?lanhs...........................................................................1534
?lansb...........................................................................1535
?lanhb...........................................................................1537
?lansp...........................................................................1539
?lanhp...........................................................................1541
?lanst/?lanht..................................................................1542
?lansy...........................................................................1544
?lanhe...........................................................................1545
?lantb............................................................................1547
?lantp............................................................................1549
?lantr............................................................................1551
?lanv2...........................................................................1553
?lapll.............................................................................1554
?lapmt...........................................................................1556
?lapy2...........................................................................1557
?lapy3...........................................................................1558
?laqgb...........................................................................1558
?laqge...........................................................................1561
?laqhb...........................................................................1563
?laqp2...........................................................................1565
?laqps...........................................................................1567
?laqr0............................................................................1569
?laqr1............................................................................1573
?laqr2............................................................................1575
?laqr3............................................................................1579
?laqr4............................................................................1583
?laqr5............................................................................1587
25
?laqsb...........................................................................1591
?laqsp...........................................................................1593
?laqsy...........................................................................1595
?laqtr............................................................................1597
?lar1v............................................................................1600
?lar2v............................................................................1603
?larf..............................................................................1605
?larfb............................................................................1607
?larfg............................................................................1609
?larft.............................................................................1611
?larfx............................................................................1614
?largv............................................................................1616
?larnv............................................................................1618
?larra............................................................................1619
?larrb............................................................................1621
?larrc............................................................................1623
?larrd............................................................................1625
?larre............................................................................1629
?larrf.............................................................................1633
?larrj.............................................................................1636
?larrk............................................................................1638
?larrr.............................................................................1640
?larrv............................................................................1641
?lartg............................................................................1646
?lartv............................................................................1648
?laruv............................................................................1649
?larz..............................................................................1650
?larzb............................................................................1652
?larzt............................................................................1655
?las2.............................................................................1658
?lascl.............................................................................1659
?lasd0...........................................................................1661
26
Contents
?lasd1...........................................................................1663
?lasd2...........................................................................1666
?lasd3...........................................................................1670
?lasd4...........................................................................1674
?lasd5...........................................................................1676
?lasd6...........................................................................1677
?lasd7...........................................................................1682
?lasd8...........................................................................1686
?lasd9...........................................................................1689
?lasda...........................................................................1691
?lasdq...........................................................................1695
?lasdt............................................................................1698
?laset............................................................................1699
?lasq1...........................................................................1700
?lasq2...........................................................................1702
?lasq3...........................................................................1703
?lasq4...........................................................................1705
?lasq5...........................................................................1706
?lasq6...........................................................................1708
?lasr..............................................................................1709
?lasrt............................................................................1713
?lassq............................................................................1714
?lasv2...........................................................................1716
?laswp...........................................................................1717
?lasy2...........................................................................1719
?lasyf............................................................................1721
?lahef............................................................................1724
?latbs............................................................................1726
?latdf............................................................................1729
?latps............................................................................1731
?latrd............................................................................1734
?latrs............................................................................1737
27
?latrz............................................................................1742
?lauu2...........................................................................1744
?lauum..........................................................................1746
?lazq3...........................................................................1747
?lazq4...........................................................................1750
?org2l/?ung2l.................................................................1752
?org2r/?ung2r................................................................1754
?orgl2/?ungl2.................................................................1756
?orgr2/?ungr2................................................................1757
?orm2l/?unm2l...............................................................1759
?orm2r/?unm2r..............................................................1762
?orml2/?unml2...............................................................1764
?ormr2/?unmr2..............................................................1767
?ormr3/?unmr3..............................................................1769
?pbtf2...........................................................................1772
?potf2...........................................................................1774
?ptts2............................................................................1775
?rscl..............................................................................1777
?sygs2/?hegs2................................................................1778
?sytd2/?hetd2................................................................1780
?sytf2............................................................................1783
?hetf2...........................................................................1785
?tgex2...........................................................................1787
?tgsy2...........................................................................1790
?trti2.............................................................................1794
clag2z...........................................................................1795
dlag2s...........................................................................1796
slag2d...........................................................................1798
zlag2c...........................................................................1799
Utility Functions and Routines...................................................1800
ilaver.............................................................................1801
ilaenv............................................................................1802
28
Contents
iparmq..........................................................................1805
ieeeck...........................................................................1807
lsamen..........................................................................1808
?labad...........................................................................1809
?lamch..........................................................................1810
?lamc1..........................................................................1811
?lamc2..........................................................................1812
?lamc3..........................................................................1813
?lamc4..........................................................................1814
?lamc5..........................................................................1814
second/dsecnd................................................................1815
Chapter 6: ScaLAPACK Routines

Overview...............................................................................1817
Routine Naming Conventions....................................................1818
Computational Routines...........................................................1820
Linear Equations.............................................................1820
Routines for Matrix Factorization.......................................1821
p?getrf..................................................................1822
p?gbtrf..................................................................1824
p?dbtrf..................................................................1827
p?potrf..................................................................1829
p?pbtrf..................................................................1831
p?pttrf...................................................................1834
p?dttrf...................................................................1836
Routines for Solving Systems of Linear Equations................1839
p?getrs..................................................................1839
p?gbtrs.................................................................1841
p?potrs..................................................................1844
p?pbtrs.................................................................1846
p?pttrs..................................................................1849
p?dttrs..................................................................1852
29
p?dbtrs.................................................................1855
p?trtrs...................................................................1858
Routines for Estimating the Condition Number....................1860
p?gecon................................................................1861
p?pocon................................................................1864
p?trcon.................................................................1867
Refining the Solution and Estimating Its Error.....................1870
p?gerfs..................................................................1871
p?porfs..................................................................1875
p?trrfs...................................................................1879
Routines for Matrix Inversion............................................1883
p?getri..................................................................1883
p?potri..................................................................1886
p?trtri...................................................................1888
Routines for Matrix Equilibration........................................1889
p?geequ................................................................1890
p?poequ................................................................1892
Orthogonal Factorizations.................................................1895
p?geqrf.................................................................1896
p?geqpf.................................................................1898
p?orgqr.................................................................1901
p?ungqr.................................................................1904
p?ormqr................................................................1906
p?unmqr................................................................1910
p?gelqf..................................................................1913
p?orglq..................................................................1916
p?unglq.................................................................1918
p?ormlq.................................................................1920
p?unmlq................................................................1924
p?geqlf..................................................................1927
p?orgql..................................................................1930
p?ungql.................................................................1932
30
Contents
p?ormql.................................................................1935
p?unmql................................................................1938
p?gerqf.................................................................1942
p?orgrq.................................................................1945
p?ungrq.................................................................1947
p?ormrq................................................................1949
p?unmrq................................................................1953
p?tzrzf..................................................................1956
p?ormrz................................................................1959
p?unmrz................................................................1963
p?ggqrf.................................................................1967
p?ggrqf.................................................................1972
Symmetric Eigenproblems................................................1977
p?sytrd..................................................................1978
p?ormtr.................................................................1982
p?hetrd.................................................................1986
p?unmtr................................................................1990
p?stebz.................................................................1994
p?stein..................................................................1998
Nonsymmetric Eigenvalue Problems...................................2003
p?gehrd.................................................................2004
p?ormhr................................................................2008
p?unmhr................................................................2011
p?lahqr..................................................................2015
Singular Value Decomposition...........................................2018
p?gebrd.................................................................2018
p?ormbr................................................................2023
p?unmbr................................................................2028
Generalized Symmetric-Definite Eigen Problems..................2033
p?sygst.................................................................2034
p?hegst.................................................................2036
Driver Routines.......................................................................2039
31
p?gesv..........................................................................2039
p?gesvx.........................................................................2042
p?gbsv..........................................................................2049
p?dbsv..........................................................................2052
p?dtsv...........................................................................2055
p?posv..........................................................................2058
p?posvx.........................................................................2060
p?pbsv..........................................................................2067
p?ptsv...........................................................................2070
p?gels...........................................................................2073
p?syev...........................................................................2078
p?syevx.........................................................................2081
p?heevx.........................................................................2089
p?gesvd.........................................................................2098
p?sygvx.........................................................................2103
p?hegvx.........................................................................2113
Chapter 7: ScaLAPACK Auxiliary and Utility Routines

Auxiliary Routines....................................................................2123
p?lacgv..........................................................................2130
p?max1.........................................................................2131
?combamax1..................................................................2132
p?sum1.........................................................................2133
p?dbtrsv........................................................................2134
p?dttrsv.........................................................................2138
p?gebd2........................................................................2142
p?gehd2........................................................................2146
p?gelq2.........................................................................2149
p?geql2.........................................................................2152
p?geqr2.........................................................................2155
p?gerq2.........................................................................2158
p?getf2..........................................................................2161
32
Contents
p?labrd..........................................................................2163
p?lacon..........................................................................2168
p?laconsb......................................................................2170
p?lacp2..........................................................................2171
p?lacp3..........................................................................2173
p?lacpy..........................................................................2175
p?laevswp......................................................................2177
p?lahrd..........................................................................2179
p?laiect..........................................................................2182
p?lange.........................................................................2184
p?lanhs..........................................................................2186
p?lansy, p?lanhe.............................................................2188
p?lantr..........................................................................2191
p?lapiv..........................................................................2193
p?laqge.........................................................................2197
p?laqsy..........................................................................2199
p?lared1d......................................................................2202
p?lared2d......................................................................2204
p?larf............................................................................2205
p?larfb...........................................................................2209
p?larfc...........................................................................2213
p?larfg...........................................................................2216
p?larft...........................................................................2218
p?larz............................................................................2222
p?larzb..........................................................................2226
p?larzc..........................................................................2230
p?larzt...........................................................................2234
p?lascl...........................................................................2238
p?laset..........................................................................2240
p?lasmsub.....................................................................2242
p?lassq..........................................................................2243
p?laswp.........................................................................2246
33
p?latra...........................................................................2248
p?latrd..........................................................................2249
p?latrs...........................................................................2253
p?latrz...........................................................................2256
p?lauu2.........................................................................2259
p?lauum........................................................................2261
p?lawil...........................................................................2262
p?org2l/p?ung2l..............................................................2263
p?org2r/p?ung2r.............................................................2266
p?orgl2/p?ungl2..............................................................2269
p?orgr2/p?ungr2.............................................................2272
p?orm2l/p?unm2l............................................................2275
p?orm2r/p?unm2r...........................................................2279
p?orml2/p?unml2............................................................2284
p?ormr2/p?unmr2...........................................................2288
p?pbtrsv........................................................................2293
p?pttrsv.........................................................................2298
p?potf2..........................................................................2302
p?rscl............................................................................2304
p?sygs2/p?hegs2............................................................2305
p?sytd2/p?hetd2.............................................................2308
p?trti2...........................................................................2312
?lamsh..........................................................................2314
?laref............................................................................2316
?lasorte.........................................................................2318
?lasrt2...........................................................................2319
?stein2..........................................................................2321
?dbtf2...........................................................................2323
?dbtrf............................................................................2325
?dttrf.............................................................................2327
?dttrsv..........................................................................2329
?pttrsv..........................................................................2330
34
Contents
?steqr2..........................................................................2332
Utility Functions and Routines...................................................2334
p?labad.........................................................................2335
p?lachkieee....................................................................2336
p?lamch.........................................................................2337
p?lasnbt........................................................................2338
pxerbla..........................................................................2339
Chapter 8: Sparse Solver Routines

PARDISO* - Parallel Direct Sparse Solver Interface......................2341
pardiso..........................................................................2342
Direct Sparse Solver (DSS) Interface Routines............................2362
DSS Interface Description................................................2365
dss_create.....................................................................2366
dss_define_structure.......................................................2368
dss_reorder....................................................................2369
dss_factor_real, dss_factor_complex.................................2370
dss_solve_real, dss_solve_complex...................................2372
dss_delete.....................................................................2373
dss_statistics..................................................................2374
mkl_cvt_to_null_terminated_str........................................2378
Implementation Details....................................................2378
Iterative Sparse Solvers based on Reverse Communication Interface
(RCI ISS)...........................................................................2380
CG Interface Description..................................................2385
FGMRES Interface Description...........................................2391
dcg_init.........................................................................2400
dcg_check......................................................................2401
dcg...............................................................................2403
dcg_get.........................................................................2405
dcgmrhs_init..................................................................2406
dcgmrhs_check...............................................................2407
35
dcgmrhs........................................................................2409
dcgmrhs_get..................................................................2411
dfgmres_init...................................................................2412
dfgmres_check...............................................................2414
dfgmres.........................................................................2415
dfgmres_get...................................................................2418
Preconditioners based on Incomplete LU Factorization Technique...2420
ILU0 and ILUT Preconditioners Interface Description............2423
dcsrilu0.........................................................................2425
dcsrilut..........................................................................2429
Calling Sparse Solver and Preconditioner Routines from C/C++.....2434
Chapter 9: Vector Mathematical Functions

Data Types and Accuracy Modes................................................2437
Function Naming Conventions...................................................2438
Functions Interface.........................................................2439
VML Mathematical Functions.....................................2439
Pack Functions.......................................................2439
Unpack Functions...................................................2440
Service Functions....................................................2440
Input Parameters....................................................2441
Output Parameters..................................................2441
Vector Indexing Methods..........................................................2442
Error Diagnostics.....................................................................2442
VML Mathematical Functions.....................................................2443
Arithmetic Functions........................................................2445
v?Add...................................................................2445
v?Sub...................................................................2447
v?Sqr....................................................................2449
v?Mul....................................................................2451
v?MulByConj..........................................................2453
36
Contents
v?Conj..................................................................2455
v?Abs....................................................................2456
Power and Root Functions................................................2458
v?Inv....................................................................2458
v?Div....................................................................2460
v?Sqrt...................................................................2462
v?InvSqrt..............................................................2464
v?Cbrt...................................................................2466
v?InvCbrt..............................................................2467
v?Pow2o3..............................................................2469
v?Pow3o2..............................................................2470
v?Pow...................................................................2472
v?Powx..................................................................2474
v?Hypot.................................................................2478
Exponential and Logarithmic Functions...............................2480
v?Exp....................................................................2480
v?Expm1...............................................................2482
v?Ln.....................................................................2484
v?Log10................................................................2486
v?Log1p................................................................2488
Trigonometric Functions...................................................2489
v?Cos....................................................................2489
v?Sin....................................................................2492
v?SinCos...............................................................2494
v?CIS....................................................................2496
v?Tan....................................................................2497
v?Acos..................................................................2500
v?Asin...................................................................2502
v?Atan..................................................................2504
v?Atan2.................................................................2506
Hyperbolic Functions.......................................................2508
v?Cosh..................................................................2508
37
v?Sinh...................................................................2510
v?Tanh..................................................................2513
v?Acosh.................................................................2515
v?Asinh.................................................................2517
v?Atanh.................................................................2520
Special Functions............................................................2522
v?Erf.....................................................................2522
v?Erfc...................................................................2526
v?CdfNorm............................................................2528
v?ErfInv................................................................2530
v?ErfcInv...............................................................2535
v?CdfNormInv........................................................2537
Rounding Functions.........................................................2539
v?Floor..................................................................2539
v?Ceil....................................................................2541
v?Trunc.................................................................2542
v?Round................................................................2544
v?NearbyInt...........................................................2545
v?Rint...................................................................2547
v?Modf..................................................................2548
VML Pack/Unpack Functions......................................................2550
v?Pack...........................................................................2550
v?Unpack.......................................................................2553
VML Service Functions.............................................................2556
SetMode........................................................................2557
GetMode........................................................................2560
SetErrStatus...................................................................2561
GetErrStatus..................................................................2563
ClearErrStatus................................................................2563
SetErrorCallBack.............................................................2564
GetErrorCallBack.............................................................2567
ClearErrorCallBack..........................................................2568
38
Contents
Chapter 10: Statistical Functions

Random Number Generators.....................................................2569
Conventions...................................................................2570
Mathematical Notation.............................................2571
Naming Conventions...............................................2572
Basic Generators.............................................................2577
BRNG Parameter Definition......................................2579
Random Streams....................................................2580
Data Types............................................................2581
Error Reporting...............................................................2581
VSL Usage Model............................................................2583
Service Routines.............................................................2585
NewStream............................................................2587
NewStreamEx........................................................2588
iNewAbstractStream................................................2590
dNewAbstractStream...............................................2593
sNewAbstractStream...............................................2597
DeleteStream.........................................................2600
CopyStream...........................................................2601
CopyStreamState...................................................2602
SaveStreamF.........................................................2604
LoadStreamF..........................................................2606
LeapfrogStream......................................................2607
SkipAheadStream...................................................2612
GetStreamStateBrng...............................................2616
GetNumRegBrngs...................................................2617
Distribution Generators....................................................2617
Continuous Distributions..........................................2620
Discrete Distributions..............................................2669
Advanced Service Routines...............................................2692
Data types.............................................................2693
39
RegisterBrng..........................................................2695
GetBrngProperties..................................................2696
Formats for User-Designed Generators......................2697
Convolution and Correlation......................................................2702
Overview.......................................................................2702
Naming Conventions........................................................2703
Data Types.....................................................................2704
Parameters....................................................................2705
Task Status and Error Reporting........................................2708
Task Constructors...........................................................2710
NewTask................................................................2712
NewTask1D............................................................2715
NewTaskX..............................................................2718
NewTaskX1D..........................................................2724
Task Editors...................................................................2728
SetMode................................................................2730
SetInternalPrecision................................................2732
SetStart................................................................2734
SetDecimation........................................................2736
Task Execution Routines...................................................2737
Exec.....................................................................2739
Exec1D..................................................................2744
ExecX....................................................................2750
ExecX1D................................................................2755
Task Destructors.............................................................2761
DeleteTask.............................................................2761
Task Copy......................................................................2762
CopyTask...............................................................2763
Usage Examples.............................................................2765
Mathematical Notation and Definitions...............................2769
Data Allocation...............................................................2770
40
Contents
Chapter 11: Fourier Transform Functions

FFT Functions.........................................................................2776
Computing FFT...............................................................2777
FFT Interface..................................................................2777
Status Checking Functions................................................2779
ErrorClass..............................................................2779
ErrorMessage.........................................................2781
Descriptor Manipulation Functions.....................................2783
CreateDescriptor....................................................2783
CommitDescriptor...................................................2785
CopyDescriptor.......................................................2787
FreeDescriptor........................................................2788
FFT Computation Functions..............................................2790
ComputeForward....................................................2790
ComputeBackward..................................................2792
Descriptor Configuration Functions....................................2794
SetValue................................................................2794
GetValue...............................................................2797
Configuration Settings.....................................................2800
Precision of transform.............................................2805
Forward domain of transform...................................2805
Transform dimension and lengths..............................2806
Number of transforms.............................................2806
Scale....................................................................2807
Placement of result.................................................2807
Storage schemes....................................................2807
Packed formats.......................................................2819
Number of user threads...........................................2824
Input and output distances......................................2825
Strides..................................................................2826
Ordering................................................................2828
41
Transposition..........................................................2829
Cluster FFT Functions...............................................................2829
Computing Cluster FFT....................................................2830
Distributing Data among Processes....................................2831
Cluster FFT Interface.......................................................2834
Descriptor Manipulation Functions.....................................2835
CreateDescriptorDM................................................2836
CommitDescriptorDM..............................................2838
FreeDescriptorDM...................................................2840
FFT Computation Functions..............................................2841
ComputeForwardDM................................................2841
ComputeBackwardDM..............................................2845
Descriptor Configuration Functions....................................2848
SetValueDM...........................................................2848
GetValueDM...........................................................2852
Error Codes....................................................................2856
Chapter 12: PBLAS Routines

Overview...............................................................................2857
Routine Naming Conventions....................................................2858
PBLAS Level 1 Routines............................................................2860
p?amax.........................................................................2861
p?asum.........................................................................2862
p?axpy..........................................................................2864
p?copy..........................................................................2866
p?dot............................................................................2867
p?dotc...........................................................................2869
p?dotu...........................................................................2870
p?nrm2..........................................................................2872
p?scal............................................................................2873
p?swap..........................................................................2874
42
Contents
p?gemv.........................................................................2877
p?ger............................................................................2880
p?gerc...........................................................................2882
p?geru...........................................................................2884
p?hemv.........................................................................2886
p?her............................................................................2889
p?her2...........................................................................2891
p?symv..........................................................................2893
p?syr.............................................................................2896
p?syr2...........................................................................2898
p?trmv..........................................................................2901
p?trsv............................................................................2903
p?gemm........................................................................2907
p?hemm........................................................................2910
p?hemm........................................................................2912
p?herk...........................................................................2915
p?her2k.........................................................................2918
p?symm........................................................................2921
p?syrk...........................................................................2924
p?syr2k.........................................................................2927
p?tran...........................................................................2930
p?tranu..........................................................................2932
p?tranc..........................................................................2933
p?trmm.........................................................................2935
p?trsm..........................................................................2938
Chapter 13: Partial Differential Equations Support

Trigonometric Transform Routines..............................................2941
Transforms Implemented.................................................2942
Sequence of Invoking TT Routines.....................................2944
Interface Description.......................................................2947
43
TT Routines....................................................................2948
?_init_trig_transform..............................................2948
?_commit_trig_transform.........................................2949
?_forward_trig_transform........................................2952
?_backward_trig_transform......................................2954
free_trig_transform.................................................2957
Common Parameters.......................................................2958
Poisson Library Routines ..........................................................2966
Poisson Library Implemented............................................2967
Sequence of Invoking PL Routines.....................................2975
Interface Description.......................................................2978
PL Routines for the Cartesian Solver..................................2979
?_init_Helmholtz_2D/?_init_Helmholtz_3D.................2979
?_commit_Helmholtz_2D/?_commit_Helmholtz_3D.....2982
?_Helmholtz_2D/?_Helmholtz_3D.............................2988
free_Helmholtz_2D/free_Helmholtz_3D.....................2994
PL Routines for the Spherical Solver..................................2995
?_init_sph_p/?_init_sph_np.....................................2995
?_commit_sph_p/?_commit_sph_np..........................2998
?_sph_p/?_sph_np..................................................3000
free_sph_p/free_sph_np..........................................3003
Common Parameters.......................................................3004
Calling PDE Support Routines from Fortran 90.............................3027
Chapter 14: Optimization Solver Routines

Organization and Implementation..............................................3029
Nonlinear Least Squares Problem without Constraints..................3031
dtrnlsp_init....................................................................3032
dtrnlsp_solve..................................................................3033
dtrnlsp_get....................................................................3035
44
Contents
dtrnlsp_delete................................................................3037
Examples of dtrnlsp Usage...............................................3038
Nonlinear Least Squares Problem with Linear (Bound) Constraints..3051
dtrnlspbc_init.................................................................3052
dtrnlspbc_solve..............................................................3054
dtrnlspbc_get.................................................................3055
dtrnlspbc_delete.............................................................3057
Examples of dtrnlspbc Usage............................................3058
Jacobi Matrix Calculation Routines.............................................3073
djacobi_init....................................................................3074
djacobi_solve.................................................................3075
djacobi_delete................................................................3076
djacobi..........................................................................3076
Examples of djacobi_solve Usage......................................3078
Examples of djacobi Usage...............................................3085
Chapter 15: Support Functions

Version Information Functions...................................................3091
MKLGetVersion...............................................................3091
MKLGetVersionString.......................................................3094
Threading Control Functions.....................................................3096
mkl_set_num_threads.....................................................3097
mkl_domain_set_num_threads.........................................3098
mkl_set_dynamic............................................................3099
mkl_get_max_threads.....................................................3100
mkl_domain_get_max_threads.........................................3100
mkl_get_dynamic...........................................................3101
Error Handling Functions..........................................................3102
xerbla............................................................................3102
pxerbla..........................................................................3103
Equality Test Functions.............................................................3104
lsame............................................................................3104
45
lsamen..........................................................................3105
Timing Functions.....................................................................3106
second/dsecnd................................................................3106
getcpuclocks..................................................................3106
getcpufrequency.............................................................3107
setcpufrequency.............................................................3108
Memory Functions...................................................................3108
MKL_FreeBuffers.............................................................3108
MKL_MemStat................................................................3111
MKL_malloc....................................................................3112
MKL_free.......................................................................3113
Examples of MKL_malloc(), MKL_free(), MKL_MemStat()
Usage........................................................................3114
Progress Function....................................................................3118
mkl_progress.................................................................3118
Chapter 16: BLACS Routines

Matrix Shapes.........................................................................3121
BLACS Combine Operations......................................................3122
?gamx2d.......................................................................3124
?gamn2d.......................................................................3126
?gsum2d........................................................................3128
BLACS Point To Point Communication.........................................3129
?gesd2d.........................................................................3133
?trsd2d..........................................................................3134
?gerv2d.........................................................................3135
?trrv2d..........................................................................3136
BLACS Broadcast Routines........................................................3137
?gebs2d.........................................................................3138
?trbs2d..........................................................................3139
?gebr2d.........................................................................3140
?trbr2d..........................................................................3141
46
Contents
BLACS Support Routines..........................................................3142

Initialization Routines......................................................3142
blacs_pinfo............................................................3143
blacs_setup...........................................................3144
blacs_get...............................................................3145
blacs_set...............................................................3146
blacs_gridinit.........................................................3148
blacs_gridmap........................................................3149
Destruction Routines.......................................................3151
blacs_freebuff........................................................3151
blacs_gridexit.........................................................3152
blacs_abort............................................................3152
blacs_exit..............................................................3153
Informational Routines....................................................3154
blacs_gridinfo.........................................................3154
blacs_pnum...........................................................3155
blacs_pcoord..........................................................3155
Miscellaneous Routines....................................................3156
blacs_barrier..........................................................3156
Examples of BLACS Routines Usage...........................................3158
Appendix A: Linear Solvers Basics

Sparse Linear Systems.............................................................3177
Matrix Fundamentals.......................................................3178
Direct Method.................................................................3179
Sparse Matrix Storage Formats.........................................3184
Appendix B: Routine and Function Arguments

Vector Arguments in BLAS........................................................3199
Vector Arguments in VML.........................................................3200
Matrix Arguments....................................................................3201
47
Appendix C: Code Examples

BLAS Code Examples...............................................................3207
PARDISO Code Examples.........................................................3211
Examples for Sparse Symmetric Linear Systems..................3211
Examples for Sparse Unsymmetric Linear Systems..............3218
Direct Sparse Solver Code Examples..........................................3225
Iterative Sparse Solver Code Examples......................................3233
Fourier Transform Functions Code Examples................................3264
FFT Code Examples.........................................................3264
Examples of Using Multi-Threading for FFT
Computation......................................................3273
Examples for Cluster FFT Functions...................................3277
Auxiliary Data Transformations.........................................3278
PDE Support Code Examples.....................................................3280
Trigonometric Transforms Interface Code Examples.............3280
Poisson Library Code Examples.........................................3289
Preconditioner Code Examples..................................................3307
Code Examples for Sparse Matrix Converters..............................3343
Appendix D: CBLAS Interface to the BLAS

CBLAS Arguments...................................................................3355
Level 1 CBLAS........................................................................3356
Level 2 CBLAS........................................................................3359
Level 3 CBLAS........................................................................3365
Sparse CBLAS.........................................................................3368
Appendix E: Specific Features of Fortran 95 Interfaces for

LAPACK Routines
Interfaces Identical to Netlib.....................................................3371
Interfaces with Replaced Argument Names.................................3373
48
Contents
Modified Netlib Interfaces.........................................................3375

Interfaces Absent From Netlib...................................................3376
Interfaces of New Functionality.................................................3379
Appendix F: Optimization Solvers Basics

Nonlinear Least Squares Problem..............................................3381
Trust-Region Algorithm............................................................3382
Appendix G: FFTW to Intel® Math Kernel Library Wrappers

Notational Conventions ...........................................................3385
FFTW2.x to Intel® Math Kernel Library Wrappers .........................3385
Wrappers Reference........................................................3385
Complex Fast Fourier Transforms..............................3386
Real Fast Fourier Transforms....................................3387
Wisdom Wrappers...................................................3387
Memory Allocation..................................................3387
Parallel Mode..................................................................3388
Calling Wrappers from Fortran..........................................3388
Installation.....................................................................3389
Creating the Wrapper Library....................................3389
Application Assembling ...........................................3391
Running Examples .................................................3391
MPI FFTW .....................................................................3391
MPI FFTW Wrappers Reference.................................3391
Creating MPI FFTW Wrapper Library..........................3393
Application Assembling with MPI FFTW Wrapper Library
........................................................................3395
FFTW3.x to Intel® Math Kernel Library Wrappers..........................3396
Wrappers Reference........................................................3397
Wrappers for Using Plans.........................................3397
49
Basic Interface Wrappers.........................................3397

Advanced Interface Wrappers...................................3400
Guru Interface Wrappers.........................................3401
Wisdom Wrappers...................................................3402
Memory Allocation..................................................3402
Parallel Mode..................................................................3403
Calling Wrappers from Fortran..........................................3403
Installation.....................................................................3405
Creating the Wrapper Library....................................3405
Application Assembling............................................3406
Bibliography
Glossary
Index
50
Overview 1
The Intel® Math Kernel Library (Intel® MKL) provides Fortran routines and functions that perform a wide
variety of operations on vectors and matrices including sparse matrices and interval matrices. The library
also includes fast Fourier transform (FFT) functions, as well as vector mathematical and vector statistical
functions with Fortran and C interfaces.
The versions of Intel MKL intended for Windows* and Linux* operating systems also include ScaLAPACK
software and Cluster FFT software for solving respective computational problems on distributed-memory
parallel computers.
The Intel MKL enhances performance of the application programs that use it because the library has been
optimized for latest generations of Intel® processors.
This chapter introduces the Intel Math Kernel Library and provides information about the organization of
this manual.
About This Software

The Intel® Math Kernel Library includes the following groups of routines:
• Basic Linear Algebra Subprograms (BLAS):

– vector operations
– matrix-vector operations
– matrix-matrix operations
• Sparse BLAS Level 1, 2, and 3 (basic operations on sparse vectors and matrices)
• LAPACK routines for solving systems of linear equations
• LAPACK routines for solving least squares problems, eigenvalue and singular value problems,
and Sylvester's equations
• Auxiliary and utility LAPACK routines
• ScaLAPACK computational, driver, and auxiliary routines (only in Intel MKL for Linux* and
Windows* operating systems)
• PBLAS routines for distributed vector, matrix-vector, and matrix-matrix operation
• Direct and Iterative Sparse Solver routines
• Vector Mathematical Library (VML) functions for computing core mathematical functions on vector
arguments (with Fortran and C interfaces)
51
1 Intel® Math Kernel Library
• Vector Statistical Library (VSL) functions for generating vectors of pseudorandom numbers
with different types of statistical distributions and for performing convolution and correlation
computations
• General Fast Fourier Transform (FFT) Functions, providing fast computation of Discrete
Fourier Transform via the FFT algorithms and having Fortran and C interfaces
• Cluster FFT functions (only in Intel MKL for Linux* and Windows* operating systems)
• Tools for solving partial differential equations - trigonometric transform routines and Poisson
solver
• Optimization Solver routines for solving nonlinear least squares problems through the
Trust-Region (TR) algorithms and computing Jacobi matrix by central differences
• Basic Linear Algebra Communication Subprograms (BLACS) that are used to support a linear
algebra oriented message passing interface
• GMP arithmetic functions
For specific issues on using the library, please refer to the Intel® MKL Release Notes.
Technical Support
Intel MKL provides a product web site that offers timely and comprehensive product information,
including product features, white papers, and technical articles. For the latest information,
check: http://developer.intel.com/software/products/
Intel also provides a support web site that contains a rich repository of self help information,
including getting started tips, known product issues, product errata, license information, user
forums, and more (visit http://support.intel.com/support/performancetools/libraries/mkl).
Registering your product entitles you to one year of technical support and product updates
through Intel® Premier Support. Intel Premier Support is an interactive issue management and
communication web site providing these services:
• Submit issues and review their status.

• Download product updates anytime of the day.
To register your product, contact Intel, or seek product support, please visit
http://developer.intel.com/software/products/support.
BLAS Routines
The BLAS routines and functions are divided into the following groups according to the operations
they perform:
52
Overview 1
• BLAS Level 1 Routines perform operations of both addition and reduction on vectors of data.
Typical operations include scaling and dot products.
• BLAS Level 2 Routines perform matrix-vector operations, such as matrix-vector multiplication,
rank-1 and rank-2 matrix updates, and solution of triangular systems.
• BLAS Level 3 Routines perform matrix-matrix operations, such as matrix-matrix multiplication,
rank-k update, and solution of triangular systems.
Starting from release 8.0, Intel® MKL also supports Fortran 95 interface to the BLAS routines.
Starting from release 10.1, a number of BLAS-like Extensions are added to enable the user to
perform certain data manipulation, including matrix in-place and out-of-place transposition
operations combined with simple matrix arithmetic operations.
Sparse BLAS Routines

The Sparse BLAS Level 1 Routines and Functions and Sparse BLAS Level 2 Routines and Level
3 routines and functions operate on sparse vectors and matrices. These routines perform vector
operations similar to the BLAS Level 1, 2, and 3 routines. The Sparse BLAS routines take
advantage of vector and matrix sparsity: they allow you to store only non-zero elements of
vectors and matrices. Intel MKL also supports Fortran 95 interface to Sparse BLAS routines.
LAPACK Routines
The Intel® Math Kernel Library fully supports LAPACK 3.1 set of computational, driver, auxiliary
and utility routines.
The original versions of LAPACK from which that part of Intel MKL was derived can be obtained
from http://www.netlib.org/lapack/index.html. The authors of LAPACK are E. Anderson, Z. Bai,
C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling,
A. McKenney, and D. Sorensen.
The LAPACK routines can be divided into the following groups according to the operations they
perform:
• Routines for solving systems of linear equations, factoring and inverting matrices, and
estimating condition numbers (see Chapter 3).
• Routines for solving least squares problems, eigenvalue and singular value problems, and
Sylvester's equations (see Chapter 4).
• Auxiliary and utility routines used to perform certain subtasks, common low-level computation
or related tasks (see Chapter 5).
53
Starting from release 8.0, Intel MKL also supports Fortran 95 interface to LAPACK computational
and driver routines. This interface provides an opportunity for simplified calls of LAPACK routines
with fewer required arguments.
ScaLAPACK Routines
The ScaLAPACK package (included only with the Intel® MKL versions for Linux* and Windows*
operating systems, see Chapter 6 and Chapter 7) runs on distributed-memory architectures
and includes routines for solving systems of linear equations, solving linear least squares
problems, eigenvalue and singular value problems, as well as performing a number of related
computational tasks.
The original versions of ScaLAPACK from which that part of Intel MKL was derived can be
obtained from http://www.netlib.org/scalapack/index.html. The authors of ScaLAPACK are L.
Blackford, J. Choi, A.Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling,
G. Henry, A. Petitet, K.Stanley, D. Walker, and R. Whaley.
The Intel MKL version of ScaLAPACK is optimized for Intel® processors and uses MPICH version
of MPI as well as Intel MPI.
PBLAS Routines
The PBLAS routines perform operations with distributed vectors and matrices.
• PBLAS Level 1 Routines perform operations of both addition and reduction on vectors of
data. Typical operations include scaling and dot products.
• PBLAS Level 2 Routines perform distributed matrix-vector operations, such as matrix-vector
multiplication, rank-1 and rank-2 matrix updates, and solution of triangular systems.
• PBLAS Level 3 Routines perform distributed matrix-matrix operations, such as matrix-matrix
multiplication, rank-k update, and solution of triangular systems.
Intel MKL provides the PBLAS routines with interface similar to the interface used in the Netlib
PBLAS (part of the ScaLAPACK package, see
http://www.netlib.org/scalapack/html/pblas_qref.html).
Sparse Solver Routines

Direct sparse solver routines in Intel MKL (see Chapter 8) solve symmetric and
symmetrically-structured sparse matrices with real or complex coefficients. For symmetric
matrices, these Intel MKL subroutines can solve both positive-definite and indefinite systems.
Intel MKL includes the PARDISO* sparse solver interface as well as an alternative set of user
callable direct sparse solver routines.
54
Overview 1
If you use the sparse solver PARDISO* from Intel MKL, please cite:
O.Schenk and K.Gärtner. Solving unsymmetric sparse systems of linear equations with PARDISO.
J. of Future Generation Computer Systems, 20(3):475-487, 2004.
Intel MKL provides also an iterative sparse solver (see Chapter 8) that uses Sparse BLAS level
2 and 3 routines and works with different sparse data formats.
VML Functions
The Vector Mathematical Library (VML) functions (see Chapter 9) include a set of highly optimized
implementations of certain computationally expensive core mathematical functions (power,
trigonometric, exponential, hyperbolic, etc.) that operate on vectors of real and complex
numbers.
Application programs that might significantly improve performance with VML include nonlinear
programming software, integrals computation, and many others. VML provides interfaces both
for Fortran and C languages.
Statistical Functions
The Vector Statistical Library (VSL) contains two sets of functions (see Chapter 10):
• The first set includes a collection of pseudo- and quasi-random number generator subroutines
implementing basic continuous and discrete distributions. To provide best performance, the
VSL subroutines use calls to highly optimized Basic Random Number Generators (BRNGs)
and a library of vector mathematical functions.
• The second set includes a collection of routines that implement a wide variety of convolution
and correlation operations.
Fourier Transform Functions

The Intel® MKL multidimensional Fast Fourier Transform (FFT) functions with mixed radix support
(see Chapter 11) provide uniformity of discrete Fourier transform computation and combine
functionality with ease of use. Both Fortran and C interface specification are given. There is
also a cluster version of FFT functions, which runs on distributed-memory architectures and is
provided only in Intel MKL versions for the Linux* and Windows* operating systems.
The FFT functions provide fast computation via the FFT algorithms for arbitrary lengths. See
the Intel MKL User's Guide for the specific radices supported.
55
Partial Differential Equations Support

Intel® MKL provides tools for solving Partial Differential Equations (PDE) (see Chapter 13).
These tools are Trigonometric Transform interface routines and Poisson Library.
The Trigonometric Transform routines may be helpful to users who implement their own solvers
similar to the solver that the Poisson Library provides. The users can improve performance of
their solvers by using fast sine, cosine, and staggered cosine transforms implemented in the
Trigonometric Transform interface.
The Poisson Library is designed for fast solving of simple Helmholtz, Poisson, and Laplace
problems. The Trigonometric Transform interface, which underlies the solver, is based on the
Intel MKL FFT interface (refer to Chapter 11), optimized for Intel® processors.
Optimization Solver Routines

Intel® MKL provides Optimization Solver routines (see Chapter 14) that can be used to solve
nonlinear least squares problems with or without linear (bound) constraints through the
Trust-Region (TR) algorithms and compute Jacobi matrix by central differences.
See also Appendix A Optimization Solvers Basics for description of the basic notions of
optimization solvers, nonlinear least square problems, and the TR algorithm.
Support Functions
The Intel® MKL support functions (see Chapter 15) are used to support the operation of the
Intel MKL software and provide basic information on the library and library operation, such as
the current library version, timing, setting and measuring of CPU frequency, error handling,
and memory allocation.
Starting from release 10.0, the Intel MKL support functions provide additional threading control.
Starting from release 10.1, Intel MKL selectively supports a Progress Routine feature to track
progress of a lengthy computation and/or interrupt the computation using a callback function
mechanism. The user application can define a function called mkl_progress that is regularly
called from the Intel MKL routine supporting the progress routine feature. See the Progress
Routines section in Chapter 15 for reference. Refer to a specific LAPACK or DSS/PARDISO
function description to see whether the function supports this feature or not.
56
Overview 1
BLACS Routines
The Intel® Math Kernel Library implements routines from the BLACS (Basic Linear Algebra
Communication Subprograms) package (see Chapter 16) that are used to support a linear
algebra oriented message passing interface that may be implemented efficiently and uniformly
across a large range of distributed memory platforms.
The original versions of BLACS from which that part of Intel MKL was derived can be obtained
from http://www.netlib.org/blacs/index.html. The authors of BLACS are Jack Dongarra and R.
Clint Whaley.
GMP Arithmetic Functions

Intel® MKL implementation of GMP arithmetic functions includes arbitrary precision arithmetic
operations on integer numbers. The interfaces of such functions fully match the GNU Multiple
Precision (GMP) Arithmetic Library.
See http://www.intel.com/software/products/mkl/docs/gnump/WebHelp/index.htm for the
description of the GMP functions.
Performance Enhancements
The Intel® Math Kernel Library has been optimized by exploiting both processor and system
features and capabilities. Special care has been given to those routines that most profit from
cache-management techniques. These especially include matrix-matrix operation routines such
as dgemm().
In addition, code optimization techniques have been applied to minimize dependencies of
scheduling integer and floating-point units on the results within the processor.
The major optimization techniques used throughout the library include:
• Loop unrolling to minimize loop management costs

• Blocking of data to improve data reuse opportunities
• Copying to reduce chances of data eviction from cache
• Data prefetching to help hide memory latency
• Multiple simultaneous operations (for example, dot products in dgemm) to eliminate stalls
due to arithmetic unit pipelines
• Use of hardware features such as the SIMD arithmetic units, where appropriate
These are techniques from which the arithmetic code benefits the most.
57
Parallelism
In addition to the performance enhancements discussed above, Intel® MKL offers performance
gains through parallelism provided by the symmetric multiprocessing performance (SMP)
feature. You can obtain improvements from SMP in the following ways:
• One way is based on user-managed threads in the program and further distribution of the
operations over the threads based on data decomposition, domain decomposition, control
decomposition, or some other parallelizing technique. Each thread can use any of the Intel
MKL functions (except for the ?lacon, ?lasq3, ?lasq4 LAPACK deprecated routines) because
the library has been designed to be thread-safe.
• Another method is to use the FFT and BLAS level 3 routines. They have been parallelized
and require no alterations of your application to gain the performance enhancements of
multiprocessing. Performance using multiple processors on the level 3 BLAS shows excellent
scaling. Since the threads are called and managed within the library, the application does
not need to be recompiled thread-safe (see also Fortran 95 Interface Conventions in Chapter
2 ).
• Yet another method is to use tuned LAPACK routines. Currently these include the single-
and double precision flavors of routines for QR factorization of general matrices, triangular
factorization of general and symmetric positive-definite matrices, solving systems of equations
with such matrices, as well as solving symmetric eigenvalue problems.
For instructions on setting the number of available processors for the BLAS level 3 and LAPACK
routines, see Intel® MKL User's Guide.
Platforms Supported
The Intel® Math Kernel Library includes Fortran routines and functions optimized for Intel®
processor-based computers running operating systems that support multiprocessing. In addition
to the Fortran interface, Intel MKL includes a C-language interface for the Discrete Fourier
transform functions, as well as for the Vector Mathematical Library and Vector Statistical Library
functions. For hardware and software requirements to use Intel MKL, see Intel® MKL Release
Notes.
About This Manual

This manual describes the routines and functions of Intel® MKL. Each reference section describes
a routine group typically consisting of routines used with four basic data types: single-precision
real, double-precision real, single-precision complex, and double-precision complex.
58
Overview 1
Each routine group is introduced by its name, a short description of its purpose, and the calling
sequence, or syntax, for each type of data with which each routine of the group is used. The
following sections are also included:
Description Describes the operation performed by routines of the group based on
one or more equations. The data types of the arguments are defined
in general terms for the group.
Input Parameters Defines the data type for each parameter on entry, for example:
a REAL for saxpy
DOUBLE PRECISION for daxpy
Output Parameters Lists resultant parameters on exit.
The following terms are used in the manual in reference to various operating systems:
Windows* OS This term refers to information that is valid on all supported Microsoft
Windows* operating systems.
Linux* OS This term refers to information that is valid on all supported Linux*
operating systems.
Mac OS* X This term refers to information that is valid on Intel®-based systems
running the Mac OS* X operating system.
Audience for This Manual

The manual addresses programmers proficient in computational mathematics and assumes a
working knowledge of the principles and vocabulary of linear algebra, mathematical statistics,
and Fourier transforms.
Manual Organization
The manual contains the following chapters and appendixes:
Chapter 1 Overview. Introduces the Intel Math Kernel Library software, provides
information on manual organization, and explains notational
conventions.
Chapter 2 BLAS and Sparse BLAS Routines. Provides descriptions of BLAS and
Sparse BLAS functions and routines.
Chapter 3 LAPACK Routines: Linear Equations. Provides descriptions of LAPACK
routines for solving systems of linear equations and performing a
number of related computational tasks: triangular factorization, matrix
inversion, estimating the condition number of matrices.
59
Chapter 4 LAPACK Routines: Least Squares and Eigenvalue Problems. Provides

descriptions of LAPACK routines for solving least squares problems,
standard and generalized eigenvalue problems, singular value problems,
and Sylvester's equations.
Chapter 5 LAPACK Auxiliary and Utility Routines. Describes auxiliary and utility
LAPACK routines that perform certain subtasks or common low-level
computation.
Chapter 6 ScaLAPACK Routines. Describes ScaLAPACK computational and driver
routines (software included only with Intel MKL for Linux* and Windows*
operating systems).
Chapter 7 ScaLAPACK Auxiliary and Utility Routines. Describes ScaLAPACK auxiliary
routines (software included only with Intel MKL for Linux* and Windows*
operating systems).
Chapter 8 Sparse Solver Routines. Describes direct sparse solver routines that
solve symmetric and symmetrically-structured sparse matrices. Also
describes the iterative sparse solver routines.
Chapter 9 Vector Mathematical Functions. Provides descriptions of VML functions
for computing elementary mathematical functions on vector arguments.
Chapter 10 Statistical Functions. Provides descriptions of VSL functions for
generating vectors of pseudorandom numbers and for performing
convolution and correlation operations.
Chapter 11 Fourier Transform Functions. Describes multidimensional functions for
computing fast Fourier transform and cluster FFT functions (software
included only with Intel MKL for the Linux* and Windows* operating
systems).
Chapter 12 PBLAS Routines. Provides descriptions of PBLAS functions.
Chapter 13 Partial Differential Equations Support. Describes Trigonometric
Transform interface routines and Poisson Library implemented for
solving Partial Differential Equations (PDE).
Chapter 14 Optimization Solver Routines. Describes routines for solving optimization
problems. These routines solve nonlinear least squares problems
through the Trust-Region (TR) algorithms and compute Jacobi matrix
by central differences.
Chapter 15 Support Functions. Describes functions that are used to support the
operation of Intel MKL software, such as status information functions,
timing and error handling functions.
Chapter 16 BLACS Routines. Describes Basic Linear Algebra Communication
Subprograms that are used to support a linear algebra oriented message
passing interface.
60
Overview 1
Appendix A Linear Solvers Basics. Briefly describes the basic definitions and
approaches used in linear algebra for solving systems of linear
equations. Describes sparse data storage formats, as well as basic
concepts of interval arithmetic.
Appendix B Routine and Function Arguments. Describes the major arguments of
the BLAS routines and VML functions: vector and matrix arguments.
Appendix C Code Examples. Provides code examples of calling various Intel MKL
functions and routines (BLAS, PARDISO, Direct and Iterative Sparse
Solver, FFT, Cluster FFT, Intreval Linear Solvers, Trigonometric
Transforms, and Poisson Library).
Appendix D CBLAS Interface to the BLAS. Provides the C interface to the BLAS.
Appendix E Specific Features of Fortran 95 Interfaces for LAPACK Routines. Provides
the features of Intel MKL Fortran 95 interfaces for LAPACK routines in
comparison with Netlib implementation.
Appendix F Optimization Solvers Basics. Briefly describes the basic notions of
optimization solvers, nonlinear least square problem, and Trust Region
algorithm.
Appendix G FFTW to Intel® Math Kernel Library Wrappers. Describes two collections
of FFTW2MKL wrappers for programs that use FFTW. The wrappers
allow developers to gain performance with the Intel MKL Fourier
transforms without changing the program source code.
The manual also includes a Bibliography, Glossary and an Index.
Notational Conventions
This manual uses the following notational conventions:
• Routine name shorthand (for example, ?ungqr instead of cungqr/zungqr).

• Font conventions used for distinction between the text and the code.
Routine Name Shorthand

For shorthand, names with a question mark “?” in them represent groups of routines with
similar functionality. The question mark is used to indicate any or all possible varieties of a
function; for example:
?swap Refers to all four data types of the vector-vector ?swap routine: sswap,
dswap, cswap, and zswap.
61
Font Conventions
The following font conventions are used:
UPPERCASE COURIER Data type used in the description of input and output parameters for
Fortran interface. For example, CHARACTER*1.
lowercase courier Code examples:
a(k+i,j) = matrix(i,j)
and data types for C interface, for example, const float*
lowercase courier Function names for C interface, for example, vmlSetMode
mixed with UpperCase
courier
lowercase courier Variables in arguments and parameters description. For example, incx.
italic
* Used as a multiplication symbol in code examples and equations and
where required by the Fortran syntax.
62
BLAS and Sparse BLAS Routines 2
This chapter describes the Intel® Math Kernel Library implementation of the BLAS and Sparse BLAS routines,
and BLAS-like extensions. The routine descriptions are arranged in several sections according to the BLAS
level of operation:
• BLAS Level 1 Routines (vector-vector operations)
• BLAS Level 2 Routines (matrix-vector operations)
• BLAS Level 3 Routines (matrix-matrix operations)
• Sparse BLAS Level 1 Routines (vector-vector operations).
• Sparse BLAS Level 2 and Level 3 Routines (matrix-vector and matrix-matrix operations)
• BLAS-like Extensions
Each section presents the routine and function group descriptions in alphabetical order by routine or
function group name; for example, the ?asum group, the ?axpy group. The question mark in the group
name corresponds to different character codes indicating the data type (s, d, c, and z or their combination);
see Routine Naming Conventions.
When BLAS or Sparse BLAS routines encounter an error, they call the error reporting routine xerbla.
In BLAS Level 1 groups i?amax and i?amin, an “i” is placed before the data-type indicator and corresponds
to the index of an element in the vector. These groups are placed in the end of the BLAS Level 1 section.
BLAS Routines
Routine Naming Conventions

BLAS routine names have the following structure:
<character> <name> <mod> ( )
The <character> field indicates the data type:
s real, single precision
c complex, single precision
d real, double precision
z complex, double precision
Some routines and functions can have combined character codes, such as sc or dz.
63
For example, the function scasum uses a complex input array and returns a real value.
The <name> field, in BLAS level 1, indicates the operation type. For example, the BLAS level 1
routines ?dot, ?rot, ?swap compute a vector dot product, vector rotation, and vector swap,
respectively.
In BLAS level 2 and 3, <name> reflects the matrix argument type:
ge general matrix
gb general band matrix
sy symmetric matrix
sp symmetric matrix (packed storage)
sb symmetric band matrix
he Hermitian matrix
hp Hermitian matrix (packed storage)
hb Hermitian band matrix
tr triangular matrix
tp triangular matrix (packed storage)
tb triangular band matrix.
The <mod> field, if present, provides additional details of the operation. BLAS level 1 names
can have the following characters in the <mod> field:
c conjugated vector
u unconjugated vector
g Givens rotation.
BLAS level 2 names can have the following characters in the <mod> field:
mv matrix-vector product
sv solving a system of linear equations with matrix-vector operations
r rank-1 update of a matrix
r2 rank-2 update of a matrix.
BLAS level 3 names can have the following characters in the <mod> field:
mm matrix-matrix product
sm solving a system of linear equations with matrix-matrix operations
rk rank-k update of a matrix
r2k rank-2k update of a matrix.
The examples below illustrate how to interpret BLAS routine names:
64
ddot <d> <dot>: double-precision real vector-vector dot product
cdotc <c> <dot> <c>: complex vector-vector dot product, conjugated
scasum <sc> <asum>: sum of magnitudes of vector elements, single precision
real output and single precision complex input
cdotu <c> <dot> : vector-vector dot product, unconjugated, complex
sgemv <s> <ge> <mv>: matrix-vector product, general matrix, single precision
ztrmm <z> <tr> <mm>: matrix-matrix product, triangular matrix,
double-precision complex.
Sparse BLAS level 1 naming conventions are similar to those of BLAS level 1. For more
information, see “Naming Conventions”.
Fortran 95 Interface Conventions

Fortran 95 interface to BLAS and Sparse BLAS Level 1 routines is implemented through wrappers
that call respective FORTRAN 77 routines. This interface uses such features of Fortran 95 as
assumed-shape arrays and optional arguments to provide simplified calls to BLAS and Sparse
BLAS Level 1 routines with fewer parameters.
NOTE. For BLAS, Intel MKL offers two types of Fortran 95 interfaces:
• using mkl_blas.fi only through include ‘mkl_blas_subroutine.fi’ statement.
Such interfaces allow you to make use of the original LAPACK routines with all their
arguments
• using mkl_blas.f90 that includes improved interfaces described below. This file is
used to generate the module files mkl95_blas.mod and mkl95_precision.mod. See
also section “Fortran 95 interfaces and wrappers to LAPACK and BLAS” of Intel® MKL
User's Guide for details. The module files are used to process the FORTRAN use clauses
referencing the LAPACK interface: use mkl95_blas and use mkl95_precision.
The main conventions used in Fortran 95 interface are as follows:
• The names of parameters used in Fortran 95 interface are typically the same as those used
for the respective generic (FORTRAN 77) interface. In rare cases formal argument names
may be different. Note that these changes of the formal parameter names have no impact
on program semantics and follow the conventions of names unification.
• Some input parameters such as array dimensions are not required in Fortran 95 and are
skipped from the calling sequence. Array dimensions are reconstructed from the user data
that must exactly follow the required array shape.
65
• A parameter can be skipped if its value is completely defined by the presence or absence
of another parameter in the calling sequence, and the restored value is the only meaningful
value for the skipped parameter.
• Parameters specifying the increment values incx and incy are skipped. In most cases their
values are equal to 1. In Fortran 95 an increment with different value can be directly
established in the corresponding parameter.
• Some generic parameters are declared as optional in Fortran 95 interface and may or may
not be present in the calling sequence. A parameter can be declared optional if it satisfies
one of the following conditions:
1. It can take only a few possible values. The default value of such parameter typically is
the first value in the list; all exceptions to this rule are explicitly stated in the routine
description.
2. It has a natural default value.
Optional parameters are given in square brackets in Fortran 95 call syntax.
The particular rules used for reconstructing the values of omitted optional parameters are
specific for each routine and are detailed in the respective “Fortran 95 Notes” subsection at the
end of routine specification section. If this subsection is omitted, the Fortran 95 interface for
the given routine does not differ from the corresponding FORTRAN 77 interface.
Note that this interface is not implemented in the current version of Sparse BLAS Level 2 and
Level 3 routines. Fortran 95 interface for each of these routines is given in the “Interfaces -
Fortran 95” subsection at the end of the respective routine specification section.
Matrix Storage Schemes

Matrix arguments of BLAS routines can use the following storage schemes:
• Full storage: a matrix A is stored in a two-dimensional array a, with the matrix element aij
stored in the array element a(i,j).
• Packed storage scheme allows you to store symmetric, Hermitian, or triangular matrices
more compactly: the upper or lower triangle of the matrix is packed by columns in a
one-dimensional array.
• Band storage: a band matrix is stored compactly in a two-dimensional array: columns of
the matrix are stored in the corresponding columns of the array, and diagonals of the matrix
are stored in rows of the array.
For more information on matrix storage schemes, see “Matrix Arguments” in Appendix B.
66
BLAS Level 1 Routines

BLAS Level 1 includes routines and functions, which perform vector-vector operations. Table
2-1 lists the BLAS Level 1 routine and function groups and the data types associated with
them.
Table 2-1 BLAS Level 1 Routine Groups and Their Data Types
Routine or Data Types Description
Function Group
?asum s, d, sc, dz Sum of vector magnitudes (functions)
?axpy s, d, c, z Scalar-vector product (routines)
?copy s, d, c, z Copy vector (routines)
?dot s, d Dot product (functions)
?sdot sd, d Dot product with extended precision (functions)
?dotc c, z Dot product conjugated (functions)
?dotu c, z Dot product unconjugated (functions)
?nrm2 s, d, sc, dz Vector 2-norm (Euclidean norm) (functions)
?rot s, d, cs, zd Plane rotation of points (routines)
?rotg s, d, c, z Givens rotation of points (routines)
?rotm s, d Modified plane rotation of points
?rotmg s, d Givens modified plane rotation of points
?scal s, d, c, z, cs, zd Vector-scalar product (routines)
?swap s, d, c, z Vector-vector swap (routines)
i?amax s, d, c, z Index of the maximum absolute value element of

a vector (functions)
i?amin s, d, c, z Index of the minimum absolute value element of

a vector (functions)
67
Routine or Data Types Description

Function Group
dcabs1 d Absolute value of a double precision complex

number
?asum
Computes the sum of magnitudes of the vector
elements.
Syntax
FORTRAN 77:
res = sasum(n, x, incx)
res = scasum(n, x, incx)
res = dasum(n, x, incx)
res = dzasum(n, x, incx)
Fortran 95:
res = asum(x)
Description
This routine is declared in mkl_blas.fi for FORTRAN 77 interface, in mkl_blas.f90 for Fortran
95 interface, and in mkl_blas.h for C interface.
The ?asum routine computes the sum of the magnitudes of elements of a real vector, or the
sum of magnitudes of the real and imaginary parts of elements of a complex vector:
res = |Re x(1)| + |Im x(1)| + |Re x(2)| + |Im x(2)|+ ... + |Re x(n)| + |Im x(n)|,
where x is a vector with a number of elements that equals n.
Input Parameters
n INTEGER. Specifies the number of elements in vector x.
x REAL for sasum
DOUBLE PRECISION for dasum
COMPLEX for scasum
DOUBLE COMPLEX for dzasum
68
Array, DIMENSION at least (1 + (n-1)*abs(incx)).
incx INTEGER. Specifies the increment for indexing vector x.
Output Parameters
res REAL for sasum
DOUBLE PRECISION for dasum
REAL for scasum
DOUBLE PRECISION for dzasum
Contains the sum of magnitudes of real and imaginary parts
of all elements of the vector.
Fortran 95 Interface Notes

Routines in Fortran 95 interface have fewer arguments in the calling sequence than their
FORTRAN 77 counterparts. For general conventions applied to skip redundant or reconstructible
arguments, see Fortran 95 Interface Conventions.
Specific details for the routine asum interface are the following:
x Holds the array of size n.
?axpy
Computes a vector-scalar product and adds the
result to a vector.
Syntax
FORTRAN 77:
call saxpy(n, a, x, incx, y, incy)
call daxpy(n, a, x, incx, y, incy)
call caxpy(n, a, x, incx, y, incy)
call zaxpy(n, a, x, incx, y, incy)
Fortran 95:
call axpy(x, y [,a])
69
Description
The ?axpy routines perform a vector-vector operation defined as
y := a*x + y
where:
a is a scalar
x and y are vectors each with a number of elements that equals n.
Input Parameters
n INTEGER. Specifies the number of elements in vectors x and
y.
a REAL for saxpy
COMPLEX for caxpy
DOUBLE COMPLEX for zaxpy
Specifies the scalar a.
x REAL for saxpy
COMPLEX for caxpy
incx INTEGER. Specifies the increment for the elements of x.
y REAL for saxpy
COMPLEX for caxpy
Array, DIMENSION at least (1 + (n-1)*abs(incy)).
incy INTEGER. Specifies the increment for the elements of y.
Output Parameters
y Contains the updated vector y.
70
Specific details for the routine axpy interface are the following:
x Holds the array of size n.
y Holds the array of size n.
a The default value is 1.
?copy
Copies vector to another vector.
Syntax
FORTRAN 77:
call scopy(n, x, incx, y, incy)
call dcopy(n, x, incx, y, incy)
call ccopy(n, x, incx, y, incy)
call zcopy(n, x, incx, y, incy)
Fortran 95:
call copy(x, y)
Description
The ?copy routines perform a vector-vector operation defined as
y = x,
where x and y are vectors.
71
Input Parameters
y.
x REAL for scopy
DOUBLE PRECISION for dcopy
COMPLEX for ccopy
DOUBLE COMPLEX for zcopy
y REAL for scopy
DOUBLE PRECISION for dcopy
COMPLEX for ccopy
DOUBLE COMPLEX for zcopy
Output Parameters
y Contains a copy of the vector x if n is positive. Otherwise,
parameters are unaltered.

Specific details for the routine copy interface are the following:
x Holds the vector with the number of elements n.
y Holds the vector with the number of elements n.
72
?dot
Computes a vector-vector dot product.
Syntax
FORTRAN 77:
res = sdot(n, x, incx, y, incy)
res = ddot(n, x, incx, y, incy)
Fortran 95:
res = dot(x, y)
Description
The ?dot routines perform a vector-vector reduction operation defined as
res = ∑(x*y),
where x and y are vectors.
Input Parameters
y.
x REAL for sdot
DOUBLE PRECISION for ddot
Array, DIMENSION at least (1+(n-1)*abs(incx)).
y REAL for sdot
Array, DIMENSION at least (1+(n-1)*abs(incy)).
Output Parameters
res REAL for sdot
73

Contains the result of the dot product of x and y, if n is
positive. Otherwise, res contains 0.

Specific details for the routine dot interface are the following:
?sdot
Computes a vector-vector dot product with
extended precision.
Syntax
FORTRAN 77:
res = sdsdot(n, sb, sx, incx, sy, incy)
res = dsdot(n, sx, incx, sy, incy)
Fortran 95:
res = sdot(sx, sy)
res = sdot(sx, sy, sb)
Description
The ?sdot routines compute the inner product of two vectors with extended precision. Both
routines use extended precision accumulation of the intermediate results, but the sdsdot
routine outputs the final result in single precision, whereas the dsdot routine outputs the double
precision result. The function sdsdot also adds scalar value sb to the inner product.
74
Input Parameters
n INTEGER. Specifies the number of elements in the input
vectors sx and sy.
sb REAL. Single precision scalar to be added to inner product
(for the function sdsdot only).
sx, sy REAL.
Arrays, DIMENSION at least (1+(n -1)*abs(incx)) and
(1+(n-1)*abs(incy)), respectively. Contain the input
single precision vectors.
incx INTEGER. Specifies the increment for the elements of sx.
incy INTEGER. Specifies the increment for the elements of sy.
Output Parameters
res REAL for sdsdot
DOUBLE PRECISION for dsdot
Contains the result of the dot product of sx and sy (with sb
added for sdsdot), if n is positive. Otherwise, res contains
sb for sdsdot and 0 for dsdot.

Specific details for the routine sdot interface are the following:
sx Holds the vector with the number of elements n.
sy Holds the vector with the number of elements n.
NOTE. Note that scalar parameter sb is declared as a required parameter in Fortran 95

interface for the function sdot to distinguish between function flavors that output final
result in different precision.
75
?dotc
Computes a dot product of a conjugated vector
with another vector.
Syntax
FORTRAN 77:
res = cdotc(n, x, incx, y, incy)
res = zdotc(n, x, incx, y, incy)
Fortran 95:
res = dotc(x, y)
Description
The ?dotc routines perform a vector-vector operation defined as
res = ∑(conjg(x)*y)
where x and y are n-element vectors.
Input Parameters
y.
x COMPLEX for cdotc
DOUBLE COMPLEX for zdotc
Array, DIMENSION at least (1 + (n -1)*abs(incx)).
y COMPLEX for cdotc
Array, DIMENSION at least (1 + (n -1)*abs(incy)).
76
Output Parameters
res COMPLEX for cdotc
Contains the result of the dot product of the conjugated x
and unconjugated y, if n is positive. Otherwise, res contains
0.

Specific details for the routine dotc interface are the following:
?dotu
Computes a vector-vector dot product.
Syntax
FORTRAN 77:
res = cdotu(n, x, incx, y, incy)
res = zdotu(n, x, incx, y, incy)
Fortran 95:
res = dotu(x, y)
Description
The ?dotu routines perform a vector-vector reduction operation defined as
res = ∑(x*y)
where x and y are n-element complex vectors.
77
Input Parameters
y.
x COMPLEX for cdotu
DOUBLE COMPLEX for zdotu
y COMPLEX for cdotu
Output Parameters
res COMPLEX for cdotu
Contains the result of the dot product of x and y, if n is

Specific details for the routine dotu interface are the following:
78
?nrm2
Computes the Euclidean norm of a vector.
Syntax
FORTRAN 77:
res = snrm2(n, x, incx)
res = dnrm2(n, x, incx)
res = scnrm2(n, x, incx)
res = dznrm2(n, x, incx)
Fortran 95:
res = nrm2(x)
Description
The ?nrm2 routines perform a vector reduction operation defined as
res = ||x||,
where:
x is a vector,
res is a value containing the Euclidean norm of the elements of x.
Input Parameters
x REAL for snrm2
DOUBLE PRECISION for dnrm2
COMPLEX for scnrm2
DOUBLE COMPLEX for dznrm2
Array, DIMENSION at least (1 + (n -1)*abs (incx)).
79
Output Parameters
res REAL for snrm2
DOUBLE PRECISION for dnrm2
REAL for scnrm2
DOUBLE PRECISION for dznrm2
Contains the Euclidean norm of the vector x.

Specific details for the routine nrm2 interface are the following:
?rot
Performs rotation of points in the plane.
Syntax
FORTRAN 77:
call srot(n, x, incx, y, incy, c, s)
call drot(n, x, incx, y, incy, c, s)
call csrot(n, x, incx, y, incy, c, s)
call zdrot(n, x, incx, y, incy, c, s)
Fortran 95:
call rot(x, y, c, s)
Description
80
Given two complex vectors x and y, each vector element of these vectors is replaced as follows:
x(i) = c*x(i) + s*y(i)
y(i) = c*y(i) - s*x(i)
Input Parameters
y.
x REAL for srot
DOUBLE PRECISION for drot
COMPLEX for csrot
DOUBLE COMPLEX for zdrot
y REAL for srot
COMPLEX for csrot
DOUBLE COMPLEX for zdrot
c REAL for srot
REAL for csrot
DOUBLE PRECISION for zdrot
A scalar.
s REAL for srot
REAL for csrot
DOUBLE PRECISION for zdrot
A scalar.
Output Parameters
x Each element is replaced by c*x + s*y.
y Each element is replaced by c*y - s*x.
81

Specific details for the routine rot interface are the following:
?rotg
Computes the parameters for a Givens rotation.
Syntax
FORTRAN 77:
call srotg(a, b, c, s)
call drotg(a, b, c, s)
call crotg(a, b, c, s)
call zrotg(a, b, c, s)
Fortran 95:
call rotg(a, b, c, s)
Description
Given the Cartesian coordinates (a, b) of a point p, these routines return the parameters a,
b, c, and s associated with the Givens rotation that zeros the y-coordinate of the point.
See a more accurate LAPACK version ?lartg.
Input Parameters
a REAL for srotg
DOUBLE PRECISION for drotg
COMPLEX for crotg
82
DOUBLE COMPLEX for zrotg
Provides the x-coordinate of the point p.
b REAL for srotg
COMPLEX for crotg
Provides the y-coordinate of the point p.
Output Parameters
a Contains the parameter r associated with the Givens
rotation.
b Contains the parameter z associated with the Givens
rotation.
c REAL for srotg
REAL for crotg
DOUBLE PRECISION for zrotg
Contains the parameter c associated with the Givens
rotation.
s REAL for srotg
COMPLEX for crotg
Contains the parameter s associated with the Givens
rotation.
?rotm
Performs rotation of points in the modified plane.
Syntax
FORTRAN 77:
call srotm(n, x, incx, y, incy, param)
call drotm(n, x, incx, y, incy, param)
83
Fortran 95:
call rotm(x, y, param)
Description
Given two complex vectors x and y, each vector element of these vectors is replaced as follows:
x(i) = H*x(i) + H*y(i)
y(i) = H*y(i) - H*x(i)
where H is a modified Givens transformation matrix whose values are stored in the param(2)
through param(5) array. See discussion on the param argument.
Input Parameters
y.
x REAL for srotm
DOUBLE PRECISION for drotm
y REAL for srotm
param REAL for srotm
Array, DIMENSION 5.
The elements of the param array are:
param(1) contains a switch, flag. param(2-5) contain h11,
h21, h12, and h22, respectively, the components of the array
H.
Depending on the values of flag, the components of H are
set as follows:
84
In the last three cases, the matrix entries of 1., -1., and 0.
are assumed based on the value of flag and are not
required to be set in the param vector.
Output Parameters
x Each element is replaced by h11*x + h12*y.
y Each element is replaced by h21*x + h22*y.

Specific details for the routine rotm interface are the following:
85
?rotmg
Computes the parameters for a modified Givens
rotation.
Syntax
FORTRAN 77:
call srotmg(d1, d2, x1, y1, param)
call drotmg(d1, d2, x1, y1, param)
Fortran 95:
call rotmg(d1, d2, x1, y1, param)
Description
Given Cartesian coordinates (x1, y1) of an input vector, these routines compute the components
of a modified Givens transformation matrix H that zeros the y-component of the resulting vector:
Input Parameters
d1 REAL for srotmg
DOUBLE PRECISION for drotmg
Provides the scaling factor for the x-coordinate of the input
vector.
d2 REAL for srotmg
Provides the scaling factor for the y-coordinate of the input
vector.
x1 REAL for srotmg
86
Provides the x-coordinate of the input vector.
y1 REAL for srotmg
Provides the y-coordinate of the input vector.
Output Parameters
d1 REAL for srotmg
Provides the first diagonal element of the updated matrix.
d2 REAL for srotmg
Provides the second diagonal element of the updated matrix.
x1 REAL for srotmg
Provides the x-coordinate of the rotated vector before
scaling.
param REAL for srotmg
Array, DIMENSION 5.
The elements of the param array are:
param(1) contains a switch, flag. param(2-5) contain h11,
h21, h12, and h22, respectively, the components of the array
H.
Depending on the values of flag, the components of H are
set as follows:
87
In the last three cases, the matrix entries of 1., -1., and 0.
are assumed based on the value of flag and are not
required to be set in the param vector.
?scal
Computes the product of a vector by a scalar.
Syntax
FORTRAN 77:
call sscal(n, a, x, incx)
call dscal(n, a, x, incx)
call cscal(n, a, x, incx)
call zscal(n, a, x, incx)
call csscal(n, a, x, incx)
call zdscal(n, a, x, incx)
Fortran 95:
call scal(x, a)
Description
The ?scal routines perform a vector operation defined as
x = a*x
88
where:
a is a scalar, x is an n-element vector.
Input Parameters
a REAL for sscal and csscal
DOUBLE PRECISION for dscal and zdscal
COMPLEX for cscal
DOUBLE COMPLEX for zscal
x REAL for sscal
DOUBLE PRECISION for dscal
COMPLEX for cscal and csscal
DOUBLE COMPLEX for zscal and zdscal
Output Parameters
x Updated vector x.

Specific details for the routine scal interface are the following:
89
?swap
Swaps a vector with another vector.
Syntax
FORTRAN 77:
call sswap(n, x, incx, y, incy)
call dswap(n, x, incx, y, incy)
call cswap(n, x, incx, y, incy)
call zswap(n, x, incx, y, incy)
Fortran 95:
call swap(x, y)
Description
Given two vectors x and y, the ?swap routines return vectors y and x swapped, each replacing
the other.
Input Parameters
y.
x REAL for sswap
DOUBLE PRECISION for dswap
COMPLEX for cswap
DOUBLE COMPLEX for zswap
y REAL for sswap
DOUBLE PRECISION for dswap
COMPLEX for cswap
DOUBLE COMPLEX for zswap
90
Output Parameters
x Contains the resultant vector x, that is, the input vector y.
y Contains the resultant vector y, that is, the input vector x.

Specific details for the routine swap interface are the following:
i?amax
Finds the index of the element with maximum
absolute value.
Syntax
FORTRAN 77:
index = isamax(n, x, incx)
index = idamax(n, x, incx)
index = icamax(n, x, incx)
index = izamax(n, x, incx)
Fortran 95:
index = iamax(x)
Description
91
Given a vector x, the i?amax routines return the position of the vector element x(i) that has
the largest absolute value for real flavors, or the largest sum |Re(x(i))|+|Im(x(i))| for
complex flavors.
If n is not positive, 0 is returned.
If more than one vector element is found with the same largest absolute value, the index of
the first one encountered is returned.
Input Parameters
x REAL for isamax
DOUBLE PRECISION for idamax
COMPLEX for icamax
DOUBLE COMPLEX for izamax
Output Parameters
index INTEGER. Contains the position of vector element x that has
the largest absolute value.

Specific details for the routine iamax interface are the following:
92
i?amin
Finds the index of the element with the smallest
absolute value.
Syntax
FORTRAN 77:
index = isamin(n, x, incx)
index = idamin(n, x, incx)
index = icamin(n, x, incx)
index = izamin(n, x, incx)
Fortran 95:
index = iamin(x)
Description
Given a vector x, the i?amin routines return the position of the vector element x(i) that has
the smallest absolute value for real dlavors, or the smallest sum |Re(x(i))|+|Im(x(i))| for
complex flavors.
If n is not positive, 0 is returned.
If more than one vector element is found with the same smallest absolute value, the index of
the first one encountered is returned.
Input Parameters
n INTEGER. On entry, n specifies the number of elements in
vector x.
x REAL for isamin
DOUBLE PRECISION for idamin
COMPLEX for icamin
DOUBLE COMPLEX for izamin
93
Output Parameters
index INTEGER. Contains the position of vector element x that has
the smallest absolute value.

Specific details for the routine iamin interface are the following:
dcabs1
Computes absolute value of double complex
number.
Syntax
FORTRAN 77:
res = dcabs1(z)
Fortran 95:
res = dcabs1(z)
Description
The dcabs1 is an auxiliary routine for a few BLAS Level 1 routines. This routine performs an
operation defined as
res=|Re(z)|+|Im(z)|,
where z is a scalar and res is a value containing the absolute value of a double complex number
z.
Input Parameters
z DOUBLE COMPLEX scalar.
94
Output Parameters
res DOUBLE PRECISION.
Contains the absolute value of a double complex number z.

This section describes BLAS Level 2 routines, which perform matrix-vector operations. Table
2-2 lists the BLAS Level 2 routine groups and the data types associated with them.
Routine Groups Data Types Description
?gbmv s, d, c, z Matrix-vector product using a general band matrix
?gemv s, d, c, z Matrix-vector product using a general matrix
?ger s, d Rank-1 update of a general matrix
?gerc c, z Rank-1 update of a conjugated general matrix
?geru c, z Rank-1 update of a general matrix, unconjugated
?hbmv c, z Matrix-vector product using a Hermitian band

matrix
?hemv c, z Matrix-vector product using a Hermitian matrix
?her c, z Rank-1 update of a Hermitian matrix
?her2 c, z Rank-2 update of a Hermitian matrix
?hpmv c, z Matrix-vector product using a Hermitian packed

matrix
?hpr c, z Rank-1 update of a Hermitian packed matrix
?hpr2 c, z Rank-2 update of a Hermitian packed matrix
?sbmv s, d Matrix-vector product using symmetric band matrix
?spmv s, d Matrix-vector product using a symmetric packed

matrix
95
Routine Groups Data Types Description
?spr s, d Rank-1 update of a symmetric packed matrix
?spr2 s, d Rank-2 update of a symmetric packed matrix
?symv s, d Matrix-vector product using a symmetric matrix
?syr s, d Rank-1 update of a symmetric matrix
?syr2 s, d Rank-2 update of a symmetric matrix
?tbmv s, d, c, z Matrix-vector product using a triangular band

matrix
?tbsv s, d, c, z Solution of a linear system of equations with a

triangular band matrix
?tpmv s, d, c, z Matrix-vector product using a triangular packed

matrix
?tpsv s, d, c, z Solution of a linear system of equations with a

triangular packed matrix
?trmv s, d, c, z Matrix-vector product using a triangular matrix
?trsv s, d, c, z Solution of a linear system of equations with a

triangular matrix
?gbmv
Computes a matrix-vector product using a general
band matrix
Syntax
FORTRAN 77:
call sgbmv(trans, m, n, kl, ku, alpha, a, lda, x, incx, beta, y, incy )
call dgbmv(trans, m, n, kl, ku, alpha, a, lda, x, incx, beta, y, incy )
call cgbmv(trans, m, n, kl, ku, alpha, a, lda, x, incx, beta, y, incy )
call zgbmv(trans, m, n, kl, ku, alpha, a, lda, x, incx, beta, y, incy )
96
Fortran 95:
call gbmv(a, x, y [,kl] [,m] [,alpha] [,beta] [,trans])
Description
The ?gbmv routines perform a matrix-vector operation defined as
y := alpha*A*x + beta*y
or
y := alpha*A'*x + beta*y,
or
y := alpha *conjg(A')*x + beta*y,
where:
alpha and beta are scalars,
x and y are vectors,
A is an m-by-n band matrix, with kl sub-diagonals and ku super-diagonals.
Input Parameters
trans CHARACTER*1. Specifies the operation:
If trans= 'N' or 'n', then y := alpha*A*x + beta*y
If trans= 'T' or 't', then y := alpha*A'*x + beta*y
If trans= 'C' or 'c', then y := alpha *conjg(A')*x
+ beta*y
m INTEGER. Specifies the number of rows of the matrix A.
The value of m must be at least zero.
n INTEGER. Specifies the number of columns of the matrix A.
The value of n must be at least zero.
kl INTEGER. Specifies the number of sub-diagonals of the
matrix A.
The value of kl must satisfy 0 ≤ kl.
ku INTEGER. Specifies the number of super-diagonals of the
matrix A.
97
The value of ku must satisfy 0 ≤ ku.

alpha REAL for sgbmv
DOUBLE PRECISION for dgbmv
COMPLEX for cgbmv
DOUBLE COMPLEX for zgbmv
Specifies the scalar alpha.
a REAL for sgbmv
COMPLEX for cgbmv
Array, DIMENSION (lda, n).
Before entry, the leading (kl + ku + 1) by n part of the
array a must contain the matrix of coefficients. This matrix
must be supplied column-by-column, with the leading
diagonal of the matrix in row (ku + 1) of the array, the
first super-diagonal starting at position 2 in row ku, the first
sub-diagonal starting at position 1 in row (ku + 2), and
so on. Elements in the array a that do not correspond to
elements in the band matrix (such as the top left ku by ku
triangle) are not referenced.
The following program segment transfers a band matrix
from conventional full matrix storage to band storage:
do 20, j = 1, n
k = ku + 1 - j
do 10, i = max(1, j-ku), min(m, j+kl)
a(k+i, j) = matrix(i,j)
10 continue
20 continue
lda INTEGER. Specifies the first dimension of a as declared in

the calling (sub)program. The value of lda must be at least
(kl + ku + 1).
x REAL for sgbmv
COMPLEX for cgbmv
98
Array, DIMENSION at least (1 + (n - 1)*abs(incx))
when trans = 'N' or 'n', and at least (1 + (m -
1)*abs(incx)) otherwise. Before entry, the array x must
contain the vector x.
incx must not be zero.
beta REAL for sgbmv
COMPLEX for cgbmv
Specifies the scalar beta. When beta is equal to zero, then
y need not be set on input.
y REAL for sgbmv
COMPLEX for cgbmv
Array, DIMENSION at least (1 +(m - 1)*abs(incy)) when
trans = 'N' or 'n' and at least (1 +(n -
1)*abs(incy)) otherwise. Before entry, the incremented
array y must contain the vector y.
The value of incy must not be zero.
Output Parameters
y Updated vector y.

Specific details for the routine gbmv interface are the following:
a Holds the array a of size (kl+ku+1, n). Contains a banded matrix
m*nwith kl lower diagonal and ku upper diagonal.
x Holds the vector with the number of elements rx, where rx = n
if trans = 'N',rx = m otherwise.
99
y Holds the vector with the number of elements ry, where ry = m

if trans = 'N',ry = n otherwise.
trans Must be 'N', 'C', or 'T'.
The default value is 'N'.
kl If omitted, assumed kl = ku, that is, the number of lower
diagonals equals the number of the upper diagonals.
ku Restored as ku = lda-kl-1, where lda is the first dimension of
matrix A.
m If omitted, assumed m = n, that is, a square matrix.
alpha The default value is 1.
beta The default value is 1.
?gemv
Computes a matrix-vector product using a general
matrix
Syntax
FORTRAN 77:
call sgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call dgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call cgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call zgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
Fortran 95:
call gemv(a, x, y [,alpha][,beta] [,trans])
Description
The ?gemv routines perform a matrix-vector operation defined as
y := alpha*A*x + beta*y,
100
or
or
y := alpha*conjg(A')*x + beta*y,
where:
A is an m-by-n matrix.
Input Parameters
if trans= 'N' or 'n', then y := alpha*A*x + beta*y;
if trans= 'T' or 't', then y := alpha*A'*x + beta*y;
if trans= 'C' or 'c', then y := alpha *conjg(A')*x
+ beta*y.
m INTEGER. Specifies the number of rows of the matrix A. The
value of m must be at least zero.
alpha REAL for sgemv
DOUBLE PRECISION for dgemv
COMPLEX for cgemv
DOUBLE COMPLEX for zgemv
a REAL for sgemv
COMPLEX for cgemv
Array, DIMENSION (lda, n). Before entry, the leading
m-by-n part of the array a must contain the matrix of
coefficients.
max(1, m).
101
x REAL for sgemv

COMPLEX for cgemv
Array, DIMENSION at least (1+(n-1)*abs(incx)) when
trans = 'N' or 'n' and at least (1+(m - 1)*abs(incx))
otherwise. Before entry, the incremented array x must
The value of incx must not be zero.
beta REAL for sgemv
COMPLEX for cgemv
Specifies the scalar beta. When beta is set to zero, then y
need not be set on input.
y REAL for sgemv
COMPLEX for cgemv
Array, DIMENSION at least (1 +(m - 1)*abs(incy)) when
trans = 'N' or 'n' and at least (1 +(n -
1)*abs(incy)) otherwise. Before entry with non-zero beta,
the incremented array y must contain the vector y.
Output Parameters
y Updated vector y.

Specific details for the routine gemv interface are the following:
a Holds the matrix A of size (m,n).
102
x Holds the vector with the number of elements rx where rx = n
if trans = 'N', rx = m otherwise.
y Holds the vector with the number of elements ry where ry = m
if trans = 'N', ry = n otherwise.
?ger
Performs a rank-1 update of a general matrix.
Syntax
FORTRAN 77:
call sger(m, n, alpha, x, incx, y, incy, a, lda)
call dger(m, n, alpha, x, incx, y, incy, a, lda)
Fortran 95:
call ger(a, x, y [,alpha])
Description
The ?ger routines perform a matrix-vector operation defined as
A := alpha*x*y'+ A,
where:
alpha is a scalar,
x is an m-element vector,
y is an n-element vector,
A is an m-by-n general matrix.
103
Input Parameters
alpha REAL for sger
DOUBLE PRECISION for dger
x REAL for sger
Array, DIMENSION at least (1 + (m - 1)*abs(incx)).
Before entry, the incremented array x must contain the
m-element vector x.
y REAL for sger
Array, DIMENSION at least (1 + (n - 1)*abs(incy)).
Before entry, the incremented array y must contain the
n-element vector y.
a REAL for sger
Before entry, the leading m-by-n part of the array a must
contain the matrix of coefficients.
max(1, m).
Output Parameters
a Overwritten by the updated matrix.
104
Specific details for the routine ger interface are the following:
x Holds the vector with the number of elements m.
?gerc
Performs a rank-1 update (conjugated) of a general
matrix.
Syntax
FORTRAN 77:
call cgerc(m, n, alpha, x, incx, y, incy, a, lda)
call zgerc(m, n, alpha, x, incx, y, incy, a, lda)
Fortran 95:
call gerc(a, x, y [,alpha])
Description
The ?gerc routines perform a matrix-vector operation defined as
A := alpha*x*conjg(y') + A,
where:
alpha is a scalar,
105
Input Parameters
alpha COMPLEX for cgerc
DOUBLE COMPLEX for zgerc
x COMPLEX for cgerc
m-element vector x.
y COMPLEX for cgerc
n-element vector y.
a COMPLEX for cgerc
max(1, m).
Output Parameters
106
Specific details for the routine gerc interface are the following:
?geru
Performs a rank-1 update (unconjugated) of a
general matrix.
Syntax
FORTRAN 77:
call cgeru(m, n, alpha, x, incx, y, incy, a, lda)
call zgeru(m, n, alpha, x, incx, y, incy, a, lda)
Fortran 95:
call geru(a, x, y [,alpha])
Description
The ?geru routines perform a matrix-vector operation defined as
A := alpha*x*y ' + A,
where:
alpha is a scalar,
107
Input Parameters
alpha COMPLEX for cgeru
DOUBLE COMPLEX for zgeru
x COMPLEX for cgeru
m-element vector x.
y COMPLEX for cgeru
n-element vector y.
a COMPLEX for cgeru
max(1, m).
Output Parameters
108
Specific details for the routine geru interface are the following:
?hbmv
Computes a matrix-vector product using a
Hermitian band matrix.
Syntax
FORTRAN 77:
call chbmv(uplo, n, k, alpha, a, lda, x, incx, beta, y, incy)
call zhbmv(uplo, n, k, alpha, a, lda, x, incx, beta, y, incy)
Fortran 95:
call hbmv(a, x, y [,uplo][,alpha] [,beta])
Description
The ?hbmv routines perform a matrix-vector operation defined as
where:
x and y are n-element vectors,
A is an n-by-n Hermitian band matrix, with k super-diagonals.
109
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower
triangular part of the Hermitian band matrix A is used:
If uplo = 'U' or 'u', then the upper triangular part of the
matrix A is used.
If uplo = 'L' or 'l', then the low triangular part of the
matrix A is used.
n INTEGER. Specifies the order of the matrix A. The value of
n must be at least zero.
k INTEGER. Specifies the number of super-diagonals of the
matrix A.
The value of k must satisfy 0 ≤ k.
alpha COMPLEX for chbmv
DOUBLE COMPLEX for zhbmv
a COMPLEX for chbmv
Before entry with uplo = 'U' or 'u', the leading (k + 1)
by n part of the array a must contain the upper triangular
band part of the Hermitian matrix. The matrix must be
supplied column-by-column, with the leading diagonal of
the matrix in row (k + 1) of the array, the first
super-diagonal starting at position 2 in row k, and so on.
The top left k by k triangle of the array a is not referenced.
The following program segment transfers the upper
triangular part of a Hermitian band matrix from conventional
full matrix storage to band storage:
do 20, j = 1, n
m = k + 1 - j
do 10, i = max(1, j - k), j
a(m + i, j) = matrix(i, j)
10 continue
20 continue
110
Before entry with uplo = 'L' or 'l', the leading (k + 1)
by n part of the array a must contain the lower triangular
band part of the Hermitian matrix, supplied
column-by-column, with the leading diagonal of the matrix
in row 1 of the array, the first sub-diagonal starting at
position 1 in row 2, and so on. The bottom right k by k
triangle of the array a is not referenced.
The following program segment transfers the lower
triangular part of a Hermitian band matrix from conventional
do 20, j = 1, n
m = 1 - j
do 10, i = j, min( n, j + k )
a( m + i, j ) = matrix( i, j )
10 continue
20 continue
The imaginary parts of the diagonal elements need not be
set and are assumed to be zero.
(k + 1).
x COMPLEX for chbmv
Array, DIMENSION at least (1 + (n - 1)*abs(incx)).
vector x.
beta COMPLEX for chbmv
Specifies the scalar beta.
y COMPLEX for chbmv
111

vector y.
Output Parameters
y Overwritten by the updated vector y.

Specific details for the routine hbmv interface are the following:
a Holds the array a of size (k+1,n).
uplo Must be 'U' or 'L'. The default value is 'U'.
?hemv
Hermitian matrix.
Syntax
FORTRAN 77:
call chemv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call zhemv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
Fortran 95:
call hemv(a, x, y [,uplo][,alpha] [,beta])
112
Description
The ?hemv routines perform a matrix-vector operation defined as
where:
A is an n-by-n Hermitian matrix.
Input Parameters
triangular part of the array a is used.
If uplo = 'U' or 'u', then the upper triangular of the array
a is used.
If uplo = 'L' or 'l', then the low triangular of the array
a is used.
alpha COMPLEX for chemv
DOUBLE COMPLEX for zhemv
a COMPLEX for chemv
Before entry with uplo = 'U' or 'u', the leading n-by-n
upper triangular part of the array a must contain the upper
triangular part of the Hermitian matrix and the strictly lower
triangular part of a is not referenced. Before entry with uplo
= 'L' or 'l', the leading n-by-n lower triangular part of
the array a must contain the lower triangular part of the
Hermitian matrix and the strictly upper triangular part of a
is not referenced.
113

max(1, n).
x COMPLEX for chemv
n-element vector x.
beta COMPLEX for chemv
Specifies the scalar beta. When beta is supplied as zero
then y need not be set on input.
y COMPLEX for chemv
n-element vector y.
Output Parameters

Specific details for the routine hemv interface are the following:
a Holds the matrix A of size (n,n).
114
?her
Performs a rank-1 update of a Hermitian matrix.
Syntax
FORTRAN 77:
call cher(uplo, n, alpha, x, incx, a, lda)
call zher(uplo, n, alpha, x, incx, a, lda)
Fortran 95:
call her(a, x [,uplo] [, alpha])
Description
The ?her routines perform a matrix-vector operation defined as
A := alpha*x*conjg(x') + A,
where:
alpha is a real scalar,
x is an n-element vector,
Input Parameters
a is used.
a is used.
115

alpha REAL for cher
DOUBLE PRECISION for zher
x COMPLEX for cher
DOUBLE COMPLEX for zher
Array, dimension at least (1 + (n - 1)*abs(incx)).
n-element vector x.
a COMPLEX for cher
DOUBLE COMPLEX for zher
triangular part of a is not referenced.
Before entry with uplo = 'L' or 'l', the leading n-by-n
lower triangular part of the array a must contain the lower
triangular part of the Hermitian matrix and the strictly upper
max(1, n).
Output Parameters
a With uplo = 'U' or 'u', the upper triangular part of the
array a is overwritten by the upper triangular part of the
updated matrix.
With uplo = 'L' or 'l', the lower triangular part of the
array a is overwritten by the lower triangular part of the
updated matrix.
116
The imaginary parts of the diagonal elements are set to
zero.

Specific details for the routine her interface are the following:
?her2
Performs a rank-2 update of a Hermitian matrix.
Syntax
FORTRAN 77:
call cher2(uplo, n, alpha, x, incx, y, incy, a, lda)
call zher2(uplo, n, alpha, x, incx, y, incy, a, lda)
Fortran 95:
call her2(a, x, y [,uplo][,alpha])
Description
The ?her2 routines perform a matrix-vector operation defined as
A := alpha *x*conjg(y') + conjg(alpha)*y *conjg(x') + A,
where:
alpha is a scalar,
117
Input Parameters
a is used.
a is used.
alpha COMPLEX for cher2
DOUBLE COMPLEX for zher2
x COMPLEX for cher2
n-element vector x.
y COMPLEX for cher2
n-element vector y.
a COMPLEX for cher2
118
max(1, n).
Output Parameters
updated matrix.
updated matrix.
zero.

Specific details for the routine her2 interface are the following:
119
?hpmv
Hermitian packed matrix.
Syntax
FORTRAN 77:
call chpmv(uplo, n, alpha, ap, x, incx, beta, y, incy)
call zhpmv(uplo, n, alpha, ap, x, incx, beta, y, incy)
Fortran 95:
call hpmv(ap, x, y [,uplo][,alpha] [,beta])
Description
The ?hpmv routines perform a matrix-vector operation defined as
where:
A is an n-by-n Hermitian matrix, supplied in packed form.
Input Parameters
triangular part of the matrix A is supplied in the packed
array ap.
matrix A is supplied in the packed array ap .
alpha COMPLEX for chpmv
120
DOUBLE COMPLEX for zhpmv
ap COMPLEX for chpmv
Array, DIMENSION at least ((n*(n + 1))/2). Before entry
with uplo = 'U' or 'u', the array ap must contain the
upper triangular part of the Hermitian matrix packed
sequentially, column-by-column, so that ap(1) contains
a(1, 1), ap(2) and ap(3) contain a(1, 2) and a(2, 2)
respectively, and so on. Before entry with uplo = 'L' or
'l', the array ap must contain the lower triangular part of
the Hermitian matrix packed sequentially,
column-by-column, so that ap(1) contains a(1, 1), ap(2)
and ap(3) contain a(2, 1) and a(3, 1) respectively, and
so on.
x COMPLEX for chpmv
DOUBLE PRECISION COMPLEX for zhpmv
Array, DIMENSION at least (1 +(n - 1)*abs(incx)).
n-element vector x.
beta COMPLEX for chpmv
When beta is equal to zero then y need not be set on input.
y COMPLEX for chpmv
n-element vector y.
121
Output Parameters

Specific details for the routine hpmv interface are the following:
ap Holds the array ap of size (n*(n+1)/2).
?hpr
Performs a rank-1 update of a Hermitian packed
matrix.
Syntax
FORTRAN 77:
call chpr(uplo, n, alpha, x, incx, ap)
call zhpr(uplo, n, alpha, x, incx, ap)
Fortran 95:
call hpr(ap, x [,uplo] [, alpha])
Description
The ?hpr routines perform a matrix-vector operation defined as
A := alpha*x*conjg(x') + A,
122
where:
Input Parameters
array ap.
If uplo = 'U' or 'u', the upper triangular part of the
If uplo = 'L' or 'l', the low triangular part of the matrix
A is supplied in the packed array ap .
alpha REAL for chpr
DOUBLE PRECISION for zhpr
x COMPLEX for chpr
DOUBLE COMPLEX for zhpr
n-element vector x.
ap COMPLEX for chpr
DOUBLE COMPLEX for zhpr
a(1, 1), ap(2) and ap(3) contain a(1, 2) and a(2, 2)
respectively, and so on.
123
Before entry with uplo = 'L' or 'l', the array ap must

contain the lower triangular part of the Hermitian matrix
packed sequentially, column-by-column, so that ap(1)
contains a(1, 1), ap(2) and ap(3) contain a(2, 1) and
a(3, 1) respectively, and so on.
Output Parameters
ap With uplo = 'U' or 'u', overwritten by the upper
triangular part of the updated matrix.
With uplo = 'L' or 'l', overwritten by the lower triangular
part of the updated matrix.
zero.

Specific details for the routine hpr interface are the following:
?hpr2
Performs a rank-2 update of a Hermitian packed
matrix.
Syntax
FORTRAN 77:
call chpr2(uplo, n, alpha, x, incx, y, incy, ap)
call zhpr2(uplo, n, alpha, x, incx, y, incy, ap)
124
Fortran 95:
call hpr2(ap, x, y [,uplo][,alpha])
Description
The ?hpr2 routines perform a matrix-vector operation defined as
A := alpha*x*conjg(y') + conjg(alpha)*y*conjg(x') + A,
where:
alpha is a scalar,
Input Parameters
array ap.
alpha COMPLEX for chpr2
DOUBLE COMPLEX for zhpr2
x COMPLEX for chpr2
Array, dimension at least (1 +(n - 1)*abs(incx)). Before
entry, the incremented array x must contain the n-element
vector x.
125
y COMPLEX for chpr2

Array, DIMENSION at least (1 +(n - 1)*abs(incy)).
n-element vector y.
ap COMPLEX for chpr2
a(1,1), ap(2) and ap(3) contain a(1,2) and a(2,2)
contain the lower triangular part of the Hermitian matrix
contains a(1,1), ap(2) and ap(3) contain a(2,1) and
a(3,1) respectively, and so on.
Output Parameters
The imaginary parts of the diagonal elements need are set
to zero.

Specific details for the routine hpr2 interface are the following:
126
?sbmv
symmetric band matrix.
Syntax
FORTRAN 77:
call ssbmv(uplo, n, k, alpha, a, lda, x, incx, beta, y, incy)
call dsbmv(uplo, n, k, alpha, a, lda, x, incx, beta, y, incy)
Fortran 95:
call sbmv(a, x, y [,uplo][,alpha] [,beta])
Description
The ?sbmv routines perform a matrix-vector operation defined as
where:
A is an n-by-n symmetric band matrix, with k super-diagonals.
Input Parameters
triangular part of the band matrix A is used:
if uplo = 'U' or 'u' - upper triangular part;
127
if uplo = 'L' or 'l' - low triangular part.

k INTEGER. Specifies the number of super-diagonals of the
matrix A.
alpha REAL for ssbmv
DOUBLE PRECISION for dsbmv
a REAL for ssbmv
Array, DIMENSION (lda, n). Before entry with uplo =
'U' or 'u', the leading (k + 1) by n part of the array a
must contain the upper triangular band part of the
symmetric matrix, supplied column-by-column, with the
leading diagonal of the matrix in row (k + 1) of the array,
the first super-diagonal starting at position 2 in row k, and
so on. The top left k by k triangle of the array a is not
referenced.
The following program segment transfers the upper
triangular part of a symmetric band matrix from conventional
do 20, j = 1, n
m = k + 1 - j
do 10, i = max( 1, j - k ), j
10 continue
20 continue
band part of the symmetric matrix, supplied
128
The following program segment transfers the lower
triangular part of a symmetric band matrix from conventional
do 20, j = 1, n
m = 1 - j
do 10, i = j, min( n, j + k )
10 continue
20 continue

(k + 1).
x REAL for ssbmv
vector x.
beta REAL for ssbmv
y REAL for ssbmv
vector y.
Output Parameters
129

Specific details for the routine sbmv interface are the following:
?spmv
symmetric packed matrix.
Syntax
FORTRAN 77:
call sspmv(uplo, n, alpha, ap, x, incx, beta, y, incy)
call dspmv(uplo, n, alpha, ap, x, incx, beta, y, incy)
Fortran 95:
call spmv(ap, x, y [,uplo][,alpha] [,beta])
Description
The ?spmv routines perform a matrix-vector operation defined as
where:
130
A is an n-by-n symmetric matrix, supplied in packed form.
Input Parameters
array ap.
n INTEGER. Specifies the order of the matrix a. The value of
alpha REAL for sspmv
DOUBLE PRECISION for dspmv
ap REAL for sspmv
Array, DIMENSION at least ((n*(n + 1))/2).
Before entry with uplo = 'U' or 'u', the array ap must
contain the upper triangular part of the symmetric matrix
a(2, 2) respectively, and so on. Before entry with uplo
= 'L' or 'l', the array ap must contain the lower triangular
part of the symmetric matrix packed sequentially,
column-by-column, so that ap(1) contains a(1,1), ap(2)
and ap(3) contain a(2,1) and a(3,1) respectively, and
so on.
x REAL for sspmv
n-element vector x.
131
beta REAL for sspmv

When beta is supplied as zero, then y need not be set on
input.
y REAL for sspmv
n-element vector y.
Output Parameters

Specific details for the routine spmv interface are the following:
132
?spr
Performs a rank-1 update of a symmetric packed
matrix.
Syntax
FORTRAN 77:
call sspr(uplo, n, alpha, x, incx, ap)
call dspr(uplo, n, alpha, x, incx, ap)
Fortran 95:
call spr(ap, x [,uplo] [, alpha])
Description
The ?spr routines perform a matrix-vector operation defined as
a:= alpha*x*x'+ A,
where:
Input Parameters
array ap.
alpha REAL for sspr
133
DOUBLE PRECISION for dspr

x REAL for sspr
n-element vector x.
ap REAL for sspr
upper triangular part of the symmetric matrix packed
contain the lower triangular part of the symmetric matrix
Output Parameters

Specific details for the routine spr interface are the following:
134
?spr2
Performs a rank-2 update of a symmetric packed
matrix.
Syntax
FORTRAN 77:
call sspr2(uplo, n, alpha, x, incx, y, incy, ap)
call dspr2(uplo, n, alpha, x, incx, y, incy, ap)
Fortran 95:
call spr2(ap, x, y [,uplo][,alpha])
Description
The ?spr2 routines perform a matrix-vector operation defined as
A:= alpha*x*y'+ alpha*y*x' + A,
where:
alpha is a scalar,
Input Parameters
array ap.
135

alpha REAL for sspr2
DOUBLE PRECISION for dspr2
x REAL for sspr2
n-element vector x.
y REAL for sspr2
n-element vector y.
incy INTEGER. Specifies the increment for the elements of y. The
value of incy must not be zero.
ap REAL for sspr2
contains a(1,1), ap(2) and ap(3) contain a (2,1) and
Output Parameters
136

Specific details for the routine spr2 interface are the following:
?symv
Computes a matrix-vector product for a symmetric
matrix.
Syntax
FORTRAN 77:
call ssymv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call dsymv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
Fortran 95:
call symv(a, x, y [,uplo][,alpha] [,beta])
Description
The ?symv routines perform a matrix-vector operation defined as
where:
137
A is an n-by-n symmetric matrix.
Input Parameters
array a is used.
array a is used.
alpha REAL for ssymv
DOUBLE PRECISION for dsymv
a REAL for ssymv
triangular part of the symmetric matrix A and the strictly
lower triangular part of a is not referenced. Before entry
with uplo = 'L' or 'l', the leading n-by-n lower triangular
part of the array a must contain the lower triangular part
of the symmetric matrix A and the strictly upper triangular
part of a is not referenced.
max(1, n).
x REAL for ssymv
n-element vector x.
138
beta REAL for ssymv
When beta is supplied as zero, then y need not be set on
input.
y REAL for ssymv
n-element vector y.
Output Parameters

Specific details for the routine symv interface are the following:
139
?syr
Performs a rank-1 update of a symmetric matrix.
Syntax
FORTRAN 77:
call ssyr(uplo, n, alpha, x, incx, a, lda)
call dsyr(uplo, n, alpha, x, incx, a, lda)
Fortran 95:
call syr(a, x [,uplo] [, alpha])
Description
The ?syr routines perform a matrix-vector operation defined as
A := alpha*x*x' + A ,
where:
Input Parameters
array a is used.
array a is used.
alpha REAL for ssyr
DOUBLE PRECISION for dsyr
140
x REAL for ssyr
Array, DIMENSION at least (1 + (n-1)*abs(incx)). Before
vector x.
a REAL for ssyr
lower triangular part of a is not referenced.
upper triangular part of a is not referenced.
max(1, n).
Output Parameters
updated matrix.
updated matrix.

Specific details for the routine syr interface are the following:
141

?syr2
Performs a rank-2 update of symmetric matrix.
Syntax
FORTRAN 77:
call ssyr2(uplo, n, alpha, x, incx, y, incy, a, lda)
call dsyr2(uplo, n, alpha, x, incx, y, incy, a, lda)
Fortran 95:
call syr2(a, x, y [,uplo][,alpha])
Description
The ?syr2 routines perform a matrix-vector operation defined as
A := alpha*x*y'+ alpha*y*x' + A,
where:
alpha is a scalar,
Input Parameters
array a is used.
142
array a is used.
alpha REAL for ssyr2
DOUBLE PRECISION for dsyr2
x REAL for ssyr2
n-element vector x.
y REAL for ssyr2
n-element vector y.
a REAL for ssyr2
triangular part of the symmetric matrix and the strictly lower
triangular part of the symmetric matrix and the strictly upper
max(1, n).
143
Output Parameters
updated matrix.
updated matrix.

Specific details for the routine syr2 interface are the following:
x Holds the vector x of length n.
y Holds the vector y of length n.
?tbmv
triangular band matrix.
Syntax
FORTRAN 77:
call stbmv(uplo, trans, diag, n, k, a, lda, x, incx)
call dtbmv(uplo, trans, diag, n, k, a, lda, x, incx)
call ctbmv(uplo, trans, diag, n, k, a, lda, x, incx)
call ztbmv(uplo, trans, diag, n, k, a, lda, x, incx)
Fortran 95:
call tbmv(a, x [,uplo] [, trans] [,diag])
144
Description
The ?tbmv routines perform one of the matrix-vector operations defined as
x := A*x, or x := A'*x, or x := conjg(A')*x,
where:
A is an n-by-n unit, or non-unit, upper or lower triangular band matrix, with (k +1) diagonals.
Input Parameters
uplo CHARACTER*1. Specifies whether the matrix A is an upper
or lower triangular matrix:
if uplo = 'U' or 'u', then the matrix is upper triangular;
if uplo = 'L' or 'l', then the matrix is low triangular.
if trans = 'N' or 'n', then x := A*x;
if trans = 'T' or 't', then x := A'*x;
if trans = 'C' or 'c', then x := conjg(A')*x.
diag CHARACTER*1. Specifies whether the matrix A is unit
triangular:
if diag = 'U' or 'u' then the matrix is unit triangular;
if diag = 'N' or 'n', then the matrix is not unit triangular.
k INTEGER. On entry with uplo = 'U' or 'u', k specifies the
number of super-diagonals of the matrix A. On entry with
uplo = 'L' or 'l', k specifies the number of sub-diagonals
of the matrix a.
a REAL for stbmv
DOUBLE PRECISION for dtbmv
COMPLEX for ctbmv
DOUBLE COMPLEX for ztbmv
145

band part of the matrix of coefficients, supplied
in row (k + 1) of the array, the first super-diagonal starting
at position 2 in row k, and so on. The top left k by k triangle
of the array a is not referenced. The following program
segment transfers an upper triangular band matrix from
conventional full matrix storage to band storage:
do 20, j = 1, n
m = k + 1 - j
do 10, i = max(1, j - k), j
a(m + i, j) = matrix(i, j)
10 continue
20 continue
in row1 of the array, the first sub-diagonal starting at
triangle of the array a is not referenced. The following
program segment transfers a lower triangular band matrix
from conventional full matrix storage to band storage:
do 20, j = 1, n
m = 1 - j
do 10, i = j, min(n, j + k)
a(m + i, j) = matrix (i, j)
10 continue
20 continue
Note that when diag = 'U' or 'u', the elements of the
array a corresponding to the diagonal elements of the matrix
are not referenced, but are assumed to be unity.
146
(k + 1).
x REAL for stbmv
DOUBLE PRECISION for dtbmv
COMPLEX for ctbmv
DOUBLE COMPLEX for ztbmv
n-element vector x.
Output Parameters
x Overwritten with the transformed vector x.

Specific details for the routine tbmv interface are the following:
diag Must be 'N' or 'U'. The default value is 'N'.
147
?tbsv
Solves a system of linear equations whose
coefficients are in a triangular band matrix.
Syntax
FORTRAN 77:
call stbsv(uplo, trans, diag, n, k, a, lda, x, incx)
call dtbsv(uplo, trans, diag, n, k, a, lda, x, incx)
call ctbsv(uplo, trans, diag, n, k, a, lda, x, incx)
call ztbsv(uplo, trans, diag, n, k, a, lda, x, incx)
Fortran 95:
call tbsv(a, x [,uplo] [, trans] [,diag])
Description
The ?tbsv routines solve one of the following systems of equations:
A*x = b, or A'*x = b, or conjg(A')*x = b,
where:
b and x are n-element vectors,
A is an n-by-n unit, or non-unit, upper or lower triangular band matrix, with (k + 1) diagonals.
The routine does not test for singularity or near-singularity.

Such tests must be performed before calling this routine.
Input Parameters
uplo CHARACTER*1. Specifies whether the matrix A is an upper
or lower triangular matrix:
if uplo = 'U' or 'u' the matrix is upper triangular;
if uplo = 'L' or 'l', the matrix is low triangular.
trans CHARACTER*1. Specifies the system of equations:
148
if trans = 'N' or 'n', then A*x = b;
if trans = 'T' or 't', then A'*x = b;
if trans = 'C' or 'c', then conjg(A')*x = b.
triangular:
k INTEGER. On entry with uplo = 'U' or 'u', k specifies the
number of super-diagonals of the matrix A. On entry with
uplo = 'L' or 'l', k specifies the number of sub-diagonals
of the matrix A.
a REAL for stbsv
DOUBLE PRECISION for dtbsv
COMPLEX for ctbsv
DOUBLE COMPLEX for ztbsv
in row (k + 1) of the array, the first super-diagonal starting
at position 2 in row k, and so on. The top left k by k triangle
of the array a is not referenced.
The following program segment transfers an upper triangular
band matrix from conventional full matrix storage to band
storage:
do 20, j = 1, n
m = k + 1 - j
do 10, i = max(1, j - k), jl
10 continue
20 continue
149

The following program segment transfers a lower triangular
band matrix from conventional full matrix storage to band
storage:
do 20, j = 1, n
m = 1 - j
do 10, i = j, min(n, j + k)
10 continue
20 continue
When diag = 'U' or 'u', the elements of the array a
corresponding to the diagonal elements of the matrix are
not referenced, but are assumed to be unity.
(k + 1).
x REAL for stbsv
DOUBLE PRECISION for dtbsv
COMPLEX for ctbsv
DOUBLE COMPLEX for ztbsv
n-element right-hand side vector b.
Output Parameters
x Overwritten with the solution vector x.
150
Specific details for the routine tbsv interface are the following:
?tpmv
triangular packed matrix.
Syntax
FORTRAN 77:
call stpmv(uplo, trans, diag, n, ap, x, incx)
call dtpmv(uplo, trans, diag, n, ap, x, incx)
call ctpmv(uplo, trans, diag, n, ap, x, incx)
call ztpmv(uplo, trans, diag, n, ap, x, incx)
Fortran 95:
call tpmv(ap, x [,uplo] [, trans] [,diag])
Description
The ?tpmv routines perform one of the matrix-vector operations defined as
where:
151
A is an n-by-n unit, or non-unit, upper or lower triangular matrix, supplied in packed form.
Input Parameters
uplo CHARACTER*1. Specifies whether the matrix A is upper or
lower triangular:
triangular:
ap REAL for stpmv
DOUBLE PRECISION for dtpmv
COMPLEX for ctpmv
DOUBLE COMPLEX for ztpmv
upper triangular matrix packed sequentially,
column-by-column, so that ap(1) contains a(1,1), ap(2)
and ap(3) contain a(1,2) and a(2,2) respectively, and
so on. Before entry with uplo = 'L' or 'l', the array ap
must contain the lower triangular matrix packed
respectively, and so on. When diag = 'U' or 'u', the
diagonal elements of a are not referenced, but are assumed
to be unity.
x REAL for stpmv
DOUBLE PRECISION for dtpmv
152
COMPLEX for ctpmv
DOUBLE COMPLEX for ztpmv
n-element vector x.
Output Parameters

Specific details for the routine tpmv interface are the following:
153
?tpsv
coefficients are in a triangular packed matrix.
Syntax
FORTRAN 77:
call stpsv(uplo, trans, diag, n, ap, x, incx)
call dtpsv(uplo, trans, diag, n, ap, x, incx)
call ctpsv(uplo, trans, diag, n, ap, x, incx)
call ztpsv(uplo, trans, diag, n, ap, x, incx)
Fortran 95:
call tpsv(ap, x [,uplo] [, trans] [,diag])
Description
The ?tpsv routines solve one of the following systems of equations
where:
A is an n-by-n unit, or non-unit, upper or lower triangular matrix, supplied in packed form.
This routine does not test for singularity or near-singularity.

Input Parameters
lower triangular:
trans CHARACTER*1. Specifies the system of equations:
154
if trans = 'C' or 'c', then conjg(A')*x = b.
triangular:
ap REAL for stpsv
DOUBLE PRECISION for dtpsv
COMPLEX for ctpsv
DOUBLE COMPLEX for ztpsv
upper triangular matrix packed sequentially,
column-by-column, so that ap(1) contains a(1, +1), ap(2)
so on.
contain the lower triangular matrix packed sequentially,
column-by-column, so that ap(1) contains a(1, +1), ap(2)
and ap(3) contain a(2, +1) and a(3, +1) respectively,
and so on.
When diag = 'U' or 'u', the diagonal elements of a are
not referenced, but are assumed to be unity.
x REAL for stpsv
DOUBLE PRECISION for dtpsv
COMPLEX for ctpsv
DOUBLE COMPLEX for ztpsv
155
Output Parameters

Specific details for the routine tpsv interface are the following:
?trmv
triangular matrix.
Syntax
FORTRAN 77:
call strmv(uplo, trans, diag, n, a, lda, x, incx)
call dtrmv(uplo, trans, diag, n, a, lda, x, incx)
call ctrmv(uplo, trans, diag, n, a, lda, x, incx)
call ztrmv(uplo, trans, diag, n, a, lda, x, incx)
Fortran 95:
call trmv(a, x [,uplo] [, trans] [,diag])
Description
156
The ?trmv routines perform one of the following matrix-vector operations defined as
where:
A is an n-by-n unit, or non-unit, upper or lower triangular matrix.
Input Parameters
lower triangular:
triangular:
a REAL for strmv
DOUBLE PRECISION for dtrmv
COMPLEX for ctrmv
DOUBLE COMPLEX for ztrmv
Array, DIMENSION (lda,n). Before entry with uplo = 'U'
or 'u', the leading n-by-n upper triangular part of the array
a must contain the upper triangular matrix and the strictly
part of the array a must contain the lower triangular matrix
and the strictly upper triangular part of a is not referenced.
not referenced either, but are assumed to be unity.
157

max(1, n).
x REAL for strmv
DOUBLE PRECISION for dtrmv
COMPLEX for ctrmv
DOUBLE COMPLEX for ztrmv
n-element vector x.
Output Parameters

Specific details for the routine trmv interface are the following:
158
?trsv
coefficients are in a triangular matrix.
Syntax
FORTRAN 77:
call strsv(uplo, trans, diag, n, a, lda, x, incx)
call dtrsv(uplo, trans, diag, n, a, lda, x, incx)
call ctrsv(uplo, trans, diag, n, a, lda, x, incx)
call ztrsv(uplo, trans, diag, n, a, lda, x, incx)
Fortran 95:
call trsv(a, x [,uplo] [, trans] [,diag])
Description
The ?trsv routines solve one of the systems of equations:
where:
A is an n-by-n unit, or non-unit, upper or lower triangular matrix.
The routine does not test for singularity or near-singularity.

Input Parameters
lower triangular:
trans CHARACTER*1. Specifies the systems of equations:
159

if trans = 'C' or 'c', then oconjg(A')*x = b.
triangular:
a REAL for strsv
DOUBLE PRECISION for dtrsv
COMPLEX for ctrsv
DOUBLE COMPLEX for ztrsv
Array, DIMENSION (lda,n). Before entry with uplo = 'U'
a must contain the upper triangular matrix and the strictly
part of the array a must contain the lower triangular matrix
max(1, n).
x REAL for strsv
DOUBLE PRECISION for dtrsv
COMPLEX for ctrsv
DOUBLE COMPLEX for ztrsv
160
Output Parameters

Specific details for the routine trsv interface are the following:
a Holds the matrix a of size (n,n).

BLAS Level 3 routines perform matrix-matrix operations. Table 2-3 lists the BLAS Level 3
routine groups and the data types associated with them.
Routine Group Data Types Description
?gemm s, d, c, z Matrix-matrix product of general matrices
?hemm c, z Matrix-matrix product of Hermitian matrices
?herk c, z Rank-k update of Hermitian matrices
?her2k c, z Rank-2k update of Hermitian matrices
?symm s, d, c, z Matrix-matrix product of symmetric matrices
?syrk s, d, c, z Rank-k update of symmetric matrices
?syr2k s, d, c, z Rank-2k update of symmetric matrices
?trmm s, d, c, z Matrix-matrix product of triangular matrices
161
Routine Group Data Types Description
?trsm s, d, c, z Linear matrix-matrix solution for triangular matrices
Symmetric Multiprocessing Version of Intel® MKL

Many applications spend considerable time for executing BLAS level 3 routines. This time can
be scaled by the number of processors available on the system through using the symmetric
multiprocessing(SMP) feature built into the Intel MKL Library. The performance enhancements
based on the parallel use of the processors are available without any programming effort on
your part.
To enhance performance, the library uses the following methods:
• The operation of BLAS level 3 matrix-matrix functions permits to restructure the code in a
way which increases the localization of data reference, enhances cache memory use, and
reduces the dependency on the memory bus.
• Once the code has been effectively blocked as described above, one of the matrices is
distributed across the processors to be multiplied by the second matrix. Such distribution
ensures effective cache management which reduces the dependency on the memory bus
performance and brings good scaling results.
?gemm
Computes a scalar-matrix-matrix product and adds
the result to a scalar-matrix product.
Syntax
FORTRAN 77:
call sgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call dgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call cgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call zgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
Fortran 95:
call gemm(a, b, c [,transa][,transb] [,alpha][,beta])
162
Description
The ?gemm routines perform a matrix-matrix operation with general matrices. The operation is
defined as
C := alpha*op(A)*op(B) + beta*C,
where:
op(x) is one of op(x) = x, or op(x) = x', or op(x) = conjg(x'),
A, B and C are matrices:
op(A) is an m-by-k matrix,

op(B) is a k-by-n matrix,
C is an m-by-n matrix.
See also ?gemm3m, BLAS-like extension routines, that use matrix multiplication for similar
matrix-matrix operations.
Input Parameters
transa CHARACTER*1. Specifies the form of op(A) used in the
matrix multiplication:
if transa = 'N' or 'n', then op(A) = A;
if transa = 'T' or 't', then op(A) = A';
if transa = 'C' or 'c', then op(A) = conjg(A').
transb CHARACTER*1. Specifies the form of op(B) used in the
if transb = 'N' or 'n', then op(B) = B;
if transb = 'T' or 't', then op(B) = B';
if transb = 'C' or 'c', then op(B) = conjg(B').
m INTEGER. Specifies the number of rows of the matrix op(A)
and of the matrix C. The value of m must be at least zero.
n INTEGER. Specifies the number of columns of the matrix
op(B) and the number of columns of the matrix C.
163
k INTEGER. Specifies the number of columns of the matrix

op(A) and the number of rows of the matrix op(B).
The value of k must be at least zero.
alpha REAL for sgemm
DOUBLE PRECISION for dgemm
COMPLEX for cgemm
DOUBLE COMPLEX for zgemm
a REAL for sgemm
COMPLEX for cgemm
Array, DIMENSION (lda, ka), where ka is k when transa=
'N' or 'n', and is m otherwise. Before entry with transa=
'N' or 'n', the leading m-by-k part of the array a must
contain the matrix A, otherwise the leading k-by-m part of
the array a must contain the matrix A.
the calling (sub)program. When transa= 'N' or 'n', then
lda must be at least max(1, m), otherwise lda must be at
least max(1, k).
b REAL for sgemm
COMPLEX for cgemm
Array, DIMENSION (ldb, kb), where kb is n when transb
= 'N' or 'n', and is k otherwise. Before entry with transb
= 'N' or 'n', the leading k-by-n part of the array b must
contain the matrix B, otherwise the leading n-by-k part of
the array b must contain the matrix B.
ldb INTEGER. Specifies the first dimension of b as declared in
the calling (sub)program. When transb = 'N' or 'n', then
ldb must be at least max(1, k), otherwise ldb must be at
least max(1, n).
beta REAL for sgemm
COMPLEX for cgemm
164
When beta is equal to zero, then c need not be set on input.
c REAL for sgemm
COMPLEX for cgemm
Array, DIMENSION (ldc, n).
Before entry, the leading m-by-n part of the array c must
contain the matrix C, except when beta is equal to zero, in
which case c need not be set on entry.
ldc INTEGER. Specifies the first dimension of c as declared in
the calling (sub)program. The value of ldc must be at least
max(1, m).
Output Parameters
c Overwritten by the m-by-n matrix (alpha*op(A)*op(B) +
beta*C).

Specific details for the routine gemm interface are the following:
a Holds the matrix A of size (ma,ka) where
ka = k if transa= 'N',
ka = m otherwise,
ma = m if transa= 'N',
ma = k otherwise.
b Holds the matrix B of size (mb,kb) where
kb = n if transb = 'N',
kb = k otherwise,
mb = k if transb = 'N',
mb = n otherwise.
c Holds the matrix C of size (m,n).
165
transa Must be 'N', 'C', or 'T'.

transb Must be 'N', 'C', or 'T'.
?hemm
Computes a scalar-matrix-matrix product (either
one of the matrices is Hermitian) and adds the
result to scalar-matrix product.
Syntax
FORTRAN 77:
call chemm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call zhemm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
Fortran 95:
call hemm(a, b, c [,side][,uplo] [,alpha][,beta])
Description
The ?hemm routines perform a matrix-matrix operation using Hermitian matrices. The operation
is defined as
C := alpha*A*B + beta*C
or
C := alpha*B*A + beta*C,
where:
A is an Hermitian matrix,
B and C are m-by-n matrices.
166
Input Parameters
side CHARACTER*1. Specifies whether the Hermitian matrix A
appears on the left or right in the operation as follows:
if side = 'L' or 'l', then C := alpha*A*B + beta*C;
if side = 'R' or 'r', then C := alpha*B*A + beta*C.
triangular part of the Hermitian matrix A is used:
Hermitian matrix A is used.
Hermitian matrix A is used.
m INTEGER. Specifies the number of rows of the matrix C.
n INTEGER. Specifies the number of columns of the matrix C.
alpha COMPLEX for chemm
DOUBLE COMPLEX for zhemm
a COMPLEX for chemm
Array, DIMENSION (lda,ka), where ka is m when side =
'L' or 'l' and is n otherwise. Before entry with side =
'L' or 'l', the m-by-m part of the array a must contain the
Hermitian matrix, such that when uplo = 'U' or 'u', the
leading m-by-m upper triangular part of the array a must
contain the upper triangular part of the Hermitian matrix
and the strictly lower triangular part of a is not referenced,
and when uplo = 'L' or 'l', the leading m-by-m lower
triangular part of the array a must contain the lower
triangular part of the Hermitian matrix, and the strictly upper
Before entry with side = 'R' or 'r', the n-by-n part of
the array a must contain the Hermitian matrix, such that
when uplo = 'U' or 'u', the leading n-by-n upper
triangular part of the array a must contain the upper
167
triangular part of a is not referenced, and when uplo =

'L' or 'l', the leading n-by-n lower triangular part of the
array a must contain the lower triangular part of the
Hermitian matrix, and the strictly upper triangular part of
a is not referenced. The imaginary parts of the diagonal
elements need not be set, they are assumed to be zero.
the calling (sub) program. When side = 'L' or 'l' then
least max(1,n).
b COMPLEX for chemm
Array, DIMENSION (ldb,n).
Before entry, the leading m-by-n part of the array b must
contain the matrix B.
the calling (sub)program. The value of ldb must be at least
max(1, m).
beta COMPLEX for chemm
When beta is supplied as zero, then c need not be set on
input.
c COMPLEX for chemm
Array, DIMENSION (c, n). Before entry, the leading m-by-n
part of the array c must contain the matrix C, except when
beta is zero, in which case c need not be set on entry.
max(1, m).
Output Parameters
c Overwritten by the m-by-n updated matrix.
168
Specific details for the routine hemm interface are the following:
a Holds the matrix A of size (k,k) where
k = m if side = 'L',
k = n otherwise.
b Holds the matrix B of size (m,n).
side Must be 'L' or 'R'. The default value is 'L'.
?herk
Performs a rank-k update of a Hermitian matrix.
Syntax
FORTRAN 77:
call cherk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call zherk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
Fortran 95:
call herk(a, c [,uplo] [, trans] [,alpha][,beta])
Description
The ?herk routines perform a matrix-matrix operation using Hermitian matrices. The operation
is defined as
C := alpha*A*conjg(A') + beta*C,
169
or
C := alpha*conjg(A')*A + beta*C,
where:
alpha and beta are real scalars,
C is an n-by-n Hermitian matrix,
A is an n-by-k matrix in the first case and a k-by-n matrix in the second case.
Input Parameters
triangular part of the array c is used.
array c is used.
array c is used.
if trans = 'N' or 'n', then C:=
alpha*A*conjg(A')+beta*C;
if trans = 'C' or 'c', then C:=
alpha*conjg(A')*A+beta*C.
n INTEGER. Specifies the order of the matrix C. The value of
k INTEGER. With trans = 'N' or 'n', k specifies the number
of columns of the matrix A, and with trans = 'C' or 'c',
k specifies the number of rows of the matrix A.
alpha REAL for cherk
DOUBLE PRECISION for zherk
a COMPLEX for cherk
DOUBLE COMPLEX for zherk
Array, DIMENSION (lda, ka), where ka is k when trans
= 'N' or 'n', and is n otherwise. Before entry with trans
= 'N' or 'n', the leading n-by-k part of the array a must
contain the matrix a, otherwise the leading k-by-n part of
170
the calling (sub)program. When trans = 'N' or 'n', then
lda must be at least max(1, n), otherwise lda must be at
least max(1, k).
beta REAL for cherk
DOUBLE PRECISION for zherk
c COMPLEX for cherk
DOUBLE COMPLEX for zherk
Array, DIMENSION (ldc,n).
upper triangular part of the array c must contain the upper
triangular part of c is not referenced.
lower triangular part of the array c must contain the lower
set, they are assumed to be zero.
max(1, n).
Output Parameters
c With uplo = 'U' or 'u', the upper triangular part of the
array c is overwritten by the upper triangular part of the
updated matrix.
array c is overwritten by the lower triangular part of the
updated matrix.
zero.
171

Specific details for the routine herk interface are the following:
ka = n otherwise,
ma = n if transa= 'N',
ma = k otherwise.
c Holds the matrix C of size (n,n).
trans Must be 'N' or 'C'. The default value is 'N'.
?her2k
Performs a rank-2k update of a Hermitian matrix.
Syntax
FORTRAN 77:
call cher2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call zher2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
Fortran 95:
call her2k(a, b, c [,uplo][,trans] [,alpha][,beta])
Description
The ?her2k routines perform a rank-2k matrix-matrix operation using Hermitian matrices. The
operation is defined as
C := alpha*A*conjg(B') + conjg(alpha)*B*conjg(A') + beta*C,
172
or
C := alpha *conjg(B')*A + conjg(alpha) *conjg(A')*B + beta*C,
where:
alpha is a scalar and beta is a real scalar,
C is an n-by-n Hermitian matrix,
A and B are n-by-k matrices in the first case and k-by-n matrices in the second case.
Input Parameters
c is used.
c is used.
if trans = 'N' or 'n', then C:=alpha*A*conjg(B') +
alpha*B*conjg(A') + beta*C;
if trans = 'C' or 'c', then C:=alpha*conjg(A')*B +
alpha*conjg(B')*A + beta*C.
k INTEGER. With trans = 'N' or 'n', k specifies the number
of columns of the matrix A, and with trans = 'C' or 'c',
k specifies the number of rows of the matrix A.
The value of k must be at least equal to zero.
alpha COMPLEX for cher2k
DOUBLE COMPLEX for zher2k
a COMPLEX for cher2k
Array, DIMENSION (lda, ka), where ka is k when trans
= 'N' or 'n', the leading n-by-k part of the array a must
contain the matrix A, otherwise the leading k-by-n part of
173

least max(1, k).
beta REAL for cher2k
DOUBLE PRECISION for zher2k
b COMPLEX for cher2k
Array, DIMENSION (ldb, kb), where kb is k when trans
= 'N' or 'n', the leading n-by-k part of the array b must
contain the matrix B, otherwise the leading k-by-n part of
ldb must be at least max(1, n), otherwise ldb must be at
least max(1, k).
c COMPLEX for cher2k
upper triangular part of the array c must contain the upper
set, they are assumed to be zero.
max(1, n).
174
Output Parameters
updated matrix.
updated matrix.
zero.

Specific details for the routine her2k interface are the following:
ka = k if trans = 'N',
ka = n otherwise,
ma = n if trans = 'N',
ma = k otherwise.
kb = k if trans = 'N',
kb = n otherwise,
mb = n if trans = 'N',
mb = k otherwise.
175
?symm
Performs a scalar-matrix-matrix product(one matrix
operand is symmetric) and adds the result to a
scalar-matrix product.
Syntax
FORTRAN 77:
call ssymm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call dsymm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call csymm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call zsymm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
Fortran 95:
call symm(a, b, c [,side][,uplo] [,alpha][,beta])
Description
The ?symm routines perform a matrix-matrix operation using symmetric matrices. The operation
is defined as
C := alpha*A*B + beta*C,
or
C := alpha*B*A + beta*C,
where:
A is a symmetric matrix,
B and C are m-by-n matrices.
Input Parameters
side CHARACTER*1. Specifies whether the symmetric matrix A
appears on the left or right in the operation:
if side = 'L' or 'l', then C := alpha*A*B + beta*C;
176
if side = 'R' or 'r', then C := alpha*B*A + beta*C.
triangular part of the symmetric matrix A is used:
if uplo = 'U' or 'u', then the upper triangular part is
used;
if uplo = 'L' or 'l', then the lower triangular part is used.
m INTEGER. Specifies the number of rows of the matrix C.
n INTEGER. Specifies the number of columns of the matrix C.
alpha REAL for ssymm
DOUBLE PRECISION for dsymm
COMPLEX for csymm
DOUBLE COMPLEX for zsymm
a REAL for ssymm
COMPLEX for csymm
Array, DIMENSION (lda, ka), where ka is m when side =
'L' or 'l' and is n otherwise.
Before entry with side = 'L' or 'l', the m-by-m part of
the array a must contain the symmetric matrix, such that
when uplo = 'U' or 'u', the leading m-by-m upper
'L' or 'l', the leading m-by-m lower triangular part of the
symmetric matrix and the strictly upper triangular part of
a is not referenced.
Before entry with side = 'R' or 'r', the n-by-n part of
the array a must contain the symmetric matrix, such that
when uplo = 'U' or 'u', the leading n-by-n upper
177
'L' or 'l', the leading n-by-n lower triangular part of the

symmetric matrix and the strictly upper triangular part of
a is not referenced.
the calling (sub)program. When side = 'L' or 'l' then
least max(1, n).
b REAL for ssymm
COMPLEX for csymm
Array, DIMENSION (ldb,n). Before entry, the leading m-by-n
part of the array b must contain the matrix B.
max(1, m).
beta REAL for ssymm
COMPLEX for csymm
When beta is set to zero, then c need not be set on input.
c REAL for ssymm
COMPLEX for csymm
Array, DIMENSION (ldc,n). Before entry, the leading m-by-n
part of the array c must contain the matrix C, except when
beta is zero, in which case c need not be set on entry.
max(1, m).
Output Parameters
c Overwritten by the m-by-n updated matrix.
178
Specific details for the routine symm interface are the following:
k = n otherwise.
?syrk
Performs a rank-n update of a symmetric matrix.
Syntax
FORTRAN 77:
call ssyrk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call dsyrk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call csyrk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call zsyrk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
Fortran 95:
call syrk(a, c [,uplo] [, trans] [,alpha][,beta])
Description
179
The ?syrk routines perform a matrix-matrix operation using symmetric matrices. The operation
is defined as
C := alpha*A*A' + beta*C,
or
C := alpha*A'*A + beta*C,
where:
C is an n-by-n symmetric matrix,
A is an n-by-k matrix in the first case and a k-by-n matrix in the second case.
Input Parameters
array c is used.
array c is used.
if trans = 'N' or 'n', then C := alpha*A*A' + beta*C;
if trans = 'T' or 't', then C := alpha*A'*A + beta*C;
if trans = 'C' or 'c', then C := alpha*A'*A + beta*C.
k INTEGER. On entry with trans = 'N' or 'n', k specifies
the number of columns of the matrix a, and on entry with
trans = 'T' or 't' or 'C' or 'c', k specifies the number
of rows of the matrix a.
alpha REAL for ssyrk
DOUBLE PRECISION for dsyrk
COMPLEX for csyrk
DOUBLE COMPLEX for zsyrk
a REAL for ssyrk
180
COMPLEX for csyrk
Array, DIMENSION (lda,ka), where ka is k when trans =
'N' or 'n', and is n otherwise. Before entry with trans =
'N' or 'n', the leading n-by-k part of the array a must
lda must be at least max(1,n), otherwise lda must be at
least max(1, k).
beta REAL for ssyrk
COMPLEX for csyrk
c REAL for ssyrk
COMPLEX for csyrk
Array, DIMENSION (ldc,n). Before entry with uplo = 'U'
c must contain the upper triangular part of the symmetric
matrix and the strictly lower triangular part of c is not
referenced.
max(1, n).
181
Output Parameters
updated matrix.
updated matrix.

Specific details for the routine syrk interface are the following:
ka = n otherwise,
ma = n if transa= 'N',
ma = k otherwise.
182
?syr2k
Performs a rank-2k update of a symmetric matrix.
Syntax
FORTRAN 77:
call ssyr2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call dsyr2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call csyr2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call zsyr2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
Fortran 95:
call syr2k(a, b, c [,uplo][,trans] [,alpha][,beta])
Description
The ?syr2k routines perform a rank-2k matrix-matrix operation using symmetric matrices.
The operation is defined as
C := alpha*A*B' + alpha*B*A' + beta*C,
or
C := alpha*a'*B + alpha*B'*A + beta*C,
where:
C is an n-by-n symmetric matrix,
A and B are n-by-k matrices in the first case, and k-by-n matrices in the second case.
Input Parameters
array c is used.
183

array c is used.
if trans = 'N' or 'n', then C :=
alpha*A*B'+alpha*B*A'+beta*C;
if trans = 'T' or 't', then C := alpha*A'*B
+alpha*B'*A +beta*C;
if trans = 'C' or 'c', then C := alpha*A'*B
+alpha*B'*A +beta*C.
k INTEGER. On entry with trans = 'N' or 'n', k specifies
the number of columns of the matrices A and B, and on
entry with trans = 'T' or 't' or 'C' or 'c', k specifies
the number of rows of the matrices A and B. The value of k
must be at least zero.
alpha REAL for ssyr2k
DOUBLE PRECISION for dsyr2k
COMPLEX for csyr2k
DOUBLE COMPLEX for zsyr2k
a REAL for ssyr2k
COMPLEX for csyr2k
Array, DIMENSION (lda,ka), where ka is k when trans =
'N' or 'n', and is n otherwise. Before entry with trans =
'N' or 'n', the leading n-by-k part of the array a must
least max(1, k).
b REAL for ssyr2k
184
COMPLEX for csyr2k
Array, DIMENSION (ldb, kb) where kb is k when trans =
'N' or 'n' and is 'n' otherwise. Before entry with trans
= 'N' or 'n', the leading n-by-k part of the array b must
contain the matrix B, otherwise the leading k-by-n part of
ldb INTEGER. Specifies the first dimension of a as declared in
ldb must be at least max(1, n), otherwise ldb must be at
least max(1, k).
beta REAL for ssyr2k
COMPLEX for csyr2k
c REAL for ssyr2k
COMPLEX for csyr2k
Array, DIMENSION (ldc,n). Before entry with uplo = 'U'
c must contain the upper triangular part of the symmetric
matrix and the strictly lower triangular part of c is not
referenced.
max(1, n).
Output Parameters
updated matrix.
185

updated matrix.

Specific details for the routine syr2k interface are the following:
ka = k if trans = 'N',
ka = n otherwise,
ma = n if trans = 'N',
ma = k otherwise.
kb = k if trans = 'N',
kb = n otherwise,
mb = n if trans = 'N',
mb = k otherwise.
186
?trmm
Computes a scalar-matrix-matrix product (one
matrix operand is triangular).
Syntax
FORTRAN 77:
call strmm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call dtrmm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call ctrmm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call ztrmm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
Fortran 95:
call trmm(a, b [,side] [, uplo] [,transa][,diag] [,alpha])
Description
The ?trmm routines perform a matrix-matrix operation using triangular matrices. The operation
is defined as
B := alpha*op(A)*B
or
B := alpha*B*op(A)
where:
alpha is a scalar,
B is an m-by-n matrix,
A is a unit, or non-unit, upper or lower triangular matrix
op(A) is one of op(A) = A, or op(A) = A', or op(A) = conjg(A').
Input Parameters
side CHARACTER*1. Specifies whether op(A) appears on the left
or right of B in the operation:
187
if side = 'L' or 'l', then B := alpha*op(A)*B;

if side = 'R' or 'r', then B := alpha*B*op(A).
lower triangular:
triangular:
m INTEGER. Specifies the number of rows of B. The value of
m must be at least zero.
n INTEGER. Specifies the number of columns of B. The value
of n must be at least zero.
alpha REAL for strmm
DOUBLE PRECISION for dtrmm
COMPLEX for ctrmm
DOUBLE COMPLEX for ztrmm
When alpha is zero, then a is not referenced and b need
not be set before entry.
a REAL for strmm
COMPLEX for ctrmm
Array, DIMENSION (lda,k), where k is m when side = 'L'
or 'l' and is n when side = 'R' or 'r'. Before entry with
uplo = 'U' or 'u', the leading k by k upper triangular
part of the array a must contain the upper triangular matrix
and the strictly lower triangular part of a is not referenced.
188
Before entry with uplo = 'L' or 'l', the leading k by k
triangular matrix and the strictly upper triangular part of a
is not referenced.
the calling (sub)program. When side = 'L' or 'l', then
lda must be at least max(1, m), when side = 'R' or 'r',
then lda must be at least max(1, n).
b REAL for strmm
COMPLEX for ctrmm
Array, DIMENSION (ldb,n).
Before entry, the leading m-by-n part of the array b must
max(1, m).
Output Parameters
b Overwritten by the transformed matrix.

Specific details for the routine trmm interface are the following:
k = n otherwise.
189

?trsm
Solves a matrix equation (one matrix operand is
triangular).
Syntax
FORTRAN 77:
call strsm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call dtrsm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call ctrsm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call ztrsm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
Fortran 95:
call trsm(a, b [,side] [, uplo] [,transa][,diag] [,alpha])
Description
The ?trsm routines solve one of the following matrix equations:
op(A)*X = alpha*B,
or
X*op(A) = alpha*B,
where:
alpha is a scalar,
X and B are m-by-n matrices,
A is a unit, or non-unit, upper or lower triangular matrix
op(A) is one of op(A) = A, or op(A) = A', or op(A) = conjg(A').
190
The matrix B is overwritten by the solution matrix X.
Input Parameters
side CHARACTER*1. Specifies whether op(A) appears on the left
or right of X in the equation:
if side = 'L' or 'l', then op(A)*X = alpha*B;
if side = 'R' or 'r', then X*op(A) = alpha*B.
lower triangular:
triangular:
m INTEGER. Specifies the number of rows of B. The value of
m must be at least zero.
n INTEGER. Specifies the number of columns of B. The value
of n must be at least zero.
alpha REAL for strsm
DOUBLE PRECISION for dtrsm
COMPLEX for ctrsm
DOUBLE COMPLEX for ztrsm
When alpha is zero, then a is not referenced and b need
not be set before entry.
a REAL for strsm
COMPLEX for ctrsm
191
Array, DIMENSION (lda, k), where k is m when side =

'L' or 'l' and is n when side = 'R' or 'r'. Before entry
with uplo = 'U' or 'u', the leading k by k upper triangular
part of the array a must contain the upper triangular matrix
and the strictly lower triangular part of a is not referenced.
Before entry with uplo = 'L' or 'l', the leading k by k
triangular matrix and the strictly upper triangular part of a
is not referenced.
the calling (sub)program. When side = 'L' or 'l', then
lda must be at least max(1, m), when side = 'R' or 'r',
then lda must be at least max(1, n).
b REAL for strsm
COMPLEX for ctrsm
Array, DIMENSION (ldb,n). Before entry, the leading m-by-n
part of the array b must contain the right-hand side matrix
B.
max(1, +m).
Output Parameters
b Overwritten by the solution matrix X.

Specific details for the routine trsm interface are the following:
a Holds the matrix A of size (k,k) where k = m if side = 'L', k =
n otherwise.
192
Sparse BLAS Level 1 Routines

This section describes Sparse BLAS Level 1, an extension of BLAS Level 1 included in the Intel®
Math Kernel Library beginning with the Intel MKL release 2.1. Sparse BLAS Level 1 is a group
of routines and functions that perform a number of common vector operations on sparse vectors
stored in compressed form.
Sparse vectors are those in which the majority of elements are zeros. Sparse BLAS routines
and functions are specially implemented to take advantage of vector sparsity. This allows you
to achieve large savings in computer time and memory. If nz is the number of non-zero vector
elements, the computer time taken by Sparse BLAS operations will be O(nz).
Vector Arguments
Compressed sparse vectors. Let a be a vector stored in an array, and assume that the only
non-zero elements of a are the following:
a(k1), a (k2), a (k3) . . . a(knz),
where nz is the total number of non-zero elements in a.
In Sparse BLAS, this vector can be represented in compressed form by two FORTRAN arrays,
x (values) and indx (indices). Each array has nz elements:
x(1)=a(k1), x(2)=a(k2), . . . x(nz)= a(knz),
indx(1)=k1, indx(2)=k2, . . . indx(nz)= knz.

Thus, a sparse vector is fully determined by the triple (nz, x, indx). If you pass a negative or
zero value of nz to Sparse BLAS, the subroutines do not modify any arrays or variables.
193
Full-storage vectors. Sparse BLAS routines can also use a vector argument fully stored in a
single FORTRAN array (a full-storage vector). If y is a full-storage vector, its elements must
be stored contiguously: the first element in y(1), the second in y(2), and so on. This
corresponds to an increment incy = 1 in BLAS Level 1. No increment value for full-storage
vectors is passed as an argument to Sparse BLAS routines or functions.
Naming Conventions
Similar to BLAS, the names of Sparse BLAS subprograms have prefixes that determine the data
type involved: s and d for single- and double-precision real; c and z for single- and
double-precision complex respectively.
If a Sparse BLAS routine is an extension of a “dense” one, the subprogram name is formed by
appending the suffix i (standing for indexed) to the name of the corresponding “dense”
subprogram. For example, the Sparse BLAS routine saxpyi corresponds to the BLAS routine
saxpy, and the Sparse BLAS function cdotci corresponds to the BLAS function cdotc.
Routines and Data Types

Routines and data types supported in the Intel MKL implementation of Sparse BLAS are listed
in Table 2-4 .
Table 2-4 Sparse BLAS Routines and Their Data Types
Routine/Function Data Types Description
?axpyi s, d, c, z Scalar-vector product plus vector (routines)
?doti s, d Dot product (functions)
?dotci c, z Complex dot product conjugated (functions)
?dotui c, z Complex dot product unconjugated (functions)
?gthr s, d, c, z Gathering a full-storage sparse vector into

compressed form nz, x, indx (routines)
?gthrz s, d, c, z Gathering a full-storage sparse vector into

compressed form and assigning zeros to gathered
elements in the full-storage vector (routines)
?roti s, d Givens rotation (routines)
194
Routine/Function Data Types Description
?sctr s, d, c, z Scattering a vector from compressed form to

full-storage form (routines)
BLAS Level 1 Routines That Can Work With Sparse Vectors

The following BLAS Level 1 routines will give correct results when you pass to them a
compressed-form array x(with the increment incx=1):
?asum sum of absolute values of vector elements
?copy copying a vector
?nrm2 Euclidean norm of a vector
?scal scaling a vector
i?amax index of the element with the largest absolute value for real flavors, or
the largest sum |Re(x(i))|+|Im(x(i))| for complex flavors.
i?amin index of the element with the smallest absolute value for real flavors,
or the smallest sum |Re(x(i))|+|Im(x(i))| for complex flavors.
The result i returned by i?amax and i?amin should be interpreted as index in the
compressed-form array, so that the largest (smallest) value is x(i); the corresponding index
in full-storage array is indx(i).
You can also call ?roti to compute the parameters of Givens rotation and then pass these
parameters to the Sparse BLAS routines ?roti.
?axpyi
Adds a scalar multiple of compressed sparse vector
to a full-storage vector.
Syntax
FORTRAN 77:
call saxpyi(nz, a, x, indx, y)
call daxpyi(nz, a, x, indx, y)
call caxpyi(nz, a, x, indx, y)
call zaxpyi(nz, a, x, indx, y)
195
Fortran 95:
call axpyi(x, indx, y [, a])
Description
The ?axpyi routines perform a vector-vector operation defined as
y := a*x + y
where:
a is a scalar,
x is a sparse vector stored in compressed form,
y is a vector in full storage form.
The ?axpyi routines reference or modify only the elements of y whose indices are listed in the
array indx.
The values in indx must be distinct.
Input Parameters
nz INTEGER. The number of elements in x and indx.
a REAL for saxpyi
DOUBLE PRECISION for daxpyi
COMPLEX for caxpyi
DOUBLE COMPLEX for zaxpyi
x REAL for saxpyi
COMPLEX for caxpyi
Array, DIMENSION at least nz.
indx INTEGER. Specifies the indices for the elements of x.
y REAL for saxpyi
COMPLEX for caxpyi
196
Array, DIMENSION at least max(indx(i)).
Output Parameters
y Contains the updated vector y.

Specific details for the routine axpyi interface are the following:
x Holds the vector with the number of elements nz.
indx Holds the vector with the number of elements nz.
y Holds the vector with the number of elements nz.
a The default value is 1.
?doti
Computes the dot product of a compressed sparse
real vector by a full-storage real vector.
Syntax
FORTRAN 77:
res = sdoti(nz, x, indx, y )
res = ddoti(nz, x, indx, y )
Fortran 95:
res = doti(x, indx, y)
Description
The ?doti routines return the dot product of x and y defined as
res = x(1)*y(indx(1)) + x(2)*y(indx(2)) +...+ x(nz)*y(indx(nz))
197
where the triple (nz, x, indx) defines a sparse real vector stored in compressed form, and y is
a real vector in full storage form. The functions reference only the elements of y whose indices
are listed in the array indx. The values in indx must be distinct.
Input Parameters
nz INTEGER. The number of elements in x and indx .
x REAL for sdoti
DOUBLE PRECISION for ddoti
y REAL for sdoti
Output Parameters
res REAL for sdoti
Contains the dot product of x and y, if nz is positive.
Otherwise, res contains 0.

Specific details for the routine doti interface are the following:
198
?dotci
Computes the conjugated dot product of a
compressed sparse complex vector with a
full-storage complex vector.
Syntax
FORTRAN 77:
res = cdotci(nz, x, indx, y )
res = zdotci(nz, x, indx, y )
Fortran 95:
res = dotci(x, indx, y)
Description
The ?dotci routines return the dot product of x and y defined as
conjg(x(1))*y(indx(1)) + ... + conjg(x(nz))*y(indx(nz))
where the triple (nz, x, indx) defines a sparse complex vector stored in compressed form, and
y is a real vector in full storage form. The functions reference only the elements of y whose
indices are listed in the array indx. The values in indx must be distinct.
Input Parameters
x COMPLEX for cdotci
DOUBLE COMPLEX for zdotci
y COMPLEX for cdotci
199
Output Parameters
res COMPLEX for cdotci
Contains the conjugated dot product of x and y, if nz is

Specific details for the routine dotci interface are the following:
x Holds the vector with the number of elements (nz).
indx Holds the vector with the number of elements (nz).
y Holds the vector with the number of elements (nz).
?dotui
Computes the dot product of a compressed sparse
complex vector by a full-storage complex vector.
Syntax
FORTRAN 77:
res = cdotui(nz, x, indx, y )
res = zdotui(nz, x, indx, y )
Fortran 95:
res = dotui(x, indx, y)
Description
The ?dotui routines return the dot product of x and y defined as
res = x(1)*y(indx(1)) + x(2)*y(indx(2)) +...+ x(nz)*y(indx(nz))
200
where the triple (nz, x, indx) defines a sparse complex vector stored in compressed form, and
y is a real vector in full storage form. The functions reference only the elements of y whose
indices are listed in the array indx. The values in indx must be distinct.
Input Parameters
x COMPLEX for cdotui
DOUBLE COMPLEX for zdotui
y COMPLEX for cdotui
Output Parameters
res COMPLEX for cdotui
Contains the dot product of x and y, if nz is positive.
Otherwise, res contains 0.

Specific details for the routine dotui interface are the following:
201
?gthr
Gathers a full-storage sparse vector's elements
into compressed form.
Syntax
FORTRAN 77:
call sgthr(nz, y, x, indx )
call dgthr(nz, y, x, indx )
call cgthr(nz, y, x, indx )
call zgthr(nz, y, x, indx )
Fortran 95:
res = gthr(x, indx, y)
Description
The ?gthr routines gather the specified elements of a full-storage sparse vector y into
compressed form(nz, x, indx). The routines reference only the elements of y whose indices
are listed in the array indx:
x(i) = y(indx(i)), for i=1,2,... +nz.
Input Parameters
nz INTEGER. The number of elements of y to be gathered.
indx INTEGER. Specifies indices of elements to be gathered.
y REAL for sgthr
DOUBLE PRECISION for dgthr
COMPLEX for cgthr
DOUBLE COMPLEX for zgthr
202
Output Parameters
x REAL for sgthr
DOUBLE PRECISION for dgthr
COMPLEX for cgthr
DOUBLE COMPLEX for zgthr
Contains the vector converted to the compressed form.

Specific details for the routine gthr interface are the following:
?gthrz
Gathers a sparse vector's elements into
compressed form, replacing them by zeros.
Syntax
FORTRAN 77:
call sgthrz(nz, y, x, indx )
call dgthrz(nz, y, x, indx )
call cgthrz(nz, y, x, indx )
call zgthrz(nz, y, x, indx )
Fortran 95:
res = gthrz(x, indx, y)
203
Description
The ?gthrz routines gather the elements with indices specified by the array indx from a
full-storage vector y into compressed form (nz, x, indx) and overwrite the gathered elements
of y by zeros. Other elements of y are not referenced or modified (see also ?gthr).
Input Parameters
nz INTEGER. The number of elements of y to be gathered.
indx INTEGER. Specifies indices of elements to be gathered.
y REAL for sgthrz
DOUBLE PRECISION for dgthrz
COMPLEX for cgthrz
DOUBLE COMPLEX for zgthrz
Output Parameters
x REAL for sgthrz
DOUBLE PRECISION for d gthrz
COMPLEX for cgthrz
DOUBLE COMPLEX for zgthrz
Contains the vector converted to the compressed form.
y The updated vector y.

Specific details for the routine gthrz interface are the following:
204
?roti
Applies Givens rotation to sparse vectors one of
which is in compressed form.
Syntax
FORTRAN 77:
call sroti(nz, x, indx, y, c, s)
call droti(nz, x, indx, y, c, s)
Fortran 95:
call roti(x, indx, y, c, s)
Description
The ?roti routines apply the Givens rotation to elements of two real vectors, x (in compressed
form nz, x, indx) and y (in full storage form):
x(i) = c*x(i) + s*y(indx(i))
y(indx(i)) = c*y(indx(i))- s*x(i)
The routines reference only the elements of y whose indices are listed in the array indx. The
values in indx must be distinct.
Input Parameters
nz INTEGER. The number of elements in x and indx.
x REAL for sroti
DOUBLE PRECISION for droti
y REAL for sroti
DOUBLE PRECISION for droti
c A scalar: REAL for sroti
205
DOUBLE PRECISION for droti.

s A scalar: REAL for sroti
DOUBLE PRECISION for droti.
Output Parameters
x and y The updated arrays.

Specific details for the routine roti interface are the following:
?sctr
Converts compressed sparse vectors into full
storage form.
Syntax
FORTRAN 77:
call ssctr(nz, x, indx, y )
call dsctr(nz, x, indx, y )
call csctr(nz, x, indx, y )
call zsctr(nz, x, indx, y )
Fortran 95:
call sctr(x, indx, y)
Description
206
The ?sctr routines scatter the elements of the compressed sparse vector (nz, x, indx) to a
full-storage vector y. The routines modify only the elements of y whose indices are listed in
the array indx:
y(indx(i) = x(i), for i=1,2,... +nz.
Input Parameters
nz INTEGER. The number of elements of x to be scattered.
indx INTEGER. Specifies indices of elements to be scattered.
x REAL for ssctr
DOUBLE PRECISION for dsctr
COMPLEX for csctr
DOUBLE COMPLEX for zsctr
Contains the vector to be converted to full-storage form.
Output Parameters
y REAL for ssctr
DOUBLE PRECISION for dsctr
COMPLEX for csctr
DOUBLE COMPLEX for zsctr
Contains the vector y with updated elements.

Specific details for the routine sctr interface are the following:
207
Sparse BLAS Level 2 and Level 3 Routines

This section describes Sparse BLAS Level 2 and Level 3 routines included in the Intel® Math
Kernel Library (Intel® MKL) . Sparse BLAS Level 2 is a group of routines and functions that
perform operations between a sparse matrix and dense vectors. Sparse BLAS Level 3 is a group
of routines and functions that perform operations between a sparse matrix and dense matrices.
The terms and concepts required to understand the use of the Intel MKL Sparse BLAS Level 2
and Level 3 routines are discussed in the Linear Solvers Basics appendix.
This Sparse BLAS routines can be useful to implement iterative methods for solving large sparse
systems of equations or eigenvalue problems. For example, these routines can be considered
as building blocks for “Iterative Sparse Solvers based on Reverse Communication Interface
(RCI ISS)” described in the Chapter 8 of the manual.
Intel MKL provides Sparse BLAS Level 2 and Level 3 routines with typical (or conventional)
interface similar to the interface used in the NIST* Sparse BLAS library [Rem05].
Some software packages and libraries (the PARDISO* Solver used in Intel MKL, Sparskit 2
[Saad94], the Compaq* Extended Math Library (CXML)[CXML01]) use different (early) variation
of the compressed sparse row (CSR) format and support only Level 2 operations with simplified
interfaces. Intel MKL provides an additional set of Sparse BLAS Level 2 routines with similar
simplified interfaces. Each of these routines operates only on a matrix of the fixed type.
The routines described in this section support both one-based indexing and zero-based indexing
of the input data (see details in the section One-based and Zero-based Indexing).
Naming Conventions in Sparse BLAS Level 2 and Level 3

Each Sparse BLAS Level 2 and Level 3 routine has a six- or eight-character base name preceded
by the prefix mkl_ or mkl_cspblas_ .
The routines with typical (conventional) interface have six-character base names in accordance
with the template:
mkl_<character > <data> <operation>( )
The routines with simplified interfaces have eight-character base names in accordance with the
templates:
mkl_<character > <data> <mtype> <operation>( )
for routines with one-based indexing; and
mkl_cspblas_<character> <data> <mtype> <operation>( )
for routines with zero-based indexing.
The <character> field indicates the data type:
208
The <data> field indicates the sparse matrix storage format (see section Sparse Matrix Storage
Formats):
coo coordinate format
csr compressed sparse row format and its variations
csc compressed sparse column format and its variations
dia diagonal format
sky skyline storage format
bsr block sparse row format and its variations
The <operation> field indicates the type of operation:

mv matrix-vector product (Level 2)
mm matrix-matrix product (Level 3)
sv solving a single triangular system (Level 2)
sm solving triangular systems with multiple right-hand sides (Level 3)
The field <mtype> indicates the matrix type:

ge sparse representation of a general matrix
sy sparse representation of the upper or lower triangle of a symmetric
matrix
tr sparse representation of a triangular matrix
Sparse Matrix Storage Formats

The current version of Intel MKL Sparse BLAS Level 2 and Level 3 routines support the following
point entry [Duff86] storage formats for sparse matrices:
• compressed sparse row format (CSR) and its variations;

• compressed sparse column format (CSC);
• coordinate format;
• diagonal format;
• skyline storage format;
209
and one block entry storage format:
• block sparse row format (BSR) and its variations.

For more information see “Sparse Matrix Storage Formats” in the Appendix A ".
Intel MKL provides auxiliary routines - matrix converters - that convert sparse matrix from one
storage format to another.
Routines and Supported Operations

This section describes operations supported by the Intel Sparse BLAS Level 2 and Level 3
routines. The following notations are used here:
A is a sparse matrix;
B and C are dense matrices;
D is a diagonal scaling matrix;
x and y are dense vectors;
alpha and beta are scalars;
op(A) is one of the possible operations:

op(A) = A;
op(A) = A' - transpose of A;
op(A) = conj(A') - conjugated transpose of A.
inv(op(A)) denotes the inverse of op(A).

Intel MKL Sparse BLAS Level 2 and Level 3 routines support the following operations:
• computing the vector product between a sparse matrix and a dense vector:
y := alpha*op(A)*x + beta*y
• solving a single triangular system:

y := alpha*inv(op(A))*x
• computing a product between sparse matrix and dense matrix:

C := alpha*op(A)*B + beta*C
• solving a sparse triangular system with multiple right-hand sides:

C := alpha*inv(op(A))*B
210
Intel MKL provides an additional set of Sparse BLAS Level 2 routines with simplified interfaces.
Each of these routines operates on a matrix of the fixed type. The following operations are
supported:
• computing the vector product between a sparse matrix and a dense vector (for general and
symmetric matrices):
y := op(A)*x
• solving a single triangular system (for triangular matrices):

y := inv(op(A))*x
Matrix type is indicated by the field <mtype> in the routine name (see section Naming
Conventions in Sparse BLAS Level 2 and Level 3).
NOTE. The routines with simplified interfaces support only four sparse matrix storage
formats, specifically:
CSR format in the 3-array variation accepted in the direct sparse solvers and in the
CXML;
diagonal format accepted in the CXML;
coordinate format;
BSR format in the 3-array variation.
Note that routines with both typical (conventional) and simplified interfaces use the same
computational kernels that work with certain internal data structures.
CAUTION. Intel Sparse BLAS Level 2 and Level 3 routines do not support in-place
operations.
Complete list of all routines is given in the Table 2-9 .
211
Interface Consideration
One-Based and Zero-Based Indexing

The Intel MKL Sparse BLAS Leve2 and Level 3 routines support one-based and zero-based
indexing of data arrays.
Routines with typical interfaces support zero-based indexing for the following sparse data
storage formats: CSR, CSC, BSR, and COO. Routines with simplified interfaces support zero
based indexing for the following sparse data storage formats: CSR, BSR, and COO. See the
complete list of Sparse BLAS Level 2 and Level 3 Routines .
The one-based indexing uses the convention of starting array indices at 1. The zero-based
indexing uses the convention of starting array indices at 0. For example, indices of the 5-element
array x can be presented in case of one-based indexing as follows:
Element index: 1 2 3 4 5
Element value: 1.0 5.0 7.0 8.0 9.0
and in case of zero-based indexing as follows:
Element index: 0 1 2 3 4
Element value: 1.0 5.0 7.0 8.0 9.0
The detailed descriptions of the one-based and zero-based variants of the sparse data storage
formats are given in the “Sparse Matrix Storage Formats” in Appendix A
Most parameters of the routines are identical for both one-based and zero-based indexing, but
some of them have certain differences. The following table lists all these differences.
Parameter One-based Indexing Zero-based Indexing

val
Array containing non-zero elements of Array containing non-zero elements of
the matrix A, its length is pntre(m) - the matrix A, its length is pntre(m—1)
pntrb(1). - pntrb(0).
pntrb
Array of length m. This array contains Array of length m. This array contains
row indices, such that pntrb(I) - row indices, such that pntrb(I) -
pntrb(1)+1 is the first index of row I pntrb(0) is the first index of row I in
in the arrays val and indx the arrays val and indx.
212
Parameter One-based Indexing Zero-based Indexing
pntre
Array of length m. This array contains Array of length m. This array contains
row indices, such that pntre(I) - row indices, such that pntre(I) -
pntrb(1) is the last index of row I in pntrb(0)-1 is the last index of row I
the arrays val and indx. in the arrays val and indx.
ia
Array of length m + 1, containing Array of length m+1, containing indices
indices of elements in the array a, such of elements in the array a, such that
that ia(I) is the index in the array a ia(I) is the index in the array a of the
of the first non-zero element from the first non-zero element from the row I.
row I. The value of the last element The value of the last element ia(m) is
ia(m + 1) is equal to the number of equal to the number of non-zeros.
non-zeros plus one.
ldb
Specifies the first dimension of b as Specifies the second dimension of b as
declared in the calling (sub)program. declared in the calling (sub)program.
ldc
Specifies the first dimension of c as Specifies the second dimension of c as
declared in the calling (sub)program. declared in the calling (sub)program.
Difference Between Fortran and C Interfaces

Intel MKL provides both Fortran and C interfaces to all Sparse BLAS Leve2 and Level 3 routines.
Parameter descriptions are common for both interfaces with the exception of data types that
refer to the FORTRAN 77 standard types. Correspondence between data types specific to the
Fortran and C interfaces are given below:
Fortran C
REAL*4 float
REAL*8 double
INTEGER*4 int
INTEGER*8 long long int
CHARACTER char
For routines with C interfaces all parameters (including scalars) must be passed by references.
213
Another difference is how two-dimensional arrays are represented. In Fortran the column-major
order is used, and in C - row-major order. This changes the meaning of the parameters ldb
and ldc (see the table above).
Differences Between Intel MKL and NIST* Interfaces

The Intel MKL Sparse BLAS Level 3 routines have the following conventional interfaces:
mkl_xyyymm(transa, m, n, k, alpha, matdescra, arg(A), b, ldb, beta, c, ldc),
for matrix-matrix product;
mkl_xyyysm(transa, m, n, alpha, matdescra, arg(A), b, ldb, c, ldc), for triangular
solvers with multiple right-hand sides.
Here x denotes data type, and yyy - sparse matrix data structure (storage format).
The analogous NIST* Sparse BLAS (NSB) library routines have the following interfaces:
xyyymm(transa, m, n, k, alpha, descra, arg(A), b, ldb, beta, c, ldc, work,
lwork), for matrix-matrix product;
xyyysm(transa, m, n, unitd, dv, alpha, descra, arg(A), b, ldb, beta, c, ldc,
work, lwork), for triangular solvers with multiple right-hand sides.
Some similar arguments are used in both libraries. The argument transa indicates what
operation is performed and is slightly different in the NSB library (see Table 2-5 ). The arguments
m and k are the number of rows and column in the matrix A, respectively, n is the number of
columns in the matrix C. The arguments alpha and beta are scalar alpha and beta respectively
(beta is not used in the Intel MKL triangular solvers.) The arguments b and c are rectangular
arrays with the first dimension ldb and ldc, respectively. arg(A) denotes the list of arguments
that describe the sparse representation of A.
Table 2-5 Parameter transa
MKL interface NSB interface Operation
data type CHARACTER*1 INTEGER
value N or n 0 op(A) = A
T or t 1 op(A) = A'
C or c 2 op(A) = A'
214
Parameter matdescra
The parameter matdescra describes the relevant characteristic of the matrix A. This manual
describes matdescra as an array of six elements in line with the NIST* implementation. However,
only the first four elements of the array are used in the current versions of the Intel MKL Sparse
BLAS routines. Elements matdescra (5) and matdescra(6) are reserved for future use. Note
that whether matdescra is described in the user's program as an array of length 6 or 4 is of
no importance because the array is declared as a pointer in the Intel MKL routines. To learn
more about declaration of the matdescra array see Sparse BLAS examples located in the
following subdirectory of the Intel MKL installation directory: examples/spblas/. The table
below list elements of the parameter matdescra, their values and meanings. The parameter
matdescra corresponds to the argument descra from NSB library.
Table 2-6 Possible Values of the Parameter matdescra (descra)
MKL interface NSB Matrix characteristics
interface
one-based zero-based
indexing indexing
data type CHARACTER Char INTEGER
1st element matdescra(1) matdescra(0) descra(1) matrix structure
value G G 0 general
S S 1 symmetric (A = A')
H H 2 Hermitian (A=conjg(A'))
T T 3 triangular
A A 4 skew(anti)-symmetric (A=-A')
D D 5 diagonal
2nd matdescra(2) matdescra(1) descra(2) upper/lower triangular

element indicator
value L L 1 lower
U U 2 upper
215
MKL interface NSB Matrix characteristics

interface
3rd matdescra(3) matdescra(2) descra(3) main diagonal type

element
value N N 0 non-unit
U U 1 unit
type of indexing
4th matdescra(4) matdescra(3)
element
one-based indexing
value F
zero-based indexing
C
In some cases possible element values of the parameter matdescra depend on the values of
other elements. The Table 2-7 "Possible Combinations of Element Values of the Parameter
matdescra" lists all possible combinations of element values for both multiplication routines
and triangular solvers.
Table 2-7 Possible Combinations of Element Values of the Parameter matdescra
Routines matdescra(1) matdescra(2) matdescra(3) matdescra(4)
Multiplication G ignored ignored F (default) or C
Routines
S or H L (default) N (default) F (default) or C
S or H L (default) U F (default) or C
S or H U N (default) F (default) or C
S or H U U F (default) or C
A L (default) ignored F (default) or C
A U ignored F (default) or C
Multiplication T L U F (default) or C
Routines and
Triangular Solvers
T L N F (default) or C
T U U F (default) or C
T U N F (default) or C
D ignored N (default) F (default) or C
216
Routines matdescra(1) matdescra(2) matdescra(3) matdescra(4)
D ignored U F (default) or C
For a matrix in the skyline format with main diagonal declared to be unit, diagonal elements
must be stored in the sparse representation even if they are zero. In all other formats diagonal
elements can be stored (if needed) in the sparse representation if they are not zero.
Operations with Partial Matrices

One of the distinctive feature of the Intel MKL Sparse BLAS routines is a possibility to perform
operations only on partial matrices composed of certain parts (triangles and main diagonal) of
the input sparse matrix. It can be done by setting properly first three elements of the parameter
matdescra.
An arbitrary sparse matrix A can be decomposed as

A = L + D + U
where L is the strict lower triangle of A, U is the strict upper triangle of A, D is the main diagonal.
Table 2-8 shows correspondence between the output matrices and values of the parameter
matdescra for the sparse matrix A for multiplication routines.
Table 2-8 Output Matrices for Multiplication Routines
matdescra(1) matdescra(2) matdescra(3) Output Matrix
G ignored ignored alpha*op(A)*x + beta*y

alpha*op(A)*B + beta*C
S or H L N alpha*op(L+D+L')*x + beta*y
alpha*op(L+D+L')*B + beta*C
S or H L U alpha*op(L+I+L')*x + beta*y
alpha*op(L+I+L')*B + beta*C
S or H U N alpha*op(U'+D+U)*x + beta*y
alpha*op(U'+D+U)*B + beta*C
S or H U U alpha*op(U'+I+U)*x + beta*y
alpha*op(U'+I+U)*B + beta*C
217
T L U alpha*op(L+I)*x + beta*y
alpha*op(L+I)*B + beta*C
T L N alpha*op(L+D)*x + beta*y
alpha*op(L+D)*B + beta*C
T U U alpha*op(U+I)*x + beta*y
alpha*op(U+I)*B + beta*C
T U N alpha*op(U+D)*x + beta*y
alpha*op(U+D)*B + beta*C
A L ignored alpha*op(L-L')*x + beta*y

alpha*op(L-L')*B + beta*C
A U ignored alpha*op(U-U')*x + beta*y

alpha*op(U-U')*B + beta*C
D ignored N alpha*D*x + beta*y

alpha*D*B + beta*C
D ignored U alpha*x + beta*y

alpha*B + beta*C
Table 2-9 shows correspondence between the output matrices and values of the parameter
matdescra for the sparse matrix A for triangular solvers.
Table 2-9 Output Matrices for Triangular Solvers
T L N alpha*inv(op(L+L))*x
alpha*inv(op(L+L))*B
T L U alpha*inv(op(L+L))*x
alpha*inv(op(L+L))*B
T U N alpha*inv(op(U+U))*x
218
alpha*inv(op(U+U))*B
T U U alpha*inv(op(U+U))*x
alpha*inv(op(U+U))*B
D ignored N alpha*inv(D)*x
alpha*inv(D)*B
D ignored U alpha*x
alpha*B
Sparse BLAS Level 2 and Level 3 Routines.

Table 2-9 lists the sparse BLAS Level 2 and Level 3 routines described in more detail later in
this section.
Table 2-9 Sparse BLAS Level 2 and Level 3 Routines
Routine/Function Description
Simplified interface, one-based indexing
mkl_?csrgemv Computes matrix - vector product of a sparse general

matrix in the CSR format (3-array variation)
mkl_?bsrgemv Computes matrix - vector product of a sparse general

matrix in the BSR format (3-array variation).
mkl_?coogemv Computes matrix - vector product of a sparse general

matrix in the coordinate format.
mkl_?diagemv Computes matrix - vector product of a sparse general

matrix in the diagonal format.
mkl_?csrsymv Computes matrix - vector product of a sparse

symmetrical matrix in the CSR format (3-array
variation)
219
mkl_?bsrsymv Computes matrix - vector product of a sparse

symmetrical matrix in the BSR format (3-array
variation).
mkl_?coosymv Computes matrix - vector product of a sparse

symmetrical matrix in the coordinate format.
mkl_?diasymv Computes matrix - vector product of a sparse

symmetrical matrix in the diagonal format.
mkl_?csrtrsv Triangular solvers with simplified interface for a sparse

matrix in the CSR format (3-array variation).
mkl_?bsrtrsv Triangular solver with simplified interface for a sparse

matrix in the BSR format (3-array variation).
mkl_?cootrsv Triangular solvers with simplified interface for a sparse

mkl_?diatrsv Triangular solvers with simplified interface for a sparse

Simplified interface, zero-based indexing
mkl_cspblas_?csrgemv Computes matrix - vector product of a sparse general

matrix in the CSR format (3-array variation) with
zero-based indexing.
mkl_cspblas_?bsrgemv Computes matrix - vector product of a sparse general

matrix in the BSR format (3-array variation)with
mkl_cspblas_?coogemv Computes matrix - vector product of a sparse general

matrix in the coordinate format with zero-based
indexing.
mkl_cspblas_?csrsymv Computes matrix - vector product of a sparse

symmetrical matrix in the CSR format (3-array
variation) with zero-based indexing
220
mkl_cspblas_?bsrsymv Computes matrix - vector product of a sparse

symmetrical matrix in the BSR format (3-array
variation) with zero-based indexing.
mkl_cspblas_?coosymv Computes matrix - vector product of a sparse

symmetrical matrix in the coordinate format with
mkl_cspblas_?csrtrsv Triangular solvers with simplified interface for a sparse

matrix in the CSR format (3-array variation) with
mkl_cspblas_?bsrtrsv Triangular solver with simplified interface for a sparse

matrix in the BSR format (3-array variation) with
mkl_cspblas_?cootrsv Triangular solver with simplified interface for a sparse

matrix in the coordinate format with zero-based
indexing.
Typical (conventional) interface, one-based and zero-based indexing
mkl_?csrmv Computes matrix - vector product of a sparse matrix

in the CSR format.
mkl_?bsrmv Computes matrix - vector product of a sparse matrix

in the BSR format.
mkl_?cscmv Computes matrix - vector product for a sparse matrix

in the CSC format.
mkl_?coomv Computes matrix - vector product for a sparse matrix

in the coordinate format.
mkl_?csrsv Solves a system of linear equations for a sparse matrix

in the CSR format.
mkl_?bsrsv Solves a system of linear equations for a sparse matrix

in the BSR format.
221
mkl_?cscsv Solves a system of linear equations for a sparse matrix

in the CSC format.
mkl_?coosv Solves a system of linear equations for a sparse matrix

mkl_?csrmm Computes matrix - matrix product of a sparse matrix

in the CSR format
mkl_?bsrmm Computes matrix - matrix product of a sparse matrix

in the BSR format.
mkl_?cscmm Computes matrix - matrix product of a sparse matrix

in the CSC format
mkl_?coomm Computes matrix - matrix product of a sparse matrix

mkl_?csrsm Solves a system of linear matrix equations for a sparse

matrix in the CSR format.
mkl_?bsrsm Solves a system of linear matrix equations for a sparse

matrix in the BSR format.
mkl_?cscsm Solves a system of linear matrix equations for a sparse

matrix in the CSC format.
mkl_?coosm Solves a system of linear matrix equations for a sparse

Typical (conventional) interface, one-based indexing
mkl_?diamv Computes matrix - vector product of a sparse matrix

in the diagonal format.
mkl_?skymv Computes matrix - vector product for a sparse matrix

in the skyline storage format.
mkl_?diasv Solves a system of linear equations for a sparse matrix

222
mkl_?skysv Solves a system of linear equations for a sparse matrix

in the skyline format.
mkl_?diamm Computes matrix - matrix product of a sparse matrix

mkl_?skymm Computes matrix - matrix product of a sparse matrix

in the skyline storage format.
mkl_?diasm Solves a system of linear matrix equations for a sparse

mkl_?skysm Solves a system of linear matrix equations for a sparse

matrix in the skyline storage format.
Auxiliary routines
Matrix converters
mkl_ddnscsr Stores a sparse matrix in the CSR format (3-array

variation).
mkl_dcsrcoo Converts a sparse matrix in the CSR format (3-array

variation) to the coordinate format.
mkl_dcsrbsr Converts a sparse matrix in the CSR format to the BSR

format (3-array variations).
Operations on sparse matrices
mkl_?csradd Computes the sum of two sparse matrices stored in

the CSR format (3-array variation).
mkl_?csrmultcsr Computes the sum of two sparse matrices stored in

the CSR format (3-array variation).
mkl_?csrmultd Computes product of two sparse matrices stored in the

CSR format (3-array variation). The result is stored in
the dense matrix.
223
mkl_?csrgemv
Computes matrix - vector product of a sparse
general matrix stored in the CSR format (3-array
variation) with one-based indexing.
Syntax
Fortran:
call mkl_scsrgemv(transa, m, a, ia, ja, x, y)
call mkl_dcsrgemv(transa, m, a, ia, ja, x, y)
call mkl_ccsrgemv(transa, m, a, ia, ja, x, y)
call mkl_zcsrgemv(transa, m, a, ia, ja, x, y)
C:
mkl_scsrgemv(&transa, &m, a, ia, ja, x, y);
mkl_dcsrgemv(&transa, &m, a, ia, ja, x, y);
mkl_ccsrgemv(&transa, &m, a, ia, ja, x, y);
mkl_zcsrgemv(&transa, &m, a, ia, ja, x, y);
Description
This routine is declared in mkl_spblas.fi for FORTRAN 77 interface and in mkl_spblas.h
for C interface.
The mkl_?csrgemv routine performs a matrix-vector operation defined as
y := A*x
or
y := A'*x,
where:
A is an m-by-m sparse square matrix in the CSR format (3-array variation), A' is the transpose
of A.
NOTE. This routine supports only one-based indexing of the input arrays.
224
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data
types that refer here to the FORTRAN 77 standard types. Data types specific to the different
interfaces are described in the section “Interfaces” below.
transa CHARACTER*1. Specifies the operation.
If transa = 'N' or 'n', then as y := A*x
If transa = 'T' or 't' or 'C' or 'c', then y := A'*x,
m INTEGER. Number of rows of the matrix A.
a REAL for mkl_scsrgemv.
DOUBLE PRECISION for mkl_dcsrgemv.
COMPLEX for mkl_ccsrgemv.
DOUBLE COMPLEX for mkl_zcsrgemv.
Array containing non-zero elements of the matrix A. Its
length is equal to the number of non-zero elements in the
matrix A. Refer to values array description in Sparse Matrix
Storage Formats for more details.
ia INTEGER. Array of length m + 1, containing indices of
elements in the array a, such that ia(i) is the index in the
array a of the first non-zero element from the row i. The
value of the last element ia(m + 1) is equal to the number
of non-zeros plus one. Refer to rowIndex array description
in Sparse Matrix Storage Formats for more details.
ja INTEGER. Array containing the column indices for each
non-zero element of the matrix A.
Its length is equal to the length of the array a. Refer to
columns array description in Sparse Matrix Storage Formats
for more details.
x REAL for mkl_scsrgemv.
Array, DIMENSION is m.
On entry, the array x must contain the vector x.
Output Parameters
y REAL for mkl_scsrgemv.
225

Array, DIMENSION at least m.
On exit, the array y must contain the vector y.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrgemv(transa, m, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m
INTEGER ia(*), ja(*)
REAL a(*), x(*), y(*)
SUBROUTINE mkl_dcsrgemv(transa, m, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m
DOUBLE PRECISION a(*), x(*), y(*)
CHARACTER*1 transa
INTEGER m
COMPLEX a(*), x(*), y(*)
CHARACTER*1 transa
INTEGER m
DOUBLE COMPLEX a(*), x(*), y(*)
226
C:
void mkl_scsrgemv(char *transa, int *m, float *a,
int *ia, int *ja, float *x, float *y);
void mkl_dcsrgemv(char *transa, int *m, double *a,
int *ia, int *ja, double *x, double *y);
void mkl_dcsrgemv(char *transa, int *m, MKL_Complex8 *a,
int *ia, int *ja, MKL_Complex8 *x, MKL_Complex8 *y);
void mkl_dcsrgemv(char *transa, int *m, MKL_Complex16 *a,
mkl_?bsrgemv
general matrix stored in the BSR format (3-array
Syntax
Fortran:
call mkl_sbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_dbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_cbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_zbsrgemv(transa, m, lb, a, ia, ja, x, y)
C:
mkl_sbsrgemv(&transa, &m, &lb, a, ia, ja, x, y);
mkl_dbsrgemv(&transa, &m, &lb, a, ia, ja, x, y);
mkl_cbsrgemv(&transa, &m, &lb, a, ia, ja, x, y);
mkl_zbsrgemv(&transa, &m, &lb, a, ia, ja, x, y);
Description
for C interface.
227
The mkl_?bsrgemv routine performs a matrix-vector operation defined as

y := A*x
or
y := A'*x,
where:
A is an m-by-m block sparse square matrix in the BSR format (3-array variation), A' is the
transpose of A.
Input Parameters
If transa = 'N' or 'n', then the matrix-vector product is
computed as y := A*x
If transa = 'T' or 't' or 'C' or 'c', then the
matrix-vector product is computed as y := A'*x,
m INTEGER. Number of block rows of the matrix A.
lb INTEGER. Size of the block in the matrix A.
a REAL for mkl_sbsrgemv.
DOUBLE PRECISION for mkl_dbsrgemv.
COMPLEX for mkl_cbsrgemv.
DOUBLE COMPLEX for mkl_zbsrgemv.
Array containing elements of non-zero blocks of the matrix
A. Its length is equal to the number of non-zero blocks in
the matrix A multiplied by lb*lb. Refer to values array
description in BSR Format for more details.
ia INTEGER. Array of length (m + 1), containing indices of
block in the array a, such that ia(i) is the index in the
228
of non-zero blocks plus one. Refer to rowIndex array
non-zero block in the matrix A.
Its length is equal to the number of non-zero blocks of the
matrix A. Refer to columns array description in BSR Format
for more details.
x REAL for mkl_sbsrgemv.
Array, DIMENSION (m*lb).
Output Parameters
y REAL for mkl_sbsrgemv.
Array, DIMENSION at least (m*lb).
229
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrgemv(transa, m, lb, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m, lb
REAL a(*), x(*), y(*)
SUBROUTINE mkl_dbsrgemv(transa, m, lb, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m, lb
SUBROUTINE mkl_cbsrgemv(transa, m, lb, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m, lb
SUBROUTINE mkl_zbsrgemv(transa, m, lb, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m, lb
230
C:
void mkl_dbsrgemv(char *transa, int *m, int *lb, double *a,
void mkl_sbsrgemv(char *transa, int *m, int *lb, float *a,
void mkl_cbsrgemv(char *transa, int *m, int *lb, MKL_Complex8 *a,
void mkl_zbsrgemv(char *transa, int *m, int *lb, MKL_Complex16 *a,
mkl_?coogemv
Computes matrix-vector product of a sparse
general matrix stored in the coordinate format with
one-based indexing.
Syntax
Fortran:
call mkl_scoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_dcoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_ccoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_zcoogemv(transa, m, val, rowind, colind, nnz, x, y)
C:
mkl_scoogemv(&transa, &m, val, rowind, colind, &nnz, x, y);
mkl_dcoogemv(&transa, &m, val, rowind, colind, &nnz, x, y);
mkl_ccoogemv(&transa, &m, val, rowind, colind, &nnz, x, y);
mkl_zcoogemv(&transa, &m, val, rowind, colind, &nnz, x, y);
Description
for C interface.
231
The mkl_?coogemv routine performs a matrix-vector operation defined as

y := A*x
or
y := A'*x,
where:
A is an m-by-m sparse square matrix in the coordinate format, A' is the transpose of A.
Input Parameters
val REAL for mkl_scoogemv.
DOUBLE PRECISION for mkl_dcoogemv.
COMPLEX for mkl_ccoogemv.
DOUBLE COMPLEX for mkl_zcoogemv.
Array of length nnz, contains non-zero elements of the
matrix A in the arbitrary order.
Refer to values array description in Coordinate Format for
more details.
rowind INTEGER. Array of length nnz, contains the row indices for
each non-zero element of the matrix A.
Refer to rows array description in Coordinate Format for
more details.
232
colind INTEGER. Array of length nnz, contains the column indices
for each non-zero element of the matrix A. Refer to columns
array description in Coordinate Format for more details.
nnz INTEGER. Specifies the number of non-zero element of the
matrix A.
Refer to nnz description in Coordinate Format for more
details.
x REAL for mkl_scoogemv.
One entry, the array x must contain the vector x.
Output Parameters
y REAL for mkl_scoogemv.
233
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoogemv(transa, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
INTEGER rowind(*), colind(*)
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dcoogemv(transa, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
DOUBLE PRECISION val(*), x(*), y(*)
SUBROUTINE mkl_ccoogemv(transa, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
COMPLEX val(*), x(*), y(*)
SUBROUTINE mkl_zcoogemv(transa, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
DOUBLE COMPLEX val(*), x(*), y(*)
234
C:
void mkl_scoogemv(char *transa, int *m, float *val, int *rowind,
int *colind, int *nnz, float *x, float *y);
void mkl_dcoogemv(char *transa, int *m, double *val, int *rowind,
int *colind, int *nnz, double *x, double *y);
void mkl_ccoogemv(char *transa, int *m, MKL_Complex8 *val, int *rowind,
int *colind, int *nnz, MKL_Complex8 *x, MKL_Complex8 *y);
void mkl_zcoogemv(char *transa, int *m, MKL_Complex16 *val, int *rowind,
mkl_?diagemv
general matrix stored in the diagonal format with
one-based indexing.
Syntax
Fortran:
call mkl_sdiagemv(transa, m, val, lval, idiag, ndiag, x, y)
call mkl_ddiagemv(transa, m, val, lval, idiag, ndiag, x, y)
call mkl_cdiagemv(transa, m, val, lval, idiag, ndiag, x, y)
call mkl_zdiagemv(transa, m, val, lval, idiag, ndiag, x, y)
C:
mkl_sdiagemv(&transa, &m, val, &lval, idiag, &ndiag, x, y);
mkl_ddiagemv(&transa, &m, val, &lval, idiag, &ndiag, x, y);
mkl_cdiagemv(&transa, &m, val, &lval, idiag, &ndiag, x, y);
mkl_zdiagemv(&transa, &m, val, &lval, idiag, &ndiag, x, y);
Description
for C interface.
235
The mkl_?diagemv routine performs a matrix-vector operation defined as

y := A*x
or
y := A'*x,
where:
A is an m-by-m sparse square matrix in the diagonal storage format, A' is the transpose of A.
Input Parameters
If transa = 'N' or 'n', then y := A*x
If transa = 'T' or 't' or 'C' or 'c', then y := A'*x,
val REAL for mkl_sdiagemv.
DOUBLE PRECISION for mkl_ddiagemv.
DOUBLE COMPLEX for mkl_zdiagemv.
Two-dimensional array of size lval by ndiag, contains
non-zero diagonals of the matrix A. Refer to values array
description in Diagonal Storage Scheme for more details.
lval INTEGER. Leading dimension of val, lval≥m. Refer to lval
idiag INTEGER. Array of length ndiag, contains the distances
between main diagonal and each non-zero diagonals in the
matrix A.
Refer to distance array description in Diagonal Storage
Scheme for more details.
236
ndiag INTEGER. Specifies the number of non-zero diagonals of the
matrix A.
x REAL for mkl_sdiagemv.
Output Parameters
y REAL for mkl_sdiagemv.
237
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiagemv(transa, m, val, lval, idiag, ndiag, x, y)
CHARACTER*1 transa
INTEGER m, lval, ndiag
INTEGER idiag(*)
REAL val(lval,*), x(*), y(*)
SUBROUTINE mkl_ddiagemv(transa, m, val, lval, idiag, ndiag, x, y)
CHARACTER*1 transa
INTEGER idiag(*)
DOUBLE PRECISION val(lval,*), x(*), y(*)
SUBROUTINE mkl_cdiagemv(transa, m, val, lval, idiag, ndiag, x, y)
CHARACTER*1 transa
INTEGER idiag(*)
COMPLEX val(lval,*), x(*), y(*)
SUBROUTINE mkl_zdiagemv(transa, m, val, lval, idiag, ndiag, x, y)
CHARACTER*1 transa
INTEGER idiag(*)
DOUBLE COMPLEX val(lval,*), x(*), y(*)
238
C:
void mkl_sdiagemv(char *transa, int *m, float *val, int *lval,
int *idiag, int *ndiag, float *x, float *y);
void mkl_ddiagemv(char *transa, int *m, double *val, int *lval,
int *idiag, int *ndiag, double *x, double *y);
void mkl_cdiagemv(char *transa, int *m, MKL_Complex8 *val, int *lval,
int *idiag, int *ndiag, MKL_Complex8 *x, MKL_Complex8 *y);
void mkl_zdiagemv(char *transa, int *m, MKL_Complex16 *val, int *lval,
mkl_?csrsymv
symmetrical matrix stored in the CSR format
(3-array variation) with one-based indexing.
Syntax
Fortran:
call mkl_scsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_dcsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_ccsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_zcsrsymv(uplo, m, a, ia, ja, x, y)
C:
mkl_scsrsymv(&uplo, &m, a, ia, ja, x, y);
mkl_dcsrsymv(&uplo, &m, a, ia, ja, x, y);
mkl_ccsrsymv(&uplo, &m, a, ia, ja, x, y);
mkl_zcsrsymv(&uplo, &m, a, ia, ja, x, y);
Description
for C interface.
239
The mkl_?csrsymv routine performs a matrix-vector operation defined as

y := A*x
or
y := A'*x,
where:
A is an upper or lower triangle of the symmetrical sparse matrix in the CSR format (3-array
variation), A' is the transpose of A.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or low triangle

of the matrix A is used.
If uplo = 'U' or 'u', then the upper triangle of the matrix
A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix
A is used.
a REAL for mkl_scsrsymv.
DOUBLE PRECISION for mkl_dcsrsymv.
COMPLEX for mkl_ccsrsymv.
DOUBLE COMPLEX for mkl_zcsrsymv.
240
for more details.
x REAL for mkl_scsrsymv.
Output Parameters
y REAL for mkl_scsrsymv.
241
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrsymv(uplo, m, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m
REAL a(*), x(*), y(*)
SUBROUTINE mkl_dcsrsymv(uplo, m, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m
SUBROUTINE mkl_ccsrsymv(uplo, m, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m
SUBROUTINE mkl_zcsrsymv(uplo, m, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m
242
C:
void mkl_scsrsymv(char *uplo, int *m, float *a,
void mkl_dcsrsymv(char *uplo, int *m, double *a,
void mkl_ccsrsymv(char *uplo, int *m, MKL_Complex8 *a,
void mkl_zcsrsymv(char *uplo, int *m, MKL_Complex16 *a,
mkl_?bsrsymv
symmetrical matrix stored in the BSR format
(3-array variation) with one-based indexing.
Syntax
Fortran:
call mkl_sbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_dbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_cbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_zbsrsymv(uplo, m, lb, a, ia, ja, x, y)
C:
mkl_sbsrsymv(&uplo, &m, &lb, a, ia, ja, x, y);
mkl_dbsrsymv(&uplo, &m, &lb, a, ia, ja, x, y);
mkl_cbsrsymv(&uplo, &m, &lb, a, ia, ja, x, y);
mkl_zbsrsymv(&uplo, &m, &lb, a, ia, ja, x, y);
Description
for C interface.
243
The mkl_?bsrsymv routine performs a matrix-vector operation defined as

y := A*x
or
y := A'*x,
where:
A is an upper or lower triangle of the symmetrical sparse matrix in the BSR format (3-array
variation), A' is the transpose of A.
Input Parameters

of the matrix A is considered.
A is used.
A is used.
a REAL for mkl_sbsrsymv.
DOUBLE PRECISION for mkl_dbsrsymv.
COMPLEX for mkl_cbsrsymv.
244
for more details.
x REAL for mkl_sbsrsymv.
Output Parameters
y REAL for mkl_sbsrsymv.
245
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrsymv(uplo, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m, lb
REAL a(*), x(*), y(*)
SUBROUTINE mkl_dbsrsymv(uplo, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m, lb
SUBROUTINE mkl_cbsrsymv(uplo, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m, lb
SUBROUTINE mkl_zbsrsymv(uplo, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m, lb
246
C:
void mkl_sbsrsymv(char *uplo, int *m, int *lb,
float *a, int *ia, int *ja, float *x, float *y);
void mkl_dbsrsymv(char *uplo, int *m, int *lb,
double *a, int *ia, int *ja, double *x, double *y);
void mkl_cbsrsymv(char *uplo, int *m, int *lb,
MKL_Complex8 *a, int *ia, int *ja, MKL_Complex8 *x, MKL_Complex8 *y);
void mkl_zbsrsymv(char *uplo, int *m, int *lb,
mkl_?coosymv
symmetrical matrix stored in the coordinate format
with one-based indexing.
Syntax
Fortran:
call mkl_scoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_dcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_ccoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_zcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
C:
mkl_scoosymv(&uplo, &m, val, rowind, colind, &nnz, x, y);
mkl_dcoosymv(&uplo, &m, val, rowind, colind, &nnz, x, y);
mkl_ccoosymv(&uplo, &m, val, rowind, colind, &nnz, x, y);
mkl_zcoosymv(&uplo, &m, val, rowind, colind, &nnz, x, y);
Description
for C interface.
247
The mkl_?coosymv routine performs a matrix-vector operation defined as

y := A*x
or
y := A'*x,
where:
A is an upper or lower triangle of the symmetrical sparse matrix in the coordinate format, A'
is the transpose of A.
Input Parameters
A is used.
A is used.
val REAL for mkl_scoosymv.
DOUBLE PRECISION for mkl_dcoosymv.
COMPLEX for mkl_ccoosymv.
DOUBLE COMPLEX for mkl_zcoosymv.
more details.
248
more details.
matrix A.
details.
x REAL for mkl_scoosymv.
Output Parameters
y REAL for mkl_scoosymv.
249
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoosymv(uplo, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 uplo
INTEGER m, nnz
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 uplo
INTEGER m, nnz
SUBROUTINE mkl_cdcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 uplo
INTEGER m, nnz
SUBROUTINE mkl_zcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 uplo
INTEGER m, nnz
250
C:
void mkl_scoosymv(char *uplo, int *m, float *val, int *rowind,
void mkl_dcoosymv(char *uplo, int *m, double *val, int *rowind,
void mkl_ccoosymv(char *uplo, int *m, MKL_Complex8 *val, int *rowind,
void mkl_zcoosymv(char *uplo, int *m, MKL_Complex16 *val, int *rowind,
mkl_?diasymv
symmetrical matrix stored in the diagonal format
Syntax
Fortran:
call mkl_sdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
call mkl_ddiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
call mkl_cdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
call mkl_zdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
C:
mkl_sdiasymv(&uplo, &m, val, &lval, idiag, &ndiag, x, y);
mkl_ddiasymv(&uplo, &m, val, &lval, idiag, &ndiag, x, y);
mkl_cdiasymv(&uplo, &m, val, &lval, idiag, &ndiag, x, y);
mkl_zdiasymv(&uplo, &m, val, &lval, idiag, &ndiag, x, y);
Description
for C interface.
251
The mkl_?diasymv routine performs a matrix-vector operation defined as

y := A*x
or
y := A'*x,
where:
A is an upper or lower triangle of the symmetrical sparse matrix, A' is the transpose of A.
Input Parameters
A is used.
A is used.
val REAL for mkl_sdiasymv.
DOUBLE PRECISION for mkl_ddiasymv.
COMPLEX for mkl_cdiasymv.
DOUBLE COMPLEX for mkl_zdiasymv.
lval INTEGER. Leading dimension of val, lval ≥m. Refer to lval
matrix A.
252
matrix A.
x REAL for mkl_sdiasymv.
Output Parameters
y REAL for mkl_sdiasymv.
253
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
CHARACTER*1 uplo
INTEGER idiag(*)
SUBROUTINE mkl_ddiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
CHARACTER*1 uplo
INTEGER idiag(*)
SUBROUTINE mkl_cdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
CHARACTER*1 uplo
INTEGER idiag(*)
SUBROUTINE mkl_zdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
CHARACTER*1 uplo
INTEGER idiag(*)
254
C:
void mkl_sdiasymv(char *uplo, int *m, float *val, int *lval,
int *idiag, int *ndiag, float *x, float *y);
void mkl_ddiasymv(char *uplo, int *m, double *val, int *lval,
int *idiag, int *ndiag, double *x, double *y);
void mkl_cdiasymv(char *uplo, int *m, MKL_Complex8 *val, int *lval,
void mkl_zdiasymv(char *uplo, int *m, MKL_Complex16 *val, int *lval,
mkl_?csrtrsv
Triangular solvers with simplified interface for a
sparse matrix in the CSR format (3-array variation)
Syntax
Fortran:
call mkl_scsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_dcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_ccsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_zcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
C:
mkl_scsrtrsv(&uplo, &transa, &diag, &m, a, ia, ja, x, y);
mkl_dcsrtrsv(&uplo, &transa, &diag, &m, a, ia, ja, x, y);
mkl_ccsrtrsv(&uplo, &transa, &diag, &m, a, ia, ja, x, y);
mkl_zcsrtrsv(&uplo, &transa, &diag, &m, a, ia, ja, x, y);
Description
for C interface.
255
The mkl_?csrtrsv routine solves a system of linear equations with matrix-vector operations
for a sparse matrix stored in the CSR format (3 array variation):
A*y = x
or
A'*y = x,
where:
A is a sparse upper or lower triangular matrix with unit or non-unit main diagonal, A' is the
transpose of A.
Input Parameters
A is used.
A is used.
transa CHARACTER*1. Specifies the system of linear equations.
If transa = 'N' or 'n', then A*y = x
If transa = 'T' or 't' or 'C' or 'c', then A'*y = x,
diag CHARACTER*1. Specifies whether A is unit triangular.
If diag = 'U' or 'u', then A is aunit triangular.
If diag = 'N' or 'n', then A is not unit triangular.
a REAL for mkl_scsrtrmv.
DOUBLE PRECISION for mkl_dcsrtrmv.
COMPLEX for mkl_ccsrtrmv.
DOUBLE COMPLEX for mkl_zcsrtrmv.
256
for more details.
NOTE. Column indices must be sorted in increasing

order for each row.
x REAL for mkl_scsrtrmv.

Output Parameters
y REAL for mkl_scsrtrmv.
Contains the vector y.
257
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
CHARACTER*1 uplo, transa, diag
INTEGER m
REAL a(*), x(*), y(*)
SUBROUTINE mkl_dcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
INTEGER m
SUBROUTINE mkl_ccsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
INTEGER m
SUBROUTINE mkl_zcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
INTEGER m
258
C:
void mkl_scsrtrsv(char *uplo, char *transa, char *diag, int *m,
void mkl_dcsrtrsv(char *uplo, char *transa, char *diag, int *m,
void mkl_ccsrtrsv(char *uplo, char *transa, char *diag, int *m,
void mkl_zcsrtrsv(char *uplo, char *transa, char *diag, int *m,
mkl_?bsrtrsv
Triangular solver with simplified interface for a
sparse matrix stored in the BSR format (3-array
Syntax
Fortran:
call mkl_sbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_dbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_cbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_zbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
C:
mkl_sbsrtrsv(&uplo, &transa, &diag, &m, &lb, a, ia, ja, x, y);
mkl_dbsrtrsv(&uplo, &transa, &diag, &m, &lb, a, ia, ja, x, y);
mkl_cbsrtrsv(&uplo, &transa, &diag, &m, &lb, a, ia, ja, x, y);
mkl_zbsrtrsv(&uplo, &transa, &diag, &m, &lb, a, ia, ja, x, y);
Description
for C interface.
259
The mkl_?bsrtrsv routine solves a system of linear equations with matrix-vector operations
for a sparse matrix stored in the BSR format (3-array variation) :
y := A*x
or
y := A'*x,
where:
transpose of A.
Input Parameters
uplo CHARACTER*1. Specifies the upper or low triangle of the
matrix A is used.
A is used.
A is used.
matrix-vector product is computed as y := A'*x.
diag CHARACTER*1. Specifies whether A is a unit triangular
matrix.
If diag = 'U' or 'u', then A is a unit triangular.
If diag = 'N' or 'n', then A is not a unit triangular.
260
a REAL for mkl_sbsrtrsv.
DOUBLE PRECISION for mkl_dbsrtrsv.
COMPLEX for mkl_cbsrtrsv.
DOUBLE COMPLEX for mkl_zbsrtrsv.
ja INTEGER.
Array containing the column indices for each non-zero block
in the matrix A.
for more details.
x REAL for mkl_sbsrtrsv.
Output Parameters
y REAL for mkl_sbsrtrsv.
261
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
INTEGER m, lb
REAL a(*), x(*), y(*)
SUBROUTINE mkl_dbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
INTEGER m, lb
SUBROUTINE mkl_cbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
INTEGER m, lb
SUBROUTINE mkl_zbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
INTEGER m, lb
262
C:
void mkl_sbsrtrsv(char *uplo, char *transa, char *diag, int *m,
int *lb, float *a, int *ia, int *ja, float *x, float *y);
void mkl_dbsrtrsv(char *uplo, char *transa, char *diag, int *m,
int *lb, double *a, int *ia, int *ja, double *x, double *y);
void mkl_cbsrtrsv(char *uplo, char *transa, char *diag, int *m,
int *lb, MKL_Complex8 *a, int *ia, int *ja, MKL_Complex8 *x, MKL_Complex8 *y);
void mkl_zbsrtrsv(char *uplo, char *transa, char *diag, int *m,
mkl_?cootrsv
sparse matrix in the coordinate format with
one-based indexing.
Syntax
Fortran:
call mkl_scootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
call mkl_dcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
call mkl_ccootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
call mkl_zcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
C:
mkl_scootrsv(&uplo, &transa, &diag, &m, val, rowind, colind, &nnz, x, y);
mkl_dcootrsv(&uplo, &transa, &diag, &m, val, rowind, colind, &nnz, x, y);
mkl_ccootrsv(&uplo, &transa, &diag, &m, val, rowind, colind, &nnz, x, y);
mkl_zcootrsv(&uplo, &transa, &diag, &m, val, rowind, colind, &nnz, x, y);
Description
for C interface.
263
The mkl_?cootrsv routine solves a system of linear equations with matrix-vector operations
for a sparse matrix stored in the coordinate format:
A*y = x
or
A'*y = x,
where:
transpose of A.
Input Parameters
A is used.
A is used.
If diag = 'U' or 'u', then A is unit triangular.
val REAL for mkl_scootrsv.
DOUBLE PRECISION for mkl_dcootrsv.
COMPLEX for mkl_ccootrsv.
DOUBLE COMPLEX for mkl_zcootrsv.
264
more details.
more details.
matrix A.
details.
x REAL for mkl_scootrsv.
Output Parameters
y REAL for mkl_scootrsv.
265
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
INTEGER m, nnz
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
INTEGER m, nnz
SUBROUTINE mkl_ccootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
INTEGER m, nnz
SUBROUTINE mkl_zcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
INTEGER m, nnz
266
C:
void mkl_scootrsv(char *uplo, char *transa, char *diag, int *m,
float *val, int *rowind, int *colind, int *nnz, float *x, double *y);
void mkl_dcootrsv(char *uplo, char *transa, char *diag, int *m,
double *val, int *rowind, int *colind, int *nnz, double *x, double *y);
void mkl_ccootrsv(char *uplo, char *transa, char *diag, int *m,
MKL_Complex8 *val, int *rowind, int *colind, int *nnz, MKL_Complex8 *x, MKL_Complex8 *y);
void mkl_zcootrsv(char *uplo, char *transa, char *diag, int *m,
MKL_Complex16 *val, int *rowind, int *colind, int *nnz, MKL_Complex16 *x, MKL_Complex16
*y);
mkl_?diatrsv
sparse matrix in the diagonal format with
one-based indexing.
Syntax
Fortran:
call mkl_sdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
call mkl_ddiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
call mkl_cdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
call mkl_zdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
C:
mkl_sdiatrsv(&uplo, &transa, &diag, &m, val, &lval, idiag, &ndiag, x, y);
mkl_ddiatrsv(&uplo, &transa, &diag, &m, val, &lval, idiag, &ndiag, x, y);
mkl_cdiatrsv(&uplo, &transa, &diag, &m, val, &lval, idiag, &ndiag, x, y);
mkl_zdiatrsv(&uplo, &transa, &diag, &m, val, &lval, idiag, &ndiag, x, y);
267
Description
for C interface.
The mkl_?diatrsv routine solves a system of linear equations with matrix-vector operations
for a sparse matrix stored in the diagonal format:
A*y = x
or
A'*y = x,
where:
transpose of A.
Input Parameters
A is used.
A is used.
268
val REAL for mkl_sdiatrsv.
DOUBLE PRECISION for mkl_ddiatrsv.
COMPLEX for mkl_cdiatrsv.
DOUBLE COMPLEX for mkl_zdiatrsv.
matrix A.
NOTE. All elements of this array must be sorted in

increasing order.

matrix A.
x REAL for mkl_sdiatrsv.
Output Parameters
y REAL for mkl_sdiatrsv.
269
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
INTEGER indiag(*)
SUBROUTINE mkl_ddiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
INTEGER indiag(*)
SUBROUTINE mkl_cdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
INTEGER indiag(*)
SUBROUTINE mkl_zdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
INTEGER indiag(*)
270
C:
void mkl_sdiatrsv(char *uplo, char *transa, char *diag, int *m, float
*val, int *lval, int *idiag, int *ndiag, float *x, float *y);
void mkl_ddiatrsv(char *uplo, char *transa, char *diag, int *m, double
*val, int *lval, int *idiag, int *ndiag, double *x, double *y);
void mkl_cdiatrsv(char *uplo, char *transa, char *diag, int *m, MKL_Complex8
*val, int *lval, int *idiag, int *ndiag, MKL_Complex8 *x, MKL_Complex8 *y);
void mkl_zdiatrsv(char *uplo, char *transa, char *diag, int *m, MKL_Complex16
*val, int *lval, int *idiag, int *ndiag, MKL_Complex16 *x, MKL_Complex16 *y);
mkl_cspblas_?csrgemv
general matrix stored in the CSR format (3-array
Syntax
Fortran:
call mkl_cspblas_scsrgemv(transa, m, a, ia, ja, x, y)
call mkl_cspblas_dcsrgemv(transa, m, a, ia, ja, x, y)
call mkl_cspblas_ccsrgemv(transa, m, a, ia, ja, x, y)
call mkl_cspblas_zcsrgemv(transa, m, a, ia, ja, x, y)
C:
mkl_cspblas_scsrgemv(&transa, &m, a, ia, ja, x, y);
mkl_cspblas_dcsrgemv(&transa, &m, a, ia, ja, x, y);
mkl_cspblas_ccsrgemv(&transa, &m, a, ia, ja, x, y);
mkl_cspblas_zcsrgemv(&transa, &m, a, ia, ja, x, y);
271
Description
for C interface.
The mkl_cspblas_?csrgemv routine performs a matrix-vector operation defined as
y := A*x
or
y := A'*x,
where:
A is an m-by-m sparse square matrix in the CSR format (3-array variation) with zero-based
indexing, A' is the transpose of A.
NOTE. This routine supports only zero-based indexing of the input arrays.
Input Parameters
a REAL for mkl_cspblas_scsrgemv.
DOUBLE PRECISION for mkl_cspblas_dcsrgemv.
COMPLEX for mkl_cspblas_ccsrgemv.
DOUBLE COMPLEX for mkl_cspblas_zcsrgemv.
272
elements in the array a, such that ia(I) is the index in the
array a of the first non-zero element from the row I. The
value of the last element ia(m) is equal to the number of
non-zeros. Refer to rowIndex array description in Sparse
Matrix Storage Formats for more details.
for more details.
x REAL for mkl_cspblas_scsrgemv.
One entry, the array x must contain the vector x.
Output Parameters
y REAL for mkl_cspblas_scsrgemv.
273
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scsrgemv(transa, m, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m
REAL a(*), x(*), y(*)
SUBROUTINE mkl_cspblas_dcsrgemv(transa, m, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m
SUBROUTINE mkl_cspblas_ccsrgemv(transa, m, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m
SUBROUTINE mkl_cspblas_zcsrgemv(transa, m, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m
274
C:
void mkl_cspblas_scsrgemv(char *transa, int *m, float *a,
void mkl_cspblas_dcsrgemv(char *transa, int *m, double *a,
void mkl_cspblas_ccsrgemv(char *transa, int *m, MKL_Complex8 *a,
void mkl_cspblas_zcsrgemv(char *transa, int *m, MKL_Complex16 *a,
mkl_cspblas_?bsrgemv
general matrix stored in the BSR format (3-array
Syntax
Fortran:
call mkl_cspblas_sbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_cspblas_dbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_cspblas_cbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_cspblas_zbsrgemv(transa, m, lb, a, ia, ja, x, y)
C:
mkl_cspblas_sbsrgemv(&transa, &m, &lb, a, ia, ja, x, y);
mkl_cspblas_dbsrgemv(&transa, &m, &lb, a, ia, ja, x, y);
mkl_cspblas_cbsrgemv(&transa, &m, &lb, a, ia, ja, x, y);
mkl_cspblas_zbsrgemv(&transa, &m, &lb, a, ia, ja, x, y);
Description
for C interface.
275
The mkl_cspblas_?bsrgemv routine performs a matrix-vector operation defined as

y := A*x
or
y := A'*x,
where:
A is an m-by-m block sparse square matrix in the BSR format (3-array variation) with zero-based
indexing, A' is the transpose of A.
Input Parameters
a REAL for mkl_cspblas_sbsrgemv.
DOUBLE PRECISION for mkl_cspblas_dbsrgemv.
COMPLEX for mkl_cspblas_cbsrgemv.
DOUBLE COMPLEX for mkl_cspblas_zbsrgemv.
276
of non-zero blocks. Refer to rowIndex array description in
BSR Format for more details.
for more details.
x REAL for mkl_cspblas_sbsrgemv.
Output Parameters
y REAL for mkl_cspblas_sbsrgemv.
277
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_sbsrgemv(transa, m, lb, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m, lb
REAL a(*), x(*), y(*)
SUBROUTINE mkl_cspblas_dbsrgemv(transa, m, lb, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m, lb
SUBROUTINE mkl_cspblas_cbsrgemv(transa, m, lb, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m, lb
SUBROUTINE mkl_cspblas_zbsrgemv(transa, m, lb, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m, lb
278
C:
void mkl_cspblas_sbsrgemv(char *transa, int *m, int *lb,
void mkl_cspblas_dbsrgemv(char *transa, int *m, int *lb,
void mkl_cspblas_cbsrgemv(char *transa, int *m, int *lb,
void mkl_cspblas_zbsrgemv(char *transa, int *m, int *lb,
mkl_cspblas_?coogemv
general matrix stored in the coordinate format with
Syntax
Fortran:
call mkl_cspblas_scoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_dcoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_ccoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_zcoogemv(transa, m, val, rowind, colind, nnz, x, y)
C:
mkl_cspblas_scoogemv(&transa, &m, val, rowind, colind, &nnz, x, y);
mkl_cspblas_dcoogemv(&transa, &m, val, rowind, colind, &nnz, x, y);
mkl_cspblas_ccoogemv(&transa, &m, val, rowind, colind, &nnz, x, y);
mkl_cspblas_zcoogemv(&transa, &m, val, rowind, colind, &nnz, x, y);
Description
for C interface.
279
The mkl_cspblas_dcoogemv routine performs a matrix-vector operation defined as

y := A*x
or
y := A'*x,
where:
A is an m-by-m sparse square matrix in the coordinate format with zero-based indexing, A' is
the transpose of A.
Input Parameters
val REAL for mkl_cspblas_scoogemv.
DOUBLE PRECISION for mkl_cspblas_dcoogemv.
COMPLEX for mkl_cspblas_ccoogemv.
DOUBLE COMPLEX for mkl_cspblas_zcoogemv.
more details.
more details.
280
matrix A.
details.
x REAL for mkl_cspblas_scoogemv.
Output Parameters
y REAL for mkl_cspblas_scoogemv.
281
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scoogemv(transa, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
REAL val(*), x(*), y(*)
SUBROUTINE mkl_cspblas_dcoogemv(transa, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
SUBROUTINE mkl_cspblas_ccoogemv(transa, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
SUBROUTINE mkl_cspblas_zcoogemv(transa, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
282
C:
void mkl_cspblas_scoogemv(char *transa, int *m, float *val, int *rowind,
void mkl_cspblas_dcoogemv(char *transa, int *m, double *val, int *rowind,
void mkl_cspblas_ccoogemv(char *transa, int *m, MKL_Complex8 *val, int *rowind,
void mkl_cspblas_zcoogemv(char *transa, int *m, MKL_Complex16 *val, int *rowind,
mkl_cspblas_?csrsymv
symmetrical matrix stored in the CSR format
(3-array variation) with zero-based indexing.
Syntax
Fortran:
call mkl_cspblas_scsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_cspblas_dcsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_cspblas_ccsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_cspblas_zcsrsymv(uplo, m, a, ia, ja, x, y)
C:
mkl_cspblas_scsrsymv(&uplo, &m, a, ia, ja, x, y);
mkl_cspblas_dcsrsymv(&uplo, &m, a, ia, ja, x, y);
mkl_cspblas_ccsrsymv(&uplo, &m, a, ia, ja, x, y);
mkl_cspblas_zcsrsymv(&uplo, &m, a, ia, ja, x, y);
Description
for C interface.
283
The mkl_cspblas_?csrsymv routine performs a matrix-vector operation defined as

y := A*x
or
y := A'*x,
where:
A is an upper or lower triangle of the symmetrical sparse matrix in the CSR format (3-array
variation) with zero-based indexing, A' is the transpose of A.
Input Parameters

A is used.
A is used.
a REAL for mkl_cspblas_scsrsymv.
DOUBLE PRECISION for mkl_cspblas_dcsrsymv.
COMPLEX for mkl_cspblas_ccsrsymv.
DOUBLE COMPLEX for mkl_cspblas_zcsrsymv.
284
of non-zeros. Refer to rowIndex array description in Sparse
for more details.
x REAL for mkl_cspblas_scsrsymv.
Output Parameters
y REAL for mkl_cspblas_scsrsymv.
285
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scsrsymv(uplo, m, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m
REAL a(*), x(*), y(*)
SUBROUTINE mkl_cspblas_dcsrsymv(uplo, m, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m
SUBROUTINE mkl_cspblas_ccsrsymv(uplo, m, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m
SUBROUTINE mkl_cspblas_zcsrsymv(uplo, m, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m
286
C:
void mkl_cspblas_scsrsymv(char *uplo, int *m, float *a,
void mkl_cspblas_dcsrsymv(char *uplo, int *m, double *a,
void mkl_cspblas_ccsrsymv(char *uplo, int *m, MKL_Complex8 *a,
void mkl_cspblas_zcsrsymv(char *uplo, int *m, MKL_Complex16 *a,
mkl_cspblas_?bsrsymv
symmetrical matrix stored in the BSR format
(3-arrays variation) with zero-based indexing.
Syntax
Fortran:
call mkl_cspblas_sbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_cspblas_dbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_cspblas_cbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_cspblas_zbsrsymv(uplo, m, lb, a, ia, ja, x, y)
C:
mkl_cspblas_sbsrsymv(&uplo, &m, &lb, a, ia, ja, x, y);
mkl_cspblas_dbsrsymv(&uplo, &m, &lb, a, ia, ja, x, y);
mkl_cspblas_cbsrsymv(&uplo, &m, &lb, a, ia, ja, x, y);
mkl_cspblas_zbsrsymv(&uplo, &m, &lb, a, ia, ja, x, y);
Description
for C interface.
287
The mkl_cspblas_?bsrsymv routine performs a matrix-vector operation defined as

y := A*x
or
y := A'*x,
where:
A is an upper or lower triangle of the symmetrical sparse matrix in the BSR format (3-array
variation) with zero-based indexing, A' is the transpose of A.
Input Parameters

A is used.
A is used.
a REAL for mkl_cspblas_sbsrsymv.
DOUBLE PRECISION for mkl_cspblas_dbsrsymv.
COMPLEX for mkl_cspblas_cbsrsymv.
DOUBLE COMPLEX for mkl_cspblas_zbsrsymv.
288
for more details.
x REAL for mkl_cspblas_sbsrsymv.
Output Parameters
y REAL for mkl_cspblas_sbsrsymv.
289
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_sbsrsymv(uplo, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m, lb
REAL a(*), x(*), y(*)
SUBROUTINE mkl_cspblas_dbsrsymv(uplo, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m, lb
SUBROUTINE mkl_cspblas_cbsrsymv(uplo, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m, lb
SUBROUTINE mkl_cspblas_zbsrsymv(uplo, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m, lb
290
C:
void mkl_cspblas_sbsrsymv(char *uplo, int *m, int *lb,
void mkl_cspblas_dbsrsymv(char *uplo, int *m, int *lb,
void mkl_cspblas_cbsrsymv(char *uplo, int *m, int *lb,
void mkl_cspblas_zbsrsymv(char *uplo, int *m, int *lb,
mkl_cspblas_?coosymv
symmetrical matrix stored in the coordinate format
with zero-based indexing .
Syntax
Fortran:
call mkl_cspblas_scoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_dcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_ccoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_zcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
C:
mkl_cspblas_scoosymv(&uplo, &m, val, rowind, colind, &nnz, x, y);
mkl_cspblas_dcoosymv(&uplo, &m, val, rowind, colind, &nnz, x, y);
mkl_cspblas_ccoosymv(&uplo, &m, val, rowind, colind, &nnz, x, y);
mkl_cspblas_zcoosymv(&uplo, &m, val, rowind, colind, &nnz, x, y);
Description
for C interface.
291
The mkl_cspblas_?coosymv routine performs a matrix-vector operation defined as

y := A*x
or
y := A'*x,
where:
A is an upper or lower triangle of the symmetrical sparse matrix in the coordinate format with
zero-based indexing, A' is the transpose of A.
Input Parameters
A is used.
A is used.
val REAL for mkl_cspblas_scoosymv.
DOUBLE PRECISION for mkl_cspblas_dcoosymv.
COMPLEX for mkl_cspblas_ccoosymv.
DOUBLE COMPLEX for mkl_cspblas_zcoosymv.
more details.
292
more details.
matrix A.
details.
x REAL for mkl_cspblas_scoosymv.
Output Parameters
y REAL for mkl_cspblas_scoosymv.
293
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scoosymv(uplo, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 uplo
INTEGER m, nnz
REAL val(*), x(*), y(*)
SUBROUTINE mkl_cspblas_dcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 uplo
INTEGER m, nnz
SUBROUTINE mkl_cspblas_ccoosymv(uplo, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 uplo
INTEGER m, nnz
SUBROUTINE mkl_cspblas_zcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 uplo
INTEGER m, nnz
294
C:
void mkl_cspblas_scoosymv(char *uplo, int *m, float *val, int *rowind,
void mkl_cspblas_dcoosymv(char *uplo, int *m, double *val, int *rowind,
void mkl_cspblas_ccoosymv(char *uplo, int *m, MKL_Complex8 *val, int *rowind,
void mkl_cspblas_zcoosymv(char *uplo, int *m, MKL_Complex16 *val, int *rowind,
mkl_cspblas_?csrtrsv
sparse matrix in the CSR format (3-array variation)
with zero-based indexing.
Syntax
Fortran:
call mkl_cspblas_scsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_cspblas_dcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_cspblas_ccsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_cspblas_zcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
C:
mkl_cspblas_scsrtrsv(&uplo, &transa, &diag, &m, a, ia, ja, x, y);
mkl_cspblas_dcsrtrsv(&uplo, &transa, &diag, &m, a, ia, ja, x, y);
mkl_cspblas_ccsrtrsv(&uplo, &transa, &diag, &m, a, ia, ja, x, y);
mkl_cspblas_zcsrtrsv(&uplo, &transa, &diag, &m, a, ia, ja, x, y);
Description
for C interface.
295
The mkl_cspblas_?csrtrsv routine solves a system of linear equations with matrix-vector

operations for a sparse matrix stored in the CSR format (3-array variation) with zero-based
indexing:
A*y = x
or
A'*y = x,
where:
transpose of A.
Input Parameters
A is used.
A is used.
diag CHARACTER*1. Specifies whether matrix A is unit triangular.
a REAL for mkl_cspblas_scsrtrsv.
DOUBLE PRECISION for mkl_cspblas_dcsrtrsv.
COMPLEX for mkl_cspblas_ccsrtrsv.
296
DOUBLE COMPLEX for mkl_cspblas_zcsrtrsv.
ia INTEGER. Array of length m+1, containing indices of elements
in the array a, such that ia(i) is the index in the array a
of the first non-zero element from the row i. The value of
the last element ia(m) is equal to the number of non-zeros.
Refer to rowIndex array description in Sparse Matrix Storage
Formats for more details.
for more details.

order for each row.
x REAL for mkl_cspblas_scsrtrsv.

Output Parameters
y REAL for mkl_cspblas_scsrtrsv.
297
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
INTEGER m
REAL a(*), x(*), y(*)
SUBROUTINE mkl_cspblas_dcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
INTEGER m
SUBROUTINE mkl_cspblas_ccsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
INTEGER m
SUBROUTINE mkl_cspblas_zcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
INTEGER m
298
C:
void mkl_cspblas_scsrtrsv(char *uplo, char *transa, char *diag, int *m,
void mkl_cspblas_dcsrtrsv(char *uplo, char *transa, char *diag, int *m,
void mkl_cspblas_ccsrtrsv(char *uplo, char *transa, char *diag, int *m,
void mkl_cspblas_zcsrtrsv(char *uplo, char *transa, char *diag, int *m,
mkl_cspblas_?bsrtrsv
Triangular solver with simplified interface for a
sparse matrix stored in the BSR format (3-array
Syntax
Fortran:
call mkl_cspblas_sbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_cspblas_dbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_cspblas_cbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_cspblas_zbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
C:
mkl_cspblas_sbsrtrsv(&uplo, &transa, &diag, &m, &lb, a, ia, ja, x, y);
mkl_cspblas_dbsrtrsv(&uplo, &transa, &diag, &m, &lb, a, ia, ja, x, y);
mkl_cspblas_cbsrtrsv(&uplo, &transa, &diag, &m, &lb, a, ia, ja, x, y);
mkl_cspblas_zbsrtrsv(&uplo, &transa, &diag, &m, &lb, a, ia, ja, x, y);
Description
for C interface.
299
The mkl_cspblas_?bsrtrsv routine solves a system of linear equations with matrix-vector

operations for a sparse matrix stored in the BSR format (3-array variation) with zero-based
indexing:
y := A*x
or
y := A'*x,
where:
transpose of A.
Input Parameters
uplo CHARACTER*1. Specifies the upper or low triangle of the
matrix A is used.
A is used.
A is used.
matrix-vector product is computed as y := A'*x.
diag CHARACTER*1. Specifies whether matrix A is unit triangular
or not.
If diag = 'U' or 'u', A is unit triangular.
If diag = 'N' or 'n', A is not unit triangular.
300
a REAL for mkl_cspblas_sbsrtrmv.
DOUBLE PRECISION for mkl_cspblas_dbsrtrmv.
COMPLEX for mkl_cspblas_cbsrtrmv.
DOUBLE COMPLEX for mkl_cspblas_zbsrtrmv.
block in the array a, such that ia(I) is the index in the
of non-zero blocks. Refer to rowIndex array description in
BSR Format for more details.
for more details.
x REAL for mkl_cspblas_sbsrtrmv.
Output Parameters
y REAL for mkl_cspblas_sbsrtrmv.
301
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_sbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
INTEGER m, lb
REAL a(*), x(*), y(*)
SUBROUTINE mkl_cspblas_dbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
INTEGER m, lb
SUBROUTINE mkl_cspblas_cbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
INTEGER m, lb
SUBROUTINE mkl_cspblas_zbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
INTEGER m, lb
302
C:
void mkl_cspblas_sbsrtrsv(char *uplo, char *transa, char *diag, int *m,
int *lb, float *a, int *ia, int *ja, float *x, float *y);
void mkl_cspblas_dbsrtrsv(char *uplo, char *transa, char *diag, int *m,
int *lb, double *a, int *ia, int *ja, double *x, double *y);
void mkl_cspblas_cbsrtrsv(char *uplo, char *transa, char *diag, int *m,
void mkl_cspblas_zbsrtrsv(char *uplo, char *transa, char *diag, int *m,
mkl_cspblas_?cootrsv
sparse matrix in the coordinate format with
zero-based indexing .
Syntax
Fortran:
call mkl_cspblas_scootrsv(uplo, transa, diag, m, val, rowind, colind, nnz,
x, y)
call mkl_cspblas_dcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz,
x, y)
call mkl_cspblas_ccootrsv(uplo, transa, diag, m, val, rowind, colind, nnz,
x, y)
call mkl_cspblas_zcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz,
x, y)
303
C:
mkl_cspblas_scootrsv(&uplo, &transa, &diag, &m, val, rowind, colind, &nnz,
x, y);
mkl_cspblas_dcootrsv(&uplo, &transa, &diag, &m, val, rowind, colind, &nnz,
x, y);
mkl_cspblas_ccootrsv(&uplo, &transa, &diag, &m, val, rowind, colind, &nnz,
x, y);
mkl_cspblas_zcootrsv(&uplo, &transa, &diag, &m, val, rowind, colind, &nnz,
x, y);
Description
for C interface.
The mkl_cspblas_?cootrsv routine solves a system of linear equations with matrix-vector
operations for a sparse matrix stored in the coordinate format with zero-based indexing:
A*y = x
or
A'*y = x,
where:
transpose of A.
Input Parameters
304
A is used.
A is used.
val REAL for mkl_cspblas_scootrsv.
DOUBLE PRECISION for mkl_cspblas_dcootrsv.
COMPLEX for mkl_cspblas_ccootrsv.
DOUBLE COMPLEX for mkl_cspblas_zcootrsv.
more details.
more details.
matrix A.
details.
x REAL for mkl_cspblas_scootrsv.
305
Output Parameters
y REAL for mkl_cspblas_scootrsv.
306
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
INTEGER m, nnz
REAL val(*), x(*), y(*)
SUBROUTINE mkl_cspblas_dcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
INTEGER m, nnz
SUBROUTINE mkl_cspblas_ccootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
INTEGER m, nnz
SUBROUTINE mkl_cspblas_zcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
INTEGER m, nnz
307
C:
void mkl_cspblas_scootrsv(char *uplo, char *transa, char *diag, int *m,
float *val, int *rowind, int *colind, int *nnz, float *x, float *y);
void mkl_cspblas_dcootrsv(char *uplo, char *transa, char *diag, int *m,
double *val, int *rowind, int *colind, int *nnz, double *x, double *y);
void mkl_cspblas_ccootrsv(char *uplo, char *transa, char *diag, int *m,
MKL_Complex8 *val, int *rowind, int *colind, int *nnz, MKL_Complex8 *x, MKL_Complex8 *y);
void mkl_cspblas_zcootrsv(char *uplo, char *transa, char *diag, int *m,
*y);
mkl_?csrmv
matrix stored in the CSR format.
Syntax
Fortran:
call mkl_scsrmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x,
beta, y)
call mkl_dcsrmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x,
beta, y)
call mkl_ccsrmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x,
beta, y)
call mkl_zcsrmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x,
beta, y)
308
C:
mkl_scsrmv(&transa, &m, &k, &alpha, matdescra, val, indx, pntrb, pntre, x,
&beta, y);
mkl_dcsrmv(&transa, &m, &k, &alpha, matdescra, val, indx, pntrb, pntre, x,
&beta, y);
mkl_ccsrmv(&transa, &m, &k, &alpha, matdescra, val, indx, pntrb, pntre, x,
&beta, y);
mkl_zcsrmv(&transa, &m, &k, &alpha, matdescra, val, indx, pntrb, pntre, x,
&beta, y);
Description
for C interface.
The mkl_?csrmv routine performs a matrix-vector operation defined as
or
where:
A is an m-by-k sparse matrix in the CSR format, A' is the transpose of A.
NOTE. This routine supports a CSR format both with one-based indexing and zero-based
indexing.
Input Parameters
If transa = 'N' or 'n', then y := alpha*A*x + beta*y
309
If transa = 'T' or 't' or 'C' or 'c', then y :=

alpha*A'*x + beta*y,
k INTEGER. Number of columns of the matrix A.
alpha REAL for mkl_scsrmv.
DOUBLE PRECISION for mkl_dcsrmv.
COMPLEX for mkl_ccsrmv.
DOUBLE COMPLEX for mkl_zcsrmv.
matdescra CHARACTER. Array of six elements, specifies properties of
the matrix used for operation. Only first four array elements
are used, their possible values are given in the Table 2-6 .
val REAL for mkl_scsrmv.
Array containing non-zero elements of the matrix A.
For one-based indexing its length is pntre(m) - pntrb(1).
For zero-based indexing its length is pntre(m-1) -
pntrb(0).
Refer to values array description in CSR Format for more
details.
indx INTEGER. Array containing the column indices for each
non-zero element of the matrix A.Its length is equal to length
of the val array.
Refer to columns array description in CSR Format for more
details.
pntrb INTEGER. Array of length m.
For one-based indexing this array contains row indices, such
that pntrb(i) - pntrb(1)+1 is the first index of row i in
the arrays val and indx.
For zero-based indexing this array contains row indices,
such that pntrb(i) - pntrb(0) is the first index of row
i in the arrays val and indx. Refer to pointerb array
description in CSR Format for more details.
pntre INTEGER. Array of length m.
310
that pntre(i) - pntrb(1) is the last index of row i in
such that pntre(i) - pntrb(0)-1 is the last index of row
i in the arrays val and indx.Refer to pointerE array
x REAL for mkl_scsrmv.
Array, DIMENSION at least k if transa = 'N' or 'n' and
at least m otherwise. On entry, the array x must contain the
vector x.
beta REAL for mkl_scsrmv.
y REAL for mkl_scsrmv.
Array, DIMENSION at least m if transa = 'N' or 'n' and
at least k otherwise. On entry, the array y must contain the
vector y.
Output Parameters
311
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrmv(transa, m, k, alpha, matdescra, val, indx,
pntrb, pntre, x, beta, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, k
INTEGER indx(*), pntrb(m), pntre(m)
REAL alpha, beta
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dcsrmv(transa, m, k, alpha, matdescra, val, indx,
CHARACTER*1 transa
INTEGER m, k
DOUBLE PRECISION alpha, beta
SUBROUTINE mkl_ccsrmv(transa, m, k, alpha, matdescra, val, indx,
CHARACTER*1 transa
INTEGER m, k
COMPLEX alpha, beta
SUBROUTINE mkl_zcsrmv(transa, m, k, alpha, matdescra, val, indx,
312
CHARACTER*1 transa
INTEGER m, k
REAL*8 alpha, beta
REAL*8 val(*), x(*), y(*)
C:
void mkl_scsrmv(char *transa, int *m, int *k, float *alpha, char *matdescra,
float *val, int *indx, int *pntrb, int *pntre, float *x, float *beta, float *y);
void mkl_dcsrmv(char *transa, int *m, int *k, double *alpha, char *matdescra,
double *val, int *indx, int *pntrb, int *pntre, double *x, double *beta, double *y);
void mkl_ccsrmv(char *transa, int *m, int *k, MKL_Complex8 *alpha, char *matdescra,
MKL_Complex8 *val, int *indx, int *pntrb, int *pntre, MKL_Complex8 *x, MKL_Complex8 *beta,
double *y);
void mkl_zcsrmv(char *transa, int *m, int *k, MKL_Complex16 *alpha, char *matdescra,
MKL_Complex16 *val, int *indx, int *pntrb, int *pntre, MKL_Complex16 *x, MKL_Complex16
*beta, MKL_Complex16 *y);
313
mkl_?bsrmv
matrix stored in the BSR format.
Syntax
Fortran:
call mkl_sbsrmv(transa, m, k, lb, alpha, matdescra, val, indx, pntrb, pntre,
x, beta, y)
call mkl_dbsrmv(transa, m, k, lb, alpha, matdescra, val, indx, pntrb, pntre,
x, beta, y)
call mkl_cbsrmv(transa, m, k, lb, alpha, matdescra, val, indx, pntrb, pntre,
x, beta, y)
call mkl_zbsrmv(transa, m, k, lb, alpha, matdescra, val, indx, pntrb, pntre,
x, beta, y)
C:
mkl_sbsrmv(&transa, &m, &k, &lb, &alpha, matdescra, val, indx, pntrb, pntre,
x, &beta, y);
mkl_dbsrmv(&transa, &m, &k, &lb, &alpha, matdescra, val, indx, pntrb, pntre,
x, &beta, y);
mkl_cbsrmv(&transa, &m, &k, &lb, &alpha, matdescra, val, indx, pntrb, pntre,
x, &beta, y);
mkl_zbsrmv(&transa, &m, &k, &lb, &alpha, matdescra, val, indx, pntrb, pntre,
x, &beta, y);
Description
for C interface.
The mkl_?bsrmv routine performs a matrix-vector operation defined as
or
314
where:
A is an m-by-k block sparse matrix in the BSR format, A' is the transpose of A.
NOTE. This routine supports a BSR format both with one-based indexing and zero-based
indexing.
Input Parameters
computed as y := alpha*A*x + beta*y
matrix-vector product is computed as y := alpha*A'*x
+ beta*y,
k INTEGER. Number of block columns of the matrix A.
alpha REAL for mkl_sbsrmv.
DOUBLE PRECISION for mkl_dbsrmv.
COMPLEX for mkl_cbsrmv.
DOUBLE COMPLEX for mkl_zbsrmv.
val REAL for mkl_sbsrmv.
315

the matrix A multiplied by lb*lb.
Refer to values array description in BSR Format for more
details.
non-zero block in the matrix A. Its length is equal to the
number of non-zero blocks in the matrix A.
Refer to columns array description in BSR Format for more
details.
For one-based indexing: this array contains row indices,
such that pntrb(i) - pntrb(1)+1 is the first index of
block row i in the array indx.
For zero-based indexing: this array contains row indices,
such that pntrb(i) - pntrb(0) is the first index of block
row i in the array indx
Refer to pointerB array description in BSR Format for more
details.
that pntre(i) - pntrb(1) is the last index of block row
i in the array indx.
such that pntre(i) - pntrb(0)-1 is the last index of
Refer to pointerE array description in BSR Format for more
details.
x REAL for mkl_sbsrmv.
Array, DIMENSION at least (k*lb) if transa = 'N' or 'n',
and at least (m*lb) otherwise. On entry, the array x must
beta REAL for mkl_sbsrmv.
316
y REAL for mkl_sbsrmv.
Array, DIMENSION at least (m*lb) if transa = 'N' or 'n',
and at least (k*lb) otherwise. On entry, the array y must
contain the vector y.
Output Parameters
317
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrmv(transa, m, k, lb, alpha, matdescra, val, indx,
CHARACTER*1 transa
INTEGER m, k, lb
REAL alpha, beta
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dbsrmv(transa, m, k, lb, alpha, matdescra, val, indx,
CHARACTER*1 transa
INTEGER m, k, lb
SUBROUTINE mkl_cbsrmv(transa, m, k, lb, alpha, matdescra, val, indx,
CHARACTER*1 transa
INTEGER m, k, lb
COMPLEX alpha, beta
SUBROUTINE mkl_zbsrmv(transa, m, k, lb, alpha, matdescra, val, indx,
318
CHARACTER*1 transa
INTEGER m, k, lb
DOUBLE COMPLEX alpha, beta
C:
void mkl_sbsrmv(char *transa, int *m, int *k, int *lb,
float *alpha, char *matdescra, float *val, int *indx,
int *pntrb, int *pntre, float *x, float *beta, float *y);
void mkl_dbsrmv(char *transa, int *m, int *k, int *lb,
double *alpha, char *matdescra, double *val, int *indx,
int *pntrb, int *pntre, double *x, double *beta, double *y);
void mkl_cbsrmv(char *transa, int *m, int *k, int *lb,
MKL_Complex8 *alpha, char *matdescra, MKL_Complex8 *val, int *indx,
int *pntrb, int *pntre, MKL_Complex8 *x, MKL_Complex8 *beta, MKL_Complex8 *y);
void mkl_zbsrmv(char *transa, int *m, int *k, int *lb,
int *pntrb, int *pntre, MKL_Complex16 *x, MKL_Complex16 *beta, MKL_Complex16 *y);
319
mkl_?cscmv
Computes matrix-vector product for a sparse
Syntax
Fortran:
call mkl_scscmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x,
beta, y)
call mkl_dcscmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x,
beta, y)
call mkl_ccscmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x,
beta, y)
call mkl_zcscmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x,
beta, y)
C:
mkl_scscmv(&transa, &m, &k, &alpha, matdescra, val, indx, pntrb, pntre, x,
&beta, y);
mkl_dcscmv(&transa, &m, &k, &alpha, matdescra, val, indx, pntrb, pntre, x,
&beta, y);
mkl_ccscmv(&transa, &m, &k, &alpha, matdescra, val, indx, pntrb, pntre, x,
&beta, y);
mkl_zcscmv(&transa, &m, &k, &alpha, matdescra, val, indx, pntrb, pntre, x,
&beta, y);
Description
for C interface.
The mkl_?cscmv routine performs a matrix-vector operation defined as
or
320
where:
A is an m-by-k sparse matrix in compressed sparse column (CSC) format, A' is the transpose
of A.
NOTE. This routine supports CSC format both with one-based indexing and zero-based
indexing.
Input Parameters
alpha REAL for mkl_scscmv.
DOUBLE PRECISION for mkl_dcscmv.
COMPLEX for mkl_ccscmv.
DOUBLE COMPLEX for mkl_zcscmv.
val REAL for mkl_scscmv.
For one-based indexing its length is pntre(k) - pntrb(1).
321

pntrb(0).
Refer to values array description in CSC Format for more
details.
indx INTEGER. Array containing the row indices for each non-zero
element of the matrix A. Its length is equal to length of the
val array.
Refer to rows array description in CSC Format for more
details.
pntrb INTEGER. Array of length k.
For one-based indexing this array contains column indices,
column i in the arrays val and indx.
For zero-based indexing this array contains column indices,
such that pntrb(i) - pntrb(0) is the first index of column
i in the arrays val and indx.
Refer to pointerb array description in CSC Format for more
details.
pntre INTEGER. Array of length k.
such that pntre(i) - pntrb(1) is the last index of column
Refer to pointerE array description in CSC Format for more
details.
x REAL for mkl_scscmv.
vector x.
beta REAL for mkl_scscmv.
322
y REAL for mkl_scscmv.
vector y.
Output Parameters
323
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scscmv(transa, m, k, alpha, matdescra, val, indx,
CHARACTER*1 transa
INTEGER m, k, ldb, ldc
REAL alpha, beta
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dcscmv(transa, m, k, alpha, matdescra, val, indx,
CHARACTER*1 transa
SUBROUTINE mkl_ccscmv(transa, m, k, alpha, matdescra, val, indx,
CHARACTER*1 transa
COMPLEX alpha, beta
SUBROUTINE mkl_zcscmv(transa, m, k, alpha, matdescra, val, indx,
324
CHARACTER*1 transa
C:
void mkl_scscmv(char *transa, int *m, int *k, float *alpha,
char *matdescra, float *val, int *indx, int *pntrb,
int *pntre, float *x, float *beta, float *y);
void mkl_dcscmv(char *transa, int *m, int *k, double *alpha,
char *matdescra, double *val, int *indx, int *pntrb,
int *pntre, double *x, double *beta, double *y);
void mkl_ccscmv(char *transa, int *m, int *k, MKL_Complex8 *alpha,
char *matdescra, MKL_Complex8 *val, int *indx, int *pntrb,
int *pntre, MKL_Complex8 *x, MKL_Complex8 *beta, MKL_Complex8 *y);
void mkl_zcscmv(char *transa, int *m, int *k, MKL_Complex16 *alpha,
int *pntre, MKL_Complex16 *x, MKL_Complex16 *beta, MKL_Complex16 *y);
325
mkl_?coomv
Computes matrix - vector product for a sparse
Syntax
Fortran:
call mkl_scoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x,
beta, y)
call mkl_dcoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x,
beta, y)
call mkl_ccoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x,
beta, y)
call mkl_zcoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x,
beta, y)
C:
mkl_scoomv(&transa, &m, &k, &alpha, matdescra, val, rowind, colind, &nnz, x,
&beta, y);
mkl_dcoomv(&transa, &m, &k, &alpha, matdescra, val, rowind, colind, &nnz, x,
&beta, y);
mkl_ccoomv(&transa, &m, &k, &alpha, matdescra, val, rowind, colind, &nnz, x,
&beta, y);
mkl_zcoomv(&transa, &m, &k, &alpha, matdescra, val, rowind, colind, &nnz, x,
&beta, y);
Description
for C interface.
The mkl_?coomv routine performs a matrix-vector operation defined as
or
326
where:
A is an m-by-k sparse matrix in compressed coordinate format, A' is the transpose of A.
NOTE. This routine supports a coordinate format both with one-based indexing and
Input Parameters
alpha REAL for mkl_scoomv.
DOUBLE PRECISION for mkl_dcoomv.
COMPLEX for mkl_ccoomv.
DOUBLE COMPLEX for mkl_zcoomv.
val REAL for mkl_scoomv.
more details.
327
more details.
matrix A.
details.
x REAL for mkl_scoomv.
vector x.
beta REAL for mkl_scoomv.
y REAL for mkl_scoomv.
vector y.
Output Parameters
328
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta,
y)
CHARACTER*1 transa
INTEGER m, k, nnz
REAL alpha, beta
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dcoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta,
y)
CHARACTER*1 transa
INTEGER m, k, nnz
SUBROUTINE mkl_ccoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta,
y)
CHARACTER*1 transa
INTEGER m, k, nnz
COMPLEX alpha, beta
SUBROUTINE mkl_zcoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta,
y)
CHARACTER*1 transa
329
INTEGER m, k, nnz
C:
void mkl_scoomv(char *transa, int *m, int *k, float *alpha, char *matdescra,
float *val, int *rowind, int *colind, int *nnz, float *x, float *beta, float *y);
void mkl_dcoomv(char *transa, int *m, int *k, double *alpha, char *matdescra,
double *val, int *rowind, int *colind, int *nnz, double *x, double *beta, double *y);
void mkl_ccoomv(char *transa, int *m, int *k, MKL_Complex8 *alpha, char *matdescra,
MKL_Complex8 *val, int *rowind, int *colind, int *nnz, MKL_Complex8 *x, MKL_Complex8 *beta,
MKL_Complex8 *y);
void mkl_zcoomv(char *transa, int *m, int *k, MKL_Complex16 *alpha, char *matdescra,
*beta, MKL_Complex16 *y);
mkl_?csrsv
Solves a system of linear equations for a sparse
matrix in the CSR format.
Syntax
Fortran:
call mkl_scsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_dcsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_ccsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_zcsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
330
C:
mkl_scsrsv(&transa, &m, &alpha, matdescra, val, indx, pntrb, pntre, x, y);
mkl_dcsrsv(&transa, &m, &alpha, matdescra, val, indx, pntrb, pntre, x, y);
mkl_ccsrsv(&transa, &m, &alpha, matdescra, val, indx, pntrb, pntre, x, y);
mkl_zcsrsv(&transa, &m, &alpha, matdescra, val, indx, pntrb, pntre, x, y);
Description
for C interface.
The mkl_?csrsv routine solves a system of linear equations with matrix-vector operations for
a sparse matrix in the CSR format:
y := alpha*inv(A)*x
or
y := alpha*inv(A')*x,
where:
alpha is scalar, x and y are vectors, A is a sparse upper or lower triangular matrix with unit or
non-unit main diagonal, A' is the transpose of A.
indexing.
Input Parameters
If transa = 'N' or 'n', then y := alpha*inv(A)*x
alpha*inv(A')*x,
m INTEGER. Number of columns of the matrix A.
alpha REAL for mkl_scsrsv.
DOUBLE PRECISION for mkl_dcsrsv.
331
COMPLEX for mkl_ccsrsv.

DOUBLE COMPLEX for mkl_zcsrsv.
val REAL for mkl_scsrsv.
pntrb(0).
details.
non-zero element of the matrix A. Its length is equal to
length of the val array.
details.

order for each row.

332
x REAL for mkl_scsrsv.
On entry, the array x must contain the vector x. The
elements are accessed with unit increment.
y REAL for mkl_scsrsv.
On entry, the array y must contain the vector y. The
Output Parameters
y Contains solution vector x.
333
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
CHARACTER*1 transa
INTEGER m
REAL alpha
REAL val(*)
REAL x(*), y(*)
SUBROUTINE mkl_dcsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
CHARACTER*1 transa
INTEGER m
DOUBLE PRECISION alpha
DOUBLE PRECISION val(*)
DOUBLE PRECISION x(*), y(*)
SUBROUTINE mkl_ccsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
CHARACTER*1 transa
INTEGER m
COMPLEX alpha
COMPLEX val(*)
COMPLEX x(*), y(*)
SUBROUTINE mkl_zcsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
334
CHARACTER*1 transa
INTEGER m
DOUBLE COMPLEX alpha
DOUBLE COMPLEX val(*)
DOUBLE COMPLEX x(*), y(*)
C:
void mkl_scsrsv(char *transa, int *m, float *alpha, char *matdescra,
float *val, int *indx, int *pntrb, int *pntre, float *x, float *y);
void mkl_dcsrsv(char *transa, int *m, double *alpha, char *matdescra,
double *val, int *indx, int *pntrb, int *pntre, double *x, double *y);
void mkl_ccsrsv(char *transa, int *m, MKL_Complex8 *alpha, char *matdescra,
MKL_Complex8 *val, int *indx, int *pntrb, int *pntre, MKL_Complex8 *x, MKL_Complex8 *y);
void mkl_zcsrsv(char *transa, int *m, MKL_Complex16 *alpha, char *matdescra,
335
mkl_?bsrsv
matrix in the BSR format.
Syntax
Fortran:
call mkl_sbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x,
y)
call mkl_dbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x,
y)
call mkl_cbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x,
y)
call mkl_zbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x,
y)
C:
mkl_sbsrsv(&transa, &m, &lb, &alpha, matdescra, val, indx, pntrb, pntre, x,
y);
mkl_dbsrsv(&transa, &m, &lb, &alpha, matdescra, val, indx, pntrb, pntre, x,
y);
mkl_cbsrsv(&transa, &m, &lb, &alpha, matdescra, val, indx, pntrb, pntre, x,
y);
mkl_zbsrsv(&transa, &m, &lb, &alpha, matdescra, val, indx, pntrb, pntre, x,
y);
Description
for C interface.
The mkl_?bsrsv routine solves a system of linear equations with matrix-vector operations for
a sparse matrix in the BSR format:
y := alpha*inv(A)*x
or
y := alpha*inv(A')* x,
336
where:
indexing.
Input Parameters
alpha*inv(A')* x,
m INTEGER. Number of block columns of the matrix A.
alpha REAL for mkl_sbsrsv.
DOUBLE PRECISION for mkl_dbsrsv.
COMPLEX for mkl_cbsrsv.
DOUBLE COMPLEX for mkl_zbsrsv.
val REAL for mkl_sbsrsv.
the matrix A multiplied by lb*lb.
Refer to the values array description in BSR Format for
more details.
337

number of non-zero blocks in the matrix A.
Refer to the columns array description in BSR Format for
more details.
row i in the array indx
details.
details.
x REAL for mkl_sbsrsv.
y REAL for mkl_sbsrsv.
338
Output Parameters
339
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)
CHARACTER*1 transa
INTEGER m, lb
REAL alpha
REAL val(*)
REAL x(*), y(*)
SUBROUTINE mkl_dbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)
CHARACTER*1 transa
INTEGER m, lb
SUBROUTINE mkl_cbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)
CHARACTER*1 transa
INTEGER m, lb
COMPLEX alpha
COMPLEX val(*)
COMPLEX x(*), y(*)
SUBROUTINE mkl_zbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)
340
CHARACTER*1 transa
INTEGER m, lb
C:
void mkl_sbsrsv(char *transa, int *m, int *lb, float *alpha, char *matdescra,
void mkl_dbsrsv(char *transa, int *m, int *lb, double *alpha, char *matdescra,
void mkl_cbsrsv(char *transa, int *m, int *lb, MKL_Complex8 *alpha, char *matdescra,
void mkl_zbsrsv(char *transa, int *m, int *lb, MKL_Complex16 *alpha, char *matdescra,
341
mkl_?cscsv
Syntax
Fortran:
call mkl_scscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x,
y)call mkl_dcscsv
call mkl_dcscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x,
y)call mkl_dcscsv
call mkl_ccscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x,
y)call mkl_dcscsv
call mkl_zcscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x,
y)call mkl_dcscsv
C:
mkl_scscsv(&transa, &m, &alpha, matdescra, val, indx, pntrb, pntre, x, y);
mkl_dcscsv(&transa, &m, &alpha, matdescra, val, indx, pntrb, pntre, x, y);
mkl_ccscsv(&transa, &m, &alpha, matdescra, val, indx, pntrb, pntre, x, y);
mkl_zcscsv(&transa, &m, &alpha, matdescra, val, indx, pntrb, pntre, x, y);
Description
for C interface.
The mkl_?cscsv routine solves a system of linear equations with matrix-vector operations for
a sparse matrix in the CSC format:
y := alpha*inv(A)*x
or
where:
342
NOTE. This routine supports a CSC format both with one-based indexing and zero-based
indexing.
Input Parameters
If transa= 'T' or 't' or 'C' or 'c', then y :=
alpha*inv(A')* x,
alpha REAL for mkl_scscsv.
DOUBLE PRECISION for mkl_dcscsv.
COMPLEX for mkl_ccscsv.
DOUBLE COMPLEX for mkl_zcscsv.
val REAL for mkl_scscsv.
pntrb(0).
details.
val array.
Refer to columns array description in CSC Format for more
details.
343
NOTE. Row indices must be sorted in increasing

order for each column.

details.
details.
x REAL for mkl_scscsv.
y REAL for mkl_scscsv.
344
Output Parameters
y Contains the solution vector x.
345
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
CHARACTER*1 transa
INTEGER m
REAL alpha
REAL val(*)
REAL x(*), y(*)
SUBROUTINE mkl_dcscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
CHARACTER*1 transa
INTEGER m
SUBROUTINE mkl_ccscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
CHARACTER*1 transa
INTEGER m
COMPLEX alpha
COMPLEX val(*)
COMPLEX x(*), y(*)
SUBROUTINE mkl_zcscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
346
CHARACTER*1 transa
INTEGER m
C:
void mkl_scscsv(char *transa, int *m, float *alpha, char *matdescra,
void mkl_dcscsv(char *transa, int *m, double *alpha, char *matdescra,
void mkl_ccscsv(char *transa, int *m, MKL_Complex8 *alpha, char *matdescra,
void mkl_zcscsv(char *transa, int *m, MKL_Complex16 *alpha, char *matdescra,
347
mkl_?coosv
Syntax
Fortran:
call mkl_scoosv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x,
y)
call mkl_dcoosv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x,
y)
call mkl_ccoosv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x,
y)
call mkl_zcoosv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x,
y)
C:
mkl_scoosv(&transa, &m, &k, &alpha, matdescra, val, rowind, colind, &nnz, x,
y);
mkl_dcoosv(&transa, &m, &k, &alpha, matdescra, val, rowind, colind, &nnz, x,
y);
mkl_ccoosv(&transa, &m, &k, &alpha, matdescra, val, rowind, colind, &nnz, x,
y);
mkl_zcoosv(&transa, &m, &k, &alpha, matdescra, val, rowind, colind, &nnz, x,
y);
Description
for C interface.
The mkl_?coosv routine solves a system of linear equations with matrix-vector operations for
a sparse matrix in the coordinate format:
y := alpha*inv(A)*x
or
348
where:
Input Parameters
alpha*inv(A')* x,
alpha REAL for mkl_scoosv.
DOUBLE PRECISION for mkl_dcoosv.
COMPLEX for mkl_ccoosv.
DOUBLE COMPLEX for mkl_zcoosv.
val REAL for mkl_scoosv.
more details.
349

more details.
matrix A.
details.
x REAL for mkl_scoosv.
y REAL for mkl_scoosv.
Output Parameters
350
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
REAL alpha
REAL val(*)
REAL x(*), y(*)
SUBROUTINE mkl_dcoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
SUBROUTINE mkl_ccoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
COMPLEX alpha
COMPLEX val(*)
COMPLEX x(*), y(*)
SUBROUTINE mkl_zcoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)
351
CHARACTER*1 transa
INTEGER m, nnz
C:
void mkl_scoosv(char *transa, int *m, float *alpha, char *matdescra,
float *val, int *rowind, int *colind, int *nnz,
float *x, float *y);
void mkl_dcoosv(char *transa, int *m, double *alpha, char *matdescra,
double *val, int *rowind, int *colind, int *nnz,
double *x, double *y);
void mkl_ccoosv(char *transa, int *m, MKL_Complex8 *alpha, char *matdescra,
MKL_Complex8 *val, int *rowind, int *colind, int *nnz,
MKL_Complex8 *x, MKL_Complex8 *y);
void mkl_zcoosv(char *transa, int *m, MKL_Complex16 *alpha, char *matdescra,
MKL_Complex16 *val, int *rowind, int *colind, int *nnz,
MKL_Complex16 *x, MKL_Complex16 *y);
352
mkl_?csrmm
Computes matrix - matrix product of a sparse
matrix stored in the CSR format.
Syntax
Fortran:
call mkl_scsrmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre,
b, ldb, beta, c, ldc)
call mkl_dcsrmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre,
call mkl_ccsrmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre,
call mkl_zcsrmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre,
C:
mkl_scsrmm(&transa, &m, &n, &k, &alpha, matdescra, val, indx, pntrb, pntre,
b, &ldb, &beta, c, &ldc);
mkl_dcsrmm(&transa, &m, &n, &k, &alpha, matdescra, val, indx, pntrb, pntre,
mkl_ccsrmm(&transa, &m, &n, &k, &alpha, matdescra, val, indx, pntrb, pntre,
mkl_zcsrmm(&transa, &m, &n, &k, &alpha, matdescra, val, indx, pntrb, pntre,
Description
for C interface.
The mkl_?csrmm routine performs a matrix-matrix operation defined as
or
C := alpha*A'*B + beta*C,
353
where:
B and C are dense matrices, A is an m-by-k sparse matrix in compressed sparse row (CSR)
format, A' is the transpose of A.
indexing.
Input Parameters
If transa = 'N' or 'n', then C := alpha*A*B + beta*C
If transa = 'T' or 't' or 'C' or 'c', then C :=
alpha*A'*B + beta*C,
n INTEGER. Number of columns of the matrix C.
alpha REAL for mkl_scsrmm.
DOUBLE PRECISION for mkl_dcsrmm.
COMPLEX for mkl_ccsrmm.
DOUBLE COMPLEX for mkl_zcsrmm.
val REAL for mkl_scsrmm.
354
For zero-based indexing its length is pntre(—1) -
pntrb(0).
details.
details.
that pntrb(I) - pntrb(1)+1 is the first index of row I in
such that pntrb(I) - pntrb(0) is the first index of row
I in the arrays val and indx.
Refer to pointerb array description in CSR Format for more
details.
that pntre(I) - pntrb(1) is the last index of row I in
such that pntre(I) - pntrb(0)-1 is the last index of row
Refer to pointerE array description in CSR Format for more
details.
b REAL for mkl_scsrmm.
Array, DIMENSION (ldb, n) for one-based indexing, and
(m, ldb) for zero-based indexing.
On entry with transa= 'N' or 'n', the leading k-by-n part
of the array b must contain the matrix B, otherwise the
leading m-by-n part of the array b must contain the matrix
B.
355
ldb INTEGER. Specifies the first dimension of b for one-based

indexing, and the second dimension of b for zero-based
indexing, as declared in the calling (sub)program.
beta REAL for mkl_scsrmm.
c REAL for mkl_scsrmm.
Array, DIMENSION (ldc, n) for one-based indexing, and (m,
ldc) for zero-based indexing.
On entry, the leading m-by-n part of the array c must contain
the matrix C, otherwise the leading k-by-n part of the array
c must contain the matrix C.
ldc INTEGER. Specifies the first dimension of c for one-based
indexing, and the second dimension of c for zero-based
Output Parameters
c Overwritten by the matrix (alpha*A*B + beta* C) or
(alpha*A'*B + beta*C).
356
Interfaces
FORTRAN 77:
SUBROUTINE mkl_dcsrmm(transa, m, n, k, alpha, matdescra, val, indx,
pntrb, pntre, b, ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER m, n, k, ldb, ldc
REAL alpha, beta
REAL val(*), b(ldb,*), c(ldc,*)
CHARACTER*1 transa
DOUBLE PRECISION val(*), b(ldb,*), c(ldc,*)
CHARACTER*1 transa
COMPLEX alpha, beta
COMPLEX val(*), b(ldb,*), c(ldc,*)
357

CHARACTER*1 transa
DOUBLE COMPLEX val(*), b(ldb,*), c(ldc,*)
C:
void mkl_scsrmm(char *transa, int *m, int *n, int *k, float *alpha,
char *matdescra, float *val, int *indx, int *pntrb, int *pntre,
float *b, int *ldb, float *beta, float *c, int *ldc,);
void mkl_dcsrmm(char *transa, int *m, int *n, int *k, double *alpha,
char *matdescra, double *val, int *indx, int *pntrb, int *pntre,
double *b, int *ldb, double *beta, double *c, int *ldc,);
void mkl_ccsrmm(char *transa, int *m, int *n, int *k, double *alpha,
void mkl_zcsrmm(char *transa, int *m, int *n, int *k, double *alpha,
358
mkl_?bsrmm
Computes matrix - matrix product of a sparse
matrix stored in the BSR format.
Syntax
Fortran:
call mkl_sbsrmm(transa, m, n, k, lb, alpha, matdescra, val, indx, pntrb,
pntre, b, ldb, beta, c, ldc)
call mkl_dbsrmm(transa, m, n, k, lb, alpha, matdescra, val, indx, pntrb,
call mkl_cbsrmm(transa, m, n, k, lb, alpha, matdescra, val, indx, pntrb,
call mkl_zbsrmm(transa, m, n, k, lb, alpha, matdescra, val, indx, pntrb,
C:
mkl_sbsrmm(&transa, &m, &n, &k, &lb, &alpha, matdescra, val, indx, pntrb,
pntre, b, &ldb, &beta, c, &ldc);
mkl_dbsrmm(&transa, &m, &n, &k, &lb, &alpha, matdescra, val, indx, pntrb,
mkl_cbsrmm(&transa, &m, &n, &k, &lb, &alpha, matdescra, val, indx, pntrb,
mkl_zbsrmm(&transa, &m, &n, &k, &lb, &alpha, matdescra, val, indx, pntrb,
Description
for C interface.
The mkl_?bsrmm routine performs a matrix-matrix operation defined as
or
359
where:
B and C are dense matrices, A is an m-by-k sparse matrix in block sparse row (BSR) format, A'
is the transpose of A.
indexing.
Input Parameters
If transa = 'N' or 'n', then the matrix-matrix product
is computed as C := alpha*A*B + beta*C
matrix-vector product is computed as C := alpha*A'*B
+ beta*C,
k INTEGER. Number of block columns of the matrix A.
alpha REAL for mkl_sbsrmm.
DOUBLE PRECISION for mkl_dbsrmm.
COMPLEX for mkl_cbsrmm.
DOUBLE COMPLEX for mkl_zbsrmm.
val REAL for mkl_sbsrmm.
360
the matrix A multiplied by lb*lb. Refer to the values array
number of non-zero blocks in the matrix A. Refer to the
columns array description in BSR Format for more details.
such that pntrb(I) - pntrb(1)+1 is the first index of
block row I in the array indx.
such that pntrb(I) - pntrb(0) is the first index of block
row I in the array indx.
details.
that pntre(I) - pntrb(1) is the last index of block row
I in the array indx.
such that pntre(I) - pntrb(0)-1 is the last index of
block row I in the array indx.
details.
b REAL for mkl_sbsrmm.
Array, DIMENSION (ldb, n) for one-based indexing,
DIMENSION (m, ldb) for zero-based indexing.
On entry with transa= 'N' or 'n', the leading n-by-k block
part of the array b must contain the matrix B, otherwise the
leading m-by-n block part of the array b must contain the
matrix B.
361
ldb INTEGER. Specifies the first dimension (in blocks) of b as

declared in the calling (sub)program.
beta REAL for mkl_sbsrmm.
c REAL for mkl_sbsrmm.
Array, DIMENSION (ldc, n) for one-based indexing,
DIMENSION (k, ldc) for zero-based indexing.
On entry, the leading m-by-n block part of the array c must
contain the matrix C, otherwise the leading n-by-k block
part of the array c must contain the matrix C.
ldc INTEGER. Specifies the first dimension (in blocks) of c as
Output Parameters
c Overwritten by the matrix (alpha*A*B + beta*C) or
362
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrmm(transa, m, n, k, lb, alpha, matdescra, val,
indx, pntrb, pntre, b, ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER m, n, k, ld, ldb, ldc
REAL alpha, beta
SUBROUTINE mkl_dbsrmm(transa, m, n, k, lb, alpha, matdescra, val,
CHARACTER*1 transa
SUBROUTINE mkl_cbsrmm(transa, m, n, k, lb, alpha, matdescra, val,
CHARACTER*1 transa
COMPLEX alpha, beta
SUBROUTINE mkl_zbsrmm(transa, m, n, k, lb, alpha, matdescra, val,
363

CHARACTER*1 transa
C:
void mkl_sbsrmm(char *transa, int *m, int *n, int *k, int *lb,
float *alpha, char *matdescra, float *val, int *indx, int *pntrb,
int *pntre, float *b, int *ldb, float *beta, float *c, int *ldc,);
void mkl_dbsrmm(char *transa, int *m, int *n, int *k, int *lb,
double *alpha, char *matdescra, double *val, int *indx, int *pntrb,
int *pntre, double *b, int *ldb, double *beta, double *c, int *ldc,);
void mkl_cbsrmm(char *transa, int *m, int *n, int *k, int *lb,
MKL_Complex8 *alpha, char *matdescra, MKL_Complex8 *val, int *indx, int *pntrb,
int *pntre, MKL_Complex8 *b, int *ldb, MKL_Complex8 *beta, MKL_Complex8 *c, int *ldc,);
void mkl_zbsrmm(char *transa, int *m, int *n, int *k, int *lb,
MKL_Complex16 *alpha, char *matdescra, MKL_Complex16 *val, int *indx, int *pntrb,
int *pntre, MKL_Complex16 *b, int *ldb, MKL_Complex16 *beta, MKL_Complex16 *c, int *ldc,);
364
mkl_?cscmm
Computes matrix-matrix product of a sparse matrix
stored in the CSC format.
Syntax
Fortran:
call mkl_scscmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre,
call mkl_dcscmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre,
call mkl_ccscmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre,
call mkl_zcscmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre,
C:
mkl_scscmm(&transa, &m, &n, &k, &alpha, matdescra, val, indx, pntrb, pntre,
mkl_dcscmm(&transa, &m, &n, &k, &alpha, matdescra, val, indx, pntrb, pntre,
mkl_ccscmm(&transa, &m, &n, &k, &alpha, matdescra, val, indx, pntrb, pntre,
mkl_zcscmm(&transa, &m, &n, &k, &alpha, matdescra, val, indx, pntrb, pntre,
Description
for C interface.
The mkl_?cscmm routine performs a matrix-matrix operation defined as
or
365
where:
B and C are dense matrices, A is an m-by-k sparse matrix in compressed sparse column (CSC)
format, A' is the transpose of A.
NOTE. This routine supports CSC format both with one-based indexing and zero-based
indexing.
Input Parameters
If transa = 'N' or 'n', then C := alpha*A* B + beta*C
alpha REAL for mkl_scscmm.
DOUBLE PRECISION for mkl_dcscmm.
COMPLEX for mkl_ccscmm.
DOUBLE COMPLEX for mkl_zcscmm.
val REAL for mkl_scscmm.
366
pntrb(0).
details.
element of the matrix A.Its length is equal to length of the
val array.
details.
pntrb INTEGER. Array of length k.
details.
pntre INTEGER. Array of length k.
details.
b REAL for mkl_scscmm.
Array, DIMENSION(ldb, n) for one-based indexing, and
On entry with transa = 'N' or 'n', the leading k-by-n
B.
367

beta REAL*8. Specifies the scalar beta.
c REAL for mkl_scscmm.
Output Parameters
c Overwritten by the matrix (alpha*A*B + beta* C) or
368
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scscmm(transa, m, n, k, alpha, matdescra, val, indx,
CHARACTER*1 transa
INTEGER indx(*), pntrb(k), pntre(k)
REAL alpha, beta
SUBROUTINE mkl_dcscmm(transa, m, n, k, alpha, matdescra, val, indx,
CHARACTER*1 transa
SUBROUTINE mkl_ccscmm(transa, m, n, k, alpha, matdescra, val, indx,
CHARACTER*1 transa
COMPLEX alpha, beta
SUBROUTINE mkl_zcscmm(transa, m, n, k, alpha, matdescra, val, indx,
369

CHARACTER*1 transa
C:
void mkl_scscmm(char *transa, int *m, int *n, int *k,
float *alpha, char *matdescra, float *val, int *indx,
int *pntrb, int *pntre, float *b, int *ldb,
float *beta, float *c, int *ldc);
void mkl_dcscmm(char *transa, int *m, int *n, int *k,
double *alpha, char *matdescra, double *val, int *indx,
int *pntrb, int *pntre, double *b, int *ldb,
double *beta, double *c, int *ldc);
void mkl_ccscmm(char *transa, int *m, int *n, int *k,
int *pntrb, int *pntre, MKL_Complex8 *b, int *ldb,
MKL_Complex8 *beta, MKL_Complex8 *c, int *ldc);
void mkl_zcscmm(char *transa, int *m, int *n, int *k,
int *pntrb, int *pntre, MKL_Complex16 *b, int *ldb,
370
mkl_?coomm
stored in the coordinate format.
Syntax
Fortran:
call mkl_scoomm(transa, m, n, k, alpha, matdescra, val, rowind, colind, nnz,
call mkl_dcoomm(transa, m, n, k, alpha, matdescra, val, rowind, colind, nnz,
call mkl_ccoomm(transa, m, n, k, alpha, matdescra, val, rowind, colind, nnz,
call mkl_zcoomm(transa, m, n, k, alpha, matdescra, val, rowind, colind, nnz,
C:
mkl_scoomm(&transa, &m, &n, &k, &alpha, matdescra, val, rowind, colind, &nnz,
mkl_dcoomm(&transa, &m, &n, &k, &alpha, matdescra, val, rowind, colind, &nnz,
mkl_ccoomm(&transa, &m, &n, &k, &alpha, matdescra, val, rowind, colind, &nnz,
mkl_zcoomm(&transa, &m, &n, &k, &alpha, matdescra, val, rowind, colind, &nnz,
Description
for C interface.
The mkl_?coomm routine performs a matrix-matrix operation defined as
or
371
where:
B and C are dense matrices, A is an m-by-k sparse matrix in the coordinate format, A' is the
transpose of A.
Input Parameters
If transa = 'N' or 'n', then C := alpha*A*B + beta*C
alpha REAL for mkl_scoomm.
DOUBLE PRECISION for mkl_dcoomm.
COMPLEX for mkl_ccoomm.
DOUBLE COMPLEX for mkl_zcoomm.
val REAL for mkl_scoomm.
372
more details.
more details.
matrix A.
details.
b REAL for mkl_scoomm.
Array, DIMENSION (ldb, n) for one-based indexing, and
B.
beta REAL for mkl_scoomm.
c REAL for mkl_scoomm.
373

Output Parameters
374
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoomm(transa, m, n, k, alpha, matdescra, val,
rowind, colind, nnz, b, ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER m, n, k, ldb, ldc, nnz
REAL alpha, beta
SUBROUTINE mkl_dcoomm(transa, m, n, k, alpha, matdescra, val,
CHARACTER*1 transa
SUBROUTINE mkl_ccoomm(transa, m, n, k, alpha, matdescra, val,
CHARACTER*1 transa
COMPLEX alpha, beta
SUBROUTINE mkl_zcoomm(transa, m, n, k, alpha, matdescra, val,
375

CHARACTER*1 transa
C:
void mkl_scoomm(char *transa, int *m, int *n, int *k, float *alpha,
char *matdescra, float *val, int *rowind, int *colind, int *nnz,
float *b, int *ldb, float *beta, float *c, int *ldc);
void mkl_dcoomm(char *transa, int *m, int *n, int *k, double *alpha,
char *matdescra, double *val, int *rowind, int *colind, int *nnz,
double *b, int *ldb, double *beta, double *c, int *ldc);
void mkl_ccoomm(char *transa, int *m, int *n, int *k, MKL_Complex8 *alpha,
char *matdescra, MKL_Complex8 *val, int *rowind, int *colind, int *nnz,
MKL_Complex8 *b, int *ldb, MKL_Complex8 *beta, MKL_Complex8 *c, int *ldc);
void mkl_zcoomm(char *transa, int *m, int *n, int *k, MKL_Complex16 *alpha,
char *matdescra, MKL_Complex16 *val, int *rowind, int *colind, int *nnz,
376
mkl_?csrsm
Solves a system of linear matrix equations for a
sparse matrix in the CSR format.
Syntax
Fortran:
call mkl_scsrsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, c, ldc)
call mkl_dcsrsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, c, ldc)
call mkl_ccsrsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, c, ldc)
call mkl_zcsrsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, c, ldc)
C:
mkl_scsrsm(&transa, &m, &n, &alpha, matdescra, val, indx, pntrb, pntre, b,
&ldb, c, &ldc);
mkl_dcsrsm(&transa, &m, &n, &alpha, matdescra, val, indx, pntrb, pntre, b,
&ldb, c, &ldc);
mkl_ccsrsm(&transa, &m, &n, &alpha, matdescra, val, indx, pntrb, pntre, b,
&ldb, c, &ldc);
mkl_zcsrsm(&transa, &m, &n, &alpha, matdescra, val, indx, pntrb, pntre, b,
&ldb, c, &ldc);
Description
for C interface.
The mkl_?csrsm routine solves a system of linear equations with matrix-matrix operations for
a sparse matrix in the CSR format:
C := alpha*inv(A)*B
or
C := alpha*inv(A')*B,
377
where:
alpha is scalar, B and C are dense matrices, A is a sparse upper or lower triangular matrix with
unit or non-unit main diagonal, A' is the transpose of A.
indexing.
Input Parameters
If transa = 'N' or 'n', then C := alpha*inv(A)*B
alpha*inv(A')*B,
alpha REAL for mkl_scsrsm.
DOUBLE PRECISION for mkl_dcsrsm.
COMPLEX for mkl_ccsrsm.
DOUBLE COMPLEX for mkl_zcsrsm.
val REAL for mkl_scsrsm.
pntrb(0).
details.
378
details.

order for each row.

b REAL for mkl_scsrsm.
Array, DIMENSION (ldb, n)for one-based indexing, and
On entry the leading m-by-n part of the array b must contain
the matrix B.
379

Output Parameters
c REAL*8.
The leading m-by-n part of the array c contains the output
matrix C.
380
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrsm(transa, m, n, alpha, matdescra, val, indx,
pntrb, pntre, b, ldb, c, ldc)
CHARACTER*1 transa
INTEGER m, n, ldb, ldc
REAL alpha
SUBROUTINE mkl_dcsrsm(transa, m, n, alpha, matdescra, val, indx,
CHARACTER*1 transa
SUBROUTINE mkl_ccsrsm(transa, m, n, alpha, matdescra, val, indx,
CHARACTER*1 transa
COMPLEX alpha
SUBROUTINE mkl_zcsrsm(transa, m, n, alpha, matdescra, val, indx,
381

CHARACTER*1 transa
C:
void mkl_scsrsm(char *transa, int *m, int *n, float *alpha,
int *pntre, float *b, int *ldb, float *c, int *ldc);
void mkl_dcsrsm(char *transa, int *m, int *n, double *alpha,
int *pntre, double *b, int *ldb, double *c, int *ldc);
void mkl_ccsrsm(char *transa, int *m, int *n, MKL_Complex8 *alpha,
int *pntre, MKL_Complex8 *b, int *ldb, MKL_Complex8 *c, int *ldc);
void mkl_zcsrsm(char *transa, int *m, int *n, MKL_Complex16 *alpha,
382
mkl_?cscsm
sparse matrix in the CSC format.
Syntax
Fortran:
call mkl_scscsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, c, ldc)
call mkl_dcscsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, c, ldc)
call mkl_ccscsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, c, ldc)
call mkl_zcscsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, c, ldc)
C:
mkl_scscsm(&transa, &m, &n, &alpha, matdescra, val, indx, pntrb, pntre, b,
&ldb, c, &ldc);
mkl_dcscsm(&transa, &m, &n, &alpha, matdescra, val, indx, pntrb, pntre, b,
&ldb, c, &ldc);
mkl_ccscsm(&transa, &m, &n, &alpha, matdescra, val, indx, pntrb, pntre, b,
&ldb, c, &ldc);
mkl_zcscsm(&transa, &m, &n, &alpha, matdescra, val, indx, pntrb, pntre, b,
&ldb, c, &ldc);
Description
for C interface.
The mkl_?cscsm routine solves a system of linear equations with matrix-matrix operations for
a sparse matrix in the CSC format:
C := alpha*inv(A)*B
or
383
where:
NOTE. This routine supports a CSC format both with one-based indexing and zero-based
indexing.
Input Parameters
transa CHARACTER*1. Specifies the system of equations.
If transa = 'N' or 'n', then C := alpha*inv(A)*B
alpha*inv(A')*B,
alpha REAL for mkl_scscsm.
DOUBLE PRECISION for mkl_dcscsm.
COMPLEX for mkl_ccscsm.
DOUBLE COMPLEX for mkl_zcscsm.
val REAL for mkl_scscsm.
pntrb(0).
details.
384
val array.
details.
NOTE. Row indices must be sorted in increasing

order for each column.

such that pntrb(I) - pntrb(1)+1 is the first index of
column I in the arrays val and indx.
such that pntrb(I) - pntrb(0) is the first index of column
details.
such that pntre(I) - pntrb(1) is the last index of column
such that pntre(I) - pntrb(1)-1 is the last index of
column I in the arrays val and indx.
details.
b REAL for mkl_scscsm.
Array, DIMENSION (ldb, n) for one-based indexing, and (m,
ldb) for zero-based indexing.
the matrix B.
385

Output Parameters
c REAL for mkl_scscsm.
matrix C.
386
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scscsm(transa, m, n, alpha, matdescra, val, indx,
CHARACTER*1 transa
REAL alpha
SUBROUTINE mkl_dcscsm(transa, m, n, alpha, matdescra, val, indx,
CHARACTER*1 transa
SUBROUTINE mkl_ccscsm(transa, m, n, alpha, matdescra, val, indx,
CHARACTER*1 transa
COMPLEX alpha
SUBROUTINE mkl_zcscsm(transa, m, n, alpha, matdescra, val, indx,
387

CHARACTER*1 transa
C:
void mkl_scscsm(char *transa, int *m, int *n, float *alpha,
int *pntre, float *b, int *ldb, float *c, int *ldc);
void mkl_dcscsm(char *transa, int *m, int *n, double *alpha,
int *pntre, double *b, int *ldb, double *c, int *ldc);
void mkl_ccscsm(char *transa, int *m, int *n, MKL_Complex8 *alpha,
void mkl_zcscsm(char *transa, int *m, int *n, MKL_Complex16 *alpha,
388
mkl_?coosm
sparse matrix in the coordinate format.
Syntax
Fortran:
call mkl_scoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b,
ldb, c, ldc)
call mkl_dcoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b,
ldb, c, ldc)
call mkl_ccoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b,
ldb, c, ldc)
call mkl_zcoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b,
ldb, c, ldc)
C:
mkl_scoosm(&transa, &m, &n, &alpha, matdescra, val, rowind, colind, &nnz, b,
&ldb, c, &ldc);
mkl_dcoosm(&transa, &m, &n, &alpha, matdescra, val, rowind, colind, &nnz, b,
&ldb, c, &ldc);
mkl_ccoosm(&transa, &m, &n, &alpha, matdescra, val, rowind, colind, &nnz, b,
&ldb, c, &ldc);
mkl_zcoosm(&transa, &m, &n, &alpha, matdescra, val, rowind, colind, &nnz, b,
&ldb, c, &ldc);
Description
for C interface.
The mkl_?coosm routine solves a system of linear equations with matrix-matrix operations for
a sparse matrix in the coordinate format:
C := alpha*inv(A)*B
or
389
where:
Input Parameters
is computed as C := alpha*inv(A)*B
matrix-vector product is computed as C :=
alpha*inv(A')*B,
alpha REAL for mkl_scoosm.
DOUBLE PRECISION for mkl_dcoosm.
COMPLEX for mkl_ccoosm.
DOUBLE COMPLEX for mkl_zcoosm.
val REAL for mkl_scoosm.
more details.
390
more details.
matrix A.
details.
b REAL for mkl_scoosm.
Array, DIMENSION (ldb, n) for one-based indexing, and (m,
ldb) for zero-based indexing.
Before entry the leading m-by-n part of the array b must
Output Parameters
c REAL for mkl_scoosm.
matrix C.
391
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c,
ldc)
CHARACTER*1 transa
INTEGER m, n, ldb, ldc, nnz
REAL alpha
SUBROUTINE mkl_dcoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c,
ldc)
CHARACTER*1 transa
SUBROUTINE mkl_ccoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c,
ldc)
CHARACTER*1 transa
COMPLEX alpha
SUBROUTINE mkl_zcoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c,
ldc)
CHARACTER*1 transa
392
C:
void mkl_scoosm(char *transa, int *m, int *n, float *alpha, char *matdescra,
float *val, int *rowind, int *colind, int *nnz, float *b, int *ldb, float *c, int *ldc);
void mkl_dcoosm(char *transa, int *m, int *n, double *alpha, char *matdescra,
double *val, int *rowind, int *colind, int *nnz, double *b, int *ldb, double *c, int *ldc);
void mkl_ccoosm(char *transa, int *m, int *n, MKL_Complex8 *alpha, char *matdescra,
MKL_Complex8 *val, int *rowind, int *colind, int *nnz, MKL_Complex8 *b, int *ldb,
MKL_Complex8 *c, int *ldc);
void mkl_zcoosm(char *transa, int *m, int *n, MKL_Complex16 *alpha, char *matdescra,
MKL_Complex16 *val, int *rowind, int *colind, int *nnz, MKL_Complex16 *b, int *ldb,
393
mkl_?bsrsm
sparse matrix in the BSR format.
Syntax
Fortran:
call mkl_scsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre,
b, ldb, c, ldc)
call mkl_dcsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre,
b, ldb, c, ldc)
call mkl_ccsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre,
b, ldb, c, ldc)
call mkl_zcsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre,
b, ldb, c, ldc)
C:
mkl_scsrsm(&transa, &m, &n, &lb, &alpha, matdescra, val, indx, pntrb, pntre,
b, &ldb, c, &ldc);
mkl_dcsrsm(&transa, &m, &n, &lb, &alpha, matdescra, val, indx, pntrb, pntre,
b, &ldb, c, &ldc);
mkl_ccsrsm(&transa, &m, &n, &lb, &alpha, matdescra, val, indx, pntrb, pntre,
b, &ldb, c, &ldc);
mkl_zcsrsm(&transa, &m, &n, &lb, &alpha, matdescra, val, indx, pntrb, pntre,
b, &ldb, c, &ldc);
Description
for C interface.
The mkl_?bsrsm routine solves a system of linear equations with matrix-matrix operations for
a sparse matrix in the BSR format:
C := alpha*inv(A)*B
or
394
where:
indexing.
Input Parameters
is computed as C := alpha*inv(A)*B.
matrix-vector product is computed as C :=
alpha*inv(A')*B.
m INTEGER. Number of block columns of the matrix A.
alpha REAL for mkl_sbsrsm.
DOUBLE PRECISION for mkl_dbsrsm.
COMPLEX for mkl_cbsrsm.
DOUBLE COMPLEX for mkl_zbsrsm.
val REAL for mkl_sbsrsm.
395

A. Its length is equal to the ABAB number ABAB of non-zero
blocks in the matrix A multiplied by lb*lb. Refer to the
values array description in BSR Format for more details.
number of non-zero blocks in the matrix A. Refer to the
columns array description in BSR Format for more details.
row i in the array indx.
details.
details.
b REAL for mkl_sbsrsm.
Array, DIMENSION (ldb, n) for one-based indexing,
DIMENSION (m, ldb) for zero-based indexing.
the matrix B.
ldb INTEGER. Specifies the first dimension (in blocks) of b as
396
ldc INTEGER. Specifies the first dimension (in blocks) of c as
Output Parameters
c REAL for mkl_sbsrsm.
Array, DIMENSION (ldc, n) for one-based indexing,
DIMENSION (m, ldc) for zero-based indexing.
matrix C.
397
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
c, ldc)
CHARACTER*1 transa
INTEGER m, n, lb, ldb, ldc
REAL alpha
SUBROUTINE mkl_dbsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
c, ldc)
CHARACTER*1 transa
SUBROUTINE mkl_cbsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
c, ldc)
CHARACTER*1 transa
COMPLEX alpha
SUBROUTINE mkl_zbsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
c, ldc)
CHARACTER*1 transa
398
C:
void mkl_sbsrsm(char *transa, int *m, int *n, int *lb, float *alpha, char *matdescra,
float *val, int *indx, int *pntrb, int *pntre, float *b, int *ldb, float *c, int *ldc);
void mkl_dbsrsm(char *transa, int *m, int *n, int *lb, double *alpha, char *matdescra,
double *val, int *indx, int *pntrb, int *pntre, double *b, int *ldb, double *c, int *ldc);
void mkl_cbsrsm(char *transa, int *m, int *n, int *lb, MKL_Complex8 *alpha, char *matdescra,
MKL_Complex8 *val, int *indx, int *pntrb, int *pntre, MKL_Complex8 *b, int *ldb,
void mkl_zbsrsm(char *transa, int *m, int *n, int *lb, MKL_Complex16 *alpha, char *matdescra,
MKL_Complex16 *val, int *indx, int *pntrb, int *pntre, MKL_Complex16 *b, int *ldb,
399
mkl_?diamv
matrix in the diagonal format with one-based
indexing.
Syntax
Fortran:
call mkl_sdiamv(transa, m, k, alpha, matdescra, val, lval, idiag, ndiag, x,
beta, y)
call mkl_ddiamv(transa, m, k, alpha, matdescra, val, lval, idiag, ndiag, x,
beta, y)
call mkl_cdiamv(transa, m, k, alpha, matdescra, val, lval, idiag, ndiag, x,
beta, y)
call mkl_zdiamv(transa, m, k, alpha, matdescra, val, lval, idiag, ndiag, x,
beta, y)
C:
mkl_sdiamv(&transa, &m, &k, &alpha, matdescra, val, &lval, idiag, &ndiag, x,
&beta, y);
mkl_ddiamv(&transa, &m, &k, &alpha, matdescra, val, &lval, idiag, &ndiag, x,
&beta, y);
mkl_cdiamv(&transa, &m, &k, &alpha, matdescra, val, &lval, idiag, &ndiag, x,
&beta, y);
mkl_zdiamv(&transa, &m, &k, &alpha, matdescra, val, &lval, idiag, &ndiag, x,
&beta, y);
Description
for C interface.
The mkl_?diamv routine performs a matrix-vector operation defined as
or
400
where:
A is an m-by-k sparse matrix stored in the diagonal format, A' is the transpose of A.
Input Parameters
If transa = 'N' or 'n', then y := alpha*A*x + beta*y,
alpha*A'*x + beta*y.
alpha REAL for mkl_sdiamv.
DOUBLE PRECISION for mkl_ddiamv.
COMPLEX for mkl_cdiamv.
DOUBLE COMPLEX for mkl_zdiamv.
val REAL for mkl_sdiamv.
401
lval INTEGER. Leading dimension of val, lval ≥m. Refer to

lval description in Diagonal Storage Scheme for more
details.
matrix A.
matrix A.
x REAL for mkl_sdiamv.
Array, DIMENSION at least k if transa = 'N' or 'n', and
vector x.
beta REAL for mkl_sdiamv.
y REAL for mkl_sdiamv.
Array, DIMENSION at least m if transa = 'N' or 'n', and
vector y.
Output Parameters
402
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiamv(transa, m, k, alpha, matdescra, val, lval, idiag,
ndiag, x, beta, y)
CHARACTER*1 transa
INTEGER m, k, lval, ndiag
INTEGER idiag(*)
REAL alpha, beta
SUBROUTINE mkl_ddiamv(transa, m, k, alpha, matdescra, val, lval, idiag,
ndiag, x, beta, y)
CHARACTER*1 transa
INTEGER idiag(*)
SUBROUTINE mkl_cdiamv(transa, m, k, alpha, matdescra, val, lval, idiag,
ndiag, x, beta, y)
CHARACTER*1 transa
INTEGER idiag(*)
COMPLEX alpha, beta
SUBROUTINE mkl_zdiamv(transa, m, k, alpha, matdescra, val, lval, idiag,
403
ndiag, x, beta, y)
CHARACTER*1 transa
INTEGER idiag(*)
C:
void mkl_sdiamv(char *transa, int *m, int *k, float *alpha,
char *matdescra, float *val, int *lval, int *idiag,
int *ndiag, float *x, float *beta, float *y);
void mkl_ddiamv(char *transa, int *m, int *k, double *alpha,
char *matdescra, double *val, int *lval, int *idiag,
int *ndiag, double *x, double *beta, double *y);
void mkl_cdiamv(char *transa, int *m, int *k, MKL_Complex8 *alpha,
char *matdescra, MKL_Complex8 *val, int *lval, int *idiag,
int *ndiag, MKL_Complex8 *x, MKL_Complex8 *beta, MKL_Complex8 *y);
void mkl_zdiamv(char *transa, int *m, int *k, MKL_Complex16 *alpha,
char *matdescra, MKL_Complex16 *val, int *lval, int *idiag,
int *ndiag, MKL_Complex16 *x, MKL_Complex16 *beta, MKL_Complex16 *y);
404
mkl_?skymv
matrix in the skyline storage format with one-based
indexing.
Syntax
Fortran:
call mkl_sskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
call mkl_dskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
call mkl_cskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
call mkl_zskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
C:
mkl_sskymv(&transa, &m, &k, &alpha, matdescra, val, pntr, x, &beta, y);
mkl_dskymv(&transa, &m, &k, &alpha, matdescra, val, pntr, x, &beta, y);
mkl_cskymv(&transa, &m, &k, &alpha, matdescra, val, pntr, x, &beta, y);
mkl_zskymv(&transa, &m, &k, &alpha, matdescra, val, pntr, x, &beta, y);
Description
for C interface.
The mkl_?skymv routine performs a matrix-vector operation defined as
or
where:
A is an m-by-k sparse matrix stored using the skyline storage scheme, A' is the transpose of
A.
405
Input Parameters
alpha REAL for mkl_sskymv.
DOUBLE PRECISION for mkl_dskymv.
COMPLEX for mkl_cskymv.
DOUBLE COMPLEX for mkl_zskymv.
NOTE. General matrices (matdescra (1)='G') is

not supported.
val REAL for mkl_sskymv.

Array containing the set of elements of the matrix A in the
skyline profile form.
If matdescrsa(2)= 'L', then val contains elements from
the low triangle of the matrix A.
If matdescrsa(2)= 'U', then val contains elements from
the upper triangle of the matrix A.
406
Refer to values array description in Skyline Storage Scheme
for more details.
pntr INTEGER. Array of length (m+m) for lower triangle, and
(k+k) for upper triangle.
It contains the indices specifying in the val the positions of
the first element in each row (column) of the matrix A. Refer
to pointers array description in Skyline Storage Scheme
for more details.
x REAL for mkl_sskymv.
vector x.
beta REAL for mkl_sskymv.
y REAL for mkl_sskymv.
vector y.
Output Parameters
407
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
CHARACTER*1 transa
INTEGER m, k
INTEGER pntr(*)
REAL alpha, beta
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
CHARACTER*1 transa
INTEGER m, k
INTEGER pntr(*)
SUBROUTINE mkl_cdskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
CHARACTER*1 transa
INTEGER m, k
INTEGER pntr(*)
COMPLEX alpha, beta
SUBROUTINE mkl_zskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
CHARACTER*1 transa
INTEGER m, k
408
INTEGER pntr(*)
C:
void mkl_sskymv (char *transa, int *m, int *k, float *alpha, char *matdescra,
float *val, int *pntr, float *x, float *beta, float *y);
void mkl_dskymv (char *transa, int *m, int *k, double *alpha, char *matdescra,
double *val, int *pntr, double *x, double *beta, double *y);
void mkl_cskymv (char *transa, int *m, int *k, MKL_Complex8 *alpha, char *matdescra,
MKL_Complex8 *val, int *pntr, MKL_Complex8 *x, MKL_Complex8 *beta, MKL_Complex8 *y);
void mkl_zskymv (char *transa, int *m, int *k, MKL_Complex16 *alpha, char *matdescra,
MKL_Complex16 *val, int *pntr, MKL_Complex16 *x, MKL_Complex16 *beta, MKL_Complex16 *y);
mkl_?diasv
matrix in the diagonal format with one-based
indexing.
Syntax
Fortran:
call mkl_sdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
call mkl_ddiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
call mkl_cdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
call mkl_zdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
C:
mkl_sdiasv(&transa, &m, &alpha, matdescra, val, &lval, idiag, &ndiag, x, y);
mkl_ddiasv(&transa, &m, &alpha, matdescra, val, &lval, idiag, &ndiag, x, y);
mkl_cdiasv(&transa, &m, &alpha, matdescra, val, &lval, idiag, &ndiag, x, y);
mkl_zdiasv(&transa, &m, &alpha, matdescra, val, &lval, idiag, &ndiag, x, y);
409
Description
for C interface.
The mkl_?diasv routine solves a system of linear equations with matrix-vector operations for
a sparse matrix stored in the diagonal format:
y := alpha*inv(A)*x
or
where:
Input Parameters
alpha*inv(A')*x,
alpha REAL for mkl_sdiasv.
DOUBLE PRECISION for mkl_ddiasv.
COMPLEX for mkl_cdiasv.
DOUBLE COMPLEX for mkl_zdiasv.
val REAL for mkl_sdiasv.
410
matrix A.

increasing order.

matrix A.
x REAL for mkl_sdiasv.
y REAL for mkl_sdiasv.
Output Parameters
411
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
CHARACTER*1 transa
INTEGER indiag(*)
REAL alpha
SUBROUTINE mkl_ddiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
CHARACTER*1 transa
INTEGER indiag(*)
SUBROUTINE mkl_cdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
CHARACTER*1 transa
INTEGER indiag(*)
COMPLEX alpha
SUBROUTINE mkl_zdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
CHARACTER*1 transa
412
INTEGER indiag(*)
C:
void mkl_sdiasv(char *transa, int *m, float *alpha, char *matdescra,
float *val, int *lval, int *idiag, int *ndiag, float *x, float *y);
void mkl_ddiasv(char *transa, int *m, double *alpha, char *matdescra,
double *val, int *lval, int *idiag, int *ndiag, double *x, double *y);
void mkl_cdiasv(char *transa, int *m, MKL_Complex8 *alpha, char *matdescra,
MKL_Complex8 *val, int *lval, int *idiag, int *ndiag, MKL_Complex8 *x, MKL_Complex8 *y);
void mkl_zdiasv(char *transa, int *m, MKL_Complex16 *alpha, char *matdescra,
MKL_Complex16 *val, int *lval, int *idiag, int *ndiag, MKL_Complex16 *x, MKL_Complex16 *y);
mkl_?skysv
matrix in the skyline format with one-based
indexing.
Syntax
Fortran:
call mkl_sskysv(transa, m, alpha, matdescra, val, pntr, x, y)
call mkl_dskysv(transa, m, alpha, matdescra, val, pntr, x, y)
call mkl_cskysv(transa, m, alpha, matdescra, val, pntr, x, y)
call mkl_zskysv(transa, m, alpha, matdescra, val, pntr, x, y)
C:
mkl_sskysv(&transa, &m, &alpha, matdescra, val, pntr, x, y);
mkl_dskysv(&transa, &m, &alpha, matdescra, val, pntr, x, y);
mkl_cskysv(&transa, &m, &alpha, matdescra, val, pntr, x, y);
mkl_zskysv(&transa, &m, &alpha, matdescra, val, pntr, x, y);
413
Description
for C interface.
The mkl_?skysv routine solves a system of linear equations with matrix-vector operations for
a sparse matrix in the skyline storage format:
y := alpha*inv(A)*x
or
where:
Input Parameters
alpha*inv(A')* x,
alpha REAL for mkl_sskysv.
DOUBLE PRECISION for mkl_dskysv.
COMPLEX for mkl_cskysv.
DOUBLE COMPLEX for mkl_zskysv.
414
not supported.
val REAL for mkl_sskysv.

for more details.
for more details.
x REAL for mkl_sskysv.
y REAL for mkl_sskysv.
415
Output Parameters
416
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sskysv(transa, m, alpha, matdescra, val, pntr, x, y)
CHARACTER*1 transa
INTEGER m
INTEGER pntr(*)
REAL alpha
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dskysv(transa, m, alpha, matdescra, val, pntr, x, y)
CHARACTER*1 transa
INTEGER m
INTEGER pntr(*)
SUBROUTINE mkl_cskysv(transa, m, alpha, matdescra, val, pntr, x, y)
CHARACTER*1 transa
INTEGER m
INTEGER pntr(*)
COMPLEX alpha
SUBROUTINE mkl_zskysv(transa, m, alpha, matdescra, val, pntr, x, y)
CHARACTER*1 transa
INTEGER m
417
INTEGER pntr(*)
C:
void mkl_sskysv(char *transa, int *m, float *alpha, char *matdescra,
float *val, int *pntr, float *x, float *y);
void mkl_dskysv(char *transa, int *m, double *alpha, char *matdescra,
double *val, int *pntr, double *x, double *y);
void mkl_cskysv(char *transa, int *m, MKL_Complex8 *alpha, char *matdescra,
MKL_Complex8 *val, int *pntr, MKL_Complex8 *x, MKL_Complex8 *y);
void mkl_zskysv(char *transa, int *m, MKL_Complex16 *alpha, char *matdescra,
MKL_Complex16 *val, int *pntr, MKL_Complex16 *x, MKL_Complex16 *y);
mkl_?diamm
stored in the diagonal format with one-based
indexing.
Syntax
Fortran:
call mkl_sdiamm(transa, m, n, k, alpha, matdescra, val, lval, idiag, ndiag,
call mkl_ddiamm(transa, m, n, k, alpha, matdescra, val, lval, idiag, ndiag,
call mkl_cdiamm(transa, m, n, k, alpha, matdescra, val, lval, idiag, ndiag,
call mkl_zdiamm(transa, m, n, k, alpha, matdescra, val, lval, idiag, ndiag,
418
C:
mkl_sdiamm(&transa, &m, &n, &k, &alpha, matdescra, val, &lval, idiag, &ndiag,
mkl_ddiamm(&transa, &m, &n, &k, &alpha, matdescra, val, &lval, idiag, &ndiag,
mkl_cdiamm(&transa, &m, &n, &k, &alpha, matdescra, val, &lval, idiag, &ndiag,
mkl_zdiamm(&transa, &m, &n, &k, &alpha, matdescra, val, &lval, idiag, &ndiag,
Description
for C interface.
The mkl_?diamm routine performs a matrix-matrix operation defined as
or
where:
B and C are dense matrices, A is an m-by-k sparse matrix in the diagonal format, A' is the
transpose of A.
Input Parameters
If transa = 'N' or 'n', then C := alpha*A*B + beta*C,
alpha*A'*B + beta*C.
419

alpha REAL for mkl_sdiamm.
DOUBLE PRECISION for mkl_ddiamm.
COMPLEX for mkl_cdiamm.
DOUBLE COMPLEX for mkl_zdiamm.
val REAL for mkl_sdiamm.
matrix A.
matrix A.
b REAL for mkl_sdiamm.
Array, DIMENSION (ldb, n).
B.
420
the calling (sub)program.
beta REAL for mkl_sdiamm.
c REAL for mkl_sdiamm.
Output Parameters
421
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiamm(transa, m, n, k, alpha, matdescra, val, lval,
idiag, ndiag, b, ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER m, n, k, ldb, ldc, lval, ndiag
INTEGER idiag(*)
REAL alpha, beta
REAL val(lval,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_ddiamm(transa, m, n, k, alpha, matdescra, val, lval,
CHARACTER*1 transa
INTEGER idiag(*)
DOUBLE PRECISION val(lval,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_cdiamm(transa, m, n, k, alpha, matdescra, val, lval,
CHARACTER*1 transa
INTEGER idiag(*)
COMPLEX alpha, beta
COMPLEX val(lval,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_zdiamm(transa, m, n, k, alpha, matdescra, val, lval,
422
CHARACTER*1 transa
INTEGER idiag(*)
DOUBLE COMPLEX val(lval,*), b(ldb,*), c(ldc,*)
C:
void mkl_sdiamm(char *transa, int *m, int *n, int *k, float *alpha,
char *matdescra, float *val, int *lval, int *idiag, int *ndiag,
float *b, int *ldb, float *beta, float *c, int *ldc);
void mkl_ddiamm(char *transa, int *m, int *n, int *k, double *alpha,
char *matdescra, double *val, int *lval, int *idiag, int *ndiag,
double *b, int *ldb, double *beta, double *c, int *ldc);
void mkl_cdiamm(char *transa, int *m, int *n, int *k, MKL_Complex8 *alpha,
char *matdescra, MKL_Complex8 *val, int *lval, int *idiag, int *ndiag,
void mkl_zdiamm(char *transa, int *m, int *n, int *k, MKL_Complex16 *alpha,
423
mkl_?skymm
stored using the skyline storage scheme with
one-based indexing.
Syntax
Fortran:
call mkl_sskymm(transa, m, n, k, alpha, matdescra, val, pntr, b, ldb, beta,
c, ldc)
call mkl_dskymm(transa, m, n, k, alpha, matdescra, val, pntr, b, ldb, beta,
c, ldc)
call mkl_cskymm(transa, m, n, k, alpha, matdescra, val, pntr, b, ldb, beta,
c, ldc)
call mkl_zskymm(transa, m, n, k, alpha, matdescra, val, pntr, b, ldb, beta,
c, ldc)
C:
mkl_sskymm(&transa, &m, &n, &k, &alpha, matdescra, val, pntr, b, &ldb, &beta,
c, &ldc);
mkl_dskymm(&transa, &m, &n, &k, &alpha, matdescra, val, pntr, b, &ldb, &beta,
c, &ldc);
mkl_cskymm(&transa, &m, &n, &k, &alpha, matdescra, val, pntr, b, &ldb, &beta,
c, &ldc);
mkl_zskymm(&transa, &m, &n, &k, &alpha, matdescra, val, pntr, b, &ldb, &beta,
c, &ldc);
Description
for C interface.
The mkl_?skymm routine performs a matrix-matrix operation defined as
or
424
where:
B and C are dense matrices, A is an m-by-k sparse matrix in the skyline storage format, A' is
the transpose of A.
Input Parameters
If transa = 'N' or 'n', then C := alpha*A*B + beta*C,
alpha REAL for mkl_sskymm.
DOUBLE PRECISION for mkl_dskymm.
COMPLEX for mkl_cskymm.
DOUBLE COMPLEX for mkl_zskymm.

not supported.
val REAL for mkl_sskymm.

425

for more details.
for more details.
b REAL for mkl_sskymm.
B.
beta REAL for mkl_sskymm.
c REAL for mkl_sskymm.
426
Output Parameters
427
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sskymm(transa, m, n, k, alpha, matdescra, val, pntr, b,
ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER pntr(*)
REAL alpha, beta
SUBROUTINE mkl_dskymm(transa, m, n, k, alpha, matdescra, val, pntr, b,
ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER pntr(*)
SUBROUTINE mkl_cskymm(transa, m, n, k, alpha, matdescra, val, pntr, b,
ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER pntr(*)
COMPLEX alpha, beta
SUBROUTINE mkl_zskymm(transa, m, n, k, alpha, matdescra, val, pntr, b,
428
ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER pntr(*)
C:
void mkl_sskymm(char *transa, int *m, int *n, int *k, float *alpha,
char *matdescra, float *val, int *pntr, float *b, int *ldb,
float *beta, float *c, int *ldc);
void mkl_dskymm(char *transa, int *m, int *n, int *k, double *alpha,
char *matdescra, double *val, int *pntr, double *b, int *ldb,
double *beta, double *c, int *ldc);
void mkl_cskymm(char *transa, int *m, int *n, int *k, MKL_Complex8 *alpha,
char *matdescra, MKL_Complex8 *val, int *pntr, MKL_Complex8 *b, int *ldb,
void mkl_zskymm(char *transa, int *m, int *n, int *k, MKL_Complex16 *alpha,
char *matdescra, MKL_Complex16 *val, int *pntr, MKL_Complex16 *b, int *ldb,
429
mkl_?diasm
sparse matrix in the diagonal format with
one-based indexing.
Syntax
Fortran:
call mkl_sdiasm(transa, m, n, alpha, matdescra, val, lval, idiag, ndiag, b,
ldb, c, ldc)
call mkl_ddiasm(transa, m, n, alpha, matdescra, val, lval, idiag, ndiag, b,
ldb, c, ldc)
call mkl_cdiasm(transa, m, n, alpha, matdescra, val, lval, idiag, ndiag, b,
ldb, c, ldc)
call mkl_zdiasm(transa, m, n, alpha, matdescra, val, lval, idiag, ndiag, b,
ldb, c, ldc)
C:
mkl_ddiasm(&transa, &m, &n, &alpha, matdescra, val, &lval, idiag, &ndiag, b,
&ldb, c, &ldc);
&ldb, c, &ldc);
&ldb, c, &ldc);
&ldb, c, &ldc);
Description
for C interface.
The mkl_?diasm routine solves a system of linear equations with matrix-matrix operations for
a sparse matrix in the diagonal format:
C := alpha*inv(A)*B
430
or
where:
Input Parameters
If transa = 'N' or 'n', then C := alpha*inv(A)*B,
alpha*inv(A')*B.
alpha REAL for mkl_sdiasm.
DOUBLE PRECISION for mkl_ddiasm.
COMPLEX for mkl_cdiasm.
DOUBLE COMPLEX for mkl_zdiasm.
val REAL for mkl_sdiasm.
431

matrix A.

increasing order.

matrix A.
b REAL for mkl_sdiasm.
the matrix B.
Output Parameters
c REAL for mkl_sdiasm.
The leading m-by-n part of the array c contains the matrix
C.
432
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiasm(transa, m, n, alpha, matdescra, val, lval, idiag,
ndiag, b, ldb, c, ldc)
CHARACTER*1 transa
INTEGER m, n, ldb, ldc, lval, ndiag
INTEGER idiag(*)
REAL alpha
REAL val(lval,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_ddiasm(transa, m, n, alpha, matdescra, val, lval, idiag,
CHARACTER*1 transa
INTEGER idiag(*)
DOUBLE PRECISION val(lval,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_cdiasm(transa, m, n, alpha, matdescra, val, lval, idiag,
CHARACTER*1 transa
INTEGER idiag(*)
COMPLEX alpha
COMPLEX val(lval,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_zdiasm(transa, m, n, alpha, matdescra, val, lval, idiag,
433

CHARACTER*1 transa
INTEGER idiag(*)
DOUBLE COMPLEX val(lval,*), b(ldb,*), c(ldc,*)
C:
void mkl_sdiasm(char *transa, int *m, int *n, float *alpha,
char *matdescra, float *val, int *lval, int *idiag, int *ndiag,
float *b, int *ldb, float *c, int *ldc);
void mkl_ddiasm(char *transa, int *m, int *n, double *alpha,
char *matdescra, double *val, int *lval, int *idiag, int *ndiag,
double *b, int *ldb, double *c, int *ldc);
void mkl_cdiasm(char *transa, int *m, int *n, MKL_Complex8 *alpha,
MKL_Complex8 *b, int *ldb, MKL_Complex8 *c, int *ldc);
void mkl_zdiasm(char *transa, int *m, int *n, MKL_Complex16 *alpha,
MKL_Complex16 *b, int *ldb, MKL_Complex16 *c, int *ldc);
434
mkl_?skysm
sparse matrix stored using the skyline storage
scheme with one-based indexing.
Syntax
Fortran:
call mkl_sskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
call mkl_dskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
call mkl_cskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
call mkl_zskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
C:
mkl_sskysm(&transa, &m, &n, &alpha, matdescra, val, pntr, b, &ldb, c, &ldc);
mkl_dskysm(&transa, &m, &n, &alpha, matdescra, val, pntr, b, &ldb, c, &ldc);
mkl_cskysm(&transa, &m, &n, &alpha, matdescra, val, pntr, b, &ldb, c, &ldc);
mkl_zskysm(&transa, &m, &n, &alpha, matdescra, val, pntr, b, &ldb, c, &ldc);
Description
for C interface.
The mkl_?skysm routine solves a system of linear equations with matrix-matrix operations for
a sparse matrix in the skyline storage format:
C := alpha*inv(A)*B
or
where:
435
Input Parameters
If transa = 'N' or 'n', then C := alpha*inv(A)*B,
alpha*inv(A')*B,
alpha REAL for mkl_sskysm.
DOUBLE PRECISION for mkl_dskysm.
COMPLEX for mkl_cskysm.
DOUBLE COMPLEX for mkl_zskysm.

not supported.
val REAL for mkl_sskysm.

for more details.
436
pntr INTEGER. Array of length (m+m). It contains the indices
specifying in the val the positions of the first non-zero
element of each i-row (column) of the matrix A such that
pointers(i)- pointers(1)+1. Refer to pointers array
description in Skyline Storage Scheme for more details.
b REAL for mkl_sskysm.
the matrix B.
Output Parameters
c REAL for mkl_sskysm.
The leading m-by-n part of the array c contains the matrix
C.
437
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)

CHARACTER*1 transa
INTEGER pntr(*)
REAL alpha
SUBROUTINE mkl_dskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
CHARACTER*1 transa
INTEGER pntr(*)
SUBROUTINE mkl_cskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
CHARACTER*1 transa
INTEGER pntr(*)
COMPLEX alpha
SUBROUTINE mkl_zskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
CHARACTER*1 transa
438
INTEGER pntr(*)
C:
void mkl_sskysm(char *transa, int *m, int *n, float *alpha, char *matdescra,
float *val, int *pntr, float *b, int *ldb, float *c, int *ldc);
void mkl_dskysm(char *transa, int *m, int *n, double *alpha, char *matdescra,
double *val, int *pntr, double *b, int *ldb, double *c, int *ldc);
void mkl_cskysm(char *transa, int *m, int *n, MKL_Complex8 *alpha, char *matdescra,
MKL_Complex8 *val, int *pntr, MKL_Complex8 *b, int *ldb, MKL_Complex8 *c, int *ldc);
void mkl_zskysm(char *transa, int *m, int *n, MKL_Complex16 *alpha, char *matdescra,
MKL_Complex16 *val, int *pntr, MKL_Complex16 *b, int *ldb, MKL_Complex16 *c, int *ldc);
mkl_ddnscsr
Convert a sparse matrix to the CSR format and
vice versa.
Syntax
Fortran:
call mkl_ddnscsr(job, m, n, adns, lda, acsr, ja, ia, info)
C:
mkl_ddnscsr(job, &m, &n, adns, &lda, acsr, ja, ia, &info);
Description
for C interface.
This routine converts an sparse matrix stored as a rectangular m-by-n matrix A to the
compressed sparse row (CSR) format (3-array variation) and vice versa.
439
Input Parameters
job INTEGER
Array, contains the following conversion parameters:
job(1)
If job(1)=0, the rectangular matrix A is converted to the
CSR format;
if job(1)=1, the rectangular matrix A is restored from the
CSR format.
job(2)
If job(2)=0, zero-based indexing for the rectangular matrix
A is used;
if job(2)=1, one-based indexing for the rectangular matrix
A is used.
job(3)
If job(3)=0, zero-based indexing for the matrix in CSR
format is used;
if job(3)=1, one-based indexing for the matrix in CSR
format is used.
job(4)
If job(4)=0, adns is a lower triangular part of matrix A;
If job(4)=1, adns is an upper triangular part of matrix A;
If job(4)=2, adns is a whole matrix A.
job(5)
job(5)=nzmax - maximum number of the non-zero
elements allowed if job(1)=0.
job(6) - job indicator for conversion to CSR format.
If job(6)=0, only array ia is generated for the output
storage.
If job(6)>0, arrays acsr, ia, ja are generated for the
output storage.
n INTEGER. Number of columns of the matrix A.
adns (input/output)DOUBLE PRECISION.
440
lda (input/output)INTEGER. Specifies the first dimension of adns
as declared in the calling (sub)program, must be at least
max(1, m).
acsr (input/output)DOUBLE PRECISION.
ja (input/output)INTEGER. Array containing the column indices
for each non-zero element of the matrix A.
Its length is equal to the length of the array acsr. Refer to
for more details.
ia (input/output)INTEGER. Array of length m + 1, containing
indices of elements in the array acsr, such that ia(I) is
the index in the array acsr of the first non-zero element
from the row I. The value of the last element ia(m + 1)
is equal to the number of non-zeros plus one. Refer to
rowIndex array description in Sparse Matrix Storage Formats
for more details.
Output Parameters
info INTEGER. Integer info indicator only for restoring the matrix
A from the CSR format.
If info=0, the execution is successful.
If info=i, the routine is interrupted processing the i-th
row because there is no space in the arrays adns and ja
according to the value nzmax.
441
Interfaces
FORTRAN 77:
SUBROUTINE mkl_ddnscsr(job, m, n, adns, lda, acsr, ja, ia, info)
INTEGER job(8)
INTEGER m, n, lda, info
INTEGER ja(*), ia(m+1)
DOUBLE PRECISION adns(*), acsr(*)
C:
void mkl_ddnscsr(int *job, int *m, int *n, double *adns,
int *lda, double *acsr, double ja, int *ia, int *info);
mkl_dcsrcoo
Converts a sparse matrix in CSR format to the
coordinate format.
Syntax
Fortran:
call mkl_dcsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
C:
mkl_dcsrcoo(job, &n, acsr, ja, ia, &nnz, acoo, rowind, colind, &info);
Description
for C interface.
This routine converts a sparse matrix A stored in the compressed sparse row (CSR) format
(3-array variation) to coordinate format and vice versa.
442
Input Parameters
job INTEGER
job(1)
If job(1)=0, the matrix in the CSR format is converted to
the coordinate format;
if job(1)=1, the matrix in the coordinate format is
converted to the CSR format.
job(2)
format is used;
format is used.
job(3)
If job(3)=0, zero-based indexing for the matrix in
coordinate format is used;
if job(3)=1, one-based indexing for the matrix in coordinate
format is used.
job(5)
job(5)=nzmax - maximum number of the non-zero
elements allowed if job(1)=0.
job(5)=nnz - sets number of the non-zero elements of the
matrix A if job(1)=0.
job(6) - job indicator.
For conversion to the coordinate format:
If job(6)=1, only array rowind is filled in for the output
storage.
If job(6)=2, arrays rowind, colind are filled in for the
output storage.
If job(6)=3, all arrays rowind, colind, acoo are filled in
for the output storage.
For conversion to the CSR format:
If job(6)=0, all arrays acsr, ja, ia are filled in for the
output storage.
If job(6)=1, only array ia is filled in for the output storage.
443
If job(6)=2, then it is assumed that the routine already

has been called with the job(6)=1, and the user allocated
the required space for storing the output arrays acsr and
ja.
m INTEGER. Dimension of the matrix A.
acsr (input/output) DOUBLE PRECISION.
ja (input/output) INTEGER. Array containing the column indices
for more details.
ia (input/output) INTEGER. Array of length m + 1, containing
for more details.
acoo (input/output) DOUBLE PRECISION.
rowind (input/output)INTEGER. Array of length nnz, contains the
row indices for each non-zero element of the matrix A.
more details.
colind (input/output) INTEGER. Array of length nnz, contains the
column indices for each non-zero element of the matrix A.
Refer to columns array description in Coordinate Format for
more details.
444
Output Parameters
matrix A.
details.
info INTEGER. Integer info indicator only for converting the
matrix A from the CSR format.
If info=1, the routine is interrupted because there is no
space in the arrays acoo, rowind, colind according to the
value nzmax.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_dcsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
INTEGER job(8)
INTEGER n, nnz, info
INTEGER ja(*), ia(m+1), rowind(*), colind(*)
DOUBLE PRECISION acsr(*), acoo(*)
C:
void mkl_dcsrcoo(int *job, int *n, double *acsr, int *ja,
int *ia, int *nnz, double *acoo, int *rowind, int *colind, int *info);
mkl_dcsrbsr
Converts a sparse matrix in CSR format to BSR
format.
Syntax
Fortran:
call mkl_dcsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
445
C:
mkl_dcsrbsr(job, &m, &mblk, &ldabsr, acsr, ja, ia, absr, jab , iab , &info);
Description
for C interface.
This routine converts a sparse matrix A stored in the compressed sparse row (CSR) format
(3-array variation) to the block sparse row (BSR) format and vice versa.
Input Parameters
job INTEGER
job(1)
If job(1)=0, the matrix in the CSR format is converted to
the BSR format;
if job(1)=1, the matrix in the BSR format is converted to
the CSR format.
job(2)
format is used;
format is used.
job(3)
If job(3)=0, zero-based indexing for the matrix in the BSR
format is used;
if job(3)=1, one-based indexing for the matrix in the BSR
format is used.
job(6) - job indicator.
For conversion to the BSR format:
If job(6)=0, only arrays jab, iab are generated for the
output storage.
If job(6)>0, all output arrays absr, jab, and iab are filled
in for the output storage.
446
If job(6)=-1, iab(1) returns the number of non-zero
blocks.
For conversion to the CSR format:
If job(6)=0, only arrays ja, ia are generated for the output
storage.
m INTEGER. Actual row dimension of the matrix A for convert
to the BSR format; block row dimension of the matrix A for
convert to the CSR format.
mblk INTEGER. Size of the block in the matrix A.
ldabsr INTEGER. Leading dimension of the array absr as declared
in the calling program. ldabsr must be greater than or
equal to mblk*mblk.
acsr (input/output) DOUBLE PRECISION.
ja (input/output) INTEGER. Array containing the column indices
for more details.
ia (input/output) INTEGER. Array of length m + 1, containing
for more details.
absr (input/output) DOUBLE PRECISION.
the matrix A multiplied by mblk*mblk. Refer to values array
jab (input/output) INTEGER. Array of length (m + 1), containing
indices of block in the array absr, such that ja(i) is the
index in the array absr of the first non-zero element from
447
the i-th row . The value of the last element jab(m + 1) is

equal to the number of non-zero blocks plus one. Refer to
rowIndex array description in BSR Format for more details.
iab (input/output) INTEGER. Array containing the column indices
for each non-zero block of the matrix A.
for more details.
Output Parameters
info INTEGER. Integer info indicator only for converting the
matrix A from the CSR format.
If info=1, it means that mblk is equal to 0.
If info=2, it means that ldabsr is less than mblk*mblk
and there is no space for all blocks.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_dcsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
INTEGER job(8)
INTEGER m, mblk, ldabsr, info
INTEGER ja(*), ia(m+1), jab(*), iab(*)
DOUBLE PRECISION acsr(*), absr(ldabsr,*)
C:
void mkl_dcsrbsr(int *job, int *m, int *mblk, int *ldabsr, double *acsr, int *ja,
int *ia, double *absr, int *jab, int *iab, int *info);
448
mkl_?csradd
Computes the sum of two matrices stored in the
CSR format (3-array variation) with one-based
indexing.
Syntax
Fortran:
call mkl_scsradd(trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c,
jc, ic, nzmax, info)
call mkl_dcsradd(trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c,
call mkl_ccsradd(trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c,
call mkl_zcsradd(trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c,
C:
mkl_scsradd(&trans, &request, &sort, &m, &n, a, ja, ia, &beta, b, jb, ib, c,
jc, ic, &nzmax, &info);
mkl_dcsradd(&trans, &request, &sort, &m, &n, a, ja, ia, &beta, b, jb, ib ,
c, jc, ic, &nzmax, &info);
mkl_ccsradd(&trans, &request, &sort, &m, &n, a, ja, ia, &beta, b, jb, ib, c,
mkl_zcsradd(&trans, &request, &sort, &m, &n, a, ja, ia, &beta, b, jb, ib, c,
Description
for C interface.
The mkl_?csradd routine performs a matrix-matrix operation defined as
C := A+beta*op(B)
where:
A, B, C are the sparse matrices in the CSR format (3-array variation).
449
op(B) is one of op(B) = B, or op(B) = B', or op(A) = conjg(B')

beta is a scalar.
The routine works correctly if and only if the column indices in sparse matrix representations
of matrices A and B are arranged in the increasing order for each row. If not, use the parameter
sort (see below) to reorder column indices and the corresponding elements of the input
matrices.
Input Parameters
trans CHARACTER*1. Specifies the operation.
If trans = 'N' or 'n', then C := A+beta*B
If trans = 'T' or 't' or 'C' or 'c', then C :=
A+beta*B'.
request INTEGER.
If request=0, the routine performs addition, the memory
for the output arrays ic, jc, c must be allocated
beforehand.
If request=1, the routine computes only values of the array
ic of length m + 1, the memory for this array must be
allocated beforehand. On exit the value ic(m+1) - 1 is the
actual number of the elements in the arrays c and jc.
If request=2, the routine has been called previously with
the parameter request=1, the output arrays jc and c are
allocated in the calling program and they are of the length
(m+1)-1 at least.
sort INTEGER. Specifies the type of reordering. If this parameter
is not set (default), the routine does not perform reordering.
If sort=1, the routine arranges the column indices ja for
each row in the increasing order and reorders the
corresponding values of the matrix A in the array a.
450
If sort=2, the routine arranges the column indices jb for
corresponding values of the matrix B in the array b.
If sort=3, the routine performs reordering for both input
matrices A and B.
a REAL for mkl_scsradd.
DOUBLE PRECISION for mkl_dcsradd.
COMPLEX for mkl_ccsradd.
DOUBLE COMPLEX for mkl_zcsradd.
non-zero element of the matrix A. For each row the column
indices must be arranged in the increasing order.
The length of this array is equal to the length of the array
a. Refer to columns array description in Sparse Matrix
elements in the array a, such that ia(I) is the index in the
of non-zero elements of the matrix B plus one. Refer to
for more details.
beta REAL for mkl_scsradd.
b REAL for mkl_scsradd.
451
Array containing non-zero elements of the matrix B. Its

matrix B. Refer to values array description in Sparse Matrix
jb INTEGER. Array containing the column indices for each
non-zero element of the matrix B. For each row the column
b. Refer to columns array description in Sparse Matrix
ib INTEGER. Array of length m + 1 when trans = 'N' or 'n',
or n + 1 otherwise.
This array contains indices of elements in the array b, such
that ib(I) is the index in the array b of the first non-zero
element from the row I. The value of the last element ib(m
+ 1) or ib(n + 1) is equal to the number of non-zero
elements of the matrix B plus one. Refer to rowIndex array
description in Sparse Matrix Storage Formats for more
details.
nzmax INTEGER. The length of the arrays c and jc.
This parameter is used only if request=0. The routine stops
calculation if the number of elements in the result matrix C
exceeds the specified value of nzmax.
Output Parameters
c REAL for mkl_scsradd.
Array containing non-zero elements of the result matrix C.
Its length is equal to the number of non-zero elements in
the matrix C. Refer to values array description in Sparse
jc INTEGER. Array containing the column indices for each
non-zero element of the matrix C.
452
c. Refer to columns array description in Sparse Matrix
ic INTEGER. Array of length m + 1, containing indices of
elements in the array c, such that ic(I) is the index in the
array c of the first non-zero element from the row I. The
value of the last element ic(m + 1) is equal to the number
of non-zero elements of the matrix C plus one. Refer to
for more details.
info INTEGER.
If info=I>0, the routine stops calculation in the I-th row
of the matrix C because number of elements in C exceeds
nzmax.
If info=-1, the routine calculates only the size of the arrays
c and jc and returns this value plus 1 as the last element
of the array ic.
453
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsradd( trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc,
ic, nzmax, info)
CHARACTER trans
INTEGER request, sort, m, n, nzmax, info
INTEGER ja(*), jb(*), jc(*), ia(*), ib(*), ic(*)
REAL a(*), b(*), c(*), beta
SUBROUTINE mkl_dcsradd( trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc,
ic, nzmax, info)
CHARACTER trans
DOUBLE PRECISION a(*), b(*), c(*), beta
SUBROUTINE mkl_ccsradd( trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc,
ic, nzmax, info)
CHARACTER trans
COMPLEX a(*), b(*), c(*), beta
SUBROUTINE mkl_zcsradd( trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc,
ic, nzmax, info)
CHARACTER trans
DOUBLE COMPLEX a(*), b(*), c(*), beta
454
C:
void mkl_scsradd(char *trans, int *request, int *sort, int *m, int *n, float *a, int *ja,
int *ia, float *beta, float *b, int *jb, int *ib, float *c,
int *jc, int *ic, int *nzmax, int *info);
void mkl_dcsradd(char *trans, int *request, int *sort, int *m, int *n,
double *a, int *ja, int *ia, double *beta, double *b, int *jb, int *ib, double *c,
void mkl_ccsradd(char *trans, int *request, int *sort, int *m, int *n,
MKL_Complex8 *a, int *ja, int *ia, MKL_Complex8 *beta, MKL_Complex8 *b,
int *jb, int *ib, MKL_Complex8 *c, int *jc, int *ic, int *nzmax, int *info);
void mkl_zcsradd(char *trans, int *request, int *sort, int *m, int *n,
MKL_Complex16 *a, int *ja, int *ia, MKL_Complex16 *beta, MKL_Complex16 *b,
int *jb, int *ib, MKL_Complex16 *c, int *jc, int *ic, int *nzmax, int *info);
mkl_?csrmultcsr
Computes product of two sparse matrices stored
in the CSR format (3-array variation) with
one-based indexing.
Syntax
Fortran:
call mkl_scsrmultcsr(trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c,
call mkl_dcsrmultcsr(trans, request, sort, m, n, k, a, ja, ia, b, jb, ib ,
c, jc, ic, nzmax, info)
call mkl_ccsrmultcsr( trans, request, sort, m, n, k, a, ja, ia, b, jb, ib,
c, jc, ic, nzmax, info)
call mkl_zcsrmultcsr(trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c,
455
C:
mkl_scsrmultcsr(&trans, &request, &sort, &m, &n, &k, a, ja, ia, b, jb, ib,
mkl_dcsrmultcsr(&trans, &request, &sort, &m, &n, &k, a, ja, ia, b, jb, ib,
mkl_ccsrmultcsr(&trans, &request, &sort, &m, &n, &k, a, ja, ia, b, jb, ib,
mkl_zcsrmultcsr(&trans, &request, &sort, &m, &n, &k, a, ja, ia, b, jb, ib,
Description
for C interface.
The mkl_?csrmultcsr routine performs a matrix-matrix operation defined as
C := op(A)*B
where:
A, B, C are the sparse matrices in the CSR format (3-array variation);
op(A) is one of op(A) = A, or op(A) =A', or op(A) = conjg(A') .

matrices.
Input Parameters
If trans = 'N' or 'n', then C := A*B
If trans = 'T' or 't' or 'C' or 'c', then C := A'*B.
456
request INTEGER.
If request=0, the routine performs multiplication, the
memory for the output arrays ic, jc, c must be allocated
beforehand.
If request=1, the routine computes only values of the array
ic of length m + 1, the memory for this array must be
allocated beforehand. On exit the value ic(m+1) - 1 is the
actual number of the elements in the arrays c and jc.
If request=2, the routine has been called previously with
the parameter request=1, the output arrays jc and c are
allocated in the calling program and they are of the length
(m+1)-1 at least.
sort INTEGER. Specifies the type of reordering. If this parameter
is not set (default), the routine does not perform reordering.
If sort=1, the routine arranges the column indices ja for
corresponding values of the matrix A in the array a.
If sort=2, the routine arranges the column indices jb for
corresponding values of the matrix B in the array b.
If sort=3, the routine performs reordering for both input
matrices A and B.
k INTEGER. Number of columns of the matrix B.
a REAL for mkl_scsrmultcsr.
DOUBLE PRECISION for mkl_dcsrmultcsr.
COMPLEX for mkl_ccsrmultcsr.
DOUBLE COMPLEX for mkl_zcsrmultcsr.
457

ia INTEGER. Array of length m + 1 when trans = 'N' or 'n',
or n + 1 otherwise.
This array contains indices of elements in the array a, such
that ia(I) is the index in the array a of the first non-zero
element from the row I. The value of the last element ia(m
+ 1) or ia(n + 1) is equal to the number of non-zero
elements of the matrix A plus one. Refer to rowIndex array
details.
b REAL for mkl_scsrmultcsr.
ib INTEGER. Array of length m + 1.
+ 1) is equal to the number of non-zero elements of the
matrix B plus one. Refer to rowIndex array description in
Sparse Matrix Storage Formats for more details.
nzmax INTEGER. The length of the arrays c and jc.
This parameter is used only if request=0. The routine stops
calculation if the number of elements in the result matrix C
exceeds the specified value of nzmax.
458
Output Parameters
c REAL for mkl_scsrmultcsr.
Its length is equal to the number of non-zero elements in
the matrix C. Refer to values array description in Sparse
jc INTEGER. Array containing the column indices for each
non-zero element of the matrix C.
c. Refer to columns array description in Sparse Matrix
ic INTEGER. Array of length m + 1 when trans = 'N' or 'n',
or n + 1 otherwise.
This array contains indices of elements in the array c, such
that ic(I) is the index in the array c of the first non-zero
element from the row I. The value of the last element ic(m
+ 1) or ic(n + 1) is equal to the number of non-zero
elements of the matrix C plus one. Refer to rowIndex array
details.
info INTEGER.
If info=I>0, the routine stops calculation in the I-th row
of the matrix C because number of elements in C exceeds
nzmax.
If info=-1, the routine calculates only the size of the arrays
c and jc and returns this value plus 1 as the last element
of the array ic.
459
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrmultcsr( trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc,
ic, nzmax, info)
CHARACTER*1 trans
INTEGER request, sort, m, n, k, nzmax, info
REAL a(*), b(*), c(*)
SUBROUTINE mkl_dcsrmultcsr( trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc,
ic, nzmax, info)
CHARACTER*1 trans
DOUBLE PRECISION a(*), b(*), c(*)
SUBROUTINE mkl_ccsrmultcsr( trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc,
ic, nzmax, info)
CHARACTER*1 trans
COMPLEX a(*), b(*), c(*)
SUBROUTINE mkl_zcsrmultcsr( trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc,
ic, nzmax, info)
CHARACTER*1 trans
DOUBLE COMPLEX a(*), b(*), c(*)
460
C:
void mkl_scsrmultcsr(char *trans, int *request, int *sort, int *m, int *n, int *k,
float *a, int *ja, int *ia, float *b, int *jb, int *ib, float *c,
void mkl_dcsrmultcsr(char *trans, int *request, int *sort, int *m, int *n, int *k,
double *a, int *ja, int *ia, double *b, int *jb, int *ib, double *c,
void mkl_ccsrmultcsr(char *trans, int *request, int *sort, int *m, int *n, int *k,
MKL_Complex8 *a, int *ja, int *ia, MKL_Complex8 *b, int *jb, int *ib,
MKL_Complex8 *c, int *jc, int *ic, int *nzmax, int *info);
void mkl_zcsrmultcsr(char *trans, int *request, int *sort, int *m, int *n, int *k,
MKL_Complex16 *c, int *jc, int *ic, int *nzmax, int *info);
mkl_?csrmultd
Computes product of two sparse matrices stored
in the CSR format (3-array variation) with
one-based indexing. The result is stored in the
dense matrix.
Syntax
Fortran:
call mkl_scsrmultd(trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
call mkl_dcsrmultd(trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
call mkl_ccsrmultd(trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
call mkl_zcsrmultd(trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
C:
mkl_scsrmultd(&trans, &m, &n, &k, a, ja, ia, b, jb, ib, c, &ldc);
mkl_dcsrmultd(&trans, &m, &n, &k, a, ja, ia, b, jb, ib, c, &ldc);
mkl_ccsrmultd(&trans, &m, &n, &k, a, ja, ia, b, jb, ib, c, &ldc);
mkl_zcsrmultd(&trans, &m, &n, &k, a, ja, ia, b, jb, ib, c, &ldc);
461
Description
for C interface.
The mkl_?csrmultd routine performs a matrix-matrix operation defined as
C := op(A)*B
where:
A, B are the sparse matrices in the CSR format (3-array variation), C is dense matrix;
op(A) is one of op(A) = A, or op(A) =A', or op(A) = conjg(A') .

matrices.
Input Parameters
If trans = 'N' or 'n', then C := A*B
If trans = 'T' or 't' or 'C' or 'c', then C := A'*B.
k INTEGER. Number of columns of the matrix B.
a REAL for mkl_scsrmultd.
DOUBLE PRECISION for mkl_dcsrmultd.
COMPLEX for mkl_ccsrmultd.
DOUBLE COMPLEX for mkl_zcsrmultd.
462
ia INTEGER. Array of length m + 1 when trans = 'N' or 'n',
or n + 1 otherwise.
This array contains indices of elements in the array a, such
that ia(I) is the index in the array a of the first non-zero
element from the row I. The value of the last element ia(m
+ 1) or ia(n + 1) is equal to the number of non-zero
elements of the matrix A plus one. Refer to rowIndex array
details.
b REAL for mkl_scsrmultd.
ib INTEGER. Array of length m + 1.
463
+ 1) is equal to the number of non-zero elements of the

matrix B plus one. Refer to rowIndex array description in
Sparse Matrix Storage Formats for more details.
Output Parameters
c REAL for mkl_scsrmultd.
ldc INTEGER. Specifies the first dimension of the dense matrix

C as declared in the calling (sub)program. Must be at least
max(m, 1) when trans = 'N' or 'n', or max(1, n)
otherwise.
464
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrmultd( trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
CHARACTER*1 trans
INTEGER m, n, k, ldc
INTEGER ja(*), jb(*), ia(*), ib(*)
REAL a(*), b(*), c(ldc, *)
SUBROUTINE mkl_dcsrmultd( trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
CHARACTER*1 trans
DOUBLE PRECISION a(*), b(*), c(ldc, *)
SUBROUTINE mkl_ccsrmultd( trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
CHARACTER*1 trans
COMPLEX a(*), b(*), c(ldc, *)
SUBROUTINE mkl_zcsrmultd( trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
CHARACTER*1 trans
DOUBLE COMPLEX a(*), b(*), c(ldc, *)
465
C:
void mkl_scsrmultd(char *trans, int *m, int *n, int *k,
float *a, int *ja, int *ia, float *b, int *jb, int *ib, float *c, int *ldc);
void mkl_dcsrmultd(char *trans, int *m, int *n, int *k,
double *a, int *ja, int *ia, double *b, int *jb, int *ib, double *c, int *ldc);
void mkl_ccsrmultd(char *trans, int *m, int *n, int *k,
void mkl_zcsrmultd(char *trans, int *m, int *n, int *k,
BLAS-like Extensions
Intel MKL provides C and Fortran routines to perform certain data manipulation, including matrix
in-place and out-of-place transposition operations combined with simple matrix arithmetic
operations. Transposition operations are Copy As Is, Conjugate transpose, Transpose, and
Conjugate. Each routine adds the possibility of scaling during the transposition operation by
giving some alpha and/or beta parameters. Each routine supports both row-major orderings
and column-major orderings.
There is also ?gemm3m, a routine that computes a matrix-matrix product of general complex
matrices using matrix multiplication.
Table 2-10 lists these routines.
The <?> symbol in the routine short names is a precision prefix that indicates the data type:
s REAL for Fortran interface, or float for C interface
d DOUBLE PRECISION for Fortran interface, or double for C interface.
c COMPLEX for Fortran interface, or MKL_Complex8 for C interface.
z DOUBLE COMPLEX for Fortran interface, or MKL_Complex16 for C
interface.
Table 2-10. BLAS-like Extensions
Routine Data Types Description
mkl_?imatcopy s, d, c, z Performs scaling and in-place transposition/copying

of matrices.
mkl_?omatcopy s, d, c, z Performs scaling and out-of-place

transposition/copying of matrices.
466
mkl_?omatcopy2 s, d, c, z Performs two-strided scaling and out-of-place

mkl_?omatadd s, d, c, z Performs scaling and sum of two matrices including

their out-of-place transposition/copying.
?gemm3m c, z Computes a scalar-matrix-matrix product using

matrix multiplications and adds the result to a
mkl_?imatcopy
Performs scaling and in-place transposition/copying
of matrices.
Syntax
Fortran:
call mkl_simatcopy(ordering, trans, rows, cols, alpha, a, src_lda, dst_lda)
call mkl_dimatcopy(ordering, trans, rows, cols, alpha, a, src_lda, dst_lda)
call mkl_cimatcopy(ordering, trans, rows, cols, alpha, a, src_lda, dst_lda)
call mkl_zimatcopy(ordering, trans, rows, cols, alpha, a, src_lda, dst_lda)
C:
mkl_simatcopy(ordering, trans, rows, cols, alpha, a, src_lda, dst_lda);
mkl_dimatcopy(ordering, trans, rows, cols, alpha, a, src_lda, dst_lda);
mkl_cimatcopy(ordering, trans, rows, cols, alpha, a, src_lda, dst_lda);
mkl_zimatcopy(ordering, trans, rows, cols, alpha, a, src_lda, dst_lda);
Description
This routine is declared in mkl_trans.fi for FORTRAN 77 interface and in mkl_trans.h for
C interface. Always reference mkl_trans.h for C interface correct resolution.
The mkl_?imatcopy routine performs scaling and in-place transposition/copying of matrices.
A transposition operation can be a normal matrix copy, a transposition, a conjugate transposition,
or just a conjugation. The operation is defined as follows:
467
A := alpha*op(A).
The routine parameter descriptions are common for all implemented interfaces with the exception
of data types that refer here to the FORTRAN 77 standard types. Data types specific to the
different interfaces are described in the section “Interfaces” below.
Note that different arrays should not overlap.
Input Parameters
ordering CHARACTER*1. Ordering of the matrix storage.
If ordering = 'R' or 'r', the ordering is row-major.
If ordering = 'C' or 'c', the ordering is column-major.
trans CHARACTER*1. Parameter that specifies the operation type.
If trans = 'N' or 'n', op(A)=A and the matrix A is
assumed unchanged on input.
If trans = 'T' or 't', it is assumed that A should be
transposed.
If trans = 'C' or 'c', it is assumed that A should be
conjugate transposed.
If trans = 'R' or 'r', it is assumed that A should be only
conjugated.
If the data is real, then trans = 'R' is the same as trans
= 'N', and trans = 'C' is the same as trans = 'T'.
rows INTEGER. The number of matrix rows.
cols INTEGER. The number of matrix columns.
a REAL for mkl_simatcopy.
DOUBLE PRECISION for mkl_dimatcopy.
COMPLEX for mkl_cimatcopy.
DOUBLE COMPLEX for mkl_zimatcopy.
Array, DIMENSION a(scr_lda,*).
alpha REAL for mkl_simatcopy.
This parameter scales the input matrix by alpha.
468
src_lda INTEGER. Distance between the first elements in adjacent
columns (in the case of the column-major order) or rows
(in the case of the row-major order) in the source matrix;
measured in the number of elements.
This parameter must be at least max(1,rows) if ordering
= 'C' or 'c', and max(1,cols) otherwise.
dst_lda INTEGER. Distance between the first elements in adjacent
(in the case of the row-major order) in the destination
matrix; measured in the number of elements.
To determine the minimum value of dst_lda on output,
consider the following guideline:
If ordering = 'C' or 'c', then
• If trans = 'T' or 't' or 'C' or 'c', this parameter

must be at least max(1,rows)
• If trans = 'N' or 'n' or 'R' or 'r', this parameter
must be at least max(1,cols)
If ordering = 'R' or 'r', then

Output Parameters
a REAL for mkl_simatcopy.
Contains the matrix A.
469
Interfaces
FORTRAN 77:
SUBROUTINE mkl_simatcopy ( ordering, trans, rows, cols, alpha, a, src_lda, dst_lda )
CHARACTER*1 ordering, trans
INTEGER rows, cols, src_ld, dst_ld
REAL a(*), alpha*
SUBROUTINE mkl_dimatcopy ( ordering, trans, rows, cols, alpha, a, src_lda, dst_lda )
DOUBLE PRECISION a(*), alpha*
SUBROUTINE mkl_cimatcopy ( ordering, trans, rows, cols, alpha, a, src_lda, dst_lda )
COMPLEX a(*), alpha*
SUBROUTINE mkl_zimatcopy ( ordering, trans, rows, cols, alpha, a, src_lda, dst_lda )
DOUBLE COMPLEX a(*), alpha*
C:
void mkl_simatcopy(char ordering, char trans, size_t rows, size_t cols, float *alpha, float
*a, size_t src_lda, size_t dst_lda);
void mkl_dimatcopy(char ordering, char trans, size_t rows, size_t cols, double *alpha, float
*a, size_t src_lda, size_t dst_lda);
void mkl_cimatcopy(char ordering, char trans, size_t rows, size_t cols, MKL_Complex8 *alpha,
MKL_Complex8 *a, size_t src_lda, size_t dst_lda);
void mkl_zimatcopy(char ordering, char trans, size_t rows, size_t cols, MKL_Complex16 *alpha,
MKL_Complex16 *a, size_t src_lda, size_t dst_lda);
470
mkl_?omatcopy
Performs scaling and out-place
Syntax
Fortran:
call mkl_somatcopy(ordering, trans, rows, cols, alpha, src, src_ld, dst,
dst_ld)
call mkl_domatcopy(ordering, trans, rows, cols, alpha, src, src_ld, dst,
dst_ld)
call mkl_comatcopy(ordering, trans, rows, cols, alpha, src, src_ld, dst,
dst_ld)
call mkl_zomatcopy(ordering, trans, rows, cols, alpha, src, src_ld, dst,
dst_ld)
C:
mkl_somatcopy(ordering, trans, rows, cols, alpha, SRC, src_stride, DST,
dst_stride);
mkl_domatcopy(ordering, trans, rows, cols, alpha, SRC, src_stride, DST,
dst_stride);
mkl_comatcopy(ordering, trans, rows, cols, alpha, SRC, src_stride, DST,
dst_stride);
mkl_zomatcopy(ordering, trans, rows, cols, alpha, SRC, src_stride, DST,
dst_stride);
Description
The mkl_?omatcopy routine performs scaling and out-of-place transposition/copying of matrices.
A transposition operation can be a normal matrix copy, a transposition, a conjugate transposition,
or just a conjugation. The operation is defined as follows:
B := alpha*op(A)
471
of data types that mostly refer here to the FORTRAN 77 standard types. Data types specific to
the different interfaces are described in the section “Interfaces” below.
Input Parameters
transposed.
conjugated.
alpha REAL for mkl_somatcopy.
DOUBLE PRECISION for mkl_domatcopy.
COMPLEX for mkl_comatcopy.
DOUBLE COMPLEX for mkl_zomatcopy.
src REAL for mkl_somatcopy.
Array, DIMENSION src(scr_ld,*).
472
src_ld INTEGER. (Fortran interface). Distance between the first
elements in adjacent columns (in the case of the
column-major order) or rows (in the case of the row-major
order) in the source matrix; measured in the number of
elements.
src_stride INTEGER. (C interface). Distance between the first elements
in adjacent columns (in the case of the column-major order)
or rows (in the case of the row-major order) in the source
matrix; measured in the number of elements.
dst REAL for mkl_somatcopy.
Array, DIMENSION dst(dst_ld,*).
dst_ld INTEGER. (Fortran interface). Distance between the first
elements in adjacent columns (in the case of the
column-major order) or rows (in the case of the row-major
order) in the destination matrix; measured in the number
of elements.


473
dst_stride INTEGER. (C interface). Distance between the first elements

in adjacent columns (in the case of the column-major order)
or rows (in the case of the row-major order) in the
destination matrix; measured in the number of elements.


Output Parameters
dst REAL for mkl_somatcopy.
Contains the destination matrix.
474
Interfaces
FORTRAN 77:
SUBROUTINE mkl_somatcopy ( ordering, trans, rows, cols, alpha, src, src_ld, dst, dst_ld )
REAL alpha, dst(dst_ld,*), src(src_ld,*)
SUBROUTINE mkl_domatcopy ( ordering, trans, rows, cols, alpha, src, src_ld, dst, dst_ld )
INTEGER rows, cols, src_lda, dst_lda
DOUBLE PRECISION alpha, dst(dst_ld,*), src(src_ld,*)
SUBROUTINE mkl_comatcopy ( ordering, trans, rows, cols, alpha, src, src_ld, dst, dst_ld )
COMPLEX alpha, dst(dst_ld,*), src(src_ld,*)
SUBROUTINE mkl_zomatcopy ( ordering, trans, rows, cols, alpha, src, src_ld, dst, dst_ld )
DOUBLE COMPLEX alpha, dst(dst_ld,*), src(src_ld,*)
C:
void mkl_somatcopy(char ordering, char trans, size_t rows, size_t cols, float *alpha, float
*SRC, size_t src_stride, float *DST, size_t dst_stride);
void mkl_domatcopy(char ordering, char trans, size_t rows, size_t cols, float *alpha, double
*SRC, size_t src_stride, double *DST, size_t dst_stride);
void mkl_comatcopy(char ordering, char trans, size_t rows, size_t cols, MKL_Complex8 *alpha,
MKL_Complex8 *SRC, size_t src_stride, MKL_Complex8 *DST, size_t dst_stride);
void mkl_zomatcopy(char ordering, char trans, size_t rows, size_t cols, MKL_Complex16 *alpha,
MKL_Complex16 *SRC, size_t src_stride, MKL_Complex16 *DST, size_t dst_stride);
475
mkl_?omatcopy2
Performs two-strided scaling and out-of-place
Syntax
Fortran:
call mkl_somatcopy2(ordering, trans, rows, cols, alpha, src, src_row, src_col,
dst, dst_row, dst_col)
call mkl_domatcopy2(ordering, trans, rows, cols, alpha, src, src_row, src_col,
call mkl_comatcopy2(ordering, trans, rows, cols, alpha, src, src_row, src_col,
call mkl_zomatcopy2(ordering, trans, rows, cols, alpha, src, src_row, src_col,
C:
mkl_somatcopy2(ordering, trans, rows, cols, alpha, SRC, src_row, src_col,
DST, dst_row, dst_col);
mkl_domatcopy2(ordering, trans, rows, cols, alpha, SRC, src_row, src_col,
mkl_comatcopy2(ordering, trans, rows, cols, alpha, SRC, src_row, src_col,
mkl_zomatcopy2(ordering, trans, rows, cols, alpha, SRC, src_row, src_col,
Description
The mkl_?omatcopy2 routine performs two-strided scaling and out-of-place transposition/copying
of matrices. A transposition operation can be a normal matrix copy, a transposition, a conjugate
transposition, or just a conjugation. The operation is defined as follows:
B := alpha*op(A)
476
Normally, matrices in the BLAS or LAPACK are specified by a single stride index. For instance,
in the column-major order, A(2,1) is stored in memory one element away from A(1,1), but
A(1,2) is a leading dimension away. The leading dimension in this case is the single stride. If
a matrix has two strides, then both A(2,1) and A(1,2) may be an arbitrary distance from
A(1,1).
Input Parameters
transposed.
conjugated.
alpha REAL for mkl_somatcopy2.
DOUBLE PRECISION for mkl_domatcopy2.
COMPLEX for mkl_comatcopy2.
DOUBLE COMPLEX for mkl_zomatcopy2.
src REAL for mkl_somatcopy2.
477
Array, DIMENSION src(*).

src_row INTEGER. Distance between the first elements in adjacent
rows in the source matrix; measured in the number of
elements.
This parameter must be at least max(1,rows).
src_col INTEGER. Distance between the first elements in adjacent
columns in the source matrix; measured in the number of
elements.
This parameter must be at least max(1,cols).
dst REAL for mkl_somatcopy2.
Array, DIMENSION dst(*).
dst_row INTEGER. Distance between the first elements in adjacent
rows in the destination matrix; measured in the number of
elements.
To determine the minimum value of dst_row on output,

dst_col INTEGER. Distance between the first elements in adjacent

columns in the destination matrix; measured in the number
of elements.

478
Output Parameters
dst REAL for mkl_somatcopy2.
Contains the destination matrix.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_somatcopy2 ( ordering, trans, rows, cols, alpha, src, src_row, src_col, dst,
dst_row, dst_col )
INTEGER rows, cols, src_row, src_col, dst_row, dst_col
REAL alpha, dst(*), src(*)
SUBROUTINE mkl_domatcopy2 ( ordering, trans, rows, cols, alpha, src, src_row, src_col, dst,
dst_row, dst_col )
DOUBLE PRECISION alpha, dst(*), src(*)
SUBROUTINE mkl_comatcopy2 ( ordering, trans, rows, cols, alpha, src, src_row, src_col, dst,
dst_row, dst_col )
COMPLEX alpha, dst(*), src(*)
SUBROUTINE mkl_zomatcopy2 ( ordering, trans, rows, cols, alpha, src, src_row, src_col, dst,
dst_row, dst_col )
DOUBLE COMPLEX alpha, dst(*), src(*)
479
C:
void mkl_somatcopy2(char ordering, char trans, size_t rows, size_t cols, float *alpha, float
*SRC, size_t src_row, size_t src_col, float *DST, size_t dst_row, size_t dst_col);
void mkl_domatcopy2(char ordering, char trans, size_t rows, size_t cols, float *alpha,
double *SRC, size_t src_row, size_t src_col, double *DST, size_t dst_row, size_t dst_col);
void mkl_comatcopy2(char ordering, char trans, size_t rows, size_t cols, MKL_Complex8 *alpha,
MKL_Complex8 *SRC, size_t src_row, size_t src_col, MKL_Complex8 *DST, size_t dst_row,
size_t dst_col);
void mkl_zomatcopy2(char ordering, char trans, size_t rows, size_t cols, MKL_Complex16
*alpha, MKL_Complex16 *SRC, size_t src_row, size_t src_col, MKL_Complex16 *DST, size_t
dst_row, size_t dst_col);
mkl_?omatadd
Performs scaling and sum of two matrices including
their out-of-place transposition/copying.
Syntax
Fortran:
call mkl_somatadd(ordering, transa, transb, m, n, alpha, a, lda, beta, b,
ldb, c, ldc)
call mkl_domatadd(ordering, transa, transb, m, n, alpha, a, lda, beta, b,
ldb, c, ldc)
call mkl_comatadd(ordering, transa, transb, m, n, alpha, a, lda, beta, b,
ldb, c, ldc)
call mkl_zomatadd(ordering, transa, transb, m, n, alpha, a, lda, beta, b,
ldb, c, ldc)
480
C:
mkl_somatadd(ordering, transa, transb, m, n, alpha, A, lda, beta, B, ldb, C,
ldc);
mkl_domatadd(ordering, transa, transb, m, n, alpha, A, lda, beta, B, ldb, C,
ldc);
mkl_comatadd(ordering, transa, transb, m, n, alpha, A, lda, beta, B, ldb, C,
ldc);
mkl_zomatadd(ordering, transa, transb, m, n, alpha, A, lda, beta, B, ldb, C,
ldc);
Description
The mkl_?omatadd routine scaling and sum of two matrices including their out-of-place
transposition/copying. A transposition operation can be a normal matrix copy, a transposition,
a conjugate transposition, or just a conjugation. The following out-of-place memory movement
is done:
C := alpha*op(A) + beta*op(B)
op(A) is either transpose, conjugate-transpose, or leave alone depending on transa. If no
transposition of the source matrices is required, m is the number of rows and n is the number
of columns in the source matrices A and B. In this case, the output matrix C is m-by-n.
Input Parameters
transa CHARACTER*1. Parameter that specifies the operation type
on matrix A.
If transa = 'N' or 'n', op(A)=A and the matrix A is
481
If transa = 'T' or 't', it is assumed that A should be

transposed.
If transa = 'C' or 'c', it is assumed that A should be
If transa = 'R' or 'r', it is assumed that A should be only
conjugated.
If the data is real, then transa = 'R' is the same as transa
= 'N', and transa = 'C' is the same as transa = 'T'.
transb CHARACTER*1. Parameter that specifies the operation type
on matrix B.
If transb = 'N' or 'n', op(B)=B and the matrix B is
If transb = 'T' or 't', it is assumed that B should be
transposed.
If transb = 'C' or 'c', it is assumed that B should be
If transb = 'R' or 'r', it is assumed that B should be only
conjugated.
If the data is real, then transb = 'R' is the same as transb
= 'N', and transb = 'C' is the same as transb = 'T'.
m INTEGER. The number of matrix rows.
n INTEGER. The number of matrix columns.
alpha REAL for mkl_somatadd.
DOUBLE PRECISION for mkl_domatadd.
COMPLEX for mkl_comatadd.
DOUBLE COMPLEX for mkl_zomatadd.
a REAL for mkl_somatadd.
Array, DIMENSION a(lda,*).
lda INTEGER. Distance between the first elements in adjacent
(in the case of the row-major order) in the source matrix
A; measured in the number of elements.
482
beta REAL for mkl_somatadd.
This parameter scales the input matrix by beta.
b REAL for mkl_somatadd.
Array, DIMENSION b(ldb,*).
ldb INTEGER. Distance between the first elements in adjacent
(in the case of the row-major order) in the source matrix
B; measured in the number of elements.
Output Parameters
c REAL for mkl_somatadd.
Array, DIMENSION c(ldc,*).
ldc INTEGER. Distance between the first elements in adjacent
(in the case of the row-major order) in the destination
matrix C; measured in the number of elements.
To determine the minimum value of ldc, consider the
following guideline:
• If transa or transb = 'T' or 't' or 'C' or 'c', this

parameter must be at least max(1,rows)
• If transa or transb = 'N' or 'n' or 'R' or 'r', this
parameter must be at least max(1,cols)
483
• If transa or transb = 'T' or 't' or 'C' or 'c', this

parameter must be at least max(1,cols)
• If transa or transb = 'N' or 'n' or 'R' or 'r', this
parameter must be at least max(1,rows)
Interfaces
FORTRAN 77:
SUBROUTINE mkl_somatadd ( ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c,
ldc )
CHARACTER*1 ordering, transa, transb
INTEGER m, n, lda, ldb, ldc
REAL alpha, beta
REAL a(lda,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_domatadd ( ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c,
ldc )
DOUBLE PRECISION a(lda,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_comatadd ( ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c,
ldc )
COMPLEX alpha, beta
COMPLEX a(lda,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_zomatadd ( ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c,
ldc )
DOUBLE COMPLEX a(lda,*), b(ldb,*), c(ldc,*)
484
C:
void mkl_somatadd(char ordering, char transa, char transb, size_t m, size_t n, float *alpha,
float *A, size_t lda, float *beta, float *B, size_t ldb, float *C, size_t ldc);
void mkl_domatadd(char ordering, char transa, char transb, size_t m, size_t n, double *alpha,
double *A, size_t lda, double *beta, float *B, size_t ldb, double *C, size_t ldc);
void mkl_comatadd(char ordering, char transa, char transb, size_t m, size_t n,
MKL_Complex8 *alpha, MKL_Complex8 *A, size_t lda, float *beta, float *B, size_t ldb,
MKL_Complex8 *C, size_t ldc);
void mkl_zomatadd(char ordering, char transa, char transb, size_t m, size_t n,
MKL_Complex16 *alpha, MKL_Complex16 *A, size_t lda, float *beta, float *B, size_t ldb,
MKL_Complex16 *C, size_t ldc);
?gemm3m
Computes a scalar-matrix-matrix product using
matrix multiplications and adds the result to a
Syntax
FORTRAN 77:
call cgemm3m(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call zgemm3m(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
Fortran 95:
call gemm3m(a, b, c [,transa][,transb] [,alpha][,beta])
Description
The ?gemm3m routines perform a matrix-matrix operation with general complex matrices. These
routines are similar to the ?gemm routines, but they use matrix multiplications(see Application
Notes below).
The operation is defined as
C := alpha*op(A)*op(B) + beta*C,
where:
485
op(x) is one of op(x) = x, or op(x) = x', or op(x) = conjg(x'),

A, B and C are matrices:
op(A) is an m-by-k matrix,

op(B) is a k-by-n matrix,
C is an m-by-n matrix.
Input Parameters
transb CHARACTER*1. Specifies the form of op(B) used in the
if transb = 'N' or 'n', then op(B) = B;
if transb = 'T' or 't', then op(B) = B';
if transb = 'C' or 'c', then op(B) = conjg(B').
m INTEGER. Specifies the number of rows of the matrix op(A)
and of the matrix C. The value of m must be at least zero.
n INTEGER. Specifies the number of columns of the matrix
op(B) and the number of columns of the matrix C.
k INTEGER. Specifies the number of columns of the matrix
op(A) and the number of rows of the matrix op(B).
alpha COMPLEX for cgemm3m
DOUBLE COMPLEX for zgemm3m
a COMPLEX for cgemm3m
486
Array, DIMENSION (lda, ka), where ka is k when transa=
'N' or 'n', and is m otherwise. Before entry with transa=
'N' or 'n', the leading m-by-k part of the array a must
contain the matrix A, otherwise the leading k-by-m part of
the calling (sub)program. When transa= 'N' or 'n', then
least max(1, k).
b COMPLEX for cgemm3m
Array, DIMENSION (ldb, kb), where kb is n when transb
= 'N' or 'n', and is k otherwise. Before entry with transb
= 'N' or 'n', the leading k-by-n part of the array b must
contain the matrix B, otherwise the leading n-by-k part of
the calling (sub)program. When transb = 'N' or 'n', then
ldb must be at least max(1, k), otherwise ldb must be at
least max(1, n).
beta COMPLEX for cgemm3m
When beta is equal to zero, then c need not be set on input.
c COMPLEX for cgemm3m
Before entry, the leading m-by-n part of the array c must
contain the matrix C, except when beta is equal to zero, in
which case c need not be set on entry.
max(1, m).
487
Output Parameters
c Overwritten by the m-by-n matrix (alpha*op(A)*op(B) +
beta*C).

Specific details for the routine gemm3m interface are the following:
ka = m otherwise,
ma = m if transa= 'N',
ma = k otherwise.
kb = n if transb = 'N',
kb = k otherwise,
mb = k if transb = 'N',
mb = n otherwise.
transb Must be 'N', 'C', or 'T'.
Application Notes
These routines perform the complex multiplication by forming the real and imaginary parts of
the input matrices. It allows to use three real matrix multiplications and five real matrix additions,
instead of the conventional four real matrix multiplications and two real matrix additions. The
use of three real matrix multiplications only gives a 25% reduction of time in matrix operations.
This can result in significant savings in computing time for large matrices.
If the errors in the floating point calculations satisfy the following conditions:
488
fl(x op y)=(x op y)(1+δ),|δ|≤u, op=×,/, fl(x±y)=x(1+α)±y(1+β), |α|,|β|≤u
then for n-by-n matrix Ĉ=fl(C1+iC2)= fl((A1+iA2)(B1+iB2))=Ĉ1+iĈ2 the following
estimations are correct
║Ĉ1-C2║≤ 2(n+1)u║A║∞║B║∞+O(u2),
║Ĉ2-C1║≤ 4(n+4)u║A║∞║B║∞+O(u2),
where ║A║∞=max(║A1║∞,║A2║∞), and ║B║∞=max(║B1║∞,║B2║∞).
and hence the matrix multiplications are stable.
489
LAPACK Routines: Linear
Equations 3
This chapter describes the Intel® Math Kernel Library implementation of routines from the LAPACK package
that are used for solving systems of linear equations and performing a number of related computational
tasks. The library includes LAPACK routines for both real and complex data. Routines are supported for
systems of equations with the following types of matrices:
• general
• banded
• symmetric or Hermitian positive-definite (both full and packed storage)
• symmetric or Hermitian positive-definite banded
• symmetric or Hermitian indefinite (both full and packed storage)
• symmetric or Hermitian indefinite banded
• triangular (both full and packed storage)
• triangular banded
• tridiagonal.
For each of the above matrix types, the library includes routines for performing the following computations:
– factoring the matrix (except for triangular matrices)
– equilibrating the matrix
– solving a system of linear equations
– estimating the condition number of a matrix
– refining the solution of linear equations and computing its error bounds
– inverting the matrix.
To solve a particular problem, you can call two or more computational routines or call a corresponding
driver routine that combines several tasks in one call, such as ?gesv for factoring and solving. For example,
to solve a system of linear equations with a general matrix, call ?getrf (LU factorization) and then ?getrs
(computing the solution). Then, call ?gerfs to refine the solution and get the error bounds. Alternatively,
use the driver routine ?gesvx that performs all these tasks in one call.
WARNING. LAPACK routines expect that input matrices do not contain INF or NaN values.
When input data is inappropriate for LAPACK, problems may arise, including possible hangs.
491
Starting from release 8.0, Intel MKL along with FORTRAN 77 interface to LAPACK computational and
driver routines also supports Fortran 95 interface which uses simplified routine calls with shorter
argument lists. The syntax section of the routine description gives the calling sequence for Fortran 95
interface immediately after FORTRAN 77 calls.

To call each routine introduced in this chapter from the FORTRAN 77 program, you can use
the LAPACK name.
LAPACK names are listed in Table 3-1 and Table 3-2 , and have the structure ?yyzzz or
?yyzz, which is described below.
The initial symbol ? indicates the data type:
The second and third letters yy indicate the matrix type and storage scheme:
ge general
gb general band
gt general tridiagonal
po symmetric or Hermitian positive-definite
pp symmetric or Hermitian positive-definite (packed storage)
pb symmetric or Hermitian positive-definite band
pt symmetric or Hermitian positive-definite tridiagonal
sy symmetric indefinite
sp symmetric indefinite (packed storage)
he Hermitian indefinite
hp Hermitian indefinite (packed storage)
tr triangular
tp triangular (packed storage)
tb triangular band
For computational routines, the last three letters zzz indicate the computation performed:
trf form a triangular matrix factorization
492
LAPACK Routines: Linear Equations 3
trs solve the linear system with a factored matrix
con estimate the matrix condition number
rfs refine the solution and compute error bounds
tri compute the inverse matrix using the factorization
equ equilibrate a matrix.
For example, the sgetrf routine performs the triangular factorization of general real matrices
in single precision; the corresponding routine for complex matrices is cgetrf.
For driver routines, the names can end with -sv (meaning a simple driver), or with -svx
(meaning an expert driver).
Names of the LAPACK computational and driver routines for Fortran 95 interface in Intel MKL
are the same as FORTRAN 77 names but without the first letter that indicates the data type.
For example, the name of the routine that performs triangular factorization of general real
matrices in Fortran 95 interface is getrf. Different data types are handled through defining a
specific internal parameter that refers to a module block with named constants for single and
double precision.
Fortran 95 Interface Conventions

Fortran 95 interface to LAPACK is implemented through wrappers that call respective
FORTRAN 77 routines. This interface uses such features of Fortran 95 as assumed-shape arrays
and optional arguments to provide simplified calls to LAPACK routines with fewer arguments.
NOTE. For LAPACK, Intel MKL offers two types of Fortran 95 interfaces:
• using mkl_lapack.fi only through include ‘mkl_lapack.fi’ statement. Such
interfaces allow you to make use of the original LAPACK routines with all their
arguments
• using mkl_lapack.f90 that includes improved interfaces described below. This file
is used to generate the module files mkl95_lapack.mod and mkl95_precision.mod.
See also section “Fortran 95 interfaces and wrappers to LAPACK and BLAS” of Intel®
MKL User's Guide for details. The module files are used to process the FORTRAN use
clauses referencing the LAPACK interface: use mkl95_lapack and use
mkl95_precision.
The main conventions for Fortran 95 interface are as follows:
493
• The names of arguments used in Fortran 95 call are typically the same as for the respective
generic (FORTRAN 77) interface. In rare cases formal argument names may be different.
For instance, select instead of selctg.
• Input arguments such as array dimensions are not required in Fortran 95 and are skipped
from the calling sequence. Array dimensions are reconstructed from the user data that must
exactly follow the required array shape.
Another type of generic arguments that are skipped in Fortran 95 interface are arguments
that represent workspace arrays (such as work, rwork, and so on). The only exception are
cases when workspace arrays return significant information on output.
An argument can also be skipped if its value is completely defined by the presence or absence
of another argument in the calling sequence, and the restored value is the only meaningful
value for the skipped argument.
• Some generic arguments are declared as optional in Fortran 95 interface and may or may
not be present in the calling sequence. An argument can be declared optional if it satisfies
one of the following conditions:
– If the argument value is completely defined by the presence or absence of another
argument in the calling sequence, it can be declared as optional. The difference from the
skipped argument in this case is that the optional argument can have some meaningful
values that are distinct from the value reconstructed by default. For example, if some
argument (like jobz) can take only two values and one of these values directly implies
the use of another argument, then the value of jobz can be uniquely reconstructed from
the actual presence or absence of this second argument, and jobz can be omitted.
– If an input argument can take only a few possible values, it can be declared as optional.
The default value of such argument is typically set as the first value in the list and all
exceptions to this rule are explicitly stated in the routine description.
– If an input argument has a natural default value, it can be declared as optional. The
default value of such optional argument is set to its natural default value.
• Argument info is declared as optional in Fortran 95 interface. If it is present in the calling

sequence, the value assigned to info is interpreted as follows:
– If this value is more than -1000, its meaning is the same as in FORTRAN 77 routine.
– If this value is equal to -1000, it means that there is not enough work memory.
– If this value is equal to -1001, incompatible arguments are present in the calling sequence.
• Optional arguments are given in square brackets in Fortran 95 call syntax.

“Fortran 95 Notes” subsection at the end of the topic describing each routine details concrete
rules for reconstructing the values of omitted optional parameters.
494
Intel® MKL Fortran 95 Interfaces for LAPACK Routines vs. Netlib Implementation
The following list presents general digressions of the Intel MKL LAPACK95 implementation from
the Netlib analog:
• The Intel MKL Fortran 95 interfaces are provided for pure procedures.
• Names of interfaces do not contain the LA_ prefix.
• An optional array argument always has the target attribute.
• Functionality of the Intel MKL LAPACK95 wrapper is close to the FORTRAN 77 original
implementation in the getrf, gbtrf, and potrf interfaces.
• If jobz argument value specifies presence or absence of z argument, then z is always
declared as optional and jobz is restored depending on whether z is present or not. It is
not always so in the Netlib version (see “Modified Netlib Interfaces” in Appendix E).
• To avoid double error checking, processing of the info argument is limited to checking of
the allocated memory and disarranging of optional arguments.
• If an argument that is present in the list of arguments completely defines another argument,
the latter is always declared as optional.
You can transform an application that uses the Netlib LAPACK interfaces to ensure its work with
the Intel MKL interfaces providing that:
a. The application is correct, that is, unambiguous, compiler-independent, and contains no

errors.
b. Each routine name denotes only one specific routine. If any routine name in the application
coincides with a name of the original Netlib routine (for example, after removing the LA_
prefix) but denotes a routine different from the Netlib original routine, this name should be
modified through context name replacement.
You should transform your application in the following cases (see Appendix E for specific
differences of individual interfaces):
• When using the Netlib routines that differ from the Intel MKL routines only by the LA_ prefix
or in the array attribute target. The only transformation required in this case is context
name replacement. See “Interfaces Identical to Netlib” in Appendix E for details.
• When using Netlib routines that differ from the Intel MKL routines by the LA_ prefix, the
target array attribute, and the names of formal arguments. In the case of positional passing
of arguments, no additional transformation except context name replacement is required.
In the case of the keywords passing of arguments, in addition to the context name
replacement the names of mismatching keywords should also be modified. See “Interfaces
with Replaced Argument Names” in Appendix E for details.
495
• When using the Netlib routines that differ from the respective Intel MKL routines by the LA_
prefix, the target array attribute, sequence of the arguments, arguments missing in Intel
MKL but present in Netlib and, vice versa, present in Intel MKL but missing in Netlib. Remove
the differences in the sequence and range of the arguments in process of all the
transformations when you use the Netlib routines specified by this bullet and the preceding
bullet. See “Modified Netlib Interfaces” in Appendix E for details.
• When using the getrf, gbtrf, and potrf interfaces, that is, new functionality implemented
in Intel MKL but unavailable in the Netlib source. To override the differences, build the
desired functionality explicitly with the Intel MKL means or create a new subroutine with
the new functionality, using specific MKL interfaces corresponding to LAPACK 77 routines.
You can call the LAPACK 77 routines directly but using the new Intel MKL interfaces is
preferable. See “Interfaces Absent From Netlib” and “Interfaces of New Functionality”in
Appendix E for details.
Note that if the transformed application calls getrf, gbtrf or potrf without controlling
arguments rcond and norm, just context name replacement is enough in modifying the calls
into the Intel MKL interfaces, as described in the first bullet above. The Netlib functionality
is preserved in such cases.
• When using the Netlib auxiliary routines. In this case, call a corresponding subroutine directly,
using the MKL LAPACK 77 interfaces.
Transform your application as follows:
1. Make sure conditions a. and b. are met.

2. Select Netlib LAPACK 95 calls. For each call do the following:
• Select the type of digression and do the required transformations.
• Revise results to eliminate unneeded code or data, which may appear after several
identical calls.
3. Make sure the transformations are correct and complete.

LAPACK routines use the following matrix storage schemes:
496
• Band storage: an m-by-n band matrix with kl sub-diagonals and ku superdiagonals is stored
compactly in a two-dimensional array ab with kl+ku+1 rows and n columns. Columns of the
matrix are stored in the corresponding columns of the array, and diagonals of the matrix
are stored in rows of the array.
In Chapters 4 and 5 , arrays that hold matrices in packed storage have names ending in p;
arrays with matrices in band storage have names ending in b.
For more information on matrix storage schemes, see “Matrix Arguments” in Appendix B.
Mathematical Notation
Descriptions of LAPACK routines use the following notation:
Ax = b A system of linear equations with an n-by-n matrix A = {aij},
a right-hand side vector b = {bi}, and an unknown vector
x = {xi}.
AX = B A set of systems with a common matrix A and multiple
right-hand sides. The columns of B are individual right-hand
sides, and the columns of X are the corresponding solutions.
|x| the vector with elements |xi| (absolute values of xi).
|A| the matrix with elements |aij| (absolute values of aij).
||x||∞ = maxi|xi| The infinity-norm of the vector x.
||A||∞ = maxiΣj|aij| The infinity-norm of the matrix A.
||A||1 = maxjΣi|aij| The one-norm of the matrix A. ||A||1 = ||AT||∞ = ||AH||∞
κ(A) = ||A|| ||A-1|| The condition number of the matrix A.
Error Analysis
In practice, most computations are performed with rounding errors. Besides, you often need
to solve a system Ax = b, where the data (the elements of A and b) are not known exactly.
Therefore, it is important to understand how the data errors and rounding errors can affect the
solution x.
Data perturbations. If x is the exact solution of Ax = b, and x + δx is the exact solution of

a perturbed problem (A + δA)x = (b + δb), then
497
where
In other words, relative errors in A or b may be amplified in the solution vector x by a factor
κ(A) = ||A|| ||A-1|| called the condition number of A.
Rounding errors have the same effect as relative perturbations c(n)ε in the original data.
Here ε is the machine precision, and c(n) is a modest function of the matrix order n. The
corresponding solution error is
||δx||/||x||≤ c(n)κ(A)ε. (The value of c(n) is seldom greater than 10n.)
Thus, if your matrix A is ill-conditioned (that is, its condition number κ(A) is very large), then
the error in the solution x is also large; you may even encounter a complete loss of precision.
LAPACK provides routines that allow you to estimate κ(A) (see Routines for Estimating the
Condition Number) and also give you a more precise estimate for the actual solution error (see
Refining the Solution and Estimating Its Error).
Computational Routines
Table 3-1 lists the LAPACK computational routines (FORTRAN 77 and Fortran 95 interfaces)
for factorizing, equilibrating, and inverting real matrices, estimating their condition numbers,
solving systems of equations with real matrices, refining the solution, and estimating its error.
Table 3-2 lists similar routines for complex matrices. Respective routine names in Fortran 95
interface are without the first symbol (see Routine Naming Conventions).
Table 3-1 Computational Routines for Systems of Equations with Real Matrices
Matrix type, Factorize Equilibrate Solve Condition Estimate Invert matrix
storage scheme matrix matrix system number error
general ?getrf ?geequ ?getrs ?gecon ?gerfs ?getri
general band ?gbtrf ?gbequ ?gbtrs ?gbcon ?gbrfs
498
general ?gttrf ?gttrs ?gtcon ?gtrfs

tridiagonal
symmetric ?potrf ?poequ ?potrs ?pocon ?porfs ?potri

positive-definite
symmetric ?pptrf ?ppequ ?pptrs ?ppcon ?pprfs ?pptri

positive-definite,
packed storage
symmetric ?pbtrf ?pbequ ?pbtrs ?pbcon ?pbrfs

positive-definite,
band
symmetric ?pttrf ?pttrs ?ptcon ?ptrfs

positive-definite,
tridiagonal
symmetric ?sytrf ?sytrs ?sycon ?syrfs ?sytri

indefinite
symmetric ?sptrf ?sptrs ?spcon ?sprfs ?sptri

indefinite,
packed storage
triangular ?trtrs ?trcon ?trrfs ?trtri
triangular, ?tptrs ?tpcon ?tprfs ?tptri

packed storage
triangular band ?tbtrs ?tbcon ?tbrfs
In the table above, ? denotes s (single precision) or d (double precision) for FORTRAN 77
interface.
Table 3-2 Computational Routines for Systems of Equations with Complex Matrices
general ?getrf ?geequ ?getrs ?gecon ?gerfs ?getri
general band ?gbtrf ?gbequ ?gbtrs ?gbcon ?gbrfs
499

general ?gttrf ?gttrs ?gtcon ?gtrfs

tridiagonal
Hermitian ?potrf ?poequ ?potrs ?pocon ?porfs ?potri

positive-definite
Hermitian ?pptrf ?ppequ ?pptrs ?ppcon ?pprfs ?pptri

positive-definite,
packed storage
Hermitian ?pbtrf ?pbequ ?pbtrs ?pbcon ?pbrfs

positive-definite,
band
Hermitian ?pttrf ?pttrs ?ptcon ?ptrfs

positive-definite,
tridiagonal
Hermitian ?hetrf ?hetrs ?hecon ?herfs ?hetri

indefinite
symmetric ?sytrf ?sytrs ?sycon ?syrfs ?sytri

indefinite
Hermitian ?hptrf ?hptrs ?hpcon ?hprfs ?hptri

indefinite,
packed storage
symmetric ?sptrf ?sptrs ?spcon ?sprfs ?sptri

indefinite,
packed storage
triangular ?trtrs ?trcon ?trrfs ?trtri
triangular, ?tptrs ?tpcon ?tprfs ?tptri

packed storage
triangular band ?tbtrs ?tbcon ?tbrfs
In the table above, ? stands for c (single precision complex) or z (double precision complex)
for FORTRAN 77 interface.
500
Routines for Matrix Factorization

This section describes the LAPACK routines for matrix factorization. The following factorizations
are supported:
• LU factorization
• Cholesky factorization of real symmetric positive-definite matrices
• Cholesky factorization of Hermitian positive-definite matrices
• Bunch-Kaufman factorization of real and complex symmetric matrices
• Bunch-Kaufman factorization of Hermitian matrices.
You can compute:
• the LU factorization using full and band storage of matrices

• the Cholesky factorization using full, packed, and band storage
• the Bunch-Kaufman factorization using full and packed storage.
?getrf
Computes the LU factorization of a general m-by-n
matrix.
Syntax
FORTRAN 77:
call sgetrf( m, n, a, lda, ipiv, info )
call dgetrf( m, n, a, lda, ipiv, info )
call cgetrf( m, n, a, lda, ipiv, info )
call zgetrf( m, n, a, lda, ipiv, info )
Fortran 95:
call getrf( a [,ipiv] [,info] )
Description
This routine is declared in mkl_lapack.fi for FORTRAN 77 interface, in mkl_lapack.f90 for
Fortran 95 interface, and in mkl_lapack.h for C interface.
501
The routine computes the LU factorization of a general m-by-n matrix A as

A = P*L*U,
where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower
trapezoidal if m > n) and U is upper triangular (upper trapezoidal if m < n). The routine uses
partial pivoting, with row interchanges.
NOTE. This routine supports the Progress Routine feature. See Progress Function section
for details.
Input Parameters
m INTEGER. The number of rows in the matrix A (m ≥ 0).
n INTEGER. The number of columns in A; n ≥ 0.
a REAL for sgetrf
DOUBLE PRECISION for dgetrf
COMPLEX for cgetrf
DOUBLE COMPLEX for zgetrf.
Array, DIMENSION (lda,*). Contains the matrix A. The second
dimension of a must be at least max(1, n).
lda INTEGER. The first dimension of array a.
Output Parameters
a Overwritten by L and U. The unit diagonal elements of L are not stored.
ipiv INTEGER.
Array, DIMENSION at least max(1,min(m, n)). The pivot indices; for
1 ≤ i ≤ min(m, n) , row i was interchanged with row ipiv(i).
info INTEGER. If info=0, the execution is successful.
If info = -i, the i-th parameter had an illegal value.
If info = i, uii is 0. The factorization has been completed, but U is
exactly singular. Division by 0 will occur if you use the factor U for
solving a system of linear equations.
502
Specific details for the routine getrf interface are as follows:
ipiv Holds the vector of length min(m,n).
Application Notes
The computed L and U are the exact factors of a perturbed matrix A + E, where
|E| ≤ c(min(m,n))ε P|L||U|
c(n) is a modest linear function of n, and ε is the machine precision.

The approximate number of floating-point operations for real flavors is
3
(2/3)n If m = n,
2
(1/3)n (3m-n) If m > n,
2
(1/3)m (3n-m) If m < n.
The number of operations for complex flavors is four times greater.

After calling this routine with m = n, you can call the following:
?getrs to solve A*x = B or ATX = B or AHX = B
?gecon to estimate the condition number of A
?getri to compute the inverse of A.
503
?gbtrf
band matrix.
Syntax
FORTRAN 77:
call sgbtrf( m, n, kl, ku, ab, ldab, ipiv, info )
call dgbtrf( m, n, kl, ku, ab, ldab, ipiv, info )
call cgbtrf( m, n, kl, ku, ab, ldab, ipiv, info )
call zgbtrf( m, n, kl, ku, ab, ldab, ipiv, info )
Fortran 95:
call gbtrf( ab [,kl] [,m] [,ipiv] [,info] )
Description
The routine forms the LU factorization of a general m-by-n band matrix A with kl non-zero
subdiagonals and ku non-zero superdiagonals, that is,
A = P*L*U,
where P is a permutation matrix; L is lower triangular with unit diagonal elements and at most
kl non-zero elements in each column; U is an upper triangular band matrix with kl + ku
superdiagonals. The routine uses partial pivoting, with row interchanges (which creates the
additional kl superdiagonals in U).
for details.
Input Parameters
m INTEGER. The number of rows in matrix A; m ≥ 0.
n INTEGER. The number of columns in matrix A; n ≥ 0.
504
kl INTEGER. The number of subdiagonals within the band of A; kl ≥ 0.
ku INTEGER. The number of superdiagonals within the band of A; ku ≥ 0.
ab REAL for sgbtrf
DOUBLE PRECISION for dgbtrf
COMPLEX for cgbtrf
DOUBLE COMPLEX for zgbtrf.
Array, DIMENSION (ldab,*). The array ab contains the matrix A in
band storage, in rows kl + 1 to 2*kl + ku + 1; rows 1 to kl of the
array need not be set. The j-th column of A is stored in the j-th column
of the array ab as follows:
ab(kl + ku + 1 + i - j, j) = a(i,j) for max(1,j-ku) ≤ i ≤
min(m,j+kl).
ldab INTEGER. The first dimension of the array ab. (ldab ≥ 2*kl + ku + 1)
Output Parameters
ab Overwritten by L and U. U is stored as an upper triangular band matrix
with kl + ku superdiagonals in rows 1 to kl + ku + 1, and the
multipliers used during the factorization are stored in rows kl + ku
+ 2 to 2*kl + ku + 1. See Application Notes below for further details.
ipiv INTEGER.
Array, DIMENSION at least max(1,min(m, n)). The pivot indices; for
1 ≤ i ≤ min(m, n) , row i was interchanged with row ipiv(i). .
info INTEGER. If info = 0, the execution is successful.
exactly singular. Division by 0 will occur if you use the factor U for

Specific details for the routine gbtrf interface are as follows:
ab Holds the array A of size (2*kl+ku+1,n).
505
ipiv Holds the vector of length min(m,n).

kl If omitted, assumed kl = ku.
ku Restored as ku = lda-2*kl-1.
m If omitted, assumed m = n.
Application Notes
The computed L and U are the exact factors of a perturbed matrix A + E, where
|E| ≤ c(kl+ku+1) ε P|L||U|
c(k) is a modest linear function of k, and ε is the machine precision.

The total number of floating-point operations for real flavors varies between approximately
2n(ku+1)kl and 2n(kl+ku+1)kl. The number of operations for complex flavors is four times
greater. All these estimates assume that kl and ku are much less than min(m,n).
The band storage scheme is illustrated by the following example, when m = n = 6, kl = 2, ku

= 1:
Array elements marked * are not used by the routine; elements marked + need not be set on
entry, but are required by the routine to store elements ofU because of fill-in resulting from
the row interchanges.
After calling this routine with m = n, you can call the following routines:
gbtrs to solve A*X = B or AT*X = B or AH*X = B
gbcon to estimate the condition number of A.
506
?gttrf
Computes the LU factorization of a tridiagonal
matrix.
Syntax
FORTRAN 77:
call sgttrf( n, dl, d, du, du2, ipiv, info )
call dgttrf( n, dl, d, du, du2, ipiv, info )
call cgttrf( n, dl, d, du, du2, ipiv, info )
call zgttrf( n, dl, d, du, du2, ipiv, info )
Fortran 95:
call gttrf( dl, d, du, du2 [, ipiv] [,info] )
Description
The routine computes the LU factorization of a real or complex tridiagonal matrix A in the form
A = P*L*U,
where P is a permutation matrix; L is lower bidiagonal with unit diagonal elements; and U is
an upper triangular matrix with nonzeroes in only the main diagonal and first two superdiagonals.
The routine uses elimination with partial pivoting and row interchanges.
Input Parameters
n INTEGER. The order of the matrix A; n ≥ 0.
dl, d, du REAL for sgttrf
DOUBLE PRECISION for dgttrf
COMPLEX for cgttrf
DOUBLE COMPLEX for zgttrf.
Arrays containing elements of A.
The array dl of dimension (n - 1) contains the subdiagonal elements
of A.
The array d of dimension n contains the diagonal elements of A.
507
The array du of dimension (n - 1) contains the superdiagonal elements

of A.
Output Parameters
dl Overwritten by the (n -1) multipliers that define the matrix L from
the LU factorization of A.
d Overwritten by the n diagonal elements of the upper triangular matrix
U from the LU factorization of A.
du Overwritten by the (n-1) elements of the first superdiagonal of U.
du2 REAL for sgttrf
DOUBLE PRECISION for dgttrf
COMPLEX for cgttrf
DOUBLE COMPLEX for zgttrf.
Array, dimension (n -2). On exit, du2 contains (n-2) elements of the
second superdiagonal of U.
ipiv INTEGER.
Array, dimension (n). The pivot indices: for 1 ≤ i ≤ n, row i was
interchanged with row ipiv(i). ipiv(i) is always i or i+1; ipiv(i)
= i indicates a row interchange was not required.
exactly singular. Division by zero will occur if you use the factor U for

Specific details for the routine gttrf interface are as follows:
dl Holds the vector of length (n-1).
d Holds the vector of length n.
du Holds the vector of length (n-1).
du2 Holds the vector of length (n-2).
508
ipiv Holds the vector of length n.
Application Notes
?gbtrs to solve A*X = B or AT*X = B or AH*X = B
?gbcon to estimate the condition number of A.
?potrf
Computes the Cholesky factorization of a
symmetric (Hermitian) positive-definite matrix.
Syntax
FORTRAN 77:
call spotrf( uplo, n, a, lda, info )
call dpotrf( uplo, n, a, lda, info )
call cpotrf( uplo, n, a, lda, info )
call zpotrf( uplo, n, a, lda, info )
Fortran 95:
call potrf( a [, uplo] [,info] )
Description
The routine forms the Cholesky factorization of a symmetric positive-definite or, for complex
data, Hermitian positive-definite matrix A:
A = UT*U for real data, A = UH*U for complex data if uplo='U'

A = L*LT for real data, A = L*LH for complex data if uplo='L'
where L is a lower triangular matrix and U is upper triangular.
for details.
509
Input Parameters
uplo CHARACTER*1. Must be 'U' or 'L'.
Indicates whether the upper or lower triangular part of A is stored and
how A is factored:
If uplo = 'U', the array a stores the upper triangular part of the matrix
H
A, and A is factored as U *U.
If uplo = 'L', the array a stores the lower triangular part of the matrix
H
A; A is factored as L*L .
n INTEGER. The order of matrix A; n ≥ 0.
a REAL for spotrf
DOUBLE PRECISION for dpotrf
COMPLEX for cpotrf
DOUBLE COMPLEX for zpotrf.
Array, DIMENSION (lda,*). The array a contains either the upper or
the lower triangular part of the matrix A (see uplo). The second
lda INTEGER. The first dimension of a.
Output Parameters
a The upper or lower triangular part of a is overwritten by the Cholesky
factor U or L, as specified by uplo.
If info = i, the leading minor of order i (and therefore the matrix A
itself) is not positive-definite, and the factorization could not be
completed. This may indicate an error in forming the matrix A.

Specific details for the routine potrf interface are as follows:
a Holds the matrix A of size (n, n).
510
Application Notes
If uplo = 'U', the computed factor U is the exact factor of a perturbed matrix A + E, where

A similar estimate holds for uplo = 'L'.
The total number of floating-point operations is approximately (1/3)n3 for real flavors or
(4/3)n3 for complex flavors.
After calling this routine, you can call the following routines:
?potrs to solve A*X = B
?pocon to estimate the condition number of A
?potri to compute the inverse of A.
?pptrf
symmetric (Hermitian) positive-definite matrix
using packed storage.
Syntax
FORTRAN 77:
call spptrf( uplo, n, ap, info )
call dpptrf( uplo, n, ap, info )
call cpptrf( uplo, n, ap, info )
call zpptrf( uplo, n, ap, info )
Fortran 95:
call pptrf( ap [, uplo] [,info] )
511
Description
data, Hermitian positive-definite packed matrix A:

for details.
Input Parameters
Indicates whether the upper or lower triangular part of A is packed in
the array ap, and how A is factored:
If uplo = 'U', the array ap stores the upper triangular part of the
H
matrix A, and A is factored as U *U.
If uplo = 'L', the array ap stores the lower triangular part of the matrix
H
A; A is factored as L*L .
ap REAL for spptrf
DOUBLE PRECISION for dpptrf
COMPLEX for cpptrf
DOUBLE COMPLEX for zpptrf.
Array, DIMENSION at least max(1, n(n+1)/2). The array ap contains
either the upper or the lower triangular part of the matrix A (as specified
by uplo) in packed storage (see Matrix Storage Schemes).
Output Parameters
ap The upper or lower triangular part of A in packed storage is overwritten
by the Cholesky factor U or L, as specified by uplo.
512

Specific details for the routine pptrf interface are as follows:
ap Holds the array A of size (n*(n+1)/2).
Application Notes

The total number of floating-point operations is approximately (1/3)n3 for real flavors and
?pptrs to solve A*X = B
?ppcon to estimate the condition number of A
?pptri to compute the inverse of A.
513
?pbtrf
symmetric (Hermitian) positive-definite band
matrix.
Syntax
FORTRAN 77:
call spbtrf( uplo, n, kd, ab, ldab, info )
call dpbtrf( uplo, n, kd, ab, ldab, info )
call cpbtrf( uplo, n, kd, ab, ldab, info )
call zpbtrf( uplo, n, kd, ab, ldab, info )
Fortran 95:
call pbtrf( ab [, uplo] [,info] )
Description
data, Hermitian positive-definite band matrix A:

for details.
Input Parameters
Indicates whether the upper or lower triangular part of A is stored in
the array ab, and how A is factored:
514
If uplo = 'U', the upper triangle of A is stored.
If uplo = 'L', the lower triangle of A is stored.
kd INTEGER. The number of superdiagonals or subdiagonals in the matrix
A; kd ≥ 0.
ab REAL for spbtrf
DOUBLE PRECISION for dpbtrf
COMPLEX for cpbtrf
DOUBLE COMPLEX for zpbtrf.
Array, DIMENSION (,*). The array ab contains either the upper or the
lower triangular part of the matrix A (as specified by uplo) in band
storage (see Matrix Storage Schemes). The second dimension of ab
must be at least max(1, n).
ldab INTEGER. The first dimension of the array ab. (ldab ≥ kd + 1)
Output Parameters
ab The upper or lower triangular part of A (in band storage) is overwritten
by the Cholesky factor U or L, as specified by uplo.

Specific details for the routine pbtrf interface are as follows:
ab Holds the array A of size (kd+1,n).
Application Notes
515

The total number of floating-point operations for real flavors is approximately n(kd+1)2. The
number of operations for complex flavors is 4 times greater. All these estimates assume that
kd is much less than n.
?pbtrs to solve A*X = B
?pbcon to estimate the condition number of A.
?pttrf
Computes the factorization of a symmetric
(Hermitian) positive-definite tridiagonal matrix.
Syntax
FORTRAN 77:
call spttrf( n, d, e, info )
call dpttrf( n, d, e, info )
call cpttrf( n, d, e, info )
call zpttrf( n, d, e, info )
Fortran 95:
call pttrf( d, e [,info] )
Description
The routine forms the factorization of a symmetric positive-definite or, for complex data,
Hermitian positive-definite tridiagonal matrix A:
516
A = L*D*L',
where D is diagonal and L is unit lower bidiagonal. The factorization may also be regarded as
having the form A = U'*D*U, where D is unit upper bidiagonal.
Input Parameters
d REAL for spttrf, cpttrf
DOUBLE PRECISION for dpttrf, zpttrf.
Array, dimension (n). Contains the diagonal elements of A.
e REAL for spttrf
DOUBLE PRECISION for dpttrf
COMPLEX for cpttrf
DOUBLE COMPLEX for zpttrf. Array, dimension (n -1). Contains the
subdiagonal elements of A.
Output Parameters
d Overwritten by the n diagonal elements of the diagonal matrix D from
the L*D*L' factorization of A.
e Overwritten by the (n - 1) off-diagonal elements of the unit bidiagonal
factor L or U from the factorization of A.
itself) is not positive-definite; if i < n, the factorization could not be
completed, while if i = n, the factorization was completed, but d(n)
≤ 0.

Specific details for the routine pttrf interface are as follows:
e Holds the vector of length (n-1).
517
?sytrf
Computes the Bunch-Kaufman factorization of a
symmetric matrix.
Syntax
FORTRAN 77:
call ssytrf( uplo, n, a, lda, ipiv, work, lwork, info )
call dsytrf( uplo, n, a, lda, ipiv, work, lwork, info )
call csytrf( uplo, n, a, lda, ipiv, work, lwork, info )
call zsytrf( uplo, n, a, lda, ipiv, work, lwork, info )
Fortran 95:
call sytrf( a [, uplo] [,ipiv] [, info] )
Description
The routine computes the factorization of a real/complex symmetric matrix A using the
Bunch-Kaufman diagonal pivoting method. The form of the factorization is:
T T
if uplo='U', A = P*U*D*U *P
T T
if uplo='L', A = P*L*D*L *P ,
where A is the input matrix, P is a permutation matrix, U and L are upper and lower triangular
matrices with unit diagonal, and D is a symmetric block-diagonal matrix with 1-by-1 and 2-by-2
diagonal blocks. U and L have 2-by-2 unit diagonal blocks corresponding to the 2-by-2 blocks
of D.
NOTE. This routine supports the Progress Routine feature. See Progress Routine section
for details.
Input Parameters
518
how A is factored:
If uplo = 'U', the array a stores the upper triangular part of the
matrix A, and A is factored as P*U*D*UT*PT.
A, and A is factored as P*L*D*LT*PT.
a REAL for ssytrf
DOUBLE PRECISION for dsytrf
COMPLEX for csytrf
DOUBLE COMPLEX for zsytrf.
Array, DIMENSION (lda,*). The array a contains either the upper or
the lower triangular part of the matrix A (see uplo). The second
lda INTEGER. The first dimension of a; at least max(1, n).
work Same type as a. A workspace array, dimension at least max(1,lwork).
lwork INTEGER. The size of the work array (lwork ≥ n).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the
first entry of the work array, and no error message related to lwork is
issued by xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
a The upper or lower triangular part of a is overwritten by details of the
block-diagonal matrix D and the multipliers used to obtain the factor U
(or L).
work(1) If info=0, on exit work(1) contains the minimum value of lwork
required for optimum performance. Use this lwork for subsequent runs.
ipiv INTEGER.
Array, DIMENSION at least max(1, n). Contains details of the
interchanges and the block structure of D. If ipiv(i) = k >0, then
dii is a 1-by-1 block, and the i-th row and column of A was
interchanged with the k-th row and column.
519
If uplo = 'U' and ipiv(i) =ipiv(i-1) = -m < 0, then D has a 2-by-2

block in rows/columns i and i-1, and (i-1)-th row and column of A
was interchanged with the m-th row and column.
If uplo = 'L' and ipiv(i) =ipiv(i+1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i+1, and (i+1)-th row and column of A
If info = i, dii is 0. The factorization has been completed, but D is
exactly singular. Division by 0 will occur if you use D for solving a system
of linear equations.

Specific details for the routine sytrf interface are as follows:
a holds the matrix A of size (n, n)
ipiv holds the vector of length n
uplo must be 'U' or 'L'. The default value is 'U'.
Application Notes
For better performance, try using lwork = n*blocksize, where blocksize is a
machine-dependent value (typically, 16 to 64) required for optimum performance of the blocked
algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first
run or set lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the
minimal value described, the routine completes the task, though probably not so fast as with
a recommended workspace, and provides the recommended workspace in the first element of
the corresponding array work on exit. Use this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended
workspace in the first element of the corresponding array (work). This operation is called a
workspace query.
520
Note that if you set lwork to less than the minimal required value and not -1, the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
The 2-by-2 unit diagonal blocks and the unit diagonal elements of U and L are not stored. The
remaining elements of U and L are stored in the corresponding columns of the array a, but
additional row interchanges are required to recover U or L explicitly (which is seldom necessary).
If ipiv(i) = i for all i =1...n, then all off-diagonal elements of U (L) are stored explicitly in
the corresponding elements of the array a.
If uplo = 'U', the computed factors U and D are the exact factors of a perturbed matrix A +
E, where
|E| ≤ c(n)ε P|U||D||UT|PT
c(n) is a modest linear function of n, and ε is the machine precision. A similar estimate holds
for the computed L and D when uplo = 'L'.
?sytrs to solve A*X = B
?sycon to estimate the condition number of A
?sytri to compute the inverse of A.
?hetrf
complex Hermitian matrix.
Syntax
FORTRAN 77:
call chetrf( uplo, n, a, lda, ipiv, work, lwork, info )
call zhetrf( uplo, n, a, lda, ipiv, work, lwork, info )
Fortran 95:
call hetrf( a [, uplo] [,ipiv] [,info] )
521
Description
The routine computes the factorization of a complex Hermitian matrix A using the Bunch-Kaufman
diagonal pivoting method:
if uplo='U', A = P*U*D*UH*PT
if uplo='L', A = P*L*D*LH*PT,
matrices with unit diagonal, and D is a Hermitian block-diagonal matrix with 1-by-1 and 2-by-2
of D.
NOTE. This routine supports the Progress Routine feature. See Progress Routine section
for details.
Input Parameters
how A is factored:
matrix A, and A is factored as P*U*D*UH*PT.
A, and A is factored as P*L*D*LH*PT.
a, work COMPLEX for chetrf
DOUBLE COMPLEX for zhetrf.
Arrays, DIMENSION a(lda,*), work(*).
The array a contains the upper or the lower triangular part of the matrix
A (see uplo). The second dimension of a must be at least max(1, n).
work(*) is a workspace array of dimension at least max(1, lwork).
522
issued by xerbla.
Output Parameters
a The upper or lower triangular part of a is overwritten by details of the
block-diagonal matrix D and the multipliers used to obtain the factor U
(or L).
work(1) If info = 0, on exit work(1) contains the minimum value of lwork
ipiv INTEGER.
interchanges and the block structure of D. If ipiv(i) = k > 0, then
If uplo = 'U' and ipiv(i) = ipiv(i-1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i-1, and the (i-1)-th row and column of
A was interchanged with the m-th row and column.
If uplo = 'L' and ipiv(i) = ipiv(i+1) = -m < 0, then D has a
2-by-2 block in rows/columns i and i+1, and the (i+1)-th row and
column of A was interchanged with the m-th row and column.

Specific details for the routine hetrf interface are as follows:
a holds the matrix A of size (n, n)
523
ipiv holds the vector of length n

uplo must be 'U' or 'L'. The default value is 'U'.
Application Notes
This routine is suitable for Hermitian matrices that are not known to be positive-definite. If A
is in fact positive-definite, the routine does not perform interchanges, and no 2-by-2 diagonal
blocks occur in D.

algorithm.
workspace query.
workspace.
If ipiv(i) = i for all i =1...n, then all off-diagonal elements of U (L) are stored explicitly in
the corresponding elements of the array a.
E, where

A similar estimate holds for the computed L and D when uplo = 'L'.
524
The total number of floating-point operations is approximately (4/3)n3.
?hetrs to solve A*X = B
?hecon to estimate the condition number of A
?hetri to compute the inverse of A.
?sptrf
symmetric matrix using packed storage.
Syntax
FORTRAN 77:
call ssptrf( uplo, n, ap, ipiv, info )
call dsptrf( uplo, n, ap, ipiv, info )
call csptrf( uplo, n, ap, ipiv, info )
call zsptrf( uplo, n, ap, ipiv, info )
Fortran 95:
call sptrf( ap [,uplo] [,ipiv] [,info] )
Description
The routine computes the factorization of a real/complex symmetric matrix A stored in the
packed format using the Bunch-Kaufman diagonal pivoting method. The form of the factorization
is:
if uplo='U', A = P*U*D*UT*PT
if uplo='L', A = P*L*D*LT*PT,
where P is a permutation matrix, U and L are upper and lower triangular matrices with unit
diagonal, and D is a symmetric block-diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks.
U and L have 2-by-2 unit diagonal blocks corresponding to the 2-by-2 blocks of D.
525
for details.
Input Parameters
the array ap and how A is factored:
matrix A, and A is factored as P*U*D*UT*PT.
If uplo = 'L', the array ap stores the lower triangular part of the
matrix A, and A is factored as P*L*D*LT*PT.
ap REAL for ssptrf
DOUBLE PRECISION for dsptrf
COMPLEX for csptrf
DOUBLE COMPLEX for zsptrf.
the upper or the lower triangular part of the matrix A (as specified by
uplo) in packed storage (see Matrix Storage Schemes).
Output Parameters
ap The upper or lower triangle of A (as specified by uplo) is overwritten
by details of the block-diagonal matrix D and the multipliers used to
obtain the factor U (or L).
ipiv INTEGER.
block in rows/columns i and i+1, and the (i+1)-th row and column of
526

Specific details for the routine sptrf interface are as follows:
Application Notes
remaining elements of U and L overwrite elements of the corresponding columns of the matrix
A, but additional row interchanges are required to recover U or L explicitly (which is seldom
necessary).
If ipiv(i) = i for all i = 1...n, then all off-diagonal elements of U (L) are stored explicitly
in packed form.
E, where
c(n) is a modest linear function of n, and ε is the machine precision. A similar estimate holds
for the computed L and D when uplo = 'L'.
?sptrs to solve A*X = B
?spcon to estimate the condition number of A
527
?sptri to compute the inverse of A.
?hptrf
complex Hermitian matrix using packed storage.
Syntax
FORTRAN 77:
call chptrf( uplo, n, ap, ipiv, info )
call zhptrf( uplo, n, ap, ipiv, info )
Fortran 95:
call hptrf( ap [,uplo] [,ipiv] [,info] )
Description
The routine computes the factorization of a complex Hermitian packed matrix A using the
Bunch-Kaufman diagonal pivoting method:
matrices with unit diagonal, and D is a Hermitian block-diagonal matrix with 1-by-1 and 2-by-2
of D.
for details.
Input Parameters
528
Indicates whether the upper or lower triangular part of A is packed and
how A is factored:
matrix A, and A is factored as P*U*D*UH*PT.
matrix A, and A is factored as P*L*D*LH*PT.
ap COMPLEX for chptrf
DOUBLE COMPLEX for zhptrf.
Output Parameters
ap The upper or lower triangle of A (as specified by uplo) is overwritten
by details of the block-diagonal matrix D and the multipliers used to
obtain the factor U (or L).
ipiv INTEGER.
block in rows/columns i and i+1, and the (i+1)-th row and column of
529

Specific details for the routine hptrf interface are as follows:
Application Notes
If ipiv(i) = i for all i = 1...n, then all off-diagonal elements of U (L) are stored explicitly
in the corresponding elements of the array a.
E, where

A similar estimate holds for the computed L and D when uplo = 'L'.
3
The total number of floating-point operations is approximately (4/3)n .
?hptrs to solve A*X = B
?hpcon to estimate the condition number of A
?hptri to compute the inverse of A.
Routines for Solving Systems of Linear Equations

This section describes the LAPACK routines for solving systems of linear equations. Before
calling most of these routines, you need to factorize the matrix of your system of equations
(see Routines for Matrix Factorization in this chapter). However, the factorization is not necessary
if your system of equations has a triangular matrix.
530
?getrs
Solves a system of linear equations with an
LU-factored square matrix, with multiple right-hand
sides.
Syntax
FORTRAN 77:
call sgetrs( trans, n, nrhs, a, lda, ipiv, b, ldb, info )
call dgetrs( trans, n, nrhs, a, lda, ipiv, b, ldb, info )
call cgetrs( trans, n, nrhs, a, lda, ipiv, b, ldb, info )
call zgetrs( trans, n, nrhs, a, lda, ipiv, b, ldb, info )
Fortran 95:
call getrs( a, ipiv, b [, trans] [,info] )
Description
The routine solves for X the following systems of linear equations:
A*X = B if trans='N',
AT*X = B if trans='T',
AH*X = B if trans='C' (for complex matrices only).
Before calling this routine, you must call ?getrf to compute the LU factorization of A.
Input Parameters
trans CHARACTER*1. Must be 'N' or 'T' or 'C'.
Indicates the form of the equations:
If trans = 'N', then A*X = B is solved for X.
If trans = 'T', then AT*X = B is solved for X.
If trans = 'C', then AH*X = B is solved for X.
n INTEGER. The order of A; the number of rows in B(n ≥ 0).
nrhs INTEGER. The number of right-hand sides; nrhs ≥ 0.
531
a, b REAL for sgetrs

DOUBLE PRECISION for dgetrs
COMPLEX for cgetrs
DOUBLE COMPLEX for zgetrs.
Arrays: a(lda,*), b(ldb,*).
The array a contains LU factorization of matrix A resulting from the call
of ?getrf .
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations.
The second dimension of a must be at least max(1,n), the second
dimension of b at least max(1,nrhs).
lda INTEGER. The first dimension of a; lda ≥ max(1, n).
ldb INTEGER. The first dimension of b; ldb ≥ max(1, n).
ipiv INTEGER.
Array, DIMENSION at least max(1, n). The ipiv array, as returned by
?getrf.
Output Parameters

Specific details for the routine getrs interface are as follows:
b Holds the matrix B of size (n, nrhs).
trans Must be 'N', 'C', or 'T'. The default value is 'N'.
532
Application Notes
For each right-hand side b, the computed solution is the exact solution of a perturbed system
of equations (A + E)x = b, where
|E| ≤ c(n)ε P|L||U|

If x0 is the true solution, the computed solution x satisfies this error bound:
where cond(A,x)= || |A-1||A| |x| ||∞ / ||x||∞ ≤ ||A-1||∞ ||A||∞ = κ∞(A).
Note that cond(A,x) can be much smaller than κ∞(A); the condition number of AT and AH
might or might not be equal to κ∞(A).
The approximate number of floating-point operations for one right-hand side vector b is 2n2
for real flavors and 8n2 for complex flavors.
To estimate the condition number κ∞(A), call ?gecon.
To refine the solution and estimate the error, call ?gerfs.
533
?gbtrs
Solves a system of linear equations with an
LU-factored band matrix, with multiple right-hand
sides.
Syntax
FORTRAN 77:
call sgbtrs( trans, n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call dgbtrs( trans, n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call cgbtrs( trans, n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call zgbtrs( trans, n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
Fortran 95:
call gbtrs( ab, b, ipiv, [, kl] [, trans] [, info] )
Description
The routine solves for X the following systems of linear equations:
H
A *X = B if trans='C' (for complex matrices only).
Here A is an LU-factored general band matrix of order n with kl non-zero subdiagonals and ku
nonzero superdiagonals. Before calling this routine, call ?gbtrf to compute the LU factorization
of A.
Input Parameters
n INTEGER. The order of A; the number of rows in B; n ≥ 0.
534
ab, b REAL for sgbtrs
DOUBLE PRECISION for dgbtrs
COMPLEX for cgbtrs
DOUBLE COMPLEX for zgbtrs.
Arrays: ab(ldab,*), b(ldb,*).
The array ab contains the matrix A in band storage (see Matrix Storage
Schemes).
The second dimension of ab must be at least max(1, n), and the
second dimension of b at least max(1,nrhs).
ldab INTEGER. The leading dimension of the array ab; ldab ≥ 2*kl + ku +1.
ldb INTEGER. The leading dimension of b; ldb ≥ max(1, n).
ipiv INTEGER. Array, DIMENSION at least max(1, n). The ipiv array, as
returned by ?gbtrf.
Output Parameters

Specific details for the routine gbtrs interface are as follows:
ipiv Holds the vector of length min(m, n).
ku Restored as lda-2*kl-1.
535
Application Notes
|E| ≤ c(kl + ku + 1)ε P|L||U|

-1 -1
where cond(A,x)= || |A ||A| |x| ||∞ / ||x||∞ ≤ ||A ||∞ ||A||∞ = κ∞(A).
The approximate number of floating-point operations for one right-hand side vector is 2n(ku
+ 2kl) for real flavors. The number of operations for complex flavors is 4 times greater. All
these estimates assume that kl and ku are much less than min(m,n).
To estimate the condition number κ∞(A), call ?gbcon.
To refine the solution and estimate the error, call ?gbrfs.
536
?gttrs
Solves a system of linear equations with a
tridiagonal matrix using the LU factorization
computed by ?gttrf.
Syntax
FORTRAN 77:
call sgttrs( trans, n, nrhs, dl, d, du, du2, ipiv, b, ldb, info )
call dgttrs( trans, n, nrhs, dl, d, du, du2, ipiv, b, ldb, info )
call cgttrs( trans, n, nrhs, dl, d, du, du2, ipiv, b, ldb, info )
call zgttrs( trans, n, nrhs, dl, d, du, du2, ipiv, b, ldb, info )
Fortran 95:
call gttrs( dl, d, du, du2, b, ipiv [, trans] [,info] )
Description
The routine solves for X the following systems of linear equations with multiple right hand sides:
H
Before calling this routine, you must call ?gttrf to compute the LU factorization of A.
Input Parameters
T
If trans = 'T', then A *X = B is solved for X.
H
If trans = 'C', then A *X = B is solved for X.
n INTEGER. The order of A; n ≥ 0.
537
nrhs INTEGER. The number of right-hand sides, that is, the number of
columns in B; nrhs ≥ 0.
dl,d,du,du2,b REAL for sgttrs
DOUBLE PRECISION for dgttrs
COMPLEX for cgttrs
DOUBLE COMPLEX for zgttrs.
Arrays: dl(n -1), d(n), du(n -1), du2(n -2), b(ldb,nrhs).
The array dl contains the (n - 1) multipliers that define the matrix
L from the LU factorization of A.
The array d contains the n diagonal elements of the upper triangular
matrix U from the LU factorization of A.
The array du contains the (n - 1) elements of the first superdiagonal
of U.
The array du2 contains the (n - 2) elements of the second superdiagonal
of U.
ipiv INTEGER. Array, DIMENSION (n). The ipiv array, as returned by ?gt-
trf.
Output Parameters

Specific details for the routine gttrs interface are as follows:
538
Application Notes
|E| ≤ c(n)ε P|L||U|

-1 -1
The approximate number of floating-point operations for one right-hand side vector b is 7n
(including n divisions) for real flavors and 34n (including 2n divisions) for complex flavors.
To estimate the condition number κ∞(A), call ?gtcon.
To refine the solution and estimate the error, call ?gtrfs.
539
?potrs
Cholesky-factored symmetric (Hermitian)
positive-definite matrix.
Syntax
FORTRAN 77:
call spotrs( uplo, n, nrhs, a, lda, b, ldb, info )
call dpotrs( uplo, n, nrhs, a, lda, b, ldb, info )
call cpotrs( uplo, n, nrhs, a, lda, b, ldb, info )
call zpotrs( uplo, n, nrhs, a, lda, b, ldb, info )
Fortran 95:
call potrs( a, b [,uplo] [, info] )
Description
The routine solves for X the system of linear equations A*X = B with a symmetric
positive-definite or, for complex data, Hermitian positive-definite matrix A, given the Cholesky
factorization of A:
A = UT*U for real data, If uplo = 'U'

A = UH*U for complex
data
A = L*LT for real data, If uplo = 'L',
A = L*LH for complex
data
where L is a lower triangular matrix and U is upper triangular. The system is solved with multiple
right-hand sides stored in the columns of the matrix B.
Before calling this routine, you must call ?potrf to compute the Cholesky factorization of A.
540
Input Parameters
Indicates how the input matrix A has been factored:
nrhs INTEGER. The number of right-hand sides (nrhs ≥ 0).
a, b REAL for spotrs
DOUBLE PRECISION for dpotrs
COMPLEX for cpotrs
DOUBLE COMPLEX for zpotrs.
The array a contains the factor U or L (see uplo).
Output Parameters

Specific details for the routine potrs interface are as follows:
541
Application Notes
If uplo = 'U', the computed solution for each right-hand side b is the exact solution of a
perturbed system of equations (A + E)x = b, where
|E| ≤ c(n)ε |UH||U|

A similar estimate holds for uplo = 'L'. If x0 is the true solution, the computed solution x
satisfies this error bound:
-1 -1
Note that cond(A,x) can be much smaller than κ∞ (A). The approximate number of floating-point
operations for one right-hand side vector b is 2n2 for real flavors and 8n2 for complex flavors.
To estimate the condition number κ∞(A), call ?pocon.
To refine the solution and estimate the error, call ?porfs.
?pptrs
Solves a system of linear equations with a packed
Syntax
FORTRAN 77:
call spptrs( uplo, n, nrhs, ap, b, ldb, info )
call dpptrs( uplo, n, nrhs, ap, b, ldb, info )
call cpptrs( uplo, n, nrhs, ap, b, ldb, info )
call zpptrs( uplo, n, nrhs, ap, b, ldb, info )
542
Fortran 95:
call pptrs( ap, b [,uplo] [,info] )
Description
The routine solves for X the system of linear equations A*X = B with a packed symmetric
positive-definite or, for complex data, Hermitian positive-definite matrix A, given the Cholesky
factorization of A:
A = UT*U for real data, If uplo = 'U'

data
A = L*LT for real data, If uplo = 'L',
data
Before calling this routine, you must call ?pptrf to compute the Cholesky factorization of A.
Input Parameters
nrhs INTEGER. The number of right-hand sides (nrhs ≥ 0).
ap, b REAL for spptrs
DOUBLE PRECISION for dpptrs
COMPLEX for cpptrs
DOUBLE COMPLEX for zpptrs.
Arrays: ap(*), b(ldb,*)
The dimension of ap must be at least max(1,n(n+1)/2).
543
The array ap contains the factor U or L, as specified by uplo, in packed

storage (see Matrix Storage Schemes).
sides for the systems of equations. The second dimension of b must
be at least max(1, nrhs).
Output Parameters

Specific details for the routine pptrs interface are as follows:
Application Notes
If uplo = 'U', the computed solution for each right-hand side b is the exact solution of a
perturbed system of equations (A + E)x = b, where
|E| ≤ c(n)ε |UH||U|

544
-1 -1
Note that cond(A,x) can be much smaller than κ∞(A).
The approximate number of floating-point operations for one right-hand side vector b is 2n2
for real flavors and 8n2 for complex flavors.
To estimate the condition number κ∞(A), call ?ppcon.
To refine the solution and estimate the error, call ?pprfs.
?pbtrs
positive-definite band matrix.
Syntax
FORTRAN 77:
call spbtrs( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call dpbtrs( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call cpbtrs( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call zpbtrs( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
Fortran 95:
call pbtrs( ab, b [,uplo] [,info] )
Description
545
The routine solves for real data a system of linear equations A*X = B with a symmetric
positive-definite or, for complex data, Hermitian positive-definite band matrix A, given the
Cholesky factorization of A:
A = UT*U for real data, If uplo='U'

data
A = L*LT for real data, If uplo='L',
data
Before calling this routine, you must call ?pbtrf to compute the Cholesky factorization of A in
the band storage form.
Input Parameters
If uplo = 'U', the upper triangular factor is stored in ab.
If uplo = 'L', the lower triangular factor is stored in ab.
A; kd ≥ 0.
ab, b REAL for spbtrs
DOUBLE PRECISION for dpbtrs
COMPLEX for cpbtrs
DOUBLE COMPLEX for zpbtrs.
The array ab contains the Cholesky factor, as returned by the
factorization routine, in band storage form.
The second dimension of ab must be at least max(1, n), and the
second dimension of b at least max(1,nrhs).
546
ldab INTEGER. The first dimension of the array ab; ldab ≥ kd +1.
Output Parameters

Specific details for the routine pbtrs interface are as follows:
Application Notes
|E| ≤ c(kd + 1)ε P|UH||U| or |E| ≤ c(kd + 1)ε P|LH||L|

-1 -1
547
The approximate number of floating-point operations for one right-hand side vector is 4n*kd
for real flavors and 16n*kd for complex flavors.
To estimate the condition number κ∞(A), call ?pbcon.
To refine the solution and estimate the error, call ?pbrfs.
?pttrs
symmetric (Hermitian) positive-definite tridiagonal
matrix using the factorization computed by ?pttrf.
Syntax
FORTRAN 77:
call spttrs( n, nrhs, d, e, b, ldb, info )
call dpttrs( n, nrhs, d, e, b, ldb, info )
call cpttrs( uplo, n, nrhs, d, e, b, ldb, info )
call zpttrs( uplo, n, nrhs, d, e, b, ldb, info )
Fortran 95:
call pttrs( d, e, b [,info] )
call pttrs( d, e, b [,uplo] [,info] )
Description
The routine solves for X a system of linear equations A*X = B with a symmetric (Hermitian)
positive-definite tridiagonal matrix A. Before calling this routine, call ?pttrf to compute the
L*D*L' for real data and the L*D*L' or U'*D*U factorization of A for complex data.
Input Parameters
uplo CHARACTER*1. Used for cpttrs/zpttrs only. Must be 'U' or 'L'.
Specifies whether the superdiagonal or the subdiagonal of the tridiagonal
matrix A is stored and how A is factored:
548
If uplo = 'U', the array e stores the superdiagonal of A, and A is
factored as U'*D*U.
If uplo = 'L', the array e stores the subdiagonal of A, and A is factored
as L*D*L'.
n INTEGER. The order of A; n ≥ 0.
columns of the matrix B; nrhs ≥ 0.
d REAL for spttrs, cpttrs
DOUBLE PRECISION for dpttrs, zpttrs.
Array, dimension (n). Contains the diagonal elements of the diagonal
matrix D from the factorization computed by ?pttrf.
e, b REAL for spttrs
DOUBLE PRECISION for dpttrs
COMPLEX for cpttrs
DOUBLE COMPLEX for zpttrs.
Arrays: e(n -1), b(ldb, nrhs).
The array e contains the (n - 1) off-diagonal elements of the unit
bidiagonal factor U or L from the factorization computed by ?pttrf
(see uplo).
Output Parameters

Specific details for the routine pttrs interface are as follows:
549

uplo Used in complex flavors only. Must be 'U' or 'L'. The default value is
'U'.
?sytrs
Solves a system of linear equations with a UDU-
or LDL-factored symmetric matrix.
Syntax
FORTRAN 77:
call ssytrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call dsytrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call csytrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call zsytrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
Fortran 95:
call sytrs( a, b, ipiv [,uplo] [,info] )
Description
The routine solves for X the system of linear equations A*X = B with a symmetric matrix A,
given the Bunch-Kaufman factorization of A:
if uplo='U', A = P*U*D*UT*PT
if uplo='L', A = P*L*D*LT*PT,
diagonal, and D is a symmetric block-diagonal matrix. The system is solved with multiple
right-hand sides stored in the columns of the matrix B. You must supply to this routine the
factor U (or L) and the array ipiv returned by the factorization routine ?sytrf.
550
Input Parameters
If uplo = 'U', the array a stores the upper triangular factor U of the
factorization A = P*U*D*UT*PT.
If uplo = 'L', the array a stores the lower triangular factor L of the
factorization A = P*L*D*LT*PT.
returned by ?sytrf.
a, b REAL for ssytrs
DOUBLE PRECISION for dsytrs
COMPLEX for csytrs
DOUBLE COMPLEX for zsytrs.
sides for the system of equations.
The second dimension of a must be at least max(1,n), and the second
Output Parameters

Specific details for the routine sytrs interface are as follows:
551

Application Notes
|E| ≤ c(n)ε P|U||D||UT|PT or |E| ≤ c(n)ε P|L||D||UT|PT

-1 -1
The total number of floating-point operations for one right-hand side vector is approximately
2n2 for real flavors or 8n2 for complex flavors.
To estimate the condition number κ∞(A), call ?sycon.
To refine the solution and estimate the error, call ?syrfs.
552
?hetrs
or LDL-factored Hermitian matrix.
Syntax
FORTRAN 77:
call chetrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call zhetrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
Fortran 95:
call hetrs( a, b, ipiv [, uplo] [,info] )
Description
The routine solves for X the system of linear equations A*X = B with a Hermitian matrix A,
if uplo = 'U' A = P*U*D*UH*PT
if uplo = 'L' A = P*L*D*LH*PT,
diagonal, and D is a symmetric block-diagonal matrix. The system is solved with multiple
right-hand sides stored in the columns of the matrix B. You must supply to this routine the
factor U (or L) and the array ipiv returned by the factorization routine ?hetrf.
Input Parameters
factorization A = P*U*D*UH*PT.
factorization A = P*L*D*LH*PT.
553
ipiv INTEGER.
Array, DIMENSION at least max(1, n).
The ipiv array, as returned by ?hetrf.
a, b COMPLEX for chetrs
DOUBLE COMPLEX for zhetrs.
sides for the system of equations.
Output Parameters

Specific details for the routine hetrs interface are as follows:
Application Notes
|E| ≤ c(n)ε P|U||D||UH|PT or |E| ≤ c(n)ε P|L||D||LH|PT
554
-1 -1
8n2.
To estimate the condition number κ∞(A), call ?hecon.
To refine the solution and estimate the error, call ?herfs.
?sptrs
or LDL-factored symmetric matrix using packed
storage.
Syntax
FORTRAN 77:
call ssptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call dsptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call csptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call zsptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
Fortran 95:
call sptrs( ap, b, ipiv [, uplo] [,info] )
555
Description
The routine solves for X the system of linear equations A*X = B with a symmetric matrix A,
if uplo='U', A = PUDUTPT
if uplo='L', A = PLDLTPT,
where P is a permutation matrix, U and L are upper and lower packed triangular matrices with
unit diagonal, and D is a symmetric block-diagonal matrix. The system is solved with multiple
right-hand sides stored in the columns of the matrix B. You must supply the factor U (or L) and
the array ipiv returned by the factorization routine ?sptrf.
Input Parameters
If uplo = 'U', the array ap stores the packed factor U of the
factorization A = P*U*D*UT*PT. If uplo = 'L', the array ap stores the
packed factor L of the factorization A = P*L*D*LT*PT.
ipiv INTEGER.
?sptrf.
ap, b REAL for ssptrs
DOUBLE PRECISION for dsptrs
COMPLEX for csptrs
DOUBLE COMPLEX for zsptrs.
Arrays: ap(*), b(ldb,*).
The dimension of ap must be at least max(1,n(n+1)/2). The array ap
contains the factor U or L, as specified by uplo, in packed storage (see
Matrix Storage Schemes).
sides for the system of equations. The second dimension of b must be
at least max(1, nrhs).
556
Output Parameters

Specific details for the routine sptrs interface are as follows:
Application Notes
|E| ≤ c(n)ε P|U||D||UT|PT or |E| ≤ c(n)ε P|L||D||LT|PT

-1 -1
557
2n2 for real flavors or 8n2 for complex flavors.
To estimate the condition number κ∞(A), call ?spcon.
To refine the solution and estimate the error, call ?sprfs.
?hptrs
or LDL-factored Hermitian matrix using packed
storage.
Syntax
FORTRAN 77:
call chptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call zhptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
Fortran 95:
call hptrs( ap, b, ipiv [,uplo] [,info] )
Description
The routine solves for X the system of linear equations A*X = B with a Hermitian matrix A,
where P is a permutation matrix, U and L are upper and lower packed triangular matrices with
unit diagonal, and D is a symmetric block-diagonal matrix. The system is solved with multiple
You must supply to this routine the arrays ap (containing U or L)and ipiv in the form returned
by the factorization routine ?hptrf.
558
Input Parameters
If uplo = 'U', the array ap stores the packed factor U of the
factorization A = P*U*D*UH*PT. If uplo = 'L', the array ap stores the
packed factor L of the factorization A = P*L*D*LH*PT.
returned by ?hptrf.
ap, b COMPLEX for chptrs
DOUBLE COMPLEX for zhptrs.
Output Parameters

Specific details for the routine hptrs interface are as follows:
559

Application Notes
|E| ≤ c(n)ε P|U||D||UH|PT or |E| ≤ c(n)ε P|L||D||LH|PT

-1 -1
8n2 for complex flavors.
To estimate the condition number κ∞(A), call ?hpcon.
To refine the solution and estimate the error, call ?hprfs.
560
?trtrs
triangular matrix, with multiple right-hand sides.
Syntax
FORTRAN 77:
call strtrs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, info )
call dtrtrs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, info )
call ctrtrs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, info )
call ztrtrs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, info )
Fortran 95:
call trtrs( a, b [,uplo] [, trans] [,diag] [,info] )
Description
The routine solves for X the following systems of linear equations with a triangular matrix A,
with multiple right-hand sides stored in B:
H
Input Parameters
Indicates whether A is upper or lower triangular:
If uplo = 'U', then A is upper triangular.
If uplo = 'L', then A is lower triangular.
T
H
561
diag CHARACTER*1. Must be 'N' or 'U'.

If diag = 'N', then A is not a unit triangular matrix.
If diag = 'U', then A is unit triangular: diagonal elements of A are
assumed to be 1 and not referenced in the array a.
a, b REAL for strtrs
DOUBLE PRECISION for dtrtrs
COMPLEX for ctrtrs
DOUBLE COMPLEX for ztrtrs.
The array a contains the matrix A.
Output Parameters

Specific details for the routine trtrs interface are as follows:
a Stands for argument ap in FORTRAN 77 interface. Holds the matrix A
of size (n*(n+1)/2).
562
Application Notes
|E| ≤ c(n)ε |A|
c(n) is a modest linear function of n, and ε is the machine precision. If x0 is the true solution,
the computed solution x satisfies this error bound:
-1 -1
2
The approximate number of floating-point operations for one right-hand side vector b is n for
real flavors and 4n2 for complex flavors.
To estimate the condition number κ∞(A), call ?trcon.
To estimate the error in the solution, call ?trrfs.
563
?tptrs
Solves a system of linear equations with a packed
Syntax
FORTRAN 77:
call stptrs( uplo, trans, diag, n, nrhs, ap, b, ldb, info )
call dtptrs( uplo, trans, diag, n, nrhs, ap, b, ldb, info )
call ctptrs( uplo, trans, diag, n, nrhs, ap, b, ldb, info )
call ztptrs( uplo, trans, diag, n, nrhs, ap, b, ldb, info )
Fortran 95:
call tptrs( ap, b [,uplo] [, trans] [,diag] [,info] )
Description
The routine solves for X the following systems of linear equations with a packed triangular
matrix A, with multiple right-hand sides stored in B:
H
Input Parameters
T
H
564
If diag = 'U', then A is unit triangular: diagonal elements are assumed
to be 1 and not referenced in the array ap.
ap, b REAL for stptrs
DOUBLE PRECISION for dtptrs
COMPLEX for ctptrs
DOUBLE COMPLEX for ztptrs.
contains the matrix A in packed storage (see Matrix Storage Schemes).
Output Parameters

Specific details for the routine tptrs interface are as follows:
565
Application Notes
|E| ≤ c(n)ε |A|

-1 -1
2
The approximate number of floating-point operations for one right-hand side vector b is n for
real flavors and 4n2 for complex flavors.
To estimate the condition number κ∞(A), call ?tpcon.
To estimate the error in the solution, call ?tprfs.
566
?tbtrs
Solves a system of linear equations with a band
Syntax
FORTRAN 77:
call stbtrs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, info )
call dtbtrs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, info )
call ctbtrs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, info )
call ztbtrs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, info )
Fortran 95:
call tbtrs( ab, b [,uplo] [, trans] [,diag] [,info] )
Description
The routine solves for X the following systems of linear equations with a band triangular matrix
A, with multiple right-hand sides stored in B:
H
Input Parameters
T
H
567

to be 1 and not referenced in the array ab.
A; kd ≥ 0.
ab, b REAL for stbtrs
DOUBLE PRECISION for dtbtrs
COMPLEX for ctbtrs
DOUBLE COMPLEX for ztbtrs.
The array ab contains the matrix A in band storage form.
The second dimension of ab must be at least max(1, n), the second
ldab INTEGER. The first dimension of ab; ldab ≥ kd + 1.
Output Parameters

Specific details for the routine tbtrs interface are as follows:
ab Holds the array A of size (kd+1,n)
568
Application Notes
|E|≤ c(n)ε|A|
c(n) is a modest linear function of n, and ε is the machine precision. If x0 is the true solution,
the computed solution x satisfies this error bound:
-1 -1
The approximate number of floating-point operations for one right-hand side vector b is 2n*kd
for real flavors and 8n*kd for complex flavors.
To estimate the condition number κ∞(A), call ?tbcon.
To estimate the error in the solution, call ?tbrfs.
Routines for Estimating the Condition Number

This section describes the LAPACK routines for estimating the condition number of a matrix.
The condition number is used for analyzing the errors in the solution of a system of linear
equations (see Error Analysis). Since the condition number may be arbitrarily large when the
matrix is nearly singular, the routines actually compute the reciprocal condition number.
569
?gecon
Estimates the reciprocal of the condition number
of a general matrix in the 1-norm or the
infinity-norm.
Syntax
FORTRAN 77:
call sgecon( norm, n, a, lda, anorm, rcond, work, iwork, info )
call dgecon( norm, n, a, lda, anorm, rcond, work, iwork, info )
call cgecon( norm, n, a, lda, anorm, rcond, work, rwork, info )
call zgecon( norm, n, a, lda, anorm, rcond, work, rwork, info )
Fortran 95:
call gecon( a, anorm, rcond [,norm] [,info] )
Description
The routine estimates the reciprocal of the condition number of a general matrix A in the 1-norm
or infinity-norm:
-1
κ 1(A) =||A||1||A ||1 = κ ∞(A )= κ ∞(A )
T H
-1
κ ∞(A) =||A||∞||A ||∞ = κ 1(A )= κ 1(A ).
T H
Before calling this routine:
• compute anorm (either ||A|| =max Σ |a | or ||A|| = max Σ |a |)

1 j i ij ∞ i j ij
• call ?getrf to compute the LU factorization of A.
Input Parameters
norm CHARACTER*1. Must be '1' or 'O' or 'I'.
If norm = '1' or 'O', then the routine estimates the condition number
of matrix A in 1-norm.
570
If norm = 'I', then the routine estimates the condition number of
matrix A in infinity-norm.
a, work REAL for sgecon
DOUBLE PRECISION for dgecon
COMPLEX for cgecon
DOUBLE COMPLEX for zgecon. Arrays: a(lda,*), work(*).
The array a contains the LU-factored matrix A, as returned by ?getrf.
The second dimension of a must be at least max(1,n). The array work
is a workspace for the routine.
The dimension of work must be at least max(1, 4*n) for real flavors
and max(1, 2*n) for complex flavors.
anorm REAL for single precision flavors.
DOUBLE PRECISION for double precision flavors. The norm of the
original matrix A (see Description).
iwork INTEGER. Workspace array, DIMENSION at least max(1, n).
rwork REAL for cgecon
DOUBLE PRECISION for zgecon.
Workspace array, DIMENSION at least max(1, 2*n).
Output Parameters
rcond REAL for single precision flavors.
DOUBLE PRECISION for double precision flavors.
An estimate of the reciprocal of the condition number. The routine sets
rcond = 0 if the estimate underflows; in this case the matrix is singular
(to working precision). However, anytime rcond is small compared to
1.0, for the working precision, the matrix may be poorly conditioned
or even singular.
571

Specific details for the routine gecon interface are as follows:
norm Must be '1', 'O', or 'I'. The default value is '1'.
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in
practice is nearly always less than 10r. A call to this routine involves solving a number of
systems of linear equations A*x = b or AH*x = b; the number is usually 4 or 5 and never
more than 11. Each solution requires approximately 2*n2 floating-point operations for real
flavors and 8*n2 for complex flavors.
?gbcon
of a band matrix in the 1-norm or the infinity-norm.
Syntax
FORTRAN 77:
call sgbcon( norm, n, kl, ku, ab, ldab, ipiv, anorm, rcond, work, iwork, info
)
call dgbcon( norm, n, kl, ku, ab, ldab, ipiv, anorm, rcond, work, iwork, info
)
call cgbcon( norm, n, kl, ku, ab, ldab, ipiv, anorm, rcond, work, rwork, info
)
call zgbcon( norm, n, kl, ku, ab, ldab, ipiv, anorm, rcond, work, rwork, info
)
Fortran 95:
call gbcon( ab, ipiv, anorm, rcond [,kl] [,norm] [,info] )
572
Description
The routine estimates the reciprocal of the condition number of a general band matrix A in the
1-norm or infinity-norm:
-1
κ1(A) =||A||1||A ||1 = κ∞(A ) = κ∞(A )
T H
-1
κ∞(A) =||A||∞||A ||∞ = κ1(A ) = κ1(A ).
T H
• compute anorm (either ||A||1 =maxj Σi |aij| or ||A||∞ = maxi Σj |aij|)

• call ?gbtrf to compute the LU factorization of A.
Input Parameters
ldab INTEGER. The first dimension of the array ab. (ldab ≥ 2*kl + ku +1).
returned by ?gbtrf.
ab, work REAL for sgbcon
DOUBLE PRECISION for dgbcon
COMPLEX for cgbcon
DOUBLE COMPLEX for zgbcon.
Arrays: ab(ldab,*), work(*).
The array ab contains the factored band matrix A, as returned by ?gb-
trf.
573
The second dimension of ab must be at least max(1,n). The array work

is a workspace for the routine.
The norm of the original matrix A (see Description).
rwork REAL for cgbcon
DOUBLE PRECISION for zgbcon.
Workspace array, DIMENSION at least max(1, 2*n).
Output Parameters
rcond =0 if the estimate underflows; in this case the matrix is singular
or even singular.

Specific details for the routine gbcon interface are as follows:
574
Application Notes
systems of linear equations A*x = b or AH*x = b; the number is usually 4 or 5 and never
more than 11. Each solution requires approximately 2n(ku + 2kl) floating-point operations
for real flavors and 8n(ku + 2kl) for complex flavors.
?gtcon
of a tridiagonal matrix using the factorization
computed by ?gttrf.
Syntax
FORTRAN 77:
call sgtcon( norm, n, dl, d, du, du2, ipiv, anorm, rcond, work, iwork, info
)
call dgtcon( norm, n, dl, d, du, du2, ipiv, anorm, rcond, work, iwork, info
)
call cgtcon( norm, n, dl, d, du, du2, ipiv, anorm, rcond, work, info )
call zgtcon( norm, n, dl, d, du, du2, ipiv, anorm, rcond, work, info )
Fortran 95:
call gtcon( dl, d, du, du2, ipiv, anorm, rcond [,norm] [,info] )
Description
The routine estimates the reciprocal of the condition number of a real or complex tridiagonal
matrix A in the 1-norm or infinity-norm:
-1
κ1(A) = ||A||1||A ||1
-1
κ∞(A) = ||A||∞||A ||∞
An estimate is obtained for ||A-1||, and the reciprocal of the condition number is computed
as rcond = 1 / (||A|| ||A-1||).
575
• compute anorm (either ||A|| = max Σ |a | or ||A|| = max Σ |a |)

1 j i ij ∞ i j ij
• call ?gttrf to compute the LU factorization of A.
Input Parameters
dl,d,du,du2 REAL for sgtcon
DOUBLE PRECISION for dgtcon
COMPLEX for cgtcon
DOUBLE COMPLEX for zgtcon.
Arrays: dl(n -1), d(n), du(n -1), du2(n -2).
The array dl contains the (n - 1) multipliers that define the matrix
L from the LU factorization of A as computed by ?gttrf.
The array d contains the n diagonal elements of the upper triangular
matrix U from the LU factorization of A.
The array du contains the (n - 1) elements of the first superdiagonal
of U.
The array du2 contains the (n - 2) elements of the second superdiagonal
of U.
ipiv INTEGER.
Array, DIMENSION (n). The array of pivot indices, as returned by ?gt-
trf.
work REAL for sgtcon
DOUBLE PRECISION for dgtcon
COMPLEX for cgtcon
DOUBLE COMPLEX for zgtcon.
Workspace array, DIMENSION (2*n).
576
iwork INTEGER. Workspace array, DIMENSION (n). Used for real flavors only.
Output Parameters
rcond=0 if the estimate underflows; in this case the matrix is singular
or even singular.

Specific details for the routine gtcon interface are as follows:
Application Notes
systems of linear equations A*x = b; the number is usually 4 or 5 and never more than 11.
Each solution requires approximately 2n2 floating-point operations for real flavors and 8n2 for
complex flavors.
577
?pocon
of a symmetric (Hermitian) positive-definite matrix.
Syntax
FORTRAN 77:
call spocon( uplo, n, a, lda, anorm, rcond, work, iwork, info )
call dpocon( uplo, n, a, lda, anorm, rcond, work, iwork, info )
call cpocon( uplo, n, a, lda, anorm, rcond, work, rwork, info )
call zpocon( uplo, n, a, lda, anorm, rcond, work, rwork, info )
Fortran 95:
call pocon( a, anorm, rcond [,uplo] [,info] )
Description
The routine estimates the reciprocal of the condition number of a symmetric (Hermitian)
positive-definite matrix A:
κ1(A) = ||A||1 ||A-1||1 (since A is symmetric or Hermitian, κ∞(A) = κ1(A)).

1 j i ij ∞ i j ij
• call ?potrf to compute the Cholesky factorization of A.
Input Parameters
a, work REAL for spocon
578
DOUBLE PRECISION for dpocon
COMPLEX for cpocon
DOUBLE COMPLEX for zpocon.
Arrays: a(lda,*), work(*).
The array a contains the factored matrix A, as returned by ?potrf. The
second dimension of a must be at least max(1,n).
The array work is a workspace for the routine. The dimension of work
must be at least max(1, 3*n) for real flavors and max(1, 2*n) for
complex flavors.
anorm REAL for single precision flavors
rwork REAL for cpocon
DOUBLE PRECISION for zpocon.
Workspace array, DIMENSION at least max(1, n).
Output Parameters
rcond REAL for single precision flavors
or even singular.

Specific details for the routine pocon interface are as follows:
579
Application Notes
complex flavors.
?ppcon
of a packed symmetric (Hermitian) positive-definite
matrix.
Syntax
FORTRAN 77:
call sppcon( uplo, n, ap, anorm, rcond, work, iwork, info )
call dppcon( uplo, n, ap, anorm, rcond, work, iwork, info )
call cppcon( uplo, n, ap, anorm, rcond, work, rwork, info )
call zppcon( uplo, n, ap, anorm, rcond, work, rwork, info )
Fortran 95:
call ppcon( ap, anorm, rcond [,uplo] [,info] )
Description
The routine estimates the reciprocal of the condition number of a packed symmetric (Hermitian)
positive-definite matrix A:

1 j i ij ∞ i j ij
580
• call ?pptrf to compute the Cholesky factorization of A.
Input Parameters
ap, work REAL for sppcon
DOUBLE PRECISION for dppcon
COMPLEX for cppcon
DOUBLE COMPLEX for zppcon.
Arrays: ap(*), work(*).
The array ap contains the packed factored matrix A, as returned by
?pptrf. The dimension of ap must be at least max(1,n(n+1)/2).
complex flavors.
rwork REAL for cppcon
DOUBLE PRECISION for zppcon.
Output Parameters
or even singular.
581

Specific details for the routine ppcon interface are as follows:
Application Notes
complex flavors.
?pbcon
of a symmetric (Hermitian) positive-definite band
matrix.
Syntax
FORTRAN 77:
call spbcon( uplo, n, kd, ab, ldab, anorm, rcond, work, iwork, info )
call dpbcon( uplo, n, kd, ab, ldab, anorm, rcond, work, iwork, info )
call cpbcon( uplo, n, kd, ab, ldab, anorm, rcond, work, rwork, info )
call zpbcon( uplo, n, kd, ab, ldab, anorm, rcond, work, rwork, info )
Fortran 95:
call pbcon( ab, anorm, rcond [,uplo] [,info] )
Description
582
The routine estimates the reciprocal of the condition number of a symmetric (Hermitian)
positive-definite band matrix A:

1 j i ij ∞ i j ij
• call ?pbtrf to compute the Cholesky factorization of A.
Input Parameters
If uplo = 'U', the upper triangular factor is stored in ab.
If uplo = 'L', the lower triangular factor is stored in ab.
A; kd ≥ 0.
ldab INTEGER. The first dimension of the array ab. (ldab ≥ kd +1).
ab, work REAL for spbcon
DOUBLE PRECISION for dpbcon
COMPLEX for cpbcon
DOUBLE COMPLEX for zpbcon.
The array ab contains the factored matrix A in band form, as returned
by ?pbtrf. The second dimension of ab must be at least max(1, n).
complex flavors.
rwork REAL for cpbcon
DOUBLE PRECISION for zpbcon.
583
Output Parameters
or even singular.

Specific details for the routine pbcon interface are as follows:
Application Notes
Each solution requires approximately 4*n(kd + 1) floating-point operations for real flavors
and 16*n(kd + 1) for complex flavors.
584
?ptcon
of a symmetric (Hermitian) positive-definite
tridiagonal matrix.
Syntax
FORTRAN 77:
call sptcon( n, d, e, anorm, rcond, work, info )
call dptcon( n, d, e, anorm, rcond, work, info )
call cptcon( n, d, e, anorm, rcond, work, info )
call zptcon( n, d, e, anorm, rcond, work, info )
Fortran 95:
call ptcon( d, e, anorm, rcond [,info] )
Description
The routine computes the reciprocal of the condition number (in the 1-norm) of a real symmetric
or complex Hermitian positive-definite tridiagonal matrix using the factorization A = L*D*LT
for real flavors and A = L*D*LH for complex flavors or A = UT*D*U for real flavors and A =
UH*D*U for complex flavors computed by ?pttrf :
The norm ||A-1|| is computed by a direct method, and the reciprocal of the condition number
is computed as rcond = 1 / (||A|| ||A-1||).
• compute anorm as ||A|| = max Σ |a |

1 j i ij
• call ?pttrf to compute the factorization of A.
Input Parameters
585
d, work REAL for single precision flavors

Arrays, dimension (n).
The array d contains the n diagonal elements of the diagonal matrix D
from the factorization of A, as computed by ?pttrf ;
work is a workspace array.
e REAL for sptcon
DOUBLE PRECISION for dptcon
COMPLEX for cptcon
DOUBLE COMPLEX for zptcon.
Array, DIMENSION (n -1).
Contains off-diagonal elements of the unit bidiagonal factor U or L from
the factorization computed by ?pttrf .
The 1- norm of the original matrix A (see Description).
Output Parameters
or even singular.
info INTEGER.
If info = 0, the execution is successful.

Specific details for the routine gtcon interface are as follows:
586
Application Notes
and 16*n(kd + 1) for complex flavors.
?sycon
of a symmetric matrix.
Syntax
FORTRAN 77:
call ssycon( uplo, n, a, lda, ipiv, anorm, rcond, work, iwork, info )
call dsycon( uplo, n, a, lda, ipiv, anorm, rcond, work, iwork, info )
call csycon( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call zsycon( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
Fortran 95:
call sycon( a, ipiv, anorm, rcond [,uplo] [,info] )
Description
The routine estimates the reciprocal of the condition number of a symmetric matrix A:
κ1(A) = ||A||1 ||A-1||1 (since A is symmetric, κ∞(A) = κ1(A)).
• compute anorm (either ||A||1 = maxj Σi |aij| or ||A||∞ = maxi Σj |aij|)

• call ?sytrf to compute the factorization of A.
587
Input Parameters
factorization A = P*U*D*UT*PT.
factorization A = P*L*D*LT*PT.
a, work REAL for ssycon
DOUBLE PRECISION for dsycon
COMPLEX for csycon
DOUBLE COMPLEX for zsycon.
The array a contains the factored matrix A, as returned by ?sytrf. The
The array work is a workspace for the routine.
The dimension of work must be at least max(1, 2*n).
ipiv INTEGER. Array, DIMENSION at least max(1, n).
The array ipiv, as returned by ?sytrf.
Output Parameters
or even singular.
588
Specific details for the routine sycon interface are as follows:
Application Notes
complex flavors.
?hecon
of a Hermitian matrix.
Syntax
FORTRAN 77:
call checon( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call zhecon( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
Fortran 95:
call hecon( a, ipiv, anorm, rcond [,uplo] [,info] )
Description
The routine estimates the reciprocal of the condition number of a Hermitian matrix A:
κ1(A) = ||A||1 ||A-1||1 (since A is Hermitian, κ∞(A) = κ1(A)).
589
• compute anorm (either ||A||1 =maxj Σi |aij| or ||A||∞ =maxi Σj |aij|)

• call ?hetrf to compute the factorization of A.
Input Parameters
a, work COMPLEX for checon
DOUBLE COMPLEX for zhecon.
The array a contains the factored matrix A, as returned by ?hetrf. The
The array ipiv, as returned by ?hetrf.
Output Parameters
or even singular.
590

Specific details for the routine hecon interface are as follows:
Application Notes
systems of linear equations A*x = b; the number is usually 5 and never more than 11. Each
solution requires approximately 8n2 floating-point operations.
?spcon
of a packed symmetric matrix.
Syntax
FORTRAN 77:
call sspcon( uplo, n, ap, ipiv, anorm, rcond, work, iwork, info )
call dspcon( uplo, n, ap, ipiv, anorm, rcond, work, iwork, info )
call cspcon( uplo, n, ap, ipiv, anorm, rcond, work, info )
call zspcon( uplo, n, ap, ipiv, anorm, rcond, work, info )
Fortran 95:
call spcon( ap, ipiv, anorm, rcond [,uplo] [,info] )
591
Description
The routine estimates the reciprocal of the condition number of a packed symmetric matrix A:
κ1(A) = ||A||1 ||A-1||1 (since A is symmetric, κ∞(A) = κ1(A)).
• compute anorm (either ||A||1 = maxj Σi |aij| or ||A||∞ = maxi Σj |aij|)

• call ?sptrf to compute the factorization of A.
Input Parameters
If uplo = 'U', the array ap stores the packed upper triangular factor
U of the factorization A = P*U*D*UT*PT.
If uplo = 'L', the array ap stores the packed lower triangular factor
L of the factorization A = P*L*D*LT*PT.
ap, work REAL for sspcon
DOUBLE PRECISION for dspcon
COMPLEX for cspcon
DOUBLE COMPLEX for zspcon.
?sptrf. The dimension of ap must be at least max(1,n(n+1)/2).
The array ipiv, as returned by ?sptrf.
592
Output Parameters
rcond = 0 if the estimate underflows; in this case the matrix is singular
or even singular.

Specific details for the routine spcon interface are as follows:
Application Notes
complex flavors.
593
?hpcon
of a packed Hermitian matrix.
Syntax
FORTRAN 77:
call chpcon( uplo, n, ap, ipiv, anorm, rcond, work, info )
call zhpcon( uplo, n, ap, ipiv, anorm, rcond, work, info )
Fortran 95:
call hpcon( ap, ipiv, anorm, rcond [,uplo] [,info] )
Description
The routine estimates the reciprocal of the condition number of a Hermitian matrix A:
κ1(A) = ||A||1 ||A-1||1 (since A is Hermitian, κ∞(A) = k1(A)).
• compute anorm (either ||A||1 =maxj Σi |aij| or ||A||∞ =maxi Σj |aij|)

• call ?hptrf to compute the factorization of A.
Input Parameters
If uplo = 'U', the array ap stores the packed upper triangular factor
U of the factorization A = P*U*D*UT*PT.
If uplo = 'L', the array ap stores the packed lower triangular factor
L of the factorization A = P*L*D*LT*PT.
ap, work COMPLEX for chpcon
DOUBLE COMPLEX for zhpcon.
594
?hptrf. The dimension of ap must be at least max(1,n(n+1)/2).
ipiv INTEGER.
Array, DIMENSION at least max(1, n). The array ipiv, as returned by
?hptrf.
Output Parameters
or even singular.

Specific details for the routine hbcon interface are as follows:
Application Notes
systems of linear equations A*x = b; the number is usually 5 and never more than 11. Each
solution requires approximately 8n2 floating-point operations.
595
?trcon
of a triangular matrix.
Syntax
FORTRAN 77:
call strcon( norm, uplo, diag, n, a, lda, rcond, work, iwork, info )
call dtrcon( norm, uplo, diag, n, a, lda, rcond, work, iwork, info )
call ctrcon( norm, uplo, diag, n, a, lda, rcond, work, rwork, info )
call ztrcon( norm, uplo, diag, n, a, lda, rcond, work, rwork, info )
Fortran 95:
call trcon( a, rcond [,uplo] [,diag] [,norm] [,info] )
Description
The routine estimates the reciprocal of the condition number of a triangular matrix A in either
the 1-norm or infinity-norm:
-1
κ1(A) =||A||1 ||A ||1 = κ∞(A ) = κ∞(A )
T H
-1
κ∞ (A) =||A||∞ ||A ||∞ =k1 (A ) = κ1 (A ) .
T H
Input Parameters
If uplo = 'U', the array a stores the upper triangle of A, other array
elements are not referenced.
596
If uplo = 'L', the array a stores the lower triangle of A, other array
elements are not referenced.
to be 1 and not referenced in the array a.
a, work REAL for strcon
DOUBLE PRECISION for dtrcon
COMPLEX for ctrcon
DOUBLE COMPLEX for ztrcon.
The array a contains the matrix A. The second dimension of a must be
at least max(1,n).
complex flavors.
rwork REAL for ctrcon
DOUBLE PRECISION for ztrcon.
Output Parameters
or even singular.
597

Specific details for the routine trcon interface are as follows:
Application Notes
Each solution requires approximately n2 floating-point operations for real flavors and 4n2
operations for complex flavors.
?tpcon
of a packed triangular matrix.
Syntax
FORTRAN 77:
call stpcon( norm, uplo, diag, n, ap, rcond, work, iwork, info )
call dtpcon( norm, uplo, diag, n, ap, rcond, work, iwork, info )
call ctpcon( norm, uplo, diag, n, ap, rcond, work, rwork, info )
call ztpcon( norm, uplo, diag, n, ap, rcond, work, rwork, info )
Fortran 95:
call tpcon( ap, rcond [,uplo] [,diag] [,norm] [,info] )
598
Description
The routine estimates the reciprocal of the condition number of a packed triangular matrix A
in either the 1-norm or infinity-norm:
-1
κ1(A) =||A||1 ||A ||1 = κ∞(A ) = κ∞(A )
T H
-1
κ∞(A) =||A||∞ ||A ||∞ =κ1 (A ) = κ1(A ) .
T H
Input Parameters
uplo CHARACTER*1. Must be 'U' or 'L'. Indicates whether A is upper or
lower triangular:
If uplo = 'U', the array ap stores the upper triangle of A in packed
form.
If uplo = 'L', the array ap stores the lower triangle of A in packed
form.
ap, work REAL for stpcon
DOUBLE PRECISION for dtpcon
COMPLEX for ctpcon
DOUBLE COMPLEX for ztpcon.
The array ap contains the packed matrix A. The dimension of ap must
be at least max(1,n(n+1)/2). The array work is a workspace for the
routine.
599
rwork REAL for ctpcon
DOUBLE PRECISION for ztpcon.
Output Parameters
or even singular.

Specific details for the routine tpcon interface are as follows:
Application Notes
Each solution requires approximately n2 floating-point operations for real flavors and 4n2
600
?tbcon
of a triangular band matrix.
Syntax
FORTRAN 77:
call stbcon( norm, uplo, diag, n, kd, ab, ldab, rcond, work, iwork, info )
call dtbcon( norm, uplo, diag, n, kd, ab, ldab, rcond, work, iwork, info )
call ctbcon( norm, uplo, diag, n, kd, ab, ldab, rcond, work, rwork, info )
call ztbcon( norm, uplo, diag, n, kd, ab, ldab, rcond, work, rwork, info )
Fortran 95:
call tbcon( ab, rcond [,uplo] [,diag] [,norm] [,info] )
Description
The routine estimates the reciprocal of the condition number of a triangular band matrix A in
either the 1-norm or infinity-norm:
-1
κ1(A) =||A||1 ||A ||1 = κ∞(A ) = κ∞(A )
T H
-1
κ∞(A) =||A||∞ ||A ||∞ =κ1 (A ) = κ1(A ) .
T H
Input Parameters
uplo CHARACTER*1. Must be 'U' or 'L'. Indicates whether A is upper or
lower triangular:
If uplo = 'U', the array ap stores the upper triangle of A in packed
form.
601
If uplo = 'L', the array ap stores the lower triangle of A in packed

form.
A; kd ≥ 0.
ab, work REAL for stbcon
DOUBLE PRECISION for dtbcon
COMPLEX for ctbcon
DOUBLE COMPLEX for ztbcon.
The array ab contains the band matrix A. The second dimension of ab
must be at least max(1,n). The array work is a workspace for the
routine.
ldab INTEGER. The first dimension of the array ab. (ldab ≥ kd +1).
rwork REAL for ctbcon
DOUBLE PRECISION for ztbcon.
Output Parameters
or even singular.
602
Specific details for the routine tbcon interface are as follows:
Application Notes
and 8*n(kd + 1) operations for complex flavors.
Refining the Solution and Estimating Its Error

This section describes the LAPACK routines for refining the computed solution of a system of
linear equations and estimating the solution error. You can call these routines after factorizing
the matrix of the system of equations and computing the solution (see Routines for Matrix
Factorization and Routines for Solving Systems of Linear Equations).
603
?gerfs
Refines the solution of a system of linear equations
with a general matrix and estimates its error.
Syntax
FORTRAN 77:
call sgerfs( trans, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr,
berr, work, iwork, info )
call dgerfs( trans, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr,
call cgerfs( trans, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr,
berr, work, rwork, info )
call zgerfs( trans, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr,
Fortran 95:
call gerfs( a, af, ipiv, b, x [,trans] [,ferr] [,berr] [,info] )
Description
The routine performs an iterative refinement of the solution to a system of linear equations
H
A*X = B or AT*X = B or A *X = B with a general matrix A, with multiple right-hand sides. For
each computed solution vector x, the routine computes the component-wise backward error
β. This error is the smallest relative perturbation in elements of A and b such that x is the exact
solution of the perturbed system:
|δaij|/|aij| ≤ β|aij|, |δbi|/|bi| ≤ β |bi| such that (A + δA)x = (b + δb).
Finally, the routine estimates the component-wise forward error in the computed solution
||x - xe||∞/||x||∞ (here xe is the exact solution).
• call the factorization routine ?getrf

• call the solver routine ?getrs.
604
Input Parameters
If trans = 'N', the system has the form A*X = B.
T
If trans = 'T', the system has the form A *X = B.
H
If trans = 'C', the system has the form A *X = B.
a,af,b,x,work REAL for sgerfs
DOUBLE PRECISION for dgerfs
COMPLEX for cgerfs
DOUBLE COMPLEX for zgerfs.
Arrays:
a(lda,*) contains the original matrix A, as supplied to ?getrf.
af(ldaf,*) contains the factored matrix A, as returned by ?getrf.
b(ldb,*) contains the right-hand side matrix B.
x(ldx,*) contains the solution matrix X.
work(*) is a workspace array.
The second dimension of a and af must be at least max(1, n); the
second dimension of b and x must be at least max(1, nrhs); the
dimension of work must be at least max(1, 3*n) for real flavors and
max(1, 2*n) for complex flavors.
ldaf INTEGER. The first dimension of af; ldaf ≥ max(1, n).
ldx INTEGER. The first dimension of x; ldx ≥ max(1, n).
ipiv INTEGER.
The ipiv array, as returned by ?getrf.
iwork INTEGER.
rwork REAL for cgerfs
DOUBLE PRECISION for zgerfs.
605
Output Parameters
x The refined solution matrix X.
ferr, berr REAL for single precision flavors
Arrays, DIMENSION at least max(1, nrhs). Contain the component-wise
forward and backward errors, respectively, for each solution vector.

Specific details for the routine gerfs interface are as follows:
af Holds the matrix AF of size (n, n).
x Holds the matrix X of size (n, nrhs).
ferr Holds the vector of length (nrhs).
berr Holds the vector of length (nrhs).
Application Notes
The bounds returned in ferr are not rigorous, but in practice they almost always overestimate
the actual error.
For each right-hand side, computation of the backward error involves a minimum of 4n2
floating-point operations (for real flavors) or 16n2 operations (for complex flavors). In addition,
each step of iterative refinement involves 6n2 operations (for real flavors) or 24n2 operations
(for complex flavors); the number of iterations may range from 1 to 5. Estimating the forward
606
error involves solving a number of systems of linear equations A*x = b; the number is usually
4 or 5 and never more than 11. Each solution requires approximately 2n2 floating-point
operations for real flavors or 8n2 for complex flavors.
?gbrfs
with a general band matrix and estimates its error.
Syntax
FORTRAN 77:
call sgbrfs( trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, b, ldb, x,
ldx, ferr, berr, work, iwork, info )
call dgbrfs( trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, b, ldb, x,
call cgbrfs( trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, b, ldb, x,
ldx, ferr, berr, work, rwork, info )
call zgbrfs( trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, b, ldb, x,
Fortran 95:
call gbrfs( ab, afb, ipiv, b, x [,kl] [,trans] [,ferr] [,berr] [,info] )
Description
H
A*X = B or AT*X = B or A *X = B with a band matrix A, with multiple right-hand sides. For
|δaij|/|aij| ≤ β|aij|, |δbi|/|bi| ≤ β|bi| such that (A + δA)x = (b + δb).
607
• call the factorization routine ?gbtrf

• call the solver routine ?gbtrs.
Input Parameters
T
H
kl INTEGER. The number of sub-diagonals within the band of A; kl ≥ 0.
ku INTEGER. The number of super-diagonals within the band of A; ku ≥ 0.
ab,afb,b,x,work REAL for sgbrfs
DOUBLE PRECISION for dgbrfs
COMPLEX for cgbrfs
DOUBLE COMPLEX for zgbrfs.
Arrays:
ab(ldab,*) contains the original band matrix A, as supplied to ?gbtrf,
but stored in rows from 1 to kl + ku + 1.
afb(ldafb,*) contains the factored band matrix A, as returned by
?gbtrf.
The second dimension of ab and afb must be at least max(1, n); the
ldab INTEGER. The first dimension of ab.
ldafb INTEGER. The first dimension of afb .
608
ipiv INTEGER.
?gbtrf.
rwork REAL for cgbrfs
DOUBLE PRECISION for zgbrfs.
Output Parameters
info INTEGER. If info =0, the execution is successful.

Specific details for the routine gbrfs interface are as follows:
ab Holds the array A of size (kl+ku+1,n).
afb Holds the array AF of size (2*kl*ku+1,n).
ku Restored as ku = lda-kl-1.
609
Application Notes
the actual error.
For each right-hand side, computation of the backward error involves a minimum of 4n(kl +
ku) floating-point operations (for real flavors) or 16n(kl + ku) operations (for complex flavors).
In addition, each step of iterative refinement involves 2n(4kl + 3ku) operations (for real flavors)
or 8n(4kl + 3ku) operations (for complex flavors); the number of iterations may range from
1 to 5. Estimating the forward error involves solving a number of systems of linear equations
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires
approximately 2n2 floating-point operations for real flavors or 8n2 for complex flavors.
?gtrfs
with a tridiagonal matrix and estimates its error.
Syntax
FORTRAN 77:
call sgtrfs( trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x,
call dgtrfs( trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x,
call cgtrfs( trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x,
call zgtrfs( trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x,
Fortran 95:
call gtrfs( dl, d, du, dlf, df, duf, du2, ipiv, b, x [,trans] [,ferr] [,berr]
[,info] )
Description
610
H
A*X = B or AT*X = B or A *X = B with a tridiagonal matrix A, with multiple right-hand sides.
For each computed solution vector x, the routine computes the component-wise backward
error β. This error is the smallest relative perturbation in elements of A and b such that x is
the exact solution of the perturbed system:
• call the factorization routine ?gttrf

• call the solver routine ?gttrs.
Input Parameters
T
H
dl,d,du,dlf, REAL for sgtrfs
df,duf,du2, DOUBLE PRECISION for dgtrfs
b,x,work COMPLEX for cgtrfs
DOUBLE COMPLEX for zgtrfs.
Arrays:
dl, dimension (n -1), contains the subdiagonal elements of A.
d, dimension (n), contains the diagonal elements of A.
du, dimension (n -1), contains the superdiagonal elements of A.
dlf, dimension (n -1), contains the (n - 1) multipliers that define the
matrix L from the LU factorization of A as computed by ?gttrf.
df, dimension (n), contains the n diagonal elements of the upper
triangular matrix U from the LU factorization of A.
611
duf, dimension (n -1), contains the (n - 1) elements of the first

superdiagonal of U.
du2, dimension (n -2), contains the (n - 2) elements of the second
superdiagonal of U.
b(ldb,nrhs) contains the right-hand side matrix B.
x(ldx,nrhs) contains the solution matrix X, as computed by ?gttrs.
work(*) is a workspace array; the dimension of work must be at least
max(1, 3*n) for real flavors and max(1, 2*n) for complex flavors.
ipiv INTEGER.
?gttrf.
rwork REAL for cgtrfs
DOUBLE PRECISION for zgtrfs.
Workspace array, DIMENSION (n). Used for complex flavors only.
Output Parameters
Arrays, DIMENSION at least max(1,nrhs). Contain the component-wise

Specific details for the routine gtrfs interface are as follows:
612
dlf Holds the vector of length (n-1).
df Holds the vector of length n.
duf Holds the vector of length (n-1).
b Holds the matrix B of size (n,nrhs).
x Holds the matrix X of size (n,nrhs).
?porfs
with a symmetric (Hermitian) positive-definite
matrix and estimates its error.
Syntax
FORTRAN 77:
call sporfs( uplo, n, nrhs, a, lda, af, ldaf, b, ldb, x, ldx, ferr, berr,
work, iwork, info )
call dporfs( uplo, n, nrhs, a, lda, af, ldaf, b, ldb, x, ldx, ferr, berr,
work, iwork, info )
call cporfs( uplo, n, nrhs, a, lda, af, ldaf, b, ldb, x, ldx, ferr, berr,
work, rwork, info )
call zporfs( uplo, n, nrhs, a, lda, af, ldaf, b, ldb, x, ldx, ferr, berr,
work, rwork, info )
Fortran 95:
call porfs( a, af, b, x [,uplo] [,ferr] [,berr] [,info] )
613
Description
A*X = B with a symmetric (Hermitian) positive definite matrix A, with multiple right-hand sides.
For each computed solution vector x, the routine computes the component-wise backward
• call the factorization routine ?potrf

• call the solver routine ?potrs.
Input Parameters
a,af,b,x,work REAL for sporfs
DOUBLE PRECISION for dporfs
COMPLEX for cporfs
DOUBLE COMPLEX for zporfs.
Arrays:
a(lda,*) contains the original matrix A, as supplied to ?potrf.
af(ldaf,*) contains the factored matrix A, as returned by ?potrf.
614
rwork REAL for cporfs
DOUBLE PRECISION for zporfs.
Output Parameters

Specific details for the routine porfs interface are as follows:
af Holds the matrix AF of size (n,n).
615

Application Notes
the actual error.
?pprfs
with a packed symmetric (Hermitian)
positive-definite matrix and estimates its error.
Syntax
FORTRAN 77:
call spprfs( uplo, n, nrhs, ap, afp, b, ldb, x, ldx, ferr, berr, work, iwork,
info )
call dpprfs( uplo, n, nrhs, ap, afp, b, ldb, x, ldx, ferr, berr, work, iwork,
info )
call cpprfs( uplo, n, nrhs, ap, afp, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call zpprfs( uplo, n, nrhs, ap, afp, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
Fortran 95:
call pprfs( ap, afp, b, x [,uplo] [,ferr] [,berr] [,info] )
616
Description
A*X = B with a packed symmetric (Hermitian)positive definite matrix A, with multiple right-hand
sides. For each computed solution vector x, the routine computes the component-wise backward
• call the factorization routine ?pptrf

• call the solver routine ?pptrs.
Input Parameters
ap,afp,b,x,work REAL for spprfs
DOUBLE PRECISION for dpprfs
COMPLEX for cpprfs
DOUBLE COMPLEX for zpprfs.
Arrays:
ap(*) contains the original packed matrix A, as supplied to ?pptrf.
afp(*) contains the factored packed matrix A, as returned by ?pptrf.
617
The dimension of arrays ap and afp must be at least max(1,n(n+1)/2);

the second dimension of b and x must be at least max(1, nrhs); the
rwork REAL for cpprfs
DOUBLE PRECISION for zpprfs.
Output Parameters
ferr, berr REAL for single precision flavors.

Specific details for the routine pprfs interface are as follows:
afp Holds the array AF of size (n*(n+1)/2).
618
Application Notes
the actual error.
(for complex flavors); the number of iterations may range from 1 to 5.
Estimating the forward error involves solving a number of systems of linear equations A*x =
b; the number of systems is usually 4 or 5 and never more than 11. Each solution requires
?pbrfs
with a band symmetric (Hermitian) positive-definite
matrix and estimates its error.
Syntax
FORTRAN 77:
call spbrfs( uplo, n, kd, nrhs, ab, ldab, afb, ldafb, b, ldb, x, ldx, ferr,
call dpbrfs( uplo, n, kd, nrhs, ab, ldab, afb, ldafb, b, ldb, x, ldx, ferr,
call cpbrfs( uplo, n, kd, nrhs, ab, ldab, afb, ldafb, b, ldb, x, ldx, ferr,
call zpbrfs( uplo, n, kd, nrhs, ab, ldab, afb, ldafb, b, ldb, x, ldx, ferr,
Fortran 95:
call pbrfs( ab, afb, b, x [,uplo] [,ferr] [,berr] [,info] )
Description
619
A*X = B with a symmetric (Hermitian) positive definite band matrix A, with multiple right-hand
sides. For each computed solution vector x, the routine computes the component-wise backward
• call the factorization routine ?pbtrf

• call the solver routine ?pbtrs.
Input Parameters
A; kd ≥ 0.
ab,afb,b,x,work REAL for spbrfs
DOUBLE PRECISION for dpbrfs
COMPLEX for cpbrfs
DOUBLE COMPLEX for zpbrfs.
Arrays:
ab(ldab,*) contains the original band matrix A, as supplied to ?pbtrf.
afb(ldafb,*) contains the factored band matrix A, as returned by
?pbtrf.
620
The second dimension of ab and afb must be at least max(1, n); the
ldab INTEGER. The first dimension of ab; ldab ≥ kd + 1.
ldafb INTEGER. The first dimension of afb; ldafb ≥ kd + 1.
rwork REAL for cpbrfs
DOUBLE PRECISION for zpbrfs.
Output Parameters

Specific details for the routine pbrfs interface are as follows:
ab Holds the array A of size (kd+1, n).
afb Holds the array AF of size (kd+1, n).
621

Application Notes
the actual error.
For each right-hand side, computation of the backward error involves a minimum of 8n*kd
floating-point operations (for real flavors) or 32n*kd operations (for complex flavors). In
addition, each step of iterative refinement involves 12n*kd operations (for real flavors) or
48n*kd operations (for complex flavors); the number of iterations may range from 1 to 5.
b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately
4n*kd floating-point operations for real flavors or 16n*kd for complex flavors.
?ptrfs
with a symmetric (Hermitian) positive-definite
tridiagonal matrix and estimates its error.
Syntax
FORTRAN 77:
call sptrfs( n, nrhs, d, e, df, ef, b, ldb, x, ldx, ferr, berr, work, info )
call dptrfs( n, nrhs, d, e, df, ef, b, ldb, x, ldx, ferr, berr, work, info )
call cptrfs( uplo, n, nrhs, d, e, df, ef, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call zptrfs( uplo, n, nrhs, d, e, df, ef, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
Fortran 95:
call ptrfs( d, df, e, ef, b, x [,ferr] [,berr] [,info] )
call ptrfs( d, df, e, ef, b, x [,uplo] [,ferr] [,berr] [,info] )
622
Description
A*X = B with a symmetric (Hermitian) positive definite tridiagonal matrix A, with multiple
right-hand sides. For each computed solution vector x, the routine computes the
component-wise backward error β. This error is the smallest relative perturbation in elements
of A and b such that x is the exact solution of the perturbed system:
• call the factorization routine ?pttrf

• call the solver routine ?pttrs.
Input Parameters
uplo CHARACTER*1. Used for complex flavors only. Must be 'U' or 'L'.
Specifies whether the superdiagonal or the subdiagonal of the tridiagonal
matrix A is stored and how A is factored:
If uplo = 'U', the array e stores the superdiagonal of A, and A is
factored as UH*D*U.
If uplo = 'L', the array e stores the subdiagonal of A, and A is factored
as L*D*LH.
d, df, rwork REAL for single precision flavors DOUBLE PRECISION for double precision
flavors
Arrays: d(n), df(n), rwork(n).
The array d contains the n diagonal elements of the tridiagonal matrix
A.
The array df contains the n diagonal elements of the diagonal matrix
D from the factorization of A as computed by ?pttrf.
623
The array rwork is a workspace array used for complex flavors only.
e,ef,b,x,work REAL for sptrfs
DOUBLE PRECISION for dptrfs
COMPLEX for cptrfs
DOUBLE COMPLEX for zptrfs.
Arrays: e(n -1), ef(n -1), b(ldb,nrhs), x(ldx,nrhs), work(*).
The array e contains the (n - 1) off-diagonal elements of the tridiagonal
matrix A (see uplo).
The array ef contains the (n - 1) off-diagonal elements of the unit
bidiagonal factor U or L from the factorization computed by ?pttrf
(see uplo).
The array x contains the solution matrix X as computed by ?pttrs.
The array work is a workspace array. The dimension of work must be
at least 2*n for real flavors, and at least n for complex flavors.
ldx INTEGER. The leading dimension of x; ldx ≥ max(1, n).
Output Parameters
info INTEGER.

Specific details for the routine ptrfs interface are as follows:
624
ef Holds the vector of length (n-1).
uplo Used in complex flavors only. Must be 'U' or 'L'. The default value is
'U'.
?syrfs
with a symmetric matrix and estimates its error.
Syntax
FORTRAN 77:
call ssyrfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr,
call dsyrfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr,
call csyrfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr,
call zsyrfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr,
Fortran 95:
call syrfs( a, af, ipiv, b, x [,uplo] [,ferr] [,berr] [,info] )
Description
625
A*X = B with a symmetric full-storage matrix A, with multiple right-hand sides. For each
computed solution vector x, the routine computes the component-wise backward error β.
This error is the smallest relative perturbation in elements of A and b such that x is the exact
• call the factorization routine ?sytrf

• call the solver routine ?sytrs.
Input Parameters
a,af,b,x,work REAL for ssyrfs
DOUBLE PRECISION for dsyrfs
COMPLEX for csyrfs
DOUBLE COMPLEX for zsyrfs.
Arrays:
a(lda,*) contains the original matrix A, as supplied to ?sytrf.
af(ldaf,*) contains the factored matrix A, as returned by ?sytrf.
626
ipiv INTEGER.
?sytrf.
rwork REAL for csyrfs
DOUBLE PRECISION for zsyrfs.
Output Parameters

Specific details for the routine syrfs interface are as follows:
627
Application Notes
the actual error.
?herfs
with a complex Hermitian matrix and estimates its
error.
Syntax
FORTRAN 77:
call cherfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr,
call zherfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr,
Fortran 95:
call herfs( a, af, ipiv, b, x [,uplo] [,ferr] [,berr] [,info] )
Description
628
A*X = B with a complex Hermitian full-storage matrix A, with multiple right-hand sides. For
• call the factorization routine ?hetrf

• call the solver routine ?hetrs.
Input Parameters
a,af,b,x,work COMPLEX for cherfs
DOUBLE COMPLEX for zherfs.
Arrays:
a(lda,*) contains the original matrix A, as supplied to ?hetrf.
af(ldaf,*) contains the factored matrix A, as returned by ?hetrf.
dimension of work must be at least max(1, 2*n).
629

ipiv INTEGER.
?hetrf.
rwork REAL for cherfs
DOUBLE PRECISION for zherfs.
Output Parameters
ferr, berr REAL for cherfs
DOUBLE PRECISION for zherfs.

Specific details for the routine herfs interface are as follows:
630
Application Notes
the actual error.
operations. In addition, each step of iterative refinement involves 24n2 operations; the number
of iterations may range from 1 to 5.
8n2 floating-point operations.
The real counterpart of this routine is ?ssyrfs/?dsyrfs
?sprfs
with a packed symmetric matrix and estimates the
solution error.
Syntax
FORTRAN 77:
call ssprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call dsprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call csprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call zsprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
Fortran 95:
call sprfs( ap, afp, ipiv, b, x [,uplo] [,ferr] [,berr] [,info] )
Description
631
A*X = B with a packed symmetric matrix A, with multiple right-hand sides. For each computed
solution vector x, the routine computes the component-wise backward error β. This error is
the smallest relative perturbation in elements of A and b such that x is the exact solution of
the perturbed system:
• call the factorization routine ?sptrf

• call the solver routine ?sptrs.
Input Parameters
ap,afp,b,x,work REAL for ssprfs
DOUBLE PRECISION for dsprfs
COMPLEX for csprfs
DOUBLE COMPLEX for zsprfs.
Arrays:
ap(*) contains the original packed matrix A, as supplied to ?sptrf.
afp(*) contains the factored packed matrix A, as returned by ?sptrf.
The dimension of arrays ap and afp must be at least max(1,
n(n+1)/2); the second dimension of b and x must be at least max(1,
nrhs); the dimension of work must be at least max(1, 3*n) for real
flavors and max(1, 2*n) for complex flavors.
632
ipiv INTEGER.
?sptrf.
rwork REAL for csprfs
DOUBLE PRECISION for zsprfs.
Output Parameters

Specific details for the routine sprfs interface are as follows:
633
Application Notes
the actual error.
(for complex flavors); the number of iterations may range from 1 to 5.
b; the number of systems is usually 4 or 5 and never more than 11. Each solution requires
?hprfs
with a packed complex Hermitian matrix and
estimates the solution error.
Syntax
FORTRAN 77:
call chprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call zhprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
Fortran 95:
call hprfs( ap, afp, ipiv, b, x [,uplo] [,ferr] [,berr] [,info] )
Description
A*X = B with a packed complex Hermitian matrix A, with multiple right-hand sides. For each
634
• call the factorization routine ?hptrf

• call the solver routine ?hptrs.
Input Parameters
ap,afp,b,x,work COMPLEX for chprfs
DOUBLE COMPLEX for zhprfs.
Arrays:
ap(*) contains the original packed matrix A, as supplied to ?hptrf.
afp(*) contains the factored packed matrix A, as returned by ?hptrf.
the second dimension of b and x must be at least max(1,nrhs); the
dimension of work must be at least max(1, 2*n).
ipiv INTEGER.
?hptrf.
rwork REAL for chprfs
DOUBLE PRECISION for zhprfs.
635
Output Parameters
ferr, berr REAL for chprfs.
DOUBLE PRECISION for zhprfs.
Arrays, DIMENSION at least max(1,nrhs). Contain the component-wise

Specific details for the routine hprfs interface are as follows:
Application Notes
the actual error.
operations. In addition, each step of iterative refinement involves 24n2 operations; the number
of iterations may range from 1 to 5.
8n2 floating-point operations.
The real counterpart of this routine is ?ssprfs/?dsprfs.
636
?trrfs
Estimates the error in the solution of a system of
linear equations with a triangular matrix.
Syntax
FORTRAN 77:
call strrfs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, x, ldx, ferr, berr,
work, iwork, info )
call dtrrfs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, x, ldx, ferr, berr,
work, iwork, info )
call ctrrfs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, x, ldx, ferr, berr,
work, rwork, info )
call ztrrfs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, x, ldx, ferr, berr,
work, rwork, info )
Fortran 95:
call trrfs( a, b, x [,uplo] [,trans] [,diag] [,ferr] [,berr] [,info] )
Description
The routine estimates the errors in the solution to a system of linear equations A*X = B or
H
AT*X = B or A *X = B with a triangular matrix A, with multiple right-hand sides. For each
The routine also estimates the component-wise forward error in the computed solution ||x
- xe||∞/||x||∞ (here xe is the exact solution).
Before calling this routine, call the solver routine ?trtrs.
637
Input Parameters
T
H
If diag = 'U', then A is unit triangular: diagonal elements of A are
assumed to be 1 and not referenced in the array a.
a, b, x, work REAL for strrfs
DOUBLE PRECISION for dtrrfs
COMPLEX for ctrrfs
DOUBLE COMPLEX for ztrrfs.
Arrays:
a(lda,*) contains the upper or lower triangular matrix A, as specified
by uplo.
The second dimension of a must be at least max(1,n); the second
dimension of b and x must be at least max(1,nrhs); the dimension of
work must be at least max(1,3*n) for real flavors and max(1,2*n)
for complex flavors.
638
rwork REAL for ctrrfs
DOUBLE PRECISION for ztrrfs.
Output Parameters

Specific details for the routine trrfs interface are as follows:
Application Notes
the actual error.
A call to this routine involves, for each right-hand side, solving a number of systems of linear
equations A*x = b; the number of systems is usually 4 or 5 and never more than 11. Each
solution requires approximately n2 floating-point operations for real flavors or 4n2 for complex
flavors.
639
?tprfs
linear equations with a packed triangular matrix.
Syntax
FORTRAN 77:
call stprfs( uplo, trans, diag, n, nrhs, ap, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call dtprfs( uplo, trans, diag, n, nrhs, ap, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call ctprfs( uplo, trans, diag, n, nrhs, ap, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call ztprfs( uplo, trans, diag, n, nrhs, ap, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
Fortran 95:
call tprfs( ap, b, x [,uplo] [,trans] [,diag] [,ferr] [,berr] [,info] )
Description
H
AT*X = B or A *X = B with a packed triangular matrix A, with multiple right-hand sides. For
Before calling this routine, call the solver routine ?tptrs.
640
Input Parameters
T
H
If diag = 'N', A is not a unit triangular matrix.
If diag = 'U', A is unit triangular: diagonal elements of A are assumed
ap, b, x, work REAL for stprfs
DOUBLE PRECISION for dtprfs
COMPLEX for ctprfs
DOUBLE COMPLEX for ztprfs.
Arrays:
ap(*) contains the upper or lower triangular matrix A, as specified by
uplo.
The dimension of ap must be at least max(1,n(n+1)/2); the second
dimension of b and x must be at least max(1,nrhs); the dimension of
work must be at least max(1,3*n) for real flavors and max(1,2*n)
rwork REAL for ctprfs
641
DOUBLE PRECISION for ztprfs.

Output Parameters

Specific details for the routine tprfs interface are as follows:
Application Notes
the actual error.
solution requires approximately n2 floating-point operations for real flavors or 4n2 for complex
flavors.
642
?tbrfs
linear equations with a triangular band matrix.
Syntax
FORTRAN 77:
call stbrfs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, x, ldx, ferr,
call dtbrfs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, x, ldx, ferr,
call ctbrfs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, x, ldx, ferr,
call ztbrfs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, x, ldx, ferr,
Fortran 95:
call tbrfs( ab, b, x [,uplo] [,trans] [,diag] [,ferr] [,berr] [,info] )
Description
H
AT*X = B or A *X = B with a triangular band matrix A, with multiple right-hand sides. For each
Before calling this routine, call the solver routine ?tbtrs.
643
Input Parameters
T
H
If diag = 'N', A is not a unit triangular matrix.
kd INTEGER. The number of super-diagonals or sub-diagonals in the matrix
A; kd ≥ 0.
ab, b, x, work REAL for stbrfs
DOUBLE PRECISION for dtbrfs
COMPLEX for ctbrfs
DOUBLE COMPLEX for ztbrfs.
Arrays:
ab(ldab,*) contains the upper or lower triangular matrix A, as specified
by uplo, in band storage format.
The second dimension of a must be at least max(1,n); the second
dimension of b and x must be at least max(1,nrhs). The dimension
of work must be at least max(1,3*n) for real flavors and max(1,2*n)
644
rwork REAL for ctbrfs
DOUBLE PRECISION for ztbrfs.
Output Parameters

Specific details for the routine tbrfs interface are as follows:
Application Notes
the actual error.
645
solution requires approximately 2n*kd floating-point operations for real flavors or 8n*kd
Routines for Matrix Inversion

It is seldom necessary to compute an explicit inverse of a matrix. In particular, do not attempt
to solve a system of equations Ax = b by first computing A-1 and then forming the matrix-vector
product x = A-1b. Call a solver routine instead (see Routines for Solving Systems of Linear
Equations); this is more efficient and more accurate.
However, matrix inversion routines are provided for the rare occasions when an explicit inverse
matrix is needed.
?getri
Computes the inverse of an LU-factored general
matrix.
Syntax
FORTRAN 77:
call sgetri( n, a, lda, ipiv, work, lwork, info )
call dgetri( n, a, lda, ipiv, work, lwork, info )
call cgetri( n, a, lda, ipiv, work, lwork, info )
call zgetri( n, a, lda, ipiv, work, lwork, info )
Fortran 95:
call getri( a, ipiv [,info] )
Description
The routine computes the inverse inv(A) of a general matrix A. Before calling this routine, call
?getrf to factorize A.
646
Input Parameters
a, work REAL for sgetri
DOUBLE PRECISION for dgetri
COMPLEX for cgetri
DOUBLE COMPLEX for zgetri.
a(lda,*) contains the factorization of the matrix A, as returned by
?getrf: A = P*L*U.
The second dimension of a must be at least max(1,n).
work(*) is a workspace array of dimension at least max(1,lwork).
ipiv INTEGER.
The ipiv array, as returned by ?getrf.
lwork INTEGER. The size of the work array; lwork ≥ n.
issued by xerbla.
See Application Notes below for the suggested value of lwork.
Output Parameters
a Overwritten by the n-by-n matrix inv(A).
If info = i, the i-th diagonal element of the factor U is zero, U is
singular, and the inversion could not be completed.
647

Specific details for the routine getri interface are as follows:
Application Notes
algorithm.
workspace query.
workspace.
The computed inverse X satisfies the following error bound:
|XA - I| ≤ c(n)ε|X|P|L||U|,
where c(n) is a modest linear function of n; ε is the machine precision; I denotes the identity
matrix; P, L, and U are the factors of the matrix factorization A = P*L*U.
648
?potri
Computes the inverse of a symmetric (Hermitian)
Syntax
FORTRAN 77:
call spotri( uplo, n, a, lda, info )
call dpotri( uplo, n, a, lda, info )
call cpotri( uplo, n, a, lda, info )
call zpotri( uplo, n, a, lda, info )
Fortran 95:
call potri( a [,uplo] [,info] )
Description
The routine computes the inverse inv(A) of a symmetric positive definite or, for complex
flavors, Hermitian positive-definite matrix A. Before calling this routine, call ?potrf to factorize
A.
Input Parameters
a REAL for spotri
DOUBLE PRECISION for dpotri
COMPLEX for cpotri
DOUBLE COMPLEX for zpotri.
Array a(lda,*). Contains the factorization of the matrix A, as returned
by ?potrf.
649
The second dimension of a must be at least max(1, n).

Output Parameters
info INTEGER.
If info = i, the i-th diagonal element of the Cholesky factor (and
therefore the factor itself) is zero, and the inversion could not be
completed.

Specific details for the routine potri interface are as follows:
Application Notes
The computed inverse X satisfies the following error bounds:
||XA - I||2 ≤ c(n)εκ2(A), ||AX - I||2 ≤ c(n)εκ2(A),
where c(n) is a modest linear function of n, and ε is the machine precision; I denotes the
identity matrix.
The 2-norm ||A||2 of a matrix A is defined by ||A||2 = maxx·x=1(Ax·Ax)1/2, and the condition
number κ2(A) is defined by κ2(A) = ||A||2 ||A-1||2.
650
?pptri
Computes the inverse of a packed symmetric
(Hermitian) positive-definite matrix
Syntax
FORTRAN 77:
call spptri( uplo, n, ap, info )
call dpptri( uplo, n, ap, info )
call cpptri( uplo, n, ap, info )
call zpptri( uplo, n, ap, info )
Fortran 95:
call pptri( ap [,uplo] [,info] )
Description
The routine computes the inverse inv(A) of a symmetric positive definite or, for complex
flavors, Hermitian positive-definite matrix A in packed form. Before calling this routine, call
?pptrf to factorize A.
Input Parameters
Indicates whether the upper or lower triangular factor is stored in ap:
If uplo = 'U', then the upper triangular factor is stored.
If uplo = 'L', then the lower triangular factor is stored.
ap REAL for spptri
DOUBLE PRECISION for dpptri
COMPLEX for cpptri
DOUBLE COMPLEX for zpptri.
Array, DIMENSION at least max(1, n(n+1)/2).
651
Contains the factorization of the packed matrix A, as returned by ?pp-

trf.
The dimension ap must be at least max(1,n(n+1)/2).
Output Parameters
ap Overwritten by the packed n-by-n matrix inv(A).
info INTEGER.
If info = i, the i-th diagonal element of the Cholesky factor (and
therefore the factor itself) is zero, and the inversion could not be
completed.

Specific details for the routine pptri interface are as follows:
Application Notes
||XA - I||2 ≤ c(n)εκ2(A), ||AX - I||2 ≤ c(n)εκ2(A),
where c(n) is a modest linear function of n, and ε is the machine precision; I denotes the
identity matrix.
The 2-norm ||A||2 of a matrix A is defined by ||A||2 =maxx·x=1(Ax·Ax)1/2, and the condition
number κ2(A) is defined by κ2(A) = ||A||2 ||A-1||2 .
652
?sytri
Computes the inverse of a symmetric matrix.
Syntax
FORTRAN 77:
call ssytri( uplo, n, a, lda, ipiv, work, info )
call dsytri( uplo, n, a, lda, ipiv, work, info )
call csytri( uplo, n, a, lda, ipiv, work, info )
call zsytri( uplo, n, a, lda, ipiv, work, info )
Fortran 95:
call sytri( a, ipiv [,uplo] [,info] )
Description
The routine computes the inverse inv(A) of a symmetric matrix A. Before calling this routine,
call ?sytrf to factorize A.
Input Parameters
If uplo = 'U', the array a stores the Bunch-Kaufman factorization A
= P*U*D*UT*PT.
If uplo = 'L', the array a stores the Bunch-Kaufman factorization A
= P*L*D*LT*PT.
a, work REAL for ssytri
DOUBLE PRECISION for dsytri
COMPLEX for csytri
DOUBLE COMPLEX for zsytri.
Arrays:
653

?sytrf.
The dimension of work must be at least max(1,2*n).
ipiv INTEGER.
The ipiv array, as returned by ?sytrf.
Output Parameters
info INTEGER.
If info =-i, the i-th parameter had an illegal value.
If info = i, the i-th diagonal element of D is zero, D is singular, and
the inversion could not be completed.

Specific details for the routine sytri interface are as follows:
Application Notes
|D*UT*PT*X*P*U - I| ≤ c(n)ε(|D||UT|PT|X|P|U| + |D||D-1|)
for uplo = 'U', and
|D*LT*PT*X*P*L - I| ≤ c(n)ε(|D||LT|PT|X|P|L| + |D||D-1|)
654
for uplo = 'L'. Here c(n) is a modest linear function of n, and ε is the machine precision; I
denotes the identity matrix.
?hetri
Computes the inverse of a complex Hermitian
matrix.
Syntax
FORTRAN 77:
call chetri( uplo, n, a, lda, ipiv, work, info )
call zhetri( uplo, n, a, lda, ipiv, work, info )
Fortran 95:
call hetri( a, ipiv [,uplo] [,info] )
Description
The routine computes the inverse inv(A) of a complex Hermitian matrix A. Before calling this
routine, call ?hetrf to factorize A.
Input Parameters
If uplo = 'U', the array a stores the Bunch-Kaufman factorization A
= P*U*D*UH*PT.
If uplo = 'L', the array a stores the Bunch-Kaufman factorization A
= P*L*D*LH*PT.
a, work COMPLEX for chetri
DOUBLE COMPLEX for zhetri.
Arrays:
655

?hetrf.
The dimension of work must be at least max(1,n).
ipiv INTEGER.
?hetrf.
Output Parameters
info INTEGER.

Specific details for the routine hetri interface are as follows:
Application Notes
|D*UH*PT*X*P*U - I| ≤ c(n)ε(|D||UH|PT|X|P|U| + |D||D-1|)
for uplo = 'U', and
|D*LH*PT*X*P*L - I| ≤ c(n)ε(|D||LH|PT|X|P|L| + |D||D-1|)
656
The real counterpart of this routine is ?sytri.
?sptri
Computes the inverse of a symmetric matrix using
packed storage.
Syntax
FORTRAN 77:
call ssptri( uplo, n, ap, ipiv, work, info )
call dsptri( uplo, n, ap, ipiv, work, info )
call csptri( uplo, n, ap, ipiv, work, info )
call zsptri( uplo, n, ap, ipiv, work, info )
Fortran 95:
call sptri( ap, ipiv [,uplo] [,info] )
Description
The routine computes the inverse inv(A) of a packed symmetric matrix A. Before calling this
routine, call ?sptrf to factorize A.
Input Parameters
If uplo = 'U', the array ap stores the Bunch-Kaufman factorization
A = P*U*D*UT*PT.
If uplo = 'L', the array ap stores the Bunch-Kaufman factorization
A = P*L*D*LT*PT.
657

ap, work REAL for ssptri
DOUBLE PRECISION for dsptri
COMPLEX for csptri
DOUBLE COMPLEX for zsptri.
Arrays:
ap(*) contains the factorization of the matrix A, as returned by ?sptrf.
ipiv INTEGER.
?sptrf.
Output Parameters
ap Overwritten by the n-by-n matrix inv(A) in packed form.
info INTEGER.

Specific details for the routine sptri interface are as follows:
Application Notes
|D*UT*PT*X*P*U - I| ≤ c(n)ε(|D||UT|PT|X|P|U| + |D||D-1|)
658
for uplo = 'U', and
|D*LT*PT*X*P*L - I| ≤ c(n)ε(|D||LT|PT|X|P|L| + |D||D-1|)
?hptri
Computes the inverse of a complex Hermitian
matrix using packed storage.
Syntax
FORTRAN 77:
call chptri( uplo, n, ap, ipiv, work, info )
call zhptri( uplo, n, ap, ipiv, work, info )
Fortran 95:
call hptri( ap, ipiv [,uplo] [,info] )
Description
The routine computes the inverse inv(A) of a complex Hermitian matrix A using packed storage.
Before calling this routine, call ?hptrf to factorize A.
Input Parameters
If uplo = 'U', the array ap stores the packed Bunch-Kaufman
If uplo = 'L', the array ap stores the packed Bunch-Kaufman
659
ap, work COMPLEX for chptri

DOUBLE COMPLEX for zhptri.
Arrays:
ap(*) contains the factorization of the matrix A, as returned by ?hptrf.
ipiv INTEGER.
The ipiv array, as returned by ?hptrf.
Output Parameters
ap Overwritten by the n-by-n matrix inv(A).
info INTEGER.

Specific details for the routine hptri interface are as follows:
Application Notes
|D*UH*PT*X*P*U - I| ≤ c(n)ε(|D||UH|PT|X|P|U| + |D||D-1|)
for uplo = 'U', and
|D*LH*PT*X*PL - I| ≤ c(n)ε(|D||LH|PT|X|P|L| + |D||D-1|)
660
The real counterpart of this routine is ?sptri.
?trtri
Computes the inverse of a triangular matrix.
Syntax
FORTRAN 77:
call strtri( uplo, diag, n, a, lda, info )
call dtrtri( uplo, diag, n, a, lda, info )
call ctrtri( uplo, diag, n, a, lda, info )
call ztrtri( uplo, diag, n, a, lda, info )
Fortran 95:
call trtri( a [,uplo] [,diag] [,info] )
Description
The routine computes the inverse inv(A) of a triangular matrix A.
Input Parameters
to be 1 and not referenced in the array a.
661

a REAL for strtri
DOUBLE PRECISION for dtrtri
COMPLEX for ctrtri
DOUBLE COMPLEX for ztrtri.
Array: DIMENSION (,*).
Contains the matrix A.
Output Parameters
info INTEGER.
If info = i, the i-th diagonal element of A is zero, A is singular, and

Specific details for the routine trtri interface are as follows:
Application Notes
|XA - I| ≤ c(n)ε |X||A|
|XA - I| ≤ c(n)ε |A-1||A||X|,
matrix.
662
?tptri
Computes the inverse of a triangular matrix using
packed storage.
Syntax
FORTRAN 77:
call stptri( uplo, diag, n, ap, info )
call dtptri( uplo, diag, n, ap, info )
call ctptri( uplo, diag, n, ap, info )
call ztptri( uplo, diag, n, ap, info )
Fortran 95:
call tptri( ap [,uplo] [,diag] [,info] )
Description
The routine computes the inverse inv(A) of a packed triangular matrix A.
Input Parameters
ap REAL for stptri
663
DOUBLE PRECISION for dtptri

COMPLEX for ctptri
DOUBLE COMPLEX for ztptri.
Array, DIMENSION at least max(1,n(n+1)/2).
Contains the packed triangular matrix A.
Output Parameters
ap Overwritten by the packed n-by-n matrix inv(A) .
info INTEGER.
If info = i, the i-th diagonal element of A is zero, A is singular, and

Specific details for the routine tptri interface are as follows:
Application Notes
|XA - I| ≤ c(n)ε |X||A|
|X - A-1| ≤ c(n)ε |A-1||A||X|,
matrix.
664
Routines for Matrix Equilibration

Routines described in this section are used to compute scaling factors needed to equilibrate a
matrix. Note that these routines do not actually scale the matrices.
?geequ
Computes row and column scaling factors intended
to equilibrate a matrix and reduce its condition
number.
Syntax
FORTRAN 77:
call sgeequ( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call dgeequ( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call cgeequ( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call zgeequ( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
Fortran 95:
call geequ( a, r, c [,rowcnd] [,colcnd] [,amax] [,info] )
Description
The routine computes row and column scalings intended to equilibrate an m-by-n matrix A and
reduce its condition number. The output array r returns the row scale factors and the array c
the column scale factors. These factors are chosen to try to make the largest element in each
row and column of the matrix B with elements bij=r(i)*aij*c(j) have absolute value 1.
See ?laqge auxiliary function that uses scaling factors computed by ?geequ.
Input Parameters
m INTEGER. The number of rows of the matrix A; m ≥ 0.
n INTEGER. The number of columns of the matrix A; n ≥ 0.
665
a REAL for sgeequ

DOUBLE PRECISION for dgeequ
COMPLEX for cgeequ
DOUBLE COMPLEX for zgeequ.
Array: DIMENSION (lda,*).
Contains the m-by-n matrix A whose equilibration factors are to be
computed.
lda INTEGER. The leading dimension of a; lda ≥ max(1, m).
Output Parameters
r, c REAL for single precision flavors
Arrays: r(m), c(n).
If info = 0, or info > m, the array r contains the row scale factors
of the matrix A.
If info = 0 , the array c contains the column scale factors of the
matrix A.
rowcnd REAL for single precision flavors
If info = 0 or info > m, rowcnd contains the ratio of the smallest
r(i) to the largest r(i).
colcnd REAL for single precision flavors
If info = 0, colcnd contains the ratio of the smallest c(i) to the
largest c(i).
amax REAL for single precision flavors
Absolute value of the largest element of the matrix A.
info INTEGER.
If info = i and
i ≤ m, the i-th row of A is exactly zero;
i > m, the (i-m)th column of A is exactly zero.
666
Specific details for the routine geequ interface are as follows:
a Holds the matrix A of size (m, n).
r Holds the vector of length (m).
c Holds the vector of length n.
Application Notes
All the components of r and c are restricted to be between SMLNUM = smallest safe number
and BIGNUM= largest safe number. Use of these scaling factors is not guaranteed to reduce
the condition number of A but works well in practice.
If rowcnd ≥ 0.1 and amax is neither too large nor too small, it is not worth scaling by r.
If colcnd ≥ 0.1, it is not worth scaling by c.
If amax is very close to overflow or very close to underflow, the matrix A should be scaled.
?gbequ
to equilibrate a band matrix and reduce its
condition number.
Syntax
FORTRAN 77:
call sgbequ( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call dgbequ( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call cgbequ( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call zgbequ( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
Fortran 95:
call gbequ( ab, r, c [,kl] [,rowcnd] [,colcnd] [,amax] [,info] )
667
Description
The routine computes row and column scalings intended to equilibrate an m-by-n band matrix
A and reduce its condition number. The output array r returns the row scale factors and the
array c the column scale factors. These factors are chosen to try to make the largest element
in each row and column of the matrix B with elements bij=r(i)*aij*c(j) have absolute value
1.
See ?laqgb auxiliary function that uses scaling factors computed by ?gbequ.
Input Parameters
m INTEGER. The number of rows of the matrix A; m ≥ 0.
n INTEGER. The number of columns of the matrix A; n ≥ 0.
ku INTEGER. The number of superdiagonals within the band of A; ku ≥
0.
ab REAL for sgbequ
DOUBLE PRECISION for dgbequ
COMPLEX for cgbequ
DOUBLE COMPLEX for zgbequ.
Array, DIMENSION (ldab,*).
Contains the original band matrix A stored in rows from 1 to kl + ku
+ 1.
The second dimension of ab must be at least max(1,n);
ldab INTEGER. The leading dimension of ab; ldab ≥ kl+ku+1.
Output Parameters
Arrays: r(m), c(n).
If info = 0, or info > m, the array r contains the row scale factors
of the matrix A.
668
If info = 0 , the array c contains the column scale factors of the
matrix A.
rowcnd REAL for single precision flavors
If info = 0 or info > m, rowcnd contains the ratio of the smallest
r(i) to the largest r(i).
colcnd REAL for single precision flavors
If info = 0, colcnd contains the ratio of the smallest c(i) to the
largest c(i).
info INTEGER.
If info = i and
i ≤ m, the i-th row of A is exactly zero;
i > m, the (i-m)th column of A is exactly zero.

Specific details for the routine gbequ interface are as follows:
r Holds the vector of length (m).
c Holds the vector of length n.
Application Notes
All the components of r and c are restricted to be between SMLNUM = smallest safe number
and BIGNUM= largest safe number. Use of these scaling factors is not guaranteed to reduce
the condition number of A but works well in practice.
669
If rowcnd ≥ 0.1 and amax is neither too large nor too small, it is not worth scaling by r.
If colcnd ≥ 0.1, it is not worth scaling by c.
?poequ
to equilibrate a symmetric (Hermitian) positive
definite matrix and reduce its condition number.
Syntax
FORTRAN 77:
call spoequ( n, a, lda, s, scond, amax, info )
call dpoequ( n, a, lda, s, scond, amax, info )
call cpoequ( n, a, lda, s, scond, amax, info )
call zpoequ( n, a, lda, s, scond, amax, info )
Fortran 95:
call poequ( a, s [,scond] [,amax] [,info] )
Description
The routine computes row and column scalings intended to equilibrate a symmetric (Hermitian)
positive-definite matrix A and reduce its condition number (with respect to the two-norm). The
output array s returns scale factors computed as
670
These factors are chosen so that the scaled matrix B with elements bij=s(i)*aij*s(j) has
diagonal elements equal to 1.
This choice of s puts the condition number of B within a factor n of the smallest possible condition
number over all possible diagonal scalings.
See ?laqsy auxiliary function that uses scaling factors computed by ?poequ.
Input Parameters
a REAL for spoequ
DOUBLE PRECISION for dpoequ
COMPLEX for cpoequ
DOUBLE COMPLEX for zpoequ.
Array: DIMENSION (lda,*).
Contains the n-by-n symmetric or Hermitian positive definite matrix A
whose scaling factors are to be computed. Only diagonal elements of
A are referenced.
lda INTEGER. The leading dimension of a; lda ≥ max(1, m).
Output Parameters
s REAL for single precision flavors
Array, DIMENSION (n).
If info = 0, the array s contains the scale factors for A.
scond REAL for single precision flavors
If info = 0, scond contains the ratio of the smallest s(i) to the largest
s(i).
info INTEGER.
If info = i, the i-th diagonal element of A is nonpositive.
671

Specific details for the routine poequ interface are as follows:
s Holds the vector of length n.
Application Notes
If scond ≥ 0.1 and amax is neither too large nor too small, it is not worth scaling by s.
?ppequ
definite matrix in packed storage and reduce its
condition number.
Syntax
FORTRAN 77:
call sppequ( uplo, n, ap, s, scond, amax, info )
call dppequ( uplo, n, ap, s, scond, amax, info )
call cppequ( uplo, n, ap, s, scond, amax, info )
call zppequ( uplo, n, ap, s, scond, amax, info )
Fortran 95:
call ppequ( ap, s [,scond] [,amax] [,uplo] [,info] )
Description
672
positive definite matrix A in packed storage and reduce its condition number (with respect to
the two-norm). The output array s returns scale factors computed as
diagonal elements equal to 1.
This choice of s puts the condition number of B within a factor n of the smallest possible condition
number over all possible diagonal scalings.
See ?laqsp auxiliary function that uses scaling factors computed by ?ppequ.
Input Parameters
the array ap:
matrix A.
matrix A.
ap REAL for sppequ
DOUBLE PRECISION for dppequ
COMPLEX for cppequ
DOUBLE COMPLEX for zppequ.
Array, DIMENSION at least max(1,n(n+1)/2). The array ap contains
Output Parameters
673

s(i).
info INTEGER.

Specific details for the routine ppequ interface are as follows:
Application Notes
674
?pbequ
to equilibrate a symmetric (Hermitian)
positive-definite band matrix and reduce its
condition number.
Syntax
FORTRAN 77:
call spbequ( uplo, n, kd, ab, ldab, s, scond, amax, info )
call dpbequ( uplo, n, kd, ab, ldab, s, scond, amax, info )
call cpbequ( uplo, n, kd, ab, ldab, s, scond, amax, info )
call zpbequ( uplo, n, kd, ab, ldab, s, scond, amax, info )
Fortran 95:
call pbequ( ab, s [,scond] [,amax] [,uplo] [,info] )
Description
positive definite matrix A in packed storage and reduce its condition number (with respect to
the two-norm). The output array s returns scale factors computed as
diagonal elements equal to 1. This choice of s puts the condition number of B within a factor n
of the smallest possible condition number over all possible diagonal scalings.
See ?laqsb auxiliary function that uses scaling factors computed by ?pbequ.
675
Input Parameters
the array ab:
If uplo = 'U', the array ab stores the upper triangular part of the
matrix A.
If uplo = 'L', the array ab stores the lower triangular part of the
matrix A.
A; kd ≥ 0.
ab REAL for spbequ
DOUBLE PRECISION for dpbequ
COMPLEX for cpbequ
DOUBLE COMPLEX for zpbequ.
The array ap contains either the upper or the lower triangular part of
the matrix A (as specified by uplo) in band storage (see Matrix Storage
Schemes).
The second dimension of ab must be at least max(1, n).
ldab INTEGER. The leading dimension of the array ab; ldab ≥ kd +1.
Output Parameters
s(i).
info INTEGER.
676

Specific details for the routine pbequ interface are as follows:
Application Notes
Driver Routines
Table 3-3 lists the LAPACK driver routines for solving systems of linear equations with real or
complex matrices.
Table 3-3 Driver Routines for Solving Systems of Linear Equations
Matrix type, storage Simple Driver Expert Driver
scheme
general ?gesv ?gesvx
general band ?gbsv ?gbsvx
general tridiagonal ?gtsv ?gtsvx
symmetric/Hermitian ?posv ?posvx

positive-definite
symmetric/Hermitian ?ppsv ?ppsvx

positive-definite, storage
677
Matrix type, storage Simple Driver Expert Driver

scheme
symmetric/Hermitian ?pbsv ?pbsvx

positive-definite, band
symmetric/Hermitian ?ptsv ?ptsvx

positive-definite, tridiagonal
symmetric/Hermitian ?sysv/?hesv ?sysvx/?hesvx

indefinite
symmetric/Hermitian ?spsv/?hpsv ?spsvx/?hpsvx

indefinite, packed storage
complex symmetric ?sysv ?sysvx
complex symmetric, packed ?spsv ?spsvx

storage
In this table ? stands for s (single precision real), d (double precision real), c (single precision
complex), or z (double precision complex). In the description of ?gesv routine ? stands for
combined character codes ds and zc for the mixed precision subroutines.
678
?gesv
Computes the solution to the system of linear
equations with a square matrix A and multiple
right-hand sides.
Syntax
FORTRAN 77:
call sgesv( n, nrhs, a, lda, ipiv, b, ldb, info )
call dgesv( n, nrhs, a, lda, ipiv, b, ldb, info )
call cgesv( n, nrhs, a, lda, ipiv, b, ldb, info )
call zgesv( n, nrhs, a, lda, ipiv, b, ldb, info )
call dsgesv( n, nrhs, a, lda, ipiv, b, ldb, x, ldx, work, swork, iter, info
)
call zcgesv( n, nrhs, a, lda, ipiv, b, ldb, x, ldx, work, swork, iter, info
)
Fortran 95:
call gesv( a, b [,ipiv] [,info] )
Description
The routine solves for X the system of linear equations A*X = B, where A is an n-by-n matrix,
the columns of matrix B are individual right-hand sides, and the columns of X are the
corresponding solutions.
The LU decomposition with partial pivoting and row interchanges is used to factor A as A =
P*L*U, where P is a permutation matrix, L is unit lower triangular, and U is upper triangular.
The factored form of A is then used to solve the system of equations A*X = B.
The dsgesv and zcgesv are mixed precision iterative refinement subroutines for exploiting
fast single precision hardware. They first attempt to factorize the matrix in single precision
(dsgesv) or single complex precision (zcgesv) and use this factorization within an iterative
refinement procedure to produce a solution with double precision (dsgesv) / double complex
679
precision (zcgesv) normwise backward error quality (see below). If the approach fails, the
method switches to a double precision or double complex precision factorization respectively
and computes the solution.
The iterative refinement is not going to be a winning strategy if the ratio single precision
performance over double precision performance is too small. A reasonable strategy should take
the number of right-hand sides and the size of the matrix into account. This might be done
with a call to ilaenv in the future. At present, iterative refinement is implemented.
The iterative refinement process is stopped if
iter > itermax
or for all the right-hand sides:
rnmr < sqrt(n)*xnrm*anrm*eps*bwdmax,
where
• iter is the number of the current iteration in the iterativerefinement process

• rnmr is the infinity-norm of the residual
• xnrm is the infinity-norm of the solution
• anrm is the infinity-operator-norm of the matrix A
• eps is the machine epsilon returned by dlamch (‘Epsilon’).
The values itermax and bwdmax are fixed to 30 and 1.0d+00 respectively.
Input Parameters
n INTEGER. The number of linear equations, that is, the order of the
matrix A; n ≥ 0.
a, b REAL for sgesv
DOUBLE PRECISION for dgesv and dsgesv
COMPLEX for cgesv
DOUBLE COMPLEX for zgesv and zcgesv.
The array a contains the n-by-n coefficient matrix A.
The array b contains the n-by-nrhs matrix of right hand side matrix B.
The second dimension of a must be at least max(1, n), the second
lda INTEGER. The leading dimension of the array a; lda ≥ max(1, n).
680
ldb INTEGER. The leading dimension of the array b; ldb ≥ max(1, n).
ldx INTEGER. The leading dimension of the array x; ldx ≥ max(1, n).
work DOUBLE PRECISION for dsgesv
DOUBLE COMPLEX for zcgesv.
Workspace array, DIMENSION at least max(1,n*nrhs). This array is
used to hold the residual vectors.
swork REAL for dsgesv
COMPLEX for zcgesv.
Workspace array, DIMENSION at least max(1,n*nrhs). This array is
used to use the single precision matrix and the right-hand sides or
solutions in single precision.
Output Parameters
a Overwritten by the factors L and U from the factorization of A = P*L*U;
the unit diagonal elements of L are not stored.
If iterative refinement has been successfully used (info= 0 and iter≥
0, see description below), then A is unchanged.
If double precision factorization has been used (info= 0 and iter <
0, see description below), then the array A contains the factors L and
U from the factorization A = P*L*U; the unit diagonal elements of L
are not stored.
b Overwritten by the solution matrix X for dgesv, sgesv,zgesv,zgesv.
Unchanged for dsgesv and zcgesv.
ipiv INTEGER.
Array, DIMENSION at least max(1, n). The pivot indices that define
the permutation matrix P; row i of the matrix was interchanged with
row ipiv(i). Corresponds to the single precision factorization (if info=
0 and iter ≥ 0) or the double precision factorization (if info= 0 and
iter < 0).
x DOUBLE PRECISION for dsgesv
DOUBLE COMPLEX for zcgesv.
Array, DIMENSION (ldx, nrhs). If info = 0, contains the n-by-nrhs
solution matrix X.
iter INTEGER.
681
If iter < 0: iterative refinement has failed, double precision

factorization has been performed
• If iter = -1: taking into account machine parameters, n, nrhs, it

is a priori not worth working in single precision
• If iter = -2: overflow of an entry when moving from double to
single precision
• If iter = -3: failure of sgetrf
• If iter = -31: stop the iterative refinement after the 30th iteration.
If iter > 0: iterative refinement has been successfully used. Returns

the number of iterations.
If info = i, U(i, i) (computed in double precision for mixed precision
subroutines) is exactly zero. The factorization has been completed, but
the factor U is exactly singular, so the solution could not be computed.

Specific details for the routine gesv interface are as follows:
NOTE. Fortran 95 Interface is so far not available for the mixed precision subroutines
dsgesv/zcgesv.
See Also
• Driver Routines
• ilaenv
• dlamch
682
• sgetrf
?gesvx
equations with a square matrix A and multiple
right-hand sides, and provides error bounds on the
solution.
Syntax
FORTRAN 77:
call sgesvx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b,
ldb, x, ldx, rcond, ferr, berr, work, iwork, info )
call dgesvx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b,
ldb, x, ldx, rcond, ferr, berr, work, iwork, info )
call cgesvx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b,
ldb, x, ldx, rcond, ferr, berr, work, rwork, info )
call zgesvx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b,
ldb, x, ldx, rcond, ferr, berr, work, rwork, info )
Fortran 95:
call gesvx( a, b, x [,af] [,ipiv] [,fact] [,trans] [,equed] [,r] [,c] [,ferr]
[,berr] [,rcond] [,rpvgrw] [,info] )
Description
The routine uses the LU factorization to compute the solution to a real or complex system of
linear equations A*X = B, where A is an n-by-n matrix, the columns of matrix B are individual
right-hand sides, and the columns of X are the corresponding solutions.
Error bounds on the solution and a condition estimate are also provided.
The routine ?gesvx performs the following steps:
1. If fact = 'E', real scaling factors r and c are computed to equilibrate the system:
trans = 'N': diag(r)*A*diag(c)*inv(diag(c))*X = diag(r)*B
683
trans = 'T': (diag(r)*A*diag(c))T*inv(diag(r))*X = diag(c)*B

trans = 'C': (diag(r)*A*diag(c))H*inv(diag(r))*X = diag(c)*B
Whether or not the system will be equilibrated depends on the scaling of the matrix A, but
if equilibration is used, A is overwritten by diag(r)*A*diag(c) and B by diag(r)*B (if
trans='N') or diag(c)*B (if trans = 'T' or 'C').
2. If fact = 'N' or 'E', the LU decomposition is used to factor the matrix A (after
equilibration if fact = 'E') as A = P*L*U, where P is a permutation matrix, L is a unit
lower triangular matrix, and U is upper triangular.
3. If some Ui,i= 0, so that U is exactly singular, then the routine returns with info = i.
Otherwise, the factored form of A is used to estimate the condition number of the matrix A.
If the reciprocal of the condition number is less than machine precision, info = n + 1 is
returned as a warning, but the routine still goes on to solve for X and compute error bounds
as described below.
4. The system of equations is solved for X using the factored form of A.
5. Iterative refinement is applied to improve the computed solution matrix and calculate error
bounds and backward error estimates for it.
6. If equilibration was used, the matrix X is premultiplied by diag(c) (if trans = 'N') or
diag(r) (if trans = 'T' or 'C') so that it solves the original system before equilibration.
Input Parameters
fact CHARACTER*1. Must be 'F', 'N', or 'E'.
Specifies whether or not the factored form of the matrix A is supplied
on entry, and if not, whether the matrix A should be equilibrated before
it is factored.
If fact = 'F': on entry, af and ipiv contain the factored form of A.
If equed is not 'N', the matrix A has been equilibrated with scaling
factors given by r and c.
a, af, and ipiv are not modified.
If fact = 'N', the matrix A will be copied to af and factored.
If fact = 'E', the matrix A will be equilibrated if necessary, then
copied to af and factored.
trans CHARACTER*1. Must be 'N', 'T', or 'C'.
Specifies the form of the system of equations:
If trans = 'N', the system has the form A*X = B (No transpose).
T
If trans = 'T', the system has the form A *X = B (Transpose).
684
H
If trans = 'C', the system has the form A *X = B (Transpose for real
flavors, conjugate transpose for complex flavors).
n INTEGER. The number of linear equations; the order of the matrix A;
n ≥ 0.
nrhs INTEGER. The number of right hand sides; the number of columns of
the matrices B and X; nrhs ≥ 0.
a,af,b,work REAL for sgesvx
DOUBLE PRECISION for dgesvx
COMPLEX for cgesvx
DOUBLE COMPLEX for zgesvx.
Arrays: a(lda,*), af(ldaf,*), b(ldb,*), work(*).
The array a contains the matrix A. If fact = 'F' and equed is not 'N',
then A must have been equilibrated by the scaling factors in r and/or
c. The second dimension of a must be at least max(1,n).
The array af is an input argument if fact = 'F'. It contains the
factored form of the matrix A, that is, the factors L and U from the
factorization A = P*L*U as computed by ?getrf. If equed is not 'N',
then af is the factored form of the equilibrated matrix A. The second
dimension of af must be at least max(1,n).
work(*) is a workspace array. The dimension of work must be at least
max(1,4*n) for real flavors, and at least max(1,2*n) for complex
flavors.
ipiv INTEGER.
Array, DIMENSION at least max(1, n). The array ipiv is an input
argument if fact = 'F'. It contains the pivot indices from the
factorization A = P*L*U as computed by ?getrf; row i of the matrix
was interchanged with row ipiv(i).
equed CHARACTER*1. Must be 'N', 'R', 'C', or 'B'.
685
equed is an input argument if fact = 'F'. It specifies the form of

equilibration that was done:
If equed = 'N', no equilibration was done (always true if fact =
'N').
If equed = 'R', row equilibration was done, that is, A has been
premultiplied by diag(r).
If equed = 'C', column equilibration was done, that is, A has been
postmultiplied by diag(c).
If equed = 'B', both row and column equilibration was done, that is,
A has been replaced by diag(r)*A*diag(c).
Arrays: r(n), c(n). The array r contains the row scale factors for A,
and the array c contains the column scale factors for A. These arrays
are input arguments if fact = 'F' only; otherwise they are output
arguments.
If equed = 'R' or 'B', A is multiplied on the left by diag(r); if equed
= 'N' or 'C', r is not accessed.
If fact = 'F' and equed = 'R' or 'B', each element of r must be
positive.
If equed = 'C' or 'B', A is multiplied on the right by diag(c); if
equed = 'N' or 'R', c is not accessed.
If fact = 'F' and equed = 'C' or 'B', each element of c must be
positive.
ldx INTEGER. The first dimension of the output array x; ldx ≥ max(1,
n).
iwork INTEGER. Workspace array, DIMENSION at least max(1, n); used in
real flavors only.
rwork REAL for single precision flavors
Workspace array, DIMENSION at least max(1, 2*n); used in complex
flavors only.
Output Parameters
x REAL for sgesvx
COMPLEX for cgesvx
686
Array, DIMENSION (ldx,*).
If info = 0 or info = n+1, the array x contains the solution matrix
X to the original system of equations. Note that A and B are modified
on exit if equed ≠ 'N', and the solution to the equilibrated system
is:
diag(C)-1*X, if trans = 'N' and equed = 'C' or 'B'; diag(R)-1*X,
if trans = 'T' or 'C' and equed = 'R' or 'B'. The second dimension
of x must be at least max(1,nrhs).
a Array a is not modified on exit if fact = 'F' or 'N', or if fact = 'E'
and equed = 'N'. If equed ≠ 'N', A is scaled on exit as follows:
equed = 'R': A = diag(R)*A
equed = 'C': A = A*diag(c)
equed = 'B': A = diag(R)*A*diag(c).
af If fact = 'N' or 'E', then af is an output argument and on exit
returns the factors L and U from the factorization A = PLU of the original
matrix A (if fact = 'N') or of the equilibrated matrix A (if fact =
'E'). See the description of a for the form of the equilibrated matrix.
b Overwritten by diag(r)*B if trans = 'N' and equed = 'R'or 'B';
overwritten by diag(c)*B if trans = 'T' or 'C' and equed = 'C'
or 'B';
not changed if equed = 'N'.
r, c These arrays are output arguments if fact ≠ 'F'. See the description
of r, c in Input Arguments section.
An estimate of the reciprocal condition number of the matrix A after
equilibration (if done). If rcond is less than the machine precision, in
particular, if rcond = 0, the matrix is singular to working precision.
This condition is indicated by a return code of info > 0.
ferr REAL for single precision flavors
Array, DIMENSION at least max(1, nrhs). Contains the estimated
forward error bound for each solution vector x (j) (the j-th column
of the solution matrix X). If xtrue is the true solution corresponding to
x(j), ferr(j) is an estimated upper bound for the magnitude of the
687
largest element in (x(j) - xtrue) divided by the magnitude of the

largest element in x(j). The estimate is as reliable as the estimate for
rcond, and is almost always a slight overestimate of the true error.
berr REAL for single precision flavors
Array, DIMENSION at least max(1, nrhs). Contains the component-wise
relative backward error for each solution vector x(j), that is, the
smallest relative change in any element of A or B that makes x(j) an
exact solution.
ipiv If fact = 'N' or 'E', then ipiv is an output argument and on exit
contains the pivot indices from the factorization A = P*L*U of the
original matrix A (if fact = 'N') or of the equilibrated matrix A (if
fact = 'E').
equed If fact ≠ 'F' , then equed is an output argument. It specifies the
form of equilibration that was done (see the description of equed in
Input Arguments section).
work, rwork On exit, work(1) for real flavors, or rwork(1) for complex flavors,
contains the reciprocal pivot growth factor norm(A)/norm(U). The "max
absolute element" norm is used. If work(1) for real flavors, or
rwork(1) for complex flavors is much less than 1, then the stability
of the LU factorization of the (equilibrated) matrix A could be poor. This
also means that the solution x, condition estimator rcond, and forward
error bound ferr could be unreliable. If factorization fails with 0 <
info ≤ n, then work(1) for real flavors, or rwork(1) for complex
flavors contains the reciprocal pivot growth factor for the leading info
columns of A.
If info = i, and i ≤ n, then U(i, i) is exactly zero. The factorization
has been completed, but the factor U is exactly singular, so the solution
and error bounds could not be computed; rcond = 0 is returned.
If info = i, and i = n+1, then U is nonsingular, but rcond is less
than machine precision, meaning that the matrix is singular to working
precision. Nevertheless, the solution and error bounds are computed
because there are a number of situations where the computed solution
can be more accurate than the value of rcond would suggest.
688
Specific details for the routine gesvx interface are as follows:
r Holds the vector of length n. Default value for each element is r(i) =
1.0_WP.
c Holds the vector of length n. Default value for each element is c(i) =
1.0_WP.
fact Must be 'N', 'E', or 'F'. The default value is 'N'. If fact = 'F',
then both arguments af and ipiv must be present; otherwise, an error
is returned.
equed Must be 'N', 'B', 'C', or 'R'. The default value is 'N'.
rpvgrw Real value that contains the reciprocal pivot growth factor
norm(A)/norm(U).
689
?gbsv
equations with a band matrix A and multiple
right-hand sides.
Syntax
FORTRAN 77:
call sgbsv( n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call dgbsv( n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call cgbsv( n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call zgbsv( n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
Fortran 95:
call gbsv( ab, b [,kl] [,ipiv] [,info] )
Description
The routine solves for X the real or complex system of linear equations A*X = B, where A is
an n-by-n band matrix with kl subdiagonals and ku superdiagonals, the columns of matrix B
are individual right-hand sides, and the columns of X are the corresponding solutions.
The LU decomposition with partial pivoting and row interchanges is used to factor A as A = L*U,
where L is a product of permutation and unit lower triangular matrices with kl subdiagonals,
and U is upper triangular with kl+ku superdiagonals. The factored form of A is then used to
solve the system of equations A*X = B.
Input Parameters
n INTEGER. The order of A. The number of rows in B; n ≥ 0.
nrhs INTEGER. The number of right-hand sides. The number of columns in
B; nrhs ≥ 0.
690
ab, b REAL for sgbsv
DOUBLE PRECISION for dgbsv
COMPLEX for cgbsv
DOUBLE COMPLEX for zgbsv.
Schemes). The second dimension of ab must be at least max(1, n).
be at least max(1,nrhs).
ldab INTEGER. The first dimension of the array ab. (ldab ≥ 2kl + ku +1)
Output Parameters
ab Overwritten by L and U. The diagonal and kl + ku superdiagonals of U
are stored in the first 1 + kl + ku rows of ab. The multipliers used to
form L are stored in the next kl rows.
ipiv INTEGER.
Array, DIMENSION at least max(1, n). The pivot indices: row i was
interchanged with row ipiv(i).
If info = i, U(i, i) is exactly zero. The factorization has been
completed, but the factor U is exactly singular, so the solution could
not be computed.

Specific details for the routine gbsv interface are as follows:
691

?gbsvx
Computes the solution to the real or complex
system of linear equations with a band matrix A
and multiple right-hand sides, and provides error
bounds on the solution.
Syntax
FORTRAN 77:
call sgbsvx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed,
r, c, b, ldb, x, ldx, rcond, ferr, berr, work, iwork, info )
call dgbsvx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed,
r, c, b, ldb, x, ldx, rcond, ferr, berr, work, iwork, info )
call cgbsvx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed,
r, c, b, ldb, x, ldx, rcond, ferr, berr, work, rwork, info )
call zgbsvx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed,
r, c, b, ldb, x, ldx, rcond, ferr, berr, work, rwork, info )
Fortran 95:
call gbsvx( ab, b, x [,kl] [,afb] [,ipiv] [,fact] [,trans] [,equed] [,r] [,c]
[,ferr] [,berr] [,rcond] [,rpvgrw] [,info] )
Description
T H
linear equations A*X = B, A *X = B, or A *X = B, where A is a band matrix of order n with kl
subdiagonals and ku superdiagonals, the columns of matrix B are individual right-hand sides,
and the columns of X are the corresponding solutions.
The routine ?gbsvx performs the following steps:
692
1. If fact = 'E', real scaling factors r and c are computed to equilibrate the system:
trans = 'N': diag(r)*A*diag(c) *inv(diag(c))*X = diag(r)*B
trans = 'T': (diag(r)*A*diag(c))T *inv(diag(r))*X = diag(c)*B
trans = 'C': (diag(r)*A*diag(c))H *inv(diag(r))*X = diag(c)*B
Whether the system will be equilibrated depends on the scaling of the matrix A, but if
equilibration is used, A is overwritten by diag(r)*A*diag(c) and B by diag(r)*B (if
2. If fact = 'N' or 'E', the LU decomposition is used to factor the matrix A (after equilibration
if fact = 'E') as A = L*U, where L is a product of permutation and unit lower triangular
matrices with kl subdiagonals, and U is upper triangular with kl+ku superdiagonals.
3. If some Ui,i = 0, so that U is exactly singular, then the routine returns with info = i.
as described below.
6. If equilibration was used, the matrix X is premultiplied by diag(c) (if trans = 'N') or
diag(r) (if trans = 'T' or 'C') so that it solves the original system before equilibration.
Input Parameters
Specifies whether the factored form of the matrix A is supplied on entry,
and if not, whether the matrix A should be equilibrated before it is
factored.
If fact = 'F': on entry, afb and ipiv contain the factored form of
A. If equed is not 'N', the matrix A is equilibrated with scaling factors
given by r and c.
ab, afb, and ipiv are not modified.
If fact = 'N', the matrix A will be copied to afb and factored.
copied to afb and factored.
693

T
H
If trans = 'C', the system has the form A *X = B (Transpose for real
flavors, conjugate transpose for complex flavors).
n INTEGER. The number of linear equations, the order of the matrix A;
n ≥ 0.
nrhs IN TEGER. The number of right hand sides, the number of columns of
ab,afb,b,work REAL for sgesvx
COMPLEX for cgesvx
Schemes). The second dimension of ab must be at least max(1, n).
If fact = 'F' and equed is not 'N', then A must have been
equilibrated by the scaling factors in r and/or c.
The array afb is an input argument if fact = 'F'. The second
dimension of afb must be at least max(1,n). It contains the factored
form of the matrix A, that is, the factors L and U from the factorization
A = L*U as computed by ?gbtrf. U is stored as an upper triangular
band matrix with kl + ku superdiagonals in the first 1 + kl + ku rows
of afb. The multipliers used during the factorization are stored in the
next kl rows. If equed is not 'N', then afb is the factored form of the
equilibrated matrix A.
flavors.
ldab INTEGER. The first dimension of ab; ldab ≥ kl+ku+1.
694
ldafb INTEGER. The first dimension of afb; ldafb ≥ 2*kl+ku+1.
ipiv INTEGER.
argument if fact = 'F'. It contains the pivot indices from the
factorization A = L*U as computed by ?gbtrf; row i of the matrix
was interchanged with row ipiv(i).
equed CHARACTER*1. Must be 'N', 'R', 'C', or 'B'.
If equed = 'N', no equilibration was done (always true if fact =
'N').
If equed = 'R', row equilibration was done, that is, A has been
If equed = 'C', column equilibration was done, that is, A has been
if equed = 'B', both row and column equilibration was done, that is,
A has been replaced by diag(r)*A*diag(c).
Arrays: r(n), c(n).
The array r contains the row scale factors for A, and the array c contains
the column scale factors for A. These arrays are input arguments if
fact = 'F' only; otherwise they are output arguments.
If equed = 'R' or 'B', A is multiplied on the left by diag(r); if equed
= 'N' or 'C', r is not accessed.
If fact = 'F' and equed = 'R' or 'B', each element of r must be
positive.
If equed = 'C' or 'B', A is multiplied on the right by diag(c); if
equed = 'N' or 'R', c is not accessed.
If fact = 'F' and equed = 'C' or 'B', each element of c must be
positive.
n).
real flavors only.
695
rwork REAL for single precision flavors

Workspace array, DIMENSION at least max(1, n); used in complex
flavors only.
Output Parameters
x REAL for sgbsvx
DOUBLE PRECISION for dgbsvx
COMPLEX for cgbsvx
DOUBLE COMPLEX for zgbsvx.
X to the original system of equations. Note that A and B are modified
on exit if equed ≠ 'N', and the solution to the equilibrated system
is: inv(diag(c))*X, if trans = 'N' and equed = 'C' or 'B';
inv(diag(r))*X, if trans = 'T' or 'C' and equed = 'R' or 'B'.
The second dimension of x must be at least max(1,nrhs).
ab Array ab is not modified on exit if fact = 'F' or 'N', or if fact =
'E' and equed = 'N'.
If equed ≠ 'N', A is scaled on exit as follows:
equed = 'R': A = diag(r)*A
equed = 'B': A = diag(r)*A*diag(c).
afb If fact = 'N' or 'E', then afb is an output argument and on exit
returns details of the LU factorization of the original matrix A (if fact
= 'N') or of the equilibrated matrix A (if fact = 'E'). See the
description of ab for the form of the equilibrated matrix.
b Overwritten by diag(r)*b if trans = 'N' and equed = 'R' or 'B';
overwritten by diag(c)*b if trans = 'T' or 'C' and equed = 'C'
or 'B';
not changed if equed = 'N'.
r, c These arrays are output arguments if fact ≠ 'F'. See the description
of r, c in Input Arguments section.
696
equilibration (if done).
If rcond is less than the machine precision (in particular, if rcond =0),
the matrix is singular to working precision. This condition is indicated
by a return code of info>0.
forward error bound for each solution vector x(j) (the j-th column of
the solution matrix X). If xtrue is the true solution corresponding to
exact solution.
ipiv If fact = 'N' or 'E', then ipiv is an output argument and on exit
contains the pivot indices from the factorization A = L*U of the original
matrix A (if fact = 'N') or of the equilibrated matrix A (if fact =
'E').
work, rwork On exit, work(1) for real flavors, or rwork(1) for complex flavors,
contains the reciprocal pivot growth factor norm(A)/norm(U). The "max
absolute element" norm is used. If work(1) for real flavors, or
rwork(1) for complex flavors is much less than 1, then the stability
of the LU factorization of the (equilibrated) matrix A could be poor. This
also means that the solution x, condition estimator rcond, and forward
error bound ferr could be unreliable. If factorization fails with 0 <
697
info ≤ n, then work(1) for real flavors, or rwork(1) for complex

flavors contains the reciprocal pivot growth factor for the leading info
columns of A.
has been completed, but the factor U is exactly singular, so the solution
and error bounds could not be computed; rcond = 0 is returned. If
info = i, and i = n+1, then U is nonsingular, but rcond is less than
machine precision, meaning that the matrix is singular to working

Specific details for the routine gbsvx interface are as follows:
afb Holds the array AF of size (2*kl+ku+1,n).
r Holds the vector of length n. Default value for each element is r(i) =
1.0_WP.
c Holds the vector of length n. Default value for each element is c(i) =
1.0_WP.
equed Must be 'N', 'B', 'C', or 'R'. The default value is 'N'.
698
then both arguments af and ipiv must be present; otherwise, an error
is returned.
rpvgrw Real value that contains the reciprocal pivot growth factor
norm(A)/norm(U).
?gtsv
equations with a tridiagonal matrix A and multiple
right-hand sides.
Syntax
FORTRAN 77:
call sgtsv( n, nrhs, dl, d, du, b, ldb, info )
call dgtsv( n, nrhs, dl, d, du, b, ldb, info )
call cgtsv( n, nrhs, dl, d, du, b, ldb, info )
call zgtsv( n, nrhs, dl, d, du, b, ldb, info )
Fortran 95:
call gtsv( dl, d, du, b [,info] )
Description
The routine solves for X the system of linear equations A*X = B, where A is an n-by-n tridiagonal
matrix, the columns of matrix B are individual right-hand sides, and the columns of X are the
corresponding solutions. The routine uses Gaussian elimination with partial pivoting.
T
Note that the equation A *X = B may be solved by interchanging the order of the arguments
du and dl.
699
Input Parameters
n INTEGER. The order of A, the number of rows in B; n ≥ 0.
nrhs INTEGER. The number of right-hand sides, the number of columns in
B; nrhs ≥ 0.
dl, d, du, b REAL for sgtsv
DOUBLE PRECISION for dgtsv
COMPLEX for cgtsv
DOUBLE COMPLEX for zgtsv.
Arrays: dl(n - 1), d(n), du(n - 1), b(ldb,*).
The array dl contains the (n - 1) subdiagonal elements of A.
The array d contains the diagonal elements of A.
The array du contains the (n - 1) superdiagonal elements of A.
Output Parameters
dl Overwritten by the (n-2) elements of the second superdiagonal of the
upper triangular matrix U from the LU factorization of A. These elements
are stored in dl(1), ... , dl(n-2).
d Overwritten by the n diagonal elements of U.
du Overwritten by the (n-1) elements of the first superdiagonal of U.
If info = i, U(i, i) is exactly zero, and the solution has not been
computed. The factorization has not been completed unless i = n.

Specific details for the routine gtsv interface are as follows:
700
?gtsvx
Computes the solution to the real or complex
system of linear equations with a tridiagonal matrix
A and multiple right-hand sides, and provides error
bounds on the solution.
Syntax
FORTRAN 77:
call sgtsvx( fact, trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb,
x, ldx, rcond, ferr, berr, work, iwork, info )
call dgtsvx( fact, trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb,
call cgtsvx( fact, trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb,
x, ldx, rcond, ferr, berr, work, rwork, info )
call zgtsvx( fact, trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb,
x, ldx, rcond, ferr, berr, work, rwork, info )
Fortran 95:
call gtsvx( dl, d, du, b, x [,dlf] [,df] [,duf] [,du2] [,ipiv] [,fact] [,trans]
[,ferr] [,berr] [,rcond] [,info] )
Description
T H
linear equations A*X = B, A *X = B, or A *X = B, where A is a tridiagonal matrix of order n,
701
The routine ?gtsvx performs the following steps:
1. If fact = 'N', the LU decomposition is used to factor the matrix A as A = L*U, where L
is a product of permutation and unit lower bidiagonal matrices and U is an upper triangular
matrix with nonzeroes in only the main diagonal and first two superdiagonals.
2. If some Ui,i = 0, so that U is exactly singular, then the routine returns with info = i.
as described below.
Input Parameters
fact CHARACTER*1. Must be 'F' or 'N'.
Specifies whether or not the factored form of the matrix A has been
supplied on entry.
If fact = 'F': on entry, dlf, df, duf, du2, and ipiv contain the
factored form of A; arrays dl, d, du, dlf, df, duf, du2, and ipiv will
not be modified.
If fact = 'N', the matrix A will be copied to dlf, df, and duf and
factored.
T
H
If trans = 'C', the system has the form A *X = B (Conjugate
transpose).
n INTEGER. The number of linear equations, the order of the matrix A;
n ≥ 0.
nrhs INTEGER. The number of right hand sides, the number of columns of
dl,d,du,dlf,df, REAL for sgtsvx
duf,du2,b, DOUBLE PRECISION for dgtsvx
x,work COMPLEX for cgtsvx
702
DOUBLE COMPLEX for zgtsvx.
Arrays:
dl, dimension (n -1), contains the subdiagonal elements of A.
d, dimension (n), contains the diagonal elements of A.
du, dimension (n -1), contains the superdiagonal elements of A.
dlf, dimension (n -1). If fact = 'F' , then dlf is an input argument
and on entry contains the (n -1) multipliers that define the matrix L
from the LU factorization of A as computed by ?gttrf.
df, dimension (n). If fact = 'F' , then df is an input argument and
on entry contains the n diagonal elements of the upper triangular matrix
U from the LU factorization of A.
duf, dimension (n -1). If fact = 'F' , then duf is an input argument
and on entry contains the (n -1) elements of the first superdiagonal of
U.
du2, dimension (n -2). If fact = 'F' , then du2 is an input argument
and on entry contains the (n-2) elements of the second superdiagonal
of U.
b(ldb*) contains the right-hand side matrix B. The second dimension
of b must be at least max(1, nrhs).
x(ldx*) contains the solution matrix X. The second dimension of x
must be at least max(1, nrhs).
work(*) is a workspace array;
the dimension of work must be at least max(1, 3*n) for real flavors
ipiv INTEGER.
Array, DIMENSION at least max(1, n). If fact = 'F' , then ipiv is
an input argument and on entry contains the pivot indices, as returned
by ?gttrf.
rwork REAL for cgtsvx
DOUBLE PRECISION for zgtsvx.
Workspace array, DIMENSION (n). Used for complex flavors only.
703
Output Parameters
x REAL for sgtsvx
DOUBLE PRECISION for dgtsvx
COMPLEX for cgtsvx
DOUBLE COMPLEX for zgtsvx.
X. The second dimension of x must be at least max(1, nrhs).
dlf If fact = 'N' , then dlf is an output argument and on exit contains
the (n-1) multipliers that define the matrix L from the LU factorization
of A.
df If fact = 'N' , then df is an output argument and on exit contains
the n diagonal elements of the upper triangular matrix U from the LU
factorization of A.
duf If fact = 'N' , then duf is an output argument and on exit contains
the (n-1) elements of the first superdiagonal of U.
du2 If fact = 'N' , then du2 is an output argument and on exit contains
the (n-2) elements of the second superdiagonal of U.
ipiv The array ipiv is an output argument if fact = 'N' and, on exit,
contains the pivot indices from the factorization A = L*U ; row i of
the matrix was interchanged with row ipiv(i). The value of ipiv(i)
will always be i or i+1; ipiv(i)=i indicates a row interchange was
not required.
An estimate of the reciprocal condition number of the matrix A. If rcond
is less than the machine precision (in particular, if rcond =0), the matrix
is singular to working precision. This condition is indicated by a return
code of info>0.
704
exact solution.
has not been completed unless i = n, but the factor U is exactly
singular, so the solution and error bounds could not be computed;
rcond = 0 is returned. If info = i, and i = n + 1, then U is
nonsingular, but rcond is less than machine precision, meaning that
the matrix is singular to working precision. Nevertheless, the solution
and error bounds are computed because there are a number of
situations where the computed solution can be more accurate than the
value of rcond would suggest.

Specific details for the routine gtsvx interface are as follows:
dlf Holds the vector of length (n-1).
duf Holds the vector of length (n-1).
705

fact Must be 'N' or 'F'. The default value is 'N'. If fact = 'F', then the
arguments dlf, df, duf, du2, and ipiv must be present; otherwise,
an error is returned.
?posv
equations with a symmetric or Hermitian
positive-definite matrix A and multiple right-hand
sides.
Syntax
FORTRAN 77:
call sposv( uplo, n, nrhs, a, lda, b, ldb, info )
call dposv( uplo, n, nrhs, a, lda, b, ldb, info )
call cposv( uplo, n, nrhs, a, lda, b, ldb, info )
call zposv( uplo, n, nrhs, a, lda, b, ldb, info )
Fortran 95:
call posv( a, b [,uplo] [,info] )
Description
an n-by-n symmetric/Hermitian positive-definite matrix, the columns of matrix B are individual
The Cholesky decomposition is used to factor A as

T H
A = U *U (real flavors) and A = U *U (complex flavors), if uplo = 'U'
or A = L*LT (real flavors) and A = L*LH (complex flavors), if uplo = 'L',
706
where U is an upper triangular matrix and L is a lower triangular matrix. The factored form of
A is then used to solve the system of equations A*X = B.
Input Parameters
Indicates whether the upper or lower triangular part of A is stored:
B; nrhs ≥ 0.
a, b REAL for sposv
DOUBLE PRECISION for dposv
COMPLEX for cposv
DOUBLE COMPLEX for zposv.
Arrays: a(lda,*), b(ldb,*). The array a contains the upper or the
lower triangular part of the matrix A (see uplo). The second dimension
of a must be at least max(1, n).
Output Parameters
a If info = 0, the upper or lower triangular part of a is overwritten by
the Cholesky factor U or L, as specified by uplo.
itself) is not positive definite, so the factorization could not be
completed, and the solution has not been computed.
707

Specific details for the routine posv interface are as follows:
?posvx
Uses the Cholesky factorization to compute the
solution to the system of linear equations with a
symmetric or Hermitian positive definite matrix A,
and provides error bounds on the solution.
Syntax
FORTRAN 77:
call sposvx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx,
rcond, ferr, berr, work, iwork, info )
call dposvx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx,
rcond, ferr, berr, work, iwork, info )
call cposvx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx,
rcond, ferr, berr, work, rwork, info )
call zposvx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx,
rcond, ferr, berr, work, rwork, info )
Fortran 95:
call posvx( a, b, x [,uplo] [,af] [,fact] [,equed] [,s] [,ferr] [,berr] [,rcond]
[,info] )
Description
708
The routine uses the Cholesky factorization A=UT*U (real flavors) / A=UH*U (complex flavors)
or A=L*LT (real flavors) / A=L*LH (complex flavors) to compute the solution to a real or complex
system of linear equations A*X = B, where A is a n-by-n real symmetric/Hermitian positive
definite matrix, the columns of matrix B are individual right-hand sides, and the columns of X
are the corresponding solutions.
The routine ?posvx performs the following steps:
1. If fact = 'E', real scaling factors s are computed to equilibrate the system:
diag(s)*A*diag(s)*inv(diag(s))*X = diag(s)*B.
if equilibration is used, A is overwritten by diag(s)*A*diag(s) and B by diag(s)*B.
2. If fact = 'N' or 'E', the Cholesky decomposition is used to factor the matrix A (after
equilibration if fact = 'E') as
A = UT*U (real), A = UH*U (complex), if uplo = 'U',
or A = L*LT (real), A = L*LH (complex) , if uplo = 'L',
where U is an upper triangular matrix and L is a lower triangular matrix.

3. If the leading i-by-i principal minor is not positive-definite, then the routine returns with
info = i. Otherwise, the factored form of A is used to estimate the condition number of
the matrix A. If the reciprocal of the condition number is less than machine precision, info
= n + 1 is returned as a warning, but the routine still goes on to solve for X and compute
error bounds as described below.
6. If equilibration was used, the matrix X is premultiplied by diag(s) so that it solves the
original system before equilibration.
Input Parameters
it is factored.
709
If fact = 'F': on entry, af contains the factored form of A. If equed

= 'Y', the matrix A has been equilibrated with scaling factors given
by s.
a and af will not be modified.
copied to af and factored.
B; nrhs ≥ 0.
a,af,b,work REAL for sposvx
DOUBLE PRECISION for dposvx
COMPLEX for cposvx
DOUBLE COMPLEX for zposvx.
The array a contains the matrix A as specified by uplo. If fact = 'F'
and equed = 'Y', then A must have been equilibrated by the scaling
factors in s, and a must contain the equilibrated matrix
diag(s)*A*diag(s). The second dimension of a must be at least
max(1,n).
The array af is an input argument if fact = 'F'. It contains the
triangular factor U or L from the Cholesky factorization of A in the same
storage format as A. If equed is not 'N', then af is the factored form
of the equilibrated matrix diag(s)*A*diag(s). The second dimension
of af must be at least max(1,n).
flavors.
710
equed CHARACTER*1. Must be 'N' or 'Y'.
if equed = 'N', no equilibration was done (always true if fact =
'N');
if equed = 'Y', equilibration was done, that is, A has been replaced
by diag(s)*A*diag(s).
Array, DIMENSION (n). The array s contains the scale factors for A. This
array is an input argument if fact = 'F' only; otherwise it is an output
argument.
If equed = 'N', s is not accessed.
If fact = 'F' and equed = 'Y', each element of s must be positive.
n).
real flavors only.
rwork REAL for cposvx
DOUBLE PRECISION for zposvx.
flavors only.
Output Parameters
x REAL for sposvx
DOUBLE PRECISION for dposvx
COMPLEX for cposvx
DOUBLE COMPLEX for zposvx.
711

X to the original system of equations. Note that if equed = 'Y', A and
B are modified on exit, and the solution to the equilibrated system is
inv(diag(s))*X. The second dimension of x must be at least
max(1,nrhs).
a Array a is not modified on exit if fact = 'F' or 'N', or if fact = 'E'
and equed = 'N'.
If fact = 'E' and equed = 'Y', A is overwritten by
diag(s)*A*diag(s).
af If fact = 'N' or 'E', then af is an output argument and on exit
returns the triangular factor U or L from the Cholesky factorization
A=UT*U or A=L*LT (real routines), A=UH*U or A=L*LH (complex routines)
of the original matrix A (if fact = 'N'), or of the equilibrated matrix
A (if fact = 'E'). See the description of a for the form of the
equilibrated matrix.
b Overwritten by diag(s)*B , if equed = 'Y'; not changed if equed =
'N'.
s This array is an output argument if fact ≠ 'F'. See the description
of s in Input Arguments section.
equilibration (if done). If rcond is less than the machine precision (in
particular, if rcond =0), the matrix is singular to working precision.
This condition is indicated by a return code of info>0.
712
exact solution.
If info = i, and i ≤ n, the leading minor of order i (and therefore
the matrix A itself) is not positive-definite, so the factorization could
not be completed, and the solution and error bounds could not be
computed; rcond =0 is returned.
If info = i, and i = n + 1, then U is nonsingular, but rcond is less

Specific details for the routine posvx interface are as follows:
s Holds the vector of length n. Default value for each element is s(i) =
1.0_WP.
713
then af must be present; otherwise, an error is returned.
equed Must be 'N' or 'Y'. The default value is 'N'.
?ppsv
equations with a symmetric (Hermitian) positive
definite packed matrix A and multiple right-hand
sides.
Syntax
FORTRAN 77:
call sppsv( uplo, n, nrhs, ap, b, ldb, info )
call dppsv( uplo, n, nrhs, ap, b, ldb, info )
call cppsv( uplo, n, nrhs, ap, b, ldb, info )
call zppsv( uplo, n, nrhs, ap, b, ldb, info )
Fortran 95:
call ppsv( ap, b [,uplo] [,info] )
Description
an n-by-n real symmetric/Hermitian positive-definite matrix stored in packed format, the
columns of matrix B are individual right-hand sides, and the columns of X are the corresponding
solutions.
T H
A is then used to solve the system of equations A*X = B.
714
Input Parameters
B; nrhs ≥ 0.
ap, b REAL for sppsv
DOUBLE PRECISION for dppsv
COMPLEX for cppsv
DOUBLE COMPLEX for zppsv.
Arrays: ap(*), b(ldb,*). The array ap contains the upper or the lower
triangular part of the matrix A (as specified by uplo) in packed storage
(see Matrix Storage Schemes). The dimension of ap must be at least
max(1,n(n+1)/2).
Output Parameters
ap If info = 0, the upper or lower triangular part of A in packed storage
is overwritten by the Cholesky factor U or L, as specified by uplo.
itself) is not positive-definite, so the factorization could not be

715
Specific details for the routine ppsv interface are as follows:

?ppsvx
symmetric (Hermitian) positive definite packed
matrix A, and provides error bounds on the
solution.
Syntax
FORTRAN 77:
call sppsvx( fact, uplo, n, nrhs, ap, afp, equed, s, b, ldb, x, ldx, rcond,
ferr, berr, work, iwork, info )
call dppsvx( fact, uplo, n, nrhs, ap, afp, equed, s, b, ldb, x, ldx, rcond,
ferr, berr, work, iwork, info )
call cppsvx( fact, uplo, n, nrhs, ap, afp, equed, s, b, ldb, x, ldx, rcond,
ferr, berr, work, rwork, info )
call zppsvx( fact, uplo, n, nrhs, ap, afp, equed, s, b, ldb, x, ldx, rcond,
ferr, berr, work, rwork, info )
Fortran 95:
call ppsvx( ap, b, x [,uplo] [,af] [,fact] [,equed] [,s] [,ferr] [,berr]
[,rcond] [,info] )
Description
716
system of linear equations A*X = B, where A is a n-by-n symmetric or Hermitian positive-definite
matrix stored in packed format, the columns of matrix B are individual right-hand sides, and
the columns of X are the corresponding solutions.
The routine ?ppsvx performs the following steps:

= n+1 is returned as a warning, but the routine still goes on to solve for X and compute
Input Parameters
it is factored.
717
If fact = 'F': on entry, afp contains the factored form of A. If equed

by s.
ap and afp will not be modified.
If fact = 'N', the matrix A will be copied to afp and factored.
copied to afp and factored.
nrhs INTEGER. The number of right-hand sides; the number of columns in
B; nrhs ≥ 0.
ap,afp,b,work REAL for sppsvx
DOUBLE PRECISION for dppsvx
COMPLEX for cppsvx
DOUBLE COMPLEX for zppsvx.
Arrays: ap(*), afp(*), b(ldb,*), work(*).
The array ap contains the upper or lower triangle of the original
symmetric/Hermitian matrix A in packed storage (see Matrix Storage
Schemes). In case when fact = 'F' and equed = 'Y', ap must
contain the equilibrated matrix diag(s)*A*diag(s).
The array afp is an input argument if fact = 'F' and contains the
triangular factor U or L from the Cholesky factorization of A in the same
storage format as A. If equed is not 'N', then afp is the factored form
of the equilibrated matrix A.
n(n+1)/2); the second dimension of b must be at least max(1,nrhs);
the dimension of work must be at least max(1, 3*n) for real flavors
718
if equed = 'N', no equilibration was done (always true if fact =
'N');
by diag(s)A*diag(s).
argument.
n).
real flavors only.
rwork REAL for cppsvx;
DOUBLE PRECISION for zppsvx.
flavors only.
Output Parameters
x REAL for sppsvx
DOUBLE PRECISION for dppsvx
COMPLEX for cppsvx
DOUBLE COMPLEX for zppsvx.
X to the original system of equations. Note that if equed = 'Y', A and
max(1,nrhs).
ap Array ap is not modified on exit if fact = 'F' or 'N', or if fact =
'E' and equed = 'N'.
719
If fact = 'E' and equed = 'Y', A is overwritten by

diag(s)*A*diag(s).
afp If fact = 'N' or 'E', then afp is an output argument and on exit
A (if fact = 'E'). See the description of ap for the form of the
'N'.
particular, if rcond = 0), the matrix is singular to working precision.
exact solution.
720
computed; rcond = 0 is returned.

Specific details for the routine ppsvx interface are as follows:
afp Holds the matrix AF of size (n*(n+1)/2).
s Holds the vector of length n. Default value for each element is s(i) =
1.0_WP.
721
?pbsv
equations with a symmetric or Hermitian
positive-definite band matrix A and multiple
right-hand sides.
Syntax
FORTRAN 77:
call spbsv( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call dpbsv( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call cpbsv( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call zpbsv( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
Fortran 95:
call pbsv( ab, b [,uplo] [,info] )
Description
an n-by-n symmetric/Hermitian positive definite band matrix, the columns of matrix B are
individual right-hand sides, and the columns of X are the corresponding solutions.

T H
where U is an upper triangular band matrix and L is a lower triangular band matrix, with the
same number of superdiagonals or subdiagonals as A. The factored form of A is then used to
solve the system of equations A*X = B.
Input Parameters
722
kd INTEGER. The number of superdiagonals of the matrix A if uplo = 'U',
or the number of subdiagonals if uplo = 'L'; kd ≥ 0.
B; nrhs ≥ 0.
ab, b REAL for spbsv
DOUBLE PRECISION for dpbsv
COMPLEX for cpbsv
DOUBLE COMPLEX for zpbsv.
Arrays: ab(ldab, *), b(ldb,*). The array ab contains the upper or
the lower triangular part of the matrix A (as specified by uplo) in band
storage (see Matrix Storage Schemes). The second dimension of ab
Output Parameters
ab The upper or lower triangular part of A (in band storage) is overwritten
by the Cholesky factor U or L, as specified by uplo, in the same storage
format as A.
itself) is not positive-definite, so the factorization could not be
723

Specific details for the routine pbsv interface are as follows:
?pbsvx
symmetric (Hermitian) positive-definite band
matrix A, and provides error bounds on the
solution.
Syntax
FORTRAN 77:
call spbsvx( fact, uplo, n, kd, nrhs, ab, ldab, afb, ldafb, equed, s, b, ldb,
call dpbsvx( fact, uplo, n, kd, nrhs, ab, ldab, afb, ldafb, equed, s, b, ldb,
call cpbsvx( fact, uplo, n, kd, nrhs, ab, ldab, afb, ldafb, equed, s, b, ldb,
call zpbsvx( fact, uplo, n, kd, nrhs, ab, ldab, afb, ldafb, equed, s, b, ldb,
Fortran 95:
call pbsvx( ab, b, x [,uplo] [,afb] [,fact] [,equed] [,s] [,ferr] [,berr]
[,rcond] [,info] )
Description
724
system of linear equations A*X = B, where A is a n-by-n symmetric or Hermitian positive
definite band matrix, the columns of matrix B are individual right-hand sides, and the columns
of X are the corresponding solutions.
The routine ?pbsvx performs the following steps:
where U is an upper triangular band matrix and L is a lower triangular band matrix.
3. If the leading i-by-i principal minor is not positive definite, then the routine returns with
Input Parameters
it is factored.
725
If fact = 'F': on entry, afb contains the factored form of A. If equed

by s.
ab and afb will not be modified.
If fact = 'N', the matrix A will be copied to afb and factored.
copied to afb and factored.
A; kd ≥ 0.
B; nrhs ≥ 0.
ab,afb,b,work REAL for spbsvx
DOUBLE PRECISION for dpbsvx
COMPLEX for cpbsvx
DOUBLE COMPLEX for zpbsvx.
Arrays: ab(ldab,*), afb(ldab,*), b(ldb,*), work(*).
The array ab contains the upper or lower triangle of the matrix A in
band storage (see Matrix Storage Schemes).
If fact = 'F' and equed = 'Y', then ab must contain the equilibrated
matrix diag(s)*A*diag(s). The second dimension of ab must be at
least max(1, n).
The array afb is an input argument if fact = 'F'. It contains the
triangular factor U or L from the Cholesky factorization of the band
matrix A in the same storage format as A. If equed = 'Y', then afb
is the factored form of the equilibrated matrix A. The second dimension
of afb must be at least max(1, n).
726
The dimension of work must be at least max(1,3*n) for real flavors,
and at least max(1,2*n) for complex flavors.
ldab INTEGER. The first dimension of ab; ldab ≥ kd+1.
ldafb INTEGER. The first dimension of afb; ldafb ≥ kd+1.
if equed = 'N', no equilibration was done (always true if fact = 'N')
by diag(s)*A*diag(s).
argument.
n).
real flavors only.
rwork REAL for cpbsvx
DOUBLE PRECISION for zpbsvx.
flavors only.
Output Parameters
x REAL for spbsvx
DOUBLE PRECISION for dpbsvx
COMPLEX for cpbsvx
DOUBLE COMPLEX for zpbsvx.
727
If info = 0 or info = n+1, the array x contains the solution matrix X

to the original system of equations. Note that if equed = 'Y', A and
max(1,nrhs).
ab On exit, if fact = 'E' and equed = 'Y', A is overwritten by
diag(s)*A*diag(s).
afb If fact = 'N' or 'E', then afb is an output argument and on exit
A (if fact = 'E'). See the description of ab for the form of the
'N'.
728
exact solution.
equed If fact ≠'F' , then equed is an output argument. It specifies the form
of equilibration that was done (see the description of equed in Input
Arguments section).
the matrix A itself) is not positive definite, so the factorization could
computed; rcond =0 is returned. If info = i, and i = n + 1, then U
is nonsingular, but rcond is less than machine precision, meaning that
the matrix is singular to working precision. Nevertheless, the solution
and error bounds are computed because there are a number of
situations where the computed solution can be more accurate than the
value of rcond would suggest.

Specific details for the routine pbsvx interface are as follows:
afb Holds the array AF of size (kd+1,n).
s Holds the vector with the number of elements n. Default value for each
element is s(i) = 1.0_WP.
ferr Holds the vector with the number of elements nrhs.
berr Holds the vector with the number of elements nrhs.
729
?ptsv
equations with a symmetric or Hermitian positive
definite tridiagonal matrix A and multiple
right-hand sides.
Syntax
FORTRAN 77:
call sptsv( n, nrhs, d, e, b, ldb, info )
call dptsv( n, nrhs, d, e, b, ldb, info )
call cptsv( n, nrhs, d, e, b, ldb, info )
call zptsv( n, nrhs, d, e, b, ldb, info )
Fortran 95:
call ptsv( d, e, b [,info] )
Description
an n-by-n symmetric/Hermitian positive-definite tridiagonal matrix, the columns of matrix B
A is factored as A = L*D*LT (real flavors) or A = L*D*LH (complex flavors), and the factored
form of A is then used to solve the system of equations A*X = B.
Input Parameters
B; nrhs ≥ 0.
730
d REAL for single precision flavors
Array, dimension at least max(1, n). Contains the diagonal elements
of the tridiagonal matrix A.
e, b REAL for sptsv
DOUBLE PRECISION for dptsv
COMPLEX for cptsv
DOUBLE COMPLEX for zptsv.
Arrays: e(n - 1) , b(ldb,*). The array e contains the (n - 1)
subdiagonal elements of A.
Output Parameters
d Overwritten by the n diagonal elements of the diagonal matrix D from
the L*D*LT (real)/ L*D*LH (complex) factorization of A.
e Overwritten by the (n - 1) subdiagonal elements of the unit bidiagonal
factor L from the factorization of A.
itself) is not positive-definite, and the solution has not been computed.
The factorization has not been completed unless i = n.

Specific details for the routine ptsv interface are as follows:
731
?ptsvx
Uses factorization to compute the solution to the
system of linear equations with a symmetric
(Hermitian) positive definite tridiagonal matrix A,
and provides error bounds on the solution.
Syntax
FORTRAN 77:
call sptsvx( fact, n, nrhs, d, e, df, ef, b, ldb, x, ldx, rcond, ferr, berr,
work, info )
call dptsvx( fact, n, nrhs, d, e, df, ef, b, ldb, x, ldx, rcond, ferr, berr,
work, info )
call cptsvx( fact, n, nrhs, d, e, df, ef, b, ldb, x, ldx, rcond, ferr, berr,
work, rwork, info )
call zptsvx( fact, n, nrhs, d, e, df, ef, b, ldb, x, ldx, rcond, ferr, berr,
work, rwork, info )
Fortran 95:
call ptsvx( d, e, b, x [,df] [,ef] [,fact] [,ferr] [,berr] [,rcond] [,info] )
Description
The routine uses the Cholesky factorization A = L*D*LT (real)/A = L*D*LH (complex) to
compute the solution to a real or complex system of linear equations A*X = B, where A is a
n-by-n symmetric or Hermitian positive definite tridiagonal matrix, the columns of matrix B are
individual right-hand sides, and the columns of X are the corresponding solutions.
The routine ?ptsvx performs the following steps:
1. If fact = 'N', the matrix A is factored as A = L*D*LT (real flavors)/A = L*D*LH (complex
flavors), where L is a unit lower bidiagonal matrix and D is diagonal. The factorization can
also be regarded as having the form A = UT*D*U (real flavors)/A = UH*D*U (complex
flavors).
732
Input Parameters
on entry.
If fact = 'F': on entry, df and ef contain the factored form of A.
Arrays d, e, df, and ef will not be modified.
If fact = 'N', the matrix A will be copied to df and ef , and factored.
B; nrhs ≥ 0.
d, df, rwork REAL for single precision flavors
Arrays: d(n), df(n), rwork(n).
The array d contains the n diagonal elements of the tridiagonal matrix
A.
The array df is an input argument if fact = 'F' and on entry contains
the n diagonal elements of the diagonal matrix D from the L*D*LT (real)/
L*D*LH (complex) factorization of A.
The array rwork is a workspace array used for complex flavors only.
e,ef,b,work REAL for sptsvx
DOUBLE PRECISION for dptsvx
COMPLEX for cptsvx
DOUBLE COMPLEX for zptsvx.
Arrays: e(n -1), ef(n -1), b(ldb*), work(*). The array e contains
the (n - 1) subdiagonal elements of the tridiagonal matrix A.
733
The array ef is an input argument if fact = 'F' and on entry contains

the (n - 1) subdiagonal elements of the unit bidiagonal factor L from
the L*D*LT (real)/ L*D*LH (complex) factorization of A.
The array work is a workspace array. The dimension of work must be
at least 2*n for real flavors, and at least n for complex flavors.
ldx INTEGER. The leading dimension of x; ldx ≥ max(1, n).
Output Parameters
x REAL for sptsvx
DOUBLE PRECISION for dptsvx
COMPLEX for cptsvx
DOUBLE COMPLEX for zptsvx.
X to the system of equations. The second dimension of x must be at
least max(1,nrhs).
df, ef These arrays are output arguments if fact = 'N'. See the description
of df, ef in Input Arguments section.
734
exact solution.
computed; rcond =0 is returned.

Specific details for the routine ptsvx interface are as follows:
ef Holds the vector of length (n-1).
fact Must be 'N' or 'F'. The default value is 'N'. If fact = 'F', then
both arguments af and ipiv must be present; otherwise, an error is
returned.
735
?sysv
equations with a real or complex symmetric matrix
A and multiple right-hand sides.
Syntax
FORTRAN 77:
call ssysv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call dsysv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call csysv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call zsysv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
Fortran 95:
call sysv( a, b [,uplo] [,ipiv] [,info] )
Description
an n-by-n symmetric matrix, the columns of matrix B are individual right-hand sides, and the
columns of X are the corresponding solutions.
The diagonal pivoting method is used to factor A as A = U*D*UT or A = L*D*LT , where U (or
L) is a product of permutation and unit upper (lower) triangular matrices, and D is symmetric
and block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
Input Parameters
736
B; nrhs ≥ 0.
a, b, work REAL for ssysv
DOUBLE PRECISION for dsysv
COMPLEX for csysv
DOUBLE COMPLEX for zsysv.
Arrays: a(lda,*), b(ldb,*), work(*).
The array a contains the upper or the lower triangular part of the
symmetric matrix A (see uplo). The second dimension of a must be at
least max(1, n).
work is a workspace array, dimension at least max(1,lwork).
lwork INTEGER. The size of the work array; lwork ≥ 1.
issued by xerbla. See Application Notes below for details and for the
suggested value of lwork.
Output Parameters
a If info = 0, a is overwritten by the block-diagonal matrix D and the
multipliers used to obtain the factor U (or L) from the factorization of
A as computed by ?sytrf.
b If info = 0, b is overwritten by the solution matrix X.
ipiv INTEGER.
interchanges and the block structure of D, as determined by ?sytrf.
If ipiv(i) = k >0, then dii is a 1-by-1 diagonal block, and the i-th
row and column of A was interchanged with the k-th row and column.
737
If uplo = 'U' and ipiv(i) = ipiv(i-1) = -m < 0, then D has a 2-by-2

If uplo = 'L' and ipiv(i) = ipiv(i+1) = -m < 0, then D has a 2-by-2
exactly singular, so the solution could not be computed.

Specific details for the routine sysv interface are as follows:
Application Notes
algorithm.
738
workspace query.
workspace.
?sysvx
Uses the diagonal pivoting factorization to compute
the solution to the system of linear equations with
a real or complex symmetric matrix A, and provides
error bounds on the solution.
Syntax
FORTRAN 77:
call ssysvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx,
rcond, ferr, berr, work, lwork, iwork, info )
call dsysvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx,
rcond, ferr, berr, work, lwork, iwork, info )
call csysvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx,
rcond, ferr, berr, work, lwork, rwork, info )
call zsysvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx,
Fortran 95:
call sysvx( a, b, x [,uplo] [,af] [,ipiv] [,fact] [,ferr] [,berr] [,rcond]
[,info] )
Description
The routine uses the diagonal pivoting factorization to compute the solution to a real or complex
system of linear equations A*X = B, where A is a n-by-n symmetric matrix, the columns of
matrix B are individual right-hand sides, and the columns of X are the corresponding solutions.
739
The routine ?sysvx performs the following steps:
1. If fact = 'N', the diagonal pivoting method is used to factor the matrix A. The form of the
factorization is A = U*D*UT or A = L*D*LT, where U (or L) is a product of permutation and
unit upper (lower) triangular matrices, and D is symmetric and block diagonal with 1-by-1
and 2-by-2 diagonal blocks.
2. If some di,i= 0, so that D is exactly singular, then the routine returns with info = i.
If the reciprocal of the condition number is less than machine precision, info = n+1 is
as described below.
Input Parameters
supplied on entry.
Arrays a, af, and ipiv will not be modified.
B; nrhs ≥ 0.
a,af,b,work REAL for ssysvx
DOUBLE PRECISION for dsysvx
COMPLEX for csysvx
DOUBLE COMPLEX for zsysvx.
740
symmetric matrix A (see uplo). The second dimension of a must be at
least max(1,n).
The array af is an input argument if fact = 'F'. It contains he block
diagonal matrix D and the multipliers used to obtain the factor U or L
from the factorization A = U*D*UT orA = L*D*LT as computed by
?sytrf. The second dimension of af must be at least max(1,n).
work(*) is a workspace array, dimension at least max(1,lwork).
ipiv INTEGER.
argument if fact = 'F'. It contains details of the interchanges and
the block structure of D, as determined by ?sytrf.
If ipiv(i) = k > 0, then dii is a 1-by-1 diagonal block, and the i-th
If uplo = 'U' and ipiv(i) = ipiv(i-1) = -m < 0, then D has a
2-by-2 block in rows/columns i and i-1, and (i-1)-th row and column
of A was interchanged with the m-th row and column.
If uplo = 'L' and ipiv(i) = ipiv(i+1) = -m < 0, then D has a
2-by-2 block in rows/columns i and i+1, and (i+1)-th row and column
of A was interchanged with the m-th row and column.
ldx INTEGER. The leading dimension of the output array x; ldx ≥ max(1,
n).
lwork INTEGER. The size of the work array.
741

real flavors only.
rwork REAL for csysvx;
DOUBLE PRECISION for zsysvx.
flavors only.
Output Parameters
x REAL for ssysvx
DOUBLE PRECISION for dsysvx
COMPLEX for csysvx
DOUBLE COMPLEX for zsysvx.
least max(1,nrhs).
af, ipiv These arrays are output arguments if fact = 'N'.
See the description of af, ipiv in Input Arguments section.
is less than the machine precision (in particular, if rcond = 0), the
matrix is singular to working precision. This condition is indicated by a
return code of info > 0.
742
exact solution.
work(1) If info=0, on exit work(1) contains the minimum value of lwork
If info = i, and i ≤ n, then dii is exactly zero. The factorization has
been completed, but the block diagonal matrix D is exactly singular, so
the solution and error bounds could not be computed; rcond = 0 is
returned.
If info = i, and i = n + 1, then D is nonsingular, but rcond is less

Specific details for the routine sysvx interface are as follows:
returned.
743
Application Notes
For real flavors, lwork must be at least 3*n, and for complex flavors at least 2*n. For better
performance, try using lwork = n*blocksize, where blocksize is the optimal block size for
?sytrf.
workspace query.
workspace.
?hesv
equations with a Hermitian matrix A and multiple
right-hand sides.
Syntax
FORTRAN 77:
call chesv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call zhesv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
Fortran 95:
call hesv( a, b [,uplo] [,ipiv] [,info] )
Description
744
The routine solves for X the complex system of linear equations A*X = B, where A is an n-by-n
symmetric matrix, the columns of matrix B are individual right-hand sides, and the columns of
X are the corresponding solutions.
The diagonal pivoting method is used to factor A as A = U*D*UH or A = L*D*LH , where U (or
L) is a product of permutation and unit upper (lower) triangular matrices, and D is Hermitian
Input Parameters
how A is factored:
matrix A, and A is factored as U*D*UH.
A, and A is factored as L*D*LH.
B; nrhs ≥ 0.
a, b, work COMPLEX for chesv
DOUBLE COMPLEX for zhesv.
Arrays: a(lda,*), b(ldb,*), work(*). The array a contains the upper
or the lower triangular part of the Hermitian matrix A (see uplo). The
second dimension of a must be at least max(1, n).
work is a workspace array, dimension at least max(1,lwork).
lwork INTEGER. The size of the work array (lwork ≥ 1).
745

Output Parameters
a If info = 0, a is overwritten by the block-diagonal matrix D and the
multipliers used to obtain the factor U (or L) from the factorization of
A as computed by ?hetrf.
ipiv INTEGER.
interchanges and the block structure of D, as determined by ?hetrf.

Specific details for the routine hesv interface are as follows:
746
Application Notes
algorithm.
workspace query.
workspace.
?hesvx
the solution to the complex system of linear
equations with a Hermitian matrix A, and provides
error bounds on the solution.
Syntax
FORTRAN 77:
call chesvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx,
call zhesvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx,
747
Fortran 95:
call hesvx( a, b, x [,uplo] [,af] [,ipiv] [,fact] [,ferr] [,berr] [,rcond]
[,info] )
Description
The routine uses the diagonal pivoting factorization to compute the solution to a complex system
of linear equations A*X = B, where A is an n-by-n Hermitian matrix, the columns of matrix B
The routine ?hesvx performs the following steps:
factorization is A = U*D*UH or A = L*D*LH, where U (or L) is a product of permutation and
unit upper (lower) triangular matrices, and D is Hermitian and block diagonal with 1-by-1
2. If some di,i = 0, so that D is exactly singular, then the routine returns with info = i.
as described below.
Input Parameters
supplied on entry.
Arrays a, af, and ipiv are not modified.
If fact = 'N', the matrix A is copied to af and factored.
748
how A is factored:
Hermitian matrix A, and A is factored as U*D*UH.
If uplo = 'L', the array a stores the lower triangular part of the
Hermitian matrix A; A is factored as L*D*LH.
B; nrhs ≥ 0.
a,af,b,work COMPLEX for chesvx
DOUBLE COMPLEX for zhesvx.
Hermitian matrix A (see uplo). The second dimension of a must be at
least max(1,n).
The array af is an input argument if fact = 'F'. It contains he block
from the factorization A = U*D*UH or A = L*D*LH as computed by
?hetrf. The second dimension of af must be at least max(1,n).
work(*) is a workspace array of dimension at least max(1, lwork).
ipiv INTEGER.
the block structure of D, as determined by ?hetrf.
749

n).
rwork REAL for chesvx
DOUBLE PRECISION for zhesvx.
Output Parameters
x COMPLEX for chesvx
DOUBLE COMPLEX for zhesvx.
least max(1,nrhs).
af, ipiv These arrays are output arguments if fact = 'N'. See the description
of af, ipiv in Input Arguments section.
rcond REAL for chesvx
ferr REAL for chesvx
750
rcon, and is almost always a slight overestimate of the true error.
berr REAL for chesvx
exact solution.
returned.

Specific details for the routine hesvx interface are as follows:
751

returned.
Application Notes
The value of lwork must be at least 2*n. For better performance, try using lwork =
n*blocksize, where blocksize is the optimal block size for ?hetrf.
workspace query.
workspace.
?spsv
equations with a real or complex symmetric matrix
A stored in packed format, and multiple right-hand
sides.
Syntax
FORTRAN 77:
call sspsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call dspsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call cspsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call zspsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
752
Fortran 95:
call spsv( ap, b [,uplo] [,ipiv] [,info] )
Description
an n-by-n symmetric matrix stored in packed format, the columns of matrix B are individual
The diagonal pivoting method is used to factor A as A = U*D*UT or A = L*D*LT , where U (or
L) is a product of permutation and unit upper (lower) triangular matrices, and D is symmetric
Input Parameters
B; nrhs ≥ 0.
ap, b REAL for sspsv
DOUBLE PRECISION for dspsv
COMPLEX for cspsv
DOUBLE COMPLEX for zspsv.
753
Output Parameters
ap The block-diagonal matrix D and the multipliers used to obtain the factor
U (or L) from the factorization of A as computed by ?sptrf, stored as
a packed triangular matrix in the same storage format as A.
ipiv INTEGER.
interchanges and the block structure of D, as determined by ?sptrf.
If ipiv(i) = k > 0, then dii is a 1-by-1 block, and the i-th row and
column of A was interchanged with the k-th row and column.

Specific details for the routine spsv interface are as follows:
ipiv Holds the vector with the number of elements n.
754
?spsvx
a real or complex symmetric matrix A stored in
packed format, and provides error bounds on the
solution.
Syntax
FORTRAN 77:
call sspsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr,
call dspsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr,
call cspsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr,
call zspsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr,
Fortran 95:
call spsvx( ap, b, x [,uplo] [,afp] [,ipiv] [,fact] [,ferr] [,berr] [,rcond]
[,info] )
Description
The routine uses the diagonal pivoting factorization to compute the solution to a real or complex
system of linear equations A*X = B, where A is a n-by-n symmetric matrix stored in packed
format, the columns of matrix B are individual right-hand sides, and the columns of X are the
The routine ?spsvx performs the following steps:
755
1. If fact = 'N', the diagonal pivoting method is used to factor the matrix A. The form of
the factorization is A = U*D*UT orA = L*D*LT, where U (or L) is a product of permutation
and unit upper (lower) triangular matrices, and D is symmetric and block diagonal with
1-by-1 and 2-by-2 diagonal blocks.
as described below.
Input Parameters
supplied on entry.
If fact = 'F': on entry, afp and ipiv contain the factored form of
A. Arrays ap, afp, and ipiv are not modified.
If fact = 'N', the matrix A is copied to afp and factored.
how A is factored:
symmetric matrix A, and A is factored as U*D*UT.
symmetric matrix A; A is factored as L*D*LT.
B; nrhs ≥ 0.
ap,afp,b,work REAL for sspsvx
DOUBLE PRECISION for dspsvx
COMPLEX for cspsvx
DOUBLE COMPLEX for zspsvx.
756
The array ap contains the upper or lower triangle of the symmetric
matrix A in packed storage (see Matrix Storage Schemes).
The array afp is an input argument if fact = 'F'. It contains the block
from the factorization A = U*D*UT or A = L*D*LT as computed by
?sptrf, in the same storage format as A.
n(n+1)/2); the second dimension of b must be at least max(1,nrhs);
the dimension of work must be at least max(1,3*n) for real flavors
and max(1,2*n) for complex flavors.
ipiv INTEGER.
the block structure of D, as determined by ?sptrf.
n).
real flavors only.
rwork REAL for cspsvx
DOUBLE PRECISION for zspsvx.
flavors only.
757
Output Parameters
x REAL for sspsvx
DOUBLE PRECISION for dspsvx
COMPLEX for cspsvx
DOUBLE COMPLEX for zspsvx.
least max(1,nrhs).
afp, ipiv These arrays are output arguments if fact = 'N'. See the description
of afp, ipiv in Input Arguments section.
forward and relative backward errors, respectively, for each solution
vector.
returned.
758
Specific details for the routine spsvx interface are as follows:
returned.
?hpsv
equations with a Hermitian matrix A stored in
packed format, and multiple right-hand sides.
Syntax
FORTRAN 77:
call chpsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call zhpsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
Fortran 95:
call hpsv( ap, b [,uplo] [,ipiv] [,info] )
Description
759
The routine solves for X the system of linear equations A*X = B, where A is an n-by-n Hermitian
matrix stored in packed format, the columns of matrix B are individual right-hand sides, and
the columns of X are the corresponding solutions.
The diagonal pivoting method is used to factor A as A = U*D*UH or A = L*D*LH, where U (or
L) is a product of permutation and unit upper (lower) triangular matrices, and D is Hermitian
Input Parameters
B; nrhs ≥ 0.
ap, b COMPLEX for chpsv
DOUBLE COMPLEX for zhpsv.
Output Parameters
ap The block-diagonal matrix D and the multipliers used to obtain the factor
U (or L) from the factorization of A as computed by ?hptrf, stored as
a packed triangular matrix in the same storage format as A.
ipiv INTEGER.
760
interchanges and the block structure of D, as determined by ?hptrf.
If ipiv(i) = k > 0, then dii is a 1-by-1 block, and the i-th row and
column of A was interchanged with the k-th row and column.

Specific details for the routine hpsv interface are as follows:
761
?hpsvx
a Hermitian matrix A stored in packed format, and
provides error bounds on the solution.
Syntax
FORTRAN 77:
call chpsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr,
call zhpsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr,
Fortran 95:
call hpsvx( ap, b, x [,uplo] [,afp] [,ipiv] [,fact] [,ferr] [,berr] [,rcond]
[,info] )
Description
The routine uses the diagonal pivoting factorization to compute the solution to a complex system
of linear equations A*X = B, where A is a n-by-n Hermitian matrix stored in packed format,
The routine ?hpsvx performs the following steps:
factorization is A = U*D*UH or A = L*D*LH, where U (or L) is a product of permutation and
unit upper (lower) triangular matrices, and D is a Hermitian and block diagonal with 1-by-1
762
as described below.
Input Parameters
supplied on entry.
If fact = 'F': on entry, afp and ipiv contain the factored form of
A. Arrays ap, afp, and ipiv are not modified.
If fact = 'N', the matrix A is copied to afp and factored.
how A is factored:
Hermitian matrix A, and A is factored as U*D*UH.
Hermitian matrix A, and A is factored as L*D*LH.
B; nrhs ≥ 0.
ap,afp,b,work COMPLEX for chpsvx
DOUBLE COMPLEX for zhpsvx.
The array ap contains the upper or lower triangle of the Hermitian
matrix A in packed storage (see Matrix Storage Schemes).
The array afp is an input argument if fact = 'F'. It contains the block
from the factorization A = U*D*UH or A = L*D*LH as computed by
?hptrf, in the same storage format as A.
763
the second dimension of b must be at least max(1,nrhs); the dimension
of work must be at least max(1,2*n).
ipiv INTEGER.
the block structure of D, as determined by ?hptrf.
n).
rwork REAL for chpsvx
DOUBLE PRECISION for zhpsvx.
Output Parameters
x COMPLEX for chpsvx
DOUBLE COMPLEX for zhpsvx.
least max(1,nrhs).
afp, ipiv These arrays are output arguments if fact = 'N'. See the description
of afp, ipiv in Input Arguments section.
rcond REAL for chpsvx
764
ferr REAL for chpsvx
berr REAL for chpsvx
exact solution.
returned.

Specific details for the routine hpsvx interface are as follows:
765

returned.
766
LAPACK Routines: Least Squares
and Eigenvalue Problems 4
This chapter describes the Intel® Math Kernel Library implementation of routines from the LAPACK package
that are used for solving linear least squares problems, eigenvalue and singular value problems, as well
as performing a number of related computational tasks.
Sections in this chapter include descriptions of LAPACK computational routines and driver routines. For
full reference on LAPACK routines and related information see [LUG].
Least Squares Problems. A typical least squares problem is as follows: given a matrix A and a vector
b, find the vector x that minimizes the sum of squares Σi((Ax)i - bi)2 or, equivalently, find the vector
x that minimizes the 2-norm ||Ax - b||2.
In the most usual case, A is an m-by-n matrix with m ≥ n and rank(A) = n. This problem is also referred
to as finding the least squares solution to an overdetermined system of linear equations (here we have
more equations than unknowns). To solve this problem, you can use the QR factorization of the matrix A
(see QR Factorization).
If m < n and rank(A) = m, there exist an infinite number of solutions x which exactly satisfy Ax = b,
and thus minimize the norm ||Ax - b||2. In this case it is often useful to find the unique solution that
minimizes ||x||2. This problem is referred to as finding the minimum-norm solution to an
underdetermined system of linear equations (here we have more unknowns than equations). To solve
this problem, you can use the LQ factorization of the matrix A (see LQ Factorization).
In the general case you may have a rank-deficient least squares problem, with rank(A)< min(m, n):
find the minimum-norm least squares solution that minimizes both ||x||2 and ||Ax - b||2. In this
case (or when the rank of A is in doubt) you can use the QR factorization with pivoting or singular value
decomposition (see Singular Value Decomposition).
Eigenvalue Problems. The eigenvalue problems (from German eigen “own”) are stated as follows:
given a matrix A, find the eigenvalues λ and the corresponding eigenvectors z that satisfy the equation
Az = λz (right eigenvectors z)
or the equation
zHA = λzH (left eigenvectors z).

If A is a real symmetric or complex Hermitian matrix, the above two equations are equivalent, and the
problem is called a symmetric eigenvalue problem. Routines for solving this type of problems are described
in the section Symmetric Eigenvalue Problems .
Routines for solving eigenvalue problems with nonsymmetric or non-Hermitian matrices are described in
the section Nonsymmetric Eigenvalue Problems .
767
The library also includes routines that handle generalized symmetric-definite eigenvalue problems:
find the eigenvalues λ and the corresponding eigenvectors x that satisfy one of the following equations:
Az = λBz, ABz = λz, or BAz = λz,

where A is symmetric or Hermitian, and B is symmetric positive-definite or Hermitian positive-definite.
Routines for reducing these problems to standard symmetric eigenvalue problems are described in
the section Generalized Symmetric-Definite Eigenvalue Problems .
To solve a particular problem, you usually call several computational routines. Sometimes you need
to combine the routines of this chapter with other LAPACK routines described in Chapter 3 as well
as with BLAS routines described in Chapter 2 .
For example, to solve a set of least squares problems minimizing ||Ax - b||2 for all columns b of
a given matrix B (where A and B are real matrices), you can call ?geqrf to form the factorization A
= QR, then call ?ormqr to compute C = QHB and finally call the BLAS routine ?trsm to solve for X
the system of equations RX = C.
Another way is to call an appropriate driver routine that performs several tasks in one call. For
example, to solve the least squares problem the driver routine ?gels can be used.
WARNING. LAPACK routines require that input matrices do not contain INF or NaN
values. When input data is inappropriate for LAPACK, problems may arise, including
possible hangs.
Starting from release 8.0, Intel MKL along with FORTRAN 77 interface to LAPACK computational and
driver routines supports also Fortran 95 interface which uses simplified routine calls with shorter
argument lists. The calling sequence for Fortran 95 interface is given in the syntax section of the
routine description immediately after FORTRAN 77 calls.

For each routine in this chapter, when calling it from the FORTRAN 77 program you can use
the LAPACK name.
LAPACK names have the structure xyyzzz, which is described below.
The initial letter x indicates the data type:
768
LAPACK Routines: Least Squares and Eigenvalue Problems 4
The second and third letters yy indicate the matrix type and storage scheme:
bd bidiagonal matrix
ge general matrix
gb general band matrix
hs upper Hessenberg matrix
or (real) orthogonal matrix
op (real) orthogonal matrix (packed storage)
un (complex) unitary matrix
up (complex) unitary matrix (packed storage)
pt symmetric or Hermitian positive-definite tridiagonal matrix
sy symmetric matrix
sp symmetric matrix (packed storage)
sb (real) symmetric band matrix
st (real) symmetric tridiagonal matrix
he Hermitian matrix
hp Hermitian matrix (packed storage)
hb (complex) Hermitian band matrix
tr triangular or quasi-triangular matrix.
The last three letters zzz indicate the computation performed, for example:
qrf form the QR factorization
lqf form the LQ factorization.
Thus, the routine sgeqrf forms the QR factorization of general real matrices in single precision;
the corresponding routine for complex matrices is cgeqrf.
Names of the LAPACK computational and driver routines for Fortran 95 interface in Intel MKL
are the same as FORTRAN 77 names but without the first letter that indicates the data type.
For example, the name of the routine that forms the QR factorization of general real matrices
in Fortran 95 interface is geqrf. Handling of different data types is done through defining a
specific internal parameter referring to a module block with named constants for single and
double precision.
For details on the design of Fortran 95 interface for LAPACK computational and driver routines
in Intel MKL and for the general information on how the optional arguments are reconstructed,
see Fortran 95 Interface Conventions in chapter 3 .
769

LAPACK routines use the following matrix storage schemes:
• Band storage: an m-by-n band matrix with kl sub-diagonals and ku super-diagonals is
stored compactly in a two-dimensional array ab with kl+ku+1 rows and n columns. Columns
of the matrix are stored in the corresponding columns of the array, and diagonals of the
matrix are stored in rows of the array.
In Chapters 3 and 4 , arrays that hold matrices in packed storage have names ending in p;
arrays with matrices in band storage have names ending in b. For more information on matrix
storage schemes, see “Matrix Arguments” in Appendix B .
In addition to the mathematical notation used in previous chapters, descriptions of routines in
this chapter use the following notation:
λi Eigenvalues of the matrix A (for the definition of eigenvalues, see

Eigenvalue Problems).
σi Singular values of the matrix A. They are equal to square roots of the
H
eigenvalues of A A. (For more information, see Singular Value
Decomposition).
||x||2 The 2-norm of the vector x: ||x||2 = (Σi|xi|2)1/2 = ||x||E .
||A||2 The 2-norm (or spectral norm) of the matrix A.
||A||2 = maxiσi, ||A||22= max|x|=1(Ax·Ax).
||A||E The Euclidean norm of the matrix A: ||A||E2 = ΣiΣj|aij|2 (for
vectors, the Euclidean norm and the 2-norm are equal: ||x||E =
||x||2).
q(x, y) The acute angle between vectors x and y:
cos q(x, y) = |x·y| / (||x||2||y||2).
770
In the sections that follow, the descriptions of LAPACK computational routines are given. These
routines perform distinct computational tasks that can be used for:
Orthogonal Factorizations
Singular Value Decomposition
Symmetric Eigenvalue Problems
Generalized Symmetric-Definite Eigenvalue Problems
Nonsymmetric Eigenvalue Problems
Generalized Nonsymmetric Eigenvalue Problems
Generalized Singular Value Decomposition
See also the respective driver routines.
This section describes the LAPACK routines for the QR (RQ) and LQ (QL) factorization of
matrices. Routines for the RZ factorization as well as for generalized QR and RQ factorizations
are also included.
QR Factorization. Assume that A is an m-by-n matrix to be factored.
If m ≥ n, the QR factorization is given by
where R is an n-by-n upper triangular matrix with real diagonal elements, and Q is an m-by-m
orthogonal (or unitary) matrix.
You can use the QR factorization for solving the following least squares problem: minimize ||Ax
- b||2 where A is a full-rank m-by-n matrix (m≥n). After factoring the matrix, compute the
solution x by solving Rx = (Q1)Tb.
If m < n, the QR factorization is given by
A = QR = Q(R1R2)
771
where R is trapezoidal, R1 is upper triangular and R2 is rectangular.

The LAPACK routines do not form the matrix Q explicitly. Instead, Q is represented as a product
of min(m, n) elementary reflectors. Routines are provided to work with Q in this representation.
LQ Factorization LQ factorization of an m-by-n matrix A is as follows. If m ≤ n,
where L is an m-by-m lower triangular matrix with real diagonal elements, and Q is an n-by-n
orthogonal (or unitary) matrix.
If m > n, the LQ factorization is
where L1 is an n-by-n lower triangular matrix, L2 is rectangular, and Q is an n-by-n orthogonal

(or unitary) matrix.
You can use the LQ factorization to find the minimum-norm solution of an underdetermined
system of linear equations Ax = b where A is an m-by-n matrix of rank m (m < n). After
factoring the matrix, compute the solution vector x as follows: solve Ly = b for y, and then
compute x = (Q1)Hy.
Table 4-1 lists LAPACK routines (FORTRAN 77 interface) that perform orthogonal factorization
of matrices. Respective routine names in Fortran 95 interface are without the first symbol (see
Routine Naming Conventions).
Table 4-1 Computational Routines for Orthogonal Factorization
Matrix type, factorization Factorize without Factorize Generate Apply
pivoting with pivoting matrix Q matrix Q
general matrices, QR ?geqrf ?geqpf ?orgqr ?ormqr

factorization ?geqp3 ?ungqr ?unmqr
general matrices, RQ ?gerqf ?orgrq ?ormrq

factorization ?ungrq ?unmrq
772
Matrix type, factorization Factorize without Factorize Generate Apply
pivoting with pivoting matrix Q matrix Q
general matrices, LQ ?gelqf ?orglq ?ormlq

factorization ?unglq ?unmlq
general matrices, QL ?geqlf ?orgql ?ormql

factorization ?ungql ?unmql
trapezoidal matrices, RZ ?tzrzf ?ormrz

factorization ?unmrz
pair of matrices, generalized ?ggqrf

QR factorization
pair of matrices, generalized ?ggrqf

RQ factorization
?geqrf
Computes the QR factorization of a general m-by-n
matrix.
Syntax
FORTRAN 77:
call sgeqrf(m, n, a, lda, tau, work, lwork, info)
call dgeqrf(m, n, a, lda, tau, work, lwork, info)
call cgeqrf(m, n, a, lda, tau, work, lwork, info)
call zgeqrf(m, n, a, lda, tau, work, lwork, info)
Fortran 95:
call geqrf(a [, tau] [,info])
Description
The routine forms the QR factorization of a general m-by-n matrix A (see Orthogonal
Factorizations). No pivoting is performed.
773
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of
min(m, n) elementary reflectors. Routines are provided to work with Q in this representation.
for details.
Input Parameters
n INTEGER. The number of columns in A (n ≥ 0).
a, work REAL for sgeqrf
DOUBLE PRECISION for dgeqrf
COMPLEX for cgeqrf
DOUBLE COMPLEX for zgeqrf.
Arrays: a(lda,*) contains the matrix A. The second
work is a workspace array, its dimension max(1, lwork).
lda INTEGER. The first dimension of a; at least max(1, m).
If lwork = -1, then a workspace query is assumed; the
routine only calculates the optimal size of the work array,
returns this value as the first entry of the work array, and
no error message related to lwork is issued by xerbla.
Output Parameters
a Overwritten by the factorization data as follows:
If m ≥ n, the elements below the diagonal are overwritten
by the details of the unitary matrix Q, and the upper triangle
is overwritten by the corresponding elements of the upper
triangular matrix R.
774
If m < n, the strictly lower triangular part is overwritten by
the details of the unitary matrix Q, and the remaining
elements are overwritten by the corresponding elements of
the m-by-n upper trapezoidal matrix R.
tau REAL for sgeqrf
DOUBLE PRECISION for dgeqrf
COMPLEX for cgeqrf
Array, DIMENSION at least max (1, min(m, n)). Contains
additional information on the matrix Q.
work(1) If info = 0, on exit work(1) contains the minimum value
of lwork required for optimum performance. Use this lwork
for subsequent runs.
info INTEGER.

FORTRAN 77 counterparts. For general conventions applied to skip redundant or restorable
Specific details for the routine geqrf interface are the following:

tau Holds the vector of length min(m,n)
Application Notes
algorithm.
775
workspace query.
workspace.
The computed factorization is the exact factorization of a matrix A + E, where
||E||2 = O(ε)||A||2.
3
(4/3)n if m = n,
2
(2/3)n (3m-n) if m > n,
2
(2/3)m (3n-m) if m < n.
The number of operations for complex flavors is 4 times greater.

To solve a set of least squares problems minimizing ||A*x - b||2 for all columns b of a given
matrix B, you can call the following:
?geqrf (this routine) to factorize A = QR;

T
?ormqr to compute C = Q *B (for real matrices);
H
?unmqr to compute C = Q *B (for complex matrices);
?trsm (a BLAS routine) to solve R*X = C.
(The columns of the computed X are the least squares solution vectors x.)
To compute the elements of Q explicitly, call

?orgqr (for real matrices)
?ungqr (for complex matrices).
776
?geqpf
matrix with pivoting.
Syntax
FORTRAN 77:
call sgeqpf(m, n, a, lda, jpvt, tau, work, info)
call dgeqpf(m, n, a, lda, jpvt, tau, work, info)
call cgeqpf(m, n, a, lda, jpvt, tau, work, rwork, info)
call zgeqpf(m, n, a, lda, jpvt, tau, work, rwork, info)
Fortran 95:
call geqpf(a, jpvt [,tau] [,info])
Description
The routine is deprecated and has been replaced by routine ?geqp3.
The routine ?geqpf forms the QR factorization of a general m-by-n matrix A with column pivoting:
A*P = Q*R (see Orthogonal Factorizations). Here P denotes an n-by-n permutation matrix.
Input Parameters
a, work REAL for sgeqpf
DOUBLE PRECISION for dgeqpf
COMPLEX for cgeqpf
DOUBLE COMPLEX for zgeqpf.
Arrays: a (lda,*) contains the matrix A. The second
777
work (lwork) is a workspace array. The size of the work

array must be at least max(1, 3*n) for real flavors and at
least max(1, n) for complex flavors.
jpvt INTEGER. Array, DIMENSION at least max(1, n).
On entry, if jpvt(i) > 0, the i-th column of A is moved
to the beginning of A*P before the computation, and fixed
in place during the computation.
If jpvt(i) = 0, the ith column of A is a free column (that
is, it may be interchanged during the computation with any
other free column).
rwork REAL for cgeqpf
DOUBLE PRECISION for zgeqpf.
A workspace array, DIMENSION at least max(1, 2*n).
Output Parameters
by the details of the unitary (orthogonal) matrix Q, and the
upper triangle is overwritten by the corresponding elements
of the upper triangular matrix R.
the details of the matrix Q, and the remaining elements are
overwritten by the corresponding elements of the m-by-n
upper trapezoidal matrix R.
tau REAL for sgeqpf
DOUBLE PRECISION for dgeqpf
COMPLEX for cgeqpf
DOUBLE COMPLEX for zgeqpf.
additional information on the matrix Q.
jpvt Overwritten by details of the permutation matrix P in the
factorization A*P = Q*R. More precisely, the columns of
A*P are the columns of A in the following order:
jpvt(1), jpvt(2), ..., jpvt(n).
info INTEGER.
778

Specific details for the routine geqpf interface are the following:
jpvt Holds the vector of length n.
Application Notes
||E||2 = O(ε)||A||2.
3
(4/3)n if m = n,
2
(2/3)n (3m-n) if m > n,
2
(2/3)m (3n-m) if m < n.

?geqpf (this routine) to factorize A*P = Q*R;
?ormqr to compute C = QT*B (for real matrices);
?unmqr to compute C = QH*B (for complex matrices);
(The columns of the computed X are the permuted least squares solution vectors x; the output
array jpvt specifies the permutation order.)

779
?geqp3
matrix with column pivoting using level 3 BLAS.
Syntax
FORTRAN 77:
call sgeqp3(m, n, a, lda, jpvt, tau, work, lwork, info)
call dgeqp3(m, n, a, lda, jpvt, tau, work, lwork, info)
call cgeqp3(m, n, a, lda, jpvt, tau, work, lwork, rwork, info)
call zgeqp3(m, n, a, lda, jpvt, tau, work, lwork, rwork, info)
Fortran 95:
call geqp3(a, jpvt [,tau] [,info])
Description
The routine forms the QR factorization of a general m-by-n matrix A with column pivoting: A*P
= Q*R (see Orthogonal Factorizations) using Level 3 BLAS. Here P denotes an n-by-n permutation
matrix. Use this routine instead of ?geqpf for better performance.
Input Parameters
a, work REAL for sgeqp3
DOUBLE PRECISION for dgeqp3
COMPLEX for cgeqp3
DOUBLE COMPLEX for zgeqp3.
780
Arrays:
a (lda,*) contains the matrix A.
work is a workspace array, its dimension max(1,
lwork).
lwork INTEGER. The size of the work array; must be at least
max(1, 3*n+1) for real flavors, and at least max(1, n+1)
no error message related to lwork is issued by xerbla. See
Application Notes below for details.
jpvt INTEGER.
On entry, if jpvt(i) ≠ 0, the i-th column of A is moved
to the beginning of AP before the computation, and fixed in
place during the computation.
If jpvt(i) = 0, the i-th column of A is a free column (that
is, it may be interchanged during the computation with any
other free column).
rwork REAL for cgeqp3
DOUBLE PRECISION for zgeqp3.
A workspace array, DIMENSION at least max(1, 2*n). Used
in complex flavors only.
Output Parameters
upper triangle is overwritten by the corresponding elements
of the upper triangular matrix R.
781

upper trapezoidal matrix R.
tau REAL for sgeqp3
DOUBLE PRECISION for dgeqp3
COMPLEX for cgeqp3
DOUBLE COMPLEX for zgeqp3.
scalar factors of the elementary reflectors for the matrix Q.
jpvt Overwritten by details of the permutation matrix P in the
factorization A*P = Q*R. More precisely, the columns of AP
are the columns of A in the following order:
jpvt(1), jpvt(2), ..., jpvt(n).
info INTEGER.

Specific details for the routine geqp3 interface are the following:
jpvt Holds the vector of length n.
Application Notes
?geqp3 (this routine) to factorize A*P = Q*R;
?ormqr to compute C = QT*B (for real matrices);
?unmqr to compute C = QH*B (for complex matrices);
782
(The columns of the computed X are the permuted least squares solution vectors x; the output
array jpvt specifies the permutation order.)

workspace query.
workspace.
?orgqr
Generates the real orthogonal matrix Q of the QR
factorization formed by ?geqrf.
Syntax
FORTRAN 77:
call sorgqr(m, n, k, a, lda, tau, work, lwork, info)
call dorgqr(m, n, k, a, lda, tau, work, lwork, info)
Fortran 95:
call orgqr(a, tau [,info])
Description
783
The routine generates the whole or part of m-by-m orthogonal matrix Q of the QR factorization
formed by the routines sgeqrf/dgeqrf or sgeqpf/dgeqpf. Use this routine after a call to
sgeqrf/dgeqrf or sgeqpf/dgeqpf.
Usually Q is determined from the QR factorization of an m by p matrix A with m ≥ p. To compute

the whole matrix Q, use:
call ?orgqr(m, m, p, a, lda, tau, work, lwork, info)
To compute the leading p columns of Q (which form an orthonormal basis in the space spanned
by the columns of A):
call ?orgqr(m, p, p, a, lda, tau, work, lwork, info)
k
To compute the matrix Q of the QR factorization of leading k columns of the matrix A:
call ?orgqr(m, m, k, a, lda, tau, work, lwork, info)
k
To compute the leading k columns of Q (which form an orthonormal basis in the space spanned
by leading k columns of the matrix A:):
call ?orgqr(m, k, k, a, lda, tau, work, lwork, info)
Input Parameters
m INTEGER. The order of the orthogonal matrix Q (m ≥ 0).
n INTEGER. The number of columns of Q to be computed
(0 ≤ n ≤ m).
k INTEGER. The number of elementary reflectors whose
product defines the matrix Q (0 ≤ k ≤ n).
a, tau, work REAL for sorgqr
DOUBLE PRECISION for dorgqr
Arrays:
a(lda,*) and tau(*) are the arrays returned by sgeqrf /
dgeqrf or sgeqpf / dgeqpf.
The dimension of tau must be at least max(1, k).
784
Output Parameters
a Overwritten by n leading columns of the m-by-m orthogonal
matrix Q.
info INTEGER.

Specific details for the routine orgqr interface are the following:
tau Holds the vector of length (k)
Application Notes
algorithm.
785
workspace query.
workspace.
The computed Q differs from an exactly orthogonal matrix by a matrix E such that
||E||2 = O(ε)|*|A||2 where ε is the machine precision.

The total number of floating-point operations is approximately 4*m*n*k - 2*(m + n)*k2 +
(4/3)*k3.
If n = k, the number is approximately (2/3)*n2*(3m - n).
The complex counterpart of this routine is ?ungqr.
?ormqr
Multiplies a real matrix by the orthogonal matrix
Q of the QR factorization formed by ?geqrf or
?geqpf.
Syntax
FORTRAN 77:
call sormqr(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call dormqr(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
Fortran 95:
call ormqr(a, tau, c [,side] [,trans] [,info])
Description
T
The routine multiplies a real matrix C by Q or Q , where Q is the orthogonal matrix Q of the QR
factorization formed by the routines sgeqrf/dgeqrf or sgeqpf/dgeqpf.
786
Depending on the parameters side and trans, the routine can form one of the matrix products
Q*C, QT*C, C*Q, or C*QT (overwriting the result on C).
Input Parameters
side CHARACTER*1. Must be either 'L' or 'R'.
T
If side ='L', Q or Q is applied to C from the left.
T
If side ='R', Q or Q is applied to C from the right.
trans CHARACTER*1. Must be either 'N' or 'T'.
If trans ='N', the routine multiplies C by Q.
T
If trans ='T', the routine multiplies C by Q .
m INTEGER. The number of rows in the matrix C (m ≥ 0).
n INTEGER. The number of columns in C (n ≥ 0).
product defines the matrix Q. Constraints:
0 ≤ k ≤ m if side ='L';
0 ≤ k ≤ n if side ='R'.
a, tau, c, work REAL for sgeqrf
DOUBLE PRECISION for dgeqrf.
Arrays:
a(lda,*) and tau(*) are the arrays returned by sgeqrf /
dgeqrf or sgeqpf / dgeqpf. The second dimension of a
must be at least max(1, k). The dimension of tau must be
at least max(1, k).
c(ldc,*) contains the matrix C.
The second dimension of c must be at least max(1, n)
lwork).
lda INTEGER. The first dimension of a. Constraints:
lda ≥ max(1, m) if side = 'L';
lda ≥ max(1, n) if side = 'R'.
ldc INTEGER. The first dimension of c. Constraint:
ldc ≥ max(1, m).
lwork INTEGER. The size of the work array. Constraints:
787
lwork ≥ max(1, n) if side = 'L';

lwork ≥ max(1, m) if side = 'R'.
Output Parameters
c Overwritten by the product Q*C, QT*C, C*Q, or C*QT (as
specified by side and trans).
info INTEGER.

Specific details for the routine ormqr interface are the following:
a Holds the matrix A of size (r,k).
r = m if side = 'L'.
r = n if side = 'R'.
tau Holds the vector of length (k).
trans Must be 'N' or 'T'. The default value is 'N'.
788
Application Notes
For better performance, try using lwork = n*blocksize (if side = 'L') or lwork =
m*blocksize (if side = 'R') where blocksize is a machine-dependent value (typically, 16
to 64) required for optimum performance of the blocked algorithm.
workspace query.
workspace.
The complex counterpart of this routine is ?unmqr.
?ungqr
Generates the complex unitary matrix Q of the QR
factorization formed by ?geqrf.
Syntax
FORTRAN 77:
call cungqr(m, n, k, a, lda, tau, work, lwork, info)
call zungqr(m, n, k, a, lda, tau, work, lwork, info)
Fortran 95:
call ungqr(a, tau [,info])
Description
789
The routine generates the whole or part of m-by-m unitary matrix Q of the QR factorization formed
by the routines cgeqrf/zgeqrf or cgeqpf/zgeqpf. Use this routine after a call to cgeqrf/zge-
qrf or cgeqpf/zgeqpf.
Usually Q is determined from the QR factorization of an m by p matrix A with m ≥ p. To compute

call ?ungqr(m, m, p, a, lda, tau, work, lwork, info)
To compute the leading p columns of Q (which form an orthonormal basis in the space spanned
by the columns of A):
call ?ungqr(m, p, p, a, lda, tau, work, lwork, info)
k
To compute the matrix Q of the QR factorization of the leading k columns of the matrix A:
call ?ungqr(m, m, k, a, lda, tau, work, lwork, info)
k
To compute the leading k columns of Q (which form an orthonormal basis in the space spanned
by the leading k columns of the matrix A):
call ?ungqr(m, k, k, a, lda, tau, work, lwork, info)
Input Parameters
m INTEGER. The order of the unitary matrix Q (m ≥ 0).
n INTEGER. The number of columns of Q to be computed
(0 ≤ n ≤ m).
product defines the matrix Q (0 ≤ k ≤ n).
a, tau, work COMPLEX for cungqr
DOUBLE COMPLEX for zungqr
Arrays: a(lda,*) and tau(*) are the arrays returned by
cgeqrf/zgeqrf or cgeqpf/zgeqpf.
790
Output Parameters
a Overwritten by n leading columns of the m-by-m unitary
matrix Q.
info INTEGER.

Specific details for the routine ungqr interface are the following:
Application Notes
For better performance, try using lwork =n*blocksize, where blocksize is a
algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first
run, or set lwork = -1.
In first case the routine completes the task, though probably not so fast as with a recommended
workspace, and provides the recommended workspace in the first element of the corresponding
array work on exit. Use this value (work(1)) for subsequent runs.
791
If lwork = -1, then the routine returns immediately and provides the recommended workspace
in the first element of the corresponding array (work). This operation is called a workspace
query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the
routine returns immediately with an error exit and does not provide any information on the
recommended workspace.
The computed Q differs from an exactly unitary matrix by a matrix E such that ||E||2 =
O(ε)*||A||2, where ε is the machine precision.
The total number of floating-point operations is approximately 16*m*n*k - 8*(m + n)*k2
+ (16/3)*k3.
If n = k, the number is approximately (8/3)*n2*(3m - n).
The real counterpart of this routine is ?orgqr.
?unmqr
Multiplies a complex matrix by the unitary matrix
Q of the QR factorization formed by ?geqrf.
Syntax
FORTRAN 77:
call cunmqr(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call zunmqr(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
Fortran 95:
call unmqr(a, tau, c [,side] [,trans] [,info])
Description
The routine multiplies a rectangular complex matrix C by Q or QH, where Q is the unitary matrix
Q of the QR factorization formed by the routines cgeqrf/zgeqrf or cgeqpf/zgeqpf.
H H
Q*C, Q *C, C*Q, or C*Q (overwriting the result on C).
792
Input Parameters
If side = 'L', Q or QH is applied to C from the left.
If side = 'R', Q or QH is applied to C from the right.
trans CHARACTER*1. Must be either 'N' or 'C'.
If trans = 'N', the routine multiplies C by Q.
H
If trans = 'C', the routine multiplies C by Q .
0 ≤ k ≤ m if side = 'L';
0 ≤ k ≤ n if side = 'R'.
a, c, tau, work COMPLEX for cgeqrf
Arrays:
a(lda,*) and tau(*) are the arrays returned by cgeqrf /
zgeqrf or cgeqpf / zgeqpf.
The second dimension of a must be at least max(1, k).
lda ≥ max(1, m) if side = 'L';
lda ≥ max(1, n) if side = 'R'.
ldc INTEGER. The first dimension of c. Constraint:
ldc ≥ max(1, m).
793

See Application notes for the suggested value of lwork.
Output Parameters
H H
c Overwritten by the product Q*C, Q *C, C*Q, or C*Q (as
info INTEGER.

Specific details for the routine unmqr interface are the following:
Application Notes
794
query.
The real counterpart of this routine is ?ormqr.
?gelqf
Computes the LQ factorization of a general m-by-n
matrix.
Syntax
FORTRAN 77:
call sgelqf(m, n, a, lda, tau, work, lwork, info)
call dgelqf(m, n, a, lda, tau, work, lwork, info)
call cgelqf(m, n, a, lda, tau, work, lwork, info)
call zgelqf(m, n, a, lda, tau, work, lwork, info)
Fortran 95:
call gelqf(a [, tau] [,info])
Description
The routine forms the LQ factorization of a general m-by-n matrix A (see Orthogonal
Factorizations). No pivoting is performed.
795
for details.
Input Parameters
a, work REAL for sgelqf
DOUBLE PRECISION for dgelqf
COMPLEX for cgelqf
DOUBLE COMPLEX for zgelqf.
Arrays:
a(lda,*) contains the matrix A.
lwork INTEGER. The size of the work array; at least max(1, m).
Output Parameters
If m ≤ n, the elements above the diagonal are overwritten
lower triangle is overwritten by the corresponding elements
of the lower triangular matrix L.
796
If m > n, the strictly upper triangular part is overwritten by
lower trapezoidal matrix L.
tau REAL for sgelqf
DOUBLE PRECISION for dgelqf
COMPLEX for cgelqf
DOUBLE COMPLEX for zgelqf.
Array, DIMENSION at least max(1, min(m, n)).
Contains additional information on the matrix Q.
info INTEGER.

Specific details for the routine gelqf interface are the following:
tau Holds the vector of length min(m,n).
Application Notes
For better performance, try using lwork =m*blocksize, where blocksize is a
algorithm.
797
workspace query.
workspace.
||E||2 = O(ε) ||A||2.

3
(4/3)n if m = n,
2
(2/3)n (3m-n) if m > n,
2
(2/3)m (3n-m) if m < n.

To find the minimum-norm solution of an underdetermined least squares problem minimizing
||A*x - b||2 for all columns b of a given matrix B, you can call the following:
?gelqf (this routine) to factorize A = L*Q;
?trsm (a BLAS routine) to solve L*Y = B for Y;
?ormlq to compute X = (Q1)T*Y (for real matrices);
?unmlq to compute X = (Q1)H*Y (for complex matrices).
(The columns of the computed X are the minimum-norm solution vectors x. Here A is an m-by-n
matrix with m < n; Q1 denotes the first m columns of Q).

?orglq (for real matrices)
?unglq (for complex matrices).
798
?orglq
Generates the real orthogonal matrix Q of the LQ
factorization formed by ?gelqf.
Syntax
FORTRAN 77:
call sorglq(m, n, k, a, lda, tau, work, lwork, info)
call dorglq(m, n, k, a, lda, tau, work, lwork, info)
Fortran 95:
call orglq(a, tau [,info])
Description
The routine generates the whole or part of n-by-n orthogonal matrix Q of the LQ factorization
formed by the routines sgelqf/gelqf. Use this routine after a call to sgelqf/dgelqf.
Usually Q is determined from the LQ factorization of an p-by-n matrix A with n ≥ p. To compute

call ?orglq(n, n, p, a, lda, tau, work, lwork, info)
To compute the leading p rows of Q, which form an orthonormal basis in the space spanned by
the rows of A, use:
call ?orglq(p, n, p, a, lda, tau, work, lwork, info)
k
To compute the matrix Q of the LQ factorization of A's leading k rows, use:
call ?orglq(n, n, k, a, lda, tau, work, lwork, info)
k
To compute the leading k rows of Q , which form an orthonormal basis in the space spanned
by A's leading k rows, use:
call ?orgqr(k, n, k, a, lda, tau, work, lwork, info)
Input Parameters
m INTEGER. The number of rows of Q to be computed
(0 ≤ m ≤ n).
799
n INTEGER. The order of the orthogonal matrix Q (n ≥ m).

product defines the matrix Q (0 ≤ k ≤ m).
a, tau, work REAL for sorglq
DOUBLE PRECISION for dorglq
sgelqf/dgelqf.
lwork).
Output Parameters
a Overwritten by m leading rows of the n-by-n orthogonal
matrix Q.
info INTEGER.

Specific details for the routine orglq interface are the following:
800
Application Notes
algorithm.
workspace query.
workspace.
The computed Q differs from an exactly orthogonal matrix by a matrix E such that ||E||2 =
(4/3)*k3.
If m = k, the number is approximately (2/3)*m2*(3n - m).
The complex counterpart of this routine is ?unglq.
801
?ormlq
Q of the LQ factorization formed by ?gelqf.
Syntax
FORTRAN 77:
call sormlq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call dormlq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
Fortran 95:
call ormlq(a, tau, c [,side] [,trans] [,info])
Description
T
The routine multiplies a real m-by-n matrix C by Q or Q , where Q is the orthogonal matrix Q of
the LQ factorization formed by the routine sgelqf/dgelqf.
Input Parameters
T
If side = 'L', Q or Q is applied to C from the left.
T
If side = 'R', Q or Q is applied to C from the right.
T
If trans = 'T', the routine multiplies C by Q .
0 ≤ k ≤ m if side = 'L';
802
0 ≤ k ≤ n if side = 'R'.
a, c, tau, work REAL for sormlq
DOUBLE PRECISION for dormlq.
Arrays:
a(lda,*) and tau(*) are arrays returned by ?gelqf.
The second dimension of a must be:
at least max(1, m) if side = 'L';
at least max(1, n) if side = 'R'.
lwork).
lda INTEGER. The first dimension of a; lda ≥ max(1, k).
ldc INTEGER. The first dimension of c; ldc ≥ max(1, m).
Output Parameters
info INTEGER.
803

Specific details for the routine ormlq interface are the following:
a Holds the matrix A of size (k,m).
Application Notes
run or set lwork= -1.
If you set lwork= -1, the routine returns immediately and provides the recommended workspace
query.
workspace.
The complex counterpart of this routine is ?unmlq.
804
?unglq
Generates the complex unitary matrix Q of the LQ
factorization formed by ?gelqf.
Syntax
FORTRAN 77:
call cunglq(m, n, k, a, lda, tau, work, lwork, info)
call zunglq(m, n, k, a, lda, tau, work, lwork, info)
Fortran 95:
call unglq(a, tau [,info])
Description
The routine generates the whole or part of n-by-n unitary matrix Q of the LQ factorization formed
by the routines cgelqf/zgelqf. Use this routine after a call to cgelqf/zgelqf.
Usually Q is determined from the LQ factorization of an p-by-n matrix A with n < p. To compute
call ?unglq(n, n, p, a, lda, tau, work, lwork, info)
To compute the leading p rows of Q, which form an orthonormal basis in the space spanned by
the rows of A, use:
call ?unglq(p, n, p, a, lda, tau, work, lwork, info)
k
To compute the matrix Q of the LQ factorization of the leading k rows of the matrix A, use:
call ?unglq(n, n, k, a, lda, tau, work, lwork, info)
k
To compute the leading k rows of Q , which form an orthonormal basis in the space spanned
by the leading k rows of the matrix A, use:
call ?ungqr(k, n, k, a, lda, tau, work, lwork, info)
Input Parameters
m INTEGER. The number of rows of Q to be computed (0 ≤ m
≤ n).
805
n INTEGER. The order of the unitary matrix Q (n ≥ m).

product defines the matrix Q (0 ≤ k ≤ m).
a, tau, work COMPLEX for cunglq
DOUBLE COMPLEX for zunglq
sgelqf/dgelqf.
Output Parameters
a Overwritten by m leading rows of the n-by-n unitary matrix
Q.
info INTEGER.

Specific details for the routine unglq interface are the following:
806
Application Notes
For better performance, try using lwork = m*blocksize, where blocksize is a
algorithm.
query.
The computed Q differs from an exactly unitary matrix by a matrix E such that ||E||2 =
(16/3)*k3.
If m = k, the number is approximately (8/3)*m2*(3n - m) .
The real counterpart of this routine is ?orglq.
?unmlq
Q of the LQ factorization formed by ?gelqf.
Syntax
FORTRAN 77:
call cunmlq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call zunmlq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
807
Fortran 95:
call unmlq(a, tau, c [,side] [,trans] [,info])
Description
H
The routine multiplies a real m-by-n matrix C by Q or Q , where Q is the unitary matrix Q of the
LQ factorization formed by the routine cgelqf/zgelqf.
H H
Input Parameters
H
H
If trans = 'C', the routine multiplies C by QH.
0 ≤ k ≤ m if side = 'L';
0 ≤ k ≤ n if side = 'R'.
a, c, tau, work COMPLEX for cunmlq
DOUBLE COMPLEX for zunmlq.
Arrays:
a(lda,*) and tau(*) are arrays returned by ?gelqf.
The second dimension of a must be:
at least max(1, m) if side = 'L';
at least max(1, n) if side = 'R'.
808
Output Parameters
H H
info INTEGER.

Specific details for the routine unmlq interface are the following:
809
Application Notes
query.
The real counterpart of this routine is ?ormlq.
?geqlf
Computes the QL factorization of a general m-by-n
matrix.
Syntax
FORTRAN 77:
call sgeqlf(m, n, a, lda, tau, work, lwork, info)
call dgeqlf(m, n, a, lda, tau, work, lwork, info)
call cgeqlf(m, n, a, lda, tau, work, lwork, info)
call zgeqlf(m, n, a, lda, tau, work, lwork, info)
Fortran 95:
call geqlf(a [, tau] [,info])
810
Description
The routine forms the QL factorization of a general m-by-n matrix A. No pivoting is performed.
for details.
Input Parameters
a, work REAL for sgeqlf
DOUBLE PRECISION for dgeqlf
COMPLEX for cgeqlf
DOUBLE COMPLEX for zgeqlf.
Arrays: a(lda,*) contains the matrix A.
lwork INTEGER. The size of the work array; at least max(1, n).
Output Parameters
a Overwritten on exit by the factorization data as follows:
811
if m ≥ n, the lower triangle of the subarray a(m-n+1:m, 1:n)

contains the n-by-n lower triangular matrix L; if m ≤ n, the
elements on and below the (n-m)-th superdiagonal contain
the m-by-n lower trapezoidal matrix L; in both cases, the
remaining elements, with the array tau, represent the
orthogonal/unitary matrix Q as a product of elementary
reflectors.
tau REAL for sgeqlf
DOUBLE PRECISION for dgeqlf
COMPLEX for cgeqlf
DOUBLE COMPLEX for zgeqlf.
Array, DIMENSION at least max(1, min(m, n)). Contains
scalar factors of the elementary reflectors for the matrix Q.
of lwork required for optimum performance.
info INTEGER.

Specific details for the routine geqlf interface are the following:
Application Notes
algorithm.
812
workspace query.
workspace.
Related routines include:
?orgql to generate matrix Q (for real matrices);
?ungql to generate matrix Q (for complex matrices);
?ormql to apply matrix Q (for real matrices);
?unmql to apply matrix Q (for complex matrices).
?orgql
Generates the real matrix Q of the QL factorization
formed by ?geqlf.
Syntax
FORTRAN 77:
call sorgql(m, n, k, a, lda, tau, work, lwork, info)
call dorgql(m, n, k, a, lda, tau, work, lwork, info)
Fortran 95:
call orgql(a, tau [,info])
Description
813
The routine generates an m-by-n real matrix Q with orthonormal columns, which is defined as
the last n columns of a product of k elementary reflectors H(i) of order m: Q = H(k) *...*
H(2)*H(1) as returned by the routines sgeqlf/dgeqlf. Use this routine after a call to sge-
qlf/dgeqlf.
Input Parameters
m INTEGER. The number of rows of the matrix Q (m≥ 0).
n INTEGER. The number of columns of the matrix Q (m≥ n≥
0).
product defines the matrix Q (n≥ k≥ 0).
a, tau, work REAL for sorgql
DOUBLE PRECISION for dorgql
Arrays: a(lda,*), tau(*).
On entry, the (n - k + i)th column of a must contain the
vector which defines the elementary reflector H(i), for i =
1,2,...,k, as returned by sgeqlf/dgeqlf in the last k
columns of its array argument a; tau(i) must contain the
scalar factor of the elementary reflector H(i), as returned
by sgeqlf/dgeqlf;
lwork).
Output Parameters
a Overwritten by the m-by-n matrix Q.
814
info INTEGER.

Specific details for the routine orgql interface are the following:
Application Notes
algorithm.
workspace query.
workspace.
The complex counterpart of this routine is ?ungql.
815
?ungql
Generates the complex matrix Q of the QL
factorization formed by ?geqlf.
Syntax
FORTRAN 77:
call cungql(m, n, k, a, lda, tau, work, lwork, info)
call zungql(m, n, k, a, lda, tau, work, lwork, info)
Fortran 95:
call ungql(a, tau [,info])
Description
The routine generates an m-by-n complex matrix Q with orthonormal columns, which is defined
as the last n columns of a product of k elementary reflectors H(i) of order m: Q = H(k) *...*
H(2)*H(1) as returned by the routines cgeqlf/zgeqlf . Use this routine after a call to cge-
qlf/zgeqlf.
Input Parameters
m INTEGER. The number of rows of the matrix Q (m≥0).
n INTEGER. The number of columns of the matrix Q (m≥n≥0).
product defines the matrix Q (n≥k≥0).
a, tau, work COMPLEX for cungql
DOUBLE COMPLEX for zungql
Arrays: a(lda,*), tau(*), work(lwork).
On entry, the (n - k + i)th column of a must contain the
vector which defines the elementary reflector H(i), for i =
1,2,...,k, as returned by cgeqlf/zgeqlf in the last k
columns of its array argument a;
816
tau(i) must contain the scalar factor of the
elementaryreflector H(i), as returned by cgeqlf/zgeqlf;
Output Parameters
info INTEGER.

Specific details for the routine ungql interface are the following:
Application Notes
algorithm.
817
query.
The real counterpart of this routine is ?orgql.
?ormql
Q of the QL factorization formed by ?geqlf.
Syntax
FORTRAN 77:
call sormql(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call dormql(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
Fortran 95:
call ormql(a, tau, c [,side] [,trans] [,info])
Description
T
The routine multiplies a real m-by-n matrix C by Q or Q , where Q is the orthogonal matrix Q of
the QL factorization formed by the routine sgeqlf/dgeqlf .
Depending on the parameters side and trans, the routine ?ormql can form one of the matrix
products Q*C, QT*C, C*Q, or C*QT (overwriting the result over C).
818
Input Parameters
T
T
T
m INTEGER. The number of rows in the matrix C (m≥ 0).
n INTEGER. The number of columns in C (n≥ 0).
0 ≤k≤m if side = 'L';
0 ≤k≤n if side = 'R'.
a, tau, c, work REAL for sormql
DOUBLE PRECISION for dormql.
Arrays: a(lda,*), tau(*), c(ldc,*).
On entry, the ith column of a must contain the vector which
defines the elementary reflector Hi, for i = 1,2,...,k, as
returned by sgeqlf/dgeqlf in the last k columns of its
array argument a.
tau(i) must contain the scalar factor of the elementary
reflector Hi, as returned by sgeqlf/dgeqlf.
c(ldc,*) contains the m-by-n matrix C.
lwork).
lda INTEGER. The first dimension of a;
if side = 'L', lda≥ max(1, m);
if side = 'R', lda≥ max(1, n).
ldc INTEGER. The first dimension of c; ldc≥ max(1, m).
819
lwork≥ max(1, n) if side = 'L';

lwork≥ max(1, m) if side = 'R'.
Output Parameters
info INTEGER.

Specific details for the routine ormql interface are the following:
820
Application Notes
workspace query.
workspace.
The complex counterpart of this routine is ?unmql.
?unmql
Q of the QL factorization formed by ?geqlf.
Syntax
FORTRAN 77:
call cunmql(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call zunmql(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
Fortran 95:
call unmql(a, tau, c [,side] [,trans] [,info])
Description
821
H
The routine multiplies a complex m-by-n matrix C by Q or Q , where Q is the unitary matrix Q of
the QL factorization formed by the routine cgeqlf/zgeqlf .
Depending on the parameters side and trans, the routine ?unmql can form one of the matrix
H H
products Q*C, Q *C, C*Q, or C*Q (overwriting the result over C).
Input Parameters
H
H
H
0 ≤ k ≤ m if side = 'L';
0 ≤ k ≤ n if side = 'R'.
a, tau, c, work COMPLEX for cunmql
DOUBLE COMPLEX for zunmql.
Arrays: a(lda,*), tau(*), c(ldc,*), work(lwork).
On entry, the i-th column of a must contain the vector
which defines the elementary reflector H(i), for i = 1,2,...,k,
as returned by cgeqlf/zgeqlf in the last k columns of its
array argument a.
reflector H(i), as returned by cgeqlf/zgeqlf.
822
Output Parameters
H H
info INTEGER.

Specific details for the routine unmql interface are the following:
side Must be 'L' or 'L'. The default value is 'L'.
823
Application Notes
query.
The real counterpart of this routine is ?ormql.
?gerqf
Computes the RQ factorization of a general m-by-n
matrix.
Syntax
FORTRAN 77:
call sgerqf(m, n, a, lda, tau, work, lwork, info)
call dgerqf(m, n, a, lda, tau, work, lwork, info)
call cgerqf(m, n, a, lda, tau, work, lwork, info)
call zgerqf(m, n, a, lda, tau, work, lwork, info)
Fortran 95:
call gerqf(a [, tau] [,info])
824
Description
The routine forms the RQ factorization of a general m-by-n matrix A. No pivoting is performed.
for details.
Input Parameters
a, work REAL for sgerqf
DOUBLE PRECISION for dgerqf
COMPLEX for cgerqf
DOUBLE COMPLEX for zgerqf.
Arrays:
a(lda,*) contains the m-by-n matrix A.
lwork INTEGER. The size of the work array;
lwork ≥ max(1, m).
Output Parameters
825
if m ≤ n, the upper triangle of the subarray

a(1:m, n-m+1:n ) contains the m-by-m upper triangular matrix
R;
if m ≥ n, the elements on and above the (m-n)th subdiagonal
contain the m-by-n upper trapezoidal matrix R;
in both cases, the remaining elements, with the array tau,
represent the orthogonal/unitary matrix Q as a product of
min(m,n) elementary reflectors.
tau REAL for sgerqf
DOUBLE PRECISION for dgerqf
COMPLEX for cgerqf
DOUBLE COMPLEX for zgerqf.
Array, DIMENSION at least max (1, min(m, n)).
Contains scalar factors of the elementary reflectors for the
matrix Q.
info INTEGER.

Specific details for the routine gerqf interface are the following:
Application Notes
algorithm.
826
workspace query.
workspace.
?orgrq to generate matrix Q (for real matrices);
?ungrq to generate matrix Q (for complex matrices);
?ormrq to apply matrix Q (for real matrices);
?unmrq to apply matrix Q (for complex matrices).
?orgrq
Generates the real matrix Q of the RQ factorization
formed by ?gerqf.
Syntax
FORTRAN 77:
call sorgrq(m, n, k, a, lda, tau, work, lwork, info)
call dorgrq(m, n, k, a, lda, tau, work, lwork, info)
Fortran 95:
call orgrq(a, tau [,info])
Description
827
The routine generates an m-by-n real matrix Q with orthonormal rows, which is defined as the
last m rows of a product of k elementary reflectors H(i) of order n: Q = H(1)*
H(2)*...*H(k)as returned by the routines sgerqf/dgerqf. Use this routine after a call to
sgerqf/dgerqf.
Input Parameters
m INTEGER. The number of rows of the matrix Q (m≥ 0).
n INTEGER. The number of columns of the matrix Q (n≥ m).
product defines the matrix Q (m≥ k≥ 0).
a, tau, work REAL for sorgrq
DOUBLE PRECISION for dorgrq
Arrays: a(lda,*), tau(*).
On entry, the (m - k + i)-th row of a must contain the vector
as returned by sgerqf/dgerqf in the last k rows of its array
argument a;
reflector H(i), as returned by sgerqf/dgerqf;
Output Parameters
828
info INTEGER.

Specific details for the routine orgrq interface are the following:
Application Notes
algorithm.
workspace query.
workspace.
The complex counterpart of this routine is ?ungrq.
829
?ungrq
Generates the complex matrix Q of the RQ
factorization formed by ?gerqf.
Syntax
FORTRAN 77:
call cungrq(m, n, k, a, lda, tau, work, lwork, info)
call zungrq(m, n, k, a, lda, tau, work, lwork, info)
Fortran 95:
call ungrq(a, tau [,info])
Description
The routine generates an m-by-n complex matrix Q with orthonormal rows, which is defined as
the last m rows of a product of k elementary reflectors H(i) of order n: Q = H(1)H*
H(2)H*...*H(k)H as returned by the routines sgerqf/dgerqf. Use this routine after a call to
sgerqf/dgerqf.
Input Parameters
m INTEGER. The number of rows of the matrix Q (m≥0).
n INTEGER. The number of columns of the matrix Q (n≥m ).
product defines the matrix Q (m≥k≥0).
a, tau, work REAL for cungrq
DOUBLE PRECISION for zungrq
Arrays: a(lda,*), tau(*), work(lwork).
On entry, the (m - k + i)th row of a must contain the vector
as returned by sgerqf/dgerqf in the last k rows of its array
argument a;
830
reflector H(i), as returned by sgerqf/dgerqf;
Output Parameters
info INTEGER.

Specific details for the routine ungrq interface are the following:
Application Notes
algorithm.
831
query.
The real counterpart of this routine is ?orgrq.
?ormrq
Q of the RQ factorization formed by ?gerqf.
Syntax
FORTRAN 77:
call sormrq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call dormrq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
Fortran 95:
call ormrq(a, tau, c [,side] [,trans] [,info])
Description
T
The routine multiplies a real m-by-n matrix C by Q or Q , where Q is the real orthogonal matrix
defined as a product of k elementary reflectors Hi : Q = H1 H2 ... Hk as returned by the RQ
factorization routine sgerqf/dgerqf .
Q*C, QT*C, C*Q, or C*QT (overwriting the result over C).
832
Input Parameters
T
T
T
0 ≤ k ≤ m, if side = 'L';
0 ≤ k ≤ n, if side = 'R'.
a, tau, c, work REAL for sormrq
DOUBLE PRECISION for dormrq.
On entry, the ith row of a must contain the vector which
defines the elementary reflector Hi, for i = 1,2,...,k, as
returned by sgerqf/dgerqf in the last k rows of its array
argument a.
The second dimension of a must be at least max(1, m) if
side = 'L', and at least max(1, n) if side = 'R'.
reflector Hi, as returned by sgerqf/dgerqf.
833

Output Parameters
info INTEGER.

Specific details for the routine ormrq interface are the following:
Application Notes
834
workspace query.
workspace.
The complex counterpart of this routine is ?unmrq.
?unmrq
Q of the RQ factorization formed by ?gerqf.
Syntax
FORTRAN 77:
call cunmrq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call zunmrq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
Fortran 95:
call unmrq(a, tau, c [,side] [,trans] [,info])
Description
H
The routine multiplies a complex m-by-n matrix C by Q or Q , where Q is the complex unitary
matrix defined as a product of k elementary reflectors H(i) of order n: Q = H(1)H*
H(2)H*...*H(k)Has returned by the RQ factorization routine cgerqf/zgerqf .
H H
Q*C, Q *C, C*Q, or C*Q (overwriting the result over C).
835
Input Parameters
H
H
H
0 ≤ k ≤ m, if side = 'L';
0 ≤ k ≤ n, if side = 'R'.
a, tau, c, work COMPLEX for cunmrq
DOUBLE COMPLEX for zunmrq.
defines the elementary reflector H(i), for i = 1,2,...,k, as
returned by cgerqf/zgerqf in the last k rows of its array
argument a.
reflector H(i), as returned by cgerqf/zgerqf.
lwork).
lda INTEGER. The first dimension of a; lda ≥ max(1, k) .
836
Output Parameters
H H
info INTEGER.

Specific details for the routine unmrq interface are the following:
Application Notes
837
query.
The real counterpart of this routine is ?ormrq.
?tzrzf
Reduces the upper trapezoidal matrix A to upper
triangular form.
Syntax
FORTRAN 77:
call stzrzf(m, n, a, lda, tau, work, lwork, info)
call dtzrzf(m, n, a, lda, tau, work, lwork, info)
call ctzrzf(m, n, a, lda, tau, work, lwork, info)
call ztzrzf(m, n, a, lda, tau, work, lwork, info)
Fortran 95:
call tzrzf(a [, tau] [,info])
Description
The routine reduces the m-by-n (m ≤ n) real/complex upper trapezoidal matrix A to upper
triangular form by means of orthogonal/unitary transformations. The upper trapezoidal matrix
A is factored as
A = (R 0)*Z,
where Z is an n-by-n orthogonal/unitary matrix and R is an m-by-m upper triangular matrix.
838
See ?larz that applies an elementary reflector returned by ?tzrzf to a general matrix.
Input Parameters
n INTEGER. The number of columns in A (n ≥ m).
a, work REAL for stzrzf
DOUBLE PRECISION for dtzrzf
COMPLEX for ctzrzf
DOUBLE COMPLEX for ztzrzf.
Arrays: a(lda,*), work(lwork).The leading m-by-n upper
trapezoidal part of the array a contains the matrix A to be
factorized.
lwork ≥ max(1, m).
Output Parameters
the leading m-by-m upper triangular part of a contains the
upper triangular matrix R, and elements m +1 to n of the
first m rows of a, with the array tau, represent the
orthogonal matrix Z as a product of m elementary reflectors.
tau REAL for stzrzf
DOUBLE PRECISION for dtzrzf
COMPLEX for ctzrzf
DOUBLE COMPLEX for ztzrzf.
Array, DIMENSION at least max (1, m). Contains scalar
factors of the elementary reflectors for the matrix Z.
839

info INTEGER.

Specific details for the routine tzrzf interface are the following:
tau Holds the vector of length (m).
Application Notes
algorithm.
query.
?ormrz to apply matrix Q (for real matrices)
?unmrz to apply matrix Q (for complex matrices).
840
?ormrz
defined from the factorization formed by ?tzrzf.
Syntax
FORTRAN 77:
call sormrz(side, trans, m, n, k, l, a, lda, tau, c, ldc, work, lwork, info)
call dormrz(side, trans, m, n, k, l, a, lda, tau, c, ldc, work, lwork, info)
Fortran 95:
call ormrz(a, tau, c, l [, side] [,trans] [,info])
Description
T
The routine multiplies a real m-by-n matrix C by Q or Q , where Q is the real orthogonal matrix
defined as a product of k elementary reflectors H(i) of order n: Q = H(1)* H(2)*...*H(k)
as returned by the factorization routine stzrzf/dtzrzf .
Q*C, QT*C, C*Q, or C*QT (overwriting the result over C).
The matrix Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
T
T
T
841

0 ≤ k ≤ m, if side = 'L';
0 ≤ k ≤ n, if side = 'R'.
l INTEGER.
The number of columns of the matrix A containing the
meaningful part of the Householder reflectors. Constraints:
0 ≤ l ≤ m, if side = 'L';
0 ≤ l ≤ n, if side = 'R'.
a, tau, c, work REAL for sormrz
DOUBLE PRECISION for dormrz.
returned by stzrzf/dtzrzf in the last k rows of its array
argument a.
reflector H(i), as returned by stzrzf/dtzrzf.
lda INTEGER. The first dimension of a; lda ≥ max(1, k) .
842
Output Parameters
info INTEGER.

Specific details for the routine ormrz interface are the following:
Application Notes
workspace query.
843
workspace.
The complex counterpart of this routine is ?unmrz.
?unmrz
defined from the factorization formed by ?tzrzf.
Syntax
FORTRAN 77:
call cunmrz(side, trans, m, n, k, l, a, lda, tau, c, ldc, work, lwork, info)
call zunmrz(side, trans, m, n, k, l, a, lda, tau, c, ldc, work, lwork, info)
Fortran 95:
call unmrz(a, tau, c, l [,side] [,trans] [,info])
Description
H
The routine multiplies a complex m-by-n matrix C by Q or Q , where Q is the unitary matrix
defined as a product of k elementary reflectors H(i):
Q = H(1)H* H(2)H*...*H(k)H as returned by the factorization routine ctzrzf/ztzrzf .

H H
Q*C, Q *C, C*Q, or C*Q (overwriting the result over C).
The matrix Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
H
H
844
H
0 ≤ k ≤ m, if side = 'L';
0 ≤ k ≤ n, if side = 'R'.
l INTEGER.
The number of columns of the matrix A containing the
meaningful part of the Householder reflectors. Constraints:
0 ≤ l ≤ m, if side = 'L';
0 ≤ l ≤ n, if side = 'R'.
a, tau, c, work COMPLEX for cunmrz
DOUBLE COMPLEX for zunmrz.
returned by ctzrzf/ztzrzf in the last k rows of its array
argument a.
reflector H(i), as returned by ctzrzf/ztzrzf.
lwork).
845

Output Parameters
H H
info INTEGER.
If info = -i, the ith parameter had an illegal value.

Specific details for the routine unmrz interface are the following:
Application Notes
846
query.
The real counterpart of this routine is ?ormrz.
?ggqrf
Computes the generalized QR factorization of two
matrices.
Syntax
FORTRAN 77:
call sggqrf(n, m, p, a, lda, taua, b, ldb, taub, work, lwork, info)
call dggqrf(n, m, p, a, lda, taua, b, ldb, taub, work, lwork, info)
call cggqrf(n, m, p, a, lda, taua, b, ldb, taub, work, lwork, info)
call zggqrf(n, m, p, a, lda, taua, b, ldb, taub, work, lwork, info)
Fortran 95:
call ggqrf(a, b [,taua] [,taub] [,info])
Description
The routine forms the generalized QR factorization of an n-by-m matrix A and an n-by-p matrix
B as A = Q*R, B = Q*T*Z, where Q is an n-by-n orthogonal/unitary matrix, Z is a p-by-p
orthogonal/unitary matrix, and R and T assume one of the forms:
847
or
where R11 is upper triangular, and
where T12 or T21 is a p-by-p upper triangular matrix.
In particular, if B is square and nonsingular, the GQR factorization of A and B implicitly gives the
-1
QR factorization of B A as:
B-1*A = ZH*(T-1*R)
Input Parameters
n INTEGER. The number of rows of the matrices A and B (n
≥ 0).
m INTEGER. The number of columns in A (m ≥ 0).
848
p INTEGER. The number of columns in B (p ≥ 0).
a, b, work REAL for sggqrf
DOUBLE PRECISION for dggqrf
COMPLEX for cggqrf
DOUBLE COMPLEX for zggqrf.
Arrays: a(lda,*) contains the matrix A.
The second dimension of a must be at least max(1, m).
b(ldb,*) contains the matrix B.
The second dimension of b must be at least max(1, p).
ldb INTEGER. The first dimension of b; at least max(1, n).
max(1, n, m, p).
Output Parameters
a, b Overwritten by the factorization data as follows:
on exit, the elements on and above the diagonal of the array
a contain the min(n,m)-by-m upper trapezoidal matrix R (R
is upper triangular if n ≥ m);the elements below the
diagonal, with the array taua, represent the
orthogonal/unitary matrix Q as a product of min(n,m)
elementary reflectors ;
if n ≤ p, the upper triangle of the subarray b(1:n, p-n+1:p
) contains the n-by-n upper triangular matrix T;
if n > p, the elements on and above the (n-p)th subdiagonal
contain the n-by-p upper trapezoidal matrix T; the remaining
elements, with the array taub, represent the
orthogonal/unitary matrix Z as a product of elementary
reflectors.
taua, taub REAL for sggqrf
849
DOUBLE PRECISION for dggqrf

COMPLEX for cggqrf
DOUBLE COMPLEX for zggqrf.
Arrays, DIMENSION at least max (1, min(n, m)) for taua and
at least max (1, min(n, p)) for taub. The array taua contains
the scalar factors of the elementary reflectors which
represent the orthogonal/unitary matrix Q.
The array taub contains the scalar factors of the elementary
reflectors which represent the orthogonal/unitary matrix Z.
info INTEGER.

Specific details for the routine ggqrf interface are the following:
a Holds the matrix A of size (n,m).
b Holds the matrix B of size (n,p).
taua Holds the vector of length min(n,m).
taub Holds the vector of length min(n,p).
Application Notes
For better performance, try using lwork ≥ max(n,m, p)*max(nb1,nb2,nb3), where nb1 is
the optimal blocksize for the QR factorization of an n-by-m matrix, nb2 is the optimal blocksize
for the RQ factorization of an n-by-p matrix, and nb3 is the optimal blocksize for a call of ?or-
mqr/?unmqr.
850
workspace query.
workspace.
?ggrqf
Computes the generalized RQ factorization of two
matrices.
Syntax
FORTRAN 77:
call sggrqf (m, p, n, a, lda, taua, b, ldb, taub, work, lwork, info)
call dggrqf (m, p, n, a, lda, taua, b, ldb, taub, work, lwork, info)
call cggrqf (m, p, n, a, lda, taua, b, ldb, taub, work, lwork, info)
call zggrqf (m, p, n, a, lda, taua, b, ldb, taub, work, lwork, info)
Fortran 95:
call ggrqf(a, b [,taua] [,taub] [,info])
Description
The routine forms the generalized RQ factorization of an m-by-n matrix A and an p-by-n matrix
B as A = R*Q, B = Z*T*Q, where Q is an n-by-n orthogonal/unitary matrix, Z is a p-by-p
orthogonal/unitary matrix, and R and T assume one of the forms:
851
or
where R11 or R21 is upper triangular, and
or
where T11 is upper triangular.
In particular, if B is square and nonsingular, the GRQ factorization of A and B implicitly gives the
-1
RQ factorization of A*B as:
A*B-1 = (R*T-1)*ZH
Input Parameters
m INTEGER. The number of rows of the matrix A (m ≥ 0).
852
p INTEGER. The number of rows in B (p ≥ 0).
n INTEGER. The number of columns of the matrices A and B
(n ≥ 0).
a, b, work REAL for sggrqf
DOUBLE PRECISION for dggrqf
COMPLEX for cggrqf
DOUBLE COMPLEX for zggrqf.
Arrays:
b(ldb,*) contains the p-by-n matrix B.
The second dimension of b must be at least max(1, n).
ldb INTEGER. The first dimension of b; at least max(1, p).
max(1, n, m, p).
Output Parameters
a, b Overwritten by the factorization data as follows:
on exit, if m ≤ n, the upper triangle of the subarray a(1:m,
n-m+1:n ) contains the m-by-m upper triangular matrix R;
if m > n, the elements on and above the (m-n)th subdiagonal
contain the m-by-n upper trapezoidal matrix R;
the remaining elements, with the array taua, represent the
reflectors; the elements on and above the diagonal of the
array b contain the min(p,n)-by-n upper trapezoidal matrix
T (T is upper triangular if p ≥ n); the elements below the
853
diagonal, with the array taub, represent the

orthogonal/unitary matrix Z as a product of elementary
reflectors.
taua, taub REAL for sggrqf
DOUBLE PRECISION for dggrqf
COMPLEX for cggrqf
DOUBLE COMPLEX for zggrqf.
Arrays, DIMENSION at least max (1, min(m, n)) for taua and
at least max (1, min(p, n)) for taub.
The array taua contains the scalar factors of the elementary
reflectors which represent the orthogonal/unitary matrix Q.
info INTEGER.

Specific details for the routine ggrqf interface are the following:
b Holds the matrix A of size (p,n).
taua Holds the vector of length min(m,n).
taub Holds the vector of length min(p,n).
Application Notes
For better performance, try using
lwork ≥ max(n,m, p)*max(nb1,nb2,nb3),
854
where nb1 is the optimal blocksize for the RQ factorization of an m-by-n matrix, nb2 is the optimal
blocksize for the QR factorization of an p-by-n matrix, and nb3 is the optimal blocksize for a call
of ?ormrq/?unmrq.
run or set lwork= -1.
If you set lwork= -1, the routine returns immediately and provides the recommended workspace
query.
workspace.

This section describes LAPACK routines for computing the singular value decomposition (SVD)
of a general m-by-n matrix A:
A = UΣVH.
In this decomposition, U and V are unitary (for complex A) or orthogonal (for real A); Σ is an
m-by-n diagonal matrix with real diagonal elements σi:
σ1 < σ2 < ... < σmin(m, n) < 0.
The diagonal elements σi are singular values of A. The first min(m, n) columns of the matrices
U and V are, respectively, left and right singular vectors of A. The singular values and singular
vectors satisfy
Avi = σiui and AHui = σivi

where ui and vi are the i-th columns of U and V, respectively.
To find the SVD of a general matrix A, call the LAPACK routine ?gebrd or ?gbbrd for reducing
A to a bidiagonal matrix B by a unitary (orthogonal) transformation: A = QBPH. Then call ?bdsqr,
which forms the SVD of a bidiagonal matrix: B = U1ΣV1H.
855
Thus, the sought-for SVD of A is given by A = UΣVH =(QU1)Σ(V1HPH).

Table 4-2 lists LAPACK routines (FORTRAN 77 interface) that perform singular value
decomposition of matrices. Respective routine names in Fortran 95 interface are without the
first symbol (see Routine Naming Conventions).
Table 4-2 Computational Routines for Singular Value Decomposition (SVD)
Operation Real matrices Complex matrices
Reduce A to a bidiagonal matrix B: A = ?gebrd ?gebrd

QBPH (full storage)
Reduce A to a bidiagonal matrix B: A = ?gbbrd ?gbbrd

QBPH (band storage)
Generate the orthogonal (unitary) matrix ?orgbr ?ungbr

Q or P
Apply the orthogonal (unitary) matrix Q ?ormbr ?unmbr

or P
Form singular value decomposition of the ?bdsqr ?bdsdc ?bdsqr

H
bidiagonal matrix B: B = UΣV
856
Figure 4-1 Decision Tree: Singular Value Decomposition
Figure 4-1 “Decision Tree: Singular Value Decomposition” presents a decision tree that helps
you choose the right sequence of routines for SVD, depending on whether you need singular
values only or singular vectors as well, whether A is real or complex, and so on.
You can use the SVD to find a minimum-norm solution to a (possibly) rank-deficient least
squares problem of minimizing ||Ax - b||2. The effective rank k of the matrix A can be
determined as the number of singular values which exceed a suitable threshold. The
minimum-norm solution is
x = Vk(Σk)-1c
where Σk is the leading k-by-k submatrix of Σ, the matrix Vk consists of the first k columns of
V = PV1, and the vector c consists of the first k elements of UHb = U1HQHb.
857
?gebrd
Reduces a general matrix to bidiagonal form.
Syntax
FORTRAN 77:
call sgebrd(m, n, a, lda, d, e, tauq, taup, work, lwork, info)
call dgebrd(m, n, a, lda, d, e, tauq, taup, work, lwork, info)
call cgebrd(m, n, a, lda, d, e, tauq, taup, work, lwork, info)
call zgebrd(m, n, a, lda, d, e, tauq, taup, work, lwork, info)
Fortran 95:
call gebrd(a [, d] [,e] [,tauq] [,taup] [,info])
Description
The routine reduces a general m-by-n matrix A to a bidiagonal matrix B by an orthogonal (unitary)
transformation.
If m ≥ n, the reduction is given by
where B1 is an n-by-n upper diagonal matrix, Q and P are orthogonal or, for a complex A, unitary
matrices; Q1 consists of the first n columns of Q.
If m < n, the reduction is given by

A = Q*B*PH = Q*(B10)*PH = Q1*B1*P1H,
where B1 is an m-by-m lower diagonal matrix, Q and P are orthogonal or, for a complex A, unitary
matrices; P1 consists of the first m rows of P.
The routine does not form the matrices Q and P explicitly, but represents them as products of
elementary reflectors. Routines are provided to work with the matrices Q and P in this
representation:
858
If the matrix A is real,
• to compute Q and P explicitly, call ?orgbr.

• to multiply a general matrix by Q or P, call ?ormbr.
If the matrix A is complex,
• to compute Q and P explicitly, call ?ungbr.

• to multiply a general matrix by Q or P, call ?unmbr.
Input Parameters
a, work REAL for sgebrd
DOUBLE PRECISION for dgebrd
COMPLEX for cgebrd
DOUBLE COMPLEX for zgebrd.
Arrays:
lwork INTEGER.
The dimension of work; at least max(1, m, n).
Output Parameters
a If m ≥ n, the diagonal and first super-diagonal of a are
overwritten by the upper bidiagonal matrix B. Elements
below the diagonal are overwritten by details of Q, and the
remaining elements are overwritten by details of P.
859
If m < n, the diagonal and first sub-diagonal of a are

overwritten by the lower bidiagonal matrix B. Elements
above the diagonal are overwritten by details of P, and the
remaining elements are overwritten by details of Q.
d REAL for single-precision flavors
DOUBLE PRECISION for double-precision flavors.
Contains the diagonal elements of B.
e REAL for single-precision flavors
Array, DIMENSION at least max(1, min(m, n) - 1).
Contains the off-diagonal elements of B.
tauq, taup REAL for sgebrd
COMPLEX for cgebrd
Arrays, DIMENSION at least max (1, min(m, n)). Contain
further details of the matrices Q and P.
info INTEGER.

Specific details for the routine gebrd interface are the following:
d Holds the vector of length min(m,n).
e Holds the vector of length min(m,n)-1.
tauq Holds the vector of length min(m,n).
taup Holds the vector of length min(m,n).
860
Application Notes
For better performance, try using lwork = (m + n)*blocksize, where blocksize is a
algorithm.
workspace query.
workspace.
The computed matrices Q, B, and P satisfy QBPH = A + E, where ||E||2 = c(n)ε ||A||2,
c(n) is a modestly increasing function of n, and ε is the machine precision.
(4/3)*n2*(3*m - n) for m ≥ n,
(4/3)*m2*(3*n - m) for m < n.
The number of operations for complex flavors is four times greater.
If n is much less than m, it can be more efficient to first form the QR factorization of A by calling
?geqrf and then reduce the factor R to bidiagonal form. This requires approximately 2*n2*(m
+ n) floating-point operations.
If m is much less than n, it can be more efficient to first form the LQ factorization of A by calling
?gelqf and then reduce the factor L to bidiagonal form. This requires approximately 2*m2*(m
+ n) floating-point operations.
861
?gbbrd
Reduces a general band matrix to bidiagonal form.
Syntax
FORTRAN 77:
call sgbbrd(vect, m, n, ncc, kl, ku, ab, ldab, d, e, q, ldq, pt, ldpt, c, ldc,
work, info)
call dgbbrd(vect, m, n, ncc, kl, ku, ab, ldab, d, e, q, ldq, pt, ldpt, c, ldc,
work, info)
call cgbbrd(vect, m, n, ncc, kl, ku, ab, ldab, d, e, q, ldq, pt, ldpt, c, ldc,
work, rwork, info)
call zgbbrd(vect, m, n, ncc, kl, ku, ab, ldab, d, e, q, ldq, pt, ldpt, c, ldc,
work, rwork, info)
Fortran 95:
call gbbrd(ab [, c] [,d] [,e] [,q] [,pt] [,kl] [,m] [,info])
Description
The routine reduces an m-by-n band matrix A to upper bidiagonal matrix B: A = Q*B*PH. Here
the matrices Q and P are orthogonal (for real A) or unitary (for complex A). They are determined
as products of Givens rotation matrices, and may be formed explicitly by the routine if required.
The routine can also update a matrix C as follows: C = QH*C.
Input Parameters
vect CHARACTER*1. Must be 'N' or 'Q' or 'P' or 'B'.
H
If vect = 'N', neither Q nor P is generated.
If vect = 'Q', the routine generates the matrix Q.
H
If vect = 'P', the routine generates the matrix P .
H
If vect = 'B', the routine generates both Q and P .
862
ncc INTEGER. The number of columns in C (ncc ≥ 0).
kl INTEGER. The number of sub-diagonals within the band of
A (kl ≥ 0).
ku INTEGER. The number of super-diagonals within the band
of A (ku ≥ 0).
ab, c, work REAL for sgbbrd
DOUBLE PRECISION for dgbbrd COMPLEX for cgbbrd
DOUBLE COMPLEX for zgbbrd.
Arrays:
ab(ldab,*) contains the matrix A in band storage (see Matrix
Storage Schemes).
c(ldc,*) contains an m-by-ncc matrix C.
If ncc = 0, the array c is not referenced.
The second dimension of c must be at least max(1, ncc).
The dimension of work must be at least 2*max(m, n) for
real flavors, or max(m, n) for complex flavors.
ldab INTEGER. The first dimension of the array ab (ldab ≥ kl
+ ku + 1).
ldq INTEGER. The first dimension of the output array q.
ldq ≥ max(1, m) if vect = 'Q' or 'B', ldq ≥ 1
otherwise.
ldpt INTEGER. The first dimension of the output array pt.
ldpt ≥ max(1, n) if vect = 'P' or 'B', ldpt ≥ 1
otherwise.
ldc INTEGER. The first dimension of the array c.
ldc ≥ max(1, m) if ncc > 0; ldc ≥ 1 if ncc = 0.
rwork REAL for cgbbrd DOUBLE PRECISION for zgbbrd.
A workspace array, DIMENSION at least max(m, n).
Output Parameters
ab Overwritten by values generated during the reduction.
863

Array, DIMENSION at least max(1, min(m, n)). Contains the
diagonal elements of the matrix B.
Array, DIMENSION at least max(1, min(m, n) - 1).
Contains the off-diagonal elements of B.
q, pt REAL for sgebrd
COMPLEX for cgebrd
Arrays:
q(ldq,*) contains the output m-by-m matrix Q.
The second dimension of q must be at least max(1, m).
T
p(ldpt,*) contains the output n-by-n matrix P .
The second dimension of pt must be at least max(1, n).
info INTEGER.

Specific details for the routine gbbrd interface are the following:
c Holds the matrix C of size (m,ncc).
d Holds the vector with the number of elements min(m,n).
e Holds the vector with the number fo elements min(m,n)-1.
q Holds the matrix Q of size (m,m).
pt Holds the matrix PT of size (n,n).
m If omitted, assumed m = n.
864
vect Restored based on the presence of arguments q and pt as follows:
vect = 'B', if both q and pt are present,
vect = 'Q', if q is present and pt omitted, vect = 'P',
if q is omitted and pt present, vect = 'N', if both q and pt are
omitted.
Application Notes
The computed matrices Q, B, and P satisfy Q*B*PH = A + E, where ||E||2 = c(n)ε ||A||2,
If m = n, the total number of floating-point operations for real flavors is approximately the
sum of:
6*n2*(kl + ku) if vect = 'N' and ncc = 0,
3*n2*ncc*(kl + ku - 1)/(kl + ku) if C is updated, and
H
3*n3*(kl + ku - 1)/(kl + ku) if either Q or P is generated (double this if both).
To estimate the number of operations for complex flavors, use the same formulas with the
coefficients 20 and 10 (instead of 6 and 3).
?orgbr
T
Generates the real orthogonal matrix Q or P
determined by ?gebrd.
Syntax
FORTRAN 77:
call sorgbr(vect, m, n, k, a, lda, tau, work, lwork, info)
call dorgbr(vect, m, n, k, a, lda, tau, work, lwork, info)
Fortran 95:
call orgbr(a, tau [,vect] [,info])
Description
865
The routine generates the whole or part of the orthogonal matrices Q and PT formed by the
routines sgebrd/dgebrd. Use this routine after a call to sgebrd/dgebrd. All valid combinations
of arguments are described in Input parameters. In most cases you need the following:
To compute the whole m-by-m matrix Q:
call ?orgbr('Q', m, m, n, a ... )
(note that the array a must have at least m columns).
To form the n leading columns of Q if m > n:

call ?orgbr('Q', m, n, n, a ... )
T
To compute the whole n-by-n matrix P :
call ?orgbr('P', n, n, m, a ... )
(note that the array a must have at least n rows).
To form the m leading rows of PT if m < n:

call ?orgbr('P', m, n, m, a ... )
Input Parameters
vect CHARACTER*1. Must be 'Q' or 'P'.
T
T
m INTEGER. The number of required rows of Q or P .
T
n INTEGER. The number of required columns of Q or P .
k INTEGER. One of the dimensions of A in ?gebrd:
If vect = 'Q', the number of columns in A;
If vect = 'P', the number of rows in A.
Constraints: m ≥ 0, n ≥ 0, k ≥ 0.
For vect = 'Q': k ≤ n ≤ m if m > k, or m = n if m ≤ k.
For vect = 'P': k ≤ m ≤ n if n > k, or m = n if n ≤ k.
a, work REAL for sorgbr
DOUBLE PRECISION for dorgbr.
Arrays:
a(lda,*) is the array a as returned by ?gebrd.
866
tau REAL for sorgbr
DOUBLE PRECISION for dorgbr.
For vect = 'Q', the array tauq as returned by ?gebrd.
For vect = 'P', the array taup as returned by ?gebrd.
The dimension of tau must be at least max(1, min(m, k))
for vect = 'Q', or max(1, min(m, k)) for vect = 'P'.
Output Parameters
T
a Overwritten by the orthogonal matrix Q or P (or the leading
rows or columns thereof) as specified by vect, m, and n.
info INTEGER.

Specific details for the routine orgbr interface are the following:
tau Holds the vector of length min(m,k) where
k = m, if vect = 'P',
k = n, if vect = 'Q'.
vect Must be 'Q' or 'P'. The default value is 'Q'.
867
Application Notes
For better performance, try using lwork = min(m,n)*blocksize, where blocksize is a
algorithm.
workspace query.
workspace.
The computed matrix Q differs from an exactly orthogonal matrix by a matrix E such that ||E||2
= O(ε).
The approximate numbers of floating-point operations for the cases listed in Description are
as follows:
To form the whole of Q:
(4/3)*n*(3m2 - 3m*n + n2) if m > n;
(4/3)*m3 if m ≤ n.
To form the n leading columns of Q when m > n:
(2/3)*n2*(3m - n2) if m > n.

T
To form the whole of P :
(4/3)*n3 if m ≥ n;
(4/3)*m*(3n2 - 3m*n + m2) if m < n.
T
To form the m leading columns of P when m < n:
(2/3)*n2*(3m - n2) if m > n.
868
The complex counterpart of this routine is ?ungbr.
?ormbr
Multiplies an arbitrary real matrix by the real
T
orthogonal matrix Q or P determined by ?gebrd.
Syntax
FORTRAN 77:
call sormbr(vect, side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork,
info)
call dormbr(vect, side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork,
info)
Fortran 95:
call ormbr(a, tau, c [,vect] [,side] [,trans] [,info])
Description
Given an arbitrary real matrix C, this routine forms one of the matrix products Q*C, QT*C, C*Q,
C*Q,T, P*C, PT*C, C*P, C*PT, where Q and P are orthogonal matrices computed by a call to
sgebrd/dgebrd. The routine overwrites the product on C.
Input Parameters
T
In the descriptions below, r denotes the order of Q or P :
If side = 'L', r = m; if side = 'R', r = n.

T
If vect = 'Q', then Q or Q is applied to C.
If vect = 'P', then P or PT is applied to C.
side CHARACTER*1. Must be 'L' or 'R'.
If side = 'L', multipliers are applied to C from the left.
If side = 'R', they are applied to C from the right.
trans CHARACTER*1. Must be 'N' or 'T'.
If trans = 'N', then Q or P is applied to C.
869
T T
If trans = 'T', then Q or P is applied to C.
m INTEGER. The number of rows in C.
n INTEGER. The number of columns in C.
a, c, work REAL for sormbr
DOUBLE PRECISION for dormbr.
Arrays:
Its second dimension must be at least max(1, min(r,k)) for
vect = 'Q', or max(1, r)) for vect = 'P'.
c(ldc,*) holds the matrix C.
Its second dimension must be at least max(1, n).
lda ≥ max(1, r) if vect = 'Q';
lda ≥ max(1, min(r,k)) if vect = 'P'.
tau REAL for sormbr
DOUBLE PRECISION for dormbr.
Array, DIMENSION at least max (1, min(r, k)).
870
Output Parameters
c Overwritten by the product Q*C, QT*C, C*Q, C*Q,T, P*C,
PT*C, C*P, or C*PT, as specified by vect, side, and trans.
info INTEGER.

Specific details for the routine ormbr interface are the following:
a Holds the matrix A of size (r,min(nq,k)) where
r = nq, if vect = 'Q',
r = min(nq,k), if vect = 'P',
nq = m, if side = 'L',
nq = n, if side = 'R',
tau Holds the vector of length min(nq,k).
Application Notes
For better performance, try using
lwork = n*blocksize for side = 'L', or
lwork = m*blocksize for side = 'R',
871
where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum

performance of the blocked algorithm.
workspace query.
workspace.
The computed product differs from the exact product by a matrix E such that ||E||2 =
O(ε)*||C||2.
The total number of floating-point operations is approximately
2*n*k(2*m - k) if side = 'L' and m ≥ k;
2*m*k(2*n - k) if side = 'R' and n ≥ k;

2*m2*n if side = 'L' and m < k;
2*n2*m if side = 'R' and n < k.
The complex counterpart of this routine is ?unmbr.
872
?ungbr
H
Generates the complex unitary matrix Q or P
determined by ?gebrd.
Syntax
FORTRAN 77:
call cungbr(vect, m, n, k, a, lda, tau, work, lwork, info)
call zungbr(vect, m, n, k, a, lda, tau, work, lwork, info)
Fortran 95:
call ungbr(a, tau [,vect] [,info])
Description
The routine generates the whole or part of the unitary matrices Q and PH formed by the routines
cgebrd/zgebrd. Use this routine after a call to cgebrd/zgebrd. All valid combinations of
arguments are described in Input Parameters;in most cases you need the following:
To compute the whole m-by-m matrix Q, use:
call ?ungbr('Q', m, m, n, a ... )
(note that the array a must have at least m columns).
To form the n leading columns of Q if m > n, use:

call ?ungbr('Q', m, n, n, a ... )
H
To compute the whole n-by-n matrix P , use:
call ?ungbr('P', n, n, m, a ... )
(note that the array a must have at least n rows).
To form the m leading rows of PH if m < n, use:

call ?ungbr('P', m, n, m, a ... )
Input Parameters
873

H
H
m INTEGER. The number of required rows of Q or P .
H
n INTEGER. The number of required columns of Q or P .
For vect = 'Q': k ≤ n ≤ m if m > k, or m = n if m ≤ k.
For vect = 'P': k ≤ m ≤ n if n > k, or m = n if n ≤ k.
a, work COMPLEX for cungbr
DOUBLE COMPLEX for zungbr.
Arrays:
lwork).
tau COMPLEX for cungbr
DOUBLE COMPLEX for zungbr.
The dimension of tau must be at least max(1, min(m, k))
for vect = 'Q', or max(1, min(m, k)) for vect = 'P'.
Constraint: lwork < max(1, min(m, n)).
Output Parameters
T
a Overwritten by the orthogonal matrix Q or P (or the leading
rows or columns thereof) as specified by vect, m, and n.
874
info INTEGER.

Specific details for the routine ungbr interface are the following:
tau Holds the vector of length min(m,k) where
Application Notes
For better performance, try using lwork = min(m,n)*blocksize, where blocksize is a
algorithm.
query.
875
= O(ε).
The approximate numbers of possible floating-point operations are listed below:
To compute the whole matrix Q:
(16/3)n(3m2 - 3m*n + n2) if m > n;
(16/3)m3 if m ≤ n.
To form the n leading columns of Q when m > n:
(8/3)n2(3m - n2).
H
To compute the whole matrix P :
(16/3)n3 if m ≥ n;
(16/3)m(3n2 - 3m*n + m2) if m < n.
H
To form the m leading columns of P when m < n:
(8/3)n2(3m - n2) if m > n.

The real counterpart of this routine is ?orgbr.
?unmbr
Multiplies an arbitrary complex matrix by the
unitary matrix Q or P determined by ?gebrd.
Syntax
FORTRAN 77:
call cunmbr(vect, side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork,
info)
call zunmbr(vect, side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork,
info)
Fortran 95:
call unmbr(a, tau, c [,vect] [,side] [,trans] [,info])
876
Description
H
Given an arbitrary complex matrix C, this routine forms one of the matrix products Q*C, Q *C,
H H H
C*Q, C*Q , P*C, P *C, C*P, or C*P , where Q and P are unitary matrices computed by a call to
cgebrd/zgebrd. The routine overwrites the product on C.
Input Parameters
H
In the descriptions below, r denotes the order of Q or P :

H
If vect = 'Q', then Q or Q is applied to C.
H
If vect = 'P', then P or P is applied to C.
If side = 'L', multipliers are applied to C from the left.
If side = 'R', they are applied to C from the right.
trans CHARACTER*1. Must be 'N' or 'C'.
If trans = 'N', then Q or P is applied to C.
H H
If trans = 'C', then Q or P is applied to C.
m INTEGER. The number of rows in C.
n INTEGER. The number of columns in C.
a, c, work COMPLEX for cunmbr
DOUBLE COMPLEX for zunmbr.
Arrays:
Its second dimension must be at least max(1, min(r,k)) for
vect = 'Q', or max(1, r)) for vect = 'P'.
c(ldc,*) holds the matrix C.
Its second dimension must be at least max(1, n).
877

lwork).
lda ≥ max(1, r) if vect = 'Q';
lda ≥ max(1, min(r,k)) if vect = 'P'.
tau COMPLEX for cunmbr
DOUBLE COMPLEX for zunmbr.
Array, DIMENSION at least max (1, min(r, k)).
lwork ≥ 1 if n=0 or m=0.
For optimum performance lwork ≥ max(1,n*nb) if side
= 'L', and lwork ≥ max(1,m*nb) if side = 'R', where
nb is the optimal blocksize. (nb = 0 if m = 0 or n =
0.)
Output Parameters
H H H
c Overwritten by the product Q*C, Q *C, C*Q, C*Q , P*C, P *C,
H
C*P, or C*P , as specified by vect, side, and trans.
info INTEGER.
878
Specific details for the routine unmbr interface are the following:
a Holds the matrix A of size (r,min(nq,k)) where
r = nq, if vect = 'Q',
r = min(nq,k), if vect = 'P',
nq = m, if side = 'L',
nq = n, if side = 'R',
tau Holds the vector of length min(nq,k).
Application Notes
For better performance, use
lwork = n*blocksize for side = 'L', or
lwork = m*blocksize for side = 'R',
query.
879
O(ε)*||C||2.
The total number of floating-point operations is approximately
8*n*k(2*m - k) if side = 'L' and m ≥ k;
8*m*k(2*n - k) if side = 'R' and n ≥ k;

8*m2*n if side = 'L' and m < k;
8*n2*m if side = 'R' and n < k.
The real counterpart of this routine is ?ormbr.
?bdsqr
Computes the singular value decomposition of a
general matrix that has been reduced to bidiagonal
form.
Syntax
FORTRAN 77:
call sbdsqr(uplo, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, work,
info)
call dbdsqr(uplo, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, work,
info)
call cbdsqr(uplo, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, work,
info)
call zbdsqr(uplo, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, work,
info)
Fortran 95:
call rbdsqr(d, e [,vt] [,u] [,c] [,uplo] [,info])
call bdsqr(d, e [,vt] [,u] [,c] [,uplo] [,info])
880
Description
The routine computes the singular values and, optionally, the right and/or left singular vectors
from the Singular Value Decomposition (SVD) of a real n-by-n (upper or lower) bidiagonal
matrix B using the implicit zero-shift QR algorithm. The SVD of B has the form B = Q*S*PH
where S is the diagonal matrix of singular values, Q is an orthogonal matrix of left singular
vectors, and P is an orthogonal matrix of right singular vectors. If left singular vectors are
requested, this subroutine actually returns U *Q instead of Q, and, if right singular vectors are
requested, this subroutine returns PH *VT instead of PH, for given real/complex input matrices
U and VT. When U and VT are the orthogonal/unitary matrices that reduce a general matrix A
to bidiagonal form: A = U*B*VT, as computed by ?gebrd, then
A = (U*Q)*S*(PH*VT)
is the SVD of A. Optionally, the subroutine may also compute QH *C for a given real/complex
input matrix C.
See also ?lasq1, ?lasq2, ?lasq3, ?lasq4, ?lasq5, ?lasq6 used by this routine.
Input Parameters
If uplo = 'U', B is an upper bidiagonal matrix.
If uplo = 'L', B is a lower bidiagonal matrix.
n INTEGER. The order of the matrix B (n ≥ 0).
ncvt INTEGER. The number of columns of the matrix VT, that is,
the number of right singular vectors (ncvt ≥ 0).
Set ncvt = 0 if no right singular vectors are required.
nru INTEGER. The number of rows in U, that is, the number of
left singular vectors (nru ≥ 0).
Set nru = 0 if no left singular vectors are required.
ncc INTEGER. The number of columns in the matrix C used for
computing the product Q *C (ncc ≥ 0). Set ncc = 0 if no
H
matrix C is supplied.
d, e, work REAL for single-precision flavors
881
Arrays:
d(*) contains the diagonal elements of B.
The dimension of d must be at least max(1, n).
e(*) contains the (n-1) off-diagonal elements of B.
The dimension of e must be at least max(1, n). e(n) is
used for workspace.
The dimension of work must be at least max(1, 2*n) if
ncvt = nru = ncc = 0;
max(1, 4*(n-1)) otherwise.
vt, u, c REAL for sbdsqr
DOUBLE PRECISION for dbdsqr
COMPLEX for cbdsqr
DOUBLE COMPLEX for zbdsqr.
Arrays:
vt(ldvt,*) contains an n-by-ncvt matrix VT.
The second dimension of vt must be at least max(1, ncvt).
vt is not referenced if ncvt = 0.
u(ldu,*) contains an nru by n unit matrix U.
The second dimension of u must be at least max(1, n).
u is not referenced if nru = 0.
c(ldc,*) contains the matrix C for computing the product
QH*C.
The second dimension of c must be at least max(1, ncc).
The array is not referenced if ncc = 0.
ldvt INTEGER. The first dimension of vt. Constraints:
ldvt ≥ max(1, n) if ncvt > 0;
ldvt ≥ 1 if ncvt = 0.
ldu INTEGER. The first dimension of u. Constraint:
ldu ≥ max(1, nru).
ldc INTEGER. The first dimension of c. Constraints:
ldc ≥ max(1, n) if ncc > 0;ldc ≥ 1 otherwise.
882
Output Parameters
d On exit, if info = 0, overwritten by the singular values in
decreasing order (see info).
e On exit, if info = 0, e is destroyed. See also info below.
H
c Overwritten by the product Q *C.
H
vt On exit, this array is overwritten by P *VT.
u On exit, this array is overwritten by U *Q .
info INTEGER.
If info = i, the algorithm failed to converge; i specifies
how many off-diagonals did not converge.
In this case, d and e contain on exit the diagonal and
off-diagonal elements, respectively, of a bidiagonal matrix
orthogonally equivalent to B.

Specific details for the routine bdsqr interface are the following:
d Holds the vector of length (n).
e Holds the vector of length (n).
vt Holds the matrix VT of size (n, ncvt).
u Holds the matrix U of size (nru,n).
c Holds the matrix C of size (n,ncc).
ncvt If argument vt is present, then ncvt is equal to the number of columns
in matrix VT; otherwise, ncvt is set to zero.
nru If argument u is present, then nru is equal to the number of rows in
matrix U; otherwise, nru is set to zero.
ncc If argument c is present, then ncc is equal to the number of columns
in matrix C; otherwise, ncc is set to zero.
883
Note that two variants of Fortran 95 interface for bdsqr routine are needed because of an
ambiguous choice between real and complex cases appear when vt, u, and c are omitted. Thus,
the name rbdsqr is used in real cases (single or double precision), and the name bdsqr is
used in complex cases (single or double precision).
Application Notes
Each singular value and singular vector is computed to high relative accuracy. However, the
reduction to bidiagonal form (prior to calling the routine) may decrease the relative accuracy
in the small singular values of the original matrix if its singular values vary widely in magnitude.
If si is an exact singular value of B, and si is the corresponding computed value, then
|si - σi| ≤ p*(m,n)*ε*σi
where p(m, n) is a modestly increasing function of m and n, and ∑ is the machine precision.
If only singular values are computed, they are computed more accurately than when some
singular vectors are also computed (that is, the function p(m, n) is smaller).
If ui is the corresponding exact left singular vector of B, and wi is the corresponding computed
left singular vector, then the angle θ(ui, wi) between them is bounded as follows:
θ(ui, wi) ≤ p(m,n)*ε / min i≠j(|σi - σj|/|σi + σj|).
Here mini≠j(|σi - σj|/|σi + σj|) is the relative gap between σi and the other singular
values. A similar error bound holds for the right singular vectors.
2
The total number of real floating-point operations is roughly proportional to n if only the singular
values are computed. About 6n2*nru additional operations (12n2*nru for complex flavors) are
required to compute the left singular vectors and about 6n2*ncvt operations (12n2*ncvt for
complex flavors) to compute the right singular vectors.
884
?bdsdc
real bidiagonal matrix using a divide and conquer
method.
Syntax
FORTRAN 77:
call sbdsdc(uplo, compq, n, d, e, u, ldu, vt, ldvt, q, iq, work, iwork, info)
call dbdsdc(uplo, compq, n, d, e, u, ldu, vt, ldvt, q, iq, work, iwork, info)
Fortran 95:
call bdsdc(d, e [,u] [,vt] [,q] [,iq] [,uplo] [,info])
Description
The routine computes the Singular Value Decomposition (SVD) of a real n-by-n (upper or lower)
bidiagonal matrix B: B = U*Σ*VT, using a divide and conquer method, where Σ is a diagonal
matrix with non-negative diagonal elements (the singular values of B), and U and V are
orthogonal matrices of left and right singular vectors, respectively. ?bdsdc can be used to
compute all singular values, and optionally, singular vectors or singular vectors in compact
form.
This rotuine uses ?lasd0, ?lasd1, ?lasd2, ?lasd3, ?lasd4, ?lasd5, ?lasd6, ?lasd7, ?lasd8,
?lasd9, ?lasda, ?lasdq, ?lasdt.
Input Parameters
If uplo = 'U', B is an upper bidiagonal matrix.
If uplo = 'L', B is a lower bidiagonal matrix.
compq CHARACTER*1. Must be 'N', 'P', or 'I'.
If compq = 'N', compute singular values only.
If compq = 'P', compute singular values and compute
singular vectors in compact form.
885
If compq = 'I', compute singular values and singular

vectors.
d, e, work REAL for sbdsdc
DOUBLE PRECISION for dbdsdc.
Arrays:
d(*) contains the n diagonal elements of the bidiagonal
matrix B.
e(*) contains the off-diagonal elements of the bidiagonal
matrix B.
The dimension of e must be at least max(1, n).
The dimension of work must be at least:
max(1, 4*n), if compq = 'N';
max(1, 6*n), if compq = 'P';
2
max(1, 3*n +4*n), if compq = 'I'.
ldu INTEGER. The first dimension of the output array u; ldu ≥
1.
If singular vectors are desired, then ldu ≥ max(1, n).
ldvt INTEGER. The first dimension of the output array vt; ldvt
≥ 1.
If singular vectors are desired, then ldvt ≥ max(1, n).
iwork INTEGER. Workspace array, dimension at least max(1, 8*n).
Output Parameters
d If info = 0, overwritten by the singular values of B.
e On exit, e is overwritten.
u, vt, q REAL for sbdsdc
DOUBLE PRECISION for dbdsdc.
Arrays: u(ldu,*), vt(ldvt,*), q(*).
If compq = 'I', then on exit u contains the left singular
vectors of the bidiagonal matrix B, unless info ≠ 0
(seeinfo). For other values of compq, u is not referenced.
886
The second dimension of u must be at least max(1,n).
if compq = 'I', then on exit vt contains the right singular
vectors of the bidiagonal matrix B, unless info ≠ 0
(seeinfo). For other values of compq, vt is not referenced.
The second dimension of vt must be at least max(1,n).
If compq = 'P', then on exit, if info = 0, q and iq contain
the left and right singular vectors in a compact form.
Specifically, q contains all the REAL (for sbdsdc) or DOUBLE
PRECISION (for dbdsdc) data for singular vectors. For other
values of compq, q is not referenced. See Application notes
for details.
iq INTEGER.
Array: iq(*).
If compq = 'P', then on exit, if info = 0, q and iq contain
the left and right singular vectors in a compact form.
Specifically, iq contains all the INTEGER data for singular
vectors. For other values of compq, iq is not referenced.
See Application notes for details.
info INTEGER.
If info = i, the algorithm failed to compute a singular
value. The update process of divide and conquer failed.

Specific details for the routine bdsdc interface are the following:
e Holds the vector of length n.
u Holds the matrix U of size (n,n).
vt Holds the matrix VT of size (n,n).
q Holds the vector of length (ldq), where
887
ldq ≥ n*(11 + 2*smlsiz + 8*int(log_2(n/(smlsiz + 1))))

and smlsiz is returned by ilaenv and is equal to the maximum size
of the subproblems at the bottom of the computation tree (usually
about 25).
compq Restored based on the presence of arguments u, vt, q, and iq as
follows:
compq = 'N', if none of u, vt, q, and iq are present,
compq = 'I', if both u and vt are present. Arguments u and vt must
either be both present or both omitted,
compq = 'P', if both q and iq are present. Arguments q and iq must
either be both present or both omitted.
Note that there will be an error condition if all of u, vt, q, and iq
arguments are present simultaneously.
See Also
• Singular Value Decomposition
• ?lasd0
• ?lasd1
• ?lasd2
• ?lasd3
• ?lasd4
• ?lasd5
• ?lasd6
• ?lasd7
• ?lasd8
• ?lasd9
• ?lasda
• ?lasdq
• ?lasdt
Symmetric Eigenvalue Problems

Symmetric eigenvalue problems are posed as follows: given an n-by-n real symmetric or
complex Hermitian matrix A, find the eigenvalues λ and the corresponding eigenvectors z that
satisfy the equation
888
Az = λz (or, equivalently, zHA = λzH).
In such eigenvalue problems, all n eigenvalues are real not only for real symmetric but also for
complex Hermitian matrices A, and there exists an orthonormal system of n eigenvectors. If A
is a symmetric or Hermitian positive-definite matrix, all eigenvalues are positive.
To solve a symmetric eigenvalue problem with LAPACK, you usually need to reduce the matrix
to tridiagonal form and then solve the eigenvalue problem with the tridiagonal matrix obtained.
LAPACK includes routines for reducing the matrix to a tridiagonal form by an orthogonal (or
unitary) similarity transformation A = QTQH as well as for solving tridiagonal symmetric
eigenvalue problems. These routines (for FORTRAN 77 interface) are listed in Table 4-3 .
Respective routine names in Fortran 95 interface are without the first symbol (see Routine
Naming Conventions).
There are different routines for symmetric eigenvalue problems, depending on whether you
need all eigenvectors or only some of them or eigenvalues only, whether the matrix A is
positive-definite or not, and so on.
These routines are based on three primary algorithms for computing eigenvalues and
eigenvectors of symmetric problems: the divide and conquer algorithm, the QR algorithm, and
bisection followed by inverse iteration. The divide and conquer algorithm is generally more
efficient and is recommended for computing all eigenvalues and eigenvectors. Furthermore, to
solve an eigenvalue problem using the divide and conquer algorithm, you need to call only one
routine. In general, more than one routine has to be called if the QR algorithm or bisection
followed by inverse iteration is used.
Decision tree in Figure 4-2 will help you choose the right routine or sequence of routines for
eigenvalue problems with real symmetric matrices. A similar decision tree for complex Hermitian
matrices is presented in Figure 4-3 .
889
Figure 4-2 Decision Tree: Real Symmetric Eigenvalue Problems
890
Figure 4-3 Decision Tree: Complex Hermitian Eigenvalue Problems
Table 4-3 Computational Routines for Solving Symmetric Eigenvalue Problems

Operation Real symmetric matrices Complex Hermitian
matrices
Reduce to tridiagonal form A = QTQH ?sytrd ?syrdb ?hetrd ?herdb

(full storage)
Reduce to tridiagonal form A = QTQH ?sptrd ?hptrd

(packed storage)
891
Operation Real symmetric matrices Complex Hermitian

matrices
Reduce to tridiagonal form A = QTQH ?sbtrd ?hbtrd

(band storage).
Generate matrix Q (full storage) ?orgtr ?ungtr
Generate matrix Q (packed storage) ?opgtr ?upgtr
Apply matrix Q (full storage) ?ormtr ?unmtr
Apply matrix Q (packed storage) ?opmtr ?upmtr
Find all eigenvalues of a tridiagonal ?sterf

matrix T
Find all eigenvalues and eigenvectors ?steqr ?stedc ?steqr ?stedc

of a tridiagonal matrix T
Find all eigenvalues and eigenvectors ?pteqr ?pteqr

of a tridiagonal positive-definite
matrix T.
Find selected eigenvalues of a ?stebz ?stegr ?stegr

tridiagonal matrix T
Find selected eigenvectors of a ?stein ?stegr ?stein ?stegr

Find selected eigenvalues and ?stemr ?stemr

eigenvectors of f a real symmetric
Compute the reciprocal condition ?disna ?disna

numbers for the eigenvectors
892
?sytrd
Reduces a real symmetric matrix to tridiagonal
form.
Syntax
FORTRAN 77:
call ssytrd(uplo, n, a, lda, d, e, tau, work, lwork, info)
call dsytrd(uplo, n, a, lda, d, e, tau, work, lwork, info)
Fortran 95:
call sytrd(a, tau [,uplo] [,info])
Description
The routine reduces a real symmetric matrix A to symmetric tridiagonal form T by an orthogonal
similarity transformation: A = Q*T*QT. The orthogonal matrix Q is not formed explicitly but is
represented as a product of n-1 elementary reflectors. Routines are provided for working with
Q in this representation (see Application Notes below).
This routine calls ?latrd to reduce a real symmetric matrix to tridiagonal form by an orthogonal
similarity transformation.
Input Parameters
If uplo = 'U', a stores the upper triangular part of A.
If uplo = 'L', a stores the lower triangular part of A.
n INTEGER. The order of the matrix A (n ≥ 0).
a, work REAL for ssytrd
DOUBLE PRECISION for dsytrd.
a(lda,*) is an array containing either upper or lower
triangular part of the matrix A, as specified by uplo.
893

Output Parameters
a Overwritten by the tridiagonal matrix T and details of the
orthogonal matrix Q, as specified by uplo.
d, e, tau REAL for ssytrd
DOUBLE PRECISION for dsytrd.
Arrays:
d(*) contains the diagonal elements of the matrix T.
e(*) contains the off-diagonal elements of T.
The dimension of e must be at least max(1, n-1).
tau(*) stores further details of the orthogonal matrix Q in
the first n-1 elements. tau(n) is used as workspace.
The dimension of tau must be at least max(1, n).
work(1) If info=0, on exit work(1) contains the minimum value of
lwork required for optimum performance. Use this lwork
info INTEGER.

Specific details for the routine sytrd interface are the following:
tau Holds the vector of length (n-1).
894
Application Notes
algorithm.
query.
The computed matrix T is exactly similar to a matrix A+E, where ||E||2 = c(n)*ε*||A||2,
3
The approximate number of floating-point operations is (4/3)n .
After calling this routine, you can call the following:

?orgtr to form the computed matrix Q explicitly
?ormtr to multiply a real matrix by Q.
The complex counterpart of this routine is ?hetrd.
895
?syrdb
form with Successive Bandwidth Reduction
approach.
Syntax
FORTRAN 77:
call ssyrdb(jobz, uplo, n, kd, a, lda, d, e, tau, z, ldz, work, lwork, info)
call dsyrdb(jobz, uplo, n, kd, a, lda, d, e, tau, z, ldz, work, lwork, info)
Description
The routine reduces a real symmetric matrix A to symmetric tridiagonal form T by an orthogonal
similarity transformation: A = Q*T*QT and optionally multiplies matrix Z by Q, or simply forms
Q.
This routine reduces a full symmetric matrix to the banded symmetric form, and then to the
tridiagonal symmetric form with a Successive Bandwidth Reduction approach after Prof.
C.Bischof's works (see for instance, [Bischof92]). ?syrdb is functionally close to ?sytrd routine
but the tridiagonal form may differ from those obtained by ?sytrd. Unlike ?sytrd, the
orthogonal matrix Q cannot be restored from the details of matrix A on exit.
Input Parameters
jobz CHARACTER*1. Must be 'N' or 'V'.
If jobz = 'N', then only A is reduced to T.
If jobz = 'V', then A is reduced to T and A contains Q on
exit.
If jobz = 'U', then A is reduced to T and Z contains Z*Q
on exit.
896
kd INTEGER. The bandwidth of the banded matrix B (kd ≥ 1).
a,z, work REAL for ssyrdb.
DOUBLE PRECISION for dsyrdb.
z(ldz,*), the second dimension of z must be at least max(1,
n).
If jobz = 'U', then the matrix z is multiplied by Q.
If jobz = 'N' or 'V', then z is not referenced.
work(lwork) is a workspace array.
ldz INTEGER. The first dimension of z; at least max(1, n). Not
referenced if jobz = 'N'
lwork INTEGER. The size of the work array (lwork ≥
(2kd+1)n+kd).
Output Parameters
a If jobz = 'V', then overwritten by Q matrix.
If jobz = 'N' or 'U', then overwritten by the banded
matrix B and details of the orthogonal matrix QB to reduce
A to B as specified by uplo.
z On exit,
if jobz = 'U', then the matrix z is overwritten by Z*Q.
d, e, tau DOUBLE PRECISION.
Arrays:
897

tau(*) stores further details of the orthogonal matrix Q.
The dimension of tau must be at least max(1, n-kd-1).
info INTEGER.
Application Notes
For better performance, try using lwork = n*(3*kd+3).
query.
For better performance, try using kd equal to 40 if n ≤ 2000 and 64 otherwise.
Try using ?syrdb instead of ?sytrd on large matrices obtaining only eigenvalues - when no
eigenvectors are needed, especially in multi-threaded environment. ?syrdb becomes faster
beginning approximately with n = 1000, and much faster at larger matrices with a better
scalability than ?sytrd.
Avoid applying ?syrdb for computing eigenvectors due to the two-step reduction, that is, the
number of operations needed to apply orthogonal transformations to Z is doubled compared
to the traditional one-step reduction. In that case it is better to apply ?sytrd and
?ormtr/?orgtr to obtain tridiagonal form along with the orthogonal transformation matrix Q.
898
?herdb
Reduces a complex Hermitian matrix to tridiagonal
form with Successive Bandwidth Reduction
approach.
Syntax
FORTRAN 77:
call cherdb(jobz, uplo, n, kd, a, lda, d, e, tau, z, ldz, work, lwork, info)
call zherdb(jobz, uplo, n, kd, a, lda, d, e, tau, z, ldz, work, lwork, info)
Description
The routine reduces a complex Hermitian matrix A to symmetric tridiagonal form T by a unitary
similarity transformation: A = Q*T*QT and optionally multiplies matrix Z by Q, or simply forms
Q.
This routine reduces a full Hermitian matrix to the banded Hermitian form, and then to the
tridiagonal symmetric form with a Successive Bandwidth Reduction approach after Prof.
C.Bischof's works (see for instance, [Bischof92]). ?herdb is functionally close to ?hetrd routine
but the tridiagonal form may differ from those obtained by ?hetrd. Unlike ?hetrd, the
orthogonal matrix Q cannot be restored from the details of matrix A on exit.
Input Parameters
If jobz = 'N', then only A is reduced to T.
If jobz = 'V', then A is reduced to T and A contains Q on
exit.
If jobz = 'U', then A is reduced to T and Z contains Z*Q
on exit.
899
kd INTEGER. The bandwidth of the banded matrix B (kd ≥ 1).

a,z, work COMPLEX for cherdb.
DOUBLE COMPLEX for zherdb.
z(ldz,*), the second dimension of z must be at least max(1,
n).
If jobz = 'U', then the matrix z is multiplied by Q.
ldz INTEGER. The first dimension of z; at least max(1, n). Not
referenced if jobz = 'N'
lwork INTEGER. The size of the work array (lwork ≥
(2kd+1)n+kd).
Output Parameters
a If jobz = 'V', then overwritten by Q matrix.
If jobz = 'N' or 'U', then overwritten by the banded
matrix B and details of the unitary matrix QB to reduce A to
B as specified by uplo.
z On exit,
if jobz = 'U', then the matrix z is overwritten by Z*Q .
d, e COMPLEX for cherdb.
Arrays:
900
tau(*) stores further details of the orthogonal matrix Q.
The dimension of tau must be at least max(1, n-kd-1).
tau COMPLEX for cherdb.
Array, DIMENSION at least max(1, n-1)
Stores further details of the unitary matrix QB. The dimension
of tau must be at least max(1, n-kd-1).
info INTEGER.
Application Notes
For better performance, try using lwork = n*(3*kd+3).
query.
For better performance, try using kd equal to 40 if n ≤ 2000 and 64 otherwise.
Try using ?herdb instead of ?hetrd on large matrices obtaining only eigenvalues - when no
eigenvectors are needed, especially in multi-threaded environment. ?herdb becomes faster
beginning approximately with n = 1000, and much faster at larger matrices with a better
scalability than ?hetrd.
901
Avoid applying ?herdb for computing eigenvectors due to the two-step reduction, that is, the
number of operations needed to apply orthogonal transformations to Z is doubled compared
to the traditional one-step reduction. In that case it is better to apply ?hetrd and ?un-
mtr/?ungtr to obtain tridiagonal form along with the unitary transformation matrix Q.
?orgtr
Generates the real orthogonal matrix Q determined
by ?sytrd.
Syntax
FORTRAN 77:
call sorgtr(uplo, n, a, lda, tau, work, lwork, info)
call dorgtr(uplo, n, a, lda, tau, work, lwork, info)
Fortran 95:
call orgtr(a, tau [,uplo] [,info])
Description
The routine explicitly generates the n-by-n orthogonal matrix Q formed by ?sytrd when reducing
a real symmetric matrix A to tridiagonal form: A = Q*T*QT. Use this routine after a call to
?sytrd.
Input Parameters
Use the same uplo as supplied to ?sytrd.
n INTEGER. The order of the matrix Q (n ≥ 0).
a, tau, work REAL for sorgtr
DOUBLE PRECISION for dorgtr.
Arrays:
a(lda,*) is the array a as returned by ?sytrd.
tau(*) is the array tau as returned by ?sytrd.
902
The dimension of tau must be at least max(1, n-1).
lwork).
Output Parameters
a Overwritten by the orthogonal matrix Q.
info INTEGER.

Specific details for the routine orgtr interface are the following:
Application Notes
For better performance, try using lwork = (n-1)*blocksize, where blocksize is a
algorithm.
903
workspace query.
workspace.
= O(ε), where ε is the machine precision.
3
The approximate number of floating-point operations is (4/3)n .
The complex counterpart of this routine is ?ungtr.
?ormtr
Multiplies a real matrix by the real orthogonal
matrix Q determined by ?sytrd.
Syntax
FORTRAN 77:
call sormtr(side, uplo, trans, m, n, a, lda, tau, c, ldc, work, lwork, info)
call dormtr(side, uplo, trans, m, n, a, lda, tau, c, ldc, work, lwork, info)
Fortran 95:
call ormtr(a, tau, c [,side] [,uplo] [,trans] [,info])
Description
904
T
The routine multiplies a real matrix C by Q or Q , where Q is the orthogonal matrix Q formed by
?sytrd when reducing a real symmetric matrix A to tridiagonal form: A = Q*T*QT. Use this
routine after a call to ?sytrd.
Input Parameters
In the descriptions below, r denotes the order of Q:

T
T
Use the same uplo as supplied to ?sytrd.
T
a, c, tau, work REAL for sormtr
DOUBLE PRECISION for dormtr
a(lda,*) and tau are the arrays returned by ?sytrd.
The second dimension of a must be at least max(1, r).
The dimension of tau must be at least max(1, r-1).
lwork).
lda INTEGER. The first dimension of a; lda ≥ max(1, r).
ldc INTEGER. The first dimension of c; ldc ≥ max(1, n).
905

Output Parameters
info INTEGER.

Specific details for the routine ormtr interface are the following:
a Holds the matrix A of size (r,r).
tau Holds the vector of length (r-1).
906
Application Notes
For better performance, try using lwork = n*blocksize for side = 'L', or lwork =
m*blocksize for side = 'R', where blocksize is a machine-dependent value (typically, 16
workspace query.
workspace.
O(ε)*||C||2.
The total number of floating-point operations is approximately 2*m2*n, if side = 'L', or
2*n2*m, if side = 'R'.
The complex counterpart of this routine is ?unmtr.
?hetrd
form.
Syntax
FORTRAN 77:
call chetrd(uplo, n, a, lda, d, e, tau, work, lwork, info)
call zhetrd(uplo, n, a, lda, d, e, tau, work, lwork, info)
907
Fortran 95:
call hetrd(a, tau [,uplo] [,info])
Description
The routine reduces a complex Hermitian matrix A to symmetric tridiagonal form T by a unitary
similarity transformation: A = Q*T*QH. The unitary matrix Q is not formed explicitly but is
represented as a product of n-1 elementary reflectors. Routines are provided to work with Q in
this representation. (They are described later in this section .)
This routine calls ?latrd to reduce a complex Hermitian matrix A to Hermitian tridiagonal form
by a unitary similarity transformation.
Input Parameters
a, work COMPLEX for chetrd
DOUBLE COMPLEX for zhetrd.
908
Output Parameters
a Overwritten by the tridiagonal matrix T and details of the
unitary matrix Q, as specified by uplo.
d, e REAL for chetrd
DOUBLE PRECISION for zhetrd.
Arrays:
tau COMPLEX for chetrd DOUBLE COMPLEX for zhetrd.
Array, DIMENSION at least max(1, n-1). Stores further
details of the unitary matrix Q.
info INTEGER.

Specific details for the routine hetrd interface are the following:
909
Application Notes
algorithm.
workspace query.
workspace.
The computed matrix T is exactly similar to a matrix A + E, where ||E||2 = c(n)*ε*||A||2,

The approximate number of floating-point operations is (16/3)n3.

?ungtr to form the computed matrix Q explicitly
?unmtr to multiply a complex matrix by Q.
The real counterpart of this routine is ?sytrd.
910
?ungtr
Generates the complex unitary matrix Q
determined by ?hetrd.
Syntax
FORTRAN 77:
call cungtr(uplo, n, a, lda, tau, work, lwork, info)
call zungtr(uplo, n, a, lda, tau, work, lwork, info)
Fortran 95:
call ungtr(a, tau [,uplo] [,info])
Description
The routine explicitly generates the n-by-n unitary matrix Q formed by ?hetrd when reducing
a complex Hermitian matrix A to tridiagonal form: A = Q*T*QH. Use this routine after a call to
?hetrd.
Input Parameters
Use the same uplo as supplied to ?hetrd.
a, tau, work COMPLEX for cungtr
DOUBLE COMPLEX for zungtr.
Arrays:
a(lda,*) is the array a as returned by ?hetrd.
tau(*) is the array tau as returned by ?hetrd.
911

Output Parameters
a Overwritten by the unitary matrix Q.
info INTEGER.

Specific details for the routine ungtr interface are the following:
Application Notes
For better performance, try using lwork = (n-1)*blocksize, where blocksize is a
algorithm.
912
query.
The computed matrix Q differs from an exactly unitary matrix by a matrix E such that ||E||2
The real counterpart of this routine is ?orgtr.
?unmtr
Multiplies a complex matrix by the complex unitary
matrix Q determined by ?hetrd.
Syntax
FORTRAN 77:
call cunmtr(side, uplo, trans, m, n, a, lda, tau, c, ldc, work, lwork, info)
call zunmtr(side, uplo, trans, m, n, a, lda, tau, c, ldc, work, lwork, info)
Fortran 95:
call unmtr(a, tau, c [,side] [,uplo] [,trans] [,info])
Description
H
The routine multiplies a complex matrix C by Q or Q , where Q is the unitary matrix Q formed
by ?hetrd when reducing a complex Hermitian matrix A to tridiagonal form: A = Q*T*QH. Use
this routine after a call to ?hetrd.
H H
913
Input Parameters

H
H
Use the same uplo as supplied to ?hetrd.
H
a, c, tau, work COMPLEX for cunmtr
DOUBLE COMPLEX for zunmtr.
a(lda,*) and tau are the arrays returned by ?hetrd.
The second dimension of a must be at least max(1, r).
lwork).
lda INTEGER. The first dimension of a; lda ≥ max(1, r).
914
Output Parameters
H H
info INTEGER.

Specific details for the routine unmtr interface are the following:
Application Notes
For better performance, try using lwork = n*blocksize (for side = 'L') or lwork =
m*blocksize (for side = 'R') where blocksize is a machine-dependent value (typically,
16 to 64) required for optimum performance of the blocked algorithm.
915
query.
O(ε)*||C||2, where ε is the machine precision.
The total number of floating-point operations is approximately 8*m2*n if side = 'L' or 8*n2*m
if side = 'R'.
The real counterpart of this routine is ?ormtr.
?sptrd
form using packed storage.
Syntax
FORTRAN 77:
call ssptrd(uplo, n, ap, d, e, tau, info)
call dsptrd(uplo, n, ap, d, e, tau, info)
Fortran 95:
call sptrd(ap, tau [,uplo] [,info])
Description
The routine reduces a packed real symmetric matrix A to symmetric tridiagonal form T by an
orthogonal similarity transformation: A = Q*T*QT. The orthogonal matrix Q is not formed
explicitly but is represented as a product of n-1 elementary reflectors. Routines are provided
for working with Q in this representation (see Application Notes below).
916
Input Parameters
If uplo = 'U', ap stores the packed upper triangle of A.
If uplo = 'L', ap stores the packed lower triangle of A.
ap REAL for ssptrd
DOUBLE PRECISION for dsptrd.
Array, DIMENSION at least max(1, n(n+1)/2). Contains either
upper or lower triangle of A (as specified by uplo) in packed
form.
Output Parameters
ap Overwritten by the tridiagonal matrix T and details of the
d, e, tau REAL for ssptrd
DOUBLE PRECISION for dsptrd.
Arrays:
tau(*) stores further details of the matrix Q.
info INTEGER.

Specific details for the routine sptrd interface are the following:
tau Holds the vector with the number of elements n-1.
917
d Holds the vector with the number of elements n.

e Holds the vector with the number of elements n-1.
Application Notes
c(n) is a modestly increasing function of n, and ε is the machine precision. The approximate
number of floating-point operations is (4/3)n3.

?opgtr to form the computed matrix Q explicitly
?opmtr to multiply a real matrix by Q.
The complex counterpart of this routine is ?hptrd.
?opgtr
by ?sptrd.
Syntax
FORTRAN 77:
call sopgtr(uplo, n, ap, tau, q, ldq, work, info)
call dopgtr(uplo, n, ap, tau, q, ldq, work, info)
Fortran 95:
call opgtr(ap, tau, q [,uplo] [,info])
Description
The routine explicitly generates the n-by-n orthogonal matrix Q formed by ?sptrd when reducing
a packed real symmetric matrix A to tridiagonal form: A = Q*T*QT. Use this routine after a call
to ?sptrd.
918
Input Parameters
uplo CHARACTER*1. Must be 'U' or 'L'. Use the same uplo as
supplied to ?sptrd.
ap, tau REAL for sopgtr
DOUBLE PRECISION for dopgtr.
Arrays ap and tau, as returned by ?sptrd.
The dimension of ap must be at least max(1, n(n+1)/2).
ldq INTEGER. The first dimension of the output array q; at least
max(1, n).
work REAL for sopgtr
Workspace array, DIMENSION at least max(1, n-1).
Output Parameters
q REAL for sopgtr
Array, DIMENSION (ldq,*).
Contains the computed matrix Q.
The second dimension of q must be at least max(1, n).
info INTEGER.

Specific details for the routine opgtr interface are the following:
tau Holds the vector with the number of elements n - 1.
q Holds the matrix Q of size (n,n).
919
Application Notes
The complex counterpart of this routine is ?upgtr.
?opmtr
Multiplies a real matrix by the real orthogonal
matrix Q determined by ?sptrd.
Syntax
FORTRAN 77:
call sopmtr(side, uplo, trans, m, n, ap, tau, c, ldc, work, info)
call dopmtr(side, uplo, trans, m, n, ap, tau, c, ldc, work, info)
Fortran 95:
call opmtr(ap, tau, c [,side] [,uplo] [,trans] [,info])
Description
T
The routine multiplies a real matrix C by Q or Q , where Q is the orthogonal matrix Q formed by
?sptrd when reducing a packed real symmetric matrix A to tridiagonal form: A = Q*T*QT. Use
this routine after a call to ?sptrd.
Input Parameters

T
920
T
Use the same uplo as supplied to ?sptrd.
T
ap, tau, c, work REAL for sopmtr
DOUBLE PRECISION for dopmtr.
ap and tau are the arrays returned by ?sptrd.
The dimension of ap must be at least max(1, r(r+1)/2).
The dimension of work must be at least
max(1, n) if side = 'L';
max(1, m) if side = 'R'.
Output Parameters
info INTEGER.

Specific details for the routine opmtr interface are the following:
ap Holds the array A of size (r*(r+1)/2), where
921
tau Holds the vector with the number of elements r - 1.
Application Notes
The computed product differs from the exact product by a matrix E such that ||E||2 = O(ε)
||C||2, where ε is the machine precision.
The total number of floating-point operations is approximately 2*m2*n if side = 'L', or 2*n2*m
if side = 'R'.
The complex counterpart of this routine is ?upmtr.
?hptrd
form using packed storage.
Syntax
FORTRAN 77:
call chptrd(uplo, n, ap, d, e, tau, info)
call zhptrd(uplo, n, ap, d, e, tau, info)
Fortran 95:
call hptrd(ap, tau [,uplo] [,info])
Description
922
The routine reduces a packed complex Hermitian matrix A to symmetric tridiagonal form T by
H
a unitary similarity transformation: A = Q*T*Q . The unitary matrix Q is not formed explicitly
but is represented as a product of n-1 elementary reflectors. Routines are provided for working
with Q in this representation (see Application Notes below).
Input Parameters
If uplo = 'U', ap stores the packed upper triangle of A.
If uplo = 'L', ap stores the packed lower triangle of A.
ap COMPLEX for chptrd
DOUBLE COMPLEX for zhptrd.
Array, DIMENSION at least max(1, n(n+1)/2). Contains either
upper or lower triangle of A (as specified by uplo) in packed
form.
Output Parameters
ap Overwritten by the tridiagonal matrix T and details of the
d, e REAL for chptrd
DOUBLE PRECISION for zhptrd.
Arrays:
tau COMPLEX for chptrd
DOUBLE COMPLEX for zhptrd.
Arrays, DIMENSION at least max(1, n-1). Contains further
details of the orthogonal matrix Q.
info INTEGER.
923

Specific details for the routine hptrd interface are the following:
e Holds the vector with the number of elements n - 1.
Application Notes

?upgtr to form the computed matrix Q explicitly
?upmtr to multiply a complex matrix by Q.
The real counterpart of this routine is ?sptrd.
?upgtr
determined by ?hptrd.
Syntax
FORTRAN 77:
call cupgtr(uplo, n, ap, tau, q, ldq, work, info)
call zupgtr(uplo, n, ap, tau, q, ldq, work, info)
Fortran 95:
call upgtr(ap, tau, q [,uplo] [,info])
924
Description
The routine explicitly generates the n-by-n unitary matrix Q formed by ?hptrd when reducing
H
a packed complex Hermitian matrix A to tridiagonal form: A = Q*T*Q . Use this routine after a
call to ?hptrd.
Input Parameters
uplo CHARACTER*1. Must be 'U' or 'L'. Use the same uplo as
supplied to ?hptrd.
ap, tau COMPLEX for cupgtr
DOUBLE COMPLEX for zupgtr.
Arrays ap and tau, as returned by ?hptrd.
The dimension of ap must be at least max(1, n(n+1)/2).
ldq INTEGER. The first dimension of the output array q;
at least max(1, n).
work COMPLEX for cupgtr
Workspace array, DIMENSION at least max(1, n-1).
Output Parameters
q COMPLEX for cupgtr
Contains the computed matrix Q.
info INTEGER.
925

Specific details for the routine upgtr interface are the following:
Application Notes
The real counterpart of this routine is ?opgtr.
?upmtr
Q determined by ?hptrd.
Syntax
FORTRAN 77:
call cupmtr(side, uplo, trans, m, n, ap, tau, c, ldc, work, info)
call zupmtr(side, uplo, trans, m, n, ap, tau, c, ldc, work, info)
Fortran 95:
call upmtr(ap, tau, c [,side] [,uplo] [,trans] [,info])
Description
926
H
The routine multiplies a complex matrix C by Q or Q , where Q is the unitary matrix formed by
H
?hptrd when reducing a packed complex Hermitian matrix A to tridiagonal form: A = Q*T*Q .
Use this routine after a call to ?hptrd.
H H
Input Parameters

H
H
Use the same uplo as supplied to ?hptrd.
H
ap, tau, c, COMPLEX for cupmtr
DOUBLE COMPLEX for zupmtr.
ap and tau are the arrays returned by ?hptrd.
The dimension of ap must be at least max(1, r(r+1)/2).
The dimension of work must be at least
max(1, n) if side = 'L';
max(1, m) if side = 'R'.
927
Output Parameters
H H
info INTEGER.

Specific details for the routine upmtr interface are the following:
ap Holds the array A of size (r*(r+1)/2), where
uplo Must be 'U' or 'L'.The default value is 'U'.
Application Notes
The total number of floating-point operations is approximately 8*m2*n if side = 'L' or 8*n2*m
if side = 'R'.
The real counterpart of this routine is ?opmtr.
928
?sbtrd
Reduces a real symmetric band matrix to
tridiagonal form.
Syntax
FORTRAN 77:
call ssbtrd(vect, uplo, n, kd, ab, ldab, d, e, q, ldq, work, info)
call dsbtrd(vect, uplo, n, kd, ab, ldab, d, e, q, ldq, work, info)
Fortran 95:
call sbtrd(ab[, q] [,vect] [,uplo] [,info])
Description
The routine reduces a real symmetric band matrix A to symmetric tridiagonal form T by an
orthogonal similarity transformation: A = Q*T*QT. The orthogonal matrix Q is determined as
a product of Givens rotations.
If required, the routine can also form the matrix Q explicitly.
Input Parameters
vect CHARACTER*1. Must be 'V' or 'N'.
If vect = 'V', the routine returns the explicit matrix Q.
If vect = 'N', the routine does not return Q.
If uplo = 'U', ab stores the upper triangular part of A.
If uplo = 'L', ab stores the lower triangular part of A.
n INTEGER. The order of the matrix A (n≥0).
kd INTEGER. The number of super- or sub-diagonals in A
(kd≥0).
ab, work REAL for ssbtrd
DOUBLE PRECISION for dsbtrd.
929
ab (ldab,*) is an array containing either upper or lower

triangular part of the matrix A (as specified by uplo) in band
storage format.
The dimension of work must be at least max(1, n).
ldab INTEGER. The first dimension of ab; at least kd+1.
ldq INTEGER. The first dimension of q. Constraints:
ldq ≥ max(1, n) if vect = 'V';
ldq ≥ 1 if vect = 'N'.
Output Parameters
ab On exit, the array ab is overwritten.
d, e, q REAL for ssbtrd
DOUBLE PRECISION for dsbtrd.
Arrays:
q(ldq,*) is not referenced if vect = 'N'.
If vect = 'V', q contains the n-by-n matrix Q.
The second dimension of q must be:
at least max(1, n) if vect = 'V';
at least 1 if vect = 'N'.
info INTEGER.

Specific details for the routine sbtrd interface are the following:
930
vect If omitted, this argument is restored based on the presence of argument
q as follows: vect = 'V', if q is present, vect = 'N', if q is omitted.
If present, vect must be equal to 'V' or 'U' and the argument q must
also be present. Note that there will be an error condition if vect is
present and q omitted.
Application Notes
c(n) is a modestly increasing function of n, and ε is the machine precision. The computed
matrix Q differs from an exactly orthogonal matrix by a matrix E such that ||E||2 = O(ε).
The total number of floating-point operations is approximately 6n2*kd if vect = 'N', with
3n3*(kd-1)/kd additional operations if vect = 'V'.
The complex counterpart of this routine is ?hbtrd.
?hbtrd
Reduces a complex Hermitian band matrix to
tridiagonal form.
Syntax
FORTRAN 77:
call chbtrd(vect, uplo, n, kd, ab, ldab, d, e, q, ldq, work, info)
call zhbtrd(vect, uplo, n, kd, ab, ldab, d, e, q, ldq, work, info)
Fortran 95:
call hbtrd(ab [, q] [,vect] [,uplo] [,info])
Description
931
The routine reduces a complex Hermitian band matrix A to symmetric tridiagonal form T by a
H
unitary similarity transformation: A = Q*T*Q . The unitary matrix Q is determined as a product
of Givens rotations.
If required, the routine can also form the matrix Q explicitly.
Input Parameters
vect CHARACTER*1. Must be 'V' or 'N'.
If vect = 'V', the routine returns the explicit matrix Q.
If vect = 'N', the routine does not return Q.
(kd ≥ 0).
ab, work COMPLEX for chbtrd
DOUBLE COMPLEX for zhbtrd.
triangular part of the matrix A (as specified by uplo) in band
storage format.
ldab INTEGER. The first dimension of ab; at least kd+1.
ldq INTEGER. The first dimension of q. Constraints:
ldq ≥ max(1, n) if vect = 'V';
ldq ≥ 1 if vect = 'N'.
Output Parameters
ab On exit, the array ab is overwritten.
d, e REAL for chbtrd
DOUBLE PRECISION for zhbtrd.
Arrays:
932
q COMPLEX for chbtrd
DOUBLE COMPLEX for zhbtrd.
If vect = 'N', q is not referenced.
If vect = 'V', q contains the n-by-n matrix Q.
The second dimension of q must be:
at least max(1, n) if vect = 'V';
at least 1 if vect = 'N'.
info INTEGER.

Specific details for the routine hbtrd interface are the following:
vect If omitted, this argument is restored based on the presence of argument
q as follows: vect = 'V', if q is present, vect = 'N', if q is omitted.
If present, vect must be equal to 'V' or 'U' and the argument q must
also be present. Note that there will be an error condition if vect is
present and q omitted.
933
Application Notes
c(n) is a modestly increasing function of n, and ε is the machine precision. The computed matrix
Q differs from an exactly unitary matrix by a matrix E such that ||E||2 = O(ε).
The total number of floating-point operations is approximately 20n2*kd if vect = 'N', with
10n3*(kd-1)/kd additional operations if vect = 'V'.
The real counterpart of this routine is ?sbtrd.
?sterf
Computes all eigenvalues of a real symmetric
tridiagonal matrix using QR algorithm.
Syntax
FORTRAN 77:
call ssterf(n, d, e, info)
call dsterf(n, d, e, info)
Fortran 95:
call sterf(d, e [,info])
Description
The routine computes all the eigenvalues of a real symmetric tridiagonal matrix T (which can
be obtained by reducing a symmetric or Hermitian matrix to tridiagonal form). The routine uses
a square-root-free variant of the QR algorithm.
If you need not only the eigenvalues but also the eigenvectors, call ?steqr.
Input Parameters
n INTEGER. The order of the matrix T (n ≥ 0).
d, e REAL for ssterf
DOUBLE PRECISION for dsterf.
934
Arrays:
d(*) contains the diagonal elements of T.
Output Parameters
d The n eigenvalues in ascending order, unless info > 0.
See also info.
e On exit, the array is overwritten; see info.
info INTEGER.
If info = i, the algorithm failed to find all the eigenvalues
after 30n iterations:
i off-diagonal elements have not converged to zero. On
exit, d and e contain, respectively, the diagonal and
off-diagonal elements of a tridiagonal matrix orthogonally
similar to T.

Specific details for the routine sterf interface are the following:
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix T+E such that ||E||2 =
O(ε)*||T||2, where ε is the machine precision.
If λi is an exact eigenvalue, and mi is the corresponding computed value, then
|μi - λi| ≤ c(n)*ε*||T||2
935
where c(n) is a modestly increasing function of n.
The total number of floating-point operations depends on how rapidly the algorithm converges.
Typically, it is about 14n2.
?steqr
Computes all eigenvalues and eigenvectors of a
symmetric or Hermitian matrix reduced to
tridiagonal form (QR algorithm).
Syntax
FORTRAN 77:
call ssteqr(compz, n, d, e, z, ldz, work, info)
call dsteqr(compz, n, d, e, z, ldz, work, info)
call csteqr(compz, n, d, e, z, ldz, work, info)
call zsteqr(compz, n, d, e, z, ldz, work, info)
Fortran 95:
call rsteqr(d, e [,z] [,compz] [,info])
call steqr(d, e [,z] [,compz] [,info])
Description
The routine computes all the eigenvalues and (optionally) all the eigenvectors of a real symmetric
tridiagonal matrix T. In other words, the routine can compute the spectral factorization: T =
Z*Λ*ZT. Here Λ is a diagonal matrix whose diagonal elements are the eigenvalues λi; Z is an
orthogonal matrix whose columns are eigenvectors. Thus,
T*zi = λi*zi for i = 1, 2, ..., n.

The routine normalizes the eigenvectors so that ||zi||2 = 1.
936
You can also use the routine for computing the eigenvalues and eigenvectors of an arbitrary
real symmetric (or complex Hermitian) matrix A reduced to tridiagonal form T: A = Q*T*QH.
In this case, the spectral factorization is as follows: A = Q*T*QH = (Q*Z)*Λ*(Q*Z)H. Before
calling ?steqr, you must reduce A to tridiagonal form and generate the explicit matrix Q by
calling the following routines:
for real matrices: for complex matrices:
full storage ?sytrd, ?orgtr ?hetrd, ?ungtr
packed storage ?sptrd, ?opgtr ?hptrd, ?upgtr
band storage ?sbtrd (vect='V') ?hbtrd (vect='V')
If you need eigenvalues only, it's more efficient to call ?sterf. If T is positive-definite, ?pteqr
can compute small eigenvalues more accurately than ?steqr.
To solve the problem by a single call, use one of the divide and conquer routines ?stevd,
?syevd, ?spevd, or ?sbevd for real symmetric matrices or ?heevd, ?hpevd, or ?hbevd for
complex Hermitian matrices.
Input Parameters
compz CHARACTER*1. Must be 'N' or 'I' or 'V'.
If compz = 'N', the routine computes eigenvalues only.
If compz = 'I', the routine computes the eigenvalues and
eigenvectors of the tridiagonal matrix T.
If compz = 'V', the routine computes the eigenvalues and
eigenvectors of A (and the array z must contain the matrix
Q on entry).
Arrays:
937
The dimension of work must be:

at least 1 if compz = 'N';
at least max(1, 2*n-2) if compz = 'V' or 'I'.
z REAL for ssteqr
DOUBLE PRECISION for dsteqr
COMPLEX for csteqr
DOUBLE COMPLEX for zsteqr.
Array, DIMENSION (ldz, *)
If compz = 'N' or 'I', z need not be set.
If vect = 'V', z must contain the n-by-n matrix Q.
The second dimension of z must be:
at least max(1, n) if compz = 'V' or 'I'.
work (lwork) is a workspace array.
ldz INTEGER. The first dimension of z. Constraints:
ldz ≥ 1 if compz = 'N';
ldz ≥ max(1, n) if compz = 'V' or 'I'.
Output Parameters
d The n eigenvalues in ascending order, unless info > 0.
See also info.
z If info = 0, contains the n orthonormal eigenvectors,
stored by columns. (The i-th column corresponds to the
ith eigenvalue.)
info INTEGER.
If info = i, the algorithm failed to find all the eigenvalues
after 30n iterations: i off-diagonal elements have not
converged to zero. On exit, d and e contain, respectively,
the diagonal and off-diagonal elements of a tridiagonal
matrix orthogonally similar to T.
938
Specific details for the routine steqr interface are the following:
z Holds the matrix Z of size (n,n).
compz If omitted, this argument is restored based on the presence of argument
z as follows:
compz = 'I', if z is present,
compz = 'N', if z is omitted.
If present, compz must be equal to 'I' or 'V' and the argument z
must also be present. Note that there will be an error condition if compz
is present and z omitted.
Note that two variants of Fortran 95 interface for steqr routine are
needed because of an ambiguous choice between real and complex
cases appear when z is omitted. Thus, the name rsteqr is used in real
cases (single or double precision), and the name steqr is used in
complex cases (single or double precision).
Application Notes
If λi is an exact eigenvalue, and μi is the corresponding computed value, then
|μi - λi| ≤ c(n)*ε*||T||2
If zi is the corresponding exact eigenvector, and wi is the corresponding computed vector, then
the angle θ(zi, wi) between them is bounded as follows:
θ(zi, wi) ≤ c(n)*ε*||T||2 / mini≠j|λi - λj|.
Typically, it is about
939
24n2 if compz = 'N';

7n3 (for complex flavors, 14n3) if compz = 'V' or 'I'.
?stemr
Computes selected eigenvalues and eigenvectors
of a real symmetric tridiagonal matrix.
Syntax
FORTRAN 77:
call sstemr(jobz, range, n, d, e, vl, vu, il, iu, m, w, z, ldz, nzc, isuppz,
tryrac, work, lwork, iwork, liwork, info)
call dstemr(jobz, range, n, d, e, vl, vu, il, iu, m, w, z, ldz, nzc, isuppz,
call cstemr(jobz, range, n, d, e, vl, vu, il, iu, m, w, z, ldz, nzc, isuppz,
call zstemr(jobz, range, n, d, e, vl, vu, il, iu, m, w, z, ldz, nzc, isuppz,
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a real symmetric
tridiagonal matrix T. Any such unreduced matrix has a well defined set of pairwise different
real eigenvalues, the corresponding real eigenvectors are pairwise orthogonal.
The spectrum may be computed either completely or partially by specifying either an interval
(vl,vu] or a range of indices il:iu for the desired eigenvalues.
Depending on the number of desired eigenvalues, these are computed either by bisection or
the dqds algorithm. Numerically orthogonal eigenvectors are computed by the use of various
T
suitable L*D*L factorizations near clusters of close eigenvalues (referred to as RRRs, Relatively
Robust Representations). An informal sketch of the algorithm follows.
For each unreduced block (submatrix) of T,
940
a. Compute T - sigma*I = L*D*LT, so that L and D define all the wanted eigenvalues to
high relative accuracy. This means that small relative changes in the entries of L and D cause
only small relative changes in the eigenvalues and eigenvectors. The standard (unfactored)
representation of the tridiagonal matrix T does not have this property in general.
b. Compute the eigenvalues to suitable accuracy. If the eigenvectors are desired, the algorithm
attains full accuracy of the computed eigenvalues only right before the corresponding vectors
have to be computed, see steps c and d.
c. For each cluster of close eigenvalues, select a new shift close to the cluster, find a new
factorization, and refine the shifted eigenvalues to suitable accuracy.
d. For each eigenvalue with a large enough relative separation compute the corresponding
eigenvector by forming a rank revealing twisted factorization. Go back to step c for any
clusters that remain.
For more details, see: [Dhillon04], [Dhillon04-02], [Dhillon97]

The routine works only on machines which follow IEEE-754 floating-point standard in their
handling of infinities and NaNs (NaN stands for "not a number"). This permits the use of efficient
inner loops avoiding a check for zero divisors.
LAPACK routines can be used to reduce a complex Hermitean matrix to real symmetric tridiagonal
form.
(Any complex Hermitean tridiagonal matrix has real values on its diagonal and potentially
complex numbers on its off-diagonals. By applying a similarity transform with an appropriate
diagonal matrix diag(1,e{i \phy_1}, ... , e{i \phy_{n-1}}), the complex Hermitean matrix
can be transformed into a real symmetric matrix and complex arithmetic can be entirely avoided.)
While the eigenvectors of the real symmetric tridiagonal matrix are real, the eigenvectors of
original complex Hermitean matrix have complex entries in general. Since LAPACK drivers
overwrite the matrix data with the eigenvectors, zstemr accepts complex workspace to facilitate
interoperability with zunmtr or zupmtr.
Input Parameters
If jobz = 'N', then only eigenvalues are computed.
If jobz = 'V', then eigenvalues and eigenvectors are
computed.
range CHARACTER*1. Must be 'A' or 'V' or 'I'.
If range = 'A', the routine computes all eigenvalues.
If range = 'V', the routine computes all eigenvalues in
the half-open interval: (vl, vu].
941
If range = 'I', the routine computes eigenvalues with

indices il to iu.
n INTEGER. The order of the matrix T (n≥0).
d REAL for single precision flavors
Contains n diagonal elements of the tridiagonal matrix T.
e REAL for single precision flavors
Array, DIMENSION (n-1).
Contains (n-1) off-diagonal elements of the tridiagonal
matrix T in elements 1 to n-1 of e. e(n) need not be set
on input, but is used internally as workspace.
vl, vu REAL for single precision flavors
If range = 'V', the lower and upper bounds of the interval
to be searched for eigenvalues. Constraint: vl<vu.
If range = 'A' or 'I', vl and vu are not referenced.
il, iu INTEGER.
If range = 'I', the indices in ascending order of the
smallest and largest eigenvalues to be returned.
Constraint: 1≤il≤iu≤n, if n>0.
If range = 'A' or 'V', il and iu are not referenced.
ldz INTEGER. The leading dimension of the output array z.
if jobz = 'V', then ldz ≥ max(1, n);
ldz ≥ 1 otherwise.
nzc INTEGER. The number of eigenvectors to be held in the
array z.
If range = 'A', then nzc≥max(1, n);
If range = 'V', then nzc is greater than or equal to the
number of eigenvalues in the half-open interval: (vl, vu].
If range = 'I', then nzc≥il+iu+1.
942
This value is returned as the first entry of the array z, and
no error message related to nzc is issued by the routine
xerbla.
tryrac LOGICAL.
If tryrac = .TRUE., it indicates that the code should check
whether the tridiagonal matrix defines its eigenvalues to
high relative accuracy. If so, the code uses relative-accuracy
preserving algorithms that might be (a bit) slower depending
on the matrix. If the matrix does not define its eigenvalues
to high relative accuracy, the code can uses possibly faster
algorithms.
If tryrac = .FALSE., the code is not required to guarantee
relatively accurate eigenvalues and can use the fastest
possible techniques.
work REAL for single precision flavors
Workspace array, DIMENSION (lwork).
lwork INTEGER.
The dimension of the array work,
lwork≥max(1, 18*n).
If lwork=-1, then a workspace query is assumed; the
iwork INTEGER.
Workspace array, DIMENSION (liwork).
liwork INTEGER.
The dimension of the array iwork.
lwork≥max(1, 10*n) if the eigenvectors are desired, and
lwork≥max(1, 8*n) if only the eigenvalues are to be
computed.
If liwork=-1, then a workspace query is assumed; the
routine only calculates the optimal size of the iwork array,
returns this value as the first entry of the iwork array, and
no error message related to liwork is issued by xerbla.
943
Output Parameters
e On exit, the array e is overwritten.
m INTEGER.
The total number of eigenvalues found, 0≤m≤n.
If range = 'A', then m=n, and if If range = 'I', then
m=iu-il+1.
w REAL for single precision flavors
The first m elements contain the selected eigenvalues
in ascending order.
z REAL for sstemr
DOUBLE PRECISION for dstemr
COMPLEX for cstemr
DOUBLE COMPLEX for zstemr.
Array z(ldz, *), the second dimension of z must be at least
max(1, m).
If jobz = 'V', and info = 0, then the first m columns of
z contain the orthonormal eigenvectors of the matrix T
corresponding to the selected eigenvalues, with the i-th
column of z holding the eigenvector associated with w(i).
If jobz = 'N', then z is not referenced.
Note: you must ensure that at least max(1,m) columns are
supplied in the array z ; if range = 'V', the exact value
of m is not known in advance and an can be computed with
a workspace query by setting nzc=-1, see description of
the parameter nzc.
isuppz INTEGER.
Array, DIMENSION (2*max(1, m)).
The support of the eigenvectors in z, that is the indices
indicating the nonzero elements in z. The i-th computed
eigenvector is nonzero only in elements isuppz(2*i-1)
through isuppz(2*i). This is relevant in the case when the
matrix is split. isuppz is only accessed when jobz = 'V'
and n>0.
944
tryrac On exit, TRUE. tryrac is set to .FALSE. if the matrix does
not define its eigenvalues to high relative accuracy.
work(1) On exit, if info = 0, then work(1) returns the optimal
(and minimal) size of lwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the optimal
size of liwork.
info INTEGER.
If = 0, the execution is successful.
If info = 1, internal error in ?larre occurred,
if info = 2, internal error in ?larrv occurred.
?stedc
Computes all eigenvalues and eigenvectors of a
symmetric tridiagonal matrix using the divide and
conquer method.
Syntax
FORTRAN 77:
call sstedc(compz, n, d, e, z, ldz, work, lwork, iwork, liwork, info)
call dstedc(compz, n, d, e, z, ldz, work, lwork, iwork, liwork, info)
call cstedc(compz, n, d, e, z, ldz, work, lwork, rwork, lrwork, iwork, liwork,
info)
call zstedc(compz, n, d, e, z, ldz, work, lwork, rwork, lrwork, iwork, liwork,
info)
Fortran 95:
call rstedc(d, e [,z] [,compz] [,info])
call stedc(d, e [,z] [,compz] [,info])
Description
945
The routine computes all the eigenvalues and (optionally) all the eigenvectors of a symmetric
tridiagonal matrix using the divide and conquer method. The eigenvectors of a full or band real
symmetric or complex Hermitian matrix can also be found if ?sytrd/?hetrd or ?sptrd/?hptrd
or ?sbtrd/?hbtrd has been used to reduce this matrix to tridiagonal form.
See also ?laed0, ?laed1, ?laed2, ?laed3, ?laed4, ?laed5, ?laed6, ?laed7, ?laed8,
?laed9, and ?laeda used by this function.
Input Parameters
eigenvectors of the tridiagonal matrix.
eigenvectors of original symmetric/Hermitian matrix. On
entry, the array z must contain the orthogonal/unitary
matrix used to reduce the original matrix to tridiagonal form.
n INTEGER. The order of the symmetric tridiagonal matrix (n
≥ 0).
d, e, rwork REAL for single-precision flavors
Arrays:
d(*) contains the diagonal elements of the tridiagonal
matrix.
e(*) contains the subdiagonal elements of the tridiagonal
matrix.
rwork is a workspace array, its dimension max(1,
lrwork).
z, work REAL for sstedc
DOUBLE PRECISION for dstedc
COMPLEX for cstedc
DOUBLE COMPLEX for zstedc.
Arrays: z(ldz, *), work(*).
If compz = 'V', then, on entry, z must contain the
orthogonal/unitary matrix used to reduce the original matrix
to tridiagonal form.
946
The second dimension of z must be at least max(1, n).
lwork).
lwork INTEGER. The dimension of the array work.
If compz = 'N'or 'I', or n ≤ 1, lwork must be at least
1.
If compz = 'V' and n > 1, lwork must be at least n*n.
Note that for compz = 'V', and if n is less than or equal
to the minimum divide size, usually 25, then lwork need
only be 1.
routine only calculates the optimal size of the work, rwork
and iwork arrays, returns these values as the first entries
of the work, rwork and iwork arrays, and no error message
related to lwork or lrwork or liwork is issued by xerbla.
See Application Notes for the required value of lwork.
lrwork INTEGER. The dimension of the array rwork (used for
complex flavors only).
If compz = 'N', or n ≤ 1, lrwork must be at least 1.
If compz = 'V' and n > 1, lrwork must be at least
(1+3*n+2*n*lg(n)+3*n*n), where lg(n)is the smallest
integer k such that 2**k≥n.
If compz = 'I' and n > 1, lrwork must be at least
(1+4*n+2*n*n).
Note that for compz = 'V'or 'I', and if n is less than or
equal to the minimum divide size, usually 25, then lrwork
need only be max(1, 2*(n-1)).
If lrwork = -1, then a workspace query is assumed; the
See Application Notes for the required value of lrwork.
947
iwork INTEGER. Workspace array, its dimension max(1, liwork).

liwork INTEGER. The dimension of the array iwork.
If compz = 'N', or n ≤ 1, liwork must be at least 1.
If compz = 'V' and n > 1, liwork must be at least
(6+6*n+5*n*lg(n), where lg(n)is the smallest integer k
such that 2**k≥n.
If compz = 'I' and n > 1, liwork must be at least
(3+5*n).
Note that for compz = 'V'or 'I', and if n is less than or
equal to the minimum divide size, usually 25, then liwork
need only be 1.
If liwork = -1, then a workspace query is assumed; the
See Application Notes for the required value of liwork.
Output Parameters
d The n eigenvalues in ascending order, unless info ≠ 0.
See also info.
z If info = 0, then if compz = 'V', z contains the
orthonormal eigenvectors of the original
symmetric/Hermitian matrix, and if compz = 'I', z contains
the orthonormal eigenvectors of the symmetric tridiagonal
matrix. If compz = 'N', z is not referenced.
lwork.
rwork(1) On exit, if info = 0, then rwork(1) returns the optimal
lrwork (for complex flavors only).
liwork.
info INTEGER.
948
If info = -i, the i-th parameter had an illegal value. If
info = i, the algorithm failed to compute an eigenvalue
while working on the submatrix lying in rows and columns
i/(n+1) through mod(i, n+1).

Specific details for the routine stedc interface are the following:
z as follows: compz = 'I', if z is present, compz = 'N', if z is omitted.
Note that two variants of Fortran 95 interface for stedc routine are needed because of an
ambiguous choice between real and complex cases appear when z and work are omitted. Thus,
the name rstedc is used in real cases (single or double precision), and the name stedc is
used in complex cases (single or double precision).
Application Notes
The required size of workspace arrays must be as follows.
For sstedc/dstedc:
If compz = 'N' or n ≤ 1 then lwork must be at least 1.
If compz = 'V' and n > 1 then lwork must be at least (1 + 3n + 2n·lgn + 3n2), where
lg(n) = smallest integer k such that 2k≥ n.
If compz = 'I' and n > 1 then lwork must be at least (1 + 4n + n2).
If compz = 'N' or n ≤ 1 then liwork must be at least 1.
If compz = 'V' and n > 1 then liwork must be at least (6 + 6n + 5n·lgn).
949
If compz = 'I' and n > 1 then liwork must be at least (3 + 5n).
For cstedc/zstedc:
If compz = 'N' or'I', or n ≤ 1, lwork must be at least 1.

2
If compz = 'V' and n > 1, lwork must be at least n .
If compz = 'N' or n ≤ 1, lrwork must be at least 1.

2
If compz = 'V' and n > 1, lrwork must be at least (1 + 3n + 2n·lgn + 3n ), where lg(n ) =
smallest integer k such that 2k≥ n.
2
If compz = 'I' and n > 1, lrwork must be at least(1 + 4n + 2n ).
The required value of liwork for complex flavors is the same as for real flavors.
If lwork (or liwork or lrwork, if supplied) is equal to -1, then the routine returns immediately
and provides the recommended workspace in the first element of the corresponding array
(work, iwork, rwork). This operation is called a workspace query.
Note that if lwork (liwork, lrwork) is less than the minimal required value and is not equal
to -1, the routine returns immediately with an error exit and does not provide any information
on the recommended workspace.
?stegr
Syntax
FORTRAN 77:
call sstegr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz,
work, lwork, iwork, liwork, info)
call dstegr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz,
call cstegr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz,
call zstegr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz,
950
Fortran 95:
call rstegr(d, e, w [,z] [,vl] [,vu] [,il] [,iu] [,m] [,isuppz] [,abstol]
[,info])
call stegr(d, e, w [,z] [,vl] [,vu] [,il] [,iu] [,m] [,isuppz] [,abstol]
[,info])
Description
tridiagonal matrix T. Any such unreduced matrix has a well defined set of pairwise different
real eigenvalues, the corresponding real eigenvectors are pairwise orthogonal.
The spectrum may be computed either completely or partially by specifying either an interval
(vl,vu] or a range of indices il:iu for the desired eigenvalues.
?sregr is a compatibility wrapper around the improved ?stemr routine. See its description for
further details.
Note that the abstol parameter no longer provides any benefit and hence is no longer used.
See also auxiliary ?lasq2 ?lasq5, ?lasq6 , used by this routine.
Input Parameters
If job = 'N', then only eigenvalues are computed.
If job = 'V', then eigenvalues and eigenvectors are
computed.
If range = 'V', the routine computes eigenvalues
lambda(i) in the half-open interval:
vl< lambda(i) ≤ vu.
indices il to iu.
d, e, work REAL for single precision flavors
951

Arrays:
e(*) contains the subdiagonal elements of T in elements 1
to n-1; e(n) need not be set on input, but it is used as a
workspace.
The dimension of e must be at least max(1, n).
vl, vu REAL for single precision flavors
to be searched for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
Constraint: 1 ≤ il ≤ iu ≤ n, if n > 0.
abstol REAL for single precision flavors
Unused. Was the absolute error tolerance for the
eigenvalues/eigenvectors in previous versions.
Constraints:
ldz < 1 if jobz = 'N';
ldz < max(1, n) jobz = 'V', an.
lwork INTEGER.
The dimension of the array work,
lwork≥max(1, 18*n) if jobz = 'V', and
lwork≥max(1, 12*n) if jobz = 'N'.
952
Application Notes below for details.
iwork INTEGER.
Workspace array, DIMENSION (liwork).
liwork INTEGER.
The dimension of the array iwork, lwork ≥ max(1, 10*n)
if the eigenvectors are desired, and lwork ≥ max(1, 8*n)
if only the eigenvalues are to be computed..
See Application Notes below for details.
Output Parameters
d, e On exit, d and e are overwritten.
m INTEGER. The total number of eigenvalues found,
0 ≤ m ≤ n.
If range = 'A', m = n, and if range = 'I', m = iu-il+1.
w REAL for single precision flavors
The selected eigenvalues in ascending order, stored in w(1)
to w(m).
z REAL for sstegr
DOUBLE PRECISION for dstegr
COMPLEX for cstegr
DOUBLE COMPLEX for zstegr.
Array z(ldz, *), the second dimension of z must be at least
max(1, m).
953
If jobz = 'V', and if info = 0, the first m columns of z

contain the orthonormal eigenvectors of the matrix T
supplied in the array z ; if range = 'V', the exact value
of m is not known in advance and an upper bound must be
used. Supplying n columns is always safe.
isuppz INTEGER.
Array, DIMENSION at least (2*max(1, m)).
The support of the eigenvectors in z, that is the indices
indicating the nonzero elements in z. The i-th computed
eigenvector is nonzero only in elements isuppz(2*i-1)
through isuppz(2*i). This is relevant in the case when
the matrix is split. isuppz is only accessed when jobz =
'V', and n > 0.
work(1) On exit, if info = 0, then work(1) returns the required
minimal size of lwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the required
minimal size of liwork.
info INTEGER.
If info = 1x, internal error in ?larre occurred,
If info = 2x, internal error in ?larrv occurred. Here the
digit x = abs(iinfo) < 10, where iinfo is the non-zero
error code returned by ?larre or ?larrv, respectively.

Specific details for the routine stegr interface are the following:
954
w Holds the vector of length n.
z Holds the matrix Z of size (n,m).
isuppz Holds the vector of length (2*m).
vl Default value for this argument is vl = - HUGE (vl) where HUGE(a)
means the largest machine number of the same precision as argument
a.
vu Default value for this argument is vu = HUGE (vl).
il Default value for this argument is il = 1.
iu Default value for this argument is iu = n.
abstol Default value for this argument is abstol = 0.0_WP.
jobz Restored based on the presence of the argument z as follows:
jobz = 'V', if z is present,
jobz = 'N', if z is omitted.
range Restored based on the presence of arguments vl, vu, il, iu as follows:
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il, iu is present,
Note that there will be an error condition if one of or both vl and vu
are present and at the same time one of or both il and iu are present.
Note that two variants of Fortran 95 interface for stegr routine are needed because of an
ambiguous choice between real and complex cases appear when z is omitted. Thus, the name
rstegr is used in real cases (single or double precision), and the name stegr is used in complex
cases (single or double precision).
Application Notes
2
Currently ?stegr is only set up to find all the n eigenvalues and eigenvectors of T in O(n )
time, that is, only range = 'A' is supported.
?stegr works only on machines which follow IEEE-754 floating-point standard in their handling
of infinities and NaNs. Normal execution of ?stegr may create NaNs and infinities and hence
may abort due to a floating point exception in environments which do not conform to the
IEEE-754 standard.
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for
the first run, or set lwork = -1 (liwork = -1).
955
If lwork (or liwork) has any of admissible sizes, which is no less than the minimal value
described, then the routine completes the task, though probably not so fast as with a
recommended workspace, and provides the recommended workspace in the first element of
the corresponding array (work, iwork) on exit. Use this value (work(1), iwork(1)) for
subsequent runs.
If lwork = -1 (liwork = -1), then the routine returns immediately and provides the
recommended workspace in the first element of the corresponding array (work, iwork). This
operation is called a workspace query.
Note that if lwork (liwork) is less than the minimal required value and is not equal to -1, then
the routine returns immediately with an error exit and does not provide any information on the
?pteqr
Computes all eigenvalues and (optionally) all
eigenvectors of a real symmetric positive-definite
tridiagonal matrix.
Syntax
FORTRAN 77:
call spteqr(compz, n, d, e, z, ldz, work, info)
call dpteqr(compz, n, d, e, z, ldz, work, info)
call cpteqr(compz, n, d, e, z, ldz, work, info)
call zpteqr(compz, n, d, e, z, ldz, work, info)
Fortran 95:
call rpteqr(d, e [,z] [,compz] [,info])
call pteqr(d, e [,z] [,compz] [,info])
Description
The routine computes all the eigenvalues and (optionally) all the eigenvectors of a real symmetric
positive-definite tridiagonal matrix T. In other words, the routine can compute the spectral
factorization: T = Z*Λ*ZT.
956
Here Λ is a diagonal matrix whose diagonal elements are the eigenvalues λi; Z is an orthogonal
matrix whose columns are eigenvectors. Thus,
T*zi = λi*zi for i = 1, 2, ..., n.

(The routine normalizes the eigenvectors so that ||zi||2 = 1.)
You can also use the routine for computing the eigenvalues and eigenvectors of real symmetric
(or complex Hermitian) positive-definite matrices A reduced to tridiagonal form T: A = Q*T*QH.
In this case, the spectral factorization is as follows: A = Q*T*QH = (QZ)*Λ*(QZ)H. Before
calling ?pteqr, you must reduce A to tridiagonal form and generate the explicit matrix Q by
calling the following routines:
for real matrices: for complex matrices:
full storage ?sytrd, ?orgtr ?hetrd, ?ungtr
packed storage ?sptrd, ?opgtr ?hptrd, ?upgtr
band storage ?sbtrd (vect='V') ?hbtrd (vect='V')
H
The routine first factorizes T as L*D*L where L is a unit lower bidiagonal matrix, and D is a
diagonal matrix. Then it forms the bidiagonal matrix B = L*D1/2 and calls ?bdsqr to compute
the singular values of B, which are the same as the eigenvalues of T.
Input Parameters
eigenvectors of A (and the array z must contain the matrix
Q on entry).
Arrays:
957

The dimension of work must be:
at least max(1, 4*n-4) if compz = 'V' or 'I'.
z REAL for spteqr
DOUBLE PRECISION for dpteqr
COMPLEX for cpteqr
DOUBLE COMPLEX for zpteqr.
Array, DIMENSION (ldz,*)
If compz = 'N' or 'I', z need not be set.
If vect = 'V', z must contains the n-by-n matrix Q.
at least max(1, n) if compz = 'V' or 'I'.
Output Parameters
d The n eigenvalues in descending order, unless info > 0.
See also info.
e On exit, the array is overwritten.
z If info = 0, contains the n orthonormal eigenvectors,
stored by columns. (The i-th column corresponds to the
i-th eigenvalue.)
info INTEGER.
If info = i, the leading minor of order i (and hence T
itself) is not positive-definite.
If info = n + i, the algorithm for computing singular
values failed to converge; i off-diagonal elements have not
converged to zero.
958
Specific details for the routine pteqr interface are the following:
z as follows:
Note that two variants of Fortran 95 interface for pteqr routine are needed because of an
ambiguous choice between real and complex cases appear when z is omitted. Thus, the name
rpteqr is used in real cases (single or double precision), and the name pteqr is used in complex
cases (single or double precision).
Application Notes
If λi is an exact eigenvalue, and μi is the corresponding computed value, then
|μi - λi| ≤ c(n)*ε*K*λi
where c(n) is a modestly increasing function of n, ε is the machine precision, and K = ||DTD||2
*||(DTD)-1||2, D is diagonal with dii = tii-1/2.
θ(ui, wi) ≤ c(n)εK / mini≠j(|λi - λj|/|λi + λj|).
Here mini≠j(|λi - λj|/|λi + λj|) is the relative gap between λi and the other eigenvalues.
Typically, it is about
959
30n2 if compz = 'N';

6n3 (for complex flavors, 12n3) if compz = 'V' or 'I'.
?stebz
Computes selected eigenvalues of a real symmetric
tridiagonal matrix by bisection.
Syntax
FORTRAN 77:
call sstebz (range, order, n, vl, vu, il, iu, abstol, d, e, m, nsplit, w,
iblock, isplit, work, iwork, info)
call dstebz (range, order, n, vl, vu, il, iu, abstol, d, e, m, nsplit, w,
iblock, isplit, work, iwork, info)
Fortran 95:
call stebz(d, e, m, nsplit, w, iblock, isplit [, order] [,vl] [,vu] [,il]
[,iu] [,abstol] [,info])
Description
The routine computes some (or all) of the eigenvalues of a real symmetric tridiagonal matrix
T by bisection. The routine searches for zero or negligible off-diagonal elements to see if T splits
into block-diagonal form T = diag(T1, T2, ...). Then it performs bisection on each of the
blocks Ti and returns the block index of each computed eigenvalue, so that a subsequent call
to ?stein can also take advantage of the block structure.
See also ?laebz.
Input Parameters
lambda(i) in the half-open interval: vl < lambda(i) ≤
vu.
960
indices il to iu.
order CHARACTER*1. Must be 'B' or 'E'.
If order = 'B', the eigenvalues are to be ordered from
smallest to largest within each split-off block.
If order = 'E', the eigenvalues for the entire matrix are
to be ordered from smallest to largest.
vl, vu REAL for sstebz
DOUBLE PRECISION for dstebz.
vl < lambda(i) ≤ vu.
il, iu INTEGER. Constraint: 1 ≤ il ≤ iu ≤ n.
If range = 'I', the routine computes eigenvalues
lambda(i) such that il≤ i ≤ iu (assuming that the
eigenvalues lambda(i) are in ascending order).
abstol REAL for sstebz
The absolute tolerance to which each eigenvalue is required.
An eigenvalue (or cluster) is considered to have converged
if it lies in an interval of width abstol.
If abstol ≤ 0.0, then the tolerance is taken as eps*|T|,
where eps is the machine precision, and |T| is the 1-norm
of the matrix T.
d, e, work REAL for sstebz
Arrays:
961
The dimension of work must be at least max(1, 4n).

iwork INTEGER. Workspace.
Array, DIMENSION at least max(1, 3n).
Output Parameters
m INTEGER. The actual number of eigenvalues found.
nsplit INTEGER. The number of diagonal blocks detected in T.
w REAL for sstebz
Array, DIMENSION at least max(1, n). The computed
eigenvalues, stored in w(1) to w(m).
iblock, isplit INTEGER.
Arrays, DIMENSION at least max(1, n).
A positive value iblock(i) is the block number of the
eigenvalue stored in w(i) (see also info).
The leading nsplit elements of isplit contain points at
which T splits into blocks Ti as follows: the block T1 contains
rows/columns 1 to isplit(1); the block T2 contains
rows/columns isplit(1)+1 to isplit(2), and so on.
info INTEGER.
If info = 1, for range = 'A' or 'V', the algorithm failed
to compute some of the required eigenvalues to the desired
accuracy; iblock(i)<0 indicates that the eigenvalue stored
in w(i) failed to converge.
If info = 2, for range = 'I', the algorithm failed to
compute some of the required eigenvalues. Try calling the
routine again with range = 'A'.
If info = 3:
for range = 'A' or 'V', same as info = 1;
for range = 'I', same as info = 2.
If info = 4, no eigenvalues have been computed. The
floating-point arithmetic on the computer is not behaving
as expected.
962
Specific details for the routine stebz interface are the following:
iblock Holds the vector of length n.
isplit Holds the vector of length n.
order Must be 'B' or 'E'. The default value is 'B'.
vl Default value for this argument is vl = - HUGE (vl) where HUGE(a)
means the largest machine number of the same precision as argument
a.
vu Default value for this argument is vu = HUGE (vl).
abstol Default value for this argument is abstol = 0.0_WP.
range = 'A', if none of vl, vu, il,
iu is present, Note that there will be an error condition if one of or both
vl and vu are present and at the same time one of or both il and iu
are present.
Application Notes
The eigenvalues of T are computed to high relative accuracy which means that if they vary
widely in magnitude, then any small eigenvalues will be computed more accurately than, for
example, with the standard QR method. However, the reduction to tridiagonal form (prior to
calling the routine) may exclude the possibility of obtaining high relative accuracy in the small
eigenvalues of the original matrix if its eigenvalues vary widely in magnitude.
963
?stein
Computes the eigenvectors corresponding to
specified eigenvalues of a real symmetric
tridiagonal matrix.
Syntax
FORTRAN 77:
call sstein(n, d, e, m, w, iblock, isplit, z, ldz, work, iwork, ifailv, info)
call dstein(n, d, e, m, w, iblock, isplit, z, ldz, work, iwork, ifailv, info)
call cstein(n, d, e, m, w, iblock, isplit, z, ldz, work, iwork, ifailv, info)
call zstein(n, d, e, m, w, iblock, isplit, z, ldz, work, iwork, ifailv, info)
Fortran 95:
call stein(d, e, w, iblock, isplit, z [,ifailv] [,info])
Description
The routine computes the eigenvectors of a real symmetric tridiagonal matrix T corresponding
to specified eigenvalues, by inverse iteration. It is designed to be used in particular after the
specified eigenvalues have been computed by ?stebz with order = 'B', but may also be
used when the eigenvalues have been computed by other routines.
If you use this routine after ?stebz, it can take advantage of the block structure by performing
inverse iteration on each block Ti separately, which is more efficient than using the whole
matrix T.
If T has been formed by reduction of a full symmetric or Hermitian matrix A to tridiagonal form,
you can transform eigenvectors of T to eigenvectors of A by calling ?ormtr or ?opmtr (for real
flavors) or by calling ?unmtr or ?upmtr (for complex flavors).
Input Parameters
m INTEGER. The number of eigenvectors to be returned.
d, e, w REAL for single-precision flavors
964
Arrays:
e(*) contains the sub-diagonal elements of T stored in
elements 1 to n-1
w(*) contains the eigenvalues of T, stored in w(1) to w(m)
(as returned by ?stebz). Eigenvalues of T1 must be supplied
first, in non-decreasing order; then those of T2, again in
non-decreasing order, and so on. Constraint:
if iblock(i) = iblock(i+1), w(i) ≤ w(i+1).
The dimension of w must be at least max(1, n).
iblock, isplit INTEGER.
Arrays, DIMENSION at least max(1, n). The arrays iblock
and isplit, as returned by ?stebz with order = 'B'.
If you did not call ?stebz with order = 'B', set all
elements of iblock to 1, and isplit(1) to n.)
ldz INTEGER. The first dimension of the output array z; ldz ≥
max(1, n).
work REAL for single-precision flavors
Workspace array, DIMENSION at least max(1, 5n).
iwork INTEGER.
Output Parameters
z REAL for sstein
DOUBLE PRECISION for dstein
COMPLEX for cstein
DOUBLE COMPLEX for zstein.
Array, DIMENSION (ldz, *).
If info = 0, z contains the m orthonormal eigenvectors,
stored by columns. (The ith column corresponds to the i-th
specified eigenvalue.)
ifailv INTEGER.
965
Array, DIMENSION at least max(1, m).

If info = i > 0, the first i elements of ifailv contain the
indices of any eigenvectors that failed to converge.
info INTEGER.
If info = i, then i eigenvectors (as indicated by the
parameter ifailv) each failed to converge in 5 iterations.
The current iterates are stored in the corresponding columns
of the array z.

Specific details for the routine stein interface are the following:
iblock Holds the vector of length n.
isplit Holds the vector of length n.
z Holds the matrix Z of size (n,m).
ifailv Holds the vector of length (m).
Application Notes
Each computed eigenvector zi is an exact eigenvector of a matrix T+Ei, where ||Ei||2 =
O(ε)*||T||2. However, a set of eigenvectors computed by this routine may not be orthogonal
to so high a degree of accuracy as those computed by ?steqr.
966
?disna
Computes the reciprocal condition numbers for the
eigenvectors of a symmetric/ Hermitian matrix or
for the left or right singular vectors of a general
matrix.
Syntax
FORTRAN 77:
call sdisna(job, m, n, d, sep, info)
call ddisna(job, m, n, d, sep, info)
Fortran 95:
call disna(d, sep [,job] [,minmn] [,info])
Description
The routine computes the reciprocal condition numbers for the eigenvectors of a real symmetric
or complex Hermitian matrix or for the left or right singular vectors of a general m-by-n matrix.
The reciprocal condition number is the 'gap' between the corresponding eigenvalue or singular
value and the nearest other one.
The bound on the error, measured by angle in radians, in the i-th computed vector is given
by
slamch('E')*(anorm/sep(i))
where anorm = ||A||2 = max( |d(j)| ). sep(i) is not allowed to be smaller than slam-
ch('E')*anorm in order to limit the size of the error bound.
?disna may also be used to compute error bounds for eigenvectors of the generalized symmetric
definite eigenproblem.
Input Parameters
job CHARACTER*1. Must be 'E','L', or 'R'. Specifies for which
problem the reciprocal condition numbers should be
computed:
967
job = 'E': for the eigenvectors of a symmetric/Hermitian

matrix;
job = 'L': for the left singular vectors of a general matrix;
job = 'R': for the right singular vectors of a general matrix.
m INTEGER. The number of rows of the matrix (m ≥ 0).
n INTEGER.
If job = 'L', or 'R', the number of columns of the matrix
(n ≥ 0). Ignored if job = 'E'.
d REAL for sdisna
DOUBLE PRECISION for ddisna.
Array, dimension at least max(1,m) if job = 'E', and at
least max(1, min(m,n)) if job = 'L' or 'R'.
This array must contain the eigenvalues (if job = 'E') or
singular values (if job = 'L' or 'R') of the matrix, in either
increasing or decreasing order.
If singular values, they must be non-negative.
Output Parameters
sep REAL for sdisna
DOUBLE PRECISION for ddisna.
Array, dimension at least max(1,m) if job = 'E', and at
least max(1, min(m,n)) if job = 'L' or 'R'. The reciprocal
condition numbers of the vectors.
info INTEGER.

Specific details for the routine disna interface are the following:
d Holds the vector of length min(m,n).
sep Holds the vector of length min(m,n).
job Must be 'E', 'L', or 'R'. The default value is 'E'.
968
minmn Indicates which of the values m or n is smaller. Must be either 'M' or
'N', the default is 'M'.
If job = 'E', this argument is superfluous, If job = 'L' or 'R', this
argument is used by the routine.
Generalized Symmetric-Definite Eigenvalue Problems

Generalized symmetric-definite eigenvalue problems are as follows: find the eigenvalues λ
and the corresponding eigenvectors z that satisfy one of these equations:
Az = λBz, ABz = λz, or BAz = λz,

where A is an n-by-n symmetric or Hermitian matrix, and B is an n-by-n symmetric
positive-definite or Hermitian positive-definite matrix.
In these problems, there exist n real eigenvectors corresponding to real eigenvalues (even for
complex Hermitian matrices A and B).
Routines described in this section allow you to reduce the above generalized problems to
standard symmetric eigenvalue problem Cy = λy, which you can solve by calling LAPACK
routines described earlier in this chapter (see Symmetric Eigenvalue Problems).
Different routines allow the matrices to be stored either conventionally or in packed storage.
Prior to reduction, the positive-definite matrix B must first be factorized using either ?potrf
or ?pptrf.
The reduction routine for the banded matrices A and B uses a split Cholesky factorization for
which a specific routine ?pbstf is provided. This refinement halves the amount of work required
to form matrix C.
Table 4-4 lists LAPACK routines (FORTRAN 77 interface) that can be used to solve generalized
symmetric-definite eigenvalue problems. Respective routine names in Fortran 95 interface are
without the first symbol (see Routine Naming Conventions).
Table 4-4 Computational Routines for Reducing Generalized Eigenproblems to Standard
Problems
Matrix Reduce to standard Reduce to standard Reduce to standard Factorize
type problems (full problems (packed problems (band band
storage) storage) matrices) matrix
real ?sygst ?spgst ?sbgst ?pbstf

symmetric
matrices
969
Matrix Reduce to standard Reduce to standard Reduce to standard Factorize

type problems (full problems (packed problems (band band
storage) storage) matrices) matrix
complex ?hegst ?hpgst ?hbgst ?pbstf

Hermitian
matrices
?sygst
Reduces a real symmetric-definite generalized
eigenvalue problem to the standard form.
Syntax
FORTRAN 77:
call ssygst(itype, uplo, n, a, lda, b, ldb, info)
call dsygst(itype, uplo, n, a, lda, b, ldb, info)
Fortran 95:
call sygst(a, b [,itype] [,uplo] [,info])
Description
The routine reduces real symmetric-definite generalized eigenproblems
A*z = λ*B*z, A*B*z = λ*z, or B*A*z = λ*z
to the standard form C*y = λ*y. Here A is a real symmetric matrix, and B is a real symmetric
positive-definite matrix. Before calling this routine, call ?potrf to compute the Cholesky
factorization: B = UT*U or B = L*LT.
Input Parameters
itype INTEGER. Must be 1 or 2 or 3.
If itype = 1, the generalized eigenproblem is A*z =
lambda*B*z
for uplo = 'U': C = inv(UT)*A*inv(U), z = inv(U)*y;
970
for uplo = 'L': C = inv(L)*A*inv(LT), z = inv(LT)*y.
If itype = 2, the generalized eigenproblem is A*B*z =
lambda*z
for uplo = 'U': C = U*A*UT, z = inv(U)*y;
for uplo = 'L': C = LT*A*L, z = inv(LT)*y.
If itype = 3, the generalized eigenproblem is B*A*z =
lambda*z
for uplo = 'U': C = U*A*UT, z = UT*y;
for uplo = 'L': C = LT*A*L, z = L*y.
If uplo = 'U', the array a stores the upper triangle of A;
you must supply B in the factored form B = UT*U.
If uplo = 'L', the array a stores the lower triangle of A;
you must supply B in the factored form B = L*LT.
n INTEGER. The order of the matrices A and B (n ≥ 0).
a, b REAL for ssygst
DOUBLE PRECISION for dsygst.
Arrays:
a(lda,*) contains the upper or lower triangle of A.
b(ldb,*) contains the Cholesky-factored matrix B:
B = UT*U or B = L*LT (as returned by ?potrf).
Output Parameters
a The upper or lower triangle of A is overwritten by the upper
or lower triangle of C, as specified by the arguments itype
and uplo.
info INTEGER.
971

Specific details for the routine sygst interface are the following:
b Holds the matrix B of size (n,n).
itype Must be 1, 2, or 3. The default value is 1.
Application Notes
Forming the reduced matrix C is a stable procedure. However, it involves implicit multiplication
by inv(B) (if itype = 1) or B (if itype = 2 or 3). When the routine is used as a step in the
computation of eigenvalues and eigenvectors of the original problem, there may be a significant
loss of accuracy if B is ill-conditioned with respect to inversion.
The approximate number of floating-point operations is n3.
?hegst
Reduces a complex Hermitian-definite generalized
Syntax
FORTRAN 77:
call chegst(itype, uplo, n, a, lda, b, ldb, info)
call zhegst(itype, uplo, n, a, lda, b, ldb, info)
Fortran 95:
call hegst(a, b [,itype] [,uplo] [,info])
Description
The routine reduces complex Hermitian-definite generalized eigenvalue problems
972
A*x = λ*B*x, A*B*x = λ*x, or B*A*x = λ*x.
to the standard form Cy = λy. Here the matrix A is complex Hermitian, and B is complex
Hermitian positive-definite. Before calling this routine, you must call ?potrf to compute the
Cholesky factorization: B = UH*U or B = L*LH.
Input Parameters
lambda*B*z
for uplo = 'U': C = inv(UH)*A*inv(U), z = inv(U)*y;
for uplo = 'L': C = inv(L)*A*inv(LH), z = inv(LH)*y.
lambda*z
for uplo = 'U': C = U*A*UH, z = inv(U)*y;
for uplo = 'L': C = LH*A*L, z = inv(LH)*y.
lambda*z
for uplo = 'U': C = U*A*UH, z = UH*y;
for uplo = 'L': C = LH*A*L, z = L*y.
If uplo = 'U', the array a stores the upper triangle of A;
you must supply B in the factored form B = UH*U.
If uplo = 'L', the array a stores the lower triangle of A;
you must supply B in the factored form B = L*LH.
a, b COMPLEX for chegstDOUBLE COMPLEX for zhegst.
Arrays:
a(lda,*) contains the upper or lower triangle of A.
b(ldb,*) contains the Cholesky-factored matrix B:
B = UH*U or B = L*LH (as returned by ?potrf).
973
Output Parameters
a The upper or lower triangle of A is overwritten by the upper
and uplo.
info INTEGER.

Specific details for the routine hegst interface are the following:
Application Notes
3
The approximate number of floating-point operations is n .
974
?spgst
eigenvalue problem to the standard form using
packed storage.
Syntax
FORTRAN 77:
call sspgst(itype, uplo, n, ap, bp, info)
call dspgst(itype, uplo, n, ap, bp, info)
Fortran 95:
call spgst(ap, bp [,itype] [,uplo] [,info])
Description
A*x = λ*B*x, A*B*x = λ*x, or B*A*x = λ*x
to the standard form C*y = λ*y, using packed matrix storage. Here A is a real symmetric
matrix, and B is a real symmetric positive-definite matrix. Before calling this routine, call ?pptrf
to compute the Cholesky factorization: B = UT*U or B = L*LT.
Input Parameters
lambda*B*z
for uplo = 'U': C = inv(UT)*A*inv(U), z = inv(U)*y;
for uplo = 'L': C = inv(L)*A*inv(LT), z = inv(LT)*y.
lambda*z
for uplo = 'U': C = U*A*UT, z = inv(U)*y;
for uplo = 'L': C = LT*A*L, z = inv(LT)*y.
975

lambda*z
for uplo = 'U': C = U*A*UT, z = UT*y;
for uplo = 'L': C = LT*A*L, z = L*y.
If uplo = 'U', ap stores the packed upper triangle of A;
you must supply B in the factored form B = UT*U.
If uplo = 'L', ap stores the packed lower triangle of A;
you must supply B in the factored form B = L*LT.
ap, bp REAL for sspgst
DOUBLE PRECISION for dspgst.
Arrays:
ap(*) contains the packed upper or lower triangle of A.
The dimension of ap must be at least max(1, n*(n+1)/2).
b p(*) contains the packed Cholesky factor of B (as returned
by ?pptrf with the same uplo value).
The dimension of bp must be at least max(1, n*(n+1)/2).
Output Parameters
ap The upper or lower triangle of A is overwritten by the upper
and uplo.
info INTEGER.

Specific details for the routine spgst interface are the following:
bp Holds the array B of size (n*(n+1)/2).
976
Application Notes
3
?hpgst
eigenvalue problem to the standard form using
packed storage.
Syntax
FORTRAN 77:
call chpgst(itype, uplo, n, ap, bp, info)
call zhpgst(itype, uplo, n, ap, bp, info)
Fortran 95:
call hpgst(ap, bp [,itype] [,uplo] [,info])
Description
A*z = λ*B*z, A*B*z = λ*z, or B*A*z = λ*z.
to the standard form C*y = λ*y, using packed matrix storage. Here A is a real symmetric
matrix, and B is a real symmetric positive-definite matrix. Before calling this routine, you must
call ?pptrf to compute the Cholesky factorization: B = UH*U or B = L*LH.
977
Input Parameters
lambda*B*z
for uplo = 'U': C = inv(UH)*A*inv(U), z = inv(U)*y;
for uplo = 'L': C = inv(L)*A*inv(LH), z = inv(LH)*y.
lambda*z
for uplo = 'U': C = U*A*UH, z = inv(U)*y;
for uplo = 'L': C = LH*A*L, z = inv(LH)*y.
lambda*z
for uplo = 'U': C = U*A*UH, z = UH*y;
for uplo = 'L': C = LH*A*L, z = L*y.
If uplo = 'U', ap stores the packed upper triangle of A;
you must supply B in the factored form B = UH*U.
If uplo = 'L', ap stores the packed lower triangle of A;
you must supply B in the factored form B = L*LH.
ap, bp COMPLEX for chpgstDOUBLE COMPLEX for zhpgst.
Arrays:
ap(*) contains the packed upper or lower triangle of A.
The dimension of a must be at least max(1, n*(n+1)/2).
bp(*) contains the packed Cholesky factor of B (as returned
by ?pptrf with the same uplo value).
The dimension of b must be at least max(1, n*(n+1)/2).
Output Parameters
ap The upper or lower triangle of A is overwritten by the upper
and uplo.
info INTEGER.
978
Specific details for the routine hpgst interface are the following:
Application Notes
3
?sbgst
eigenproblem for banded matrices to the standard
form using the factorization performed by ?pbstf.
Syntax
FORTRAN 77:
call ssbgst(vect, uplo, n, ka, kb, ab, ldab, bb, ldbb, x, ldx, work, info)
call dsbgst(vect, uplo, n, ka, kb, ab, ldab, bb, ldbb, x, ldx, work, info)
Fortran 95:
call sbgst(ab, bb [,x] [,uplo] [,info])
Description
979
To reduce the real symmetric-definite generalized eigenproblem A*z = λ*B*z to the standard
form C*y=λ*y, where A, B and C are banded, this routine must be preceded by a call to spb-
stf/dpbstf, which computes the split Cholesky factorization of the positive-definite matrix B:
B=ST*S. The split Cholesky factorization, compared with the ordinary Cholesky factorization,
allows the work to be approximately halved.
This routine overwrites A with C = XT*A*X, where X = inv(S)*Q and Q is an orthogonal matrix
chosen (implicitly) to preserve the bandwidth of A. The routine also has an option to allow the
accumulation of X, and then, if z is an eigenvector of C, X*z is an eigenvector of the original
system.
Input Parameters
vect CHARACTER*1. Must be 'N' or 'V'.
If vect = 'N', then matrix X is not returned;
If vect = 'V', then matrix X is returned.
ka INTEGER. The number of super- or sub-diagonals in A
(ka ≥ 0).
kb INTEGER. The number of super- or sub-diagonals in B
(ka ≥ kb ≥ 0).
ab, bb, work REAL for ssbgst
DOUBLE PRECISION for dsbgst
triangular part of the symmetric matrix A (as specified by
uplo) in band storage format.
The second dimension of the array ab must be at least
max(1, n).
bb (ldbb,*) is an array containing the banded split Cholesky
factor of B as specified by uplo, n and kb and returned by
spbstf/dpbstf.
The second dimension of the array bb must be at least
max(1, n).
980
work(*) is a workspace array, dimension at least max(1,
2*n)
ldab INTEGER. The first dimension of the array ab; must be at
least ka+1.
ldbb INTEGER. The first dimension of the array bb; must be at
least kb+1.
ldx The first dimension of the output array x. Constraints: if
vect = 'N', then ldx ≥ 1;
if vect = 'V', then ldx ≥ max(1, n).
Output Parameters
ab On exit, this array is overwritten by the upper or lower
triangle of C as specified by uplo.
x REAL for ssbgst
DOUBLE PRECISION for dsbgst
Array.
If vect = 'V', then x (ldx,*) contains the n-by-n matrix
X = inv(S)*Q.
If vect = 'N', then x is not referenced.
The second dimension of x must be:
at least max(1, n), if vect = 'V';
at least 1, if vect = 'N'.
info INTEGER.

Specific details for the routine sbgst interface are the following:
ab Holds the array A of size (ka+1,n).
bb Holds the array B of size (kb+1,n).
x Holds the matrix X of size (n,n).
981

vect Restored based on the presence of the argument x as follows:
vect = 'V', if x is present,
vect = 'N', if x is omitted.
Application Notes
Forming the reduced matrix C involves implicit multiplication by inv(B). When the routine is
used as a step in the computation of eigenvalues and eigenvectors of the original problem,
there may be a significant loss of accuracy if B is ill-conditioned with respect to inversion.
If ka and kb are much less than n then the total number of floating-point operations is
approximately 6n2*kb, when vect = 'N'. Additional (3/2)n3*(kb/ka) operations are required
when vect = 'V'.
?hbgst
eigenproblem for banded matrices to the standard
form using the factorization performed by ?pbstf.
Syntax
FORTRAN 77:
call chbgst(vect, uplo, n, ka, kb, ab, ldab, bb, ldbb, x, ldx, work, rwork,
info)
call zhbgst(vect, uplo, n, ka, kb, ab, ldab, bb, ldbb, x, ldx, work, rwork,
info)
Fortran 95:
call hbgst(ab, bb [,x] [,uplo] [,info])
Description
982
To reduce the complex Hermitian-definite generalized eigenproblem A*z = λ*B*z to the
standard form C*x = λ*y, where A, B and C are banded, this routine must be preceded by a
call to cpbstf/zpbstf, which computes the split Cholesky factorization of the positive-definite
matrix B: B = SH*S. The split Cholesky factorization, compared with the ordinary Cholesky
factorization, allows the work to be approximately halved.
This routine overwrites A with C = XH*A*X, where X = inv(S)*Q, and Q is a unitary matrix
chosen (implicitly) to preserve the bandwidth of A. The routine also has an option to allow the
accumulation of X, and then, if z is an eigenvector of C, X*z is an eigenvector of the original
system.
Input Parameters
vect CHARACTER*1. Must be 'N' or 'V'.
If vect = 'N', then matrix X is not returned;
If vect = 'V', then matrix X is returned.
(ka ≥ 0).
(ka ≥ kb ≥ 0).
ab, bb, work COMPLEX for chbgstDOUBLE COMPLEX for zhbgst
triangular part of the Hermitian matrix A (as specified by
max(1, n).
bb (ldbb,*) is an array containing the banded split Cholesky
factor of B as specified by uplo, n and kb and returned by
cpbstf/zpbstf.
max(1, n).
983

n)
least ka+1.
least kb+1.
ldx The first dimension of the output array x. Constraints:
if vect = 'N', then ldx ≥ 1;
if vect = 'V', then ldx ≥ max(1, n).
rwork REAL for chbgst
DOUBLE PRECISION for zhbgst
Workspace array, dimension at least max(1, n)
Output Parameters
ab On exit, this array is overwritten by the upper or lower
triangle of C as specified by uplo.
x COMPLEX for chbgst
DOUBLE COMPLEX for zhbgst
Array.
If vect = 'V', then x (ldx,*) contains the n-by-n matrix
X = inv(S)*Q.
If vect = 'N', then x is not referenced.
The second dimension of x must be:
at least max(1, n), if vect = 'V';
at least 1, if vect = 'N'.
info INTEGER.

Specific details for the routine hbgst interface are the following:
984
x Holds the matrix X of size (n,n).
vect Restored based on the presence of the argument x as follows: vect
= 'V', if x is present, vect = 'N', if x is omitted.
Application Notes
Forming the reduced matrix C involves implicit multiplication by inv(B). When the routine is
used as a step in the computation of eigenvalues and eigenvectors of the original problem,
there may be a significant loss of accuracy if B is ill-conditioned with respect to inversion. The
total number of floating-point operations is approximately 20n2*kb, when vect = 'N'.
Additional 5n3*(kb/ka) operations are required when vect = 'V'. All these estimates assume
that both ka and kb are much less than n.
?pbstf
Computes a split Cholesky factorization of a real
symmetric or complex Hermitian positive-definite
banded matrix used in ?sbgst/?hbgst .
Syntax
FORTRAN 77:
call spbstf(uplo, n, kb, bb, ldbb, info)
call dpbstf(uplo, n, kb, bb, ldbb, info)
call cpbstf(uplo, n, kb, bb, ldbb, info)
call zpbstf(uplo, n, kb, bb, ldbb, info)
Fortran 95:
call pbstf(bb [, uplo] [,info])
Description
The routine computes a split Cholesky factorization of a real symmetric or complex Hermitian
positive-definite band matrix B. It is to be used in conjunction with ?sbgst/?hbgst.
985
The factorization has the form B = ST*S (or B = SH*S for complex flavors), where S is a band
matrix of the same bandwidth as B and the following structure: S is upper triangular in the first
(n+kb)/2 rows and lower triangular in the remaining rows.
Input Parameters
If uplo = 'U', bb stores the upper triangular part of B.
If uplo = 'L', bb stores the lower triangular part of B.
(kb ≥ 0).
bb REAL for spbstf
DOUBLE PRECISION for dpbstf
COMPLEX for cpbstf
DOUBLE COMPLEX for zpbstf.
bb (ldbb,*) is an array containing either upper or lower
triangular part of the matrix B (as specified by uplo) in band
storage format.
max(1, n).
ldbb INTEGER. The first dimension of bb; must be at least kb+1.
Output Parameters
bb On exit, this array is overwritten by the elements of the
split Cholesky factor S.
info INTEGER.
If info = i, then the factorization could not be completed,
because the updated element bii would be the square root
of a negative number; hence the matrix B is not
positive-definite.
986
Specific details for the routine pbstf interface are the following:
Application Notes
The computed factor S is the exact factor of a perturbed matrix B + E, where

2
The total number of floating-point operations for real flavors is approximately n(kb+1) . The
number of operations for complex flavors is 4 times greater. All these estimates assume that
kb is much less than n.
After calling this routine, you can call ?sbgst/?hbgst to solve the generalized eigenproblem
Az = λBz, where A and B are banded and B is positive-definite.

This section describes LAPACK routines for solving nonsymmetric eigenvalue problems,
computing the Schur factorization of general matrices, as well as performing a number of related
A nonsymmetric eigenvalue problem is as follows: given a nonsymmetric (or non-Hermitian)
matrix A, find the eigenvalues λ and the corresponding eigenvectors z that satisfy the equation
Az = λz (right eigenvectors z)
or the equation
zHA = λzH (left eigenvectors z).
987
Nonsymmetric eigenvalue problems have the following properties:
• The number of eigenvectors may be less than the matrix order (but is not less than the
number of distinct eigenvalues of A).
• Eigenvalues may be complex even for a real matrix A.
• If a real nonsymmetric matrix has a complex eigenvalue a+bi corresponding to an eigenvector
z, then a-bi is also an eigenvalue. The eigenvalue a-bi corresponds to the eigenvector
whose elements are complex conjugate to the elements of z.
To solve a nonsymmetric eigenvalue problem with LAPACK, you usually need to reduce the
matrix to the upper Hessenberg form and then solve the eigenvalue problem with the Hessenberg
matrix obtained. Table 4-5 lists LAPACK routines (FORTRAN 77 interface) for reducing the
matrix to the upper Hessenberg form by an orthogonal (or unitary) similarity transformation
A = QHQH as well as routines for solving eigenvalue problems with Hessenberg matrices, forming
the Schur factorization of such matrices and computing the corresponding condition numbers.
Decision tree in Figure 4-4 helps you choose the right routine or sequence of routines for an
eigenvalue problem with a real nonsymmetric matrix. If you need to solve an eigenvalue problem
with a complex non-Hermitian matrix, use the decision tree shown in Figure 4-5 .
Table 4-5 Computational Routines for Solving Nonsymmetric Eigenvalue Problems
Operation performed Routines for real matrices Routines for complex
matrices
Reduce to Hessenberg ?gehrd, ?gehrd

form A = QHQH
Generate the matrix Q ?orghr ?unghr
Apply the matrix Q ?ormhr ?unmhr
Balance matrix ?gebal ?gebal
Transform eigenvectors ?gebak ?gebak

of balanced matrix to
those of the original
matrix
Find eigenvalues and ?hseqr ?hseqr

Schur factorization (QR
algorithm)
988
Operation performed Routines for real matrices Routines for complex
matrices
Find eigenvectors from ?hsein ?hsein

Hessenberg form
(inverse iteration)
Find eigenvectors from ?trevc ?trevc

Schur factorization
Estimate sensitivities of ?trsna ?trsna

eigenvalues and
eigenvectors
Reorder Schur ?trexc ?trexc

factorization
Reorder Schur ?trsen ?trsen

factorization, find the
invariant subspace and
estimate sensitivities
Solves Sylvester's ?trsyl ?trsyl

equation.
989
Figure 4-4 Decision Tree: Real Nonsymmetric Eigenvalue Problems
990
Figure 4-5 Decision Tree: Complex Non-Hermitian Eigenvalue Problems
991
?gehrd
Reduces a general matrix to upper Hessenberg
form.
Syntax
FORTRAN 77:
call sgehrd(n, ilo, ihi, a, lda, tau, work, lwork, info)
call dgehrd(n, ilo, ihi, a, lda, tau, work, lwork, info)
call cgehrd(n, ilo, ihi, a, lda, tau, work, lwork, info)
call zgehrd(n, ilo, ihi, a, lda, tau, work, lwork, info)
Fortran 95:
call gehrd(a [, tau] [,ilo] [,ihi] [,info])
Description
The routine reduces a general matrix A to upper Hessenberg form H by an orthogonal or unitary
similarity transformation A = Q*H*QH. Here H has real subdiagonal elements.
elementary reflectors. Routines are provided to work with Q in this representation.
Input Parameters
ilo, ihi INTEGER. If A is an output by ?gebal, then ilo and ihi
must contain the values returned by that routine. Otherwise
ilo = 1 and ihi = n. (If n > 0, then 1 ≤ ilo ≤ ihi ≤
n; if n = 0, ilo = 1 and ihi = 0.)
a, work REAL for sgehrd
DOUBLE PRECISION for dgehrd
COMPLEX for cgehrd
DOUBLE COMPLEX for zgehrd.
Arrays:
992
work (lwork) is a workspace array.
Output Parameters
a Overwritten by the upper Hessenberg matrix H and details
of the matrix Q. The subdiagonal elements of H are real.
tau REAL for sgehrd
DOUBLE PRECISION for dgehrd
COMPLEX for cgehrd
DOUBLE COMPLEX for zgehrd.
Array, DIMENSION at least max (1, n-1).
Contains additional information on the matrix Q.
info INTEGER.
If info = -i , the i-th parameter had an illegal value.

Specific details for the routine gehrd interface are the following:
ilo Default value for this argument is ilo = 1.
993
ihi Default value for this argument is ihi = n.
Application Notes
algorithm.
workspace query.
workspace.
The computed Hessenberg matrix H is exactly similar to a nearby matrix A + E, where ||E||2
< c(n)ε||A||2, c(n) is a modestly increasing function of n, and ε is the machine precision.
The approximate number of floating-point operations for real flavors is (2/3)*(ihi -
ilo)2(2ihi + 2ilo + 3n); for complex flavors it is 4 times greater.
?orghr
by ?gehrd.
Syntax
FORTRAN 77:
call sorghr(n, ilo, ihi, a, lda, tau, work, lwork, info)
call dorghr(n, ilo, ihi, a, lda, tau, work, lwork, info)
994
Fortran 95:
call orghr(a, tau [,ilo] [,ihi] [,info])
Description
The routine explicitly generates the orthogonal matrix Q that has been determined by a preceding
call to sgehrd/dgehrd. (The routine ?gehrd reduces a real general matrix A to upper Hessenberg
form H by an orthogonal similarity transformation, A = Q*H*QT, and represents the matrix Q
as a product of ihi-ilo elementary reflectors. Here ilo and ihi are values determined by
sgebal/dgebal when balancing the matrix; if the matrix has not been balanced, ilo = 1 and
ihi = n.)
The matrix Q generated by ?orghr has the structure:
where Q22 occupies rows and columns ilo to ihi.
Input Parameters
ilo, ihi INTEGER. These must be the same parameters ilo and ihi,
respectively, as supplied to ?gehrd. (If n > 0, then 1 ≤
ilo ≤ ihi ≤ n; if n = 0, ilo = 1 and ihi = 0.)
a, tau, work REAL for sorghr
DOUBLE PRECISION for dorghr
Arrays: a(lda,*) contains details of the vectors which define
the elementary reflectors, as returned by ?gehrd.
995
tau(*) contains further details of the elementary reflectors,

as returned by ?gehrd.
The dimension of tau must be at least max (1, n-1).
lwork ≥ max(1, ihi-ilo).
Output Parameters
a Overwritten by the n-by-n orthogonal matrix Q.
info INTEGER.

Specific details for the routine orghr interface are the following:
996
Application Notes
For better performance, try using lwork =(ihi-ilo)*blocksize where blocksize is a
algorithm.
workspace query.
workspace.
The computed matrix Q differs from the exact result by a matrix E such that ||E||2 = O(ε),
where ε is the machine precision.
The approximate number of floating-point operations is (4/3)(ihi-ilo)3.
The complex counterpart of this routine is ?unghr.
?ormhr
Multiplies an arbitrary real matrix C by the real
orthogonal matrix Q determined by ?gehrd.
Syntax
FORTRAN 77:
call sormhr(side, trans, m, n, ilo, ihi, a, lda, tau, c, ldc, work, lwork,
info)
call dormhr(side, trans, m, n, ilo, ihi, a, lda, tau, c, ldc, work, lwork,
info)
997
Fortran 95:
call ormhr(a, tau, c [,ilo] [,ihi] [,side] [,trans] [,info])
Description
The routine multiplies a matrix C by the orthogonal matrix Q that has been determined by a
preceding call to sgehrd/dgehrd. (The routine ?gehrd reduces a real general matrix A to upper
Hessenberg form H by an orthogonal similarity transformation, A = Q*H*QT, and represents
the matrix Q as a product of ihi-ilo elementary reflectors. Here ilo and ihi are values
determined by sgebal/dgebal when balancing the matrix;if the matrix has not been balanced,
ilo = 1 and ihi = n.)
With ?ormhr, you can form one of the matrix products Q*C, QT*C, C*Q, or C*QT, overwriting
the result on C (which may be any real rectangular matrix).
A common application of ?ormhr is to transform a matrix V of eigenvectors of H to the matrix

QV of eigenvectors of A.
Input Parameters
If side = 'L', then the routine forms Q*C or QT*C.
If side = 'R', then the routine forms C*Q or C*QT.
trans CHARACTER*1. Must be 'N' or 'T'.
If trans = 'N' , then Q is applied to C.
T
If trans = 'T', then Q is applied to C.
m INTEGER. The number of rows in C (m ≥ 0).
respectively, as supplied to ?gehrd.
If m > 0 and side = 'L', then 1 ≤ ilo ≤ ihi ≤ m.
If m = 0 and side = 'L' , then ilo = 1 and ihi = 0.
If n > 0 and side = 'R', then 1 ≤ ilo ≤ ihi ≤ n.
If n = 0 and side = 'R' , then ilo = 1 and ihi = 0.
a, tau, c, work REAL for sormhr
998
DOUBLE PRECISION for dormhr
Arrays:
a(lda,*) contains details of the vectors which define the
elementary reflectors, as returned by ?gehrd.
side = 'L' and at least max(1, n) if side = 'R'.
as returned by ?gehrd .
The dimension of tau must be at least max (1, m-1)
if side = 'L' and at least max (1, n-1) if side = 'R'.
c(ldc,*) contains the m by n matrix C. The second dimension
of c must be at least max(1, n).
lwork).
lda INTEGER. The first dimension of a; at least max(1, m) if
side = 'L' and at least max (1, n) if side = 'R'.
ldc INTEGER. The first dimension of c; at least max(1, m).
If side = 'L', lwork ≥ max(1, n).
If side = 'R', lwork ≥ max(1, m).
Output Parameters
c C is overwritten by product Q*C, QT*C, C*Q, or C*QT as
specified by side and trans.
info INTEGER.
999

Specific details for the routine ormhr interface are the following:
Application Notes
For better performance, lwork should be at least n*blocksize if side = 'L' and at least
m*blocksize if side = 'R', where blocksize is a machine-dependent value (typically, 16 to
64) required for optimum performance of the blocked algorithm.
workspace query.
workspace.
The computed matrix Q differs from the exact result by a matrix E such that ||E||2 =
O(ε)|*|C||2, where ε is the machine precision.
1000
The approximate number of floating-point operations is
2n(ihi-ilo)2 if side = 'L';
2m(ihi-ilo)2 if side = 'R'.
The complex counterpart of this routine is ?unmhr.
?unghr
determined by ?gehrd.
Syntax
FORTRAN 77:
call cunghr(n, ilo, ihi, a, lda, tau, work, lwork, info)
call zunghr(n, ilo, ihi, a, lda, tau, work, lwork, info)
Fortran 95:
call unghr(a, tau [,ilo] [,ihi] [,info])
Description
The routine is intended to be used following a call to cgehrd/zgehrd, which reduces a complex
matrix A to upper Hessenberg form H by a unitary similarity transformation: A = Q*H*QH.
?gehrd represents the matrix Q as a product of ihi-ilo elementary reflectors. Here ilo and
ihi are values determined by cgebal/zgebal when balancing the matrix; if the matrix has
not been balanced, ilo = 1 and ihi = n.
Use the routine ?unghr to generate Q explicitly as a square matrix. The matrix Q has the
structure:
1001
where Q22 occupies rows and columns ilo to ihi.
Input Parameters
respectively, as supplied to ?gehrd . (If n > 0, then 1 ≤
ilo ≤ ihi ≤ n. If n = 0, then ilo = 1 and ihi = 0.)
a, tau, work COMPLEX for cunghr
DOUBLE COMPLEX for zunghr.
Arrays:
a(lda,*) contains details of the vectors which define the
as returned by ?gehrd .
The dimension of tau must be at least max (1, n-1).
lwork ≥ max(1, ihi-ilo).
Output Parameters
a Overwritten by the n-by-n unitary matrix Q.
1002
info INTEGER.

Specific details for the routine unghr interface are the following:
Application Notes
For better performance, try using lwork = (ihi-ilo)*blocksize, where blocksize is a
algorithm.
query.
1003
The computed matrix Q differs from the exact result by a matrix E such that ||E||2 = O(ε),
where ε is the machine precision.
The approximate number of real floating-point operations is (16/3)(ihi-ilo)3.
The real counterpart of this routine is ?orghr.
?unmhr
Multiplies an arbitrary complex matrix C by the
complex unitary matrix Q determined by ?gehrd.
Syntax
FORTRAN 77:
call cunmhr(side, trans, m, n, ilo, ihi, a, lda, tau, c, ldc, work, lwork,
info)
call zunmhr(side, trans, m, n, ilo, ihi, a, lda, tau, c, ldc, work, lwork,
info)
Fortran 95:
call unmhr(a, tau, c [,ilo] [,ihi] [,side] [,trans] [,info])
Description
The routine multiplies a matrix C by the unitary matrix Q that has been determined by a preceding
call to cgehrd/zgehrd. (The routine ?gehrd reduces a real general matrix A to upper Hessenberg
form H by an orthogonal similarity transformation, A = Q*H*QH, and represents the matrix Q
as a product of ihi-ilo elementary reflectors. Here ilo and ihi are values determined by
cgebal/zgebal when balancing the matrix; if the matrix has not been balanced, ilo = 1 and
ihi = n.)
H H
With ?unmhr, you can form one of the matrix products Q*C, Q *C, C*Q, or C*Q , overwriting the
result on C (which may be any complex rectangular matrix). A common application of this
routine is to transform a matrix V of eigenvectors of H to the matrix QV of eigenvectors of A.
1004
Input Parameters
H
If side = 'L', then the routine forms Q*C or Q *C.
H
If side = 'R', then the routine forms C*Q or C*Q .
trans CHARACTER*1. Must be 'N' or 'C'.
If trans = 'N' , then Q is applied to C.
H
If trans = 'T', then Q is applied to C.
m INTEGER. The number of rows in C (m≥ 0).
n INTEGER. The number of columns in C (n≥ 0).
respectively, as supplied to ?gehrd .
If m > 0 and side = 'L', then 1 ≤ilo≤ihi≤m.
If m = 0 and side = 'L' , then ilo = 1 and ihi = 0.
If n > 0 and side = 'R', then 1 ≤ilo≤ihi≤n.
If n = 0 and side = 'R' , then ilo =1 and ihi = 0.
a, tau, c, work COMPLEX for cunmhr
DOUBLE COMPLEX for zunmhr.
Arrays:
a (lda,*) contains details of the vectors which define the
side = 'L' and at least max(1, n) if side = 'R'.
as returned by ?gehrd.
The dimension of tau must be at least max (1, m-1)
if side = 'L' and at least max (1, n-1) if side = 'R'.
c (ldc,*) contains the m-by-n matrix C. The second
dimension of c must be at least max(1, n).
lda INTEGER. The first dimension of a; at least max(1, m) if
side = 'L' and at least max (1, n) if side = 'R'.
If side = 'L', lwork≥ max(1,n).
1005
If side = 'R', lwork≥ max(1,m).

Output Parameters
H H
c C is overwritten by Q*C, or Q *C, or C*Q , or C*Q as specified
by side and trans.
info INTEGER.

Specific details for the routine unmhr interface are the following:
1006
Application Notes
For better performance, lwork should be at least n*blocksize if side = 'L' and at least
m*blocksize if side = 'R', where blocksize is a machine-dependent value (typically, 16 to
64) required for optimum performance of the blocked algorithm.
query.
The computed matrix Q differs from the exact result by a matrix E such that ||E||2 =
8n(ihi-ilo)2 if side = 'L';
8m(ihi-ilo)2 if side = 'R'.
The real counterpart of this routine is ?ormhr.
1007
?gebal
Balances a general matrix to improve the accuracy
of computed eigenvalues and eigenvectors.
Syntax
FORTRAN 77:
call sgebal(job, n, a, lda, ilo, ihi, scale, info)
call dgebal(job, n, a, lda, ilo, ihi, scale, info)
call cgebal(job, n, a, lda, ilo, ihi, scale, info)
call zgebal(job, n, a, lda, ilo, ihi, scale, info)
Fortran 95:
call gebal(a [, scale] [,ilo] [,ihi] [,job] [,info])
Description
The routine balances a matrix A by performing either or both of the following two similarity
transformations:
(1) The routine first attempts to permute A to block upper triangular form:
where P is a permutation matrix, and A'11 and A'33 are upper triangular. The diagonal elements
of A'11 and A'33 are eigenvalues of A. The rest of the eigenvalues of A are the eigenvalues of
the central diagonal block A'22, in rows and columns ilo to ihi. Subsequent operations to
compute the eigenvalues of A (or its Schur factorization) need only be applied to these rows
and columns; this can save a significant amount of work if ilo > 1 and ihi < n.
1008
If no suitable permutation exists (as is often the case), the routine sets ilo = 1 and ihi =
n, and A'22 is the whole of A.
(2) The routine applies a diagonal similarity transformation to A', to make the rows and columns
of A'22 as close in norm as possible:
This scaling can reduce the norm of the matrix (that is, ||A'2'2|| < ||A'22||), and hence
reduce the effect of rounding errors on the accuracy of computed eigenvalues and eigenvectors.
Input Parameters
job CHARACTER*1. Must be 'N' or 'P' or 'S' or 'B'.
If job = 'N', then A is neither permuted nor scaled (but
ilo, ihi, and scale get their values).
If job = 'P', then A is permuted but not scaled.
If job = 'S', then A is scaled but not permuted.
If job = 'B', then A is both scaled and permuted.
a REAL for sgebal
DOUBLE PRECISION for dgebal
COMPLEX for cgebal
DOUBLE COMPLEX for zgebal.
Arrays:
The second dimension of a must be at least max(1, n). a is
not referenced if job = 'N'.
1009
Output Parameters
a Overwritten by the balanced matrix (a is not referenced if
job = 'N').
ilo, ihi INTEGER. The values ilo and ihi such that on exit a(i,j)
is zero if i > j and 1 ≤ j < ilo or ihi < i ≤ n.
If job = 'N' or 'S', then ilo = 1 and ihi = n.
scale REAL for single-precision flavors DOUBLE PRECISION for
double-precision flavors
Contains details of the permutations and scaling factors.
More precisely, if pj is the index of the row and column
interchanged with row and column j, and dj is the scaling
factor used to balance row and column j, then
scale(j) = pj for j = 1, 2,..., ilo-1, ihi+1,...,
n;
scale(j) = dj for j = ilo, ilo + 1,..., ihi.
The order in which the interchanges are made is n to ihi+1,
then 1 to ilo-1.
info INTEGER.

Specific details for the routine gebal interface are the following:
scale Holds the vector of length n.
job Must be 'B', 'S', 'P', or 'N'. The default value is 'B'.
1010
Application Notes
The errors are negligible, compared with those in subsequent computations.
If the matrix A is balanced by this routine, then any eigenvectors computed subsequently are
eigenvectors of the matrix A'' and hence you must call ?gebak to transform them back to
eigenvectors of A.
If the Schur vectors of A are required, do not call this routine with job = 'S' or 'B', because
then the balancing transformation is not orthogonal (not unitary for complex flavors).
If you call this routine with job = 'P', then any Schur vectors computed subsequently are
Schur vectors of the matrix A'', and you need to call ?gebak (with side = 'R') to transform
them back to Schur vectors of A.
2
The total number of floating-point operations is proportional to n .
?gebak
Transforms eigenvectors of a balanced matrix to
those of the original nonsymmetric matrix.
Syntax
FORTRAN 77:
call sgebak(job, side, n, ilo, ihi, scale, m, v, ldv, info)
call dgebak(job, side, n, ilo, ihi, scale, m, v, ldv, info)
call cgebak(job, side, n, ilo, ihi, scale, m, v, ldv, info)
call zgebak(job, side, n, ilo, ihi, scale, m, v, ldv, info)
Fortran 95:
call gebak(v, scale [,ilo] [,ihi] [,job] [,side] [,info])
Description
1011
The routine is intended to be used after a matrix A has been balanced by a call to ?gebal, and
eigenvectors of the balanced matrix A''22 have subsequently been computed. For a description
of balancing, see ?gebal. The balanced matrix A'' is obtained as A''= D*P*A*PT*inv(D),
where P is a permutation matrix and D is a diagonal scaling matrix. This routine transforms the
eigenvectors as follows:
T
if x is a right eigenvector of A'', then P *inv(D)*x is a right eigenvector of A; if x is a left
eigenvector of A'', then PT*D*y is a left eigenvector of A.
Input Parameters
job CHARACTER*1. Must be 'N' or 'P' or 'S' or 'B'. The same
parameter job as supplied to ?gebal.
If side = 'L' , then left eigenvectors are transformed.
If side = 'R', then right eigenvectors are transformed.
n INTEGER. The number of rows of the matrix of eigenvectors
(n ≥ 0).
ilo, ihi INTEGER. The values ilo and ihi, as returned by ?gebal.
(If n > 0, then 1 ≤ ilo ≤ ihi ≤ n;
if n = 0, then ilo = 1 and ihi = 0.)
scale REAL for single-precision flavors
DOUBLE PRECISION for double-precision flavors
Contains details of the permutations and/or the scaling
factors used to balance the original general matrix, as
returned by ?gebal.
m INTEGER. The number of columns of the matrix of
eigenvectors (m ≥ 0).
v REAL for sgebak
DOUBLE PRECISION for dgebak
COMPLEX for cgebak
DOUBLE COMPLEX for zgebak.
Arrays:
v (ldv,*) contains the matrix of left or right eigenvectors
to be transformed.
The second dimension of v must be at least max(1, m).
1012
ldv INTEGER. The first dimension of v; at least max(1, n).
Output Parameters
v Overwritten by the transformed eigenvectors.
info INTEGER.

Specific details for the routine gebak interface are the following:
v Holds the matrix V of size (n,m).
Application Notes
The errors in this routine are negligible.
The approximate number of floating-point operations is approximately proportional to m*n.
1013
?hseqr
Computes all eigenvalues and (optionally) the
Schur factorization of a matrix reduced to
Hessenberg form.
Syntax
FORTRAN 77:
call shseqr(job, compz, n, ilo, ihi, h, ldh, wr, wi, z, ldz, work, lwork,
info)
call dhseqr(job, compz, n, ilo, ihi, h, ldh, wr, wi, z, ldz, work, lwork,
info)
call chseqr(job, compz, n, ilo, ihi, h, ldh, w, z, ldz, work, lwork, info)
call zhseqr(job, compz, n, ilo, ihi, h, ldh, w, z, ldz, work, lwork, info)
Fortran 95:
call hseqr(h, wr, wi [,ilo] [,ihi] [,z] [,job] [,compz] [,info])
call hseqr(h, w [,ilo] [,ihi] [,z] [,job] [,compz] [,info])
Description
The routine computes all the eigenvalues, and optionally the Schur factorization, of an upper
Hessenberg matrix H: H = Z*T*ZH, where T is an upper triangular (or, for real flavors,
quasi-triangular) matrix (the Schur form of H), and Z is the unitary or orthogonal matrix whose
columns are the Schur vectors zi.
You can also use this routine to compute the Schur factorization of a general matrix A which
has been reduced to upper Hessenberg form H:
A = Q*H*QH, where Q is unitary (orthogonal for real flavors);

A = (QZ)*H*(QZ)H.
In this case, after reducing A to Hessenberg form by ?gehrd, call ?orghr to form Q explicitly
and then pass Q to ?hseqr with compz = 'V'.
1014
You can also call ?gebal to balance the original matrix before reducing it to Hessenberg form
by ?hseqr, so that the Hessenberg matrix H will have the structure:
where H11 and H33 are upper triangular.
If so, only the central diagonal block H22 (in rows and columns ilo to ihi) needs to be further
reduced to Schur form (the blocks H12 and H23 are also affected). Therefore the values of ilo
and ihi can be supplied to ?hseqr directly. Also, after calling this routine you must call ?gebak
to permute the Schur vectors of the balanced matrix to those of the original matrix.
If ?gebal has not been called, however, then ilo must be set to 1 and ihi to n. Note that if
the Schur factorization of A is required, ?gebal must not be called with job = 'S' or 'B',
because the balancing transformation is not unitary (for real flavors, it is not orthogonal).
?hseqr uses a multishift form of the upper Hessenberg QR algorithm. The Schur vectors are
normalized so that ||zi||2 = 1, but are determined only to within a complex factor of absolute
value 1 (for the real flavors, to within a factor ±1).
Input Parameters
job CHARACTER*1. Must be 'E' or 'S'.
If job = 'E', then eigenvalues only are required.
If job = 'S', then the Schur form T is required.
If compz = 'N', then no Schur vectors are computed (and
the array z is not referenced).
If compz = 'I', then the Schur vectors of H are computed
(and the array z is initialized by the routine).
If compz = 'V', then the Schur vectors of A are computed
(and the array z must contain the matrix Q on entry).
n INTEGER. The order of the matrix H (n ≥ 0).
1015
ilo, ihi INTEGER. If A has been balanced by ?gebal, then ilo and
ihi must contain the values returned by ?gebal. Otherwise,
ilo must be set to 1 and ihi to n.
h, z, work REAL for shseqr
DOUBLE PRECISION for dhseqr
COMPLEX for chseqr
DOUBLE COMPLEX for zhseqr.
Arrays:
h(ldh,*) The n-by-n upper Hessenberg matrix H.
The second dimension of h must be at least max(1, n).
z(ldz,*)
If compz = 'V', then z must contain the matrix Q from the
reduction to Hessenberg form.
If compz = 'I', then z need not be set.
If compz = 'N', then z is not referenced.
The second dimension of z must be
at least max(1, n) if compz = 'V' or 'I';
at least 1 if compz = 'N'.
The dimension of work must be at least max (1, n).
ldh INTEGER. The first dimension of h; at least max(1, n).
ldz INTEGER. The first dimension of z;
If compz = 'N', then ldz ≥ 1.
If compz = 'V' or 'I', then ldz ≥ max(1, n).
lwork ≥ max(1, n)is sufficient, but lwork typically as
large as 6*n may be required for optimal performance. A
workspace query to determine the optimal workspace size
is recommended.
routine only estimates the optimal size of the work array,
Application Notes for details.
1016
Output Parameters
w COMPLEX for chseqr
DOUBLE COMPLEX for zhseqr.
Array, DIMENSION at least max (1, n). Contains the
computed eigenvalues, unless info>0. The eigenvalues are
stored in the same order as on the diagonal of the Schur
form T (if computed).
wr, wi REAL for shseqr
DOUBLE PRECISION for dhseqr
Arrays, DIMENSION at least max (1, n) each.
Contain the real and imaginary parts, respectively, of the
computed eigenvalues, unless info > 0. Complex conjugate
pairs of eigenvalues appear consecutively with the
eigenvalue having positive imaginary part first. The
eigenvalues are stored in the same order as on the diagonal
of the Schur form T (if computed).
h If info = 0 and job = 'S', h contains the upper triangular
matrix T from the Schur decomposition (the Schur form).
If info = 0 and job = 'E', the contents of h are
unspecified on exit. (The output value of h when info >
0 is given under the description of info below.)
z If compz = 'V' and info = 0, then z contains Q*Z.
If compz = 'I' and info = 0, then z contains the unitary
or orthogonal matrix Z of the Schur vectors of H.
lwork.
info INTEGER.
If info = i, elements 1,2, ..., ilo-1 and i+1, i+2, ..., n
of wr and wi contain the real and imaginary parts of those
eigenvalues that have been succesively found.
1017
If info > 0, and job = 'E', then on exit, the remaining

unconverged eigenvalues are the eigenvalues of the upper
Hessenberg matrix rows and columns ilo through info of
the final output value of H.
If info > 0, and job = 'S', then on exit (initial value of
H)*U = U*(final value of H), where U is a unitary matrix. The
final value of H is upper Hessenberg and triangular in rows
and columns info+1 through ihi.
If info > 0, and compz = 'V', then on exit (final value
of Z) = (initial value of Z)*U, where U is the unitary matrix
(regardless of the value of job).
If info > 0, and compz = 'I', then on exit (final value
of Z) = U, where U is the unitary matrix (regardless of the
value of job).
If info > 0, and compz = 'N', then Z is not accessed.

Specific details for the routine hseqr interface are the following:
h Holds the matrix H of size (n,n).
wr Holds the vector of length n. Used in real flavors only.
wi Holds the vector of length n. Used in real flavors only.
w Holds the vector of length n. Used in complex flavors only.
job Must be 'E' or 'S'. The default value is 'E'.
1018
Application Notes
The computed Schur factorization is the exact factorization of a nearby matrix H + E, where
||E||2 < O(ε) ||H||2/si, and ε is the machine precision.
If λi is an exact eigenvalue, and μi is the corresponding computed value, then |λi - μi|≤
c(n)*ε*||H||2/si , where c(n) is a modestly increasing function of n, and si is the reciprocal
condition number of λi. The condition numbers si may be computed by calling ?trsna.
The total number of floating-point operations depends on how rapidly the algorithm converges;
typical numbers are as follows.
3
If only eigenvalues are 7n for real flavors
3
computed: 25n for complex flavors.
3
If the Schur form is 10n for real flavors
3
computed: 35n for complex flavors.
3
If the full Schur 20n for real flavors
3
factorization is 70n for complex flavors.
computed:
workspace query.
workspace.
1019
?hsein
Computes selected eigenvectors of an upper
Hessenberg matrix that correspond to specified
eigenvalues.
Syntax
FORTRAN 77:
call shsein(job, eigsrc, initv, select, n, h, ldh, wr, wi, vl, ldvl, vr, ldvr,
mm, m, work, ifaill, ifailr, info)
call dhsein(job, eigsrc, initv, select, n, h, ldh, wr, wi, vl, ldvl, vr, ldvr,
mm, m, work, ifaill, ifailr, info)
call chsein(job, eigsrc, initv, select, n, h, ldh, w, vl, ldvl, vr, ldvr, mm,
m, work, rwork, ifaill, ifailr, info)
call zhsein(job, eigsrc, initv, select, n, h, ldh, w, vl, ldvl, vr, ldvr, mm,
m, work, rwork, ifaill, ifailr, info)
Fortran 95:
call hsein(h, wr, wi, select [, vl] [,vr] [,ifaill] [,ifailr] [,initv]
[,eigsrc] [,m] [,info])
call hsein(h, w, select [,vl] [,vr] [,ifaill] [,ifailr] [,initv] [,eigsrc]
[,m] [,info])
Description
The routine computes left and/or right eigenvectors of an upper Hessenberg matrix H,
corresponding to selected eigenvalues.
The right eigenvector x and the left eigenvector y, corresponding to an eigenvalue λ, are defined
by: H*x = λ*x and yH*H = λ*yH (or HH*y = λ**y). Here λ denotes the conjugate of λ.
*
The eigenvectors are computed by inverse iteration. They are scaled so that, for a real
eigenvector x, max|xi| = 1, and for a complex eigenvector, max(|Rexi| + |Imxi|) = 1.
If H has been formed by reduction of a general matrix A to upper Hessenberg form, then
eigenvectors of H may be transformed to eigenvectors of A by ?ormhr or ?unmhr.
1020
Input Parameters
job CHARACTER*1. Must be 'R' or 'L' or 'B'.
If job = 'R', then only right eigenvectors are computed.
If job = 'L', then only left eigenvectors are computed.
If job = 'B', then all eigenvectors are computed.
eigsrc CHARACTER*1. Must be 'Q' or 'N'.
If eigsrc = 'Q', then the eigenvalues of H were found
using ?hseqr; thus if H has any zero sub-diagonal elements
(and so is block triangular), then the j-th eigenvalue can
be assumed to be an eigenvalue of the block containing the
j-th row/column. This property allows the routine to perform
inverse iteration on just one diagonal block. If eigsrc =
'N', then no such assumption is made and the routine
performs inverse iteration using the whole matrix.
initv CHARACTER*1. Must be 'N' or 'U'.
If initv = 'N', then no initial estimates for the selected
eigenvectors are supplied.
If initv = 'U', then initial estimates for the selected
eigenvectors are supplied in vl and/or vr.
select LOGICAL.
Array, DIMENSION at least max (1, n). Specifies which
eigenvectors are to be computed.
For real flavors:
To obtain the real eigenvector corresponding to the real
eigenvalue wr(j), set select(j) to .TRUE.
To select the complex eigenvector corresponding to the
complex eigenvalue (wr(j), wi(j)) with complex conjugate
(wr(j+1), wi(j+1)), set select(j) and/or select(j+1) to
.TRUE.; the eigenvector corresponding to the first
eigenvalue in the pair is computed.
For complex flavors:
To select the eigenvector corresponding to the eigenvalue
w(j), set select(j) to .TRUE.
h, vl, vr, REAL for shsein
DOUBLE PRECISION for dhsein
1021
COMPLEX for chsein

DOUBLE COMPLEX for zhsein.
Arrays:
h(ldh,*) The n-by-n upper Hessenberg matrix H.
(ldvl,*)
If initv = 'V' and job = 'L' or 'B', then vl must
contain starting vectors for inverse iteration for the left
eigenvectors. Each starting vector must be stored in the
same column or columns as will be used to store the
corresponding eigenvector.
If initv = 'N', then vl need not be set.
The second dimension of vl must be at least max(1, mm) if
job = 'L' or 'B' and at least 1 if job = 'R'.
The array vl is not referenced if job = 'R'.
vr(ldvr,*)
If initv = 'V' and job = 'R' or 'B', then vr must
contain starting vectors for inverse iteration for the right
eigenvectors. Each starting vector must be stored in the
same column or columns as will be used to store the
corresponding eigenvector.
If initv = 'N', then vr need not be set.
The second dimension of vr must be at least max(1, mm) if
job = 'R' or 'B' and at least 1 if job = 'L'.
The array vr is not referenced if job = 'L'.
DIMENSION at least max (1, n*(n+2)) for real flavors and
at least max (1, n*n) for complex flavors.
w COMPLEX for chsein
DOUBLE COMPLEX for zhsein.
Array, DIMENSION at least max (1, n).
Contains the eigenvalues of the matrix H.
If eigsrc = 'Q', the array must be exactly as returned by
?hseqr.
wr, wi REAL for shsein
DOUBLE PRECISION for dhsein
1022
eigenvalues of the matrix H. Complex conjugate pairs of
values must be stored in consecutive elements of the arrays.
If eigsrc = 'Q', the arrays must be exactly as returned
by ?hseqr.
ldvl INTEGER. The first dimension of vl.
If job = 'L' or 'B', ldvl ≥ max(1,n).
If job = 'R', ldvl ≥ 1.
ldvr INTEGER. The first dimension of vr.
If job = 'R' or 'B', ldvr ≥ max(1,n).
If job = 'L', ldvr ≥1.
mm INTEGER. The number of columns in vl and/or vr.
Must be at least m, the actual number of columns required
(see Output Parameters below).
For real flavors, m is obtained by counting 1 for each selected
real eigenvector and 2 for each selected complex eigenvector
(see select).
For complex flavors, m is the number of selected
eigenvectors (see select).
Constraint:
0 ≤ mm ≤ n.
rwork REAL for chsein
DOUBLE PRECISION for zhsein.
Output Parameters
select Overwritten for real flavors only.
If a complex eigenvector was selected as specified above,
then select(j) is set to .TRUE. and select(j+1) to
.FALSE.
w The real parts of some elements of w may be modified, as
close eigenvalues are perturbed slightly in searching for
independent eigenvectors.
1023
wr Some elements of wr may be modified, as close eigenvalues

are perturbed slightly in searching for independent
eigenvectors.
vl, vr If job = 'L' or 'B', vl contains the computed left
eigenvectors (as specified by select).
If job = 'R' or 'B', vr contains the computed right
eigenvectors (as specified by select).
The eigenvectors are stored consecutively in the columns
of the array, in the same order as their eigenvalues.
For real flavors: a real eigenvector corresponding to a
selected real eigenvalue occupies one column; a complex
eigenvector corresponding to a selected complex eigenvalue
occupies two columns: the first column holds the real part
and the second column holds the imaginary part.
m INTEGER. For real flavors: the number of columns of vl
and/or vr required to store the selected eigenvectors.
For complex flavors: the number of selected eigenvectors.
ifaill, ifailr INTEGER.
Arrays, DIMENSION at least max(1, mm) each.
ifaill(i) = 0 if the ith column of vl converged;
ifaill(i) = j > 0 if the eigenvector stored in the i-th
column of vl (corresponding to the jth eigenvalue) failed
to converge.
ifailr(i) = 0 if the ith column of vr converged;
ifailr(i) = j > 0 if the eigenvector stored in the i-th
column of vr (corresponding to the jth eigenvalue) failed
to converge.
For real flavors: if the ith and (i+1)th columns of vl contain
a selected complex eigenvector, then ifaill(i) and
ifaill(i+1) are set to the same value. A similar rule holds
for vr and ifailr.
The array ifaill is not referenced if job = 'R'. The array
ifailr is not referenced if job = 'L'.
info INTEGER.
1024
If info > 0, then i eigenvectors (as indicated by the
parameters ifaill and/or ifailr above) failed to converge.
The corresponding columns of vl and/or vr contain no useful
information.

Specific details for the routine hsein interface are the following:
select Holds the vector of length n.
vl Holds the matrix VL of size (n,mm).
vr Holds the matrix VR of size (n,mm).
ifaill Holds the vector of length (mm). Note that there will be an error condition
if ifaill is present and vl is omitted.
ifailr Holds the vector of length (mm). Note that there will be an error condition
if ifailr is present and vr is omitted.
initv Must be 'N' or 'U'. The default value is 'N'.
eigsrc Must be 'N' or 'Q'. The default value is 'N'.
job Restored based on the presence of arguments vl and vr as follows:
job = 'B', if both vl and vr are present,
job = 'L', if vl is present and vr omitted,
job = 'R', if vl is omitted and vr present,
Note that there will be an error condition if both vl and vr are omitted.
Application Notes
Each computed right eigenvector x i is the exact eigenvector of a nearby matrix A + Ei, such
that ||Ei|| < O(ε)||A||. Hence the residual is small:
||Axi - λixi|| = O(ε)||A||.
1025
However, eigenvectors corresponding to close or coincident eigenvalues may not accurately

span the relevant subspaces.
Similar remarks apply to computed left eigenvectors.
?trevc
Computes selected eigenvectors of an upper
(quasi-) triangular matrix computed by ?hseqr.
Syntax
FORTRAN 77:
call strevc(side, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, mm, m, work,
info)
call dtrevc(side, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, mm, m, work,
info)
call ctrevc(side, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, mm, m, work,
rwork, info)
call ztrevc(side, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, mm, m, work,
rwork, info)
Fortran 95:
call trevc(t [, howmny] [,select] [,vl] [,vr] [,m] [,info])
Description
The routine computes some or all of the right and/or left eigenvectors of an upper triangular
matrix T (or, for real flavors, an upper quasi-triangular matrix T). Matrices of this type are
produced by the Schur factorization of a general matrix: A = Q*T*QH, as computed by ?hseqr.
The right eigenvector x and the left eigenvector y of T corresponding to an eigenvalue w, are
defined by:
H
T*x = w*x, yH*T = w*yH, where y denotes the conjugate transpose of y.
The eigenvalues are not input to this routine, but are read directly from the diagonal blocks of
T.
1026
This routine returns the matrices X and/or Y of right and left eigenvectors of T, or the products
Q*X and/or Q*Y, where Q is an input matrix.
If Q is the orthogonal/unitary factor that reduces a matrix A to Schur form T, then Q*X and Q*Y
are the matrices of right and left eigenvectors of A.
Input Parameters
side CHARACTER*1. Must be 'R' or 'L' or 'B'.
If side = 'R' , then only right eigenvectors are computed.
If side = 'L' , then only left eigenvectors are computed.
If side = 'B', then all eigenvectors are computed.
howmny CHARACTER*1. Must be 'A' or 'B' or 'S'.
If howmny = 'A' , then all eigenvectors (as specified by
side) are computed.
If howmny = 'B' , then all eigenvectors (as specified by
side) are computed and backtransformed by the matrices
supplied in vl and vr.
If howmny = 'S' , then selected eigenvectors (as specified
by side and select) are computed.
select LOGICAL.
If howmny = 'S', select specifies which eigenvectors are
to be computed.
If howmny = 'A' or 'B', select is not referenced.
For real flavors:
If omega(j) is a real eigenvalue, the corresponding real
eigenvector is computed if select(j) is .TRUE..
If omega(j) and omega(j+1) are the real and imaginary parts
of a complex eigenvalue, the corresponding complex
eigenvector is computed if either select(j) or select(j+1)
is .TRUE., and on exit select(j) is set to .TRUE.and
select(j+1) is set to .FALSE..
The eigenvector corresponding to the j-th eigenvalue is
computed if select(j) is .TRUE..
t, vl, vr, REAL for strevc
1027
DOUBLE PRECISION for dtrevc

COMPLEX for ctrevc
DOUBLE COMPLEX for ztrevc.
Arrays:
t(ldt,*) contains the n-by-n matrix T in Schur canonical
form.
The second dimension of t must be at least max(1, n).
vl(ldvl,*)
If howmny = 'B' and side = 'L' or 'B', then vl must
contain an n-by-n matrix Q (usually the matrix of Schur
vectors returned by ?hseqr).
If howmny = 'A' or 'S', then vl need not be set.
side = 'L' or 'B' and at least 1 if side = 'R'.
The array vl is not referenced if side = 'R'.
vr (ldvr,*)
If howmny = 'B' and side = 'R' or 'B', then vr must
contain an n-by-n matrix Q (usually the matrix of Schur
vectors returned by ?hseqr). .
If howmny = 'A' or 'S', then vr need not be set.
side = 'R' or 'B' and at least 1 if side = 'L'.
The array vr is not referenced if side = 'L'.
DIMENSION at least max (1, 3*n) for real flavors and at
least max (1, 2*n) for complex flavors.
ldt INTEGER. The first dimension of t; at least max(1, n).
If side = 'L' or 'B', ldvl ≥ n.
If side = 'R', ldvl ≥ 1.
If side = 'R' or 'B', ldvr ≥ n.
If side = 'L', ldvr ≥ 1.
mm INTEGER. The number of columns in the arrays vl and/or
vr. Must be at least m (the precise number of columns
required).
1028
If howmny = 'A' or 'B', m = n.
If howmny = 'S': for real flavors, m is obtained by counting
1 for each selected real eigenvector and 2 for each selected
complex eigenvector;
for complex flavors, m is the number of selected eigenvectors
(see select).
Constraint: 0 ≤ m ≤ n.
rwork REAL for ctrevc
DOUBLE PRECISION for ztrevc.
Workspace array, DIMENSION at least max (1, n).
Output Parameters
select If a complex eigenvector of a real matrix was selected as
specified above, then select(j) is set to .TRUE. and
select(j+1) to .FALSE.
vl, vr If side = 'L' or 'B', vl contains the computed left
eigenvectors (as specified by howmny and select).
If side = 'R' or 'B', vr contains the computed right
eigenvectors (as specified by howmny and select).
The eigenvectors are stored consecutively in the columns
of the array, in the same order as their eigenvalues.
For real flavors: corresponding to each real eigenvalue is a
real eigenvector, occupying one column;corresponding to
each complex conjugate pair of eigenvalues is a complex
eigenvector, occupying two columns; the first column holds
the real part and the second column holds the imaginary
part.
m INTEGER.
For complex flavors: the number of selected eigenvectors.
If howmny = 'A' or 'B', m is set to n.
For real flavors: the number of columns of vl and/or vr
actually used to store the selected eigenvectors.
info INTEGER.
1029

Specific details for the routine trevc interface are the following:
t Holds the matrix T of size (n,n).
side If omitted, this argument is restored based on the presence of
arguments vl and vr as follows:
side = 'B', if both vl and vr are present,
side = 'L', if vr is omitted,
side = 'R', if vl is omitted.
howmny If omitted, this argument is restored based on the presence of argument
select as follows:
howmny = 'V', if q is present,
howmny = 'N', if q is omitted.
If present, vect = 'V' or 'U' and the argument q must also be
present.
Note that there will be an error condition if both select and howmny
are present.
Application Notes
If x i is an exact right eigenvector and yi is the corresponding computed eigenvector, then the
angle θ(yi, xi) between them is bounded as follows: θ(yi,xi)≤(c(n)ε||T||2)/sepi where
sepi is the reciprocal condition number of xi. The condition number sepi may be computed by
calling ?trsna.
1030
?trsna
Estimates condition numbers for specified
eigenvalues and right eigenvectors of an upper
(quasi-) triangular matrix.
Syntax
FORTRAN 77:
call strsna(job, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, s, sep, mm,
m, work, ldwork, iwork, info)
call dtrsna(job, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, s, sep, mm,
m, work, ldwork, iwork, info)
call ctrsna(job, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, s, sep, mm,
m, work, ldwork, rwork, info)
call ztrsna(job, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, s, sep, mm,
m, work, ldwork, rwork, info)
Fortran 95:
call trsna(t [, s] [,sep] [,vl] [,vr] [,select] [,m] [,info])
Description
The routine estimates condition numbers for specified eigenvalues and/or right eigenvectors
of an upper triangular matrix T (or, for real flavors, upper quasi-triangular matrix T in canonical
Schur form). These are the same as the condition numbers of the eigenvalues and right
eigenvectors of an original matrix A = Z*T*ZH (with unitary or, for real flavors, orthogonal Z),
from which T may have been derived.
The routine computes the reciprocal of the condition number of an eigenvalue lambda(i) as
si = |vH*u|/(||u||E||v||E), where u and v are the right and left eigenvectors of T,
respectively, corresponding to lambda(i). This reciprocal condition number always lies between
zero (ill-conditioned) and one (well-conditioned).
An approximate error estimate for a computed eigenvalue lambda(i)is then given by
ε*||T||/si, where ε is the machine precision.
1031
To estimate the reciprocal of the condition number of the right eigenvector corresponding to
lambda(i), the routine first calls ?trexc to reorder the eigenvalues so that lambda(i) is in
the leading position:
The reciprocal condition number of the eigenvector is then estimated as sepi, the smallest
singular value of the matrix T22 - lambda(i)*I. This number ranges from zero (ill-conditioned)
to very large (well-conditioned).
An approximate error estimate for a computed right eigenvector u corresponding to lambda(i)
is then given by ε*||T||/sepi.
Input Parameters
job CHARACTER*1. Must be 'E' or 'V' or 'B'.
If job = 'E', then condition numbers for eigenvalues only
are computed.
If job = 'V', then condition numbers for eigenvectors only
are computed.
If job = 'B', then condition numbers for both eigenvalues
and eigenvectors are computed.
howmny CHARACTER*1. Must be 'A' or 'S'.
If howmny = 'A' , then the condition numbers for all
eigenpairs are computed.
If howmny = 'S' , then condition numbers for selected
eigenpairs (as specified by select) are computed.
select LOGICAL.
Array, DIMENSION at least max (1, n) if howmny = 'S' and
at least 1 otherwise.
Specifies the eigenpairs for which condition numbers are to
be computed if howmny= 'S'.
For real flavors:
1032
To select condition numbers for the eigenpair corresponding
to the real eigenvalue lambda(j), select(j) must be set
.TRUE.;
to select condition numbers for the eigenpair corresponding
to a complex conjugate pair of eigenvalues lambda(j) and
lambda(j+1), select(j) and/or select(j+1) must be set
.TRUE.
For complex flavors
to the eigenvalue lambda(j), select(j) must be set .TRUE.
select is not referenced if howmny = 'A'.
t, vl, vr, work REAL for strsna
DOUBLE PRECISION for dtrsna
COMPLEX for ctrsna
DOUBLE COMPLEX for ztrsna.
Arrays:
t(ldt,*) contains the n-by-n matrix T.
vl(ldvl,*)
If job = 'E' or 'B', then vl must contain the left
H
eigenvectors of T (or of any matrix Q*T*Q with Q unitary
or orthogonal) corresponding to the eigenpairs specified by
howmny and select. The eigenvectors must be stored in
consecutive columns of vl, as returned by ?trevc or
?hsein.
job = 'E' or 'B' and at least 1 if job = 'V'.
The array vl is not referenced if job = 'V'.
vr(ldvr,*)
If job = 'E' or 'B', then vr must contain the right
H
eigenvectors of T (or of any matrix Q*T*Q with Q unitary
or orthogonal) corresponding to the eigenpairs specified by
howmny and select. The eigenvectors must be stored in
consecutive columns of vr, as returned by ?trevc or
?hsein.
1033

job = 'E' or 'B' and at least 1 if job = 'V'.
The array vr is not referenced if job = 'V'.
work is a workspace array, its dimension (ldwork,n+6).
The array work is not referenced if job = 'E'.
If job = 'E' or 'B', ldvl ≥ max(1,n).
If job = 'V', ldvl ≥ 1.
If job = 'E' or 'B', ldvr ≥ max(1,n).
If job = 'R', ldvr ≥ 1.
mm INTEGER. The number of elements in the arrays s and sep,
and the number of columns in vl and vr (if used). Must be
at least m (the precise number required).
If howmny = 'A', m = n;
if howmny = 'S', for real flavors m is obtained by counting
1 for each selected real eigenvalue and 2 for each selected
complex conjugate pair of eigenvalues.
for complex flavors m is the number of selected eigenpairs
(see select). Constraint:
0 ≤ m ≤ n.
ldwork INTEGER. The first dimension of work.
If job = 'V' or 'B', ldwork ≥ max(1,n).
If job = 'E', ldwork ≥ 1.
rwork REAL for ctrsna, ztrsna.
Array, DIMENSION at least max (1, n). The array is not
referenced if job = 'E'.
iwork INTEGER for strsna, dtrsna.
Array, DIMENSION at least max (1, 2*(n - 1)). The array is
not referenced if job = 'E'.
1034
Output Parameters
s REAL for single-precision flavors
Array, DIMENSION at least max(1, mm) if job = 'E' or 'B'
and at least 1 if job = 'V'.
Contains the reciprocal condition numbers of the selected
eigenvalues if job = 'E' or 'B', stored in consecutive
elements of the array. Thus s(j), sep(j) and the j-th
columns of vl and vr all correspond to the same eigenpair
(but not in general the j th eigenpair unless all eigenpairs
have been selected).
For real flavors: for a complex conjugate pair of eigenvalues,
two consecutive elements of S are set to the same value.
The array s is not referenced if job = 'V'.
sep REAL for single-precision flavors
Array, DIMENSION at least max(1, mm) if job = 'V' or 'B'
and at least 1 if job = 'E'. Contains the estimated
reciprocal condition numbers of the selected right
eigenvectors if job = 'V' or 'B', stored in consecutive
elements of the array.
For real flavors: for a complex eigenvector, two consecutive
elements of sep are set to the same value; if the eigenvalues
cannot be reordered to compute sep(j), then sep(j) is set
to zero; this can only occur when the true value would be
very small anyway. The array sep is not referenced if job
= 'E'.
m INTEGER.
For complex flavors: the number of selected eigenpairs.
If howmny = 'A', m is set to n.
For real flavors: the number of elements of s and/or sep
actually used to store the estimated condition numbers.
info INTEGER.
1035

Specific details for the routine trsna interface are the following:
s Holds the vector of length (mm).
sep Holds the vector of length (mm).
job Restored based on the presence of arguments s and sep as follows:
job = 'B', if both s and sep are present,
job = 'E', if s is present and sep omitted,
job = 'V', if s is omitted and sep present.
Note an error condition if both s and sep are omitted.
howmny Restored based on the presence of the argument select as follows:
howmny = 'S', if select is present,
howmny = 'A', if select is omitted.
Note that the arguments s, vl, and vr must either be all present or all omitted.
Otherwise, an error condition is observed.
Application Notes
The computed values sepi may overestimate the true value, but seldom by a factor of more
than 3.
1036
?trexc
Reorders the Schur factorization of a general
matrix.
Syntax
FORTRAN 77:
call strexc(compq, n, t, ldt, q, ldq, ifst, ilst, work, info)
call dtrexc(compq, n, t, ldt, q, ldq, ifst, ilst, work, info)
call ctrexc(compq, n, t, ldt, q, ldq, ifst, ilst, info)
call ztrexc(compq, n, t, ldt, q, ldq, ifst, ilst, info)
Fortran 95:
call trexc(t, ifst, ilst [,q] [,info])
Description
The routine reorders the Schur factorization of a general matrix A = Q*T*QH, so that the
diagonal element or block of T with row index ifst is moved to row ilst.
The reordered Schur form S is computed by an unitary (or, for real flavors, orthogonal) similarity
transformation: S = ZH*T*Z. Optionally the updated matrix P of Schur vectors is computed as
P = Q*Z, giving A = P*S*PH.
Input Parameters
compq CHARACTER*1. Must be 'V' or 'N'.
If compq = 'V', then the Schur vectors (Q) are updated.
If compq = 'N', then no Schur vectors are updated.
t, q REAL for strexc
DOUBLE PRECISION for dtrexc
COMPLEX for ctrexc
DOUBLE COMPLEX for ztrexc.
Arrays:
1037
t(ldt,*) contains the n-by-n matrix T.

q(ldq,*)
If compq = 'V', then q must contain Q (Schur vectors).
If compq = 'N', then q is not referenced.
The second dimension of q must be at least max(1, n)
if compq = 'V' and at least 1 if compq = 'N'.
ldq INTEGER. The first dimension of q;
If compq = 'N', then ldq≥ 1.
If compq = 'V', then ldq≥ max(1, n).
ifst, ilst INTEGER. 1 ≤ ifst ≤ n; 1 ≤ ilst ≤ n.
Must specify the reordering of the diagonal elements (or
blocks, which is possible for real flavors) of the matrix T.
The element (or block) with row index ifst is moved to row
ilst by a sequence of exchanges between adjacent
elements (or blocks).
work REAL for strexc
DOUBLE PRECISION for dtrexc.
Output Parameters
t Overwritten by the updated matrix S.
q If compq = 'V', q contains the updated matrix of Schur
vectors.
ifst, ilst Overwritten for real flavors only.
If ifst pointed to the second row of a 2 by 2 block on entry,
it is changed to point to the first row; ilst always points
to the first row of the block in its final position (which may
differ from its input value by ±1).
info INTEGER.
1038
Specific details for the routine trexc interface are the following:
compq Restored based on the presence of the argument q as follows:
compq = 'V', if q is present,
compq = 'N', if q is omitted.
Application Notes
The computed matrix S is exactly similar to a matrix T+E, where ||E||2 = O(ε)*||T||2, and
ε is the machine precision.
Note that if a 2 by 2 diagonal block is involved in the re-ordering, its off-diagonal elements are
in general changed; the diagonal elements and the eigenvalues of the block are unchanged
unless the block is sufficiently ill-conditioned, in which case they may be noticeably altered. It
is possible for a 2 by 2 block to break into two 1 by 1 blocks, that is, for a pair of complex
eigenvalues to become purely real.
The values of eigenvalues however are never changed by the re-ordering.
for real flavors: 6n(ifst-ilst) if compq = 'N';
12n(ifst-ilst) if compq = 'V';
for complex flavors: 20n(ifst-ilst) if compq = 'N';
40n(ifst-ilst) if compq = 'V'.
1039
?trsen
Reorders the Schur factorization of a matrix and
(optionally) computes the reciprocal condition
numbers and invariant subspace for the selected
cluster of eigenvalues.
Syntax
FORTRAN 77:
call strsen(job, compq, select, n, t, ldt, q, ldq, wr, wi, m, s, sep, work,
lwork, iwork, liwork, info)
call dtrsen(job, compq, select, n, t, ldt, q, ldq, wr, wi, m, s, sep, work,
call ctrsen(job, compq, select, n, t, ldt, q, ldq, w, m, s, sep, work, lwork,
info)
call ztrsen(job, compq, select, n, t, ldt, q, ldq, w, m, s, sep, work, lwork,
info)
Fortran 95:
call trsen(t, select [,wr] [,wi] [,m] [,s] [,sep] [,q] [,info])
call trsen(t, select [,w] [,m] [,s] [,sep] [,q] [,info])
Description
H
The routine reorders the Schur factorization of a general matrix A = Q*T*Q so that a selected
cluster of eigenvalues appears in the leading diagonal elements (or, for real flavors, diagonal
blocks) of the Schur form. The reordered Schur form R is computed by an unitary (orthogonal)
similarity transformation: R = ZH*T*Z. Optionally the updated matrix P of Schur vectors is
H
computed as P = Q*Z, giving A = P*R*P .
Let
1040
where the selected eigenvalues are precisely the eigenvalues of the leading m-by-m submatrix
T11. Let P be correspondingly partitioned as (Q1 Q2) where Q1 consists of the first m columns of
Q. Then A*Q1 = Q1*T11, and so the m columns of Q1 form an orthonormal basis for the invariant
subspace corresponding to the selected cluster of eigenvalues.
Optionally the routine also computes estimates of the reciprocal condition numbers of the
average of the cluster of eigenvalues and of the invariant subspace.
Input Parameters
job CHARACTER*1. Must be 'N' or 'E' or 'V' or 'B'.
If job = 'N', then no condition numbers are required.
If job = 'E', then only the condition number for the cluster
of eigenvalues is computed.
If job = 'V', then only the condition number for the
invariant subspace is computed.
If job = 'B', then condition numbers for both the cluster
and the invariant subspace are computed.
compq CHARACTER*1. Must be 'V' or 'N'.
If compq = 'V', then Q of the Schur vectors is updated.
If compq = 'N', then no Schur vectors are updated.
select LOGICAL.
Specifies the eigenvalues in the selected cluster. To select
an eigenvalue lambda(j), select(j) must be .TRUE.
For real flavors: to select a complex conjugate pair of
eigenvalues lambda(j) and lambda(j+1) (corresponding 2
by 2 diagonal block), select(j) and/or select(j+1) must
be .TRUE.; the complex conjugate lambda(j)and
lambda(j+1) must be either both included in the cluster or
both excluded.
t, q, work REAL for strsen
1041
DOUBLE PRECISION for dtrsen

COMPLEX for ctrsen
DOUBLE COMPLEX for ztrsen.
Arrays:
t (ldt,*) The n-by-n T.
q (ldq,*)
If compq = 'V', then q must contain Q of Schur vectors.
The second dimension of q must be at least max(1, n) if
compq = 'V' and at least 1 if compq = 'N'.
If compq = 'N', then ldq ≥ 1.
If compq = 'V', then ldq ≥ max(1, n).
If job = 'V' or 'B', lwork ≥ max(1,2*m*(n-m)).
If job = 'E', then lwork ≥ max(1, m*(n-m))
If job = 'N', then lwork ≥ 1 for complex flavors and
lwork ≥ max(1,n) for real flavors.
iwork INTEGER.iwork(liwork) is a workspace array. The array
iwork is not referenced if job = 'N' or 'E'.
The actual amount of workspace required cannot exceed
2
n /2 if job = 'V' or 'B'.
liwork INTEGER.
If job = 'V' or 'B', liwork ≥ max(1,2m(n-m)).
If job = 'E' or 'E', liwork ≥ 1.
1042
See Application Notes for details.
Output Parameters
t Overwritten by the updated matrix R.
q If compq = 'V', q contains the updated matrix of Schur
vectors; the first m columns of the Q form an orthogonal
basis for the specified invariant subspace.
w COMPLEX for ctrsen
DOUBLE COMPLEX for ztrsen.
Array, DIMENSION at least max(1, n). The recorded
eigenvalues of R. The eigenvalues are stored in the same
order as on the diagonal of R.
wr, wi REAL for strsen
DOUBLE PRECISION for dtrsen
Arrays, DIMENSION at least max(1, n). Contain the real and
imaginary parts, respectively, of the reordered eigenvalues
of R. The eigenvalues are stored in the same order as on
the diagonal of R. Note that if a complex eigenvalue is
sufficiently ill-conditioned, then its value may differ
significantly from its value before reordering.
m INTEGER.
For complex flavors: the number of the specified invariant
subspaces, which is the same as the number of selected
eigenvalues (see select).
For real flavors: the dimension of the specified invariant
subspace. The value of m is obtained by counting 1 for each
selected real eigenvalue and 2 for each selected complex
conjugate pair of eigenvalues (see select).
Constraint: 0 ≤ m ≤ n.
1043
If job = 'E' or 'B', s is a lower bound on the reciprocal

condition number of the average of the selected cluster of
eigenvalues.
If m = 0 or n, then s = 1.
For real flavors: if info = 1, then s is set to zero.s is not
referenced if job = 'N' or 'V'.
sep REAL for single-precision flavors DOUBLE PRECISION for
double-precision flavors.
If job = 'V' or 'B', sep is the estimated reciprocal
condition number of the specified invariant subspace.
If m = 0 or n, then sep = |T|.
For real flavors: if info = 1, then sep is set to zero.
sep is not referenced if job = 'N' or 'E'.
work(1) On exit, if info = 0, then work(1) returns the optimal size
of lwork.
size of liwork.
info INTEGER.

Specific details for the routine trsen interface are the following:
compq Restored based on the presence of the argument q as follows: compq
= 'V', if q is present, compq = 'N', if q is omitted.
1044
job Restored based on the presence of arguments s and sep as follows:
job = 'B', if both s and sep are present,
job = 'E', if s is present and sep omitted,
job = 'V', if s is omitted and sep present,
job = 'N', if both s and sep are omitted.
Application Notes
The computed matrix R is exactly similar to a matrix T+E, where ||E||2 = O(ε)|*|T||2, and
ε is the machine precision. The computed s cannot underestimate the true reciprocal condition
number by more than a factor of (min(m, n-m))1/2; sep may differ from the true value by
2
(m*n-m )1/2. The angle between the computed invariant subspace and the true subspace is
O(ε)*||A||2/sep. Note that if a 2-by-2 diagonal block is involved in the re-ordering, its
off-diagonal elements are in general changed; the diagonal elements and the eigenvalues of
the block are unchanged unless the block is sufficiently ill-conditioned, in which case they may
be noticeably altered. It is possible for a 2-by-2 block to break into two 1-by-1 blocks, that is,
for a pair of complex eigenvalues to become purely real. The values of eigenvalues however
are never changed by the re-ordering.
the first run or set lwork = -1 (liwork = -1).
described, the routine completes the task, though probably not so fast as with a recommended
array (work, iwork) on exit. Use this value (work(1), iwork(1)) for subsequent runs.
If lwork = -1 (liwork = -1), the routine returns immediately and provides the recommended
workspace in the first element of the corresponding array (work, iwork). This operation is called
a workspace query.
Note that if lwork (liwork) is less than the minimal required value and is not equal to -1, the
1045
?trsyl
Solves Sylvester equation for real quasi-triangular
or complex triangular matrices.
Syntax
FORTRAN 77:
call strsyl(trana, tranb, isgn, m, n, a, lda, b, ldb, c, ldc, scale, info)
call dtrsyl(trana, tranb, isgn, m, n, a, lda, b, ldb, c, ldc, scale, info)
call ctrsyl(trana, tranb, isgn, m, n, a, lda, b, ldb, c, ldc, scale, info)
call ztrsyl(trana, tranb, isgn, m, n, a, lda, b, ldb, c, ldc, scale, info)
Fortran 95:
call trsyl(a, b, c, scale [, trana] [,tranb] [,isgn] [,info])
Description
The routine solves the Sylvester matrix equation op(A)*X ± X*op(B) = α*C, where op(A)
H
= A or A , and the matrices A and B are upper triangular (or, for real flavors, upper
quasi-triangular in canonical Schur form); α ≤ 1 is a scale factor determined by the routine to
avoid overflow in X; A is m-by-m, B is n-by-n, and C and X are both m-by-n. The matrix X is
obtained by a straightforward process of back substitution.
The equation has a unique solution if and only if αi ± βi ≠ 0, where {αi} and {βi} are the
eigenvalues of A and B, respectively, and the sign (+ or -) is the same as that used in the
equation to be solved.
Input Parameters
trana CHARACTER*1. Must be 'N' or 'T' or 'C'.
If trana = 'N' , then op(A) = A.
If trana = 'T', then op(A) = AT (real flavors only).
If trana = 'C' then op(A) = AH.
tranb CHARACTER*1. Must be 'N' or 'T' or 'C'.
1046
If tranb = 'N' , then op(B) = B.
If tranb = 'T', then op(B) = BT (real flavors only).
If tranb = 'C', then op(B) = BH.
isgn INTEGER. Indicates the form of the Sylvester equation.
If isgn = +1, op(A)*X + X*op(B) = alpha*C.
If isgn = -1, op(A)*X - X*op(B) = alpha*C.
m INTEGER. The order of A, and the number of rows in X and
C (m ≥ 0).
n INTEGER. The order of B, and the number of columns in X
and C (n ≥ 0).
a, b, c REAL for strsyl
DOUBLE PRECISION for dtrsyl
COMPLEX for ctrsyl
DOUBLE COMPLEX for ztrsyl.
Arrays:
The second dimension of c must be at least max(1, n).
ldc INTEGER. The first dimension of c; at least max(1, n).
Output Parameters
c Overwritten by the solution matrix X.
The value of the scale factor α.
info INTEGER.
1047
If info = 1, A and B have common or close eigenvalues

perturbed values were used to solve the equation.

Specific details for the routine trsyl interface are the following:
a Holds the matrix A of size (m,m).
trana Must be 'N', 'C', or 'T'. The default value is 'N'.
tranb Must be 'N', 'C', or 'T'. The default value is 'N'.
isgn Must be +1 or -1. The default value is +1.
Application Notes
Let X be the exact, Y the corresponding computed solution, and R the residual matrix: R = C
- (AY ± YB). Then the residual is always small:
||R||F = O(ε)*(||A||F +||B||F)*||Y||F.

However, Y is not necessarily the exact solution of a slightly perturbed equation; in other words,
the solution is not backwards stable.
For the forward error, the following bound holds:
||Y - X||F ≤||R||F/sep(A,B)

but this may be a considerable overestimate. See [Golub96] for a definition of sep(A, B).
The approximate number of floating-point operations for real flavors is m*n*(m + n). For
complex flavors it is 4 times greater.
Generalized Nonsymmetric Eigenvalue Problems

This section describes LAPACK routines for solving generalized nonsymmetric eigenvalue
problems, reordering the generalized Schur factorization of a pair of matrices, as well as
performing a number of related computational tasks.
1048
A generalized nonsymmetric eigenvalue problem is as follows: given a pair of nonsymmetric
(or non-Hermitian) n-by-n matrices A and B, find the generalized eigenvalues λ and the
corresponding generalized eigenvectors x and y that satisfy the equations
Ax = λBx (right generalized eigenvectors x)

and
yHA = λyHB (left generalized eigenvectors y).

Table 4-6 lists LAPACK routines (FORTRAN 77 interface) used to solve the generalized
nonsymmetric eigenvalue problems and the generalized Sylvester equation. Respective routine
names in Fortran 95 interface are without the first symbol (see Routine Naming Conventions).
Table 4-6 Computational Routines for Solving Generalized Nonsymmetric Eigenvalue
Problems
Routine Operation performed
name
?gghrd Reduces a pair of matrices to generalized upper Hessenberg form using

orthogonal/unitary transformations.
?ggbal Balances a pair of general real or complex matrices.
?ggbak Forms the right or left eigenvectors of a generalized eigenvalue problem.
?hgeqz Implements the QZ method for finding the generalized eigenvalues of the
matrix pair (H,T).
?tgevc Computes some or all of the right and/or left generalized eigenvectors of a
pair of upper triangular matrices
?tgexc Reorders the generalized Schur decomposition of a pair of matrices (A,B) so

that one diagonal block of (A,B) moves to another row index.
?tgsen Reorders the generalized Schur decomposition of a pair of matrices (A,B) so

that a selected cluster of eigenvalues appears in the leading diagonal blocks
of (A,B).
?tgsyl Solves the generalized Sylvester equation.
?tgsna Estimates reciprocal condition numbers for specified eigenvalues and/or

eigenvectors of a pair of matrices in generalized real Schur canonical form.
1049
?gghrd
Reduces a pair of matrices to generalized upper
Hessenberg form using orthogonal/unitary
transformations.
Syntax
FORTRAN 77:
call sgghrd(compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, info)
call dgghrd(compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, info)
call cgghrd(compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, info)
call zgghrd(compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, info)
Fortran 95:
call gghrd(a, b [,ilo] [,ihi] [,q] [,z] [,compq] [,compz] [,info])
Description
The routine reduces a pair of real/complex matrices (A,B) to generalized upper Hessenberg
form using orthogonal/unitary transformations, where A is a general matrix and B is upper
triangular. The form of the generalized eigenvalue problem is A*x = λ*B*x, and B is typically
made upper triangular by computing its QR factorization and moving the orthogonal matrix Q
to the left side of the equation.
This routine simultaneously reduces A to a Hessenberg matrix H:
QH*A*Z = H
and transforms B to another upper triangular matrix T:
QH*B*Z = T
in order to reduce the problem to its standard form H*y = λ*T*y, where y = ZH*x.
The orthogonal/unitary matrices Q and Z are determined as products of Givens rotations. They
may either be formed explicitly, or they may be postmultiplied into input matrices Q1 and Z1,
so that
1050
Q1*A*Z1H = (Q1*Q)*H*(Z1*Z)H
Q1*B*Z1H = (Q1*Q)*T*(Z1*Z)H
If Q1 is the orthogonal/unitary matrix from the QR factorization of B in the original equation A*x
= λ*B*x, then the routine ?gghrd reduces the original problem to generalized Hessenberg
form.
Input Parameters
compq CHARACTER*1. Must be 'N', 'I', or 'V'.
If compq = 'N', matrix Q is not computed.
If compq = 'I', Q is initialized to the unit matrix, and the
orthogonal/unitary matrix Q is returned;
If compq = 'V', Q must contain an orthogonal/unitary
matrix Q1 on entry, and the product Q1*Q is returned.
compz CHARACTER*1. Must be 'N', 'I', or 'V'.
If compz = 'N', matrix Z is not computed.
If compz = 'I', Z is initialized to the unit matrix, and the
orthogonal/unitary matrix Z is returned;
If compz = 'V', Z must contain an orthogonal/unitary
matrix Z1 on entry, and the product Z1*Z is returned.
ilo, ihi INTEGER. ilo and ihi mark the rows and columns of A
which are to be reduced. It is assumed that A is already
upper triangular in rows and columns 1:ilo-1 and ihi+1:n.
Values of ilo and ihi are normally set by a previous call
to ?ggbal; otherwise they should be set to 1 and n
respectively.
Constraint:
If n > 0, then 1 ≤ ilo ≤ ihi ≤ n;
if n = 0, then ilo = 1 and ihi = 0.
a, b, q, z REAL for sgghrd
DOUBLE PRECISION for dgghrd
COMPLEX for cgghrd
DOUBLE COMPLEX for zgghrd.
Arrays:
1051
a(lda,*) contains the n-by-n general matrix A. The second

b(ldb,*) contains the n-by-n upper triangular matrix B.
q (ldq,*)
If compq = 'V', then q must contain the orthogonal/unitary
matrix Q1, typically from the QR factorization of B.
z (ldz,*)
If compq = 'N', then z is not referenced.
If compq = 'V', then z must contain the orthogonal/unitary
matrix Z1.
If compq = 'I'or 'V', then ldq ≥ max(1, n).
If compq = 'N', then ldz ≥ 1.
If compq = 'I'or 'V', then ldz ≥ max(1, n).
Output Parameters
a On exit, the upper triangle and the first subdiagonal of A
are overwritten with the upper Hessenberg matrix H, and
the rest is set to zero.
b On exit, overwritten by the upper triangular matrix T =
H
Q *B*Z. The elements below the diagonal are set to zero.
q If compq = 'I', then q contains the orthogonal/unitary
matrix Q, ;
If compq = 'V', then q is overwritten by the product Q1*Q.
z If compq = 'I', then z contains the orthogonal/unitary
matrix Z;
If compq = 'V', then z is overwritten by the product Z1*Z.
1052
info INTEGER.

Specific details for the routine gghrd interface are the following:
compq If omitted, this argument is restored based on the presence of argument
q as follows: compq = 'I', if q is present, compq = 'N', if q is omitted.
If present, compq must be equal to 'I' or 'V' and the argument q
must also be present. Note that there will be an error condition if compq
is present and q omitted.
1053
?ggbal
Balances a pair of general real or complex matrices.
Syntax
FORTRAN 77:
call sggbal(job, n, a, lda, b, ldb, ilo, ihi, lscale, rscale, work, info)
call dggbal(job, n, a, lda, b, ldb, ilo, ihi, lscale, rscale, work, info)
call cggbal(job, n, a, lda, b, ldb, ilo, ihi, lscale, rscale, work, info)
call zggbal(job, n, a, lda, b, ldb, ilo, ihi, lscale, rscale, work, info)
Fortran 95:
call ggbal(a, b [,ilo] [,ihi] [,lscale] [,rscale] [,job] [,info])
Description
The routine balances a pair of general real/complex matrices (A,B). This involves, first, permuting
A and B by similarity transformations to isolate eigenvalues in the first 1 to ilo-1 and last
ihi+1 to n elements on the diagonal;and second, applying a diagonal similarity transformation
to rows and columns ilo to ihi to make the rows and columns as close in norm as possible.
Both steps are optional. Balancing may reduce the 1-norm of the matrices, and improve the
accuracy of the computed eigenvalues and/or eigenvectors in the generalized eigenvalue problem
A*x = λ*B*x.
Input Parameters
job CHARACTER*1. Specifies the operations to be performed on
A and B. Must be 'N' or 'P' or 'S' or 'B'.
If job = 'N', then no operations are done; simply set ilo
=1, ihi=n, lscale(i) =1.0 and rscale(i)=1.0 for
i = 1,..., n.
If job = 'P', then permute only.
If job = 'S', then scale only.
If job = 'B', then both permute and scale.
1054
a, b REAL for sggbal
DOUBLE PRECISION for dggbal
COMPLEX for cggbal
DOUBLE COMPLEX for zggbal.
Arrays:
a(lda,*) contains the matrix A. The second dimension of a
b(ldb,*) contains the matrix B. The second dimension of b
work REAL for single precision flavors
Workspace array, DIMENSION at least max(1, 6n) when
job = 'S'or 'B', or at least 1 when job = 'N'or 'P'.
Output Parameters
a, b Overwritten by the balanced matrices A and B, respectively.
If job = 'N', a and b are not referenced.
ilo, ihi INTEGER. ilo and ihi are set to integers such that on exit
a(i,j)=0 and b(i,j)=0 if i>j and j=1,...,ilo-1 or
i=ihi+1,..., n.
If job = 'N'or 'S', then ilo = 1 and ihi = n.
lscale, rscale REAL for single precision flavors
lscale contains details of the permutations and scaling
factors applied to the left side of A and B.
If Pj is the index of the row interchanged with row j, and
Dj is the scaling factor applied to row j, then
lscale(j) = Pj, for j = 1,..., ilo-1
= Dj, for j = ilo,...,ihi
= Pj, for j = ihi+1,..., n.
rscale contains details of the permutations and scaling
factors applied to the right side of A and B.
1055
If Pj is the index of the column interchanged with column

j, and Dj is the scaling factor applied to column j, then
rscale(j) = Pj, for j = 1,..., ilo-1
= Dj, for j = ilo,...,ihi
= Pj, for j = ihi+1,..., n
then 1 to ilo-1.
info INTEGER.

Specific details for the routine ggbal interface are the following:
lscale Holds the vector of length (n).
rscale Holds the vector of length (n).
1056
?ggbak
Forms the right or left eigenvectors of a generalized
eigenvalue problem.
Syntax
FORTRAN 77:
call sggbak(job, side, n, ilo, ihi, lscale, rscale, m, v, ldv, info)
call dggbak(job, side, n, ilo, ihi, lscale, rscale, m, v, ldv, info)
call cggbak(job, side, n, ilo, ihi, lscale, rscale, m, v, ldv, info)
call zggbak(job, side, n, ilo, ihi, lscale, rscale, m, v, ldv, info)
Fortran 95:
call ggbak(v [, ilo] [,ihi] [,lscale] [,rscale] [,job] [,info])
Description
The routine forms the right or left eigenvectors of a real/complex generalized eigenvalue problem
A*x = λ*B*x
by backward transformation on the computed eigenvectors of the balanced pair of matrices
output by ?ggbal.
Input Parameters
job CHARACTER*1. Specifies the type of backward transformation
required. Must be 'N', 'P', 'S', or 'B'.
If job = 'N', then no operations are done; return.
If job = 'P', then do backward transformation for
permutation only.
If job = 'S', then do backward transformation for scaling
only.
If job = 'B', then do backward transformation for both
permutation and scaling. This argument must be the same
as the argument job supplied to ?ggbal.
1057

If side = 'L' , then v contains left eigenvectors.
If side = 'R' , then v contains right eigenvectors.
n INTEGER. The number of rows of the matrix V (n ≥ 0).
ilo, ihi INTEGER. The integers ilo and ihi determined by ?gebal.
Constraint:
lscale, rscale REAL for single precision flavors
The array lscale contains details of the permutations and/or
scaling factors applied to the left side of A and B, as returned
by ?ggbal.
The array rscale contains details of the permutations and/or
scaling factors applied to the right side of A and B, as
returned by ?ggbal.
m INTEGER. The number of columns of the matrix V
(m ≥ 0).
v REAL for sggbak
DOUBLE PRECISION for dggbak
COMPLEX for cggbak
DOUBLE COMPLEX for zggbak.
Array v(ldv,*). Contains the matrix of right or left
eigenvectors to be transformed, as returned by ?tgevc.
ldv INTEGER. The first dimension of v; at least max(1, n).
Output Parameters
v Overwritten by the transformed eigenvectors
info INTEGER.
1058
Specific details for the routine ggbak interface are the following:
v Holds the matrix V of size (n,m).
lscale Holds the vector of length n.
rscale Holds the vector of length n.
side If omitted, this argument is restored based on the presence of
arguments lscale and rscale as follows:
side = 'L', if lscale is present and rscale omitted,
side = 'R', if lscale is omitted and rscale present.
Note that there will be an error condition if both lscale and rscale
are present or if they both are omitted.
?hgeqz
Implements the QZ method for finding the
generalized eigenvalues of the matrix pair (H,T).
Syntax
FORTRAN 77:
call shgeqz(job, compq, compz, n, ilo, ihi, h, ldh, t, ldt, alphar, alphai,
beta, q, ldq, z, ldz, work, lwork, info)
call dhgeqz(job, compq, compz, n, ilo, ihi, h, ldh, t, ldt, alphar, alphai,
beta, q, ldq, z, ldz, work, lwork, info)
call chgeqz(job, compq, compz, n, ilo, ihi, h, ldh, t, ldt, alpha, beta, q,
ldq, z, ldz, work, lwork, rwork, info)
call zhgeqz(job, compq, compz, n, ilo, ihi, h, ldh, t, ldt, alpha, beta, q,
ldq, z, ldz, work, lwork, rwork, info)
1059
Fortran 95:
call hgeqz(h, t [,ilo] [,ihi] [,alphar] [,alphai] [,beta] [,q] [,z] [,job]
[,compq] [,compz] [,info])
call hgeqz(h, t [,ilo] [,ihi] [,alpha] [,beta] [,q] [,z] [,job] [,compq] [,
compz] [,info])
Description
The routine computes the eigenvalues of a real/complex matrix pair (H,T), where H is an upper
Hessenberg matrix and T is upper triangular, using the double-shift version (for real flavors)
or single-shift version (for complex flavors) of the QZ method. Matrix pairs of this type are
produced by the reduction to generalized upper Hessenberg form of a real/complex matrix pair
(A,B):
A = Q1*H*Z1H , B = Q1*T*Z1H,
as computed by ?gghrd.
For real flavors:
If job = 'S', then the Hessenberg-triangular pair (H,T) is also reduced to generalized Schur
form,
H = Q*S*ZT, T = Q *P*ZT,
where Q and Z are orthogonal matrices, P is an upper triangular matrix, and S is a
quasi-triangular matrix with 1-by-1 and 2-by-2 diagonal blocks. The 1-by-1 blocks correspond
to real eigenvalues of the matrix pair (H,T) and the 2-by-2 blocks correspond to complex
conjugate pairs of eigenvalues.
Additionally, the 2-by-2 upper triangular diagonal blocks of P corresponding to 2-by-2 blocks
of S are reduced to positive diagonal form, that is, if S(j+1,j) is non-zero, then P(j+1,j) =
P(j,j+1) = 0, P(j,j) > 0, and P(j+1,j+1) > 0.
If job = 'S', then the Hessenberg-triangular pair (H,T) is also reduced to generalized Schur
form,
H = Q* S*ZH, T = Q*P*ZH,
where Q and Z are unitary matrices, and S and P are upper triangular.
1060
For all function flavors:
Optionally, the orthogonal/unitary matrix Q from the generalized Schur factorization may be
postmultiplied into an input matrix Q1, and the orthogonal/unitary matrix Z may be postmultiplied
into an input matrix Z1.
If Q1 and Z1 are the orthogonal/unitary matrices from ?gghrd that reduced the matrix pair (A,B)
to generalized upper Hessenberg form, then the output matrices Q1Q and Z 1Z are the
orthogonal/unitary factors from the generalized Schur factorization of (A,B):
A = (Q1Q)*S *(Z1Z)H, B = (Q1Q)*P*(Z1Z)H.

To avoid overflow, eigenvalues of the matrix pair (H,T) (equivalently, of (A,B)) are computed
as a pair of values (alpha,beta). For chgeqz/zhgeqz, alpha and beta are complex, and for
shgeqz/dhgeqz, alpha is complex and beta real. If beta is nonzero, λ = alpha/beta is an
eigenvalue of the generalized nonsymmetric eigenvalue problem (GNEP)
A*x = λ*B*x
and if alpha is nonzero, μ = beta/alpha is an eigenvalue of the alternate form of the GNEP
μ*A*y = B*y .
Real eigenvalues (for real flavors) or the values of alpha and beta for the i-th eigenvalue (for
complex flavors) can be read directly from the generalized Schur form:
alpha = S(i,i), beta = P(i,i).
Input Parameters
job CHARACTER*1. Specifies the operations to be performed.
Must be 'E' or 'S'.
If job = 'E', then compute eigenvalues only;
If job = 'S', then compute eigenvalues and the Schur
form.
compq CHARACTER*1. Must be 'N', 'I', or 'V'.
If compq = 'N', left Schur vectors (q) are not computed;
If compq = 'I', q is initialized to the unit matrix and the
matrix of left Schur vectors of (H,T) is returned;
If compq = 'V', q must contain an orthogonal/unitary
matrix Q1 on entry and the product Q1*Q is returned.
compz CHARACTER*1. Must be 'N', 'I', or 'V'.
1061
If compz = 'N', right Schur vectors (z) are not computed;

If compz = 'I', z is initialized to the unit matrix and the
matrix of right Schur vectors of (H,T) is returned;
If compz = 'V', z must contain an orthogonal/unitary
matrix Z1 on entry and the product Z1*Z is returned.
n INTEGER. The order of the matrices H, T, Q, and Z
(n ≥ 0).
ilo, ihi INTEGER. ilo and ihi mark the rows and columns of H
which are in Hessenberg form. It is assumed that H is
already upper triangular in rows and columns 1:ilo-1 and
ihi+1:n.
Constraint:
h, t, q, z, work REAL for shgeqz
DOUBLE PRECISION for dhgeqz
COMPLEX for chgeqz
DOUBLE COMPLEX for zhgeqz.
Arrays:
On entry, h(ldh,*) contains the n-by-n upper Hessenberg
matrix H.
On entry, t(ldt,*) contains the n-by-n upper triangular
matrix T.
q (ldq,*):
On entry, if compq = 'V', this array contains the
orthogonal/unitary matrix Q1 used in the reduction of (A,B)
to generalized Hessenberg form.
z (ldz,*):
On entry, if compz = 'V', this array contains the
orthogonal/unitary matrix Z1 used in the reduction of (A,B)
to generalized Hessenberg form.
1062
If compq = 'I'or 'V', then ldq ≥ max(1, n).
If compq = 'N', then ldz ≥ 1.
If compq = 'I'or 'V', then ldz ≥ max(1, n).
lwork ≥ max(1, n).
rwork REAL for chgeqz
DOUBLE PRECISION for zhgeqz.
Workspace array, DIMENSION at least max(1, n). Used in
complex flavors only.
Output Parameters
h For real flavors:
If job = 'S', then, on exit, h contains the upper
quasi-triangular matrix S from the generalized Schur
factorization; 2-by-2 diagonal blocks (corresponding to
complex conjugate pairs of eigenvalues) are returned in
standard form, with h(i,i) = h(i+1, i+1) and h(i+1,
i) * h(i, i+1) < 0.
If job = 'E', then on exit the diagonal blocks of h match
those of S, but the rest of h is unspecified.
If job = 'S', then, on exit, h contains the upper triangular
matrix S from the generalized Schur factorization.
1063
If job = 'E', then on exit the diagonal of h matches that

of S, but the rest of h is unspecified.
t If job = 'S', then, on exit, t contains the upper triangular
matrix P from the generalized Schur factorization.
For real flavors:
2-by-2 diagonal blocks of P corresponding to 2-by-2 blocks
of S are reduced to positive diagonal form, that is, if h(j+1,j)
is non-zero, then t(j+1,j)=t(j,j+1)=0 and t(j,j) and
t(j+1,j+1) will be positive.
If job = 'E', then on exit the diagonal blocks of t match
those of P, but the rest of t is unspecified.
if job = 'E', then on exit the diagonal of t matches that
of P, but the rest of t is unspecified.
alphar, alphai REAL for shgeqz;
DOUBLE PRECISION for dhgeqz.
Arrays, DIMENSION at least max(1, n). The real and
imaginary parts, respectively, of each scalar alpha defining
an eigenvalue of GNEP.
If alphai(j) is zero, then the j-th eigenvalue is real; if
positive, then the j-th and (j+1)-th eigenvalues are a
complex conjugate pair, with
alphai(j+1) = -alphai(j).
alpha COMPLEX for chgeqz;
The complex scalars alpha that define the eigenvalues of
GNEP. alphai(i) = S(i,i) in the generalized Schur
factorization.
beta REAL for shgeqz
DOUBLE PRECISION for dhgeqz
COMPLEX for chgeqz
For real flavors:
The scalars beta that define the eigenvalues of GNEP.
1064
Together, the quantities alpha = (alphar(j),
alphai(j)) and beta = beta(j) represent the j-th
eigenvalue of the matrix pair (A,B), in one of the forms
lambda = alpha/beta or mu = beta/alpha. Since either
lambda or mu may overflow, they should not, in general, be
computed.
The real non-negative scalars beta that define the
eigenvalues of GNEP.
beta(i) = P(i,i) in the generalized Schur factorization.
Together, the quantities alpha = alpha(j) and beta =
beta(j) represent the j-th eigenvalue of the matrix pair
(A,B), in one of the forms lambda = alpha/beta or mu =
beta/alpha. Since either lambda or mu may overflow, they
should not, in general, be computed.
q On exit, if compq = 'I', q is overwritten by the
orthogonal/unitary matrix of left Schur vectors of the pair
(H,T), and if compq = 'V', q is overwritten by the
orthogonal/unitary matrix of left Schur vectors of (A,B).
z On exit, if compz = 'I', z is overwritten by the
orthogonal/unitary matrix of right Schur vectors of the pair
(H,T), and if compz = 'V', z is overwritten by the
orthogonal/unitary matrix of right Schur vectors of (A,B).
work(1) If info ≥ 0, on exit, work(1) contains the minimum value
info INTEGER.
If info = 1,..., n, the QZ iteration did not converge.
(H,T) is not in Schur form, but alphar(i), alphai(i) (for real
flavors), alpha(i) (for complex flavors), and beta(i),
i=info+1,..., n should be correct.
If info = n+1,...,2n, the shift calculation failed.
(H,T) is not in Schur form, but alphar(i), alphai(i) (for real
flavors), alpha(i) (for complex flavors), and beta(i), i
=info-n+1,..., n should be correct.
1065

Specific details for the routine hgeqz interface are the following:
alphar Holds the vector of length n. Used in real flavors only.
alphai Holds the vector of length n. Used in real flavors only.
alpha Holds the vector of length n. Used in complex flavors only.
beta Holds the vector of length n.
job Must be 'E' or 'S'. The default value is 'E'.
compq If omitted, this argument is restored based on the presence of argument
q as follows:
compq = 'I', if q is present,
compq = 'N', if q is omitted.
If present, compq must be equal to 'I' or 'V' and the argument q
must also be present.
Note that there will be an error condition if compq is present and q
omitted.
z as follows:
must also be present.
Note an error condition if compz is present and z is omitted.
1066
Application Notes
workspace query.
workspace.
?tgevc
Computes some or all of the right and/or left
generalized eigenvectors of a pair of upper
triangular matrices.
Syntax
FORTRAN 77:
call stgevc(side, howmny, select, n, s, lds, p, ldp, vl, ldvl, vr, ldvr, mm,
m, work, info)
call dtgevc(side, howmny, select, n, s, lds, p, ldp, vl, ldvl, vr, ldvr, mm,
m, work, info)
call ctgevc(side, howmny, select, n, s, lds, p, ldp, vl, ldvl, vr, ldvr, mm,
m, work, rwork, info)
call ztgevc(side, howmny, select, n, s, lds, p, ldp, vl, ldvl, vr, ldvr, mm,
m, work, rwork, info)
Fortran 95:
call tgevc(s, p [,howmny] [,select] [,vl] [,vr] [,m] [,info])
1067
Description
The routine computes some or all of the right and/or left eigenvectors of a pair of real/complex
matrices (S,P), where S is quasi-triangular (for real flavors) or upper triangular (for complex
flavors) and P is upper triangular.
Matrix pairs of this type are produced by the generalized Schur factorization of a real/complex
matrix pair (A,B):
A = Q*S*ZH, B = Q*P*ZH
as computed by ?gghrd plus ?hgeqz.
The right eigenvector x and the left eigenvector y of (S,P) corresponding to an eigenvalue w
are defined by:
S*x = w*P*x, yH*S = w*yH*P
The eigenvalues are not input to this routine, but are computed directly from the diagonal
blocks or diagonal elements of S and P.
This routine returns the matrices X and/or Y of right and left eigenvectors of (S,P), or the
products Z*X and/or Q*Y, where Z and Q are input matrices.
If Q and Z are the orthogonal/unitary factors from the generalized Schur factorization of a matrix
pair (A,B), then Z*X and Q*Y are the matrices of right and left eigenvectors of (A,B).
Input Parameters
side CHARACTER*1. Must be 'R', 'L', or 'B'.
If side = 'R' , compute right eigenvectors only.
If side = 'L' , compute left eigenvectors only.
If side = 'B', compute both right and left eigenvectors.
howmny CHARACTER*1. Must be 'A' , 'B', or 'S'.
If howmny = 'A' , compute all right and/or left
eigenvectors.
If howmny = 'B' , compute all right and/or left
eigenvectors, backtransformed by the matrices in vr and/or
vl.
If howmny = 'S' , compute selected right and/or left
eigenvectors, specified by the logical array select.
1068
select LOGICAL.
If howmny = 'S', select specifies the eigenvectors to be
computed.
If howmny = 'A'or 'B', select is not referenced.
For real flavors:
If omega(j) is a real eigenvalue, the corresponding real
eigenvector is computed if select(j) is .TRUE..
If omega(j) and omega(j+1) are the real and imaginary parts
of a complex eigenvalue, the corresponding complex
eigenvector is computed if either select(j) or select(j+1)
is .TRUE., and on exit select(j) is set to .TRUE.and
select(j+1) is set to .FALSE..
The eigenvector corresponding to the j-th eigenvalue is
computed if select(j) is .TRUE..
n INTEGER. The order of the matrices S and P (n ≥ 0).
s, p, vl, vr, work REAL for stgevc
DOUBLE PRECISION for dtgevc
COMPLEX for ctgevc
DOUBLE COMPLEX for ztgevc.
Arrays:
s(lds,*) contains the matrix S from a generalized Schur
factorization as computed by ?hgeqz. This matrix is upper
quasi-triangular for real flavors, and upper triangular for
complex flavors.
The second dimension of s must be at least max(1, n).
p(ldp,*) contains the upper triangular matrix P from a
generalized Schur factorization as computed by ?hgeqz.
For real flavors, 2-by-2 diagonal blocks of P corresponding
to 2-by-2 blocks of S must be in positive diagonal form.
For complex flavors, P must have real diagonal elements.
The second dimension of p must be at least max(1, n).
If side = 'L' or 'B' and howmny = 'B', vl(ldvl,*) must
contain an n-by-n matrix Q (usually the orthogonal/unitary
matrix Q of left Schur vectors returned by ?hgeqz). The
second dimension of vl must be at least max(1, mm).
1069
If side = 'R' , vl is not referenced.

If side = 'R' or 'B' and howmny = 'B', vr(ldvr,*) must
contain an n-by-n matrix Z (usually the orthogonal/unitary
matrix Z of right Schur vectors returned by ?hgeqz). The
second dimension of vr must be at least max(1, mm).
If side = 'L', vr is not referenced.
DIMENSION at least max (1, 6*n) for real flavors and at
least max (1, 2*n) for complex flavors.
lds INTEGER. The first dimension of s; at least max(1, n).
ldp INTEGER. The first dimension of p; at least max(1, n).
ldvl INTEGER. The first dimension of vl;
If side = 'L' or 'B', then ldvl ≥n.
If side = 'R', then ldvl ≥ 1.
ldvr INTEGER. The first dimension of vr;
If side = 'R' or 'B', then ldvr ≥n.
If side = 'L', then ldvr ≥ 1.
mm INTEGER. The number of columns in the arrays vl and/or
vr (mm ≥ m).
rwork REAL for ctgevc DOUBLE PRECISION for ztgevc.
Workspace array, DIMENSION at least max (1, 2*n). Used
in complex flavors only.
Output Parameters
vl On exit, if side = 'L' or 'B', vl contains:
if howmny = 'A', the matrix Y of left eigenvectors of (S,P);
if howmny = 'B', the matrix Q*Y;
if howmny = 'S', the left eigenvectors of (S,P) specified by
select, stored consecutively in the columns of vl, in the
same order as their eigenvalues.
For real flavors:
A complex eigenvector corresponding to a complex
eigenvalue is stored in two consecutive columns, the first
holding the real part, and the second the imaginary part.
vr On exit, if side = 'R' or 'B', vr contains:
1070
if howmny = 'A', the matrix X of right eigenvectors of (S,P);
if howmny = 'B', the matrix Z*X;
if howmny = 'S', the right eigenvectors of (S,P) specified
by select, stored consecutively in the columns of vr, in the
same order as their eigenvalues.
For real flavors:
A complex eigenvector corresponding to a complex
eigenvalue is stored in two consecutive columns, the first
holding the real part, and the second the imaginary part.
m INTEGER. The number of columns in the arrays vl and/or
vr actually used to store the eigenvectors.
For real flavors:
Each selected real eigenvector occupies one column and
each selected complex eigenvector occupies two columns.
Each selected eigenvector occupies one column.
info INTEGER.
For real flavors:
if info = i>0, the 2-by-2 block (i:i+1) does not have a
complex eigenvalue.

Specific details for the routine tgevc interface are the following:
s Holds the matrix S of size (n,n).
p Holds the matrix P of size (n,n).
side Restored based on the presence of arguments vl and vr as follows:
side = 'B', if both vl and vr are present,
1071
side = 'L', if vl is present and vr omitted,

side = 'R', if vl is omitted and vr present,
howmny If omitted, this argument is restored based on the presence of argument
select as follows:
howmny = 'S', if select is present,
howmny = 'A', if select is omitted.
If present, howmny must be equal to 'A' or 'B' and the argument
select must be omitted.
Note that there will be an error condition if both howmny and select
are present.
?tgexc
Reorders the generalized Schur decomposition of
a pair of matrices (A,B) so that one diagonal block
of (A,B) moves to another row index.
Syntax
FORTRAN 77:
call stgexc(wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, ifst, ilst, work,
lwork, info)
call dtgexc(wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, ifst, ilst, work,
lwork, info)
call ctgexc(wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, ifst, ilst, info)
call ztgexc(wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, ifst, ilst, info)
Fortran 95:
call tgexc(a, b [,ifst] [,ilst] [,z] [,q] [,info])
Description
The routine reorders the generalized real-Schur/Schur decomposition of a real/complex matrix
pair (A,B) using an orthogonal/unitary equivalence transformation
1072
(A,B) = Q*(A,B)*ZH,
so that the diagonal block of (A, B) with row index ifst is moved to row ilst. Matrix pair (A,
B) must be in a generalized real-Schur/Schur canonical form (as returned by ?gges), that is,
A is block upper triangular with 1-by-1 and 2-by-2 diagonal blocks and B is upper triangular.
Optionally, the matrices Q and Z of generalized Schur vectors are updated.
Q(in)*A(in)*Z(in)' = Q(out)*A(out)*Z(out)'
Q(in)*B(in)*Z(in)' = Q(out)*B(out)*Z(out)'.
Input Parameters
wantq, wantz LOGICAL.
If wantq = .TRUE., update the left transformation matrix
Q;
If wantq = .FALSE., do not update Q;
If wantz = .TRUE., update the right transformation matrix
Z;
If wantz = .FALSE., do not update Z.
a, b, q, REAL for stgexc
DOUBLE PRECISION for dtgexc
COMPLEX for ctgexc
DOUBLE COMPLEX for ztgexc.
Arrays:
b(ldb,*) contains the matrix B. The second dimension of b
q (ldq,*)
If wantq = .FALSE., then q is not referenced.
If wantq = .TRUE., then q must contain the
orthogonal/unitary matrix Q.
z (ldz,*)
If wantz = .FALSE., then z is not referenced.
If wantz = .TRUE., then z must contain the
orthogonal/unitary matrix Z.
1073

If wantq = .FALSE., then ldq ≥ 1.
If wantq = .TRUE., then ldq ≥ max(1, n).
If wantz = .FALSE., then ldz ≥ 1.
If wantz = .TRUE., then ldz ≥ max(1, n).
ifst, ilst INTEGER. Specify the reordering of the diagonal blocks of
(A, B). The block with row index ifst is moved to row ilst,
by a sequence of swapping between adjacent blocks.
Constraint: 1 ≤ ifst, ilst ≤ n.
work REAL for stgexc;
DOUBLE PRECISION for dtgexc.
Workspace array, DIMENSION (lwork). Used in real flavors
only.
lwork INTEGER. The dimension of work; must be at least 4n +16.
Output Parameters
a, b Overwritten by the updated matrices A and B.
ifst, ilst Overwritten for real flavors only.
If ifst pointed to the second row of a 2 by 2 block on entry,
it is changed to point to the first row; ilst always points
to the first row of the block in its final position (which may
differ from its input value by ±1).
info INTEGER.
1074
If info = 1, the transformed matrix pair (A, B) would be
too far from generalized Schur form; the problem is
ill-conditioned. (A, B) may have been partially reordered,
and ilst points to the first row of the current position of
the block being moved.

Specific details for the routine tgexc interface are the following:
wantq Restored based on the presence of the argument q as follows:
wantq = .TRUE, if q is present,
wantq = .FALSE, if q is omitted.
wantz Restored based on the presence of the argument z as follows:
wantz = .TRUE, if z is present,
wantz = .FALSE, if z is omitted.
Application Notes
query.
1075
?tgsen
Reorders the generalized Schur decomposition of
a pair of matrices (A,B) so that a selected cluster
of eigenvalues appears in the leading diagonal
blocks of (A,B).
Syntax
FORTRAN 77:
call stgsen(ijob, wantq, wantz, select, n, a, lda, b, ldb, alphar, alphai,
beta, q, ldq, z, ldz, m, pl, pr, dif, work, lwork, iwork, liwork, info)
call dtgsen(ijob, wantq, wantz, select, n, a, lda, b, ldb, alphar, alphai,
beta, q, ldq, z, ldz, m, pl, pr, dif, work, lwork, iwork, liwork, info)
call ctgsen(ijob, wantq, wantz, select, n, a, lda, b, ldb, alpha, beta, q,
ldq, z, ldz, m, pl, pr, dif, work, lwork, iwork, liwork, info)
call ztgsen(ijob, wantq, wantz, select, n, a, lda, b, ldb, alpha, beta, q,
ldq, z, ldz, m, pl, pr, dif, work, lwork, iwork, liwork, info)
Fortran 95:
call tgsen(a, b, select [,alphar] [,alphai] [,beta] [,ijob] [,q] [,z] [,pl]
[,pr] [,dif] [,m] [,info])
call tgsen(a, b, select [,alpha] [,beta] [,ijob] [,q] [,z] [,pl] [,pr] [, dif]
[,m] [,info])
Description
The routine reorders the generalized real-Schur/Schur decomposition of a real/complex matrix
pair (A, B) (in terms of an orthogonal/unitary equivalence transformation Q*(A,B)*Z, so that
a selected cluster of eigenvalues appears in the leading diagonal blocks of the pair (A, B). The
leading columns of Q and Z form orthonormal/unitary bases of the corresponding left and right
eigenspaces (deflating subspaces).
(A, B) must be in generalized real-Schur/Schur canonical form (as returned by ?gges), that is,
A and B are both upper triangular.
?tgsen also computes the generalized eigenvalues
1076
ωj = (alphar(j) + alphai(j)*i)/beta(j) (for real flavors)
ωj = alpha(j)/beta(j) (for complex flavors)

of the reordered matrix pair (A, B).
Optionally, the routine computes the estimates of reciprocal condition numbers for eigenvalues
and eigenspaces. These are Difu[(A11, B11), (A22, B22)] and Difl[(A11, B11), (A22,
B22)], that is, the separation(s) between the matrix pairs (A11, B11) and (A22, B22) that
correspond to the selected cluster and the eigenvalues outside the cluster, respectively, and
norms of "projections" onto left and right eigenspaces with respect to the selected cluster in
the (1,1)-block.
Input Parameters
ijob INTEGER. Specifies whether condition numbers are required
for the cluster of eigenvalues (pl and pr) or the deflating
subspaces Difu and Difl.
If ijob =0, only reorder with respect to select;
If ijob =1, reciprocal of norms of "projections" onto left
and right eigenspaces with respect to the selected cluster
(pl and pr);
If ijob =2, compute upper bounds on Difu and Difl, using
F-norm-based estimate (dif (1:2));
If ijob =3, compute estimate of Difu and Difl, sing
1-norm-based estimate (dif (1:2)). This option is about 5
times as expensive as ijob =2;
If ijob =4,>compute pl, pr and dif (i.e., options 0, 1 and
2 above). This is an economic version to get it all;
If ijob =5, compute pl, pr and dif (i.e., options 0, 1 and
3 above).
wantq, wantz LOGICAL.
If wantq = .TRUE., update the left transformation matrix
Q;
If wantq = .FALSE., do not update Q;
If wantz = .TRUE., update the right transformation matrix
Z;
If wantz = .FALSE., do not update Z.
select LOGICAL.
1077
Array, DIMENSION at least max (1, n). Specifies the

eigenvalues in the selected cluster.
To select an eigenvalue omega(j), select(j) must be
.TRUE. For real flavors: to select a complex conjugate pair
of eigenvalues omega(j) and omega(j+1) (corresponding 2
by 2 diagonal block), select(j) and/or select(j+1) must
be set to .TRUE.; the complex conjugate omega(j) and
omega(j+1) must be either both included in the cluster or
both excluded.
a, b, q, z, work REAL for stgsen
DOUBLE PRECISION for dtgsen
COMPLEX for ctgsen
DOUBLE COMPLEX for ztgsen.
Arrays:
For real flavors: A is upper quasi-triangular, with (A, B) in
generalized real Schur canonical form.
For complex flavors: A is upper triangular, in generalized
Schur canonical form.
For real flavors: B is upper triangular, with (A, B) in
generalized real Schur canonical form.
For complex flavors: B is upper triangular, in generalized
Schur canonical form. The second dimension of b must be
at least max(1, n).
q (ldq,*)
If wantq = .TRUE., then q is an n-by-n matrix;
If wantq = .FALSE., then q is not referenced.
z (ldz,*)
If wantz = .TRUE., then z is an n-by-n matrix;
If wantz = .FALSE., then z is not referenced.
lwork).
1078
ldq INTEGER. The first dimension of q; ldq ≥ 1.
If wantq = .TRUE., then ldq ≥ max(1, n).
ldz INTEGER. The first dimension of z; ldz ≥ 1.
If wantz = .TRUE., then ldz ≥ max(1, n).
For real flavors:
If ijob = 1, 2, or 4, lwork ≥ max(4n+16, 2m(n-m)).
If ijob = 3 or 5, lwork ≥ max(4n+16, 4m(n-m)).
If ijob = 1, 2, or 4, lwork ≥ max(1, 2m(n-m)).
If ijob = 3 or 5, lwork ≥ max(1, 4m(n-m)).
If ijob =0, iwork is not referenced.
liwork INTEGER. The dimension of the array iwork.
For real flavors:
If ijob = 1, 2, or 4, liwork ≥ n+6.
If ijob = 3 or 5, liwork ≥ max(n+6, 2m(n-m)).
If ijob = 1, 2, or 4, liwork ≥ n+2.
If ijob = 3 or 5, liwork ≥ max(n+2, 2m(n-m)).
1079
Output Parameters
a, b Overwritten by the reordered matrices A and B, respectively.
alphar, alphai REAL for stgsen;
DOUBLE PRECISION for dtgsen.
Arrays, DIMENSION at least max(1, n). Contain values that
form generalized eigenvalues in real flavors.
See beta.
alpha COMPLEX for ctgsen;
Array, DIMENSION at least max(1, n). Contain values that
form generalized eigenvalues in complex flavors.
See beta.
beta REAL for stgsen
DOUBLE PRECISION for dtgsen
COMPLEX for ctgsen
For real flavors:
On exit, (alphar(j) + alphai(j)*i)/beta(j), j=1,...,
n, will be the generalized eigenvalues.
alphar(j) + alphai(j)*i and beta(j), j=1,..., n are
the diagonals of the complex Schur form (S,T) that would
result if the 2-by-2 diagonal blocks of the real generalized
Schur form of (A,B) were further reduced to triangular form
using complex unitary transformations.
positive, then the j-th and (j+1)-st eigenvalues are a
complex conjugate pair, with alphai(j+1) negative.
The diagonal elements of A and B, respectively, when the
pair (A,B) has been reduced to generalized Schur form.
alpha(i)/beta(i), i=1,..., n are the generalized
eigenvalues.
1080
q If wantq =.TRUE., then, on exit, Q has been postmultiplied
by the left orthogonal transformation matrix which reorder
(A, B). The leading m columns of Q form orthonormal bases
for the specified pair of left eigenspaces (deflating
subspaces).
z If wantz =.TRUE., then, on exit, Z has been postmultiplied
by the left orthogonal transformation matrix which reorder
(A, B). The leading m columns of Z form orthonormal bases
for the specified pair of left eigenspaces (deflating
subspaces).
m INTEGER.
The dimension of the specified pair of left and right
eigen-spaces (deflating subspaces); 0 ≤ m ≤ n.
pl, pr REAL for single precision flavors;
If ijob = 1, 4, or 5, pl and pr are lower bounds on the
reciprocal of the norm of "projections" onto left and right
eigenspaces with respect to the selected cluster.
0 < pl, pr ≤ 1. If m = 0 or m = n, pl = pr = 1.
If ijob = 0, 2 or 3, pl and pr are not referenced
dif REAL for single precision flavors;DOUBLE PRECISION for
double precision flavors.
Array, DIMENSION (2).
If ijob ≥ 2, dif(1:2) store the estimates of Difu and
Difl.
If ijob = 2 or 4, dif(1:2) are F-norm-based upper bounds
on Difu and Difl.
If ijob = 3 or 5, dif(1:2) are 1-norm-based estimates of
Difu and Difl.
If m = 0 or n, dif(1:2) = F-norm([A, B]).
If ijob = 0 or 1, dif is not referenced.
work(1) If ijob is not 0 and info = 0, on exit, work(1) contains
the minimum value of lwork required for optimum
performance. Use this lwork for subsequent runs.
1081
iwork(1) If ijob is not 0 and info = 0, on exit, iwork(1) contains

the minimum value of liwork required for optimum
performance. Use this liwork for subsequent runs.
info INTEGER.
If info = 1, Reordering of (A, B) failed because the
transformed matrix pair (A, B) would be too far from
generalized Schur form; the problem is very ill-conditioned.
(A, B) may have been partially reordered.
If requested, 0 is returned in dif(*), pl and pr.

Specific details for the routine tgsen interface are the following:
dif Holds the vector of length (2).
ijob Must be 0, 1, 2, 3, 4, or 5. The default value is 0.
wantq Restored based on the presence of the argument q as follows:
wantq = .TRUE, if q is present,
wantq = .FALSE, if q is omitted.
wantz Restored based on the presence of the argument z as follows:
wantz = .TRUE, if z is present,
wantz = .FALSE, if z is omitted.
1082
Application Notes
a workspace query.
?tgsyl
Solves the generalized Sylvester equation.
Syntax
FORTRAN 77:
call stgsyl(trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf,
scale, dif, work, lwork, iwork, info)
call dtgsyl(trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf,
call ctgsyl(trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf,
call ztgsyl(trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf,
Fortran 95:
call tgsyl(a, b, c, d, e, f [,ijob] [,trans] [,scale] [,dif] [,info])
1083
Description
The routine solves the generalized Sylvester equation:
A*R-L*B = scale*C
D*R-L*E = scale*F
where R and L are unknown m-by-n matrices, (A, D), (B, E) and (C, F) are given matrix pairs of
size m-by-m, n-by-n and m-by-n, respectively, with real/complex entries. (A, D) and (B, E) must
be in generalized real-Schur/Schur canonical form, that is, A, B are upper
quasi-triangular/triangular and D, E are upper triangular.
The solution (R, L) overwrites (C, F). The factor scale, 0≤scale≤1, is an output scaling factor
chosen to avoid overflow.
In matrix notation the above equation is equivalent to the following: solve Z*x = scale*b,
where Z is defined as
Here Ik is the identity matrix of size k and X' is the transpose/conjugate-transpose of X. kron(X,
Y) is the Kronecker product between the matrices X and Y.
If trans = 'T' (for real flavors), or trans = 'C' (for complex flavors), the routine ?tgsyl
solves the transposed/conjugate-transposed system Z'*y = scale*b, which is equivalent to
solve for R and L in
A'*R+D'*L = scale*C
R*B'+L*E' = scale*(-F)
This case (trans = 'T' for stgsyl/dtgsyl or trans = 'C' for ctgsyl/ztgsyl) is used to
compute an one-norm-based estimate of Dif[(A, D), (B, E)], the separation between the
matrix pairs (A,D) and (B,E), using slacon/clacon.
1084
If ijob ≥ 1, ?tgsyl computes a Frobenius norm-based estimate of Dif[(A, D), (B,E)].
That is, the reciprocal of a lower bound on the reciprocal of the smallest singular value of Z.
This is a level 3 BLAS algorithm.
Input Parameters
If trans = 'N', solve the generalized Sylvester equation.
If trans = 'T', solve the 'transposed' system (for real
flavors only).
If trans = 'C', solve the ' conjugate transposed' system
(for complex flavors only).
ijob INTEGER. Specifies what kind of functionality to be
performed:
If ijob =0 , solve the generalized Sylvester equation only;
If ijob =1, perform the functionality of ijob =0 and ijob
=3;
If ijob =2, perform the functionality of ijob =0 and ijob
=4;
If ijob =3, only an estimate of Dif[(A, D), (B, E)] is
computed (look ahead strategy is used);
If ijob =4, only an estimate of Dif[(A, D), (B,E)] is
computed (?gecon on sub-systems is used). If trans =
'T' or 'C', ijob is not referenced.
m INTEGER. The order of the matrices A and D, and the row
dimension of the matrices C, F, R and L.
n INTEGER. The order of the matrices B and E, and the column
dimension of the matrices C, F, R and L.
a, b, c, d, e, f, work REAL for stgsyl
DOUBLE PRECISION for dtgsyl
COMPLEX for ctgsyl
DOUBLE COMPLEX for ztgsyl.
Arrays:
a(lda,*) contains the upper quasi-triangular (for real flavors)
or upper triangular (for complex flavors) matrix A.
1085
b(ldb,*) contains the upper quasi-triangular (for real flavors)

or upper triangular (for complex flavors) matrix B. The
second dimension of b must be at least max(1, n).
c (ldc,*) contains the right-hand-side of the first matrix
equation in the generalized Sylvester equation (as defined
by trans)
The second dimension of c must be at least max(1, n).
d (ldd,*) contains the upper triangular matrix D.
The second dimension of d must be at least max(1, m).
e (lde,*) contains the upper triangular matrix E.
The second dimension of e must be at least max(1, n).
f (ldf,*) contains the right-hand-side of the second matrix
equation in the generalized Sylvester equation (as defined
by trans)
The second dimension of f must be at least max(1, n).
lwork).
ldd INTEGER. The first dimension of d; at least max(1, m).
lde INTEGER. The first dimension of e; at least max(1, n).
ldf INTEGER. The first dimension of f; at least max(1, m).
lwork INTEGER.
The dimension of the array work. lwork ≥ 1.
If ijob = 1 or 2 and trans = 'N', lwork ≥ max(1,
2*m*n).
iwork INTEGER. Workspace array, DIMENSION at least (m+n+6)
for real flavors, and at least (m+n+2) for complex flavors.
1086
Output Parameters
c If ijob=0, 1, or 2, overwritten by the solution R.
If ijob=3 or 4 and trans = 'N', c holds R, the solution
achieved during the computation of the Dif-estimate.
f If ijob=0, 1, or 2, overwritten by the solution L.
If ijob=3 or 4 and trans = 'N', f holds L, the solution
achieved during the computation of the Dif-estimate.
dif REAL for single-precision flavors
On exit, dif is the reciprocal of a lower bound of the
reciprocal of the Dif-function, that is, dif is an upper bound
of Dif[(A, D), (B, E)] = sigma_min(Z), where Z as
in (2).
If ijob = 0, or trans = 'T' (for real flavors), or trans
= 'C' (for complex flavors), dif is not touched.
On exit, scale is the scaling factor in the generalized
Sylvester equation.
If 0 < scale < 1, c and f hold the solutions R and L,
respectively, to a slightly perturbed system but the input
matrices A, B, D and E have not been changed.
If scale = 0, c and f hold the solutions R and L,
respectively, to the homogeneous system with C = F = 0.
Normally, scale = 1.
work(1) If info = 0, work(1) contains the minimum value of lwork
required for optimum performance. Use this lwork for
subsequent runs.
info INTEGER.
If info > 0, (A, D) and (B, E) have common or close
eigenvalues.
1087

Specific details for the routine tgsyl interface are the following:
a Holds the matrix A of size (m,m).
d Holds the matrix D of size (m,m).
e Holds the matrix E of size (n,n).
f Holds the matrix F of size (m,n).
ijob Must be 0, 1, 2, 3, or 4. The default value is 0.
Application Notes
query.
1088
?tgsna
Estimates reciprocal condition numbers for specified
eigenvalues and/or eigenvectors of a pair of
matrices in generalized real Schur canonical form.
Syntax
FORTRAN 77:
call stgsna(job, howmny, select, n, a, lda, b, ldb, vl, ldvl, vr, ldvr, s,
dif, mm, m, work, lwork, iwork, info)
call dtgsna(job, howmny, select, n, a, lda, b, ldb, vl, ldvl, vr, ldvr, s,
call ctgsna(job, howmny, select, n, a, lda, b, ldb, vl, ldvl, vr, ldvr, s,
call ztgsna(job, howmny, select, n, a, lda, b, ldb, vl, ldvl, vr, ldvr, s,
Fortran 95:
call tgsna(a, b [,s] [,dif] [,vl] [,vr] [,select] [,m] [,info])
Description
The real flavors stgsna/dtgsna of this routine estimate reciprocal condition numbers for
specified eigenvalues and/or eigenvectors of a matrix pair (A, B) in generalized real Schur
T T
canonical form (or of any matrix pair (Q*A*Z , Q*B*Z ) with orthogonal matrices Q and Z.
(A, B) must be in generalized real Schur form (as returned by sgges/dgges), that is, A is block
upper triangular with 1-by-1 and 2-by-2 diagonal blocks. B is upper triangular.
The complex flavors ctgsna/ztgsna estimate reciprocal condition numbers for specified
eigenvalues and/or eigenvectors of a matrix pair (A, B). (A, B) must be in generalized Schur
canonical form , that is, A and B are both upper triangular.
1089
Input Parameters
job CHARACTER*1. Specifies whether condition numbers are
required for eigenvalues or eigenvectors. Must be 'E' or
'V' or 'B'.
If job = 'E', for eigenvalues only (compute s ).
If job = 'V', for eigenvectors only (compute dif ).
If job = 'B', for both eigenvalues and eigenvectors
(compute both s and dif).
howmny CHARACTER*1. Must be 'A' or 'S'.
If howmny = 'A' , compute condition numbers for all
eigenpairs.
If howmny = 'S' , compute condition numbers for selected
eigenpairs specified by the logical array select.
select LOGICAL.
If howmny = 'S', select specifies the eigenpairs for which
condition numbers are required.
If howmny = 'A', select is not referenced.
For real flavors:
to a real eigenvalue omega(j), select(j) must be set to
.TRUE.; to select condition numbers corresponding to a
complex conjugate pair of eigenvalues omega(j) and
omega(j+1), either select(j) or select(j+1) must be set
to .TRUE.
To select condition numbers for the corresponding j-th
eigenvalue and/or eigenvector, select(j) must be set to
.TRUE..
n INTEGER. The order of the square matrix pair (A, B)
(n ≥ 0).
a, b, vl, vr, work REAL for stgsna
DOUBLE PRECISION for dtgsna
COMPLEX for ctgsna
DOUBLE COMPLEX for ztgsna.
Arrays:
1090
a(lda,*) contains the upper quasi-triangular (for real flavors)
or upper triangular (for complex flavors) matrix A in the pair
(A, B).
b(ldb,*) contains the upper triangular matrix B in the pair
(A, B). The second dimension of b must be at least max(1,
n).
If job = 'E' or 'B', vl(ldvl,*) must contain left
eigenvectors of (A, B), corresponding to the eigenpairs
specified by howmny and select. The eigenvectors must be
stored in consecutive columns of vl, as returned by ?tgevc.
If job = 'V', vl is not referenced. The second dimension
of vl must be at least max(1, m).
If job = 'E' or 'B', vr(ldvr,*) must contain right
eigenvectors of (A, B), corresponding to the eigenpairs
specified by howmny and select. The eigenvectors must be
stored in consecutive columns of vr, as returned by ?tgevc.
If job = 'V', vr is not referenced. The second dimension
of vr must be at least max(1, m).
lwork).
If job = 'E', work is not referenced.
ldvl INTEGER. The first dimension of vl; ldvl ≥ 1.
If job = 'E' or 'B', then ldvl ≥ max(1, n).
ldvr INTEGER. The first dimension of vr; ldvr ≥ 1.
If job = 'E' or 'B', then ldvr ≥ max(1, n).
mm INTEGER. The number of elements in the arrays s and dif
(mm ≥ m).
lwork ≥ max(1, n).
If job = 'V' or 'B', lwork ≥ 2*n*(n+2)+16 for real
flavors, and lwork ≥ max(1, 2*n*n) for complex flavors.
1091

iwork INTEGER. Workspace array, DIMENSION at least (n+6) for
real flavors, and at least (n+2) for complex flavors.
If ijob = 'E', iwork is not referenced.
Output Parameters
Array, DIMENSION (mm ).
If job = 'E' or 'B', contains the reciprocal condition
numbers of the selected eigenvalues, stored in consecutive
If job = 'V', s is not referenced.
For real flavors:
For a complex conjugate pair of eigenvalues two consecutive
elements of s are set to the same value. Thus, s(j), dif(j),
and the j-th columns of vl and vr all correspond to the
same eigenpair (but not in general the j-th eigenpair, unless
all eigenpairs are selected).
dif REAL for single-precision flavors
Array, DIMENSION (mm ).
If job = 'V' or 'B', contains the estimated reciprocal
condition numbers of the selected eigenvectors, stored in
consecutive elements of the array.
If the eigenvalues cannot be reordered to compute dif(j),
dif(j) is set to 0; this can only occur when the true value
would be very small anyway.
If job = 'E', dif is not referenced.
For real flavors:
For a complex eigenvector, two consecutive elements of
dif are set to the same value.
1092
For each eigenvalue/vector specified by select, dif stores
a Frobenius norm-based estimate of Difl.
m INTEGER. The number of elements in the arrays s and dif
used to store the specified condition numbers; for each
selected eigenvalue one element is used.
work(1) work(1)
If job is not 'E' and info = 0, on exit, work(1) contains
the minimum value of lwork required for optimum
performance. Use this lwork for subsequent runs.
info INTEGER.

Specific details for the routine tgsna interface are the following:
s Holds the vector of length (mm).
dif Holds the vector of length (mm).
howmny Restored based on the presence of the argument select as follows:
howmny = 'S', if select is present, howmny = 'A', if select is
omitted.
job Restored based on the presence of arguments s and dif as follows:
job = 'B', if both s and dif are present, job = 'E', if s is present
and dif omitted, job = 'V', if s is omitted and dif present, Note that
there will be an error condition if both s and dif are omitted.
1093
Application Notes
query.
Generalized Singular Value Decomposition

This section describes LAPACK computational routines used for finding the generalized singular
value decomposition (GSVD) of two matrices A and B as
UHAQ = D1*(0 R),
VHBQ = D2*(0 R),
where U, V, and Q are orthogonal/unitary matrices, R is a nonsingular upper triangular matrix,
and D1, D2 are “diagonal” matrices of the structure detailed in the routines description section.
Table 4-7 lists LAPACK routines (FORTRAN 77 interface) that perform generalized singular
value decomposition of matrices. Respective routine names in Fortran 95 interface are without
the first symbol (see Routine Naming Conventions).
Table 4-7 Computational Routines for Generalized Singular Value Decomposition
Routine name Operation performed
?ggsvp Computes the preprocessing decomposition for the generalized

SVD
?tgsja Computes the generalized SVD of two upper triangular or

trapezoidal matrices
You can use routines listed in the above table as well as the driver routine ?ggsvd to find the
GSVD of a pair of general rectangular matrices.
1094
?ggsvp
Computes the preprocessing decomposition for the
generalized SVD.
Syntax
FORTRAN 77:
call sggsvp(jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u,
ldu, v, ldv, q, ldq, iwork, tau, work, info)
call dggsvp(jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u,
ldu, v, ldv, q, ldq, iwork, tau, work, info)
call cggsvp (jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u,
ldu, v, ldv, q, ldq, iwork, rwork, tau, work, info)
call zggsvp(jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u,
ldu, v, ldv, q, ldq, iwork, rwork, tau, work, info)
Fortran 95:
call ggsvp(a, b, tola, tolb [, k] [,l] [,u] [,v] [,q] [,info])
Description
The routine computes orthogonal matrices U, V and Q such that
1095
where the k-by-k matrix A12 and l-by-l matrix B13 are nonsingular upper triangular; A23 is
l-by-l upper triangular if m-k-l ≥0, otherwise A23 is (m-k)-by-l upper trapezoidal. The sum
H H H
k+l is equal to the effective numerical rank of the (m+p)-by-n matrix (A ,B ) .
This decomposition is the preprocessing step for computing the Generalized Singular Value
Decomposition (GSVD), see subroutine ?ggsvd.
Input Parameters
jobu CHARACTER*1. Must be 'U' or 'N'.
If jobu = 'U', orthogonal/unitary matrix U is computed.
If jobu = 'N', U is not computed.
jobv CHARACTER*1. Must be 'V' or 'N'.
If jobv = 'V', orthogonal/unitary matrix V is computed.
If jobv = 'N', V is not computed.
jobq CHARACTER*1. Must be 'Q' or 'N'.
If jobq = 'Q', orthogonal/unitary matrix Q is computed.
If jobq = 'N', Q is not computed.
p INTEGER. The number of rows of the matrix B (p ≥ 0).
(n ≥ 0).
1096
a, b, tau, work REAL for sggsvp
DOUBLE PRECISION for dggsvp
COMPLEX for cggsvp
DOUBLE COMPLEX for zggsvp.
Arrays:
tau(*) is a workspace array.
The dimension of tau must be at least max(1, n).
The dimension of work must be at least max(1, 3n, m, p).
tola, tolb REAL for single-precision flavors
tola and tolb are the thresholds to determine the effective
numerical rank of matrix B and a subblock of A. Generally,
they are set to
tola = max(m, n)*||A||*MACHEPS,
tolb = max(p, n)*||B||*MACHEPS.
The size of tola and tolb may affect the size of backward
errors of the decomposition.
ldu INTEGER. The first dimension of the output array u . ldu
≥ max(1, m) if jobu = 'U'; ldu ≥ 1 otherwise.
ldv INTEGER. The first dimension of the output array v . ldv
≥ max(1, p) if jobv = 'V'; ldv ≥ 1 otherwise.
ldq INTEGER. The first dimension of the output array q . ldq
≥ max(1, n) if jobq = 'Q'; ldq ≥ 1 otherwise.
rwork REAL for cggsvp
DOUBLE PRECISION for zggsvp.
Workspace array, DIMENSION at least max(1, 2n). Used in
1097
Output Parameters
a Overwritten by the triangular (or trapezoidal) matrix
described in the Description section.
b Overwritten by the triangular matrix described in the
Description section.
k, l INTEGER. On exit, k and l specify the dimension of
subblocks. The sum k + l is equal to effective numerical
H H H
rank of (A , B ) .
u, v, q REAL for sggsvp
DOUBLE PRECISION for dggsvp
COMPLEX for cggsvp
DOUBLE COMPLEX for zggsvp.
Arrays:
If jobu = 'U', u(ldu,*) contains the orthogonal/unitary
matrix U.
The second dimension of u must be at least max(1, m).
If jobu = 'N', u is not referenced.
If jobv = 'V', v(ldv,*) contains the orthogonal/unitary
matrix V.
If jobv = 'N', v is not referenced.
If jobq = 'Q', q(ldq,*) contains the orthogonal/unitary
matrix Q.
If jobq = 'N', q is not referenced.
info INTEGER.

Specific details for the routine ggsvp interface are the following:
1098
b Holds the matrix B of size (p,n).
u Holds the matrix U of size (m,m).
v Holds the matrix V of size (p,m).
jobu Restored based on the presence of the argument u as follows:
jobu = 'U', if u is present,
jobu = 'N', if u is omitted.
jobv Restored based on the presence of the argument v as follows:
jobz = 'V', if v is present,
jobz = 'N', if v is omitted.
jobq Restored based on the presence of the argument q as follows:
jobz = 'Q', if q is present,
jobz = 'N', if q is omitted.
?tgsja
Computes the generalized SVD of two upper
triangular or trapezoidal matrices.
Syntax
FORTRAN 77:
call stgsja(jobu, jobv, jobq, m, p, n, k, l, a, lda, b, ldb, tola, tolb, alpha,
beta, u, ldu, v, ldv, q, ldq, work, ncycle, info)
call dtgsja(jobu, jobv, jobq, m, p, n, k, l, a, lda, b, ldb, tola, tolb, alpha,
call ctgsja(jobu, jobv, jobq, m, p, n, k, l, a, lda, b, ldb, tola, tolb, alpha,
call ztgsja(jobu, jobv, jobq, m, p, n, k, l, a, lda, b, ldb, tola, tolb, alpha,
Fortran 95:
call tgsja(a, b, tola, tolb, k, l [,u] [,v] [,q] [,jobu] [,jobv] [,jobq]
[,alpha] [,beta] [,ncycle] [,info])
1099
Description
The routine computes the generalized singular value decomposition (GSVD) of two real/complex
upper triangular (or trapezoidal) matrices A and B. On entry, it is assumed that matrices A and
B have the following forms, which may be obtained by the preprocessing subroutine ?ggsvp
from a general m-by-n matrix A and p-by-n matrix B:
where the k-by-k matrix A12 and l-by-l matrix B13 are nonsingular upper triangular; A23 is
l-by-l upper triangular if m-k-l ≥0, otherwise A23 is (m-k)-by-l upper trapezoidal.
On exit,
1100
UH*A*Q = D1*(0 R), VH*B*Q = D2*(0 R),
where U, V and Q are orthogonal/unitary matrices, R is a nonsingular upper triangular matrix,
and D1 and D2 are “diagonal” matrices, which are of the following structures:
If m-k-l ≥0,
where
C = diag(alpha(k+1),...,alpha(k+l))
S = diag(beta(k+1),...,beta(k+l))
C2 + S2 = I
R is stored in a(1:k+l, n-k-l+1:n ) on exit.
1101
If m-k-l < 0,
where
C = diag(alpha(K+1),...,alpha(m)),
S = diag(beta(K+1),...,beta(m)),
C2 + S2 = I
1102
On exit, is stored in a(1:m, n-k-l+1:n ) and R33 is stored
in b(m-k+1:l, n+m-k-l+1:n ).
The computation of the orthogonal/unitary transformation matrices U, V or Q is optional. These

matrices may either be formed explicitly, or they may be postmultiplied into input matrices U1,
V1, or Q1.
Input Parameters
jobu CHARACTER*1. Must be 'U', 'I', or 'N'.
If jobu = 'U', u must contain an orthogonal/unitary matrix
U1 on entry.
If jobu = 'I', u is initialized to the unit matrix.
If jobu = 'N', u is not computed.
jobv CHARACTER*1. Must be 'V', 'I', or 'N'.
If jobv = 'V', v must contain an orthogonal/unitary matrix
V1 on entry.
If jobv = 'I', v is initialized to the unit matrix.
If jobv = 'N', v is not computed.
jobq CHARACTER*1. Must be 'Q', 'I', or 'N'.
If jobq = 'Q', q must contain an orthogonal/unitary matrix
Q1 on entry.
If jobq = 'I', q is initialized to the unit matrix.
If jobq = 'N', q is not computed.
(n ≥ 0).
k, l INTEGER. Specify the subblocks in the input matrices A and
B, whose GSVD is computed.
a,b,u,v,q,work REAL for stgsja
DOUBLE PRECISION for dtgsja
COMPLEX for ctgsja
DOUBLE COMPLEX for ztgsja.
1103
Arrays:
If jobu = 'U', u(ldu,*) must contain a matrix U1 (usually
the orthogonal/unitary matrix returned by ?ggsvp).
The second dimension of u must be at least max(1, m).
If jobv = 'V', v(ldv,*) must contain a matrix V1 (usually
The second dimension of v must be at least max(1, p).
If jobq = 'Q', q(ldq,*) must contain a matrix Q1 (usually
ldu INTEGER. The first dimension of the array u .
ldu ≥ max(1, m) if jobu = 'U'; ldu ≥ 1 otherwise.
ldv INTEGER. The first dimension of the array v .
ldv ≥ max(1, p) if jobv = 'V'; ldv ≥ 1 otherwise.
ldq INTEGER. The first dimension of the array q .
ldq ≥ max(1, n) if jobq = 'Q'; ldq ≥ 1 otherwise.
tola, tolb REAL for single-precision flavors
tola and tolb are the convergence criteria for the
Jacobi-Kogbetliantz iteration procedure. Generally, they are
the same as used in ?ggsvp:
tola = max(m, n)*|A|*MACHEPS,
tolb = max(p, n)*|B|*MACHEPS.
Output Parameters
a On exit, a(n-k+1:n, 1:min(k+l, m)) contains the triangular
matrix R or part of R.
1104
b On exit, if necessary, b(m-k+1: l, n+m-k-l+1: n)) contains
a part of R.
alpha, beta REAL for single-precision flavors
Arrays, DIMENSION at least max(1, n). Contain the
generalized singular value pairs of A and B:
alpha(1:k) = 1,
beta(1:k) = 0,
and if m-k-l ≥ 0,
alpha(k+1:k+l) = diag(C),
beta(k+1:k+l) = diag(S),
or if m-k-l < 0,
alpha(k+1:m)= C, alpha(m+1:k+l)=0
beta(K+1:M) = S,
beta(m+1:k+l) = 1.
Furthermore, if k+l < n,
alpha(k+l+1:n)= 0 and
beta(k+l+1:n) = 0.
u If jobu = 'I', u contains the orthogonal/unitary matrix U.
If jobu = 'U', u contains the product U1*U.
v If jobv = 'I', v contains the orthogonal/unitary matrix U.
If jobv = 'V', v contains the product V1*V.
q If jobq = 'I', q contains the orthogonal/unitary matrix U.
If jobq = 'Q', q contains the product Q1*Q.
ncycle INTEGER. The number of cycles required for convergence.
info INTEGER.
If info = 1, the procedure does not converge after MAXIT
cycles.
1105

Specific details for the routine tgsja interface are the following:
u Holds the matrix U of size (m,m).
v Holds the matrix V of size (p,p).
alpha Holds the vector of length n.
jobu If omitted, this argument is restored based on the presence of argument
u as follows:
jobu = 'U', if u is present,
jobu = 'N', if u is omitted.
If present, jobu must be equal to 'I' or 'U' and the argument u must
also be present.
Note that there will be an error condition if jobu is present and u
omitted.
jobv If omitted, this argument is restored based on the presence of argument
v as follows:
jobv = 'V', if v is present,
jobv = 'N', if v is omitted.
If present, jobv must be equal to 'I' or 'V' and the argument v must
also be present.
Note that there will be an error condition if jobv is present and v
omitted.
jobq If omitted, this argument is restored based on the presence of argument
q as follows:
jobq = 'Q', if q is present,
jobq = 'N', if q is omitted.
If present, jobq must be equal to 'I' or 'Q' and the argument q must
also be present.
1106
Note that there will be an error condition if jobq is present and q
omitted.
Driver Routines
Each of the LAPACK driver routines solves a complete problem. To arrive at the solution, driver
routines typically call a sequence of appropriate computational routines.
Driver routines are described in the following sections :
Linear Least Squares (LLS) Problems
Generalized LLS Problems
Symmetric Eigenproblems
Nonsymmetric Eigenproblems
Generalized Symmetric Definite Eigenproblems
Generalized Nonsymmetric Eigenproblems
Linear Least Squares (LLS) Problems

This section describes LAPACK driver routines used for solving linear least squares problems.
Table 4-8 lists all such routines for FORTRAN 77 interface. Respective routine names in
Fortran 95 interface are without the first symbol (see Routine Naming Conventions).
Table 4-8 Driver Routines for Solving LLS Problems
Routine Name Operation performed
?gels Uses QR or LQ factorization to solve a overdetermined or underdetermined

linear system with full rank matrix.
?gelsy Computes the minimum-norm solution to a linear least squares problem

using a complete orthogonal factorization of A.
?gelss Computes the minimum-norm solution to a linear least squares problem

using the singular value decomposition of A.
?gelsd Computes the minimum-norm solution to a linear least squares problem

using the singular value decomposition of A and a divide and conquer
method.
1107
?gels
Uses QR or LQ factorization to solve a
overdetermined or underdetermined linear system
with full rank matrix.
Syntax
FORTRAN 77:
call sgels(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call dgels(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call cgels(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call zgels(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
Fortran 95:
call gels(a, b [,trans] [,info])
Description
The routine solves overdetermined or underdetermined real/ complex linear systems involving
an m-by-n matrix A, or its transpose/ conjugate-transpose, using a QR or LQ factorization of A.
It is assumed that A has full rank.
The following options are provided:
1. If trans = 'N' and m ≥ n: find the least squares solution of an overdetermined system,
that is, solve the least squares problem
minimize ||b - A*x||2
2. If trans = 'N' and m < n: find the minimum norm solution of an underdetermined system
A*X = B.
3. If trans = 'T' or 'C' and m ≥ n: find the minimum norm solution of an undetermined
system AH*X = B.
4. If trans = 'T' or 'C' and m < n: find the least squares solution of an overdetermined
system, that is, solve the least squares problem
1108
minimize ||b - AH*x||2
Several right hand side vectors b and solution vectors x can be handled in a single call; they
are stored as the columns of the m-by-nrhs right hand side matrix B and the n-by-nrh solution
matrix X.
Input Parameters
If trans = 'N', the linear system involves matrix A;
If trans = 'T', the linear system involves the transposed
T
matrix A (for real flavors only);
If trans = 'C', the linear system involves the
H
conjugate-transposed matrix A (for complex flavors only).
n INTEGER. The number of columns of the matrix A
(n ≥ 0).
nrhs INTEGER. The number of right-hand sides; the number of
columns in B (nrhs ≥ 0).
a, b, work REAL for sgels
DOUBLE PRECISION for dgels
COMPLEX for cgels
DOUBLE COMPLEX for zgels.
Arrays:
b(ldb,*) contains the matrix B of right hand side vectors,
stored columnwise; B is m-by-nrhs if trans = 'N' , or
n-by-nrhs if trans = 'T' or 'C'.
The second dimension of b must be at least max(1, nrhs).
ldb INTEGER. The first dimension of b; must be at least max(1,
m, n).
lwork INTEGER. The size of the work array; must be at least min
(m, n)+max(1, m, n, nrhs).
1109

Output Parameters
a On exit, overwritten by the factorization data as follows:
if m ≥ n, array a contains the details of the QR factorization
of the matrix A as returned by ?geqrf;
if m < n, array a contains the details of the LQ factorization
of the matrix A as returned by ?gelqf.
b If info = 0, b overwritten by the solution vectors, stored
columnwise:
if trans = 'N' and m ≥ n, rows 1 to n of b contain the
least squares solution vectors; the residual sum of squares
for the solution in each column is given by the sum of
squares of modulus of elements n+1 to m in that column;
if trans = 'N' and m < n, rows 1 to n of b contain the
minimum norm solution vectors;
if trans = 'T' or 'C' and m ≥ n, rows 1 to m of b contain
the minimum norm solution vectors; if trans = 'T' or 'C'
and m < n, rows 1 to m of b contain the least squares
solution vectors; the residual sum of squares for the solution
in each column is given by the sum of squares of modulus
of elements m+1 to n in that column.
info INTEGER.
If info = i, the i-th diagonal element of the triangular
factor of A is zero, so that A does not have full rank; the
least squares solution could not be computed.
1110
Specific details for the routine gels interface are the following:
b Holds the matrix of size max(m,n)-by-nrhs.
If trans = 'N', then, on entry, the size of b is m-by-nrhs,
If trans = 'T', then, on entry, the size of b is n-by-nrhs,
Application Notes
For better performance, try using lwork = min (m, n)+max(1, m, n, nrhs)*blocksize,
workspace query.
workspace.
1111
?gelsy
Computes the minimum-norm solution to a linear
least squares problem using a complete orthogonal
factorization of A.
Syntax
FORTRAN 77:
call sgelsy(m, n, nrhs, a, lda, b, ldb, jpvt, rcond, rank, work, lwork, info)
call dgelsy(m, n, nrhs, a, lda, b, ldb, jpvt, rcond, rank, work, lwork, info)
call cgelsy(m, n, nrhs, a, lda, b, ldb, jpvt, rcond, rank, work, lwork, rwork,
info)
call zgelsy(m, n, nrhs, a, lda, b, ldb, jpvt, rcond, rank, work, lwork, rwork,
info)
Fortran 95:
call gelsy(a, b [,rank] [,jpvt] [,rcond] [,info])
Description
The routine computes the minimum-norm solution to a real/complex linear least squares
problem:
using a complete orthogonal factorization of A. A is an m-by-n matrix which may be rank-deficient.
are stored as the columns of the m-by-nrhs right hand side matrix B and the n-by-nrhs solution
matrix X.
The routine first computes a QR factorization with column pivoting:
1112
with R11 defined as the largest leading submatrix whose estimated condition number is less
than 1/rcond. The order of R11, rank, is the effective rank of A. Then, R22 is considered to be
negligible, and R12 is annihilated by orthogonal/unitary transformations from the right, arriving
at the complete orthogonal factorization:
The minimum-norm solution is then
where Q1 consists of the first rank columns of Q. This routine is basically identical to the
original?gelsx except three differences:
• The call to the subroutine ?geqpf has been substituted by the call to the subroutine ?geqp3.
This subroutine is a BLAS-3 version of the QR factorization with column pivoting.
• Matrix B (the right hand side) is updated with BLAS-3.
• The permutation of matrix B (the right hand side) is faster and more simple.
Input Parameters
(n ≥ 0).
a, b, work REAL for sgelsy
1113
DOUBLE PRECISION for dgelsy

COMPLEX for cgelsy
DOUBLE COMPLEX for zgelsy.
Arrays:
b(ldb,*) contains the m-by-nrhs right hand side matrix B.
m, n).
jpvt INTEGER.
On entry, if jpvt(i)≠ 0, the i-th column of A is permuted
to the front of AP, otherwise the i-th column of A is a free
column.
rcond REAL for single-precision flavors
rcond is used to determine the effective rank of A, which is
defined as the order of the largest leading triangular
submatrix R11 in the QR factorization with pivoting of A,
whose estimated condition number < 1/rcond.
Application Notes for the suggested value of lwork.
rwork REAL for cgelsy DOUBLE PRECISION for zgelsy.
Output Parameters
a On exit, overwritten by the details of the complete
orthogonal factorization of A.
1114
b Overwritten by the n-by-nrhs solution matrix X.
jpvt On exit, if jpvt(i)= k, then the i-th column of AP was the
k-th column of A.
rank INTEGER. The effective rank of A, that is, the order of the
submatrix R11. This is the same as the order of the submatrix
T11 in the complete orthogonal factorization of A.
info INTEGER.

Specific details for the routine gelsy interface are the following:
b Holds the matrix of size max(m,n)-by-nrhs. On entry, contains the
m-by-nrhs right hand side matrix B, On exit, overwritten by the
n-by-nrhs solution matrix X.
jpvt Holds the vector of length n. Default value for this element is jpvt(i)
= 0.
rcond Default value for this element is rcond = 100*EPSILON(1.0_WP).
Application Notes
For real flavors:
The unblocked strategy requires that:
lwork ≥ max( mn+3n+1, 2*mn + nrhs ),

where mn = min( m, n ).
The block algorithm requires that:
lwork ≥ max( mn+2n+nb*(n+1), 2*mn+nb*nrhs ),

where nb is an upper bound on the blocksize returned by ilaenv for the routines sgeqp3/dge-
qp3, stzrzf/dtzrzf, stzrqf/dtzrqf, sormqr/dormqr, and sormrz/dormrz.
1115

The unblocked strategy requires that:
lwork ≥ mn + max( 2*mn, n+1, mn + nrhs ),

where mn = min( m, n ).
The block algorithm requires that:
lwork < mn + max(2*mn, nb*(n+1), mn+mn*nb, mn+ nb*nrhs ),
where nb is an upper bound on the blocksize returned by ilaenv for the routines cgeqp3/zge-
qp3, ctzrzf/ztzrzf, ctzrqf/ztzrqf, cunmqr/zunmqr, and cunmrz/zunmrz.
workspace query.
workspace.
1116
?gelss
least squares problem using the singular value
decomposition of A.
Syntax
FORTRAN 77:
call sgelss(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, info)
call dgelss(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, info)
call cgelss(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, rwork,
info)
call zgelss(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, rwork,
info)
Fortran 95:
call gelss(a, b [,rank] [,s] [,rcond] [,info])
Description
The routine computes the minimum norm solution to a real linear least squares problem:
using the singular value decomposition (SVD) of A. A is an m-by-n matrix which may be
rank-deficient. Several right hand side vectors b and solution vectors x can be handled in a
single call; they are stored as the columns of the m-by-nrhs right hand side matrix B and the
n-by-nrhs solution matrix X. The effective rank of A is determined by treating as zero those
singular values which are less than rcond times the largest singular value.
Input Parameters
(n ≥ 0).
1117

columns in B
(nrhs ≥ 0).
a, b, work REAL for sgelss
DOUBLE PRECISION for dgelss
COMPLEX for cgelss
DOUBLE COMPLEX for zgelss.
Arrays:
m, n).
rcond is used to determine the effective rank of A. Singular
values s(i) ≤ rcond *s(1) are treated as zero.
If rcond <0, machine precision is used instead.
lwork INTEGER. The size of the work array; lwork≥ 1.
rwork REAL for cgelss
DOUBLE PRECISION for zgelss.
Workspace array used in complex flavors only. DIMENSION
at least max(1, 5*min(m, n)).
Output Parameters
a On exit, the first min(m, n) rows of A are overwritten with
its right singular vectors, stored row-wise.
1118
If m≥n and rank = n, the residual sum-of-squares for the
solution in the i-th column is given by the sum of squares
of modulus of elements n+1:m in that column.
Array, DIMENSION at least max(1, min(m, n)). The singular
values of A in decreasing order. The condition number of A
in the 2-norm is
k2(A) = s(1)/ s(min(m, n)) .
rank INTEGER. The effective rank of A, that is, the number of
singular values which are greater than rcond *s(1).
work(1) If info = 0, on exit, work(1) contains the minimum value
info INTEGER.
If info = i, then the algorithm for computing the SVD
failed to converge; i indicates the number of off-diagonal
elements of an intermediate bidiagonal form which did not
converge to zero.

Specific details for the routine gelss interface are the following:
s Holds the vector of length min(m,n).
1119
Application Notes
For real flavors:
lwork ≥ 3*min(m, n)+ max( 2*min(m, n), max(m, n), nrhs)

lwork ≥ 2*min(m, n)+ max(m, n, nrhs)

For good performance, lwork should generally be larger.
workspace query.
workspace.
1120
?gelsd
least squares problem using the singular value
decomposition of A and a divide and conquer
method.
Syntax
FORTRAN 77:
call sgelsd(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, iwork,
info)
call dgelsd(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, iwork,
info)
call cgelsd(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, rwork,
iwork, info)
call zgelsd(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, rwork,
iwork, info)
Fortran 95:
call gelsd(a, b [,rank] [,s] [,rcond] [,info])
Description
The routine computes the minimum-norm solution to a real linear least squares problem:
using the singular value decomposition (SVD) of A. A is an m-by-n matrix which may be
rank-deficient.
are stored as the columns of the m-by-nrhs right hand side matrix B and the n-by-nrhs solution
matrix X.
The problem is solved in three steps:
1. Reduce the coefficient matrix A to bidiagonal form with Householder transformations, reducing
the original problem into a "bidiagonal least squares problem" (BLS).
1121
2. Solve the BLS using a divide and conquer approach.

3. Apply back all the Householder transformations to solve the original least squares problem.
The effective rank of A is determined by treating as zero those singular values which are less
than rcond times the largest singular value.
The routine uses auxiliary routines ?lals0 and ?lalsa.
Input Parameters
(n ≥ 0).
a, b, work REAL for sgelsd
DOUBLE PRECISION for dgelsd
COMPLEX for cgelsd
DOUBLE COMPLEX for zgelsd.
Arrays:
m, n).
rcond is used to determine the effective rank of A. Singular
values s(i) ≤ rcond *s(1) are treated as zero. If rcond
≤ 0, machine precision is used instead.
lwork INTEGER. The size of the work array; lwork ≥ 1.
1122
routine only calculates the optimal size of the array work
and the minimum sizes of the arrays rwork and iwork, and
returns these values as the first entries of the work, rwork
and iwork arrays, and no error message related to lwork
is issued by xerbla.
iwork INTEGER. Workspace array. See Application Notes for the
suggested dimension of iwork.
rwork REAL for cgelsd
DOUBLE PRECISION for zgelsd.
Workspace array, used in complex flavors only. See
Application Notes for the suggested dimension of rwork.
Output Parameters
a On exit, A has been overwritten.
If m ≥ n and rank = n, the residual sum-of-squares for
the solution in the i-th column is given by the sum of
squares of modulus of elements n+1:m in that column.
Array, DIMENSION at least max(1, min(m, n)). The singular
values of A in decreasing order. The condition number of A
in the 2-norm is
k2(A) = s(1)/ s(min(m, n)).
rank INTEGER. The effective rank of A, that is, the number of
singular values which are greater than rcond *s(1).
rwork(1) If info = 0, on exit, rwork(1) returns the minimum size
of the workspace array iwork required for optimum
performance.
1123
iwork(1) If info = 0, on exit, iwork(1) returns the minimum size

of the workspace array iwork required for optimum
performance.
info INTEGER.
If info = i, then the algorithm for computing the SVD
failed to converge; i indicates the number of off-diagonal
elements of an intermediate bidiagonal form that did not
converge to zero.

Specific details for the routine gelsd interface are the following:
s Holds the vector of length min(m,n).
Application Notes
The divide and conquer algorithm makes very mild assumptions about floating point arithmetic.
It will work on machines with a guard digit in add/subtract. It could conceivably fail on
hexadecimal or decimal machines without guard digits, but we know of none.
The exact minimum amount of workspace needed depends on m, n and nrhs. The size lwork
of the workspace array work must be as given below.
For real flavors:
If m ≥ n,
lwork ≥ 12n + 2n*smlsiz + 8n*nlvl + n*nrhs + (smlsiz+1)2;

If m < n,
1124
lwork ≥ 12m + 2m*smlsiz + 8m*nlvl + m*nrhs + (smlsiz+1)2;
If m ≥ n,
lwork< 2n + n*nrhs;
If m < n,
lwork ≥ 2m + m*nrhs;
where smlsiz is returned by ilaenv and is equal to the maximum size of the subproblems at
the bottom of the computation tree (usually about 25), and
nlvl = INT( log2( min( m, n )/(smlsiz+1)) ) + 1.
workspace query.
workspace.
The dimension of the workspace array iwork must be at least
3*min( m, n )*nlvl + 11*min( m, n ).

The dimension of the workspace array iwork (for complex flavors) must be at least max(1,
lrwork).
lrwork ≥ 10n + 2n*smlsiz + 8n*nlvl + 3*smlsiz*nrhs + (smlsiz+1)2 if m ≥ n, and
lrwork ≥ 10m + 2m*smlsiz + 8m*nlvl + 3*smlsiz*nrhs + (smlsiz+1)2 if m < n.
1125
Generalized LLS Problems

This section describes LAPACK driver routines used for solving generalized linear least squares
problems. Table 4-9 lists all such routines for FORTRAN 77 interface. Respective routine names
in Fortran 95 interface are without the first symbol (see Routine Naming Conventions).
Table 4-9 Driver Routines for Solving Generalized LLS Problems
?gglse Solves the linear equality-constrained least squares problem using a

generalized RQ factorization.
?ggglm Solves a general Gauss-Markov linear model problem using a generalized

QR factorization.
?gglse
Solves the linear equality-constrained least squares
problem using a generalized RQ factorization.
Syntax
FORTRAN 77:
call sgglse(m, n, p, a, lda, b, ldb, c, d, x, work, lwork, info)
call dgglse(m, n, p, a, lda, b, ldb, c, d, x, work, lwork, info)
call cgglse(m, n, p, a, lda, b, ldb, c, d, x, work, lwork, info)
call zgglse(m, n, p, a, lda, b, ldb, c, d, x, work, lwork, info)
Fortran 95:
call gglse(a, b, c, d, x [,info])
Description
The routine solves the linear equality-constrained least squares (LSE) problem:
minimize ||c - A*x||2 subject to B*x = d
1126
where A is an m-by-n matrix, B is a p-by-n matrix, c is a given m-vector, and d is a given p-vector.
It is assumed that p ≤ n ≤ m+p, and
These conditions ensure that the LSE problem has a unique solution, which is obtained using
a generalized RQ factorization of the matrices (B, A) given by
B=(0 R)*Q, A=Z*T*Q
Input Parameters
(n ≥ 0).
p INTEGER. The number of rows of the matrix B
(0 ≤ p ≤ n ≤ m+p).
a, b, c, d, work REAL for sgglse
DOUBLE PRECISION for dgglse
COMPLEX for cgglse
DOUBLE COMPLEX for zgglse.
Arrays:
c(*), dimension at least max(1, m), contains the right hand
side vector for the least squares part of the LSE problem.
d(*), dimension at least max(1, p), contains the right hand
side vector for the constrained equation.
1127

lwork ≥ max(1, m+n+p).
Output Parameters
x REAL for sgglse
DOUBLE PRECISION for dgglse
COMPLEX for cgglse
DOUBLE COMPLEX for zgglse.
Array, DIMENSION at least max(1, n). On exit, contains the
solution of the LSE problem.
a On exit, the elements on and above the diagonal of the
array contain the min(m,n)-by-n upper trapezoidal matrix
T.
b On exit, the upper triangle of the subarray b(1:p,
n-p+1:n) contains the p-by-p upper triangular matrix R.
d On exit, d is destroyed.
c On exit, the residual sum-of-squares for the solution is given
by the sum of squares of elements n-p+1 to m of vector c.
info INTEGER.
If info = 1, the upper triangular factor R associated with
B in the generalized RQ factorization of the pair (B, A) is
singular, so that rank(B) < P; the least squares solution
could not be computed.
If info = 2, the (n-p)-by-(n-p) part of the upper
trapezoidal factor T associated with A in the generalized RQ
factorization of the pair (B, A) is singular, so that
1128
; the least squares solution could not be computed.

Specific details for the routine gglse interface are the following:
c Holds the vector of length (m).
d Holds the vector of length (p).
x Holds the vector of length n.
Application Notes
For optimum performance, use
lwork ≥ p + min(m, n) + max(m, n)*nb,

where nb is an upper bound for the optimal blocksizes for ?geqrf, ?gerqf, ?ormqr/?unmqr
and ?ormrq/?unmrq.
You may set lwork to -1. The routine returns immediately and provides the recommended
workspace query.
workspace.
1129
?ggglm
Solves a general Gauss-Markov linear model
problem using a generalized QR factorization.
Syntax
FORTRAN 77:
call sggglm(n, m, p, a, lda, b, ldb, d, x, y, work, lwork, info)
call dggglm(n, m, p, a, lda, b, ldb, d, x, y, work, lwork, info)
call cggglm(n, m, p, a, lda, b, ldb, d, x, y, work, lwork, info)
call zggglm(n, m, p, a, lda, b, ldb, d, x, y, work, lwork, info)
Fortran 95:
call ggglm(a, b, d, x, y [,info])
Description
The routine solves a general Gauss-Markov linear model (GLM) problem:
minimizex ||y||2 subject to d = A*x + B*y
where A is an n-by-m matrix, B is an n-by-p matrix, and d is a given n-vector. It is assumed
that m ≤ n ≤ m+p, and rank(A) = m and rank(A B) = n.
Under these assumptions, the constrained equation is always consistent, and there is a unique
solution x and a minimal 2-norm solution y, which is obtained using a generalized QR factorization
of the matrices (A, B ) given by
In particular, if matrix B is square nonsingular, then the problem GLM is equivalent to the
following weighted linear least squares problem
minimizex ||B-1(d-A*x)||2.
1130
Input Parameters
n INTEGER. The number of rows of the matrices A and B (n
≥ 0).
m INTEGER. The number of columns in A (m ≥ 0).
p INTEGER. The number of columns in B (p ≥ n - m).
a, b, d, work REAL for sggglm
DOUBLE PRECISION for dggglm
COMPLEX for cggglm
DOUBLE COMPLEX for zggglm.
Arrays:
a(lda,*) contains the n-by-m matrix A.
b(ldb,*) contains the n-by-p matrix B.
The second dimension of b must be at least max(1, p).
d(*), dimension at least max(1, n), contains the left hand
side of the GLM equation.
lwork INTEGER. The size of the work array; lwork ≥ max(1,
n+m+p).
Output Parameters
x, y REAL for sggglm
DOUBLE PRECISION for dggglm
COMPLEX for cggglm
DOUBLE COMPLEX for zggglm.
Arrays x(*), y(*). DIMENSION at least max(1, m) for x and
at least max(1, p) for y.
1131
On exit, x and y are the solutions of the GLM problem.

a On exit, the upper triangular part of the array a contains
the m-by-m upper triangular matrix R.
b On exit, if n ≤ p, the upper triangle of the subarray
b(1:n,p-n+1:p) contains the n-by-n upper triangular
matrix T; if n > p, the elements on and above the (n-p)-th
subdiagonal contain the n-by-p upper trapezoidal matrix T.
d On exit, d is destroyed
info INTEGER.
If info = 1, the upper triangular factor R associated with
A in the generalized QR factorization of the pair (A, B) is
singular, so that rank(A) < m; the least squares solution
could not be computed.
If info = 2, the bottom (n-m)-by-(n-m) part of the upper
trapezoidal factor T associated with B in the generalized QR
factorization of the pair (A, B) is singular, so that rank(A
B) < n; the least squares solution could not be computed.

Specific details for the routine ggglm interface are the following:
a Holds the matrix A of size (n,m).
b Holds the matrix B of size (n,p).
x Holds the vector of length (m).
y Holds the vector of length (p).
1132
Application Notes
For optimum performance, use
lwork ≥ m + min(n, p) + max(n, p)*nb,

where nb is an upper bound for the optimal blocksizes for ?geqrf, ?gerqf, ?ormqr/?unmqr
and ?ormrq/?unmrq.
You may set lwork to -1. The routine returns immediately and provides the recommended
workspace query.
workspace.
This section describes LAPACK driver routines used for solving symmetric eigenvalue problems.
See also computational routines computational routines that can be called to solve these
problems. Table 4-10 lists all such driver routines for FORTRAN 77 interface. Respective routine
names in Fortran 95 interface are without the first symbol (see Routine Naming Conventions).
Table 4-10 Driver Routines for Solving Symmetric Eigenproblems
?syev/?heev Computes all eigenvalues and, optionally, eigenvectors of a real

symmetric / Hermitian matrix.
?syevd/?heevd Computes all eigenvalues and (optionally) all eigenvectors of a real

symmetric / Hermitian matrix using divide and conquer algorithm.
?syevx/?heevx Computes selected eigenvalues and, optionally, eigenvectors of a

symmetric / Hermitian matrix.
?syevr/?heevr Computes selected eigenvalues and, optionally, eigenvectors of a real

symmetric / Hermitian matrix using the Relatively Robust
Representations.
?spev/?hpev Computes all eigenvalues and, optionally, eigenvectors of a real

symmetric / Hermitian matrix in packed storage.
1133
?spevd/?hpevd Uses divide and conquer algorithm to compute all eigenvalues and
(optionally) all eigenvectors of a real symmetric / Hermitian matrix held
in packed storage.
?spevx/?hpevx Computes selected eigenvalues and, optionally, eigenvectors of a real

symmetric / Hermitian matrix in packed storage.
?sbev /?hbev Computes all eigenvalues and, optionally, eigenvectors of a real

symmetric / Hermitian band matrix.
?sbevd/?hbevd Computes all eigenvalues and (optionally) all eigenvectors of a real

symmetric / Hermitian band matrix using divide and conquer algorithm.
?sbevx/?hbevx Computes selected eigenvalues and, optionally, eigenvectors of a real

symmetric / Hermitian band matrix.
?stev Computes all eigenvalues and, optionally, eigenvectors of a real

symmetric tridiagonal matrix.
?stevd Computes all eigenvalues and (optionally) all eigenvectors of a real

symmetric tridiagonal matrix using divide and conquer algorithm.
?stevx Computes selected eigenvalues and eigenvectors of a real symmetric

tridiagonal matrix.
?stevr Computes selected eigenvalues and, optionally, eigenvectors of a real

symmetric tridiagonal matrix using the Relatively Robust
Representations.
?syev
Computes all eigenvalues and, optionally,
eigenvectors of a real symmetric matrix.
Syntax
FORTRAN 77:
call ssyev(jobz, uplo, n, a, lda, w, work, lwork, info)
call dsyev(jobz, uplo, n, a, lda, w, work, lwork, info)
1134
Fortran 95:
call syev(a, w [,jobz] [,uplo] [,info])
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a real symmetric matrix
A.
Note that for most cases of real symmetric eigenvalue problems the default choice should be
?syevr function as its underlying algorithm is faster and uses less workspace.
Input Parameters
computed.
a, work REAL for ssyev
DOUBLE PRECISION for dsyev
Arrays:
triangular part of the symmetric matrix A, as specified by
uplo.
lda INTEGER. The first dimension of the array a.
Must be at least max(1, n).
lwork INTEGER.
The dimension of the array work.
Constraint: lwork ≥ max(1, 3n-1).
1135

Output Parameters
a On exit, if jobz = 'V', then if info = 0, array a contains
the orthonormal eigenvectors of the matrix A.
If jobz = 'N', then on exit the lower triangle
(if uplo = 'L') or the upper triangle (if uplo = 'U') of
A, including the diagonal, is overwritten.
w REAL for ssyev
DOUBLE PRECISION for dsyev
If info = 0, contains the eigenvalues of the matrix A in
ascending order.
work(1) On exit, if lwork > 0, then work(1) returns the required
info INTEGER.
If info = i, then the algorithm failed to converge; i
indicates the number of elements of an intermediate
tridiagonal form which did not converge to zero.

Specific details for the routine syev interface are the following:
job Must be 'N' or 'V'. The default value is 'N'.
1136
Application Notes
For optimum performance set lwork ≥ (nb+2)*n, where nb is the blocksize for ?sytrd returned
by ilaenv.
workspace query.
workspace.
If lwork has any of admissible sizes, which is no less than the minimal value described, then
the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array on
exit. Use this value (work(1)) for subsequent runs.
in the first element of the corresponding array work. This operation is called a workspace query.
1137
?heev
eigenvectors of a Hermitian matrix.
Syntax
FORTRAN 77:
call cheev(jobz, uplo, n, a, lda, w, work, lwork, rwork, info)
call zheev(jobz, uplo, n, a, lda, w, work, lwork, rwork, info)
Fortran 95:
call heev(a, w [,jobz] [,uplo] [,info])
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a complex Hermitian
matrix A.
Note that for most cases of complex Hermitian eigenvalue problems the default choice should
be ?heevr function as its underlying algorithm is faster and uses less workspace.
Input Parameters
computed.
a, work COMPLEX for cheev
DOUBLE COMPLEX for zheev
Arrays:
1138
triangular part of the Hermitian matrix A, as specified by
uplo.
lda INTEGER. The first dimension of the array a. Must be at
least max(1, n).
lwork INTEGER.
The dimension of the array work. C
onstraint: lwork ≥ max(1, 2n-1).
rwork REAL for cheev
DOUBLE PRECISION for zheev.
Workspace array, DIMENSION at least max(1, 3n-2).
Output Parameters
a On exit, if jobz = 'V', then if info = 0, array a contains
the orthonormal eigenvectors of the matrix A.
If jobz = 'N', then on exit the lower triangle
(if uplo = 'L') or the upper triangle (if uplo = 'U') of
A, including the diagonal, is overwritten.
w REAL for cheev
DOUBLE PRECISION for zheev
ascending order.
info INTEGER.
1139


Specific details for the routine heev interface are the following:
job Must be 'N' or 'V'. The default value is 'N'.
Application Notes
For optimum performance use
lwork ≥ (nb+1)*n,
where nb is the blocksize for ?hetrd returned by ilaenv.
workspace query.
workspace.
1140
?syevd
eigenvectors of a real symmetric matrix using
divide and conquer algorithm.
Syntax
FORTRAN 77:
call ssyevd(jobz, uplo, n, a, lda, w, work, lwork, iwork, liwork, info)
call dsyevd(jobz, uplo, n, a, lda, w, work, lwork, iwork, liwork, info)
Fortran 95:
call syevd(a, w [,jobz] [,uplo] [,info])
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a real symmetric
matrix A. In other words, it can compute the spectral factorization of A as: A = Z*Λ*ZT.
Here Λ is a diagonal matrix whose diagonal elements are the eigenvalues λi, and Z is the
orthogonal matrix whose columns are the eigenvectors zi. Thus,
A*zi = λi*zi for i = 1, 2, ..., n.

If the eigenvectors are requested, then this routine uses a divide and conquer algorithm to
compute eigenvalues and eigenvectors. However, if only eigenvalues are required, then it uses
the Pal-Walker-Kahan variant of the QL or QR algorithm.
?syevr function as its underlying algorithm is faster and uses less workspace. ?syevd requires
more workspace but is faster in some cases, especially for large matrices.
Input Parameters
computed.
1141

a REAL for ssyevd
DOUBLE PRECISION for dsyevd
Array, DIMENSION (lda, *).
uplo.
work REAL for ssyevd
DOUBLE PRECISION for dsyevd.
Workspace array, DIMENSION at least lwork.
lwork INTEGER.
Constraints:
if n ≤ 1, then lwork ≥ 1;
if jobz = 'N' and n > 1, then lwork ≥ 2*n + 1;
if jobz = 'V' and n > 1, then lwork ≥ 2*n2+ 6*n + 1.
routine only calculates the required sizes of the work and
iwork arrays, returns these values as the first entries of the
work and iwork arrays, and no error message related to
lwork or liwork is issued by xerbla. See Application Notes
for details.
iwork INTEGER.
Workspace array, its dimension max(1, liwork).
liwork INTEGER.
Constraints:
if n ≤ 1, then liwork ≥ 1;
1142
if jobz = 'N' and n > 1, then liwork ≥ 1;
if jobz = 'V' and n > 1, then liwork ≥ 5*n + 3.
for details.
Output Parameters
w REAL for ssyevd
DOUBLE PRECISION for dsyevd
ascending order. See also info.
a If jobz = 'V', then on exit this array is overwritten by the
orthogonal matrix Z which contains the eigenvectors of A.
iwork(1) On exit, if liwork > 0, then iwork(1) returns the required
info INTEGER.
indicates the number of off-diagonal elements of an
intermediate tridiagonal form which did not converge to
zero.
If info = i, and jobz = 'V', then the algorithm failed to
compute an eigenvalue while working on the submatrix lying
in rows and columns info/(n+1) through mod(info,n+1).

1143
Specific details for the routine syevd interface are the following:
jobz Must be 'N' or 'V'. The default value is 'N'.
Application Notes
subsequent runs.
The complex analogue of this routine is ?heevd
1144
?heevd
eigenvectors of a complex Hermitian matrix using
Syntax
FORTRAN 77:
call cheevd(jobz, uplo, n, a, lda, w, work, lwork, rwork, lrwork, iwork,
liwork, info)
call zheevd(jobz, uplo, n, a, lda, w, work, lwork, rwork, lrwork, iwork,
liwork, info)
Fortran 95:
call heevd(a, w [,job] [,uplo] [,info])
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a complex
Hermitian matrix A. In other words, it can compute the spectral factorization of A as: A =
Z*Λ*ZH.
Here Λ is a real diagonal matrix whose diagonal elements are the eigenvalues λi, and Z is the
(complex) unitary matrix whose columns are the eigenvectors zi. Thus,
A*zi = λi*zi for i = 1, 2, ..., n.

Note that for most cases of complex Hermetian eigenvalue problems the default choice should
be ?heevr function as its underlying algorithm is faster and uses less workspace. ?heevd
requires more workspace but is faster in some cases, especially for large matrices.
Input Parameters
1145

computed.
a COMPLEX for cheevd
DOUBLE COMPLEX for zheevd
uplo.
least max(1, n).
work COMPLEX for cheevd
DOUBLE COMPLEX for zheevd.
Workspace array, DIMENSION max(1, lwork).
lwork INTEGER.
The dimension of the array work. Constraints:
if jobz = 'N' and n > 1, then lwork ≥ n+1;
if jobz = 'V' and n > 1, then lwork ≥ n2+2*n.
rwork REAL for cheevd
DOUBLE PRECISION for zheevd
Workspace array, DIMENSION at least lrwork.
lrwork INTEGER.
The dimension of the array rwork. Constraints:
1146
if n ≤ 1, then lrwork ≥ 1;
if job = 'N' and n > 1, then lrwork ≥ n;
if job = 'V' and n > 1, then lrwork ≥ 2*n2+ 5*n + 1.
liwork INTEGER.
The dimension of the array iwork. Constraints: if n ≤ 1, then
liwork ≥ 1;
if jobz = 'V' and n > 1, then liwork ≥ 5*n+3.
Output Parameters
w REAL for cheevd
DOUBLE PRECISION for zheevd
a If jobz = 'V', then on exit this array is overwritten by the
unitary matrix Z which contains the eigenvectors of A.
work(1) On exit, if lwork > 0, then the real part of work(1) returns
the required minimal size of lwork.
rwork(1) On exit, if lrwork > 0, then rwork(1) returns the required
minimal size of lrwork.
1147

info INTEGER.
If info = i, and jobz = 'N', then the algorithm failed to
converge; i off-diagonal elements of an intermediate
tridiagonal form did not converge to zero;
if info = i, and jobz = 'V', then the algorithm failed to
in rows and columns info/(n+1) through mod(info, n+1).

Specific details for the routine heevd interface are the following:
w Holds the vector of length (n).
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix A + E such that ||E||2 =
If you are in doubt how much workspace to supply, use a generous value of lwork (liwork or
lrwork) for the first run or set lwork = -1 (liwork = -1, lrwork = -1).
If you choose the first option and set any of admissible lwork (liwork or lrwork) sizes, which
is no less than the minimal value described, the routine completes the task, though probably
not so fast as with a recommended workspace, and provides the recommended workspace in
the first element of the corresponding array (work, iwork, rwork) on exit. Use this value
(work(1), iwork(1), rwork(1)) for subsequent runs.
1148
If you set lwork = -1 (liwork = -1, lrwork = -1), the routine returns immediately and
provides the recommended workspace in the first element of the corresponding array (work,
iwork, rwork). This operation is called a workspace query.
Note that if you set lwork (liwork, lrwork) to less than the minimal required value and not
-1, the routine returns immediately with an error exit and does not provide any information on
the recommended workspace.
The real analogue of this routine is ?syevd. See also ?hpevd for matrices held in packed storage,
and ?hbevd for banded matrices.
?syevx
Computes selected eigenvalues and, optionally,
eigenvectors of a symmetric matrix.
Syntax
FORTRAN 77:
call ssyevx(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z,
ldz, work, lwork, iwork, ifail, info)
call dsyevx(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z,
ldz, work, lwork, iwork, ifail, info)
Fortran 95:
call syevx(a, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol]
[,info])
Description
matrix A. Eigenvalues and eigenvectors can be selected by specifying either a range of values
or a range of indices for the desired eigenvalues.
?syevr function as its underlying algorithm is faster and uses less workspace. ?syevx is faster
for a few selected eigenvalues.
1149
Input Parameters
computed.
range CHARACTER*1. Must be 'A', 'V', or 'I'.
If range = 'A', all eigenvalues will be found.
If range = 'V', all eigenvalues in the half-open interval
(vl, vu] will be found.
If range = 'I', the eigenvalues with indices il through
iu will be found.
a, work REAL for ssyevx
DOUBLE PRECISION for dsyevx.
Arrays:
uplo.
least max(1, n).
vl, vu REAL for ssyevx
to be searched for eigenvalues; vl ≤ vu. Not referenced if
range = 'A'or 'I'.
il, iu INTEGER.
If range = 'I', the indices of the smallest and largest
eigenvalues to be returned.
Constraints: 1 ≤ il ≤ iu ≤ n, if n > 0;
il = 1 and iu = 0 , if n = 0.
1150
Not referenced if range = 'A'or 'V'.
abstol REAL for ssyevx
The absolute error tolerance for the eigenvalues. See
Application Notes for more information.
1.
If jobz = 'V', then ldz ≥ max(1, n).
lwork INTEGER.
If n ≤ 1 then lwork ≥ 1, otherwise lwork=8*n.
iwork INTEGER. Workspace array, DIMENSION at least max(1, 5n).
Output Parameters
a On exit, the lower triangle (if uplo = 'L') or the upper
triangle (if uplo = 'U') of A, including the diagonal, is
overwritten.
m INTEGER. The total number of eigenvalues found;
0 ≤ m ≤ n.
w REAL for ssyevx
DOUBLE PRECISION for dsyevx
Array, DIMENSION at least max(1, n). The first m elements
contain the selected eigenvalues of the matrix A in ascending
order.
z REAL for ssyevx
Array z(ldz,*) contains eigenvectors.
The second dimension of z must be at least max(1, m).
1151
If jobz = 'V', then if info = 0, the first m columns of z

contain the orthonormal eigenvectors of the matrix A
If an eigenvector fails to converge, then that column of z
contains the latest approximation to the eigenvector, and
the index of the eigenvector is returned in ifail.
supplied in the array z; if range = 'V', the exact value of
m is not known in advance and an upper bound must be
used.
ifail INTEGER.
If jobz = 'V', then if info = 0, the first m elements of
ifail are zero; if info > 0, then ifail contains the indices
of the eigenvectors that failed to converge.
If jobz = 'V', then ifail is not referenced.
info INTEGER.
If info = i, then i eigenvectors failed to converge; their
indices are stored in the array ifail.

Specific details for the routine syevx interface are the following:
ifail Holds the vector of length n.
1152
vl Default value for this element is vl = -HUGE(vl).
vu Default value for this element is vu = HUGE(vl).
abstol Default value for this element is abstol = 0.0_WP.
jobz Restored based on the presence of the argument z as follows: jobz
= 'V', if z is present, jobz = 'N', if z is omitted Note that there will
be an error condition if ifail is present and z is omitted.
range = 'V', if one of or both vl and vu are present, range = 'I',
if one of or both il and iu are present, range = 'A', if none of vl,
vu, il, iu is present, Note that there will be an error condition if one
of or both vl and vu are present and at the same time one of or both
il and iu are present.
Application Notes
For optimum performance use lwork ≥ (nb+3)*n, where nb is the maximum of the blocksize
for ?sytrd and ?ormtr returned by ilaenv.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run
or set lwork = -1.
If lwork has any of admissible sizes, which is no less than the minimal value described, then
the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work
on exit. Use this value (work(1)) for subsequent runs.
in the first element of the corresponding array work. This operation is called a workspace query.
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval
[a,b] of width less than or equal to abstol+ε*max(|a|,|b|), where ε is the machine precision.
1153
If abstol is less than or equal to zero, then ε*|T| is used as tolerance, where|T| is the 1-norm
of the tridiagonal matrix obtained by reducing A to tridiagonal form. Eigenvalues are computed
most accurately when abstol is set to twice the underflow threshold 2*slamch('S'), not zero.
If this routine returns with info > 0, indicating that some eigenvectors did not converge, try
setting abstol to 2*slamch('S').
?heevx
Syntax
FORTRAN 77:
call cheevx(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z,
ldz, work, lwork, rwork, iwork, ifail, info)
call zheevx(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z,
ldz, work, lwork, rwork, iwork, ifail, info)
Fortran 95:
call heevx(a, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol]
[,info])
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a complex Hermitian
Note that for most cases of complex Hermetian eigenvalue problems the default choice should
be ?heevr function as its underlying algorithm is faster and uses less workspace. ?heevx is
faster for a few selected eigenvalues.
Input Parameters
1154
computed.
range CHARACTER*1. Must be 'A', 'V', or 'I'.
iu will be found.
a, work COMPLEX for cheevx
DOUBLE COMPLEX for zheevx.
Arrays:
uplo.
least max(1, n).
vl, vu REAL for cheevx
DOUBLE PRECISION for zheevx.
range = 'A'or 'I'.
il, iu INTEGER.
If range = 'I', the indices of the smallest and largest
eigenvalues to be returned. Constraints:
1 ≤ il ≤ iu ≤ n, if n > 0;il = 1 and iu = 0, if n = 0. Not
referenced if range = 'A'or 'V'.
abstol REAL for cheevx
DOUBLE PRECISION for zheevx. The absolute error tolerance
for the eigenvalues. See Application Notes for more
information.
1155

1.
If jobz = 'V', then ldz ≥max(1, n).
lwork INTEGER.
lwork ≥ 1 if n≤1; otherwise at least 2*n.
rwork REAL for cheevx
DOUBLE PRECISION for zheevx.
Output Parameters
overwritten.
m INTEGER. The total number of eigenvalues found; 0 ≤ m ≤
n.
w REAL for cheevx
DOUBLE PRECISION for zheevx
order.
z COMPLEX for cheevx
DOUBLE COMPLEX for zheevx.
Array z(ldz,*) contains eigenvectors.
1156
If jobz = 'N', then z is not referenced. Note: you must
ensure that at least max(1,m) columns are supplied in the
array z; if range = 'V', the exact value of m is not known
in advance and an upper bound must be used.
ifail INTEGER.
ifail are zero; if info > 0, then ifail contains the indices
If jobz = 'V', then ifail is not referenced.
info INTEGER.

Specific details for the routine heevx interface are the following:
z Holds the matrix Z of size (n, n).
1157

be an error condition if ifail is present and z is omitted.
Application Notes
for ?hetrd and ?unmtr returned by ilaenv.
workspace query.
workspace.
[a,b] of width less than or equal to abstol+ε*max(|a|,|b|) , where ε is the machine precision.
1158
If abstol is less than or equal to zero, then ε*|T| will be used in its place, where |T| is the
1-norm of the tridiagonal matrix obtained by reducing A to tridiagonal form. Eigenvalues will
be computed most accurately when abstol is set to twice the underflow threshold 2*slamch('S'),
not zero.
setting abstol to 2*slamch('S').
?syevr
eigenvectors of a real symmetric matrix using the
Relatively Robust Representations.
Syntax
FORTRAN 77:
call ssyevr(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z,
ldz, isuppz, work, lwork, iwork, liwork, info)
call dsyevr(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z,
ldz, isuppz, work, lwork, iwork, liwork, info)
Fortran 95:
call syevr(a, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,isuppz] [,abstol]
[,info])
Description
The routine first reduces the matrix A to tridiagonal form T with a call to ?sytrd. Then, whenever
possible, ?syevr calls ?stemr to compute the eigenspectrum using Relatively Robust
Representations. ?stemr computes eigenvalues by the dqds algorithm, while orthogonal
T
eigenvectors are computed from various “good” L*D*L representations (also known as Relatively
1159
Robust Representations). Gram-Schmidt orthogonalization is avoided as far as possible. More

specifically, the various steps of the algorithm are as follows. For the each unreduced block of
T:
a. Compute T - σ*I = L*D*LT, so that L and D define all the wanted eigenvalues to high
relative accuracy. This means that small relative changes in the entries of D and L cause
have to be computed, see Steps c) and d).
d. For each eigenvalue with a large enough relative separation, compute the corresponding
eigenvector by forming a rank revealing twisted factorization. Go back to Step c) for any
The desired accuracy of the output can be specified by the input parameter abstol.
The routine ?syevr calls ?stemr when the full spectrum is requested on machines that conform
to the IEEE-754 floating point standard. ?syevr calls ?stebz and ?stein on non-IEEE machines
and when partial spectrum requests are made.
Note that ?syevr is preferable for most cases of real symmetric eigenvalue problems as its
underlying algorithm is fast and uses less workspace.
Input Parameters
computed.
vl < lambda(i) ≤ vu.
indices il to iu.
1160
For range = 'V'or 'I' and iu-il < n-1, sstebz/dstebz
and sstein/dstein are called.
a, work REAL for ssyevr
DOUBLE PRECISION for dsyevr.
Arrays:
uplo.
least max(1, n).
vl, vu REAL for ssyevr
Constraint: vl< vu.
il, iu INTEGER.
Constraint:
1 ≤ il ≤ iu ≤ n, if n > 0;
il=1 and iu=0, if n = 0.
abstol REAL for ssyevr
DOUBLE PRECISION for dsyevr. The absolute error tolerance
to which each eigenvalue/eigenvector is required.
If jobz = 'V', the eigenvalues and eigenvectors output
have residual norms bounded by abstol, and the dot
products between different eigenvectors are bounded by
abstol.
1161
If abstol < n *eps*|T|, then n *eps*|T| is used

instead, where eps is the machine precision, and |T| is the
1-norm of the matrix T. The eigenvalues are computed to
an accuracy of eps*|T| irrespective of abstol.
If high relative accuracy is important, set abstol to ?lam-
ch('S').
Constraints:
ldz ≥ 1 and
ldz ≥ max(1, n) if jobz = 'V'.
lwork INTEGER.
Constraint: lwork ≥ max(1, 26n).
liwork INTEGER.
The dimension of the array iwork, lwork ≥ max(1, 10n).
Output Parameters
overwritten.
m INTEGER. The total number of eigenvalues found, 0 ≤ m ≤
n.
w, z REAL for ssyevr
1162
Arrays:
w(*), DIMENSION at least max(1, n), contains the selected
eigenvalues in ascending order, stored in w(1) to w(m);
z(ldz, *), the second dimension of z must be at least
max(1, m).
If jobz = 'N', then z is not referenced. Note that you must
ensure that at least max(1, m) columns are supplied in the
array z ; if range = 'V', the exact value of m is not known
in advance and an upper bound must be used.
isuppz INTEGER.
Array, DIMENSION at least 2 *max(1, m).
The support of the eigenvectors in z, i.e., the indices
indicating the nonzero elements in z. The i-th eigenvector
is nonzero only in elements isuppz( 2i-1) through isuppz(
2i ). Referenced only if eigenvectors are needed (jobz =
'V') and all eigenvalues are needed, that is, range = 'A'
or range = 'I' and il = 1 and iu = n.
info INTEGER.
If info = i, an internal error has occurred.

Specific details for the routine syevr interface are the following:
1163

z Holds the matrix Z of size (n, n), where the values n and m are
significant.
isuppz Holds the vector of length (2*m), where the values (2*m) are significant.
be an error condition if isuppz is present and z is omitted.
Application Notes
for ?sytrd and ?ormtr returned by ilaenv.
subsequent runs.
1164
Normal execution of ?stegr may create NaNs and infinities and hence may abort due to a
floating point exception in environments which do not handle NaNs and infinities in the IEEE
standard default manner.
?heevr
eigenvectors of a Hermitian matrix using the
Relatively Robust Representations.
Syntax
FORTRAN 77:
call cheevr(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z,
ldz, isuppz, work, lwork, rwork, lrwork, iwork, liwork, info)
call zheevr(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z,
ldz, isuppz, work, lwork, rwork, lrwork, iwork, liwork, info)
Fortran 95:
call heevr(a, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,isuppz] [,abstol]
[,info])
Description
The routine first reduces the matrix A to tridiagonal form T with a call to ?hetrd. Then, whenever
possible, ?heevr calls ?stegr to compute the eigenspectrum using Relatively Robust
Representations. ?stegr computes eigenvalues by the dqds algorithm, while orthogonal
T
1165

specifically, the various steps of the algorithm are as follows. For each unreduced block
(submatrix) of T:
a. Compute T - σ*I = L*D*LT, so that L and D define all the wanted eigenvalues to high
relative accuracy. This means that small relative changes in the entries of D and L cause
have to be computed, see Steps c) and d).
d. For each eigenvalue with a large enough relative separation, compute the corresponding
eigenvector by forming a rank revealing twisted factorization. Go back to Step c) for any
The routine ?heevr calls ?stemr when the full spectrum is requested on machines which
conform to the IEEE-754 floating point standard, or ?stebz and ?stein on non-IEEE machines
and when partial spectrum requests are made.
Note that the routine ?heevr is preferable for most cases of complex Hermitian eigenvalue
problems as its underlying algorithm is fast and uses less workspace.
Input Parameters
computed.
lambda(i) in the half-open interval: vl< lambda(i) ≤
vu.
indices il to iu.
1166
For range = 'V'or 'I', sstebz/dstebz and
cstein/zstein are called.
a, work COMPLEX for cheevr
DOUBLE COMPLEX for zheevr.
Arrays:
uplo.
vl, vu REAL for cheevr
DOUBLE PRECISION for zheevr.
Constraint: vl< vu.
il, iu INTEGER.
Constraint: 1 ≤ il ≤ iu ≤ n, if n > 0; il=1 and iu=0 if
n = 0.
abstol REAL for cheevr
The absolute error tolerance to which each
eigenvalue/eigenvector is required.
abstol.
1167
If abstol < n *eps*|T|, then n *eps*|T| will be used

in its place, where eps is the machine precision, and |T| is
the 1-norm of the matrix T. The eigenvalues are computed
to an accuracy of eps*|T| irrespective of abstol.
ch('S').
Constraints:
ldz ≥ 1 if jobz = 'N';
lwork INTEGER.
Constraint: lwork ≥ max(1, 2n).
rwork REAL for cheevr
Workspace array, DIMENSION max(1, lwork).
lrwork INTEGER.
The dimension of the array rwork;
lwork ≥ max(1, 24n).
liwork INTEGER.
The dimension of the array iwork,
lwork ≥ max(1, 10n).
1168
Output Parameters
overwritten.
0 ≤ m ≤ n.
w REAL for cheevr
Array, DIMENSION at least max(1, n), contains the selected
eigenvalues in ascending order, stored in w(1) to w(m).
z COMPLEX for cheevr
DOUBLE COMPLEX for zheevr.
Array z(ldz, *); the second dimension of z must be at least
max(1, m).
used.
isuppz INTEGER.
is nonzero only in elements isuppz(2i-1) through
1169
isuppz(2i). Referenced only if eigenvectors are needed

(jobz = 'V') and all eigenvalues are needed, that is, range
= 'A' or range = 'I' and il = 1 and iu = n.
rwork(1) On exit, if info = 0, then rwork(1) returns the required
info INTEGER.

Specific details for the routine heevr interface are the following:
significant.
isuppz Holds the vector of length (2*n), where the values (2*m) are significant.
be an error condition if isuppz is present and z is omitted.
1170
Application Notes
for ?hetrd and ?unmtr returned by ilaenv.
If you are in doubt how much workspace to supply, use a generous value of lwork (or lrwork,
or liwork) for the first run or set lwork = -1 (lrwork = -1, liwork = -1).
If you choose the first option and set any of admissible lwork (or lrwork, liwork) sizes, which
the first element of the corresponding array (work, rwork, iwork) on exit. Use this value
(work(1), rwork(1), iwork(1)) for subsequent runs.
workspace in the first element of the corresponding array (work, rwork, iwork). This operation
is called a workspace query.
Note that if you set lwork (lrwork, liwork) to less than the minimal required value and not
Normal execution of ?stegr may create NaNs and infinities and hence may abort due to a
floating point exception in environments which do not handle NaNs and infinities in the IEEE
standard default manner.
1171
?spev
eigenvectors of a real symmetric matrix in packed
storage.
Syntax
FORTRAN 77:
call sspev(jobz, uplo, n, ap, w, z, ldz, work, info)
call dspev(jobz, uplo, n, ap, w, z, ldz, work, info)
Fortran 95:
call spev(ap, w [,uplo] [,z] [,info])
Description
The routine computes all the eigenvalues and, optionally, eigenvectors of a real symmetric
matrix A in packed storage.
Input Parameters
computed.
If uplo = 'U', ap stores the packed upper triangular part
of A.
If uplo = 'L', ap stores the packed lower triangular part
of A.
ap, work REAL for sspev
DOUBLE PRECISION for dspev
Arrays:
1172
ap(*) contains the packed upper or lower triangle of
symmetric matrix A, as specified by uplo.
work (*) is a workspace array, DIMENSION at least max(1, 3n).
Constraints:
if jobz = 'N', then ldz ≥ 1;
if jobz = 'V', then ldz ≥ max(1, n).
Output Parameters
w, z REAL for sspev
DOUBLE PRECISION for dspev
Arrays:
w(*), DIMENSION at least max(1, n).
If info = 0, w contains the eigenvalues of the matrix A in
ascending order.
z(ldz,*). The second dimension of z must be at least
max(1, n).
If jobz = 'V', then if info = 0, z contains the
orthonormal eigenvectors of the matrix A, with the i-th
ap On exit, this array is overwritten by the values generated
during the reduction to tridiagonal form. The elements of
the diagonal and the off-diagonal of the tridiagonal matrix
overwrite the corresponding elements of A.
info INTEGER.
1173

Specific details for the routine spev interface are the following:
w Holds the vector with the number of elements n.
= 'V', if z is present, jobz = 'N', if z is omitted.
?hpev
eigenvectors of a Hermitian matrix in packed
storage.
Syntax
FORTRAN 77:
call chpev(jobz, uplo, n, ap, w, z, ldz, work, rwork, info)
call zhpev(jobz, uplo, n, ap, w, z, ldz, work, rwork, info)
Fortran 95:
call hpev(ap, w [,uplo] [,z] [,info])
Description
The routine computes all the eigenvalues and, optionally, eigenvectors of a complex Hermitian
matrix A in packed storage.
Input Parameters
1174
computed.
of A.
of A.
ap, work COMPLEX for chpev
DOUBLE COMPLEX for zhpev.
Arrays:
Hermitian matrix A, as specified by uplo.
work (*) is a workspace array, DIMENSION at least max(1, 2n-1).
Constraints:
if jobz = 'V', then ldz ≥ max(1, n) .
rwork REAL for chpev
DOUBLE PRECISION for zhpev.
Output Parameters
w REAL for chpev
DOUBLE PRECISION for zhpev.
If info = 0, w contains the eigenvalues of the matrix A in
ascending order.
z COMPLEX for chpev
DOUBLE COMPLEX for zhpev.
Array z(ldz,*).
1175

info INTEGER.

Specific details for the routine hpev interface are the following:
1176
?spevd
Uses divide and conquer algorithm to compute all
eigenvalues and (optionally) all eigenvectors of a
real symmetric matrix held in packed storage.
Syntax
FORTRAN 77:
call sspevd(jobz, uplo, n, ap, w, z, ldz, work, lwork, iwork, liwork, info)
call dspevd(jobz, uplo, n, ap, w, z, ldz, work, lwork, iwork, liwork, info)
Fortran 95:
call spevd(ap, w [,uplo] [,z] [,info])
Description
matrix A (held in packed storage). In other words, it can compute the spectral factorization of
A as:
A = Z*Λ*ZT.
A*zi = λi*zi for i = 1, 2, ..., n.

Input Parameters
computed.
1177

of A.
of A.
ap, work REAL for sspevd
DOUBLE PRECISION for dspevd
Arrays:
The dimension of ap must be at least max(1, n*(n+1)/2)
Constraints:
lwork INTEGER.
Constraints:
if jobz = 'N' and n > 1, then lwork ≥ 2*n;
if jobz = 'V' and n > 1, then
lwork ≥ n2+ 6*n + 1.
for details.
liwork INTEGER.
Constraints:
1178
for details.
Output Parameters
w, z REAL for sspevd
DOUBLE PRECISION for dspevd
Arrays:
z(ldz,*).
The second dimension of z must be: at least 1 if jobz =
'N';at least max(1, n) if jobz = 'V'.
If jobz = 'V', then this array is overwritten by the
lwork.
liwork.
info INTEGER.
1179

Specific details for the routine spevd interface are the following:
Application Notes
If lwork = -1 (liwork = -1) , the routine returns immediately and provides the recommended
a workspace query.
The complex analogue of this routine is ?hpevd.
See also ?syevd for matrices held in full storage, and ?sbevd for banded matrices.
1180
?hpevd
Uses divide and conquer algorithm to compute all
eigenvalues and (optionally) all eigenvectors of a
complex Hermitian matrix held in packed storage.
Syntax
FORTRAN 77:
call chpevd(jobz, uplo, n, ap, w, z, ldz, work, lwork, rwork, lrwork, iwork,
liwork, info)
call zhpevd(jobz, uplo, n, ap, w, z, ldz, work, lwork, rwork, lrwork, iwork,
liwork, info)
Fortran 95:
call hpevd(ap, w [,uplo] [,z] [,info])
Description
Hermitian matrix A (held in packed storage). In other words, it can compute the spectral
factorization of A as: A = Z*Λ*ZH.
A*zi = λi*zi for i = 1, 2, ..., n.

Input Parameters
computed.
1181

of A.
of A.
ap, work COMPLEX for chpevd
DOUBLE COMPLEX for zhpevd
Arrays:
Constraints:
lwork INTEGER.
Constraints:
if jobz = 'N' and n > 1, then lwork ≥ n;
if jobz = 'V' and n > 1, then lwork ≥ 2*n.
rwork REAL for chpevd
DOUBLE PRECISION for zhpevd
Workspace array, its dimension max(1, lrwork).
lrwork INTEGER.
The dimension of the array rwork. Constraints:
1182
if jobz = 'N' and n > 1, then lrwork ≥ n;
if jobz = 'V' and n > 1, then lrwork ≥ 2*n2 + 5*n +
1.
liwork INTEGER.
Constraints:
Output Parameters
w REAL for chpevd
DOUBLE PRECISION for zhpevd
z COMPLEX for chpevd
DOUBLE COMPLEX for zhpevd
Array, DIMENSION (ldz,*).
at least 1 if jobz = 'N';
1183
at least max(1, n) if jobz = 'V'.

If jobz = 'V', then this array is overwritten by the unitary
matrix Z which contains the eigenvectors of A.
info INTEGER.

Specific details for the routine hpevd interface are the following:
1184
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix T + E such that ||E||2 =
The real analogue of this routine is ?spevd.
See also ?heevd for matrices held in full storage, and ?hbevd for banded matrices.
?spevx
eigenvectors of a real symmetric matrix in packed
storage.
Syntax
FORTRAN 77:
call sspevx(jobz, range, uplo, n, ap, vl, vu, il, iu, abstol, m, w, z, ldz,
work, iwork, ifail, info)
call dspevx(jobz, range, uplo, n, ap, vl, vu, il, iu, abstol, m, w, z, ldz,
work, iwork, ifail, info)
1185
Fortran 95:
call spevx(ap, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol]
[,info])
Description
matrix A in packed storage. Eigenvalues and eigenvectors can be selected by specifying either
a range of values or a range of indices for the desired eigenvalues.
Input Parameters
computed.
lambda(i) in the half-open interval: vl< lambda(i)≤ vu.
indices il to iu.
of A.
of A.
ap, work REAL for sspevx
DOUBLE PRECISION for dspevx
Arrays:
ap(*) contains the packed upper or lower triangle of the
1186
work(*) is a workspace array, DIMENSION at least max(1,
8n).
vl, vu REAL for sspevx
Constraint: vl< vu.
il, iu INTEGER.
Constraint: 1 ≤ il ≤ iu ≤ n, if n > 0; il=1 and iu=0
if n = 0.
abstol REAL for sspevx
The absolute error tolerance to which each eigenvalue is
required. See Application notes for details on error tolerance.
Constraints:
Output Parameters
0 ≤ m ≤ n. If range = 'A', m = n, and if range = 'I',
m = iu-il+1.
w, z REAL for sspevx
Arrays:
1187

If info = 0, contains the selected eigenvalues of the matrix
A in ascending order.
z(ldz,*).
used.
ifail INTEGER.
ifail are zero; if info > 0, the ifail contains the indices
the eigenvectors that failed to converge.
If jobz = 'N', then ifail is not referenced.
info INTEGER.

Specific details for the routine spevx interface are the following:
1188
significant.
ifail Holds the vector with the number of elements n.
jobz = 'N', if z is omitted
Note that there will be an error condition if ifail is present and z is
omitted.
Application Notes
If abstol is less than or equal to zero, then ε*||T||1 will be used in its place, where T is the
tridiagonal matrix obtained by reducing A to tridiagonal form. Eigenvalues will be computed
most accurately when abstol is set to twice the underflow threshold 2*?lamch('S'), not zero.
setting abstol to 2*?lamch('S').
1189
?hpevx
eigenvectors of a Hermitian matrix in packed
storage.
Syntax
FORTRAN 77:
call chpevx(jobz, range, uplo, n, ap, vl, vu, il, iu, abstol, m, w, z, ldz,
work, rwork, iwork, ifail, info)
call zhpevx(jobz, range, uplo, n, ap, vl, vu, il, iu, abstol, m, w, z, ldz,
work, rwork, iwork, ifail, info)
Fortran 95:
call hpevx(ap, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol]
[,info])
Description
matrix A in packed storage. Eigenvalues and eigenvectors can be selected by specifying either
a range of values or a range of indices for the desired eigenvalues.
Input Parameters
computed.
lambda(i) in the half-open interval: vl<lambda(i) ≤ vu.
indices il to iu.
1190
of A.
of A.
ap, work COMPLEX for chpevx
DOUBLE COMPLEX for zhpevx
Arrays:
2n).
vl, vu REAL for chpevx
DOUBLE PRECISION for zhpevx
Constraint: vl< vu.
il, iu INTEGER.
n = 0.
abstol REAL for chpevx
Constraints:
rwork REAL for chpevx
1191

Output Parameters
n.
w REAL for chpevx
If info = 0, contains the selected eigenvalues of the matrix
z COMPLEX for chpevx
DOUBLE COMPLEX for zhpevx
Array z(ldz,*).
used.
ifail INTEGER.
1192
info INTEGER.

Specific details for the routine hpevx interface are the following:
significant.
omitted.
1193

Application Notes
?sbev
eigenvectors of a real symmetric band matrix.
Syntax
FORTRAN 77:
call ssbev(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, info)
call dsbev(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, info)
Fortran 95:
call sbev(ab, w [,uplo] [,z] [,info])
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a real symmetric band
matrix A.
Input Parameters
1194
computed.
(kd ≥ 0).
ab, work REAL for ssbev
DOUBLE PRECISION for dsbev.
Arrays:
work (*) is a workspace array.
The dimension of work must be at least max(1, 3n-2).
ldab INTEGER. The leading dimension of ab; must be at least kd
+1.
Constraints:
Output Parameters
w, z REAL for ssbev
DOUBLE PRECISION for dsbev
Arrays:
ascending order.
z(ldz,*).
1195

ab On exit, this array is overwritten by the values generated
during the reduction to tridiagonal form.
If uplo = 'U', the first superdiagonal and the diagonal of
the tridiagonal matrix T are returned in rows kd and kd+1
of ab, and if uplo = 'L', the diagonal and first subdiagonal
of T are returned in the first two rows of ab.
info INTEGER.

Specific details for the routine sbev interface are the following:
1196
?hbev
eigenvectors of a Hermitian band matrix.
Syntax
FORTRAN 77:
call chbev(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, rwork, info)
call zhbev(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, rwork, info)
Fortran 95:
call hbev(ab, w [,uplo] [,z] [,info])
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a complex Hermitian
band matrix A.
Input Parameters
computed.
(kd ≥ 0).
ab, work COMPLEX for chbev
DOUBLE COMPLEX for zhbev.
Arrays:
1197

+1.
Constraints:
rwork REAL for chbev
DOUBLE PRECISION for zhbev
Output Parameters
w REAL for chbev
DOUBLE PRECISION for zhbev
If info = 0, contains the eigenvalues in ascending order.
z COMPLEX for chbev
DOUBLE COMPLEX for zhbev.
Array z(ldz,*).
info INTEGER.
1198
If info = i, then the algorithm failed to converge;
i indicates the number of elements of an intermediate

Specific details for the routine hbev interface are the following:
?sbevd
eigenvectors of a real symmetric band matrix using
Syntax
FORTRAN 77:
call ssbevd(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, lwork, iwork, liwork,
info)
call dsbevd(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, lwork, iwork, liwork,
info)
Fortran 95:
call sbevd(ab, w [,uplo] [,z] [,info])
1199
Description
band matrix A. In other words, it can compute the spectral factorization of A as:
A = Z*Λ*ZT
A*zi = λi*zi for i = 1, 2, ..., n.

Input Parameters
computed.
(kd ≥ 0).
ab, work REAL for ssbevd
DOUBLE PRECISION for dsbevd.
Arrays:
lwork).
1200
ldab INTEGER. The leading dimension of ab; must be at least
kd+1.
Constraints:
lwork INTEGER.
Constraints:
if jobz = 'N' and n > 1, then lwork ≥ 2n;
if jobz = 'V' and n > 1, then lwork ≥ 2*n2 + 5*n +
1.
routine only calculates the optimal size of the work and
for details.
liwork INTEGER.
The dimension of the array iwork. Constraints: if n ≤ 1,
then liwork < 1; if job = 'N' and n > 1, then liwork
< 1; if job = 'V' and n > 1, then liwork < 5*n+3.
for details.
Output Parameters
w, z REAL for ssbevd
DOUBLE PRECISION for dsbevd
Arrays:
1201

z(ldz,*).
at least 1 if job = 'N';
at least max(1, n) if job = 'V'.
If job = 'V', then this array is overwritten by the
The i-th column of Z contains the eigenvector which
corresponds to the eigenvalue w(i).
If job = 'N', then z is not referenced.
info INTEGER.

Specific details for the routine sbevd interface are the following:
1202
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix T+E such that
||E||2=O(ε)*||T||2, where ε is the machine precision.
If any of admissible lwork (or liwork) has any of admissible sizes, which is no less than the
subsequent runs.
a workspace query.
Note that if work (liwork) is less than the minimal required value and is not equal to -1, the
The complex analogue of this routine is ?hbevd.
See also ?syevd for matrices held in full storage, and ?spevd for matrices held in packed
storage.
1203
?hbevd
eigenvectors of a complex Hermitian band matrix
using divide and conquer algorithm.
Syntax
FORTRAN 77:
call chbevd(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, lwork, rwork, lrwork,
iwork, liwork, info)
call zhbevd(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, lwork, rwork, lrwork,
Fortran 95:
call hbevd(ab, w [,uplo] [,z] [,info])
Description
Hermitian band matrix A. In other words, it can compute the spectral factorization of A as: A
= Z*Λ*ZH.
A*zi = λi*zi for i = 1, 2, ..., n.

Input Parameters
computed.
1204
(kd ≥ 0).
ab, work COMPLEX for chbevd
DOUBLE COMPLEX for zhbevd.
Arrays:
work (*) is a workspace array , its dimension max(1,
lwork).
ldab INTEGER. The leading dimension of ab; must be at least
kd+1.
Constraints:
lwork INTEGER.
Constraints:
if jobz = 'N' and n > 1, then lwork ≥ n;
if jobz = 'V' and n > 1, then lwork ≥ 2*n2.
rwork REAL for chbevd
1205
DOUBLE PRECISION for zhbevd

Workspace array, DIMENSION at least lrwork.
lrwork INTEGER.
The dimension of the array rwork.
Constraints:
if jobz = 'N' and n > 1, then lrwork ≥ n;
if jobz = 'V' and n > 1, then lrwork ≥ 2*n2 + 5*n +
1.
iwork INTEGER. Workspace array, DIMENSION max(1, liwork).
liwork INTEGER.
Constraints:
if jobz = 'N' or n ≤ 1, then liwork ≥ 1;
Output Parameters
w REAL for chbevd
DOUBLE PRECISION for zhbevd
z COMPLEX for chbevd
1206
DOUBLE COMPLEX for zhbevd
Array, DIMENSION (ldz,*).
If jobz = 'V', then this array is overwritten by the unitary
matrix Z which contains the eigenvectors of A. The i-th
column of Z contains the eigenvector which corresponds to
the eigenvalue w(i).
work(1) On exit, if lwork > 0, then the real part of work(1) returns
the required minimal size of lwork.
rwork(1) On exit, if lrwork > 0, then rwork(1) returns the required
info INTEGER.

Specific details for the routine hbevd interface are the following:
1207

Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix T + E such that ||E||2 =
O(ε)||T||2, where ε is the machine precision.
The real analogue of this routine is ?sbevd.
See also ?heevd for matrices held in full storage, and ?hpevd for matrices held in packed
storage.
?sbevx
eigenvectors of a real symmetric band matrix.
Syntax
FORTRAN 77:
call ssbevx(jobz, range, uplo, n, kd, ab, ldab, q, ldq, vl, vu, il, iu, abstol,
m, w, z, ldz, work, iwork, ifail, info)
call dsbevx(jobz, range, uplo, n, kd, ab, ldab, q, ldq, vl, vu, il, iu, abstol,
m, w, z, ldz, work, iwork, ifail, info)
1208
Fortran 95:
call sbevx(ab, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,q]
[,abstol] [,info])
Description
band matrix A. Eigenvalues and eigenvectors can be selected by specifying either a range of
values or a range of indices for the desired eigenvalues.
Input Parameters
computed.
lambda(i) in the half-open interval: vl<lambda(i)≤vu.
If range = 'I', the routine computes eigenvalues in range
il to iu.
(kd ≥ 0).
ab, work REAL for ssbevx
DOUBLE PRECISION for dsbevx.
Arrays:
1209

+1.
vl, vu REAL for ssbevx
DOUBLE PRECISION for dsbevx.
Constraint: vl< vu.
il, iu INTEGER.
if n = 0.
abstol REAL for chpevx
ldq, ldz INTEGER. The leading dimensions of the output arrays q
and z, respectively.
Constraints:
ldq ≥ 1, ldz ≥ 1;
If jobz = 'V', then ldq ≥ max(1, n) and ldz ≥ max(1,
n).
Output Parameters
q REAL for ssbevx DOUBLE PRECISION for dsbevx.
Array, DIMENSION (ldz,n).
If jobz = 'V', the n-by-n orthogonal matrix is used in the
reduction to tridiagonal form.
If jobz = 'N', the array q is not referenced.
1210
n.
w, z REAL for ssbevx
DOUBLE PRECISION for dsbevx
Arrays:
w(*), DIMENSION at least max(1, n). The first m elements
of w contain the selected eigenvalues of the matrix A in
ascending order.
z(ldz,*).
used.
ifail INTEGER.
info INTEGER.
1211


Specific details for the routine sbevx interface are the following:
significant.
q Holds the matrix Q of size (n, n).
Note that there will be an error condition if either ifail or q is present
and z is omitted.
1212
Application Notes
If abstol is less than or equal to zero, then ε*||T||1 is used as tolerance, where T is the
?hbevx
eigenvectors of a Hermitian band matrix.
Syntax
FORTRAN 77:
call chbevx(jobz, range, uplo, n, kd, ab, ldab, q, ldq, vl, vu, il, iu, abstol,
m, w, z, ldz, work, rwork, iwork, ifail, info)
call zhbevx(jobz, range, uplo, n, kd, ab, ldab, q, ldq, vl, vu, il, iu, abstol,
m, w, z, ldz, work, rwork, iwork, ifail, info)
Fortran 95:
call hbevx(ab, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,q]
[,abstol] [,info])
Description
band matrix A. Eigenvalues and eigenvectors can be selected by specifying either a range of
values or a range of indices for the desired eigenvalues.
Input Parameters
1213

computed.
lambda(i) in the half-open interval: vl< lambda(i) ≤
vu.
indices il to iu.
(kd ≥ 0).
ab, work COMPLEX for chbevx
DOUBLE COMPLEX for zhbevx.
Arrays:
+1.
vl, vu REAL for chbevx
DOUBLE PRECISION for zhbevx.
Constraint: vl< vu.
il, iu INTEGER.
1214
n = 0.
abstol REAL for chbevx
DOUBLE PRECISION for zhbevx.
ldq, ldz INTEGER. The leading dimensions of the output arrays q
and z, respectively.
Constraints:
ldq ≥ 1, ldz ≥ 1;
If jobz = 'V', then ldq ≥ max(1, n) and ldz ≥ max(1,
n).
rwork REAL for chbevx
DOUBLE PRECISION for zhbevx
iwork INTEGER. Workspace array, DIMENSION at least max(1,
5n).
Output Parameters
q COMPLEX for chbevx DOUBLE COMPLEX for zhbevx.
Array, DIMENSION (ldz,n).
If jobz = 'V', the n-by-n unitary matrix is used in the
reduction to tridiagonal form.
If jobz = 'N', the array q is not referenced.
0 ≤ m ≤ n.
w REAL for chbevx
DOUBLE PRECISION for zhbevx
order.
1215
z COMPLEX for chbevx

DOUBLE COMPLEX for zhbevx.
Array z(ldz,*).
used.
ifail INTEGER.
info INTEGER.
1216
Specific details for the routine hbevx interface are the following:
significant.
Note that there will be an error condition if either ifail or q is present
and z is omitted.
Application Notes
[a,b] of width less than or equal to abstol + ε * max( |a|,|b| ), where ε is the machine
precision.
1217
?stev
eigenvectors of a real symmetric tridiagonal matrix.
Syntax
FORTRAN 77:
call sstev(jobz, n, d, e, z, ldz, work, info)
call dstev(jobz, n, d, e, z, ldz, work, info)
Fortran 95:
call stev(d, e [,z] [,info])
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a real symmetric tridiagonal
matrix A.
Input Parameters
computed.
d, e, work REAL for sstev
DOUBLE PRECISION for dstev.
Arrays:
1218
d(*) contains the n diagonal elements of the tridiagonal
matrix A.
e(*) contains the n-1 subdiagonal elements of the tridiagonal
matrix A.
The dimension of e must be at least max(1, n-1). The n-th
element of this array is used as workspace.
The dimension of work must be at least max(1, 2n-2).
If jobz = 'N', work is not referenced.
ldz INTEGER. The leading dimension of the output array z; ldz
≥ 1. If jobz = 'V' then ldz ≥ max(1, n).
Output Parameters
d On exit, if info = 0, contains the eigenvalues of the matrix
z REAL for sstev
DOUBLE PRECISION for dstev
column of z holding the eigenvector associated with the
eigenvalue returned in d(i).
If job = 'N', then z is not referenced.
e On exit, this array is overwritten with intermediate results.
info INTEGER.
If info = i, then the algorithm failed to converge;
i elements of e did not converge to zero.

1219
Specific details for the routine stev interface are the following:
?stevd
eigenvectors of a real symmetric tridiagonal matrix
using divide and conquer algorithm.
Syntax
FORTRAN 77:
call sstevd(jobz, n, d, e, z, ldz, work, lwork, iwork, liwork, info)
call dstevd(jobz, n, d, e, z, ldz, work, lwork, iwork, liwork, info)
Fortran 95:
call stevd(d, e [,z] [,info])
Description
tridiagonal matrix T. In other words, the routine can compute the spectral factorization of T
as: T = Z*Λ*ZT.
T*zi = λi*zi for i = 1, 2, ..., n.
1220
There is no complex analogue of this routine.
Input Parameters
computed.
d, e, work REAL for sstevd
DOUBLE PRECISION for dstevd.
Arrays:
matrix T.
e(*) contains the n-1 off-diagonal elements of T.
The dimension of work must be at least lwork.
Constraints:
ldz ≥ 1 if job = 'N';
ldz < max(1, n) if job = 'V'.
lwork INTEGER.
Constraints:
if jobz = 'N' or n ≤ 1, then lwork ≥ 1;
if jobz = 'V' and n > 1, then lwork ≥ n2 + 4*n + 1.
1221

for details.
liwork INTEGER.
Constraints:
if jobz = 'N' or n ≤ 1, then liwork ≥ 1;
for details.
Output Parameters
d On exit, if info = 0, contains the eigenvalues of the matrix
T in ascending order.
See also info.
z REAL for sstevd
DOUBLE PRECISION for dstevd
If jobz = 'V', then this array is overwritten by the
orthogonal matrix Z which contains the eigenvectors of T.
e On exit, this array is overwritten with intermediate results.
info INTEGER.
1222

Specific details for the routine stevd interface are the following:
Application Notes
If λi is an exact eigenvalue, and mi is the corresponding computed value, then
|μi - λi| ≤ c(n)*ε*||T||2

θ(zi, wi) ≤ c(n)*ε*||T||2 / min i≠j|λi - λj|.
Thus the accuracy of a computed eigenvector depends on the gap between its eigenvalue and
all the other eigenvalues.
1223
subsequent runs.
?stevx
Syntax
FORTRAN 77:
call sstevx(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, work,
iwork, ifail, info)
call dstevx(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, work,
iwork, ifail, info)
Fortran 95:
call stevx(d, e, w [, z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol]
[,info])
Description
tridiagonal matrix A. Eigenvalues and eigenvectors can be selected by specifying either a range
of values or a range of indices for the desired eigenvalues.
1224
Input Parameters
computed.
lambda(i) in the half-open interval: vl<lambda(i)≤ vu.
indices il to iu.
d, e, work REAL for sstevx
DOUBLE PRECISION for dstevx.
Arrays:
matrix A.
e(*) contains the n-1 subdiagonal elements of A.
vl, vu REAL for sstevx
Constraint: vl< vu.
il, iu INTEGER.
n = 0.
1225
abstol REAL for sstevx

DOUBLE PRECISION for dstevx. The absolute error tolerance
to which each eigenvalue is required. See Application notes
for details on error tolerance.
ldz INTEGER. The leading dimensions of the output array z;
ldz ≥ 1. If jobz = 'V', then ldz ≥ max(1, n).
Output Parameters
0 ≤ m ≤ n.
w, z REAL for sstevx
Arrays:
The first m elements of w contain the selected eigenvalues
of the matrix A in ascending order.
z(ldz,*).
used.
d, e On exit, these arrays may be multiplied by a constant factor
chosen to avoid overflow or underflow in computing the
eigenvalues.
ifail INTEGER.
1226
info INTEGER.

Specific details for the routine stevx interface are the following:
significant.
omitted.
1227

Application Notes
If abstol is less than or equal to zero, then ε*||A||1 is used instead. Eigenvalues are computed
If this routine returns with info > 0, indicating that some eigenvectors did not converge, set
abstol to 2*?lamch('S').
?stevr
eigenvectors of a real symmetric tridiagonal matrix
using the Relatively Robust Representations.
Syntax
FORTRAN 77:
call sstevr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz,
call dstevr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz,
Fortran 95:
call stevr(d, e, w [, z] [,vl] [,vu] [,il] [,iu] [,m] [,isuppz] [,abstol]
[,info])
Description
1228
tridiagonal matrix T. Eigenvalues and eigenvectors can be selected by specifying either a range
of values or a range of indices for the desired eigenvalues.
Whenever possible, the routine calls ?stemr to compute the eigenspectrum using Relatively
Robust Representations. ?stegr computes eigenvalues by the dqds algorithm, while orthogonal
T
specifically, the various steps of the algorithm are as follows. For the i-th unreduced block of
T:
a. Compute T - σ = L *D *L T, such that L *D *L T is a relatively robust representation.

i i i i i i i
b. Compute the eigenvalues, λj, of Li*Di*LiT to high relative accuracy by the dqds algorithm.
c. If there is a cluster of close eigenvalues, “choose” σi close to the cluster, and go to Step
(a).
d. Given the approximate eigenvalue λj of Li*Di*LiT, compute the corresponding eigenvector
by forming a rank-revealing twisted factorization.
The routine ?stevr calls ?stemr when the full spectrum is requested on machines which
conform to the IEEE-754 floating point standard. ?stevr calls ?stebz and ?stein on non-IEEE
machines and when partial spectrum requests are made.
Input Parameters
computed.
vl<lambda(i) ≤ vu.
indices il to iu.
1229
For range = 'V'or 'I' and iu-il < n-1, sstebz/dstebz

and sstein/dstein are called.
d, e, work REAL for sstevr
DOUBLE PRECISION for dstevr.
Arrays:
matrix T.
e(*) contains the n-1 subdiagonal elements of A.
lwork).
vl, vu REAL for sstevr
Constraint: vl< vu.
il, iu INTEGER.
Constraint: 1 ≤ il ≤ iu≤ n, if n > 0; il=1 and iu=0 if
n = 0.
abstol REAL for sstevr
The absolute error tolerance to which each
eigenvalue/eigenvector is required.
abstol. If abstol < n *eps*|T|, then n *eps*|T| will
1230
be used in its place, where eps is the machine precision,
and |T| is the 1-norm of the matrix T. The eigenvalues are
computed to an accuracy of eps*|T| irrespective of abstol.
ch('S').
Constraints:
ldz ≥ 1 if jobz = 'N';
lwork INTEGER.
The dimension of the array work. Constraint:
lwork ≥ max(1, 20*n).
for details.
iwork INTEGER.
liwork INTEGER.
The dimension of the array iwork,
lwork ≥ max(1, 10*n).
for details.
Output Parameters
m = iu-il+1.
w, z REAL for sstevr
1231

Arrays:
of the matrix T in ascending order.
z(ldz,*).
used.
d, e On exit, these arrays may be multiplied by a constant factor
chosen to avoid overflow or underflow in computing the
eigenvalues.
isuppz INTEGER.
is nonzero only in elements isuppz( 2i-1) through isuppz(
2i ).
Implemented only for range = 'A' or 'I' and iu-il =
n-1.
info INTEGER.
1232
Specific details for the routine stevr interface are the following:
significant.
isuppz Holds the vector of length (2*n), where the values (2*m) are significant.
omitted.
Application Notes
Normal execution of the routine ?stegr may create NaNs and infinities and hence may abort
due to a floating point exception in environments which do not handle NaNs and infinities in
the IEEE standard default manner.
1233
subsequent runs.
Nonsymmetric Eigenproblems
This section describes LAPACK driver routines used for solving nonsymmetric eigenproblems.
See also computational routines that can be called to solve these problems.
Table 4-11 lists all such driver routines for FORTRAN 77 interface. Respective routine names
Table 4-11 Driver Routines for Solving Nonsymmetric Eigenproblems
?gees Computes the eigenvalues and Schur factorization of a general matrix,

and orders the factorization so that selected eigenvalues are at the top
left of the Schur form.
?geesx Computes the eigenvalues and Schur factorization of a general matrix,

orders the factorization and computes reciprocal condition numbers.
?geev Computes the eigenvalues and left and right eigenvectors of a general
matrix.
?geevx Computes the eigenvalues and left and right eigenvectors of a general
matrix, with preliminary matrix balancing, and computes reciprocal
condition numbers for the eigenvalues and right eigenvectors.
1234
?gees
Computes the eigenvalues and Schur factorization
of a general matrix, and orders the factorization
so that selected eigenvalues are at the top left of
the Schur form.
Syntax
FORTRAN 77:
call sgees(jobvs, sort, select, n, a, lda, sdim, wr, wi, vs, ldvs, work,
lwork, bwork, info)
call dgees(jobvs, sort, select, n, a, lda, sdim, wr, wi, vs, ldvs, work,
lwork, bwork, info)
call cgees(jobvs, sort, select, n, a, lda, sdim, w, vs, ldvs, work, lwork,
rwork, bwork, info)
call zgees(jobvs, sort, select, n, a, lda, sdim, w, vs, ldvs, work, lwork,
rwork, bwork, info)
Fortran 95:
call gees(a, wr, wi [,vs] [,select] [,sdim] [,info])
call gees(a, w [,vs] [,select] [,sdim] [,info])
Description
The routine computes for an n-by-n real/complex nonsymmetric matrix A, the eigenvalues, the
real Schur form T, and, optionally, the matrix of Schur vectors Z. This gives the Schur
factorization A = Z*T*ZH.
Optionally, it also orders the eigenvalues on the diagonal of the real-Schur/Schur form so that
selected eigenvalues are at the top left. The leading columns of Z then form an orthonormal
basis for the invariant subspace corresponding to the selected eigenvalues.
A real matrix is in real-Schur form if it is upper quasi-triangular with 1-by-1 and 2-by-2 blocks.
2-by-2 blocks will be standardized in the form
1235
where b*c < 0. The eigenvalues of such a block are

A complex matrix is in Schur form if it is upper triangular.
Input Parameters
jobvs CHARACTER*1. Must be 'N' or 'V'.
If jobvs = 'N', then Schur vectors are not computed.
If jobvs = 'V', then Schur vectors are computed.
sort CHARACTER*1. Must be 'N' or 'S'. Specifies whether or
not to order the eigenvalues on the diagonal of the Schur
form.
If sort = 'N', then eigenvalues are not ordered.
If sort = 'S', eigenvalues are ordered (see select).
select LOGICAL FUNCTION of two REAL arguments for real flavors.
LOGICAL FUNCTION of one COMPLEX argument for complex
flavors.
select must be declared EXTERNAL in the calling subroutine.
If sort = 'S', select is used to select eigenvalues to sort
to the top left of the Schur form.
If sort = 'N', select is not referenced.
For real flavors:
An eigenvalue wr(j)+sqrt(-1)*wi(j) is selected if
select(wr(j), wi(j)) is true; that is, if either one of a
complex conjugate pair of eigenvalues is selected, then both
complex eigenvalues are selected.
Note that a selected complex eigenvalue may no longer
satisfy select(wr(j), wi(j))= .TRUE. after ordering, since
ordering may change the value of complex eigenvalues
(especially if the eigenvalue is ill-conditioned); in this case
info may be set to n+2 (see info below).
An eigenvalue w(j) is selected if select(w(j)) is true.
1236
a, work REAL for sgees
DOUBLE PRECISION for dgees
COMPLEX for cgees
DOUBLE COMPLEX for zgees.
Arrays:
a(lda,*) is an array containing the n-by-n matrix A.
lwork).
least max(1, n).
ldvs INTEGER. The leading dimension of the output array vs.
Constraints:
ldvs ≥ 1;
ldvs ≥ max(1, n) if jobvs = 'V'.
lwork INTEGER.
Constraint:
lwork ≥ max(1, 3n) for real flavors;
lwork ≥ max(1, 2n) for complex flavors.
rwork REAL for cgees
DOUBLE PRECISION for zgees
bwork LOGICAL. Workspace array, DIMENSION at least max(1, n).
Not referenced if sort = 'N'.
Output Parameters
a On exit, this array is overwritten by the real-Schur/Schur
form T.
1237
sdim INTEGER.
If sort = 'N', sdim= 0.
If sort = 'S', sdim is equal to the number of eigenvalues
(after sorting) for which select is true.
Note that for real flavors complex conjugate pairs for which
select is true for either eigenvalue count as 2.
wr, wi REAL for sgees
Arrays, DIMENSION at least max (1, n) each. Contain the
real and imaginary parts, respectively, of the computed
eigenvalues, in the same order that they appear on the
diagonal of the output real-Schur form T. Complex conjugate
eigenvalue having positive imaginary part first.
w COMPLEX for cgees
Array, DIMENSION at least max(1, n). Contains the computed
eigenvalues. The eigenvalues are stored in the same order
as they appear on the diagonal of the output Schur form T.
vs REAL for sgees
COMPLEX for cgees
Array vs(ldvs,*);the second dimension of vs must be at
least max(1, n).
If jobvs = 'V', vs contains the orthogonal/unitary matrix
Z of Schur vectors.
If jobvs = 'N', vs is not referenced.
info INTEGER.
If info = i, and
i ≤ n:
1238
the QR algorithm failed to compute all the eigenvalues;
elements 1:ilo-1 and i+1:n of wr and wi (for real flavors)
or w (for complex flavors) contain those eigenvalues which
have converged; if jobvs = 'V', vs contains the matrix
which reduces A to its partially converged Schur form;
i = n+1:
the eigenvalues could not be reordered because some
eigenvalues were too close to separate (the problem is very
ill-conditioned);
i = n+2:
after reordering, round-off changed values of some complex
eigenvalues so that leading eigenvalues in the Schur form
no longer satisfy select = .TRUE.. This could also be
caused by underflow due to scaling.

Specific details for the routine gees interface are the following:
vs Holds the matrix VS of size (n, n).
jobvs Restored based on the presence of the argument vs as follows:
jobvs = 'V', if vs is present,
jobvs = 'N', if vs is omitted.
sort Restored based on the presence of the argument select as follows:
sort = 'S', if select is present,
sort = 'N', if select is omitted.
Application Notes
1239
workspace query.
workspace.
?geesx
of a general matrix, orders the factorization and
computes reciprocal condition numbers.
Syntax
FORTRAN 77:
call sgeesx(jobvs, sort, select, sense, n, a, lda, sdim, wr, wi, vs, ldvs,
rconde, rcondv, work, lwork, iwork, liwork, bwork, info)
call dgeesx(jobvs, sort, select, sense, n, a, lda, sdim, wr, wi, vs, ldvs,
rconde, rcondv, work, lwork, iwork, liwork, bwork, info)
call cgeesx(jobvs, sort, select, sense, n, a, lda, sdim, w, vs, ldvs, rconde,
rcondv, work, lwork, rwork, bwork, info)
call zgeesx(jobvs, sort, select, sense, n, a, lda, sdim, w, vs, ldvs, rconde,
rcondv, work, lwork, rwork, bwork, info)
Fortran 95:
call geesx(a, wr, wi [,vs] [,select] [,sdim] [,rconde] [,rcondev] [,info])
call geesx(a, w [,vs] [,select] [,sdim] [,rconde] [,rcondev] [,info])
Description
1240
The routine computes for an n-by-n real/complex nonsymmetric matrix A, the eigenvalues, the
real-Schur/Schur form T, and, optionally, the matrix of Schur vectors Z. This gives the Schur
factorization A = Z*T*ZH.
Optionally, it also orders the eigenvalues on the diagonal of the real-Schur/Schur form so that
selected eigenvalues are at the top left; computes a reciprocal condition number for the average
of the selected eigenvalues (rconde); and computes a reciprocal condition number for the right
invariant subspace corresponding to the selected eigenvalues (rcondv). The leading columns
of Z form an orthonormal basis for this invariant subspace.
For further explanation of the reciprocal condition numbers rconde and rcondv, see [LUG],
Section 4.10 (where these quantities are called s and sep respectively).
A real matrix is in real-Schur form if it is upper quasi-triangular with 1-by-1 and 2-by-2 blocks.
2-by-2 blocks will be standardized in the form
where b*c < 0. The eigenvalues of such a block are

A complex matrix is in Schur form if it is upper triangular.
Input Parameters
jobvs CHARACTER*1. Must be 'N' or 'V'.
If jobvs = 'N', then Schur vectors are not computed.
If jobvs = 'V', then Schur vectors are computed.
not to order the eigenvalues on the diagonal of the Schur
form.
If sort = 'S', eigenvalues are ordered (see select).
select LOGICAL FUNCTION of two REAL arguments for real flavors.
LOGICAL FUNCTION of one COMPLEX argument for complex
flavors.
select must be declared EXTERNAL in the calling subroutine.
1241
If sort = 'S', select is used to select eigenvalues to sort

If sort = 'N', select is not referenced.
For real flavors:
An eigenvalue wr(j)+sqrt(-1)*wi(j) is selected if
select(wr(j), wi(j)) is true; that is, if either one of a
complex conjugate pair of eigenvalues is selected, then both
complex eigenvalues are selected.
satisfy select(wr(j), wi(j)) = .TRUE. after ordering, since
ordering may change the value of complex eigenvalues
(especially if the eigenvalue is ill-conditioned); in this case
info may be set to n+2 (see info below).
An eigenvalue w(j) is selected if select(w(j)) is true.
sense CHARACTER*1. Must be 'N', 'E', 'V', or 'B'. Determines
which reciprocal condition number are computed.
If sense = 'N', none are computed;
If sense = 'E', computed for average of selected
eigenvalues only;
If sense = 'V', computed for selected right invariant
subspace only;
If sense = 'B', computed for both.
If sense is 'E', 'V', or 'B', then sort must equal 'S'.
a, work REAL for sgeesx
DOUBLE PRECISION for dgeesx
COMPLEX for cgeesx
DOUBLE COMPLEX for zgeesx.
Arrays:
least max(1, n).
ldvs INTEGER. The leading dimension of the output array vs.
Constraints:
1242
ldvs ≥ 1;
ldvs ≥ max(1, n)if jobvs = 'V'.
lwork INTEGER.
The dimension of the array work. Constraint:
lwork ≥ max(1, 3n) for real flavors;
Also, if sense = 'E', 'V', or 'B', then
lwork ≥ n+2*sdim*(n-sdim) for real flavors;
lwork ≥ 2*sdim*(n-sdim) for complex flavors;
where sdim is the number of selected eigenvalues computed
by this routine.
Note that 2*sdim*(n-sdim) ≤ n*n/2. Note also that an
error is only returned if lwork<max(1, 2*n), but if sense
= 'E', or 'V', or 'B' this may not be large enough.
For good performance, lwork must generally be larger.
routine only calculates upper bound on the optimal size of
the array work, returns this value as the first entry of the
work array, and no error message related to lwork is issued
by xerbla.
iwork INTEGER.
Workspace array, DIMENSION (liwork). Used in real flavors
only. Not referenced if sense = 'N' or 'E'.
liwork INTEGER.
The dimension of the array iwork. Used in real flavors only.
Constraint:
liwork ≥ 1;
if sense = 'V' or 'B', liwork ≥ sdim*(n-sdim).
rwork REAL for cgeesx
DOUBLE PRECISION for zgeesx
1243
Output Parameters
a On exit, this array is overwritten by the real-Schur/Schur
form T.
sdim INTEGER.
(after sorting) for which select is true.
select is true for either eigenvalue count as 2.
wr, wi REAL for sgeesx
eigenvalues, in the same order that they appear on the
diagonal of the output real-Schur form T. Complex conjugate
eigenvalue having positive imaginary part first.
w COMPLEX for cgeesx
eigenvalues. The eigenvalues are stored in the same order
as they appear on the diagonal of the output Schur form T.
vs REAL for sgeesx
COMPLEX for cgeesx
Array vs(ldvs,*);the second dimension of vs must be at
least max(1, n).
If jobvs = 'V', vs contains the orthogonal/unitary matrix
Z of Schur vectors.
If jobvs = 'N', vs is not referenced.
rconde, rcondv REAL for single precision flavors DOUBLE PRECISION for
If sense = 'E' or 'B', rconde contains the reciprocal
condition number for the average of the selected
eigenvalues.
If sense = 'N' or 'V', rconde is not referenced.
1244
If sense = 'V' or 'B', rcondv contains the reciprocal
condition number for the selected right invariant subspace.
If sense = 'N' or 'E', rcondv is not referenced.
info INTEGER.
If info = i, and
i ≤ n:
the QR algorithm failed to compute all the eigenvalues;
elements 1:ilo-1 and i+1:n of wr and wi (for real flavors)
or w (for complex flavors) contain those eigenvalues which
have converged; if jobvs = 'V', vs contains the
transformation which reduces A to its partially converged
Schur form;
i = n+1:
the eigenvalues could not be reordered because some
eigenvalues were too close to separate (the problem is very
ill-conditioned);
i = n+2:
after reordering, roundoff changed values of some complex
eigenvalues so that leading eigenvalues in the Schur form
no longer satisfy select = .TRUE.. This could also be
caused by underflow due to scaling.

Specific details for the routine geesx interface are the following:
wr Holds the vector of length (n). Used in real flavors only.
wi Holds the vector of length (n). Used in real flavors only.
w Holds the vector of length (n). Used in complex flavors only.
vs Holds the matrix VS of size (n, n).
1245
jobvs Restored based on the presence of the argument vs as follows:

jobvs = 'V', if vs is present,
jobvs = 'N', if vs is omitted.
sense Restored based on the presence of arguments rconde and rcondv as
follows:
sense = 'B', if both rconde and rcondv are present,
sense = 'E', if rconde is present and rcondv omitted,
sense = 'V', if rconde is omitted and rcondv present,
sense = 'N', if both rconde and rcondv are omitted.
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork (or liwork)
for the first run or set lwork = -1 (liwork = -1).
If you choose the first option and set any of admissible lwork (or liwork) sizes, which is no
less than the minimal value described, the routine completes the task, though probably not so
fast as with a recommended workspace, and provides the recommended workspace in the first
element of the corresponding array (work, iwork) on exit. Use this value (work(1), iwork(1))
a workspace query.
Note that if you set lwork (liwork) to less than the minimal required value and not -1, the
1246
?geev
Computes the eigenvalues and left and right
eigenvectors of a general matrix.
Syntax
FORTRAN 77:
call sgeev(jobvl, jobvr, n, a, lda, wr, wi, vl, ldvl, vr, ldvr, work, lwork,
info)
call dgeev(jobvl, jobvr, n, a, lda, wr, wi, vl, ldvl, vr, ldvr, work, lwork,
info)
call cgeev(jobvl, jobvr, n, a, lda, w, vl, ldvl, vr, ldvr, work, lwork, rwork,
info)
call zgeev(jobvl, jobvr, n, a, lda, w, vl, ldvl, vr, ldvr, work, lwork, rwork,
info)
Fortran 95:
call geev(a, wr, wi [,vl] [,vr] [,info])
call geev(a, w [,vl] [,vr] [,info])
Description
The routine computes for an n-by-n real/complex nonsymmetric matrix A, the eigenvalues and,
optionally, the left and/or right eigenvectors. The right eigenvector v(j) of A satisfies
A*v(j)= λ(j)*v(j)
where λ(j) is its eigenvalue.
The left eigenvector u(j) of A satisfies
u(j)H*A = λ(j)*u(j)H
H
where u(j) denotes the conjugate transpose of u(j). The computed eigenvectors are normalized
to have Euclidean norm equal to 1 and largest component real.
1247
Input Parameters
jobvl CHARACTER*1. Must be 'N' or 'V'.
If jobvl = 'N', then left eigenvectors of A are not
computed.
If jobvl = 'V', then left eigenvectors of A are computed.
jobvr CHARACTER*1. Must be 'N' or 'V'.
If jobvr = 'N', then right eigenvectors of A are not
computed.
If jobvr = 'V', then right eigenvectors of A are computed.
a, work REAL for sgeev
DOUBLE PRECISION for dgeev
COMPLEX for cgeev
DOUBLE COMPLEX for zgeev.
Arrays:
least max(1, n).
ldvl, ldvr INTEGER. The leading dimensions of the output arrays vl
and vr, respectively.
Constraints:
ldvl ≥ 1; ldvr ≥ 1.
If jobvl = 'V', ldvl ≥ max(1, n);
If jobvr = 'V', ldvr ≥ max(1, n).
lwork INTEGER.
Constraint:
lwork ≥ max(1, 3n), and if jobvl = 'V' or jobvr =
'V', lwork < max(1, 4n) (for real flavors);
lwork < max(1, 2n) (for complex flavors).
1248
rwork REAL for cgeev
DOUBLE PRECISION for zgeev
Output Parameters
a On exit, this array is overwritten by intermediate results.
wr, wi REAL for sgeev
computed eigenvalues. Complex conjugate pairs of
eigenvalues appear consecutively with the eigenvalue having
positive imaginary part first.
w COMPLEX for cgeev
Contains the computed eigenvalues.
vl, vr REAL for sgeev
COMPLEX for cgeev
Arrays:
vl(ldvl,*);the second dimension of vl must be at least
max(1, n).
If jobvl = 'V', the left eigenvectors u(j) are stored one
after another in the columns of vl, in the same order as
their eigenvalues.
If jobvl = 'N', vl is not referenced.
For real flavors:
If the j-th eigenvalue is real, then u(j) = vl(:,j), the
j-th column of vl.
1249
If the j-th and (j+1)-st eigenvalues form a complex

conjugate pair, then u(j) = vl(:,j) + i*vl(:,j+1)
and u(j+1) = vl(:,j)- i*vl(:,j+1), where i =
sqrt(-1).
u(j) = vl(:,j), the j-th column of vl.
vr(ldvr,*); the second dimension of vr must be at least
max(1, n).
If jobvr = 'V', the right eigenvectors v(j) are stored one
after another in the columns of vr, in the same order as
their eigenvalues.
If jobvr = 'N', vr is not referenced.
For real flavors:
If the j-th eigenvalue is real, then v(j) = vr(:,j), the
j-th column of vr.
conjugate pair, then v(j) = vr(:,j) + i*vr(:,j+1)
and v(j+1) = vr(:,j) - i*vr(:,j+1), where i =
sqrt(-1).
v(j) = vr(:,j), the j-th column of vr.
info INTEGER.
If info = i, the QR algorithm failed to compute all the
eigenvalues, and no eigenvectors have been computed;
elements i+1:n of wr and wi (for real flavors) or w (for
complex flavors) contain those eigenvalues which have
converged.

Specific details for the routine geev interface are the following:
1250
vl Holds the matrix VL of size (n, n).
vr Holds the matrix VR of size (n, n).
jobvl Restored based on the presence of the argument vl as follows:
jobvl = 'V', if vl is present,
jobvl = 'N', if vl is omitted.
jobvr Restored based on the presence of the argument vr as follows:
jobvr = 'V', if vr is present,
jobvr = 'N', if vr is omitted.
Application Notes
workspace query.
workspace.
1251
?geevx
Computes the eigenvalues and left and right
eigenvectors of a general matrix, with preliminary
matrix balancing, and computes reciprocal
condition numbers for the eigenvalues and right
eigenvectors.
Syntax
FORTRAN 77:
call sgeevx(balanc, jobvl, jobvr, sense, n, a, lda, wr, wi, vl, ldvl, vr,
ldvr, ilo, ihi, scale, abnrm, rconde, rcondv, work, lwork, iwork, info)
call dgeevx(balanc, jobvl, jobvr, sense, n, a, lda, wr, wi, vl, ldvl, vr,
ldvr, ilo, ihi, scale, abnrm, rconde, rcondv, work, lwork, iwork, info)
call cgeevx(balanc, jobvl, jobvr, sense, n, a, lda, w, vl, ldvl, vr, ldvr,
ilo, ihi, scale, abnrm, rconde, rcondv, work, lwork, rwork, info)
call zgeevx(balanc, jobvl, jobvr, sense, n, a, lda, w, vl, ldvl, vr, ldvr,
ilo, ihi, scale, abnrm, rconde, rcondv, work, lwork, rwork, info)
Fortran 95:
call geevx(a, wr, wi [,vl] [,vr] [,balanc] [,ilo] [,ihi] [,scale] [,abnrm] [,
rconde] [,rcondv] [,info])
call geevx(a, w [,vl] [,vr] [,balanc] [,ilo] [,ihi] [,scale] [,abnrm] [,rconde]
[, rcondv] [,info])
Description
The routine computes for an n-by-n real/complex nonsymmetric matrix A, the eigenvalues and,
optionally, the left and/or right eigenvectors.
Optionally also, it computes a balancing transformation to improve the conditioning of the
eigenvalues and eigenvectors (ilo, ihi, scale, and abnrm), reciprocal condition numbers for
the eigenvalues (rconde), and reciprocal condition numbers for the right eigenvectors (rcondv).
The right eigenvector v(j) of A satisfies
1252
A*v(j) = λ(j)*v(j)
where λ(j) is its eigenvalue.
The left eigenvector u(j) of A satisfies
u(j)H*A = λ(j)*u(j)H
where u(j)H denotes the conjugate transpose of u(j). The computed eigenvectors are
normalized to have Euclidean norm equal to 1 and largest component real.
Balancing a matrix means permuting the rows and columns to make it more nearly upper
triangular, and applying a diagonal similarity transformation D*A*inv(D), where D is a diagonal
matrix, to make its rows and columns closer in norm and the condition numbers of its eigenvalues
and eigenvectors smaller. The computed reciprocal condition numbers correspond to the balanced
matrix. Permuting rows and columns will not change the condition numbers in exact arithmetic)
but diagonal scaling will. For further explanation of balancing, see [LUG], Section 4.10.
Input Parameters
balanc CHARACTER*1. Must be 'N', 'P', 'S', or 'B'. Indicates
how the input matrix should be diagonally scaled and/or
permuted to improve the conditioning of its eigenvalues.
If balanc = 'N', do not diagonally scale or permute;
If balanc = 'P', perform permutations to make the matrix
more nearly upper triangular. Do not diagonally scale;
If balanc = 'S', diagonally scale the matrix, i.e. replace
A by D*A*inv(D), where D is a diagonal matrix chosen to
make the rows and columns of A more equal in norm. Do
not permute;
If balanc = 'B', both diagonally scale and permute A.
Computed reciprocal condition numbers will be for the matrix
after balancing and/or permuting. Permuting does not
change condition numbers (in exact arithmetic), but
balancing does.
If jobvl = 'N', left eigenvectors of A are not computed;
If jobvl = 'V', left eigenvectors of A are computed.
If sense = 'E' or 'B', then jobvl must be 'V'.
If jobvr = 'N', right eigenvectors of A are not computed;
1253
If jobvr = 'V', right eigenvectors of A are computed.

If sense = 'E' or 'B', then jobvr must be 'V'.
If sense = 'E', computed for eigenvalues only;
If sense = 'V', computed for right eigenvectors only;
If sense = 'B', computed for eigenvalues and right
eigenvectors.
If sense is 'E' or 'B', both left and right eigenvectors must
also be computed (jobvl = 'V' and jobvr = 'V').
a, work REAL for sgeevx
DOUBLE PRECISION for dgeevx
COMPLEX for cgeevx
DOUBLE COMPLEX for zgeevx.
Arrays:
least max(1, n).
ldvl, ldvr INTEGER. The leading dimensions of the output arrays vl
Constraints:
ldvl ≥ 1; ldvr ≥ 1.
If jobvl = 'V', ldvl ≥ max(1, n);
If jobvr = 'V', ldvr ≥ max(1, n).
lwork INTEGER.
For real flavors:
If sense = 'N' or 'E', lwork ≥ max(1, 2n), and if jobvl
= 'V' or jobvr = 'V', lwork ≥ 3n;
If sense = 'V' or 'B', lwork ≥ n*(n+6).
1254
If sense = 'N'or 'E', lwork ≥ max(1, 2n);
If sense = 'V' or 'B', lwork ≥ n2+2n. For good
performance, lwork must generally be larger.
rwork REAL for cgeevx
DOUBLE PRECISION for zgeevx
iwork INTEGER.
Workspace array, DIMENSION at least max(1, 2n-2). Used
in real flavors only. Not referenced if sense = 'N' or 'E'.
Output Parameters
a On exit, this array is overwritten.
If jobvl = 'V' or jobvr = 'V', it contains the
real-Schur/Schur form of the balanced version of the input
matrix A.
wr, wi REAL for sgeevx
eigenvalues. Complex conjugate pairs of eigenvalues appear
consecutively with the eigenvalue having positive imaginary
part first.
w COMPLEX for cgeevx
eigenvalues.
vl, vr REAL for sgeevx
COMPLEX for cgeevx
1255

Arrays:
vl(ldvl,*); the second dimension of vl must be at least
max(1, n).
If jobvl = 'V', the left eigenvectors u(j) are stored one
after another in the columns of vl, in the same order as
their eigenvalues.
For real flavors:
j-th column of vl.
and (j+1) = vl(:,j) - i*vl(:,j+1), where i =
sqrt(-1).
max(1, n).
If jobvr = 'V', the right eigenvectors v(j) are stored one
after another in the columns of vr, in the same order as
their eigenvalues.
For real flavors:
j-th column of vr.
and v(j+1) = vr(:,j) - i*vr(:,j+1), where i =
sqrt(-1) .
ilo, ihi INTEGER. ilo and ihi are integer values determined when
A was balanced.
The balanced A(i,j) = 0 if i > j and j = 1,..., ilo-1
or i = ihi+1,..., n.
If balanc = 'N' or 'S', ilo = 1 and ihi = n.
1256
Array, DIMENSION at least max(1, n). Details of the
permutations and scaling factors applied when balancing A.
If P(j) is the index of the row and column interchanged with
row and column j, and D(j) is the scaling factor applied to
row and column j, then
scale(j) = P(j), for j = 1,...,ilo-1
= D(j), for j = ilo,...,ihi
= P(j) for j = ihi+1,..., n.
then 1 to ilo-1.
abnrm REAL for single-precision flavors
The one-norm of the balanced matrix (the maximum of the
sum of absolute values of elements of any column).
Arrays, DIMENSION at least max(1, n) each.
rconde(j) is the reciprocal condition number of the j-th
eigenvalue.
rcondv(j) is the reciprocal condition number of the j-th right
eigenvector.
info INTEGER.
If info = i, the QR algorithm failed to compute all the
eigenvalues, and no eigenvectors or condition numbers have
been computed; elements 1:ilo-1 and i+1:n of wr and wi
(for real flavors) or w (for complex flavors) contain
eigenvalues which have converged.

1257
Specific details for the routine geevx interface are the following:
rconde Holds the vector of length n.
rcondv Holds the vector of length n.
balanc Must be 'N', 'B', 'P' or 'S'. The default value is 'N'.
follows:
Application Notes
workspace query.
1258
workspace.

This section describes LAPACK driver routines used for solving singular value problems. See
also computational routines computational routines that can be called to solve these problems.
Table 4-12 Driver Routines for Singular Value Decomposition
?gesvd Computes the singular value decomposition of a general rectangular

matrix.
?gesdd Computes the singular value decomposition of a general rectangular matrix

using a divide and conquer method.
?ggsvd Computes the generalized singular value decomposition of a pair of general

rectangular matrices.
?gesvd
general rectangular matrix.
Syntax
FORTRAN 77:
call sgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, info)
call dgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, info)
call cgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, rwork,
info)
call zgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, rwork,
info)
1259
Fortran 95:
call gesvd(a, s [,u] [,vt] [,ww] [,job] [,info])
Description
The routine computes the singular value decomposition (SVD) of a real/complex m-by-n matrix
A, optionally computing the left and/or right singular vectors. The SVD is written
A = U*Σ*VH
where Σ is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an
m-by-m orthogonal/unitary matrix, and V is an n-by-n orthogonal/unitary matrix. The diagonal
elements of Σ are the singular values of A; they are real and non-negative, and are returned
in descending order. The first min(m, n) columns of U and V are the left and right singular vectors
of A.
H
Note that the routine returns V , not V.
Input Parameters
jobu CHARACTER*1. Must be 'A', 'S', 'O', or 'N'. Specifies
options for computing all or part of the matrix U.
If jobu = 'A', all m columns of U are returned in the array
u;
if jobu = 'S', the first min(m, n) columns of U (the left
singular vectors) are returned in the array u;
if jobu = 'O', the first min(m, n) columns of U (the left
singular vectors) are overwritten on the array a;
if jobu = 'N', no columns of U (no left singular vectors)
are computed.
jobvt CHARACTER*1. Must be 'A', 'S', 'O', or 'N'. Specifies
H
options for computing all or part of the matrix V .
H
If jobvt = 'A', all n rows of V are returned in the array
vt;
H
if jobvt = 'S', the first min(m,n) rows of V (the right
singular vectors) are returned in the array vt;
1260
H
if jobvt = 'O', the first min(m,n) rows of V (the right
singular vectors) are overwritten on the array a;
H
if jobvt = 'N', no rows of V (no right singular vectors)
are computed.
jobvt and jobu cannot both be 'O'.
a, work REAL for sgesvd
DOUBLE PRECISION for dgesvd
COMPLEX for cgesvd
DOUBLE COMPLEX for zgesvd.
Arrays:
a(lda,*) is an array containing the m-by-n matrix A.
Must be at least max(1, m).
ldu, ldvt INTEGER. The leading dimensions of the output arrays u
and vt, respectively.
Constraints:
ldu ≥ 1; ldvt < 1.
If jobu = 'S' or 'A', ldu ≥ m;
If jobvt = 'A', ldvt≥ n;
If jobvt = 'S', ldvt ≥ min(m, n).
lwork INTEGER.
The dimension of the array work; lwork ≥ 1.
Constraints:
lwork ≥ max(3*min(m, n)+max(m, n), 5*min(m,n))
(for real flavors);
lwork ≥ 2*min(m, n)+max(m, n) (for complex flavors).
1261

rwork REAL for cgesvd
DOUBLE PRECISION for zgesvd
Workspace array, DIMENSION at least max(1, 5*min(m, n)).
Used in complex flavors only.
Output Parameters
a On exit,
If jobu = 'O', a is overwritten with the first min(m,n)
columns of U (the left singular vectors, stored columnwise);
If jobvt = 'O', a is overwritten with the first min(m, n)
H
rows of V (the right singular vectors, stored rowwise);
If jobu≠'O' and jobvt≠'O', the contents of a are
destroyed.
s REAL for single precision flavors DOUBLE PRECISION for
Array, DIMENSION at least max(1, min(m,n)). Contains the
singular values of A sorted so that s(i) < s(i+1).
u, vt REAL for sgesvd
DOUBLE PRECISION for dgesvd
COMPLEX for cgesvd
DOUBLE COMPLEX for zgesvd.
Arrays:
u(ldu,*); the second dimension of u must be at least max(1,
m) if jobu = 'A', and at least max(1, min(m, n)) if jobu
= 'S'.
If jobu = 'A', u contains the m-by-m orthogonal/unitary
matrix U.
If jobu = 'S', u contains the first min(m, n) columns of U
(the left singular vectors, stored columnwise).
If jobu = 'N' or 'O', u is not referenced.
vt(ldvt,*); the second dimension of vt must be at least
max(1, n).
1262
If jobvt = 'A', vt contains the n-by-n orthogonal/unitary
matrix VH.
H
If jobvt = 'S', vt contains the first min(m, n) rows of V
(the right singular vectors, stored rowwise).
If jobvt = 'N'or 'O', vt is not referenced.
work On exit, if info = 0, then work(1) returns the required
For real flavors:
If info > 0, work(2:min(m,n)) contains the unconverged
superdiagonal elements of an upper bidiagonal matrix B
whose diagonal is in s (not necessarily sorted). B satisfies
A=u*B*vt, so it has the same singular values as A, and
singular vectors related by u and vt.
rwork On exit (for complex flavors), if info > 0,
rwork(1:min(m,n)-1) contains the unconverged
superdiagonal elements of an upper bidiagonal matrix B
whose diagonal is in s (not necessarily sorted). B satisfies
A = u*B*vt, so it has the same singular values as A, and
singular vectors related by u and vt.
info INTEGER.
If info = i, then if ?bdsqr did not converge, i specifies
how many superdiagonals of the intermediate bidiagonal
form B did not converge to zero.

Specific details for the routine gesvd interface are the following:
s Holds the vector of length min(m, n).
u Holds the matrix U of size (m,min(m, n)).
vt Holds the matrix VT of size (min(m, n),n).
1263
ww Holds the vector of length min(m, n)-1. ww contains the unconverged

superdiagonal elements of an upper bidiagonal matrix B whose diagonal
is in s (not necessarily sorted). B satisfies A = U*B*VT, so it has the
same singular values as A, and singular vectors related by U and VT.
job Must be either 'N', or 'U', or 'V'. The default value is 'N'.
If job = 'U', and u is not present, then u is returned in the array a.
If job = 'V', and vt is not present, then vt is returned in the array
a.
jobu Restored based on the presence of the argument u, value of job and
sizes of arrays u and a as follows:
jobu = 'A', if u is present and the number of columns in u is equal
to the number of rows in a,
jobu = 'S', if u is present and the number of columns in u is not equal
jobu = 'O', if u is not present and job is equal to 'U',
jobu = 'N', if u is not present and job is not equal to 'U'.
jobvt Restored based on the presence of the argument vt, value of job and
sizes of arrays vt and a as follows:
jobvt = 'A', if vt is present and the number of columns in vt is equal
jobvt = 'S', if vt is present and the number of columns in vt is not
equal to the number of rows in a,
jobvt = 'O', if vt is not present and job is equal to 'V',
jobvt = 'N', if vt is not present and job is not equal to 'V',
Application Notes
workspace query.
1264
workspace.
?gesdd
general rectangular matrix using a divide and
conquer method.
Syntax
FORTRAN 77:
call sgesdd(jobz, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, iwork, info)
call dgesdd(jobz, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, iwork, info)
call cgesdd(jobz, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, rwork, iwork,
info)
call zgesdd(jobz, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, rwork, iwork,
info)
Fortran 95:
call gesdd(a, s [,u] [,vt] [,jobz] [,info])
Description
The routine computes the singular value decomposition (SVD) of a real/complex m-by-n matrix
A, optionally computing the left and/or right singular vectors.
If singular vectors are desired, it uses a divide and conquer algorithm. The SVD is written
A = U*ΣVH,
where Σ is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an
m-by-m orthogonal/unitary matrix, and V is an n-by-n orthogonal/unitary matrix. The diagonal
elements of Σ are the singular values of A; they are real and non-negative, and are returned
in descending order. The first min(m, n) columns of U and V are the left and right singular vectors
of A.
1265
H
Note that the routine returns V , not V.
Input Parameters
jobz CHARACTER*1. Must be 'A', 'S', 'O', or 'N'.
Specifies options for computing all or part of the matrix U.
T
If jobz = 'A', all m columns of U and all n rows of V are
returned in the arrays u and vt;
if jobz = 'S', the first min(m, n) columns of U and the first
T
min(m, n) rows of V are returned in the arrays u and vt;
if jobz = 'O', then
if m ≥ n, the first n columns of U are overwritten in the array
T
a and all rows of V are returned in the array vt;
if m < n, all columns of U are returned in the array u and the
T
first m rows of V are overwritten in the array a;
T
if jobz = 'N', no columns of U or rows of V are computed.
a, work REAL for sgesdd
DOUBLE PRECISION for dgesdd
COMPLEX for cgesdd
DOUBLE COMPLEX for zgesdd.
Arrays: a(lda,*) is an array containing the m-by-n matrix
A.
least max(1, m).
ldu, ldvt INTEGER. The leading dimensions of the output arrays u
and vt, respectively.
Constraints:
ldu ≥ 1; ldvt ≥ 1.
If jobz = 'S' or 'A', or jobz = 'O' and m < n,
then ldu ≥ m;
If jobz = 'A' or jobz = 'O' and m < n,
then ldvt ≥ n;
1266
If jobz = 'S', ldvt ≥ min(m, n).
lwork INTEGER.
The dimension of the array work; lwork ≥ 1.
returns this value as the work(1), and no error message
related to lwork is issued by xerbla.
rwork REAL for cgesdd
DOUBLE PRECISION for zgesdd
Workspace array, DIMENSION at least max(1, 5*min(m,n))
if jobz = 'N'.
Otherwise, the dimension of rwork must be at least max(1,
5*(min(m,n))2 + 7*min(m,n)). This array is used in
iwork INTEGER. Workspace array, DIMENSION at least max(1, 8
*min(m, n)).
Output Parameters
a On exit:
If jobz = 'O', then if m≥ n, a is overwritten with the first
n columns of U (the left singular vectors, stored columnwise).
T
If m < n, a is overwritten with the first m rows of V (the right
singular vectors, stored rowwise);
If jobz≠'O', the contents of a are destroyed.
s REAL for single precision flavors DOUBLE PRECISION for
Array, DIMENSION at least max(1, min(m,n)). Contains the
singular values of A sorted so that s(i) ≥ s(i+1).
u, vt REAL for sgesdd
DOUBLE PRECISION for dgesdd
COMPLEX for cgesdd
DOUBLE COMPLEX for zgesdd.
Arrays:
1267

m) if jobz = 'A' or jobz = 'O' and m < n.
If jobz = 'S', the second dimension of u must be at least
max(1, min(m, n)).
If jobz = 'A'or jobz = 'O' and m < n, u contains the
m-by-m orthogonal/unitary matrix U.
If jobz = 'S', u contains the first min(m, n) columns of U
(the left singular vectors, stored columnwise).
If jobz = 'O' and m≥n, or jobz = 'N', u is not referenced.
vt(ldvt,*); the second dimension of vt must be at least
max(1, n).
If jobz = 'A'or jobz = 'O' and m≥n, vt contains the
T
n-by-n orthogonal/unitary matrix V .
T
If jobz = 'S', vt contains the first min(m, n) rows of V
(the right singular vectors, stored rowwise).
If jobz = 'O' and m < n, or jobz = 'N', vt is not
referenced.
info INTEGER.
If info = i, then ?bdsdc did not converge, updating
process failed.

Specific details for the routine gesdd interface are the following:
s Holds the vector of length min(m, n).
u Holds the matrix U of size (m,min(m, n)).
vt Holds the matrix VT of size (min(m, n),n).
job Must be 'N', 'A', 'S', or 'O'. The default value is 'N'.
1268
Application Notes
For real flavors:
If jobz = 'N', lwork ≥ 3*min(m, n) + max (max(m,n), 6*min(m, n));
If jobz = 'O', lwork ≥ 3*(min(m, n))2 + max (max(m, n), 5*(min(m, n))2 +
4*min(m, n));
If jobz = 'S' or 'A', lwork ≥ 3*(min(m, n))2 + max (max(m, n), 4*(min(m, n))2 +
4*min(m, n))
If jobz = 'N', lwork ≥ 2*min(m, n) + max(m, n);
If jobz = 'O', lwork ≥ 2*(min(m, n))2 + max(m, n) + 2*min(m, n);
If jobz = 'S' or 'A', lwork ≥ (min(m, n))2 + max(m, n) + 2*min(m, n);
workspace query.
workspace.
1269
?ggsvd
Computes the generalized singular value
decomposition of a pair of general rectangular
matrices.
Syntax
FORTRAN 77:
call sggsvd(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u,
ldu, v, ldv, q, ldq, work, iwork, info)
call dggsvd(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u,
ldu, v, ldv, q, ldq, work, iwork, info)
call cggsvd(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u,
ldu, v, ldv, q, ldq, work, rwork, iwork, info)
call zggsvd(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u,
ldu, v, ldv, q, ldq, work, rwork, iwork, info)
Fortran 95:
call ggsvd(a, b, alpha, beta [, k] [,l] [,u] [,v] [,q] [,iwork] [,info])
Description
The routine computes the generalized singular value decomposition (GSVD) of an m-by-n
real/complex matrix A and p-by-n real/complex matrix B:
UH*A*Q = D1*(0 R), VH*B*Q = D2*(0 R),

where U, V and Q are orthogonal/unitary matrices.
H H H
Let k+l = the effective numerical rank of the matrix (A , B ) , then R is a (k+l)-by-(k+l)
nonsingular upper triangular matrix, D1 and D2 are m-by-(k+l) and p-by-(k+l) “diagonal”
matrices and of the following structures, respectively:
If m-k-l ≥0,
1270
where
C = diag(alpha(K+1),..., alpha(K+l))
S = diag(beta(K+1),...,beta(K+l))
C2 + S2 = I
R is stored in a(1:k+l, n-k-l+1:n ) on exit.
If m-k-l < 0,
1271
where
C = diag(alpha(K+1),..., alpha(m)),
S = diag(beta(K+1),...,beta(m)),
C2 + S2 = I
On exit, is stored in a(1:m, n-k-l+1:n ) and R33 is stored in b(m-k+1:l,

n+m-k-l+1:n ).
The routine computes C, S, R, and optionally the orthogonal/unitary transformation matrices

U, V and Q.
1272
In particular, if B is an n-by-n nonsingular matrix, then the GSVD of A and B implicitly gives
-1
the SVD of A*B :
A*B-1 = U*(D1 D2-1)*VH.

H H H
If (A , B ) has orthonormal columns, then the GSVD of A and B is also equal to the CS
decomposition of A and B. Furthermore, the GSVD can be used to derive the solution of the
eigenvalue problem:
AH*A*x = λ*BH*B*x.
Input Parameters
jobu CHARACTER*1. Must be 'U' or 'N'.
If jobu = 'U', orthogonal/unitary matrix U is computed.
If jobu = 'N', U is not computed.
jobv CHARACTER*1. Must be 'V' or 'N'.
If jobv = 'V', orthogonal/unitary matrix V is computed.
If jobv = 'N', V is not computed.
jobq CHARACTER*1. Must be 'Q' or 'N'.
If jobq = 'Q', orthogonal/unitary matrix Q is computed.
If jobq = 'N', Q is not computed.
(n ≥ 0).
a, b, work REAL for sggsvd
DOUBLE PRECISION for dggsvd
COMPLEX for cggsvd
DOUBLE COMPLEX for zggsvd.
Arrays:
The dimension of work must be at least max(3n, m, p)+n.
1273

ldu INTEGER. The first dimension of the array u .
ldu ≥ max(1, m) if jobu = 'U'; ldu ≥ 1 otherwise.
ldv INTEGER. The first dimension of the array v .
ldv ≥ max(1, p) if jobv = 'V'; ldv ≥ 1 otherwise.
ldq INTEGER. The first dimension of the array q .
ldq ≥ max(1, n) if jobq = 'Q'; ldq ≥ 1 otherwise.
iwork INTEGER.
rwork REAL for cggsvd DOUBLE PRECISION for zggsvd.
Output Parameters
k, l INTEGER. On exit, k and l specify the dimension of the
subblocks. The sum k+l is equal to the effective numerical
H H H
rank of (A , B ) .
a On exit, a contains the triangular matrix R or part of R.
b On exit, b contains part of the triangular matrix R if m-k-l
< 0.
alpha, beta REAL for single-precision flavors
Contain the generalized singular value pairs of A and B:
alpha(1:k) = 1,
beta(1:k) = 0,
and if m-k-l ≥ 0,
alpha(k+1:k+l) = C,
beta(k+1:k+l) = S,
or if m-k-l < 0,
alpha(k+1:m)= C, alpha(m+1:k+l)=0
beta(k+1:m) = S, beta(m+1:k+l) = 1
and
1274
alpha(k+l+1:n) = 0
beta(k+l+1:n) = 0.
u, v, q REAL for sggsvd
DOUBLE PRECISION for dggsvd
COMPLEX for cggsvd
DOUBLE COMPLEX for zggsvd.
Arrays:
m).
If jobu = 'U', u contains the m-by-m orthogonal/unitary
matrix U.
v(ldv,*); the second dimension of v must be at least max(1,
p).
If jobv = 'V', v contains the p-by-p orthogonal/unitary
matrix V.
q(ldq,*); the second dimension of q must be at least max(1,
n).
If jobq = 'Q', q contains the n-by-n orthogonal/unitary
matrix Q.
iwork On exit, iwork stores the sorting information.
info INTEGER.
If info = 1, the Jacobi-type procedure failed to converge.
For further details, see subroutine ?tgsja.

Specific details for the routine ggsvd interface are the following:
b Holds the matrix B of size (p, n).
1275
alpha Holds the vector of length n.

u Holds the matrix U of size (m, m).
v Holds the matrix V of size (p, p).
iwork Holds the vector of length n.
jobu Restored based on the presence of the argument u as follows:
jobu = 'U', if u is present, jobu = 'N', if u is omitted.
jobv Restored based on the presence of the argument v as follows:
jobz = 'V', if v is present,
jobz = 'N', if v is omitted.
jobq Restored based on the presence of the argument q as follows:
jobz = 'Q', if q is present,
jobz = 'N', if q is omitted.
Generalized Symmetric Definite Eigenproblems

This section describes LAPACK driver routines used for solving generalized symmetric definite
eigenproblems. See also computational routines that can be called to solve these problems.
Table 4-13 Driver Routines for Solving Generalized Symmetric Definite Eigenproblems
?sygv/?hegv Computes all eigenvalues and, optionally, eigenvectors of a real /

complex generalized symmetric /Hermitian definite eigenproblem.
?sygvd/?hegvd Computes all eigenvalues and, optionally, eigenvectors of a real /

complex generalized symmetric /Hermitian definite eigenproblem. If
eigenvectors are desired, it uses a divide and conquer method.
?sygvx/?hegvx Computes selected eigenvalues and, optionally, eigenvectors of a real

/ complex generalized symmetric /Hermitian definite eigenproblem.
?spgv/?hpgv Computes all eigenvalues and, optionally, eigenvectors of a real /

complex generalized symmetric /Hermitian definite eigenproblem with
matrices in packed storage.
1276
?spgvd/?hpgvd Computes all eigenvalues and, optionally, eigenvectors of a real /

matrices in packed storage. If eigenvectors are desired, it uses a divide
and conquer method.
?spgvx/?hpgvx Computes selected eigenvalues and, optionally, eigenvectors of a real

/ complex generalized symmetric /Hermitian definite eigenproblem with
matrices in packed storage.
?sbgv/?hbgv Computes all eigenvalues and, optionally, eigenvectors of a real /

banded matrices.
?sbgvd/?hbgvd Computes all eigenvalues and, optionally, eigenvectors of a real /

banded matrices. If eigenvectors are desired, it uses a divide and
conquer method.
?sbgvx/?hbgvx Computes selected eigenvalues and, optionally, eigenvectors of a real

/ complex generalized symmetric /Hermitian definite eigenproblem with
banded matrices.
?sygv
eigenvectors of a real generalized symmetric
Syntax
FORTRAN 77:
call ssygv(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, info)
call dsygv(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, info)
Fortran 95:
call sygv(a, b, w [,itype] [,jobz] [,uplo] [,info])
1277
Description
The routine computes all the eigenvalues, and optionally, the eigenvectors of a real generalized
symmetric-definite eigenproblem, of the form

Here A and B are assumed to be symmetric and B is also positive definite.
Input Parameters
Specifies the problem type to be solved:
if itype = 1, the problem type is A*x = lambda*B*x;
if itype = 2, the problem type is A*B*x = lambda*x;
if itype = 3, the problem type is B*A*x = lambda*x.
If jobz = 'N', then compute eigenvalues only.
If jobz = 'V', then compute eigenvalues and eigenvectors.
If uplo = 'U', arrays a and b store the upper triangles of
A and B;
If uplo = 'L', arrays a and b store the lower triangles of
A and B.
a, b, work REAL for ssygv
DOUBLE PRECISION for dsygv.
Arrays:
a(lda,*) contains the upper or lower triangle of the
b(ldb,*) contains the upper or lower triangle of the
symmetric positive definite matrix B, as specified by uplo.
lwork).
1278
lwork INTEGER.
The dimension of the array work;
lwork ≥ max(1, 3n-1).
Output Parameters
a On exit, if jobz = 'V', then if info = 0, a contains the
matrix Z of eigenvectors. The eigenvectors are normalized
as follows:
if itype = 1 or 2, ZT*B*Z = I;
if itype = 3, ZT*inv(B)*Z = I;
If jobz = 'N', then on exit the upper triangle (if uplo =
'U') or the lower triangle (if uplo = 'L') of A, including
the diagonal, is destroyed.
b On exit, if info ≤ n, the part of b containing the matrix is
overwritten by the triangular factor U or L from the Cholesky
factorization B = UT*U or B = L*LT.
w REAL for ssygv
DOUBLE PRECISION for dsygv.
info INTEGER.
If info = -i, the i-th argument had an illegal value.
If info > 0, spotrf/dpotrf and ssyev/dsyev returned
an error code:
1279
If info = i ≤ n, ssyev/dsyev failed to converge, and i

off-diagonal elements of an intermediate tridiagonal did not
converge to zero;
If info = n + i, for 1 ≤ i ≤ n, then the leading minor
of order i of B is not positive-definite. The factorization of
B could not be completed and no eigenvalues or eigenvectors
were computed.

Specific details for the routine sygv interface are the following:
b Holds the matrix B of size (n, n).
Application Notes
For optimum performance use lwork ≥ (nb+2)*n, where nb is the blocksize for ssytrd/dsytrd
returned by ilaenv.
a workspace query.
1280
?hegv
eigenvectors of a complex generalized Hermitian
Syntax
FORTRAN 77:
call chegv(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, rwork, info)
call zhegv(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, rwork, info)
Fortran 95:
call hegv(a, b, w [,itype] [,jobz] [,uplo] [,info])
Description
The routine computes all the eigenvalues, and optionally, the eigenvectors of a complex
generalized Hermitian-definite eigenproblem, of the form

Here A and B are assumed to be Hermitian and B is also positive definite.
Input Parameters
itype INTEGER. Must be 1 or 2 or 3. Specifies the problem type
to be solved:
1281

A and B;
A and B.
a, b, work COMPLEX for chegv
DOUBLE COMPLEX for zhegv.
Arrays:
Hermitian positive definite matrix B, as specified by uplo.
lwork INTEGER.
The dimension of the array work; lwork ≥ max(1, 2n-1).
rwork REAL for chegv
DOUBLE PRECISION for zhegv.
Output Parameters
as follows:
if itype = 1 or 2, ZH*B*Z = I;
if itype = 3, ZH*inv(B)*Z = I;
1282
factorization B = UH*U or B = L*LH.
w REAL for chegv
DOUBLE PRECISION for zhegv.
info INTEGER.
If info = -i, the i-th argument has an illegal value.
If info > 0, cpotrf/zpotrf and cheev/zheev return an
error code:
If info = i ≤ n, cheev/zheev fails to converge, and i
off-diagonal elements of an intermediate tridiagonal do not
converge to zero;
B can not be completed and no eigenvalues or eigenvectors
are computed.

Specific details for the routine hegv interface are the following:
1283

Application Notes
For optimum performance use lwork ≥ (nb+1)*n, where nb is the blocksize for chetrd/zhetrd
returned by ilaenv.
workspace query.
workspace.
?sygvd
definite eigenproblem. If eigenvectors are desired,
it uses a divide and conquer method.
Syntax
FORTRAN 77:
call ssygvd(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, iwork,
liwork, info)
call dsygvd(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, iwork,
liwork, info)
Fortran 95:
call sygvd(a, b, w [,itype] [,jobz] [,uplo] [,info])
1284
Description
A*x = λ*B*x, A*B*x = λ*x, or B*A*x = λ*x .

Here A and B are assumed to be symmetric and B is also positive definite.
If eigenvectors are desired, it uses a divide and conquer algorithm.
Input Parameters
to be solved:
A and B;
A and B.
a, b, work REAL for ssygvd
DOUBLE PRECISION for dsygvd.
Arrays:
1285

lwork INTEGER.
Constraints:
If n ≤ 1, lwork ≥ 1;
If jobz = 'N' and n>1, lwork < 2n+1;
If jobz = 'V' and n>1, lwork < 2n2+6n+1.
for details.
iwork INTEGER.
Workspace array, its dimension max(1, lwork).
liwork INTEGER.
Constraints:
If n ≤ 1, liwork ≥ 1;
If jobz = 'N' and n>1, liwork ≥ 1;
If jobz = 'V' and n>1, liwork ≥ 5n+3.
for details.
Output Parameters
as follows:
1286
w REAL for ssygvd
DOUBLE PRECISION for dsygvd.
info INTEGER.
If info > 0= i ≤ n, and jobz = 'N', then the algorithm
failed to converge; i off-diagonal elements of an
intermediate tridiagonal form did not converge to zero; if
jobz = 'V', then the algorithm failed to compute an
eigenvalue while working on the submatrix lying in rows
and columns info/(n+1) through mod(info,n+1);
If info > 0= n + i, for 1 ≤ i ≤ n, then the leading minor
were computed.

Specific details for the routine sygvd interface are the following:
1287

Application Notes
a workspace query.
?hegvd
definite eigenproblem. If eigenvectors are desired,
it uses a divide and conquer method.
Syntax
FORTRAN 77:
call chegvd(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, rwork,
lrwork, iwork, liwork, info)
call zhegvd(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, rwork,
Fortran 95:
call hegvd(a, b, w [,itype] [,jobz] [,uplo] [,info])
1288
Description

Here A and B are assumed to be Hermitian and B is also positive definite.
Input Parameters
to be solved:
A and B;
A and B.
a, b, work COMPLEX for chegvd
DOUBLE COMPLEX for zhegvd.
Arrays:
1289

lwork INTEGER.
Constraints:
If jobz = 'N' and n>1, lwork ≥ n+1;
If jobz = 'V' and n>1, lwork ≥ n2+2n.
rwork REAL for chegvd
DOUBLE PRECISION for zhegvd.
Workspace array, DIMENSION max(1, lrwork).
lrwork INTEGER.
Constraints:
If n ≤ 1, lrwork ≥ 1;
If jobz = 'N' and n>1, lrwork ≥ n;
If jobz = 'V' and n>1, lrwork ≥ 2n2+5n+1.
iwork INTEGER.
Workspace array, DIMENSION max(1, liwork).
liwork INTEGER.
Constraints:
1290
Output Parameters
as follows:
if itype = 1 or 2, ZH* B*Z = I;
w REAL for chegvd
DOUBLE PRECISION for zhegvd.
info INTEGER.
1291
If info = i, and jobz = 'N', then the algorithm failed to

converge; i off-diagonal elements of an intermediate
tridiagonal form did not converge to zero;
if info = i, and jobz = 'V', then the algorithm failed to
in rows and columns info/(n+1) through mod(info, n+1).
were computed.

Specific details for the routine hegvd interface are the following:
Application Notes
1292
?sygvx
Syntax
FORTRAN 77:
call ssygvx(itype, jobz, range, uplo, n, a, lda, b, ldb, vl, vu, il, iu,
abstol, m, w, z, ldz, work, lwork, iwork, ifail, info)
call dsygvx(itype, jobz, range, uplo, n, a, lda, b, ldb, vl, vu, il, iu,
abstol, m, w, z, ldz, work, lwork, iwork, ifail, info)
Fortran 95:
call sygvx(a, b, w [,itype] [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail]
[,abstol] [,info])
Description
The routine computes selected eigenvalues, and optionally, the eigenvectors of a real generalized

Here A and B are assumed to be symmetric and B is also positive definite. Eigenvalues and
eigenvectors can be selected by specifying either a range of values or a range of indices for
the desired eigenvalues.
Input Parameters
to be solved:
1293

vl<lambda(i)≤ vu.
indices il to iu.
A and B;
A and B.
a, b, work REAL for ssygvx
DOUBLE PRECISION for dsygvx.
Arrays:
vl, vu REAL for ssygvx
Constraint: vl< vu.
1294
il, iu INTEGER.
if n = 0.
abstol REAL for ssygvx
DOUBLE PRECISION for dsygvx. The absolute error tolerance
for the eigenvalues. See Application Notes for more
information.
Constraints:
ldz ≥ 1; if jobz = 'V', ldz ≥ max(1, n).
lwork INTEGER.
The dimension of the array work;
lwork < max(1, 8n).
iwork INTEGER.
Output Parameters
a On exit, the upper triangle (if uplo = 'U') or the lower
triangle (if uplo = 'L') of A, including the diagonal, is
overwritten.
m = iu-il+1.
1295
w, z REAL for ssygvx

Arrays:
in ascending order.
z(ldz,*).
The eigenvectors are normalized as follows:
used.
ifail INTEGER.
info INTEGER.
If info = -i, the ith argument had an illegal value.
If info > 0, spotrf/dpotrf and ssyevx/dsyevx returned
an error code:
1296
If info = i ≤ n, ssyevx/dsyevx failed to converge, and
i eigenvectors failed to converge. Their indices are stored
in the array ifail;
were computed.

Specific details for the routine sygvx interface are the following:
significant.
omitted.
1297

Application Notes
For optimum performance use lwork ≥ (nb+3)*n, where nb is the blocksize for ssytrd/dsytrd
returned by ilaenv.
query.
1298
?hegvx
Syntax
FORTRAN 77:
call chegvx(itype, jobz, range, uplo, n, a, lda, b, ldb, vl, vu, il, iu,
abstol, m, w, z, ldz, work, lwork, rwork, iwork, ifail, info)
call zhegvx(itype, jobz, range, uplo, n, a, lda, b, ldb, vl, vu, il, iu,
abstol, m, w, z, ldz, work, lwork, rwork, iwork, ifail, info)
Fortran 95:
call hegvx(a, b, w [,itype] [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail]
[,abstol] [,info])
Description
The routine computes selected eigenvalues, and optionally, the eigenvectors of a complex

Here A and B are assumed to be Hermitian and B is also positive definite. Eigenvalues and
Input Parameters
to be solved:
1299

vl<lambda(i)≤ vu.
indices il to iu.
A and B;
A and B.
a, b, work COMPLEX for chegvx
DOUBLE COMPLEX for zhegvx.
Arrays:
vl, vu REAL for chegvx
DOUBLE PRECISION for zhegvx.
Constraint: vl< vu.
il, iu INTEGER.
1300
if n = 0.
abstol REAL for chegvx
Constraints:
lwork INTEGER.
The dimension of the array work; lwork ≥ max(1, 2n).
rwork REAL for chegvx
iwork INTEGER.
Output Parameters
a On exit, the upper triangle (if uplo = 'U') or the lower
triangle (if uplo = 'L') of A, including the diagonal, is
overwritten.
m = iu-il+1.
w REAL for chegvx
1301

in ascending order.
z COMPLEX for chegvx
DOUBLE COMPLEX for zhegvx.
Array z(ldz,*). The second dimension of z must be at least
max(1, m).
used.
ifail INTEGER.
info INTEGER.
If info = -i, the ith argument had an illegal value.
If info > 0, cpotrf/zpotrf and cheevx/zheevx returned
an error code:
1302
If info = i ≤ n, cheevx/zheevx failed to converge, and
in the array ifail;
were computed.

Specific details for the routine hegvx interface are the following:
significant.
omitted.
1303

Application Notes
For optimum performance use lwork ≥ (nb+1)*n, where nb is the blocksize for chetrd/zhetrd
returned by ilaenv.
workspace query.
workspace.
1304
?spgv
definite eigenproblem with matrices in packed
storage.
Syntax
FORTRAN 77:
call sspgv(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, info)
call dspgv(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, info)
Fortran 95:
call spgv(ap, bp, w [,itype] [,uplo] [,z] [,info])
Description

Here A and B are assumed to be symmetric, stored in packed format, and B is also positive
definite.
Input Parameters
to be solved:
1305
If uplo = 'U', arrays ap and bp store the upper triangles

of A and B;
If uplo = 'L', arrays ap and bp store the lower triangles
of A and B.
ap, bp, work REAL for sspgv
DOUBLE PRECISION for dspgv.
Arrays:
bp(*) contains the packed upper or lower triangle of the
symmetric matrix B, as specified by uplo.
3n).
≥ 1. If jobz = 'V', ldz ≥ max(1, n).
Output Parameters
ap On exit, the contents of ap are overwritten.
bp On exit, contains the triangular factor U or L from the
Cholesky factorization B = UT*U or B = L*LT, in the same
storage format as B.
w, z REAL for sspgv
Arrays:
z(ldz,*).
If jobz = 'V', then if info = 0, z contains the matrix Z
of eigenvectors. The eigenvectors are normalized as follows:
1306
info INTEGER.
If info > 0, spptrf/dpptrf and sspev/dspev returned
an error code:
If info = i ≤ n, sspev/dspev failed to converge, and i
converge to zero;
were computed.

Specific details for the routine spgv interface are the following:
1307
?hpgv
storage.
Syntax
FORTRAN 77:
call chpgv(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, rwork, info)
call zhpgv(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, rwork, info)
Fortran 95:
call hpgv(ap, bp, w [,itype] [,uplo] [,z] [,info])
Description

Here A and B are assumed to be Hermitian, stored in packed format, and B is also positive
definite.
Input Parameters
to be solved:
1308
of A and B;
of A and B.
ap, bp, work COMPLEX for chpgv
DOUBLE COMPLEX for zhpgv.
Arrays:
Hermitian matrix B, as specified by uplo.
2n-1).
≥ 1. If jobz = 'V', ldz ≥ max(1, n).
rwork REAL for chpgv
DOUBLE PRECISION for zhpgv.
Output Parameters
Cholesky factorization B = UH*U or B = L*LH, in the same
w REAL for chpgv
DOUBLE PRECISION for zhpgv.
z COMPLEX for chpgv
DOUBLE COMPLEX for zhpgv.
Array z(ldz,*).
1309

info INTEGER.
If info > 0, cpptrf/zpptrf and chpev/zhpev returned
an error code:
If info = i ≤ n, chpev/zhpev failed to converge, and i
converge to zero;
were computed.

Specific details for the routine hpgv interface are the following:
1310
?spgvd
storage. If eigenvectors are desired, it uses a divide
and conquer method.
Syntax
FORTRAN 77:
call sspgvd(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, lwork, iwork,
liwork, info)
call dspgvd(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, lwork, iwork,
liwork, info)
Fortran 95:
call spgvd(ap, bp, w [,itype] [,uplo] [,z] [,info])
Description

definite.
Input Parameters
to be solved:
1311

of A and B;
of A and B.
ap, bp, work REAL for sspgvd
DOUBLE PRECISION for dspgvd.
Arrays:
≥ 1. If jobz = 'V', ldz ≥ max(1, n).
lwork INTEGER.
Constraints:
If jobz = 'N' and n>1, lwork ≥ 2n;
If jobz = 'V' and n>1, lwork ≥ 2n2+6n+1.
for details.
iwork INTEGER.
Workspace array, its dimension max(1, lwork).
liwork INTEGER.
1312
Constraints:
for details.
Output Parameters
w, z REAL for sspgv
Arrays:
z(ldz,*).
info INTEGER.
1313

If info > 0, spptrf/dpptrf and sspevd/dspevd returned
an error code:
If info = i ≤ n, sspevd/dspevd failed to converge, and
i off-diagonal elements of an intermediate tridiagonal did
not converge to zero;
were computed.

Specific details for the routine spgvd interface are the following:
Application Notes
subsequent runs.
1314
?hpgvd
storage. If eigenvectors are desired, it uses a divide
and conquer method.
Syntax
FORTRAN 77:
call chpgvd(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, lwork, rwork,
call zhpgvd(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, lwork, rwork,
Fortran 95:
call hpgvd(ap, bp, w [,itype] [,uplo] [,z] [,info])
Description

definite.
1315
Input Parameters
to be solved:
of A and B;
of A and B.
ap, bp, work COMPLEX for chpgvd
DOUBLE COMPLEX for zhpgvd.
Arrays:
≥ 1. If jobz = 'V', ldz ≥ max(1, n).
lwork INTEGER.
Constraints:
If jobz = 'N' and n>1, lwork ≥ n;
If jobz = 'V' and n>1, lwork ≥ 2n.
1316
rwork REAL for chpgvd
DOUBLE PRECISION for zhpgvd.
Workspace array, its dimension max(1, lrwork).
lrwork INTEGER.
Constraints:
If jobz = 'V' and n>1, lrwork ≥ 2n2+5n+1.
iwork INTEGER.
liwork INTEGER.
Constraints:
1317
Output Parameters
w REAL for chpgvd
DOUBLE PRECISION for zhpgvd.
z COMPLEX for chpgvd
DOUBLE COMPLEX for zhpgvd.
Array z(ldz,*).
info INTEGER.
If info > 0, cpptrf/zpptrf and chpevd/zhpevd returned
an error code:
If info = i ≤ n, chpevd/zhpevd failed to converge, and
i off-diagonal elements of an intermediate tridiagonal did
not converge to zero;
1318
were computed.

Specific details for the routine hpgvd interface are the following:
Application Notes
1319
?spgvx
storage.
Syntax
FORTRAN 77:
call sspgvx(itype, jobz, range, uplo, n, ap, bp, vl, vu, il, iu, abstol, m,
w, z, ldz, work, iwork, ifail, info)
call dspgvx(itype, jobz, range, uplo, n, ap, bp, vl, vu, il, iu, abstol, m,
w, z, ldz, work, iwork, ifail, info)
Fortran 95:
call spgvx(ap, bp, w [,itype] [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail]
[,abstol] [,info])
Description

definite. Eigenvalues and eigenvectors can be selected by specifying either a range of values
Input Parameters
to be solved:
1320
vl<lambda(i)≤ vu.
indices il to iu.
of A and B;
of A and B.
ap, bp, work REAL for sspgvx
DOUBLE PRECISION for dspgvx.
Arrays:
8n).
vl, vu REAL for sspgvx
Constraint: vl< vu.
il, iu INTEGER.
1321
if n = 0.
abstol REAL for sspgvx
Constraints:
iwork INTEGER.
Output Parameters
m = iu-il+1.
w, z REAL for sspgvx
Arrays:
z(ldz,*).
1322
used.
ifail INTEGER.
info INTEGER.
If info > 0, spptrf/dpptrf and sspevx/dspevx returned
an error code:
If info = i ≤ n, sspevx/dspevx failed to converge, and
in the array ifail;
were computed.

Specific details for the routine spgvx interface are the following:
1323
significant.
omitted.
Application Notes
If abstol is less than or equal to zero, then ε*||T||1 is used instead, where T is the tridiagonal
matrix obtained by reducing A to tridiagonal form. Eigenvalues are computed most accurately
when abstol is set to twice the underflow threshold 2*?lamch('S'), not zero.
1324
?hpgvx
eigenvectors of a generalized Hermitian definite
eigenproblem with matrices in packed storage.
Syntax
FORTRAN 77:
call chpgvx(itype, jobz, range, uplo, n, ap, bp, vl, vu, il, iu, abstol, m,
w, z, ldz, work, rwork, iwork, ifail, info)
call zhpgvx(itype, jobz, range, uplo, n, ap, bp, vl, vu, il, iu, abstol, m,
w, z, ldz, work, rwork, iwork, ifail, info)
Fortran 95:
call hpgvx(ap, bp, w [,itype] [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail]
[,abstol] [,info])
Description

definite. Eigenvalues and eigenvectors can be selected by specifying either a range of values
Input Parameters
to be solved:
1325

indices il to iu.
of A and B;
of A and B.
ap, bp, work COMPLEX for chpgvx
DOUBLE COMPLEX for zhpgvx.
Arrays:
2n).
vl, vu REAL for chpgvx
DOUBLE PRECISION for zhpgvx.
Constraint: vl< vu.
il, iu INTEGER.
if n = 0.
1326
abstol REAL for chpgvx
The absolute error tolerance for the eigenvalues.
See Application Notes for more information.
≥ 1. If jobz = 'V', ldz ≥ max(1, n).
rwork REAL for chpgvx
iwork INTEGER.
Output Parameters
m = iu-il+1.
w REAL for chpgvx
z COMPLEX for chpgvx
DOUBLE COMPLEX for zhpgvx.
Array z(ldz,*).
1327
used.
ifail INTEGER.
info INTEGER.
If info > 0, cpptrf/zpptrf and chpevx/zhpevx returned
an error code:
If info = i ≤ n, chpevx/zhpevx failed to converge, and
in the array ifail;
were computed.

Specific details for the routine hpgvx interface are the following:
1328
significant.
omitted.
Application Notes
1329
?sbgv
definite eigenproblem with banded matrices.
Syntax
FORTRAN 77:
call ssbgv(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, info)
call dsbgv(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, info)
Fortran 95:
call sbgv(ab, bb, w [,uplo] [,z] [,info])
Description
symmetric-definite banded eigenproblem, of the form A*x = λ*B*x. Here A and B are assumed
to be symmetric and banded, and B is also positive definite.
Input Parameters
If uplo = 'U', arrays ab and bb store the upper triangles
of A and B;
If uplo = 'L', arrays ab and bb store the lower triangles
of A and B.
(ka ≥ 0).
1330
kb INTEGER. The number of super- or sub-diagonals in B (kb
≥ 0).
ab, bb, work REAL for ssbgv
DOUBLE PRECISION for dsbgv
Arrays:
max(1, n).
bb(ldbb,*) is an array containing either upper or lower
triangular part of the symmetric matrix B (as specified by
max(1, n).
3n)
least ka+1.
least kb+1.
≥ 1. If jobz = 'V', ldz ≥ max(1, n).
Output Parameters
ab On exit, the contents of ab are overwritten.
bb On exit, contains the factor S from the split Cholesky
factorization B = ST*S, as returned by spbstf/dpbstf.
w, z REAL for ssbgv
DOUBLE PRECISION for dsbgv
Arrays:
z(ldz,*).
1331

of eigenvectors, with the i-th column of z holding the
eigenvector associated with w(i). The eigenvectors are
normalized so that ZT*B*Z = I.
info INTEGER.
If info > 0, and
if i ≤ n, the algorithm failed to converge, and i off-diagonal
elements of an intermediate tridiagonal did not converge to
zero;
if info = n + i, for 1 ≤ i ≤ n, then spbstf/dpbstf
returned info = i and B is not positive-definite. The
factorization of B could not be completed and no eigenvalues
or eigenvectors were computed.

Specific details for the routine sbgv interface are the following:
1332
?hbgv
Syntax
FORTRAN 77:
call chbgv(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, rwork,
info)
call zhbgv(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, rwork,
info)
Fortran 95:
call hbgv(ab, bb, w [,uplo] [,z] [,info])
Description
generalized Hermitian-definite banded eigenproblem, of the form A*x = λ*B*x. Here A and B
are Hermitian and banded matrices, and matrix B is also positive definite.
Input Parameters
of A and B;
of A and B.
1333
(ka ≥ 0).
≥ 0).
ab, bb, work COMPLEX for chbgv
DOUBLE COMPLEX for zhbgv
Arrays:
max(1, n).
triangular part of the Hermitian matrix B (as specified by
max(1, n).
n).
least ka+1.
least kb+1.
≥ 1. If jobz = 'V', ldz ≥ max(1, n).
rwork REAL for chbgv
DOUBLE PRECISION for zhbgv.
Output Parameters
H
factorization B = S *S, as returned by cpbstf/zpbstf.
w REAL for chbgv
DOUBLE PRECISION for zhbgv.
1334
z COMPLEX for chbgv
DOUBLE COMPLEX for zhbgv
Array z(ldz,*).
normalized so that ZH*B*Z = I.
info INTEGER.
If info > 0, and
zero;
if info = n + i, for 1 ≤ i ≤ n, then cpbstf/zpbstf

Specific details for the routine hbgv interface are the following:
1335
?sbgvd
definite eigenproblem with banded matrices. If
eigenvectors are desired, it uses a divide and
conquer method.
Syntax
FORTRAN 77:
call ssbgvd(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, lwork,
call dsbgvd(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, lwork,
Fortran 95:
call sbgvd(ab, bb, w [,uplo] [,z] [,info])
Description
to be symmetric and banded, and B is also positive definite.
Input Parameters
of A and B;
of A and B.
1336
(ka ≥ 0).
≥ 0).
ab, bb, work REAL for ssbgvd
DOUBLE PRECISION for dsbgvd
Arrays:
max(1, n).
max(1, n).
least ka+1.
least kb+1.
≥ 1. If jobz = 'V', ldz ≥ max(1, n).
lwork INTEGER.
Constraints:
If jobz = 'N' and n>1, lwork ≥ 3n;
If jobz = 'V' and n>1, lwork ≥ 2n2+5n+1.
1337

for details.
iwork INTEGER.
liwork INTEGER.
Constraints:
for details.
Output Parameters
w, z REAL for ssbgvd
DOUBLE PRECISION for dsbgvd
Arrays:
z(ldz,*).
T
normalized so that Z *B*Z = I.
1338
info INTEGER.
If info > 0, and
zero;

Specific details for the routine sbgvd interface are the following:
Application Notes
1339
a workspace query.
?hbgvd
definite eigenproblem with banded matrices. If
eigenvectors are desired, it uses a divide and
conquer method.
Syntax
FORTRAN 77:
call chbgvd(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, lwork,
rwork, lrwork, iwork, liwork, info)
call zhbgvd(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, lwork,
rwork, lrwork, iwork, liwork, info)
Fortran 95:
call hbgvd(ab, bb, w [,uplo] [,z] [,info])
Description
are assumed to be Hermitian and banded, and B is also positive definite.
1340
Input Parameters
of A and B;
of A and B.
(ka≥0).
≥ 0).
ab, bb, work COMPLEX for chbgvd
DOUBLE COMPLEX for zhbgvd
Arrays:
max(1, n).
max(1, n).
least ka+1.
least kb+1.
≥ 1. If jobz = 'V', ldz ≥ max(1, n).
1341
lwork INTEGER.
Constraints:
If jobz = 'N' and n>1, lwork ≥ n;
If jobz = 'V' and n>1, lwork ≥ 2n2.
rwork REAL for chbgvd
DOUBLE PRECISION for zhbgvd.
Workspace array, DIMENSION max(1, lrwork).
lrwork INTEGER.
Constraints:
If jobz = 'V' and n>1, lrwork ≥ 2n2+5n +1.
iwork INTEGER.
liwork INTEGER.
Constraints:
1342
Output Parameters
H
w REAL for chbgvd
DOUBLE PRECISION for zhbgvd.
Array, DIMENSION at least max(1, n) .
z COMPLEX for chbgvd
DOUBLE COMPLEX for zhbgvd
Array z(ldz,*) .
of eigenvectors , with the i-th column of z holding the
info INTEGER.
If info > 0, and
1343

zero;

Specific details for the routine hbgvd interface are the following:
Application Notes
1344
?sbgvx
Syntax
FORTRAN 77:
call ssbgvx(jobz, range, uplo, n, ka, kb, ab, ldab, bb, ldbb, q, ldq, vl, vu,
il, iu, abstol, m, w, z, ldz, work, iwork, ifail, info)
call dsbgvx(jobz, range, uplo, n, ka, kb, ab, ldab, bb, ldbb, q, ldq, vl, vu,
il, iu, abstol, m, w, z, ldz, work, iwork, ifail, info)
Fortran 95:
call sbgvx(ab, bb, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,q]
[,abstol] [,info])
Description
to be symmetric and banded, and B is also positive definite. Eigenvalues and eigenvectors can
be selected by specifying either all eigenvalues, a range of values or a range of indices for the
desired eigenvalues.
Input Parameters
1345

vl<lambda(i)≤ vu.
If range = 'I', the routine computes eigenvalues in range
il to iu.
of A and B;
of A and B.
(ka ≥ 0).
≥ 0).
ab, bb, work REAL for ssbgvx
DOUBLE PRECISION for dsbgvx
Arrays:
max(1, n).
max(1, n).
work(*) is a workspace array, DIMENSION (7*n).
least ka+1.
least kb+1.
vl, vu REAL for ssbgvx
DOUBLE PRECISION for dsbgvx.
1346
Constraint: vl< vu.
il, iu INTEGER.
if n = 0.
abstol REAL for ssbgvx
DOUBLE PRECISION for dsbgvx.
≥ 1. If jobz = 'V', ldz ≥ max(1, n).
ldq INTEGER. The leading dimension of the output array q; ldq
< 1.
If jobz = 'V', ldq < max(1, n).
iwork INTEGER.
Output Parameters
m = iu-il+1.
w, z, q REAL for ssbgvx
DOUBLE PRECISION for dsbgvx
Arrays:
w(*), DIMENSION at least max(1, n) .
z(ldz,*) .
1347

of eigenvectors , with the i-th column of z holding the
normalized so that ZT*B*Z = I.
q(ldq,*) .
If jobz = 'V', then q contains the n-by-n matrix used in
the reduction of A*x = lambda*B*x to standard form, that
is, C*x= lambda*x and consequently C to tridiagonal form.
If jobz = 'N', then q is not referenced.
ifail INTEGER.
Array, DIMENSION (m).
info INTEGER.
If info > 0, and
zero;

Specific details for the routine sbgvx interface are the following:
1348
Note that there will be an error condition if ifail or q is present and
z is omitted.
Application Notes
1349
?hbgvx
Syntax
FORTRAN 77:
call chbgvx(jobz, range, uplo, n, ka, kb, ab, ldab, bb, ldbb, q, ldq, vl, vu,
il, iu, abstol, m, w, z, ldz, work, rwork, iwork, ifail, info)
call zhbgvx(jobz, range, uplo, n, ka, kb, ab, ldab, bb, ldbb, q, ldq, vl, vu,
il, iu, abstol, m, w, z, ldz, work, rwork, iwork, ifail, info)
Fortran 95:
call hbgvx(ab, bb, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,q]
[,abstol] [,info])
Description
are assumed to be Hermitian and banded, and B is also positive definite. Eigenvalues and
eigenvectors can be selected by specifying either all eigenvalues, a range of values or a range
of indices for the desired eigenvalues.
Input Parameters
1350
indices il to iu.
of A and B;
of A and B.
(ka ≥ 0).
≥ 0).
ab, bb, work COMPLEX for chbgvx
DOUBLE COMPLEX for zhbgvx
Arrays:
max(1, n).
max(1, n).
n).
least ka+1.
least kb+1.
vl, vu REAL for chbgvx
DOUBLE PRECISION for zhbgvx.
1351
Constraint: vl< vu.

il, iu INTEGER.
if n = 0.
abstol REAL for chbgvx
≥ 1. If jobz = 'V', ldz ≥ max(1, n).
ldq INTEGER. The leading dimension of the output array q; ldq
≥ 1. If jobz = 'V', ldq ≥ max(1, n).
rwork REAL for chbgvx
iwork INTEGER.
Output Parameters
H
m = iu-il+1.
w REAL for chbgvx
Array w(*), DIMENSION at least max(1, n).
z, q COMPLEX for chbgvx
1352
DOUBLE COMPLEX for zhbgvx
Arrays:
z(ldz,*).
q(ldq,*).
If jobz = 'V', then q contains the n-by-n matrix used in
the reduction of Ax = λBx to standard form, that is, Cx =
λ x and consequently C to tridiagonal form.
If jobz = 'N', then q is not referenced.
ifail INTEGER.
info INTEGER.
If info > 0, and
zero;
1353

Specific details for the routine hbgvx interface are the following:
Note that there will be an error condition if ifail or q is present and
z is omitted.
Application Notes
1354
Generalized Nonsymmetric Eigenproblems

This section describes LAPACK driver routines used for solving generalized nonsymmetric
eigenproblems. See also computational routines computational routines that can be called to
solve these problems. Table 4-14 lists all such driver routines for FORTRAN 77 interface.
Table 4-14 Driver Routines for Solving Generalized Nonsymmetric Eigenproblems
?gges Computes the generalized eigenvalues, Schur form, and the left and/or
right Schur vectors for a pair of nonsymmetric matrices.
?ggesx Computes the generalized eigenvalues, Schur form, and, optionally, the
left and/or right matrices of Schur vectors.
?ggev Computes the generalized eigenvalues, and the left and/or right
generalized eigenvectors for a pair of nonsymmetric matrices.
?ggevx Computes the generalized eigenvalues, and, optionally, the left and/or
right generalized eigenvectors.
1355
?gges
Computes the generalized eigenvalues, Schur form,
and the left and/or right Schur vectors for a pair
of nonsymmetric matrices.
Syntax
FORTRAN 77:
call sgges(jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alphar,
alphai, beta, vsl, ldvsl, vsr, ldvsr, work, lwork, bwork, info)
call dgges(jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alphar,
alphai, beta, vsl, ldvsl, vsr, ldvsr, work, lwork, bwork, info)
call cgges(jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alpha, beta,
vsl, ldvsl, vsr, ldvsr, work, lwork, rwork, bwork, info)
call zgges(jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alpha, beta,
vsl, ldvsl, vsr, ldvsr, work, lwork, rwork, bwork, info)
Fortran 95:
call gges(a, b, alphar, alphai, beta [,vsl] [,vsr] [,select] [,sdim] [,info])
call gges(a, b, alpha, beta [, vsl] [,vsr] [,select] [,sdim] [,info])
Description
The routine computes for a pair of n-by-n real/complex nonsymmetric matrices (A,B), the
generalized eigenvalues, the generalized real/complex Schur form (S,T), optionally, the left
and/or right matrices of Schur vectors (vsl and vsr). This gives the generalized Schur
factorization
(A,B) = ( vsl*S *vsrH, vsl*T*vsrH )
Optionally, it also orders the eigenvalues so that a selected cluster of eigenvalues appears in
the leading diagonal blocks of the upper quasi-triangular matrix S and the upper triangular
matrix T. The leading columns of vsl and vsr then form an orthonormal/unitary basis for the
corresponding left and right eigenspaces (deflating subspaces).
(If only the generalized eigenvalues are needed, use the driver ?ggev instead, which is faster.)
1356
A generalized eigenvalue for a pair of matrices (A,B) is a scalar w or a ratio alpha / beta = w,
such that A - w*B is singular. It is usually represented as the pair (alpha, beta), as there
is a reasonable interpretation for beta=0 or for both being zero. A pair of matrices (S,T) is in
generalized real Schur form if T is upper triangular with non-negative diagonal and S is block
upper triangular with 1-by-1 and 2-by-2 blocks. 1-by-1 blocks correspond to real generalized
eigenvalues, while 2-by-2 blocks of S will be “standardized" by making the corresponding
elements of T have the form:
and the pair of corresponding 2-by-2 blocks in S and T will have a complex conjugate pair of
generalized eigenvalues. A pair of matrices (S,T) is in generalized complex Schur form if S and
T are upper triangular and, in addition, the diagonal of T are non-negative real numbers.
Input Parameters
jobvsl CHARACTER*1. Must be 'N' or 'V'.
If jobvsl = 'N', then the left Schur vectors are not
computed.
If jobvsl = 'V', then the left Schur vectors are computed.
jobvsr CHARACTER*1. Must be 'N' or 'V'.
If jobvsr = 'N', then the right Schur vectors are not
computed.
If jobvsr = 'V', then the right Schur vectors are
computed.
not to order the eigenvalues on the diagonal of the
generalized Schur form.
If sort = 'S', eigenvalues are ordered (see selctg).
selctg LOGICAL FUNCTION of three REAL arguments for real
flavors.
LOGICAL FUNCTION of two COMPLEX arguments for complex
flavors.
selctg must be declared EXTERNAL in the calling subroutine.
1357
If sort = 'S', selctg is used to select eigenvalues to sort

If sort = 'N', selctg is not referenced.
For real flavors:
An eigenvalue (alphar(j) + alphai(j))/beta(j) is selected
if selctg(alphar(j), alphai(j), beta(j)) is true; that is, if
either one of a complex conjugate pair of eigenvalues is
selected, then both complex eigenvalues are selected.
Note that in the ill-conditioned case, a selected complex
eigenvalue may no longer satisfy selctg(alphar(j),
alphai(j), beta(j)) = .TRUE. after ordering. In this
case info is set to n+2 .
An eigenvalue alpha(j) / beta(j) is selected if
selctg(alpha(j), beta(j)) is true.
satisfy selctg(alpha(j), beta(j)) = .TRUE. after
ordering, since ordering may change the value of complex
eigenvalues (especially if the eigenvalue is ill-conditioned);
in this case info is set to n+2 (see info below).
n INTEGER. The order of the matrices A, B, vsl, and vsr (n
≥ 0).
a, b, work REAL for sgges
DOUBLE PRECISION for dgges
COMPLEX for cgges
DOUBLE COMPLEX for zgges.
Arrays:
a(lda,*) is an array containing the n-by-n matrix A (first of
the pair of matrices).
b(ldb,*) is an array containing the n-by-n matrix B (second
of the pair of matrices).
least max(1, n).
1358
ldb INTEGER. The first dimension of the array b. Must be at
least max(1, n).
ldvsl, ldvsr INTEGER. The first dimensions of the output matrices vsl
and vsr, respectively. Constraints:
ldvsl ≥ 1. If jobvsl = 'V', ldvsl ≥ max(1, n).
ldvsr ≥ 1. If jobvsr = 'V', ldvsr ≥ max(1, n).
lwork INTEGER.
lwork ≥ max(1, 8n+16) for real flavors;
rwork REAL for cgges
DOUBLE PRECISION for zgges
This array is used in complex flavors only.
bwork LOGICAL.
Output Parameters
a On exit, this array has been overwritten by its generalized
Schur form S.
b On exit, this array has been overwritten by its generalized
Schur form T.
sdim INTEGER.
(after sorting) for which selctg is true.
selctg is true for either eigenvalue count as 2.
alphar, alphai REAL for sgges;
1359
DOUBLE PRECISION for dgges.

Arrays, DIMENSION at least max(1, n) each. Contain values
that form generalized eigenvalues in real flavors.
See beta.
alpha COMPLEX for cgges;
form generalized eigenvalues in complex flavors. See beta.
beta REAL for sgges
COMPLEX for cgges
For real flavors:
On exit, (alphar(j) + alphai(j)*i)/beta(j), j=1,..., n, will
be the generalized eigenvalues.
alphar(j) + alphai(j)*i and beta(j), j=1,..., n are the
diagonals of the complex Schur form (S,T) that would result
if the 2-by-2 diagonal blocks of the real generalized Schur
form of (A,B) were further reduced to triangular form using
complex unitary transformations. If alphai(j) is zero, then
the j-th eigenvalue is real; if positive, then the j-th and
(j+1)-st eigenvalues are a complex conjugate pair, with
alphai(j+1) negative.
On exit, alpha(j)/beta(j), j=1,..., n, will be the generalized
eigenvalues. alpha(j), j=1,...,n, and beta(j), j=1,..., n are
the diagonals of the complex Schur form (S,T) output by
cgges/zgges. The beta(j) will be non-negative real.
See also Application Notes below.
vsl, vsr REAL for sgges
COMPLEX for cgges
Arrays:
vsl(ldvsl,*), the second dimension of vsl must be at least
max(1, n).
1360
If jobvsl = 'V', this array will contain the left Schur
vectors.
If jobvsl = 'N', vsl is not referenced.
vsr(ldvsr,*), the second dimension of vsr must be at least
max(1, n).
If jobvsr = 'V', this array will contain the right Schur
vectors.
If jobvsr = 'N', vsr is not referenced.
info INTEGER.
If info = i, and
i ≤ n:
the QZ iteration failed. (A, B) is not in Schur form, but
alphar(j), alphai(j) (for real flavors), or alpha(j) (for
complex flavors), and beta(j), j=info+1,..., n should
be correct.
i > n: errors that usually indicate LAPACK problems:
i = n+1: other than QZ iteration failed in ?hgeqz;
i = n+2: after reordering, roundoff changed values of some
complex eigenvalues so that leading eigenvalues in the
generalized Schur form no longer satisfy selctg = .TRUE..
This could also be caused due to scaling;
i = n+3: reordering failed in ?tgsen.

Specific details for the routine gges interface are the following:
1361

vsl Holds the matrix VSL of size (n, n).
vsr Holds the matrix VSR of size (n, n).
jobvsl Restored based on the presence of the argument vsl as follows:
jobvsl = 'V', if vsl is present,
jobvsl = 'N', if vsl is omitted.
jobvsr Restored based on the presence of the argument vsr as follows:
jobvsr = 'V', if vsr is present,
jobvsr = 'N', if vsr is omitted.
Application Notes
workspace query.
workspace.
The quotients alphar(j)/beta(j) and alphai(j)/beta(j) may easily over- or underflow, and
beta(j) may even be zero. Thus, you should avoid simply computing the ratio. However, alphar
and alphai will be always less than and usually comparable with norm(A) in magnitude, and
beta always less than and usually comparable with norm(B).
1362
?ggesx
Computes the generalized eigenvalues, Schur form,
and, optionally, the left and/or right matrices of
Schur vectors.
Syntax
FORTRAN 77:
call sggesx (jobvsl, jobvsr, sort, selctg, sense, n, a, lda, b, ldb, sdim,
alphar, alphai, beta, vsl, ldvsl, vsr, ldvsr, rconde, rcondv, work, lwork,
iwork, liwork, bwork, info)
call dggesx (jobvsl, jobvsr, sort, selctg, sense, n, a, lda, b, ldb, sdim,
alphar, alphai, beta, vsl, ldvsl, vsr, ldvsr, rconde, rcondv, work, lwork,
call cggesx (jobvsl, jobvsr, sort, selctg, sense, n, a, lda, b, ldb, sdim,
alpha, beta, vsl, ldvsl, vsr, ldvsr, rconde, rcondv, work, lwork, rwork,
call zggesx (jobvsl, jobvsr, sort, selctg, sense, n, a, lda, b, ldb, sdim,
alpha, beta, vsl, ldvsl, vsr, ldvsr, rconde, rcondv, work, lwork, rwork,
Fortran 95:
call ggesx(a, b, alphar, alphai, beta [,vsl] [,vsr] [,select] [,sdim] [,rconde]
[, rcondv] [,info])
call ggesx(a, b, alpha, beta [, vsl] [,vsr] [,select] [,sdim] [,rconde]
[,rcondv] [, info])
Description
generalized eigenvalues, the generalized real/complex Schur form (S,T), optionally, the left
and/or right matrices of Schur vectors (vsl and vsr). This gives the generalized Schur
factorization
(A,B) = ( vsl*S *vsrH, vsl*T*vsrH )
1363
Optionally, it also orders the eigenvalues so that a selected cluster of eigenvalues appears in
the leading diagonal blocks of the upper quasi-triangular matrix S and the upper triangular
matrix T; computes a reciprocal condition number for the average of the selected eigenvalues
(rconde); and computes a reciprocal condition number for the right and left deflating subspaces
corresponding to the selected eigenvalues (rcondv). The leading columns of vsl and vsr then
form an orthonormal/unitary basis for the corresponding left and right eigenspaces (deflating
subspaces).
A generalized eigenvalue for a pair of matrices (A,B) is a scalar w or a ratio alpha / beta =
w, such that A - w*B is singular. It is usually represented as the pair (alpha, beta), as there
is a reasonable interpretation for beta=0 or for both being zero. A pair of matrices (S,T) is in
generalized real Schur form if T is upper triangular with non-negative diagonal and S is block
upper triangular with 1-by-1 and 2-by-2 blocks. 1-by-1 blocks correspond to real generalized
eigenvalues, while 2-by-2 blocks of S will be “standardized" by making the corresponding
elements of T have the form:
and the pair of corresponding 2-by-2 blocks in S and T will have a complex conjugate pair of
generalized eigenvalues. A pair of matrices (S,T) is in generalized complex Schur form if S and
T are upper triangular and, in addition, the diagonal of T are non-negative real numbers.
Input Parameters
jobvsl CHARACTER*1. Must be 'N' or 'V'.
If jobvsl = 'N', then the left Schur vectors are not
computed.
If jobvsl = 'V', then the left Schur vectors are computed.
jobvsr CHARACTER*1. Must be 'N' or 'V'.
If jobvsr = 'N', then the right Schur vectors are not
computed.
If jobvsr = 'V', then the right Schur vectors are
computed.
1364
not to order the eigenvalues on the diagonal of the
generalized Schur form.
If sort = 'S', eigenvalues are ordered (see selctg).
selctg LOGICAL FUNCTION of three REAL arguments for real
flavors.
LOGICAL FUNCTION of two COMPLEX arguments for complex
flavors.
selctg must be declared EXTERNAL in the calling subroutine.
If sort = 'S', selctg is used to select eigenvalues to sort
If sort = 'N', selctg is not referenced.
For real flavors:
An eigenvalue (alphar(j) + alphai(j))/beta(j) is selected
if selctg(alphar(j), alphai(j), beta(j)) is true; that is, if
either one of a complex conjugate pair of eigenvalues is
selected, then both complex eigenvalues are selected.
Note that in the ill-conditioned case, a selected complex
eigenvalue may no longer satisfy selctg(alphar(j),
alphai(j), beta(j)) = .TRUE. after ordering. In this
case info is set to n+2.
An eigenvalue alpha(j) / beta(j) is selected if
selctg(alpha(j), beta(j)) is true.
satisfy selctg(alpha(j), beta(j)) = .TRUE. after
ordering, since ordering may change the value of complex
eigenvalues (especially if the eigenvalue is ill-conditioned);
in this case info is set to n+2 (see info below).
If sense = 'E', computed for average of selected
eigenvalues only;
If sense = 'V', computed for selected deflating subspaces
only;
If sense = 'B', computed for both.
1365
If sense is 'E', 'V', or 'B', then sort must equal 'S'.

n INTEGER. The order of the matrices A, B, vsl, and vsr (n
≥ 0).
a, b, work REAL for sggesx
DOUBLE PRECISION for dggesx
COMPLEX for cggesx
DOUBLE COMPLEX for zggesx.
Arrays:
ldb INTEGER. The first dimension of the array b.
ldvsl, ldvsr INTEGER. The first dimensions of the output matrices vsl
and vsr, respectively. Constraints:
ldvsl ≥ 1. If jobvsl = 'V', ldvsl ≥ max(1, n).
ldvsr ≥ 1. If jobvsr = 'V', ldvsr ≥ max(1, n).
lwork INTEGER.
For real flavors:
If n=0 then lwork≥1.
If n>0 and sense = 'N', then lwork ≥ max(8*n,
6*n+16).
If n>0 and sense = 'E', 'V', or 'B', then lwork ≥
max(8*n, 6*n+16, 2*sdim*(n-sdim));
If n=0 then lwork≥1.
If n>0 and sense = 'N', then lwork ≥ max(1, 2*n);
1366
If n>0 and sense = 'E', 'V', or 'B', then lwork ≥
max(1, 2*n, 2*sdim*(n-sdim)).
Note that 2*sdim*(n-sdim) ≤ n*n/2.
An error is only returned if lwork < max(8*n, 6*n+16)for
real flavors, and lwork < max(1, 2*n) for complex flavors,
but if sense = 'E', 'V', or 'B', this may not be large
enough.
If lwork=-1, then a workspace query is assumed; the
routine only calculates the bound on the optimal size of the
work array and the minimum size of the iwork array,
returns these values as the first entries of the work and
iwork arrays, and no error message related to lwork or
liwork is issued by xerbla.
rwork REAL for cggesx
DOUBLE PRECISION for zggesx
iwork INTEGER.
liwork INTEGER.
If sense = 'N', or n=0, then liwork≥1,
otherwise liwork ≥ (n+6) for real flavors, and liwork ≥
(n+2) for complex flavors.
If liwork=-1, then a workspace query is assumed; the
routine only calculates the bound on the optimal size of the
work array and the minimum size of the iwork array,
returns these values as the first entries of the work and
iwork arrays, and no error message related to lwork or
liwork is issued by xerbla.
bwork LOGICAL.
1367
Output Parameters
a On exit, this array has been overwritten by its generalized
Schur form S.
b On exit, this array has been overwritten by its generalized
Schur form T.
sdim INTEGER.
(after sorting) for which selctg is true.
selctg is true for either eigenvalue count as 2.
alphar, alphai REAL for sggesx;
DOUBLE PRECISION for dggesx.
See beta.
alpha COMPLEX for cggesx;
beta REAL for sggesx
COMPLEX for cggesx
For real flavors:
On exit, (alphar(j) + alphai(j)*i)/beta(j), j=1,..., n will
alphar(j) + alphai(j)*i and beta(j), j=1,..., n are the
diagonals of the complex Schur form (S,T) that would result
if the 2-by-2 diagonal blocks of the real generalized Schur
form of (A,B) were further reduced to triangular form using
complex unitary transformations. If alphai(j) is zero, then
the j-th eigenvalue is real; if positive, then the j-th and
(j+1)-st eigenvalues are a complex conjugate pair, with
alphai(j+1) negative.
1368
On exit, alpha(j)/beta(j), j=1,..., n will be the generalized
eigenvalues. alpha(j), j=1,..., n, and beta(j), j=1,...,n are
the diagonals of the complex Schur form (S,T) output by
cggesx/zggesx. The beta(j) will be non-negative real.
vsl, vsr REAL for sggesx
COMPLEX for cggesx
Arrays:
vsl(ldvsl,*), the second dimension of vsl must be at least
max(1, n).
If jobvsl = 'V', this array will contain the left Schur
vectors.
If jobvsl = 'N', vsl is not referenced.
vsr(ldvsr,*), the second dimension of vsr must be at least
max(1, n).
If jobvsr = 'V', this array will contain the right Schur
vectors.
If jobvsr = 'N', vsr is not referenced.
rconde, rcondv REAL for single precision flavors
Arrays, DIMENSION (2) each
If sense = 'E' or 'B', rconde(1) and rconde(2) contain
the reciprocal condition numbers for the average of the
selected eigenvalues.
Not referenced if sense = 'N' or 'V'.
If sense = 'V' or 'B', rcondv(1) and rcondv(2) contain
the reciprocal condition numbers for the selected deflating
subspaces.
Not referenced if sense = 'N' or 'E'.
info INTEGER.
1369

If info = i, and
i ≤ n:
the QZ iteration failed. (A, B) is not in Schur form, but
alphar(j), alphai(j) (for real flavors), or alpha(j) (for
be correct.
i = n+2: after reordering, roundoff changed values of some
complex eigenvalues so that leading eigenvalues in the
generalized Schur form no longer satisfy selctg = .TRUE..
This could also be caused due to scaling;
i = n+3: reordering failed in ?tgsen.

Specific details for the routine ggesx interface are the following:
vsl Holds the matrix VSL of size (n, n).
vsr Holds the matrix VSR of size (n, n).
rconde Holds the vector of length (2).
rcondv Holds the vector of length (2).
jobvsl Restored based on the presence of the argument vsl as follows:
jobvsl = 'V', if vsl is present,
jobvsl = 'N', if vsl is omitted.
jobvsr Restored based on the presence of the argument vsr as follows:
1370
jobvsr = 'V', if vsr is present,
jobvsr = 'N', if vsr is omitted.
follows:
Note that there will be an error condition if rconde or rcondv are present and select is omitted.
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork (or liwork)
for the first run or set lwork = -1 (liwork = -1).
If you choose the first option and set any of admissible lwork (or liwork) sizes, which is no
less than the minimal value described, the routine completes the task, though probably not so
fast as with a recommended workspace, and provides the recommended workspace in the first
element of the corresponding array (work, iwork) on exit. Use this value (work(1), iwork(1))
a workspace query.
Note that if you set lwork (liwork) to less than the minimal required value and not -1, the
and alphai will be always less than and usually comparable with norm(A) in magnitude, and
beta always less than and usually comparable with norm(B).
1371
?ggev
Computes the generalized eigenvalues, and the
left and/or right generalized eigenvectors for a pair
of nonsymmetric matrices.
Syntax
FORTRAN 77:
call sggev(jobvl, jobvr, n, a, lda, b, ldb, alphar, alphai, beta, vl, ldvl,
vr, ldvr, work, lwork, info)
call dggev(jobvl, jobvr, n, a, lda, b, ldb, alphar, alphai, beta, vl, ldvl,
vr, ldvr, work, lwork, info)
call cggev(jobvl, jobvr, n, a, lda, b, ldb, alpha, beta, vl, ldvl, vr, ldvr,
work, lwork, rwork, info)
call zggev(jobvl, jobvr, n, a, lda, b, ldb, alpha, beta, vl, ldvl, vr, ldvr,
work, lwork, rwork, info)
Fortran 95:
call ggev(a, b, alphar, alphai, beta [,vl] [,vr] [,info])
call ggev(a, b, alpha, beta [, vl] [,vr] [,info])
Description
generalized eigenvalues, and optionally, the left and/or right generalized eigenvectors.
A generalized eigenvalue for a pair of matrices (A,B) is a scalar λ or a ratio alpha / beta = λ,
such that A - λ*B is singular. It is usually represented as the pair (alpha, beta), as there is
a reasonable interpretation for beta =0 and even for both being zero.
The right generalized eigenvector v(j) corresponding to the generalized eigenvalue λ(j) of
(A,B) satisfies
A*v(j) = λ(j)*B*v(j).
1372
The left generalized eigenvector u(j) corresponding to the generalized eigenvalue λ(j) of
(A,B) satisfies
u(j)H*A = λ(j)*u(j)H*B
where u(j)H denotes the conjugate transpose of u(j).
Input Parameters
If jobvl = 'N', the left generalized eigenvectors are not
computed;
If jobvl = 'V', the left generalized eigenvectors are
computed.
If jobvr = 'N', the right generalized eigenvectors are not
computed;
If jobvr = 'V', the right generalized eigenvectors are
computed.
n INTEGER. The order of the matrices A, B, vl, and vr (n ≥
0).
a, b, work REAL for sggev
DOUBLE PRECISION for dggev
COMPLEX for cggev
DOUBLE COMPLEX for zggev.
Arrays:
least max(1, n).
ldb INTEGER. The first dimension of the array b. Must be at
least max(1, n).
1373
ldvl, ldvr INTEGER. The first dimensions of the output matrices vl

Constraints:
ldvl ≥ 1. If jobvl = 'V', ldvl ≥ max(1, n).
ldvr ≥ 1. If jobvr = 'V', ldvr ≥ max(1, n).
lwork INTEGER.
lwork ≥ max(1, 8n+16) for real flavors;
rwork REAL for cggev
DOUBLE PRECISION for zggev
Output Parameters
a, b On exit, these arrays have been overwritten.
alphar, alphai REAL for sggev;
DOUBLE PRECISION for dggev.
See beta.
alpha COMPLEX for cggev;
beta REAL for sggev
COMPLEX for cggev
1374
For real flavors:
On exit, (alphar(j)+ alphai(j)*i)/beta(j), j=1,..., n, are
the generalized eigenvalues.
On exit, alpha(j)/beta(j), j=1,..., n, are the generalized
eigenvalues.
vl, vr REAL for sggev
COMPLEX for cggev
Arrays:
max(1, n).
If jobvl = 'V', the left generalized eigenvectors u(j) are
stored one after another in the columns of vl, in the same
order as their eigenvalues. Each eigenvector is scaled so
the largest component has abs(Re) + abs(Im) = 1.
For real flavors:
j-th column of vl.
and u(j+1) = vl(:,j) - i*vl(:,j+1), where i =
sqrt(-1).
max(1, n).
If jobvr = 'V', the right generalized eigenvectors v(j) are
stored one after another in the columns of vr, in the same
order as their eigenvalues. Each eigenvector is scaled so
the largest component has abs(Re) + abs(Im) = 1.
1375
For real flavors:

j-th column of vr.
and v(j+1) = vr(:,j) - i*vr(:,j+1).
info INTEGER.
If info = i, and
i ≤ n: the QZ iteration failed. No eigenvectors have been
calculated, but alphar(j), alphai(j) (for real flavors), or
alpha(j) (for complex flavors), and beta(j), j=info+1,...,
n should be correct.
i = n+2: error return from ?tgevc.

Specific details for the routine ggev interface are the following:
1376
Application Notes
workspace query.
workspace.
and alphai (for real flavors) or alpha (for complex flavors) will be always less than and usually
comparable with norm(A) in magnitude, and beta always less than and usually comparable
with norm(B).
1377
?ggevx
Computes the generalized eigenvalues, and,
optionally, the left and/or right generalized
eigenvectors.
Syntax
FORTRAN 77:
call sggevx(balanc, jobvl, jobvr, sense, n, a, lda, b, ldb, alphar, alphai,
beta, vl, ldvl, vr, ldvr, ilo, ihi, lscale, rscale, abnrm, bbnrm, rconde,
rcondv, work, lwork, iwork, bwork, info)
call dggevx(balanc, jobvl, jobvr, sense, n, a, lda, b, ldb, alphar, alphai,
beta, vl, ldvl, vr, ldvr, ilo, ihi, lscale, rscale, abnrm, bbnrm, rconde,
rcondv, work, lwork, iwork, bwork, info)
call cggevx(balanc, jobvl, jobvr, sense, n, a, lda, b, ldb, alpha, beta, vl,
ldvl, vr, ldvr, ilo, ihi, lscale, rscale, abnrm, bbnrm, rconde, rcondv, work,
lwork, rwork, iwork, bwork, info)
call zggevx(balanc, jobvl, jobvr, sense, n, a, lda, b, ldb, alpha, beta, vl,
ldvl, vr, ldvr, ilo, ihi, lscale, rscale, abnrm, bbnrm, rconde, rcondv, work,
lwork, rwork, iwork, bwork, info)
Fortran 95:
call ggevx(a, b, alphar, alphai, beta [,vl] [,vr] [,balanc] [,ilo] [,ihi] [,
lscale] [,rscale] [,abnrm] [,bbnrm] [,rconde] [,rcondv] [,info])
call ggevx(a, b, alpha, beta [, vl] [,vr] [,balanc] [,ilo] [,ihi] [,lscale]
[, rscale] [,abnrm] [,bbnrm] [,rconde] [,rcondv] [,info])
Description
generalized eigenvalues, and optionally, the left and/or right generalized eigenvectors.
1378
Optionally also, it computes a balancing transformation to improve the conditioning of the
eigenvalues and eigenvectors (ilo, ihi, lscale, rscale, abnrm, and bbnrm), reciprocal condition
numbers for the eigenvalues (rconde), and reciprocal condition numbers for the right
eigenvectors (rcondv).
A generalized eigenvalue for a pair of matrices (A,B) is a scalar λ or a ratio alpha / beta = λ,
such that A - λ*B is singular. It is usually represented as the pair (alpha, beta), as there is
a reasonable interpretation for beta=0 and even for both being zero. The right generalized
eigenvector v(j) corresponding to the generalized eigenvalue λ(j) of (A,B) satisfies
A*v(j) = λ(j)*B*v(j).
The left generalized eigenvector u(j) corresponding to the generalized eigenvalue λ(j) of
(A,B) satisfies
u(j)H*A = λ(j)*u(j)H*B
where u(j)H denotes the conjugate transpose of u(j).
Input Parameters
balanc CHARACTER*1. Must be 'N', 'P', 'S', or 'B'. Specifies the
balance option to be performed.
If balanc = 'N', do not diagonally scale or permute;
If balanc = 'P', permute only;
If balanc = 'S', scale only;
If balanc = 'B', both permute and scale.
Computed reciprocal condition numbers will be for the
matrices after balancing and/or permuting. Permuting does
not change condition numbers (in exact arithmetic), but
balancing does.
If jobvl = 'N', the left generalized eigenvectors are not
computed;
If jobvl = 'V', the left generalized eigenvectors are
computed.
If jobvr = 'N', the right generalized eigenvectors are not
computed;
1379
If jobvr = 'V', the right generalized eigenvectors are

computed.
If sense = 'E', computed for eigenvalues only;
If sense = 'V', computed for eigenvectors only;
If sense = 'B', computed for eigenvalues and
eigenvectors.
n INTEGER. The order of the matrices A, B, vl, and vr (n ≥
0).
a, b, work REAL for sggevx
DOUBLE PRECISION for dggevx
COMPLEX for cggevx
DOUBLE COMPLEX for zggevx.
Arrays:
ldb INTEGER. The first dimension of the array b.
ldvl, ldvr INTEGER. The first dimensions of the output matrices vl
Constraints:
ldvl ≥ 1. If jobvl = 'V', ldvl ≥ max(1, n).
ldvr ≥ 1. If jobvr = 'V', ldvr ≥ max(1, n).
lwork INTEGER.
The dimension of the array work. lwork ≥ max(1, 2*n);
For real flavors:
1380
If balanc = 'S', or 'B', or jobvl = 'V', or jobvr =
'V', then lwork ≥ max(1, 6*n);
if sense = 'E', or 'B', then lwork ≥ max(1, 10*n);
if sense = 'V', or 'B', lwork ≥ (2n2+ 8*n+16).
if sense = 'E', lwork ≥ max(1, 4*n);
if sense = 'V', or 'B', lwork ≥max(1, 2*n2+ 2*n).
rwork REAL for cggevx
DOUBLE PRECISION for zggevx
Workspace array, DIMENSION at least max(1, 6*n) if
balanc = 'S', or 'B', and at least max(1, 2*n)
otherwise.
iwork INTEGER.
Workspace array, DIMENSION at least (n+6) for real flavors
and at least (n+2) for complex flavors.
Not referenced if sense = 'E'.
Not referenced if sense = 'N'.
Output Parameters
a, b On exit, these arrays have been overwritten.
If jobvl = 'V' or jobvr = 'V' or both, then a contains
the first part of the real Schur form of the “balanced”
versions of the input A and B, and b contains its second part.
alphar, alphai REAL for sggevx;
DOUBLE PRECISION for dggevx.
See beta.
alpha COMPLEX for cggevx;
1381

beta REAL for sggevx
COMPLEX for cggevx
For real flavors:
On exit, (alphar(j) + alphai(j)*i)/beta(j), j=1,..., n, will
On exit, alpha(j)/beta(j), j=1,..., n, will be the generalized
eigenvalues.
vl, vr REAL for sggevx
COMPLEX for cggevx
Arrays:
max(1, n).
If jobvl = 'V', the left generalized eigenvectors u(j) are
stored one after another in the columns of vl, in the same
order as their eigenvalues. Each eigenvector will be scaled
so the largest component have abs(Re) + abs(Im) = 1.
For real flavors:
j-th column of vl.
and u(j+1) = vl(:,j) - i*vl(:,j+1), where i =
sqrt(-1).
1382
max(1, n).
If jobvr = 'V', the right generalized eigenvectors v(j) are
stored one after another in the columns of vr, in the same
order as their eigenvalues. Each eigenvector will be scaled
so the largest component have abs(Re) + abs(Im) = 1.
For real flavors:
j-th column of vr.
and v(j+1) = vr(:,j) - i*vr(:,j+1).
ilo, ihi INTEGER. ilo and ihi are integer values such that on exit
A(i,j) = 0 and B(i,j) = 0 if i > j and j = 1,...,
ilo-1 or i = ihi+1,..., n.
If balanc = 'N' or 'S', ilo = 1 and ihi = n.
lscale, rscale REAL for single-precision flavors
lscale contains details of the permutations and scaling
factors applied to the left side of A and B.
If PL(j) is the index of the row interchanged with row j,
and DL(j) is the scaling factor applied to row j, then
lscale(j) = PL(j), for j = 1,..., ilo-1
= DL(j), for j = ilo,...,ihi
= PL(j) for j = ihi+1,..., n.
then 1 to ilo-1.
rscale contains details of the permutations and scaling
factors applied to the right side of A and B.
If PR(j) is the index of the column interchanged with
column j, and DR(j) is the scaling factor applied to column
j, then
rscale(j) = PR(j), for j = 1,..., ilo-1
1383
= DR(j), for j = ilo,...,ihi

= PR(j) for j = ihi+1,..., n.
then 1 to ilo-1.
abnrm, bbnrm REAL for single-precision flavors
The one-norms of the balanced matrices A and B,
respectively.
If sense = 'E', or 'B', rconde contains the reciprocal
condition numbers of the eigenvalues, stored in consecutive
elements of the array. For a complex conjugate pair of
eigenvalues two consecutive elements of rconde are set to
the same value. Thus rconde(j), rcondv(j), and the j-th
columns of vl and vr all correspond to the same eigenpair
(but not in general the j-th eigenpair, unless all eigenpairs
are selected).
If sense = 'N', or 'V', rconde is not referenced.
If sense = 'V', or 'B', rcondv contains the estimated
reciprocal condition numbers of the eigenvectors, stored in
consecutive elements of the array. For a complex
eigenvector two consecutive elements of rcondv are set to
the same value.
If the eigenvalues cannot be reordered to compute
rcondv(j), rcondv(j) is set to 0; this can only occur when
the true value would be very small anyway.
If sense = 'N', or 'E', rcondv is not referenced.
info INTEGER.
If info = i, and
i ≤ n:
1384
the QZ iteration failed. No eigenvectors have been calculated,
but alphar(j), alphai(j) (for real flavors), or alpha(j) (for
be correct.
i = n+2: error return from ?tgevc.

Specific details for the routine ggevx interface are the following:
lscale Holds the vector of length n.
rscale Holds the vector of length n.
rconde Holds the vector of length n.
rcondv Holds the vector of length n.
balanc Must be 'N', 'B', or 'P'. The default value is 'N'.
1385

follows:
Application Notes
workspace query.
workspace.
and alphai (for real flavors) or alpha (for complex flavors) will be always less than and usually
comparable with norm(A) in magnitude, and beta always less than and usually comparable
with norm(B).
1386
LAPACK Auxiliary and Utility
Routines 5
This chapter describes the Intel® Math Kernel Library implementation of LAPACK auxiliary and utility
routines. The library includes auxiliary routines for both real and complex data.
Auxiliary Routines
Routine naming conventions, mathematical notation, and matrix storage schemes used for LAPACK
auxiliary routines are the same as for the driver and computational routines described in previous
chapters.
The table below summarizes information about the available LAPACK auxiliary routines.
Table 5-1 LAPACK Auxiliary Routines
Routine Name Data Description
Types
?lacgv c, z Conjugates a complex vector.
?lacrm c, z Multiplies a complex matrix by a square real matrix.
?lacrt c, z Performs a linear transformation of a pair of complex vectors.
?laesy c, z Computes the eigenvalues and eigenvectors of a 2-by-2 complex

symmetric matrix.
?rot c, z Applies a plane rotation with real cosine and complex sine to a
pair of complex vectors.
?spmv c, z Computes a matrix-vector product for complex vectors using a

complex symmetric packed matrix
?spr c, z Performs the symmetrical rank-1 update of a complex symmetric

packed matrix.
?symv c, z Computes a matrix-vector product for a complex symmetric

matrix.
?syr c, z Performs the symmetric rank-1 update of a complex symmetric

matrix.
1387

Types
i?max1 c, z Finds the index of the vector element whose real part has
maximum absolute value.
?sum1 sc, dz Forms the 1-norm of the complex vector using the true
absolute value.
?gbtf2 s, d, c, z Computes the LU factorization of a general band matrix using

the unblocked version of the algorithm.
?gebd2 s, d, c, z Reduces a general matrix to bidiagonal form using an

unblocked algorithm.
?gehd2 s, d, c, z Reduces a general square matrix to upper Hessenberg form

using an unblocked algorithm.
?gelq2 s, d, c, z Computes the LQ factorization of a general rectangular matrix

?geql2 s, d, c, z Computes the QL factorization of a general rectangular matrix

?geqr2 s, d, c, z Computes the QR factorization of a general rectangular

matrix using an unblocked algorithm.
?gerq2 s, d, c, z Computes the RQ factorization of a general rectangular

matrix using an unblocked algorithm.
?gesc2 s, d, c, z Solves a system of linear equations using the LU factorization

with complete pivoting computed by ?getc2.
?getc2 s, d, c, z Computes the LU factorization with complete pivoting of the

general n-by-n matrix.
?getf2 s, d, c, z Computes the LU factorization of a general m-by-n matrix

using partial pivoting with row interchanges (unblocked
algorithm).
?gtts2 s, d, c, z Solves a system of linear equations with a tridiagonal matrix

using the LU factorization computed by ?gttrf.
?isnan s, d, Tests input for NaN.
1388
LAPACK Auxiliary and Utility Routines 5
Types
?laisnan s, d, Tests input for NaN by comparing itwo arguments for

inequality.
?labrd s, d, c, z Reduces the first nb rows and columns of a general matrix

to a bidiagonal form.
?lacn2 s, d, c, z Estimates the 1-norm of a square matrix, using reverse

communication for evaluating matrix-vector products.
?lacon s, d, c, z Estimates the 1-norm of a square matrix, using reverse

?lacpy s, d, c, z Copies all or part of one two-dimensional array to another.
?ladiv s, d, c, z Performs complex division in real arithmetic, avoiding

unnecessary overflow.
?lae2 s, d Computes the eigenvalues of a 2-by-2 symmetric matrix.
?laebz s, d Computes the number of eigenvalues of a real symmetric

tridiagonal matrix which are less than or equal to a given
value, and performs other tasks required by the routine
?stebz.
?laed0 s, d, c, z Used by ?stedc. Computes all eigenvalues and corresponding

eigenvectors of an unreduced symmetric tridiagonal matrix
using the divide and conquer method.
?laed1 s, d Used by sstedc/dstedc. Computes the updated eigensystem

of a diagonal matrix after modification by a rank-one
symmetric matrix. Used when the original matrix is
tridiagonal.
?laed2 s, d Used by sstedc/dstedc. Merges eigenvalues and deflates

secular equation. Used when the original matrix is tridiagonal.
?laed3 s, d Used by sstedc/dstedc. Finds the roots of the secular

equation and updates the eigenvectors. Used when the
original matrix is tridiagonal.
1389

Types
?laed4 s, d Used by sstedc/dstedc. Finds a single root of the secular

equation.
?laed5 s, d Used by sstedc/dstedc. Solves the 2-by-2 secular equation.
?laed6 s, d Used by sstedc/dstedc. Computes one Newton step in

solution of the secular equation.
?laed7 s, d, c, z Used by ?stedc. Computes the updated eigensystem of a

diagonal matrix after modification by a rank-one symmetric
matrix. Used when the original matrix is dense.
?laed8 s, d, c, z Used by ?stedc. Merges eigenvalues and deflates secular

equation. Used when the original matrix is dense.
?laed9 s, d Used by sstedc/dstedc. Finds the roots of the secular

equation and updates the eigenvectors. Used when the
original matrix is dense.
?laeda s, d Used by ?stedc. Computes the Z vector determining the

rank-one modification of the diagonal matrix. Used when the
?laein s, d, c, z Computes a specified right or left eigenvector of an upper

Hessenberg matrix by inverse iteration.
?laev2 s, d, c, z Computes the eigenvalues and eigenvectors of a 2-by-2

symmetric/Hermitian matrix.
?laexc s, d Swaps adjacent diagonal blocks of a real upper

quasi-triangular matrix in Schur canonical form, by an
orthogonal similarity transformation.
?lag2 s, d Computes the eigenvalues of a 2-by-2 generalized eigenvalue

problem, with scaling as necessary to avoid over-/underflow.
?lags2 s, d Computes 2-by-2 orthogonal matrices U, V, and Q, and

applies them to matrices A and B such that the rows of the
transformed A and B are parallel.
1390
Types
?lagtf s, d Computes an LU factorization of a matrix T-λI, where T is a

general tridiagonal matrix, and λ a scalar, using partial
pivoting with row interchanges.
?lagtm s, d, c, z Performs a matrix-matrix product of the form C = αab+βC,

where A is a tridiagonal matrix, B and C are rectangular
matrices, and α and β are scalars, which may be 0, 1, or -1.
?lagts s, d Solves the system of equations (T-λI)x = y or (T-λI) x = y

T
,where T is a general tridiagonal matrix and λ a scalar, using

the LU factorization computed by ?lagtf.
?lagv2 s, d Computes the Generalized Schur factorization of a real 2-by-2

matrix pencil (A,B) where B is upper triangular.
?lahqr s, d, c, z Computes the eigenvalues and Schur factorization of an

upper Hessenberg matrix, using the double-shift/single-shift
QR algorithm.
?lahrd s, d, c, z Reduces the first nb columns of a general rectangular matrix

A so that elements below the k-th subdiagonal are zero, and
returns auxiliary matrices which are needed to apply the
transformation to the unreduced part of A.
?lahr2 s, d, c, z Reduces the specified number of first columns of a general

rectangular matrix A so that elements below thespecified
subdiagonal are zero, and returns auxiliary matrices which
are needed to apply the transformation to the unreduced
part of A.
?laic1 s, d, c, z Applies one step of incremental condition estimation.
?laln2 s, d Solves a 1-by-1 or 2-by-2 linear system of equations of the

specified form.
?lals0 s, d, c, z Applies back multiplying factors in solving the least squares

problem using divide and conquer SVD approach. Used by
?gelsd.
1391

Types
?lalsa s, d, c, z Computes the SVD of the coefficient matrix in compact form.

Used by ?gelsd.
?lalsd s, d, c, z Uses the singular value decomposition of A to solve the least

squares problem.
?lamrg s, d Creates a permutation list to merge the entries of two

independently sorted sets into a single set sorted in
ascending order.
?laneg s, d Computes the Sturm count.
?langb s, d, c, z Returns the value of the 1-norm, Frobenius norm,

infinity-norm, or the largest absolute value of any element
of general band matrix.
?lange s, d, c, z Returns the value of the 1-norm, Frobenius norm,

of a general rectangular matrix.
?langt s, d, c, z Returns the value of the 1-norm, Frobenius norm,

of a general tridiagonal matrix.
?lanhs s, d, c, z Returns the value of the 1-norm, Frobenius norm,

of an upper Hessenberg matrix.
?lansb s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or

the infinity norm, or the element of largest absolute value
of a symmetric band matrix.
?lanhb c, z Returns the value of the 1-norm, or the Frobenius norm, or

of a Hermitian band matrix.
?lansp s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or

of a symmetric matrix supplied in packed form.
1392
Types
?lanhp c, z Returns the value of the 1-norm, or the Frobenius norm, or

of a complex Hermitian matrix supplied in packed form.
?lanst/?lanht s, d/c, z Returns the value of the 1-norm, or the Frobenius norm, or
of a real symmetric or complex Hermitian tridiagonal matrix.
?lansy s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or

of a real/complex symmetric matrix.
?lanhe c, z Returns the value of the 1-norm, or the Frobenius norm, or

of a complex Hermitian matrix.
?lantb s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or

of a triangular band matrix.
?lantp s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or

of a triangular matrix supplied in packed form.
?lantr s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or

of a trapezoidal or triangular matrix.
?lanv2 s, d Computes the Schur factorization of a real 2-by-2

nonsymmetric matrix in standard form.
?lapll s, d, c, z Measures the linear dependence of two vectors.
?lapmt s, d, c, z Performs a forward or backward permutation of the columns

of a matrix.
2 2
?lapy2 s, d Returns sqrt(x +y ).
2 2 2
?lapy3 s, d Returns sqrt(x +y +z ).
?laqgb s, d, c, z Scales a general band matrix, using row and column scaling
factors computed by ?gbequ.
1393

Types
?laqge s, d, c, z Scales a general rectangular matrix, using row and column

scaling factors computed by ?geequ.
?laqhb c, z Scales a Hermetian band matrix, using scaling factors

computed by ?pbequ.
?laqp2 s, d, c, z Computes a QR factorization with column pivoting of the

matrix block.
?laqps s, d, c, z Computes a step of QR factorization with column pivoting of

a real m-by-n matrix A by using BLAS level 3.
?laqr0 s, d, c, z Computes the eigenvalues of a Hessenberg matrix, and

optionally the matrices from the Schur decomposition.
?laqr1 s, d, c, z Sets a scalar multiple of the first column of the product of

2-by-2 or 3-by-3 matrix H and specified shifts.
?laqr2 s, d, c, z Performs the orthogonal/unitary similarity transformation of

a Hessenberg matrix to detect and deflate fully converged
eigenvalues from a trailing principal submatrix (aggressive
early deflation).
?laqr3 s, d, c, z Performs the orthogonal/unitary similarity transformation of

a Hessenberg matrix to detect and deflate fully converged
eigenvalues from a trailing principal submatrix (aggressive
early deflation).
?laqr4 s, d, c, z Computes the eigenvalues of a Hessenberg matrix, and

optionally the matrices from the Schur decomposition.
?laqr5 s, d, c, z Performs a single small-bulge multi-shift QR sweep.
?laqsb s, d, c, z Scales a symmetric/Hermitian band matrix, using scaling

factors computed by ?pbequ.
?laqsp s, d, c, z Scales a symmetric/Hermitian matrix in packed storage,

using scaling factors computed by ?ppequ.
?laqsy s, d, c, z Scales a symmetric/Hermitian matrix, using scaling factors

computed by ?poequ.
1394
Types
?laqtr s, d Solves a real quasi-triangular system of equations, or a

complex quasi-triangular system of special form, in real
arithmetic.
?lar1v s, d, c, z Computes the (scaled) r-th column of the inverse of the

submatrix in rows b1 through bn of the tridiagonal matrix
ldL - σI.
T
?lar2v s, d, c, z Applies a vector of plane rotations with real cosines and

real/complex sines from both sides to a sequence of 2-by-2
symmetric/Hermitian matrices.
?larf s, d, c, z Applies an elementary reflector to a general rectangular

matrix.
?larfb s, d, c, z Applies a block reflector or its transpose/conjugate-transpose

to a general rectangular matrix.
?larfg s, d, c, z Generates an elementary reflector (Householder matrix).

H
?larft s, d, c, z Forms the triangular factor T of a block reflector H = I - vtv
?larfx s, d, c, z Applies an elementary reflector to a general rectangular

matrix, with loop unrolling when the reflector has order ≤
10.
?largv s, d, c, z Generates a vector of plane rotations with real cosines and

real/complex sines.
?larnv s, d, c, z Returns a vector of random numbers from a uniform or

normal distribution.
?larra s, d Computes the splitting points with the specified threshold.
?larrb s, d Provides limited bisection to locate eigenvalues for more

accuracy.
?larrc s, d Computes the number of eigenvalues of the symmetric

tridiagonal matrix.
1395

Types
?larrd s, d Computes the eigenvalues of a symmetric tridiagonal matrix

to suitable accuracy.
?larre s, d Given the tridiagonal matrix T, sets small off-diagonal

elements to zero and for each unreduced block Ti, finds base
representations and eigenvalues.
?larrf s, d Finds a new relatively robust representation such that at

least one of the eigenvalues is relatively isolated.
?larrj s, d Performs refinement of the initial estimates of the

eigenvalues of the matrix T.
?larrk s, d Computes one eigenvalue of a symmetric tridiagonal matrix

T to suitable accuracy.
?larrr s, d Performs tests to decide whether the symmetric tridiagonal

matrix T warrants expensive computations which guarantee
high relative accuracy in the eigenvalues.
?larrv s, d, c, z Computes the eigenvectors of the tridiagonal matrix T = L

T T
D L given L, D and the eigenvalues of L D L .
?lartg s, d, c, z Generates a plane rotation with real cosine and real/complex

sine.
?lartv s, d, c, z Applies a vector of plane rotations with real cosines and

real/complex sines to the elements of a pair of vectors.
?laruv s, d Returns a vector of n random real numbers from a uniform

distribution.
?larz s, d, c, z Applies an elementary reflector (as returned by ?tzrzf) to

a general matrix.
?larzb s, d, c, z Applies a block reflector or its transpose/conjugate-transpose

to a general matrix.
H
?larzt s, d, c, z Forms the triangular factor T of a block reflector H = I - vtv .
?las2 s, d Computes singular values of a 2-by-2 triangular matrix.
1396
Types
?lascl s, d, c, z Multiplies a general rectangular matrix by a real scalar

defined as cto/cfrom.
?lasd0 s, d Computes the singular values of a real upper bidiagonal

n-by-m matrix B with diagonal d and off-diagonal e. Used
by ?bdsdc.
?lasd1 s, d Computes the SVD of an upper bidiagonal matrix B of the

specified size. Used by ?bdsdc.
?lasd2 s, d Merges the two sets of singular values together into a single
sorted set. Used by ?bdsdc.
?lasd3 s, d Finds all square roots of the roots of the secular equation,
as defined by the values in D and Z, and then updates the
singular vectors by matrix multiplication. Used by ?bdsdc.
?lasd4 s, d Computes the square root of the i-th updated eigenvalue of

a positive symmetric rank-one modification to a positive
diagonal matrix. Used by ?bdsdc.
?lasd5 s, d Computes the square root of the i-th eigenvalue of a positive

symmetric rank-one modification of a 2-by-2 diagonal
matrix.Used by ?bdsdc.
?lasd6 s, d Computes the SVD of an updated upper bidiagonal matrix

obtained by merging two smaller ones by appending a row.
Used by ?bdsdc.
?lasd7 s, d Merges the two sets of singular values together into a single
sorted set. Then it tries to deflate the size of the problem.
Used by ?bdsdc.
?lasd8 s, d Finds the square roots of the roots of the secular equation,
and stores, for each element in D, the distance to its two
nearest poles. Used by ?bdsdc.
?lasd9 s, d Finds the square roots of the roots of the secular equation,
and stores, for each element in D, the distance to its two
nearest poles. Used by ?bdsdc.
1397

Types
?lasda s, d Computes the singular value decomposition (SVD) of a real

upper bidiagonal matrix with diagonal d and off-diagonal e.
Used by ?bdsdc.
?lasdq s, d Computes the SVD of a real bidiagonal matrix with diagonal

d and off-diagonal e. Used by ?bdsdc.
?lasdt s, d Creates a tree of subproblems for bidiagonal divide and

conquer. Used by ?bdsdc.
?laset s, d, c, z Initializes the off-diagonal elements and the diagonal

elements of a matrix to given values.
?lasq1 s, d Computes the singular values of a real square bidiagonal

matrix. Used by ?bdsqr.
?lasq2 s, d Computes all the eigenvalues of the symmetric positive

definite tridiagonal matrix associated with the qd Array Z to
high relative accuracy. Used by ?bdsqr and ?stegr.
?lasq3 s, d Checks for deflation, computes a shift and calls dqds. Used
by ?bdsqr.
?lasq4 s, d Computes an approximation to the smallest eigenvalue using

values of d from the previous transform. Used by ?bdsqr.
?lasq5 s, d Computes one dqds transform in ping-pong form. Used by

?bdsqr and ?stegr.
?lasq6 s, d Computes one dqd transform in ping-pong form. Used by

?bdsqr and ?stegr.
?lasr s, d, c, z Applies a sequence of plane rotations to a general rectangular

matrix.
?lasrt s, d Sorts numbers in increasing or decreasing order.
?lassq s, d, c, z Updates a sum of squares represented in scaled form.
?lasv2 s, d Computes the singular value decomposition of a 2-by-2

triangular matrix.
1398
Types
?laswp s, d, c, z Performs a series of row interchanges on a general

rectangular matrix.
?lasy2 s, d Solves the Sylvester matrix equation where the matrices are
of order 1 or 2.
?lasyf s, d, c, z Computes a partial factorization of a real/complex symmetric

matrix, using the diagonal pivoting method.
?lahef c, z Computes a partial factorization of a complex Hermitian

indefinite matrix, using the diagonal pivoting method.
?latbs s, d, c, z Solves a triangular banded system of equations.
?latdf s, d, c, z Uses the LU factorization of the n-by-n matrix computed by

?getc2 and computes a contribution to the reciprocal
Dif-estimate.
?latps s, d, c, z Solves a triangular system of equations with the matrix held

in packed storage.
?latrd s, d, c, z Reduces the first nb rows and columns of a

symmetric/Hermitian matrix A to real tridiagonal form by an
orthogonal/unitary similarity transformation.
?latrs s, d, c, z Solves a triangular system of equations with the scale factor

set to prevent overflow.
?latrz s, d, c, z Factors an upper trapezoidal matrix by means of

H H
?lauu2 s, d, c, z Computes the product UU or L L, where U and L are upper
or lower triangular matrices (unblocked algorithm).
H H
?lauum s, d, c, z Computes the product UU or L L, where U and L are upper
or lower triangular matrices (blocked algorithm).
?lazq3 s, d, Checks for deflation, computes a shift and calls dqds
?lazq4 s, d, Computes an approximation to the smallest eigenvalue using

values of d from the previous transform.
1399

Types
?org2l/?ung2l s, d/c, z Generates all or part of the orthogonal/unitary matrix Q from

a QL factorization determined by ?geqlf (unblocked
algorithm).
?org2r/?ung2r s, d/c, z Generates all or part of the orthogonal/unitary matrix Q from

a QR factorization determined by ?geqrf (unblocked
algorithm).
?orgl2/?ungl2 s, d/c, z Generates all or part of the orthogonal/unitary matrix Q from

an LQ factorization determined by ?gelqf (unblocked
algorithm).
?orgr2/?ungr2 s, d/c, z Generates all or part of the orthogonal/unitary matrix Q from

an RQ factorization determined by ?gerqf (unblocked
algorithm).
?orm2l/?unm2l s, d/c, z Multiplies a general matrix by the orthogonal/unitary matrix

from a QL factorization determined by ?geqlf (unblocked
algorithm).
?orm2r/?unm2r s, d/c, z Multiplies a general matrix by the orthogonal/unitary matrix

from a QR factorization determined by ?geqrf (unblocked
algorithm).
?orml2/?unml2 s, d/c, z Multiplies a general matrix by the orthogonal/unitary matrix

from a LQ factorization determined by ?gelqf (unblocked
algorithm).
?ormr2/?unmr2 s, d/c, z Multiplies a general matrix by the orthogonal/unitary matrix

from a RQ factorization determined by ?gerqf (unblocked
algorithm).
?ormr3/?unmr3 s, d/c, z Multiplies a general matrix by the orthogonal/unitary matrix

from a RZ factorization determined by ?tzrzf (unblocked
algorithm).
?pbtf2 s, d, c, z Computes the Cholesky factorization of a symmetric/

Hermitian positive definite band matrix (unblocked
algorithm).
1400
Types
?potf2 s, d, c, z Computes the Cholesky factorization of a

symmetric/Hermitian positive definite matrix (unblocked
algorithm).
?ptts2 s, d, c, z Solves a tridiagonal system of the form AX=B using the L D

H
L factorization computed by ?pttrf.
?rscl s, d, cs, Multiplies a vector by the reciprocal of a real scalar.

zd
?sygs2/?hegs2 s, d/c, z Reduces a symmetric/Hermitian definite generalized

eigenproblem to standard form, using the factorization results
obtained from ?potrf (unblocked algorithm).
?sytd2/?hetd2 s, d/c, z Reduces a symmetric/Hermitian matrix to real symmetric

tridiagonal form by an orthogonal/unitary similarity
transformation (unblocked algorithm).
?sytf2 s, d, c, z Computes the factorization of a real/complex symmetric

indefinite matrix, using the diagonal pivoting method
(unblocked algorithm).
?hetf2 c, z Computes the factorization of a complex Hermitian matrix,

using the diagonal pivoting method (unblocked algorithm).
?tgex2 s, d, c, z Swaps adjacent diagonal blocks in an upper (quasi) triangular

matrix pair by an orthogonal/unitary equivalence
transformation.
?tgsy2 s, d, c, z Solves the generalized Sylvester equation (unblocked

algorithm).
?trti2 s, d, c, z Computes the inverse of a triangular matrix (unblocked

algorithm).
clag2z c→z Converts a complex single precision matrix to a complex

double precision matrix.
dlag2s d→s Converts a double precision matrix to a single precision

matrix.
1401

Types
slag2d s→d Converts a single precision matrix to a double precision

matrix.
zlag2c z→c Converts a complex double precision matrix to a complex

single precision matrix.
?lacgv
Conjugates a complex vector.
Syntax
call clacgv( n, x, incx )
call zlacgv( n, x, incx )
Description
This routine is declared in mkl_lapack.fi for FORTRAN 77 interface and in mkl_lapack.h
for C interface.
The routine conjugates a complex vector x of length n and increment incx (see “Vector
Arguments in BLAS” in Appendix B).
Input Parameters
n INTEGER. The length of the vector x (n ≥ 0).
x COMPLEX for clacgv
COMPLEX*16 for zlacgv.
Array, dimension (1+(n-1)* |incx|).
Contains the vector of length n to be conjugated.
incx INTEGER. The spacing between successive elements of x.
Output Parameters
x On exit, overwritten with conjg(x).
1402
?lacrm
Multiplies a complex matrix by a square real
matrix.
Syntax
call clacrm( m, n, a, lda, b, ldb, c, ldc, rwork )
call zlacrm( m, n, a, lda, b, ldb, c, ldc, rwork )
Description
for C interface.
The routine performs a simple matrix-matrix multiplication of the form
C = A*B,
where A is m-by-n and complex, B is n-by-n and real, C is m-by-n and complex.
Input Parameters
m INTEGER. The number of rows of the matrix A and of the
matrix C (m ≥ 0).
n INTEGER. The number of columns and rows of the matrix B
and the number of columns of the matrix C
(n ≥ 0).
a COMPLEX for clacrm
COMPLEX*16 for zlacrm
Array, DIMENSION (lda, n). Contains the m-by-n matrix
A.
lda INTEGER. The leading dimension of the array a, lda ≥
max(1, m).
b REAL for clacrm
DOUBLE PRECISION for zlacrm
Array, DIMENSION (ldb, n). Contains the n-by-n matrix
B.
1403
ldb INTEGER. The leading dimension of the array b, ldb ≥ max(1,

n).
ldc INTEGER. The leading dimension of the output array c, ldc
≥ max(1, n).
rwork REAL for clacrm
DOUBLE PRECISION for zlacrm
Workspace array, DIMENSION (2*m*n).
Output Parameters
c COMPLEX for clacrm
COMPLEX*16 for zlacrm
Array, DIMENSION (ldc, n). Contains the m-by-n matrix C.
?lacrt
Performs a linear transformation of a pair of
complex vectors.
Syntax
call clacrt( n, cx, incx, cy, incy, c, s )
call zlacrt( n, cx, incx, cy, incy, c, s )
Description
for C interface.
The routine performs the following transformation
where c, s are complex scalars and x, y are complex vectors.
1404
Input Parameters
n INTEGER. The number of elements in the vectors cx and cy
(n ≥ 0).
cx, cy COMPLEX for clacrt
COMPLEX*16 for zlacrt
Arrays, dimension (n).
Contain input vectors x and y, respectively.
incx INTEGER. The increment between successive elements of
cx.
incy INTEGER. The increment between successive elements of
cy.
c, s COMPLEX for clacrt
COMPLEX*16 for zlacrt
Complex scalars that define the transform matrix
Output Parameters
cx On exit, overwritten with c*x + s*y .
cy On exit, overwritten with -s*x + c*y .
?laesy
Computes the eigenvalues and eigenvectors of a
2-by-2 complex symmetric matrix, and checks that
the norm of the matrix of eigenvectors is larger
than a threshold value.
Syntax
call claesy( a, b, c, rt1, rt2, evscal, cs1, sn1 )
call zlaesy( a, b, c, rt1, rt2, evscal, cs1, sn1 )
1405
Description
for C interface.
The routine performs the eigendecomposition of a 2-by-2 symmetric matrix
provided the norm of the matrix of eigenvectors is larger than some threshold value.
rt1 is the eigenvalue of larger absolute value, and rt2 of smaller absolute value. If the
eigenvectors are computed, then on return (cs1, sn1) is the unit eigenvector for rt1, hence
Input Parameters
a, b, c COMPLEX for claesy
COMPLEX*16 for zlaesy
Elements of the input matrix.
Output Parameters
rt1, rt2 COMPLEX for claesy
Eigenvalues of larger and smaller modulus, respectively.
evscal COMPLEX for claesy
The complex value by which the eigenvector matrix was
scaled to make it orthonormal. If evscal is zero, the
eigenvectors were not computed. This means one of two
things: the 2-by-2 matrix could not be diagonalized, or the
norm of the matrix of eigenvectors before scaling was larger
than the threshold value thresh (set to 0.1E0).
1406
cs1, sn1 COMPLEX for claesy
If evscal is not zero, then (cs1, sn1) is the unit right
eigenvector for rt1.
?rot
Applies a plane rotation with real cosine and
complex sine to a pair of complex vectors.
Syntax
call crot( n, cx, incx, cy, incy, c, s )
call zrot( n, cx, incx, cy, incy, c, s )
Description
for C interface.
The routine applies a plane rotation, where the cosine (c) is real and the sine (s) is complex,
and the vectors cx and cy are complex. This routine has its real equivalents in BLAS (see ?rot
in Chapter 2).
Input Parameters
n INTEGER. The number of elements in the vectors cx and
cy.
cx, cy COMPLEX for crot
COMPLEX*16 for zrot
Arrays of dimension (n), contain input vectors x and y,
respectively.
cx.
cy.
c REAL for crot
DOUBLE PRECISION for zrot
s COMPLEX for crot
1407
COMPLEX*16 for zrot

Values that define a rotation
where c*c + s*conjg(s) = 1.0.
Output Parameters
cx On exit, overwritten with c*x + s*y.
cy On exit, overwritten with -conjg(s)*x + c*y.
?spmv
Computes a matrix-vector product for complex
vectors using a complex symmetric packed matrix.
Syntax
call cspmv( uplo, n, alpha, ap, x, incx, beta, y, incy )
call zspmv( uplo, n, alpha, ap, x, incx, beta, y, incy )
Description
for C interface.
The ?spmv routines perform a matrix-vector operation defined as
y := alpha*a*x + beta*y,
where:
alpha and beta are complex scalars,
x and y are n-element complex vectors
a is an n-by-n complex symmetric matrix, supplied in packed form.
These routines have their real equivalents in BLAS (see ?spmv in Chapter 2 ).
1408
Input Parameters
triangular part of the matrix a is supplied in the packed
array ap.
matrix a is supplied in the array ap.
If uplo = 'L' or 'l', the lower triangular part of the
matrix a is supplied in the array ap .
n INTEGER.
Specifies the order of the matrix a.
alpha, beta COMPLEX for cspmv
COMPLEX*16 for zspmv
Specify complex scalars alpha and beta. When beta is
supplied as zero, then y need not be set on input.
ap COMPLEX for cspmv
Array, DIMENSION at least ((n*(n + 1))/2). Before entry,
A(1, 1), ap(2) and ap(3) contain A(1, 2) and A(2, 2)
respectively, and so on. Before entry, with uplo = 'L' or
'l', the array ap must contain the lower triangular part of
the symmetric matrix packed sequentially,
column-by-column, so that ap(1) contains a(1, 1), ap(2)
so on.
x COMPLEX for cspmv
n-element vector x.
incx INTEGER. Specifies the increment for the elements of x. The
value of incx must not be zero.
y COMPLEX for cspmv
1409

n-element vector y.
Output Parameters
?spr
Performs the symmetrical rank-1 update of a
complex symmetric packed matrix.
Syntax
call cspr( uplo, n, alpha, x, incx, ap )
call zspr( uplo, n, alpha, x, incx, ap )
Description
for C interface.
The ?spr routines perform a matrix-vector operation defined as
a:= alpha*x*conjg(x') + a,
where:
alpha is a complex scalar
x is an n-element complex vector
a is an n-by-n complex symmetric matrix, supplied in packed form.
These routines have their real equivalents in BLAS (see ?spr in Chapter 2).
1410
Input Parameters
triangular part of the matrix a is supplied in the packed
array ap, as follows:
matrix a is supplied in the array ap.
If uplo = 'L' or 'l', the lower triangular part of the
matrix a is supplied in the array ap .
n INTEGER.
Specifies the order of the matrix a.
alpha COMPLEX for cspr
COMPLEX*16 for zspr
x COMPLEX for cspr
COMPLEX*16 for zspr
Array, DIMENSION at least (1 + (n - 1)*abs(incx)). Before
vector x.
ap COMPLEX for cspr
COMPLEX*16 for zspr
Array, DIMENSION at least ((n*(n + 1))/2). Before entry,
with uplo = 'U' or 'u', the array ap must contain the upper
triangular part of the symmetric matrix packed sequentially,
column-by-column, so that ap(1) contains A(1,1), ap(2)
and ap(3) contain A(1, 2) and A(2,2) respectively, and
so on.
Before entry, with uplo = 'L' or 'l', the array ap must
1411
Note that the imaginary parts of the diagonal elements need

not be set, they are assumed to be zero, and on exit they
are set to zero.
Output Parameters
?symv
Computes a matrix-vector product for a complex
symmetric matrix.
Syntax
call csymv( uplo, n, alpha, a, lda, x, incx, beta, y, incy )
call zsymv( uplo, n, alpha, a, lda, x, incx, beta, y, incy )
Description
for C interface.
The routine performs the matrix-vector operation defined as
y := alpha*a*x + beta*y,
where:
alpha and beta are complex scalars
x and y are n-element complex vectors
a is an n-by-n symmetric complex matrix.
These routines have their real equivalents in BLAS (see ?symv in Chapter 2).
Input Parameters
triangular part of the array a is used:
1412
array a is used.
If uplo = 'L' or 'l', then the lower triangular part of the
array a is used.
alpha, beta COMPLEX for csymv
COMPLEX*16 for zsymv
Specify the scalars alpha and beta. When beta is supplied
as zero, then y need not be set on input.
a COMPLEX for csymv
Array, DIMENSION (lda, n). Before entry with uplo = 'U'
a must contain the upper triangular part of the symmetric
matrix and the strictly lower triangular part of a is not
referenced. Before entry with uplo = 'L' or 'l', the
leading n-by-n lower triangular part of the array a must
lda INTEGER. Specifies the first dimension of A as declared in
max(1,n).
x COMPLEX for csymv
n-element vector x.
y COMPLEX for csymv
n-element vector y.
1413

Output Parameters
?syr
Performs the symmetric rank-1 update of a
complex symmetric matrix.
Syntax
call csyr( uplo, n, alpha, x, incx, a, lda )
call zsyr( uplo, n, alpha, x, incx, a, lda )
Description
for C interface.
The routine performs the symmetric rank 1 operation defined as
a := alpha*x*x' + a,
where:
alpha is a complex scalar
x is an n-element complex vector
a is an n-by-n complex symmetric matrix.
These routines have their real equivalents in BLAS (see ?syr in Chapter 2).
Input Parameters
triangular part of the array a is used:
array a is used.
If uplo = 'L' or 'l', then the lower triangular part of the
array a is used.
1414
alpha COMPLEX for csyr
COMPLEX*16 for zsyr
x COMPLEX for csyr
COMPLEX*16 for zsyr
n-element vector x.
a COMPLEX for csyr
COMPLEX*16 for zsyr
Array, DIMENSION (lda, n). Before entry with uplo =
'U' or 'u', the leading n-by-n upper triangular part of the
array a must contain the upper triangular part of the
symmetric matrix and the strictly lower triangular part of a
is not referenced.
max(1,n).
Output Parameters
updated matrix.
updated matrix.
1415
i?max1
Finds the index of the vector element whose real
part has maximum absolute value.
Syntax
index = icmax1( n, cx, incx )
index = izmax1( n, cx, incx )
Description
This function is declared in mkl_lapack.fi for FORTRAN 77 interface and in mkl_lapack.h
for C interface.
Given a complex vector cx, the i?max1 functions return the index of the vector element whose
real part has maximum absolute value. These functions are based on the BLAS functions ica-
max/izamax, but using the absolute value of the real part. They are designed for use with
clacon/zlacon.
Input Parameters
n INTEGER. Specifies the number of elements in the vector
cx.
cx COMPLEX for icmax1
COMPLEX*16 for izmax1
Contains the input vector.
incx INTEGER. Specifies the spacing between successive elements
of cx.
Output Parameters
index INTEGER. Contains the index of the vector element whose
real part has maximum absolute value.
1416
?sum1
Forms the 1-norm of the complex vector using the
true absolute value.
Syntax
res = scsum1( n, cx, incx )
res = dzsum1( n, cx, incx )
Description
for C interface.
Given a complex vector cx, scsum1/dzsum1 functions take the sum of the absolute values of
vector elements and return a single/DOUBLE PRECISION result, respectively. These functions
are based on scasum/dzasum from Level 1 BLAS, but use the true absolute value and were
designed for use with clacon/zlacon.
Input Parameters
n INTEGER. Specifies the number of elements in the vector
cx.
cx COMPLEX for scsum1
COMPLEX*16 for dzsum1
Contains the input vector whose elements will be summed.
incx INTEGER. Specifies the spacing between successive elements
of cx (incx > 0).
Output Parameters
res REAL for scsum1
DOUBLE PRECISION for dzsum1
Contains the sum of absolute values.
1417
?gbtf2
Computes the LU factorization of a general band
matrix using the unblocked version of the
algorithm.
Syntax
call sgbtf2( m, n, kl, ku, ab, ldab, ipiv, info )
call dgbtf2( m, n, kl, ku, ab, ldab, ipiv, info )
call cgbtf2( m, n, kl, ku, ab, ldab, ipiv, info )
call zgbtf2( m, n, kl, ku, ab, ldab, ipiv, info )
Description
for C interface.
The routine forms the LU factorization of a general real/complex m-by-n band matrix A with kl
sub-diagonals and ku super-diagonals. The routine uses partial pivoting with row interchanges
and implements the unblocked version of the algorithm, calling Level 2 BLAS. See also ?gbtrf.
Input Parameters
A (kl ≥ 0).
of A (ku ≥ 0).
ab REAL for sgbtf2
DOUBLE PRECISION for dgbtf2
COMPLEX for cgbtf2
COMPLEX*16 for zgbtf2.
The array ab contains the matrix A in band storage (see
Matrix Arguments).
1418
ldab INTEGER. The first dimension of the array ab.
(ldab ≥ 2kl + ku +1)
Output Parameters
ab Overwritten by details of the factorization. The diagonal and
kl + ku super-diagonals of U are stored in the first 1 + kl
+ ku rows of ab. The multipliers used during the factorization
are stored in the next kl rows.
ipiv INTEGER.
Array, DIMENSION at least max(1,min(m,n)).
The pivot indices: row i was interchanged with row ipiv(i).
info INTEGER. If info =0, the execution is successful.
If info = i, uii is 0. The factorization has been completed,
but U is exactly singular. Division by 0 will occur if you use
the factor U for solving a system of linear equations.
?gebd2
Reduces a general matrix to bidiagonal form using
an unblocked algorithm.
Syntax
call sgebd2( m, n, a, lda, d, e, tauq, taup, work, info )
call dgebd2( m, n, a, lda, d, e, tauq, taup, work, info )
call cgebd2( m, n, a, lda, d, e, tauq, taup, work, info )
call zgebd2( m, n, a, lda, d, e, tauq, taup, work, info )
Description
for C interface.
The routine reduces a general m-by-n matrix A to upper or lower bidiagonal form B by an
orthogonal (unitary) transformation: Q'*A*P = B
If m ≥ n, B is upper bidiagonal; if m < n, B is lower bidiagonal.
1419
The routine does not form the matrices Q and P explicitly, but represents them as products of
elementary reflectors. if m ≥ n,
Q = H(1)*H(2)*...*H(n), and P = G(1)*G(2)*...*G(n-1)

if m < n,
Q = H(1)*H(2)*...*H(m-1), and P = G(1)*G(2)*...*G(m)
Each H(i) and G(i) has the form
H(i) = I - tauq*v*v' and G(i) = I - taup*u*u'
where tauq and taup are scalars (real for sgebd2/dgebd2, complex for cgebd2/zgebd2), and
v and u are vectors (real for sgebd2/dgebd2, complex for cgebd2/zgebd2).
Input Parameters
a, work REAL for sgebd2
DOUBLE PRECISION for dgebd2
COMPLEX for cgebd2
COMPLEX*16 for zgebd2.
Arrays:
a(lda,*) contains the m-by-n general matrix A to be
reduced. The second dimension of a must be at least max(1,
n).
work(*) is a workspace array, the dimension of work must
be at least max(1, m, n).
Output Parameters
a if m ≥ n, the diagonal and first super-diagonal of a are
overwritten with the upper bidiagonal matrix B. Elements
below the diagonal, with the array tauq, represent the
reflectors, and elements above the first superdiagonal, with
the array taup, represent the orthogonal/unitary matrix p
as a product of elementary reflectors.
1420
if m < n, the diagonal and first sub-diagonal of a are
overwritten by the lower bidiagonal matrix B. Elements
below the first subdiagonal, with the array tauq, represent
the orthogonal/unitary matrix Q as a product of elementary
reflectors, and elements above the diagonal, with the array
taup, represent the orthogonal/unitary matrix p as a product
of elementary reflectors.
Contains the diagonal elements of the bidiagonal matrix B:
d(i) = a(i, i).
DOUBLE PRECISION for double-precision flavors. Array,
DIMENSION at least max(1, min(m, n) - 1).
Contains the off-diagonal elements of the bidiagonal matrix
B:
if m ≥ n, e(i) = a(i, i+1) for i = 1,2,..., n-1;
if m < n, e(i) = a(i+1, i) for i = 1,2,..., m-1.
tauq, taup REAL for sgebd2
DOUBLE PRECISION for dgebd2
COMPLEX for cgebd2
COMPLEX*16 for zgebd2.
Arrays, DIMENSION at least max (1, min(m, n)).
Contain scalar factors of the elementary reflectors which
represent orthogonal/unitary matrices Q and p, respectively.
info INTEGER.
1421
?gehd2
Reduces a general square matrix to upper
Hessenberg form using an unblocked algorithm.
Syntax
call sgehd2( n, ilo, ihi, a, lda, tau, work, info )
call dgehd2( n, ilo, ihi, a, lda, tau, work, info )
call cgehd2( n, ilo, ihi, a, lda, tau, work, info )
call zgehd2( n, ilo, ihi, a, lda, tau, work, info )
Description
for C interface.
The routine reduces a real/complex general matrix A to upper Hessenberg form H by an
orthogonal or unitary similarity transformation Q'*A*Q = H.
elementary reflectors.
Input Parameters
n INTEGER The order of the matrix A (n ≥ 0).
ilo, ihi INTEGER. It is assumed that A is already upper triangular
in rows and columns 1:ilo -1 and ihi+1:n.
If A has been output by ?gebal, then
ilo and ihi must contain the values returned by that
routine. Otherwise they should be set to ilo = 1 and ihi
= n. Constraint: 1 ≤ ilo ≤ ihi ≤ max(1, n).
a, work REAL for sgehd2
DOUBLE PRECISION for dgehd2
COMPLEX for cgehd2
COMPLEX*16 for zgehd2.
Arrays:
a (lda,*) contains the n-by-n matrix A to be reduced. The
1422
work (n) is a workspace array.
Output Parameters
a On exit, the upper triangle and the first subdiagonal of A
are overwritten with the upper Hessenberg matrix H and
the elements below the first subdiagonal, with the array
tau, represent the orthogonal/unitary matrix Q as a product
of elementary reflectors. See Application Notes below.
tau REAL for sgehd2
DOUBLE PRECISION for dgehd2
COMPLEX for cgehd2
COMPLEX*16 for zgehd2.
Array, DIMENSION at least max (1, n-1).
Contains the scalar factors of elementary reflectors. See
Application Notes below.
info INTEGER.
Application Notes
The matrix Q is represented as a product of (ihi - ilo) elementary reflectors
Q = H(ilo)*H(ilo +1)*...*H(ihi -1)

Each H(i) has the form
H(i) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector with v(1:i) = 0, v(i+1)
= 1 and v(ihi+1:n) = 0.
On exit, v(i+2:ihi) is stored in a(i+2:ihi, i) and tau in tau(i).
The contents of a are illustrated by the following example, with n = 7, ilo = 2 and ihi =
6:
1423
where a denotes an element of the original matrix A, h denotes a modified element of the upper
Hessenberg matrix H, and vi denotes an element of the vector defining H(i).
?gelq2
Computes the LQ factorization of a general
rectangular matrix using an unblocked algorithm.
Syntax
call sgelq2( m, n, a, lda, tau, work, info )
call dgelq2( m, n, a, lda, tau, work, info )
call cgelq2( m, n, a, lda, tau, work, info )
call zgelq2( m, n, a, lda, tau, work, info )
Description
for C interface.
The routine computes an LQ factorization of a real/complex m-by-n matrix A as A = L*Q.
min(m, n) elementary reflectors :
Q = H(k) ... H(2) H(1) (or Q = H(k)' ... H(2)' H(1)' for complex flavors), where k
= min(m, n)
1424
H(i) = I - tau*v*v'
where tau is a real/complex scalar stored in tau(i), and v is a real/complex vector with v(1:i-1)
= 0 and v(i) = 1.
On exit, v(i+1:n) is stored in a(i, i+1:n).
Input Parameters
a, work REAL for sgelq2
DOUBLE PRECISION for dgelq2
COMPLEX for cgelq2
COMPLEX*16 for zgelq2.
Arrays: a(lda,*) contains the m-by-n matrix A. The second
work(m) is a workspace array.
Output Parameters
on exit, the elements on and below the diagonal of the array
a contain the m-by-min(n,m) lower trapezoidal matrix L (L
is lower triangular if n ≥ m); the elements above the
diagonal, with the array tau, represent the
orthogonal/unitary matrix Q as a product of min(n,m)
elementary reflectors.
tau REAL for sgelq2
DOUBLE PRECISION for dgelq2
COMPLEX for cgelq2
COMPLEX*16 for zgelq2.
Contains scalar factors of the elementary reflectors.
info INTEGER.
1425
?geql2
Computes the QL factorization of a general
Syntax
call sgeql2( m, n, a, lda, tau, work, info )
call dgeql2( m, n, a, lda, tau, work, info )
call cgeql2( m, n, a, lda, tau, work, info )
call zgeql2( m, n, a, lda, tau, work, info )
Description
for C interface.
The routine computes a QL factorization of a real/complex m-by-n matrix A as A = Q*L.
Q = H(k)* ... *H(2)*H(1), where k = min(m, n)

H(i) = I - tau*v*v'
where tau is a real/complex scalar stored in tau(i), and v is a real/complex vector with
v(m-k+i+1:m) = 0 and v(m-k+i) = 1.
On exit, v(1:m-k+i-1) is stored in a(1:m-k+i-1, n-k+i).
Input Parameters
a, work REAL for sgeql2
DOUBLE PRECISION for dgeql2
COMPLEX for cgeql2
1426
COMPLEX*16 for zgeql2.
Arrays:
Output Parameters
on exit, if m ≥ n, the lower triangle of the subarray
a(m-n+1:m, 1:n) contains the n-by-n lower triangular
matrix L; if m < n, the elements on and below the (n-m)th
superdiagonal contain the m-by-n lower trapezoidal matrix
L; the remaining elements, with the array tau, represent
the orthogonal/unitary matrix Q as a product of elementary
reflectors.
tau REAL for sgeql2
DOUBLE PRECISION for dgeql2
COMPLEX for cgeql2
COMPLEX*16 for zgeql2.
info INTEGER.
1427
?geqr2
Computes the QR factorization of a general
Syntax
call sgeqr2( m, n, a, lda, tau, work, info )
call dgeqr2( m, n, a, lda, tau, work, info )
call cgeqr2( m, n, a, lda, tau, work, info )
call zgeqr2( m, n, a, lda, tau, work, info )
Description
for C interface.
The routine computes a QR factorization of a real/complex m-by-n matrix A as A = Q*R.
Q = H(1)*H(2)* ... *H(k), where k = min(m, n)

H(i) = I - tau*v*v'
where tau is a real/complex scalar stored in tau(i), and v is a real/complex vector with v(1:i-1)
= 0 and v(i) = 1.
On exit, v(i+1:m) is stored in a(i+1:m, i).
Input Parameters
a, work REAL for sgeqr2
DOUBLE PRECISION for dgeqr2
COMPLEX for cgeqr2
COMPLEX*16 for zgeqr2.
Arrays:
1428
work(n) is a workspace array.
Output Parameters
on exit, the elements on and above the diagonal of the array
a contain the min(n,m)-by-n upper trapezoidal matrix R (R
is upper triangular if m ≥ n); the elements below the
diagonal, with the array tau, represent the
reflectors.
tau REAL for sgeqr2
DOUBLE PRECISION for dgeqr2
COMPLEX for cgeqr2
COMPLEX*16 for zgeqr2.
info INTEGER.
?gerq2
Computes the RQ factorization of a general
Syntax
call sgerq2( m, n, a, lda, tau, work, info )
call dgerq2( m, n, a, lda, tau, work, info )
call cgerq2( m, n, a, lda, tau, work, info )
call zgerq2( m, n, a, lda, tau, work, info )
1429
Description
for C interface.
The routine computes a RQ factorization of a real/complex m-by-n matrix A as A = R*Q.
Q = H(1)*H(2)* ... *H(k), where k = min(m, n)

H(i) = I - tau*v*v'
where tau is a real/complex scalar stored in tau(i), and v is a real/complex vector with
v(n-k+i+1:n) = 0 and v(n-k+i) = 1.
On exit, v(1:n-k+i-1) is stored in a(m-k+i, 1:n-k+i-1).
Input Parameters
a, work REAL for sgerq2
DOUBLE PRECISION for dgerq2
COMPLEX for cgerq2
COMPLEX*16 for zgerq2.
Arrays:
Output Parameters
on exit, if m ≤ n, the upper triangle of the subarray a(1:m,
n-m+1:n ) contains the m-by-m upper triangular matrix R;
if m > n, the elements on and above the (m-n)-th
subdiagonal contain the m-by-n upper trapezoidal matrix R;
1430
the remaining elements, with the array tau, represent the
reflectors.
tau REAL for sgerq2
DOUBLE PRECISION for dgerq2
COMPLEX for cgerq2
COMPLEX*16 for zgerq2.
info INTEGER.
?gesc2
Solves a system of linear equations using the LU
factorization with complete pivoting computed by
?getc2.
Syntax
call sgesc2( n, a, lda, rhs, ipiv, jpiv, scale )
call dgesc2( n, a, lda, rhs, ipiv, jpiv, scale )
call cgesc2( n, a, lda, rhs, ipiv, jpiv, scale )
call zgesc2( n, a, lda, rhs, ipiv, jpiv, scale )
Description
for C interface.
The routine solves a system of linear equations
A*X = scale*RHS
with a general n-by-n matrix A using the LU factorization with complete pivoting computed by
?getc2.
1431
Input Parameters
n INTEGER. The order of the matrix A.
a, rhs REAL for sgesc2
DOUBLE PRECISION for dgesc2
COMPLEX for cgesc2
COMPLEX*16 for zgesc2.
Arrays:
a(lda,*) contains the LU part of the factorization of the
n-by-n matrix A computed by ?getc2:
A = P*L*U*Q.
The second dimension of a must be at least max(1, n);
rhs(n) contains on entry the right hand side vector for the
system of equations.
ipiv INTEGER.
Array, DIMENSION at least max(1,n).
The pivot indices: for 1 ≤ i ≤ n, row i of the matrix has
been interchanged with row ipiv(i).
jpiv INTEGER.
The pivot indices: for 1 ≤ j ≤ n, column j of the matrix
has been interchanged with column jpiv(j).
Output Parameters
rhs On exit, overwritten with the solution vector X.
scale REAL for sgesc2/cgesc2
DOUBLE PRECISION for dgesc2/zgesc2
Contains the scale factor. scale is chosen in the range 0 ≤
scale ≤ 1 to prevent overflow in the solution.
1432
?getc2
Computes the LU factorization with complete
pivoting of the general n-by-n matrix.
Syntax
call sgetc2( n, a, lda, ipiv, jpiv, info )
call dgetc2( n, a, lda, ipiv, jpiv, info )
call cgetc2( n, a, lda, ipiv, jpiv, info )
call zgetc2( n, a, lda, ipiv, jpiv, info )
Description
for C interface.
The routine computes an LU factorization with complete pivoting of the n-by-n matrix A. The
factorization has the form A = P*L*U*Q, where P and Q are permutation matrices, L is lower
triangular with unit diagonal elements and U is upper triangular.
The LU factorization computed by this routine is used by ?latdf to compute a contribution to

the reciprocal Dif-estimate.
Input Parameters
a REAL for sgetc2
DOUBLE PRECISION for dgetc2
COMPLEX for cgetc2
COMPLEX*16 for zgetc2.
Array a(lda,*) contains the n-by-n matrix A to be factored.
The second dimension of a must be at least max(1, n);
1433
Output Parameters
a On exit, the factors L and U from the factorization A =
P*L*U*Q; the unit diagonal elements of L are not stored. If
U(k, k) appears to be less than smin, U(k, k) is given the
value of smin, that is giving a nonsingular perturbed system.
ipiv INTEGER.
The pivot indices: for 1 ≤ i ≤ n , row i of the matrix has
jpiv INTEGER.
The pivot indices: for 1 ≤ j ≤ n , column j of the matrix
has been interchanged with column jpiv(j).
info INTEGER.
If info = k >0, U(k, k) is likely to produce overflow if we
try to solve for x in A*x = b. So U is perturbed to avoid the
overflow.
?getf2
matrix using partial pivoting with row interchanges
Syntax
call sgetf2( m, n, a, lda, ipiv, info )
call dgetf2( m, n, a, lda, ipiv, info )
call cgetf2( m, n, a, lda, ipiv, info )
call zgetf2( m, n, a, lda, ipiv, info )
Description
for C interface.
1434
The routine computes the LU factorization of a general m-by-n matrix A using partial pivoting
with row interchanges. The factorization has the form
A = P*L*U
where p is a permutation matrix, L is lower triangular with unit diagonal elements (lower
trapezoidal if m > n) and U is upper triangular (upper trapezoidal if m < n).
Input Parameters
a REAL for sgetf2
DOUBLE PRECISION for dgetf2
COMPLEX for cgetf2
COMPLEX*16 for zgetf2.
Array, DIMENSION (lda,*). Contains the matrix A to be
factored. The second dimension of a must be at least max(1,
n).
Output Parameters
a Overwritten by L and U. The unit diagonal elements of L are
not stored.
ipiv INTEGER.
Array, DIMENSION at least max(1,min(m,n)).
The pivot indices: for 1 ≤ i ≤ n , row i was interchanged
with row ipiv(i).
If info = i >0, uii is 0. The factorization has been
completed, but U is exactly singular. Division by 0 will occur
if you use the factor U for solving a system of linear
equations.
1435
?gtts2
tridiagonal matrix using the LU factorization
computed by ?gttrf.
Syntax
call sgtts2( itrans, n, nrhs, dl, d, du, du2, ipiv, b, ldb )
call dgtts2( itrans, n, nrhs, dl, d, du, du2, ipiv, b, ldb )
call cgtts2( itrans, n, nrhs, dl, d, du, du2, ipiv, b, ldb )
call zgtts2( itrans, n, nrhs, dl, d, du, du2, ipiv, b, ldb )
Description
for C interface.
The routine solves for X one of the following systems of linear equations with multiple right
hand sides:
A*X = B, AT*X = B, or AH*X = B (for complex matrices only), with a tridiagonal matrix A
using the LU factorization computed by ?gttrf.
Input Parameters
itrans INTEGER. Must be 0, 1, or 2.
Indicates the form of the equations to be solved:
If itrans = 0, then A*X = B (no transpose).
If itrans = 1, then AT*X = B (transpose).
If itrans = 2, then AH*X = B (conjugate transpose).
nrhs INTEGER. The number of right-hand sides, i.e., the number
of columns in B (nrhs ≥ 0).
dl,d,du,du2,b REAL for sgtts2
DOUBLE PRECISION for dgtts2
COMPLEX for cgtts2
COMPLEX*16 for zgtts2.
1436
Arrays: dl(n - 1), d(n ), du(n - 1), du2(n - 2),
b(ldb,nrhs).
The array dl contains the (n - 1) multipliers that define the
matrix L from the LU factorization of A.
The array d contains the n diagonal elements of the upper
The array du contains the (n - 1) elements of the first
super-diagonal of U.
The array du2 contains the (n - 2) elements of the second
The array b contains the matrix B whose columns are the
right-hand sides for the systems of equations.
ldb INTEGER. The leading dimension of b; must be ldb ≥
max(1, n).
ipiv INTEGER.
The pivot indices array, as returned by ?gttrf.
Output Parameters
?isnan
Tests input for NaN.
Syntax
val = sisnan( sin )
val = disnan( din )
Description
for C interface.
This logical routine returns .TRUE. if its argument is NaN, and .FALSE. otherwise.
1437
Input Parameters
sin REAL for sisnan
Input to test for NaN.
din DOUBLE PRECISION for disnan
Input to test for NaN.
Output Parameters
val Logical. Result of the test.
?laisnan
Tests input for NaN.
Syntax
val = slaisnan( sin1, sin2 )
val = dlaisnan( din1, din2 )
Description
for C interface.
This logical routine checks for NaNs (NaN stands for 'Not A Number') by comparing its two
arguments for inequality. NaN is the only floating-point value where NaN ≠ NaN returns .TRUE.
To check for NaNs, pass the same variable as both arguments.
This routine is not for general use. It exists solely to avoid over-optimization in ?isnan.
Input Parameters
sin1, sin2 REAL for sisnan
Two numbers to compare for inequality.
din2, din2 DOUBLE PRECISION for disnan
Two numbers to compare for inequality.
Output Parameters
val Logical. Result of the comparison.
1438
?labrd
Reduces the first nb rows and columns of a general
matrix to a bidiagonal form.
Syntax
call slabrd( m, n, nb, a, lda, d, e, tauq, taup, x, ldx, y, ldy )
call dlabrd( m, n, nb, a, lda, d, e, tauq, taup, x, ldx, y, ldy )
call clabrd( m, n, nb, a, lda, d, e, tauq, taup, x, ldx, y, ldy )
call zlabrd( m, n, nb, a, lda, d, e, tauq, taup, x, ldx, y, ldy )
Description
for C interface.
The routine reduces the first nb rows and columns of a general m-by-n matrix A to upper or
lower bidiagonal form by an orthogonal/unitary transformation Q'*A*P, and returns the matrices
X and Y which are needed to apply the transformation to the unreduced part of A.
if m ≥ n, A is reduced to upper bidiagonal form; if m < n, to lower bidiagonal form.
The matrices Q and P are represented as products of elementary reflectors: Q = H(1)*(2)*

...*H(nb), and P = G(1)*G(2)* ...*G(nb)
Each H(i) and G(i) has the form
H(i) = I - tauq*v*v' and G(i) = I - taup*u*u'

where tauq and taup are scalars, and v and u are vectors.
The elements of the vectors v and u together form the m-by-nb matrix V and the nb-by-n matrix
U' which are needed, with X and Y, to apply the transformation to the unreduced part of the
matrix, using a block update of the form: A := A - V*Y' - X*U'.
This is an auxiliary routine called by ?gebrd.
Input Parameters
1439
nb INTEGER. The number of leading rows and columns of A to

be reduced.
a REAL for slabrd
DOUBLE PRECISION for dlabrd
COMPLEX for clabrd
COMPLEX*16 for zlabrd.
Array a(lda,*) contains the matrix A to be reduced. The
ldx INTEGER. The first dimension of the output array x; must
beat least max(1, m).
ldy INTEGER. The first dimension of the output array y; must
beat least max(1, n).
Output Parameters
a On exit, the first nb rows and columns of the matrix are
overwritten; the rest of the array is unchanged.
if m ≥ n, elements on and below the diagonal in the first nb
columns, with the array tauq, represent the
reflectors; and elements above the diagonal in the first nb
rows, with the array taup, represent the orthogonal/unitary
matrix p as a product of elementary reflectors.
if m < n, elements below the diagonal in the first nb
reflectors, and elements on and above the diagonal in the
first nb rows, with the array taup, represent the
orthogonal/unitary matrix p as a product of elementary
reflectors.
d, e REAL for single-precision flavors
DOUBLE PRECISION for double-precision flavors. Arrays,
DIMENSION (nb) each. The array d contains the diagonal
elements of the first nb rows and columns of the reduced
matrix:
d(i) = a(i,i).
1440
The array e contains the off-diagonal elements of the first
nb rows and columns of the reduced matrix.
tauq, taup REAL for slabrd
COMPLEX for clabrd
Arrays, DIMENSION (nb) each. Contain scalar factors of the
elementary reflectors which represent the orthogonal/unitary
matrices Q and P, respectively.
x, y REAL for slabrd
COMPLEX for clabrd
Arrays, dimension x(ldx, nb), y(ldy, nb).
The array x contains the m-by-nb matrix X required to update
the unreduced part of A.
The array y contains the n-by-nb matrix Y required to update
Application Notes
if m ≥ n, then for the elementary reflectors H(i) and G(i),
v(1:i-1) = 0, v(i) = 1, and v(i:m) is stored on exit in a(i:m, i); u(1:i) = 0, u(i+1)
= 1, and u(i+1:n) is stored on exit in a(i, i+1:n);
tauq is stored in tauq(i) and taup in taup(i).
if m < n,
v(1:i) = 0, v(i+1) = 1, and v(i+1:m) is stored on exit in a(i+2:m, i) ; u(1:i-1) = 0,
u(i) = 1, and u(i:n) is stored on exit in a(i, i+1:n); tauq is stored in tauq(i) and taup
in taup(i).
The contents of a on exit are illustrated by the following examples with nb = 2:
1441
where a denotes an element of the original matrix which is unchanged, vi denotes an element
of the vector defining H(i), and ui an element of the vector defining G(i).
?lacn2
Estimates the 1-norm of a square matrix, using
reverse communication for evaluating matrix-vector
products.
Syntax
call slacn2( n, v, x, isgn, est, kase, isave )
call dlacn2( n, v, x, isgn, est, kase, isave )
call clacn2( n, v, x, est, kase, isave )
call zlacn2( n, v, x, est, kase, isave )
Description
for C interface.
The routine estimates the 1-norm of a square, real or complex matrix A. Reverse communication
is used for evaluating matrix-vector products.
1442
Input Parameters
v, x REAL for slacn2
DOUBLE PRECISION for dlacn2
COMPLEX for clacn2
COMPLEX*16 for zlacn2.
Arrays, DIMENSION (n) each.
v is a workspace array.
x is used as input after an intermediate return.
isgn INTEGER.
Workspace array, DIMENSION (n), used with real flavors
only.
est REAL for slacn2/clacn2
DOUBLE PRECISION for dlacn2/zlacn2
On entry with kase set to 1 or 2, and isave(1) = 1, est
must be unchanged from the previous call to the routine.
kase INTEGER.
On the initial call to the routine, kase must be set to 0.
isave INTEGER. Array, DIMENSION (3).
Contains variables from the previous call to the routine.
Output Parameters
est An estimate (a lower bound) for norm(A).
kase On an intermediate return, kase is set to 1 or 2, indicating
whether x is overwritten by A*x or A'*x.
On the final return, kase is set to 0.
v On the final return, v = A*w, where est =
norm(v)/norm(w) (w is not returned).
x On an intermediate return, x is overwritten by
A*x , if kase = 1,
A'*x , if kase = 2,
(A' is the conjugate transpose of A), and the routine must
be re-called with all the other parameters unchanged.
isave This parameter is used to save variables between calls to
the routine.
1443
?lacon
reverse communication for evaluating matrix-vector
products.
Syntax
call slacon( n, v, x, isgn, est, kase )
call dlacon( n, v, x, isgn, est, kase )
call clacon( n, v, x, est, kase )
call zlacon( n, v, x, est, kase )
Description
for C interface.
The routine estimates the 1-norm of a square, real/complex matrix A. Reverse communication
is used for evaluating matrix-vector products.
WARNING. The ?lacon routine is not thread-safe. It is deprecated and retained for
the backward compatibility only. Use the thread-safe ?lacn2 routine instead.
Input Parameters
v, x REAL for slacon
DOUBLE PRECISION for dlacon
COMPLEX for clacon
COMPLEX*16 for zlacon.
Arrays, DIMENSION (n) each.
v is a workspace array.
x is used as input after an intermediate return.
isgn INTEGER.
Workspace array, DIMENSION (n), used with real flavors
only.
1444
est REAL for slacon/clacon
DOUBLE PRECISION for dlacon/zlacon
An estimate that with kase=1 or 2 should be unchanged
from the previous call to ?lacon.
kase INTEGER.
On the initial call to ?lacon, kase should be 0.
Output Parameters
est REAL for slacon/clacon
DOUBLE PRECISION for dlacon/zlacon
An estimate (a lower bound) for norm(A).
kase On an intermediate return, kase will be 1 or 2, indicating
whether x should be overwritten by A*x or A'*x. On the
final return from ?lacon, kase will again be 0.
v On the final return, v = A*w, where est =
norm(v)/norm(w) (w is not returned).
x On an intermediate return, x should be overwritten by
A*x , if kase = 1,
A'*x , if kase = 2,
(where for complex flavors A' is the conjugate transpose
of A), and ?lacon must be re-called with all the other
parameters unchanged.
?lacpy
Copies all or part of one two-dimensional array to
another.
Syntax
call slacpy( uplo, m, n, a, lda, b, ldb )
call dlacpy( uplo, m, n, a, lda, b, ldb )
call clacpy( uplo, m, n, a, lda, b, ldb )
call zlacpy( uplo, m, n, a, lda, b, ldb )
1445
Description
for C interface.
The routine copies all or part of a two-dimensional matrix A to another matrix B.
Input Parameters
uplo CHARACTER*1.
Specifies the part of the matrix A to be copied to B.
If uplo = 'U', the upper triangular part of ;
if uplo = 'L', the lower triangular part of A.
Otherwise, all of the matrix A is copied.
a REAL for slacpy
DOUBLE PRECISION for dlacpy
COMPLEX for clacpy
COMPLEX*16 for zlacpy.
Array a(lda,*), contains the m-by-n matrix A.
If uplo = 'U', only the upper triangle or trapezoid is
accessed; if uplo = 'L', only the lower triangle or
trapezoid is accessed.
lda INTEGER. The first dimension of a; lda ≥ max(1, m).
ldb INTEGER. The first dimension of the output array b; ldb ≥
max(1, m).
Output Parameters
b REAL for slacpy
DOUBLE PRECISION for dlacpy
COMPLEX for clacpy
COMPLEX*16 for zlacpy.
Array b(ldb,*), contains the m-by-n matrix B.
The second dimension of b must be at least max(1,n).
On exit, B = A in the locations specified by uplo.
1446
?ladiv
Performs complex division in real arithmetic,
avoiding unnecessary overflow.
Syntax
call sladiv( a, b, c, d, p, q )
call dladiv( a, b, c, d, p, q )
res = cladiv( x, y )
res = zladiv( x, y )
Description
for C interface.
The routines sladiv/dladiv perform complex division in real arithmetic as
Complex functions cladiv/zladiv compute the result as

res = x/y,
where x and y are complex. The computation of x / y will not overflow on an intermediary step
unless the results overflows.
Input Parameters
a, b, c, d REAL for sladiv
DOUBLE PRECISION for dladiv
The scalars a, b, c, and d in the above expression (for real
flavors only).
x, y COMPLEX for cladiv
COMPLEX*16 for zladiv
The complex scalars x and y (for complex flavors only).
1447
Output Parameters
p, q REAL for sladiv
DOUBLE PRECISION for dladiv
The scalars p and q in the above expression (for real flavors
only).
res COMPLEX for cladiv
DOUBLE COMPLEX for zladiv
Contains the result of division x / y.
?lae2
Computes the eigenvalues of a 2-by-2 symmetric
matrix.
Syntax
call slae2( a, b, c, rt1, rt2 )
call dlae2( a, b, c, rt1, rt2 )
Description
for C interface.
The routines sla2/dlae2 compute the eigenvalues of a 2-by-2 symmetric matrix
On return, rt1 is the eigenvalue of larger absolute value, and rt1 is the eigenvalue of smaller
absolute value.
Input Parameters
a, b, c REAL for slae2
DOUBLE PRECISION for dlae2
The elements a, b, and c of the 2-by-2 matrix above.
1448
Output Parameters
rt1, rt2 REAL for slae2
DOUBLE PRECISION for dlae2
The computed eigenvalues of larger and smaller absolute
value, respectively.
Application Notes
rt1 is accurate to a few ulps barring over/underflow. rt2 may be inaccurate if there is massive
cancellation in the determinant a*c-b*b; higher precision or correctly rounded or correctly
truncated arithmetic would be needed to compute rt2 accurately in all cases.
Overflow is possible only if rt1 is within a factor of 5 of overflow. Underflow is harmless if the
input data is 0 or exceeds
underflow_threshold / macheps.
?laebz
Computes the number of eigenvalues of a real
symmetric tridiagonal matrix which are less than
or equal to a given value, and performs other tasks
required by the routine ?stebz.
Syntax
call slaebz( ijob, nitmax, n, mmax, minp, nbmin, abstol, reltol, pivmin, d,
e, e2, nval, ab, c, mout, nab, work, iwork, info )
call dlaebz( ijob, nitmax, n, mmax, minp, nbmin, abstol, reltol, pivmin, d,
e, e2, nval, ab, c, mout, nab, work, iwork, info )
Description
for C interface.
The routine ?laebz contains the iteration loops which compute and use the function n(w), which
is the count of eigenvalues of a symmetric tridiagonal matrix T less than or equal to its argument
w. It performs a choice of two types of loops:
ijob =1, followed by
1449
ijob =2: It takes as input a list of intervals and returns a list of sufficiently
small intervals whose union contains the same eigenvalues as the union
of the original intervals. The input intervals are (ab(j,1),ab(j,2)],
j=1,...,minp. The output interval (ab(j,1),ab(j,2)] will contain
eigenvalues nab(j,1)+1,...,nab(j,2), where 1 ≤ j ≤ mout.
ijob =3: It performs a binary search in each input interval
(ab(j,1),ab(j,2)] for a point w(j) such that n(w(j))=nval(j),
and uses c(j) as the starting point of the search. If such a w(j) is found,
then on output ab(j,1)=ab(j,2)=w. If no such w(j) is found, then on
output (ab(j,1),ab(j,2)] will be a small interval containing the point
where n(w) jumps through nval(j), unless that point lies outside the
initial interval.
Note that the intervals are in all cases half-open intervals, that is, of the form (a,b], which
includes b but not a .
To avoid underflow, the matrix should be scaled so that its largest element is no greater than
overflow1/2 * overflow1/4 in absolute value. To assure the most accurate computation of
small eigenvalues, the matrix should be scaled to be not much smaller than that, either.
NOTE. In general, the arguments are not checked for unreasonable values.
Input Parameters
ijob INTEGER. Specifies what is to be done:
= 1: Compute nab for the initial intervals.
= 2: Perform bisection iteration to find eigenvalues of T.
= 3: Perform bisection iteration to invert n(w), i.e., to find
a point which has a specified number of eigenvalues of T to
its left. Other values will cause ?laebz to return with
info=-1.
nitmax INTEGER. The maximum number of "levels" of bisection to
be performed, i.e., an interval of width W will not be made
smaller than 2-nitmax*W. If not all intervals have converged
after nitmax iterations, then info is set to the number of
non-converged intervals.
1450
n INTEGER. The dimension n of the tridiagonal matrix T. It
must be at least 1.
mmax INTEGER. The maximum number of intervals. If more than
mmax intervals are generated, then ?laebz will quit with
info=mmax+1.
minp INTEGER. The initial number of intervals. It may not be
greater than mmax.
nbmin INTEGER. The smallest number of intervals that should be
processed using a vector loop. If zero, then only the scalar
loop will be used.
abstol REAL for slaebz
DOUBLE PRECISION for dlaebz.
The minimum (absolute) width of an interval. When an
interval is narrower than abstol, or than reltol times the
larger (in magnitude) endpoint, then it is considered to be
sufficiently small, i.e., converged. This must be at least zero.
reltol REAL for slaebz
The minimum relative width of an interval. When an interval
is narrower than abstol, or than reltol times the larger
(in magnitude) endpoint, then it is considered to be
sufficiently small, i.e., converged. Note: this should always
be at least radix*machine epsilon.
pivmin REAL for slaebz
The minimum absolute value of a "pivot" in the Sturm
sequence loop. This value must be at least (max
|e(j)**2|*safe_min) and at least safe_min, where
safe_min is at least the smallest number that can divide
one without overflow.
d, e, e2 REAL for slaebz
Arrays, dimension (n) each. The array d contains the
diagonal elements of the tridiagonal matrix T.
The array e contains the off-diagonal elements of the
tridiagonal matrix T in positions 1 through n-1. e(n)vis
arbitrary.
1451
The array e2 contains the squares of the off-diagonal

elements of the tridiagonal matrix T. e2(n) is ignored.
nval INTEGER.
Array, dimension (minp).
If ijob=1 or 2, not referenced.
If ijob=3, the desired values of n(w).
ab REAL for slaebz
Array, dimension (mmax,2) The endpoints of the intervals.
ab(j,1) is a(j), the left endpoint of the j-th interval, and
ab(j,2) is b(j), the right endpoint of the j-th interval.
c REAL for slaebz
Array, dimension (mmax)
If ijob=1, ignored.
If ijob=2, workspace.
If ijob=3, then on input c(j) should be initialized to the
first search point in the binary search.
nab INTEGER.
Array, dimension (mmax,2)
If ijob=2, then on input, nab(i,j) should be set. It must
satisfy the condition:
n(ab(i,1)) ≤ nab(i,1) ≤ nab(i,2) ≤ n(ab(i,2)),
which means that in interval i only eigenvalues
nab(i,1)+1,...,nab(i,2) are considered. Usually,
nab(i,j)=n(ab(i,j)), from a previous call to ?laebz
with ijob=1.
If ijob=3, normally, nab should be set to some distinctive
value(s) before ?laebz is called.
work REAL for slaebz
Workspace array, dimension (mmax).
iwork INTEGER.
Workspace array, dimension (mmax).
1452
Output Parameters
nval The elements of nval will be reordered to correspond with
the intervals in ab. Thus, nval(j) on output will not, in
general be the same as nval(j) on input, but it will
correspond with the interval (ab(j,1),ab(j,2)] on output.
ab The input intervals will, in general, be modified, split, and
reordered by the calculation.
mout INTEGER.
If ijob=1, the number of eigenvalues in the intervals.
If ijob=2 or 3, the number of intervals output.
If ijob=3, mout will equal minp.
nab If ijob=1, then on output nab(i,j) will be set to N(ab(i,j)).
If ijob=2, then on output, nab(i,j) will contain max(na(k,
min(nb(k), N(ab(i,j)))), where k is the index of the input
interval that the output interval (ab(j,1),ab(j,2)] came from,
and na(k) and nb(k) are the input values of nab(k,1) and
nab(k,2).
If ijob=3, then on output, nab(i,j) contains N(ab(i,j)), unless
N(w) > nval(i) for all search points w, in which case nab(i,1)
will not be modified, i.e., the output value will be the same
as the input value (modulo reorderings, see nval and ab),
or unless N(w) < nval(i) for all search points w, in which
case nab(i,2) will not be modified.
info INTEGER.
If info = 0 - all intervals converged
If info = 1--mmax - the last info interval did not
converge.
If info = mmax+1 - more than mmax intervals were
generated
Application Notes
This routine is intended to be called only by other LAPACK routines, thus the interface is less
user-friendly. It is intended for two purposes:
(a) finding eigenvalues. In this case, ?laebz should have one or more initial intervals set up
in ab, and ?laebz should be called with ijob=1. This sets up nab, and also counts the
eigenvalues. Intervals with no eigenvalues would usually be thrown out at this point. Also, if
1453
not all the eigenvalues in an interval i are desired, nab(i,1) can be increased or nab(i,2)
decreased. For example, set nab(i,1)=nab(i,2)-1 to get the largest eigenvalue. ?laebz is then
called with ijob=2 and mmax no smaller than the value of mout returned by the call with ijob=1.
After this (ijob=2) call, eigenvalues nab(i,1)+1 through nab(i,2) are approximately ab(i,1) (or
ab(i,2)) to the tolerance specified by abstol and reltol.
(b) finding an interval (a',b'] containing eigenvalues w(f),...,w(l). In this case, start with a
Gershgorin interval (a,b). Set up ab to contain 2 search intervals, both initially (a,b). One nval
element should contain f-1 and the other should contain l, while c should contain a and b,
respectively. nab(i,1) should be -1 and nab(i,2) should be n+1, to flag an error if the desired
interval does not lie in (a,b). ?laebz is then called with ijob=3. On exit, if w(f-1) < w(f),
then one of the intervals -- j -- will have ab(j,1)=ab(j,2) and nab(j,1)=nab(j,2)=f-1,
while if, to the specified tolerance, w(f-k)=...=w(f+r), k > 0 and r ≥ 0, then the interval
will have n(ab(j,1))=nab(j,1)=f-k and n(ab(j,2))=nab(j,2)=f+r. The cases w(l) <
w(l+1) and w(l-r)=...=w(l+k) are handled similarly.
?laed0
Used by ?stedc. Computes all eigenvalues and
corresponding eigenvectors of an unreduced
symmetric tridiagonal matrix using the divide and
conquer method.
Syntax
call slaed0( icompq, qsiz, n, d, e, q, ldq, qstore, ldqs, work, iwork, info
)
call dlaed0( icompq, qsiz, n, d, e, q, ldq, qstore, ldqs, work, iwork, info
)
call claed0( qsiz, n, d, e, q, ldq, qstore, ldqs, rwork, iwork, info )
call zlaed0( qsiz, n, d, e, q, ldq, qstore, ldqs, rwork, iwork, info )
Description
for C interface.
Real flavors of this routine compute all eigenvalues and (optionally) corresponding eigenvectors
of a symmetric tridiagonal matrix using the divide and conquer method.
1454
Complex flavors claed0/zlaed0 compute all eigenvalues of a symmetric tridiagonal matrix
which is one diagonal block of those from reducing a dense or band Hermitian matrix and
corresponding eigenvectors of the dense or band matrix.
Input Parameters
icompq INTEGER. Used with real flavors only.
If icompq = 0, compute eigenvalues only.
If icompq = 1, compute eigenvectors of original dense
symmetric matrix also.
On entry, the array q must contain the orthogonal matrix
used to reduce the original matrix to tridiagonal form.
If icompq = 2, compute eigenvalues and eigenvectors of
the tridiagonal matrix.
qsiz INTEGER.
The dimension of the orthogonal/unitary matrix used to
reduce the full matrix to tridiagonal form; qsiz ≥ n (for
real flavors, qsiz ≥ n if icompq = 1).
n INTEGER. The dimension of the symmetric tridiagonal matrix
(n ≥ 0).
d, e, rwork REAL for single-precision flavors
DOUBLE PRECISION for double-precision flavors. Arrays:
d(*) contains the main diagonal of the tridiagonal matrix.
e(*) contains the off-diagonal elements of the tridiagonal
matrix. The dimension of e must be at least max(1, n-1).
rwork(*) is a workspace array used in complex flavors only.
The dimension of rwork must be at least (1
+3n+2nlg(n)+3n2), where lg(n) = smallest integer k such
that 2k ≥ n.
q, qstore REAL for slaed0
DOUBLE PRECISION for dlaed0
COMPLEX for claed0
COMPLEX*16 for zlaed0.
Arrays: q(ldq, *), qstore(ldqs, *). The second dimension
of these arrays must be at least max(1, n).
For real flavors:
1455
If icompq = 0, array q is not referenced.

If icompq = 1, on entry, q is a subset of the columns of
the orthogonal matrix used to reduce the full matrix to
tridiagonal form corresponding to the subset of the full
matrix which is being decomposed at this time.
If icompq = 2, on entry, q will be the identity matrix. The
array qstore is a workspace array referenced only when
icompq = 1. Used to store parts of the eigenvector matrix
when the updating matrix multiplies take place.
On entry, q must contain an qsiz-by-n matrix whose
columns are unitarily orthonormal. It is a part of the unitary
matrix that reduces the full dense Hermitian matrix to a
(reducible) symmetric tridiagonal matrix. The array qstore
is a workspace array used to store parts of the eigenvector
matrix when the updating matrix multiplies take place.
ldq INTEGER. The first dimension of the array q; ldq ≥ max(1,
n).
ldqs INTEGER. The first dimension of the array qstore; ldqs ≥
max(1, n).
work REAL for slaed0
DOUBLE PRECISION for dlaed0.
Workspace array, used in real flavors only.
If icompq = 0 or 1, the dimension of work must be at
least (1 +3n+2nlg(n)+2n2), where lg(n) = smallest
integer k such that 2k ≥ n.
If icompq = 2, the dimension of work must be at least
(4n+n2).
iwork INTEGER.
Workspace array.
For real flavors, if icompq = 0 or 1, and for complex flavors,
the dimension of iwork must be at least (6+6n+5nlg(n)).
For real flavors, if icompq = 2, the dimension of iwork
must be at least (3+5n).
1456
Output Parameters
d On exit, contains eigenvalues in ascending order.
e On exit, the array is destroyed.
q If icompq = 2, on exit, q contains the eigenvectors of the
tridiagonal matrix.
info INTEGER.
If info = i >0, the algorithm failed to compute an
eigenvalue while working on the submatrix lying in rows
and columns i/(n+1) through mod(i, n+1).
?laed1
Used by sstedc/dstedc. Computes the updated
eigensystem of a diagonal matrix after modification
by a rank-one symmetric matrix. Used when the
original matrix is tridiagonal.
Syntax
call slaed1( n, d, q, ldq, indxq, rho, cutpnt, work, iwork, info )
call dlaed1( n, d, q, ldq, indxq, rho, cutpnt, work, iwork, info )
Description
for C interface.
The routine ?laed1 computes the updated eigensystem of a diagonal matrix after modification
by a rank-one symmetric matrix. This routine is used only for the eigenproblem which requires
all eigenvalues and eigenvectors of a tridiagonal matrix. ?laed7 handles the case in which
eigenvalues only or eigenvalues and eigenvectors of a full symmetric matrix (which was reduced
to tridiagonal form) are desired.
T = Q(in)*(D(in)+ rho*Z*Z')*Q'(in) = Q(out)*D(out)*Q'(out)
where Z = Q'u, u is a vector of length n with ones in the cutpnt and (cutpnt+1) -th elements
and zeros elsewhere. The eigenvectors of the original matrix are stored in Q, and the eigenvalues
are in D. The algorithm consists of three stages:
1457
The first stage consists of deflating the size of the problem when there are multiple eigenvalues
or if there is a zero in the z vector. For each such occurrence the dimension of the secular
equation problem is reduced by one. This stage is performed by the routine ?laed2.
The second stage consists of calculating the updated eigenvalues. This is done by finding the
roots of the secular equation via the routine ?laed4 (as called by ?laed3). This routine also
calculates the eigenvectors of the current problem.
The final stage consists of computing the updated eigenvectors directly using the updated
eigenvalues. The eigenvectors for the current problem are multiplied with the eigenvectors
from the overall problem.
Input Parameters
(n ≥ 0).
d, q, work REAL for slaed1
Arrays:
d(*) contains the eigenvalues of the rank-1-perturbed
matrix. The dimension of d must be at least max(1, n).
q(ldq, *) contains the eigenvectors of the
rank-1-perturbed matrix. The second dimension of q must
be at least max(1, n).
work(*) is a workspace array, dimension at least (4n+n2).
n).
indxq INTEGER. Array, dimension (n).
On entry, the permutation which separately sorts the two
subproblems in d into ascending order.
rho REAL for slaed1
The subdiagonal entry used to create the rank-1
modification. This parameter can be modified by ?laed2,
where it is input/output.
cutpnt INTEGER.
The location of the last eigenvalue in the leading sub-matrix.
min(1,n) ≤ cutpnt ≤ n/2.
1458
iwork INTEGER.
Workspace array, dimension (4n).
Output Parameters
d On exit, contains the eigenvalues of the repaired matrix.
q On exit, q contains the eigenvectors of the repaired
tridiagonal matrix.
indxq On exit, contains the permutation which will reintegrate the
subproblems back into sorted order, that is, d( indxq(i
= 1, n )) will be in ascending order.
info INTEGER.
info = 1, an eigenvalue did not converge.
?laed2
Used by sstedc/dstedc. Merges eigenvalues and
deflates secular equation. Used when the original
matrix is tridiagonal.
Syntax
call slaed2( k, n, n1, d, q, ldq, indxq, rho, z, dlamda, w, q2, indx, indxc,
indxp, coltyp, info )
call dlaed2( k, n, n1, d, q, ldq, indxq, rho, z, dlamda, w, q2, indx, indxc,
indxp, coltyp, info )
Description
for C interface.
The routine ?laed2 merges the two sets of eigenvalues together into a single sorted set. Then
it tries to deflate the size of the problem. There are two ways in which deflation can occur:
when two or more eigenvalues are close together or if there is a tiny entry in the z vector. For
each such occurrence the order of the related secular equation problem is reduced by one.
1459
Input Parameters
k INTEGER. The number of non-deflated eigenvalues, and the
order of the related secular equation (0 ≤ k ≤ n).
(n ≥ 0).
n1 INTEGER. The location of the last eigenvalue in the leading
sub-matrix; min(1,n) ≤ n1 ≤ n/2.
d, q, z REAL for slaed2
Arrays:
d(*) contains the eigenvalues of the two submatrices to be
combined. The dimension of d must be at least max(1, n).
q(ldq, *) contains the eigenvectors of the two submatrices
in the two square blocks with corners at (1,1), (n1,n1) and
(n1+1,n1+1), (n,n). The second dimension of q must be at
least max(1, n).
z(*) contains the updating vector (the last row of the first
sub-eigenvector matrix and the first row of the second
sub-eigenvector matrix).
n).
On entry, the permutation which separately sorts the two
subproblems in d into ascending order. Note that elements
in the second half of this permutation must first have n1
added to their values.
rho REAL for slaed2
On entry, the off-diagonal element associated with the
rank-1 cut which originally split the two submatrices which
are now being recombined.
indx, indxp INTEGER.
Workspace arrays, dimension (n) each. Array indx contains
the permutation used to sort the contents of dlamda into
ascending order.
1460
Array indxp contains the permutation used to place deflated
values of d at the end of the array.
indxp(1:k) points to the nondeflated d-values and
indxp(k+1:n) points to the deflated eigenvalues.
coltyp INTEGER.
Workspace array, dimension (n).
During execution, a label which will indicate which of the
following types a column in the q2 matrix is:
1 : non-zero in the upper half only;
2 : dense;
3 : non-zero in the lower half only;
4 : deflated.
Output Parameters
d On exit, d contains the trailing (n-k) updated eigenvalues
(those which were deflated) sorted into increasing order.
q On exit, q contains the trailing (n-k) updated eigenvectors
(those which were deflated) in its last n-k columns.
z On exit, z content is destroyed by the updating process.
indxq Destroyed on exit.
rho On exit, rho has been modified to the value required by
?laed3.
dlamda, w, q2 REAL for slaed2
2 2
Arrays: dlamda(n), w(n), q2(n1 +(n-n1) ).
The array dlamda contains a copy of the first k eigenvalues
which is used by ?laed3 to form the secular equation.
The array w contains the first k values of the final
deflation-altered z-vector which is passed to ?laed3.
The array q2 contains a copy of the first k eigenvectors
which is used by ?laed3 in a matrix multiply (sgemm/dgemm)
to solve for the new eigenvectors.
indxc INTEGER. Array, dimension (n).
1461
The permutation used to arrange the columns of the deflated

q matrix into three groups: the first group contains non-zero
elements only at and above n1, the second contains
non-zero elements only below n1, and the third is dense.
coltyp On exit, coltyp(i) is the number of columns of type i, for
i=1 to 4 only (see the definition of types in the description
of coltyp in Input Parameters).
info INTEGER.
?laed3
Used by sstedc/dstedc. Finds the roots of the
secular equation and updates the eigenvectors.
Used when the original matrix is tridiagonal.
Syntax
call slaed3( k, n, n1, d, q, ldq, rho, dlamda, q2, indx, ctot, w, s, info )
call dlaed3( k, n, n1, d, q, ldq, rho, dlamda, q2, indx, ctot, w, s, info )
Description
for C interface.
The routine ?laed3 finds the roots of the secular equation, as defined by the values in d, w,
and rho, between 1 and k.
It makes the appropriate calls to ?laed4 and then updates the eigenvectors by multiplying the
matrix of eigenvectors of the pair of eigensystems being combined by the matrix of eigenvectors
of the k-by-k system which is solved here.
This code makes very mild assumptions about floating point arithmetic. It will work on machines
with a guard digit in add/subtract, or on those binary machines without guard digits which
subtract like the Cray X-MP, Cray Y-MP, Cray C-90, or Cray-2. It could conceivably fail on
hexadecimal or decimal machines without guard digits, but none are known.
1462
Input Parameters
k INTEGER. The number of terms in the rational function to
be solved by ?laed4 (k ≥ 0).
n INTEGER. The number of rows and columns in the q matrix.
n ≥ k (deflation may result in n >k).
n1 INTEGER. The location of the last eigenvalue in the leading
sub-matrix; min(1,n) ≤ n1 ≤ n/2.
q REAL for slaed3
Array q(ldq, *). The second dimension of q must be at
least max(1, n).
Initially, the first k columns of this array are used as
workspace.
n).
rho REAL for slaed3
The value of the parameter in the rank one update equation.
rho ≥ 0 required.
dlamda, q2, w REAL for slaed3
Arrays: dlamda(k), q2(ldq2, *), w(k).
The first k elements of the array dlamda contain the old
roots of the deflated updating problem. These are the poles
of the secular equation.
The first k columns of the array q2 contain the non-deflated
eigenvectors for the split problem. The second dimension
of q2 must be at least max(1, n).
The first k elements of the array w contain the components
of the deflation-adjusted updating vector.
indx INTEGER. Array, dimension (n).
The permutation used to arrange the columns of the deflated
q matrix into three groups (see ?laed2).
1463
The rows of the eigenvectors found by ?laed4 must be

likewise permuted before the matrix multiply can take place.
ctot INTEGER. Array, dimension (4).
A count of the total number of the various types of columns
in q, as described in indx. The fourth column type is any
column which has been deflated.
s REAL for slaed3
Workspace array, dimension (n1+1)*k .
Will contain the eigenvectors of the repaired matrix which
will be multiplied by the previously accumulated eigenvectors
to update the system.
Output Parameters
d REAL for slaed3
Array, dimension at least max(1, n).
d(i) contains the updated eigenvalues for 1 ≤ i ≤ k.
q On exit, the columns 1 to k of q contain the updated
eigenvectors.
dlamda May be changed on output by having lowest order bit set
to zero on Cray X-MP, Cray Y-MP, Cray-2, or Cray C-90, as
described above.
w Destroyed on exit.
info INTEGER.
If info = 1, an eigenvalue did not converge.
1464
?laed4
Used by sstedc/dstedc. Finds a single root of the
secular equation.
Syntax
call slaed4( n, i, d, z, delta, rho, dlam, info )
call dlaed4( n, i, d, z, delta, rho, dlam, info )
Description
for C interface.
This routine computes the i-th updated eigenvalue of a symmetric rank-one modification to a
diagonal matrix whose elements are given in the array d, and that
D(i) < D(j) for i < j

and that rho > 0. This is arranged by the calling routine, and is no loss in generality. The
rank-one modified system is thus
diag(D) + rho*Z * transpose(Z).
where we assume the Euclidean norm of Z is 1.
The method consists of approximating the rational functions in the secular equation by simpler
interpolating rational functions.
Input Parameters
n INTEGER. The length of all arrays.
i INTEGER. The index of the eigenvalue to be computed; 1
≤ i ≤ n.
d, z REAL for slaed4
Arrays, dimension (n) each. The array d contains the original
eigenvalues. It is assumed that they are in order, d(i) <
d(j) for i < j.
The array z contains the components of the updating vector
Z.
1465
rho REAL for slaed4

The scalar in the symmetric updating formula.
Output Parameters
delta REAL for slaed4
Array, dimension (n).
If n ≠ 1, delta contains (d(j) - lambda_i) in its j-th
component. If n = 1, then delta(1) = 1. The vector delta
contains the information necessary to construct the
eigenvectors.
dlam REAL for slaed4
The computed lambda_i, the i-th updated eigenvalue.
info INTEGER.
If info = 1, the updating process failed.
?laed5
Used by sstedc/dstedc. Solves the 2-by-2 secular
equation.
Syntax
call slaed5( i, d, z, delta, rho, dlam )
call dlaed5( i, d, z, delta, rho, dlam )
Description
for C interface.
The routine computes the i-th eigenvalue of a symmetric rank-one modification of a 2-by-2
diagonal matrix
diag(D) + rho*Z * transpose(Z).
The diagonal elements in the array D are assumed to satisfy
1466
D(i) < D(j) for i < j .
We also assume rho > 0 and that the Euclidean norm of the vector Z is one.
Input Parameters
i INTEGER. The index of the eigenvalue to be computed;
1 ≤ i ≤ 2.
Arrays, dimension (2) each. The array d contains the original
eigenvalues. It is assumed that d(1) < d(2).
The array z contains the components of the updating vector.
rho REAL for slaed5
Output Parameters
delta REAL for slaed5
Array, dimension (2).
The vector delta contains the information necessary to
construct the eigenvectors.
dlam REAL for slaed5
The computed lambda_i, the i-th updated eigenvalue.
?laed6
Used by sstedc/dstedc. Computes one Newton
step in solution of the secular equation.
Syntax
call slaed6( kniter, orgati, rho, d, z, finit, tau, info )
call dlaed6( kniter, orgati, rho, d, z, finit, tau, info )
1467
Description
for C interface.
The routine computes the positive or negative root (closest to the origin) of
It is assumed that if orgati = .TRUE. the root is between d(2) and d(3);otherwise it is between
d(1) and d(2) This routine is called by ?laed4 when necessary. In most cases, the root sought
is the smallest in magnitude, though it might not be in some extremely rare situations.
Input Parameters
kniter INTEGER.
Refer to ?laed4 for its significance.
orgati LOGICAL.
If orgati = .TRUE., the needed root is between d(2) and
d(3); otherwise it is between d(1) and d(2). See ?laed4
for further details.
rho REAL for slaed6
Refer to the equation for f(x) above.
Arrays, dimension (3) each.
The array d satisfies d(1) < d(2) < d(3).
Each of the elements in the array z must be positive.
finit REAL for slaed6
The value of f(x) at 0. It is more accurate than the one
evaluated inside this routine (if someone wants to do so).
Output Parameters
tau REAL for slaed6
1468
The root of the equation for f(x).
info INTEGER.
If info = 1, failure to converge.
?laed7
Used by ?stedc. Computes the updated
eigensystem of a diagonal matrix after modification
by a rank-one symmetric matrix. Used when the
Syntax
call slaed7( icompq, n, qsiz, tlvls, curlvl, curpbm, d, q, ldq, indxq, rho,
cutpnt, qstore, qptr, prmptr, perm, givptr, givcol, givnum, work, iwork, info
)
call dlaed7( icompq, n, qsiz, tlvls, curlvl, curpbm, d, q, ldq, indxq, rho,
cutpnt, qstore, qptr, prmptr, perm, givptr, givcol, givnum, work, iwork, info
)
call claed7( n, cutpnt, qsiz, tlvls, curlvl, curpbm, d, q, ldq, rho, indxq,
qstore, qptr, prmptr, perm, givptr, givcol, givnum, work, rwork, iwork, info
)
call zlaed7( n, cutpnt, qsiz, tlvls, curlvl, curpbm, d, q, ldq, rho, indxq,
qstore, qptr, prmptr, perm, givptr, givcol, givnum, work, rwork, iwork, info
)
Description
for C interface.
The routine ?laed7 computes the updated eigensystem of a diagonal matrix after modification
by a rank-one symmetric matrix. This routine is used only for the eigenproblem which requires
all eigenvalues and optionally eigenvectors of a dense symmetric/Hermitian matrix that has
been reduced to tridiagonal form. For real flavors, slaed1/dlaed1 handles the case in which
all eigenvalues and eigenvectors of a symmetric tridiagonal matrix are desired.
T = Q(in)*(D(in)+rho*Z*Z')*Q'(in) = Q(out)*D(out)*Q'(out)
1469
where Z = Q'*u, u is a vector of length n with ones in the cutpnt and (cutpnt + 1) -th
elements and zeros elsewhere. The eigenvectors of the original matrix are stored in Q, and the
eigenvalues are in D. The algorithm consists of three stages:
The first stage consists of deflating the size of the problem when there are multiple eigenvalues
or if there is a zero in the z vector. For each such occurrence the dimension of the secular
equation problem is reduced by one. This stage is performed by the routine slaed8/dlaed8
(for real flavors) or by the routine slaed2/dlaed2 (for complex flavors).
The second stage consists of calculating the updated eigenvalues. This is done by finding the
roots of the secular equation via the routine ?laed4 (as called by ?laed9 or ?laed3). This
routine also calculates the eigenvectors of the current problem.
The final stage consists of computing the updated eigenvectors directly using the updated
eigenvalues. The eigenvectors for the current problem are multiplied with the eigenvectors
from the overall problem.
Input Parameters
symmetric matrix also. On entry, the array q must contain
the orthogonal matrix used to reduce the original matrix to
tridiagonal form.
(n ≥ 0).
cutpnt INTEGER. The location of the last eigenvalue in the leading
sub-matrix. min(1,n) ≤ cutpnt ≤ n .
qsiz INTEGER.
tlvls INTEGER. The total number of merging levels in the overall
divide and conquer tree.
curlvl INTEGER. The current level in the overall merge routine, 0
≤ curlvl ≤ tlvls .
1470
curpbm INTEGER. The current problem in the current level in the
overall merge routine (counting from upper left to lower
right).
d REAL for slaed7/claed7
DOUBLE PRECISION for dlaed7/zlaed7.
Array, dimension at least max(1, n).
Array d(*) contains the eigenvalues of the rank-1-perturbed
matrix.
q, work REAL for slaed7
COMPLEX for claed7
Arrays:
q(ldq, *) contains the eigenvectors of the
rank-1-perturbed matrix. The second dimension of q must
work(*) is a workspace array, dimension at least
(3n+qsiz*n) for real flavors and at least (qsiz*n) for
complex flavors.
n).
Contains the permutation that separately sorts the two
sub-problems in d into ascending order.
rho REAL for slaed7 /claed7
The subdiagonal element used to create the rank-1
modification.
qstore REAL for slaed7/claed7
Array, dimension (n2+1). Serves also as output parameter.
Stores eigenvectors of submatrices encountered during
divide and conquer, packed together. qptr points to
beginning of the submatrices.
1471
qptr INTEGER. Array, dimension (n+2). Serves also as output

parameter. List of indices pointing to beginning of
submatrices stored in qstore. The submatrices are
numbered starting at the bottom left of the divide and
conquer tree, from left to right and bottom to top.
prmptr, perm, givptr INTEGER. Arrays, dimension (n lgn) each.
The array prmptr(*) contains a list of pointers which indicate
where in perm a level's permutation is stored. prmptr(i+1)
- prmptr(i) indicates the size of the permutation and also
the size of the full, non-deflated problem.
The array perm(*) contains the permutations (from deflation
and sorting) to be applied to each eigenblock. This
parameter can be modified by ?laed8, where it is output.
The array givptr(*) contains a list of pointers which indicate
where in givcol a level's Givens rotations are stored.
givptr(i+1) - givptr(i) indicates the number of Givens
rotations.
givcol INTEGER. Array, dimension (2, n lgn).
Each pair of numbers indicates a pair of columns to take
place in a Givens rotation.
givnum REAL for slaed7/claed7
Array, dimension (2, n lgn).
Each number indicates the S value to be used in the
corresponding Givens rotation.
iwork INTEGER.
Workspace array, dimension (4n ).
rwork REAL for claed7
DOUBLE PRECISION for zlaed7.
Workspace array, dimension (3n+2qsiz*n). Used in complex
flavors only.
Output Parameters
d On exit, contains the eigenvalues of the repaired matrix.
q On exit, q contains the eigenvectors of the repaired
tridiagonal matrix.
1472
Contains the permutation that reintegrates the subproblems
back into a sorted order, that is,
d(indxq(i = 1, n)) will be in the ascending order.
rho This parameter can be modified by ?laed8, where it is
input/output.
prmptr, perm, givptr INTEGER. Arrays, dimension (n lgn) each.
The array prmptr contains an updated list of pointers.
The array perm contains an updated permutation.
The array givptr contains an updated list of pointers.
givcol This parameter can be modified by ?laed8, where it is
output.
givnum This parameter can be modified by ?laed8, where it is
output.
info INTEGER.
If info = 1, an eigenvalue did not converge.
?laed8
Used by ?stedc. Merges eigenvalues and deflates
secular equation. Used when the original matrix is
dense.
Syntax
call slaed8( icompq, k, n, qsiz, d, q, ldq, indxq, rho, cutpnt, z, dlamda,
q2, ldq2, w, perm, givptr, givcol, givnum, indxp, indx, info )
call dlaed8( icompq, k, n, qsiz, d, q, ldq, indxq, rho, cutpnt, z, dlamda,
q2, ldq2, w, perm, givptr, givcol, givnum, indxp, indx, info )
call claed8( k, n, qsiz, q, ldq, d, rho, cutpnt, z, dlamda, q2, ldq2, w, indxp,
indx, indxq, perm, givptr, givcol, givnum, info )
call zlaed8( k, n, qsiz, q, ldq, d, rho, cutpnt, z, dlamda, q2, ldq2, w, indxp,
indx, indxq, perm, givptr, givcol, givnum, info )
1473
Description
for C interface.
The routine merges the two sets of eigenvalues together into a single sorted set. Then it tries
to deflate the size of the problem. There are two ways in which deflation can occur: when two
or more eigenvalues are close together or if there is a tiny element in the z vector. For each
such occurrence the order of the related secular equation problem is reduced by one.
Input Parameters
symmetric matrix also.
On entry, the array q must contain the orthogonal matrix
used to reduce the original matrix to tridiagonal form.
(n ≥ 0).
cutpnt INTEGER. The location of the last eigenvalue in the leading
sub-matrix. min(1,n) ≤ cutpnt ≤ n .
qsiz INTEGER.
d, z REAL for slaed8/claed8
Arrays, dimension at least max(1, n) each. The array d(*)
contains the eigenvalues of the two submatrices to be
combined.
On entry, z(*) contains the updating vector (the last row
of the first sub-eigenvector matrix and the first row of the
second sub-eigenvector matrix). The contents of z are
destroyed by the updating process.
q REAL for slaed8
COMPLEX for claed8
1474
Array
q(ldq, *). The second dimension of q must be at least
max(1, n). On entry, q contains the eigenvectors of the
partially solved system which has been previously updated
in matrix multiplies with other partially solved eigensystems.
For real flavors, If icompq = 0, q is not referenced.
n).
ldq2 INTEGER. The first dimension of the output array q2; ldq2
≥ max(1, n).
The permutation that separately sorts the two sub-problems
in d into ascending order. Note that elements in the second
half of this permutation must first have cutpnt added to
their values in order to be accurate.
rho REAL for slaed8/claed8
On entry, the off-diagonal element associated with the
rank-1 cut which originally split the two submatrices which
are now being recombined.
Output Parameters
k INTEGER. The number of non-deflated eigenvalues, and the
order of the related secular equation.
d On exit, contains the trailing (n-k) updated eigenvalues
z On exit, the updating process destroys the contents of z.
q On exit, q contains the trailing (n-k) updated eigenvectors
(those which were deflated) in its last (n-k) columns.
The permutation of merged eigenvalues set.
rho On exit, rho has been modified to the value required by
?laed3.
dlamda, w REAL for slaed8/claed8
1475

Arrays, dimension (n) each. The array dlamda(*) contains
a copy of the first k eigenvalues which will be used by
?laed3 to form the secular equation.
The array w(*) will hold the first k values of the final
deflation-altered z-vector and will be passed to ?laed3.
q2 REAL for slaed8
COMPLEX for claed8
Array
q2(ldq2, *). The second dimension of q2 must be at least
max(1, n).
Contains a copy of the first k eigenvectors which will be
used by slaed7/dlaed7 in a matrix multiply (sgemm/dgemm)
to update the new eigenvectors. For real flavors, If icompq
= 0, q2 is not referenced.
indxp, indx INTEGER. Workspace arrays, dimension (n) each.
The array indxp(*) will contain the permutation used to
place deflated values of d at the end of the array. On output,
indxp(1:k) points to the nondeflated d-values and
indxp(k+1:n) points to the deflated eigenvalues.
The array indx(*) will contain the permutation used to sort
the contents of d into ascending order.
perm INTEGER. Array, dimension (n ).
Contains the permutations (from deflation and sorting) to
be applied to each eigenblock.
givptr INTEGER. Contains the number of Givens rotations which
took place in this subproblem.
givcol INTEGER. Array, dimension (2, n ).
givnum REAL for slaed8/claed8
Array, dimension (2, n).
1476
info INTEGER.
?laed9
Used by sstedc/dstedc. Finds the roots of the
secular equation and updates the eigenvectors.
Used when the original matrix is dense.
Syntax
call slaed9( k, kstart, kstop, n, d, q, ldq, rho, dlamda, w, s, lds, info )
call dlaed9( k, kstart, kstop, n, d, q, ldq, rho, dlamda, w, s, lds, info )
Description
for C interface.
The routine finds the roots of the secular equation, as defined by the values in d, z, and rho,
between kstart and kstop. It makes the appropriate calls to slaed4/dlaed4 and then stores
the new matrix of eigenvectors for use in calculating the next level of z vectors.
Input Parameters
be solved by slaed4/dlaed4 (k ≥ 0).
kstart, kstop INTEGER. The updated eigenvalues lambda(i),
kstart ≤ i ≤ kstop are to be computed.
1 ≤ kstart ≤ kstop ≤ k.
n INTEGER. The number of rows and columns in the Q matrix.
n ≥ k (deflation may result in n > k).
q REAL for slaed9
Workspace array, dimension (ldq, *).
1477

n).
rho REAL for slaed9
The value of the parameter in the rank one update equation.
rho ≥ 0 required.
dlamda, w REAL for slaed9
Arrays, dimension (k) each. The first k elements of the array
dlamda(*) contain the old roots of the deflated updating
problem. These are the poles of the secular equation.
The first k elements of the array w(*) contain the
components of the deflation-adjusted updating vector.
lds INTEGER. The first dimension of the output array s; lds ≥
max(1, k).
Output Parameters
d REAL for slaed9
Array, dimension (n). Elements in d(i) are not referenced
for 1 ≤ i < kstart or kstop < i ≤ n.
s REAL for slaed9
Array, dimension (lds, *) .
The second dimension of s must be at least max(1, k).
Will contain the eigenvectors of the repaired matrix which
will be stored for subsequent z vector calculation and
multiplied by the previously accumulated eigenvectors to
update the system.
dlamda On exit, the value is modified to make sure all dlamda(i)
- dlamda(j) can be computed with high relative accuracy,
barring overflow and underflow.
w Destroyed on exit.
info INTEGER.
1478
info = 1, the eigenvalue did not converge.
?laeda
Used by ?stedc. Computes the Z vector
determining the rank-one modification of the
diagonal matrix. Used when the original matrix is
dense.
Syntax
call slaeda( n, tlvls, curlvl, curpbm, prmptr, perm, givptr, givcol, givnum,
q, qptr, z, ztemp, info )
call dlaeda( n, tlvls, curlvl, curpbm, prmptr, perm, givptr, givcol, givnum,
q, qptr, z, ztemp, info )
Description
for C interface.
The routine ?laeda computes the Z vector corresponding to the merge step in the curlvl-th
step of the merge process with tlvls steps for the curpbm-th problem.
Input Parameters
(n ≥ 0).
tlvls INTEGER. The total number of merging levels in the overall
divide and conquer tree.
curlvl INTEGER. The current level in the overall merge routine, 0
≤ curlvl ≤ tlvls .
curpbm INTEGER. The current problem in the current level in the
overall merge routine (counting from upper left to lower
right).
prmptr, perm, givptr INTEGER. Arrays, dimension (n lgn ) each.
1479
The array prmptr(*) contains a list of pointers which indicate

where in perm a level's permutation is stored. prmptr(i+1)
- prmptr(i) indicates the size of the permutation and also
the size of the full, non-deflated problem.
The array perm(*) contains the permutations (from deflation
and sorting) to be applied to each eigenblock.
The array givptr(*) contains a list of pointers which indicate
where in givcol a level's Givens rotations are stored.
givptr(i+1) - givptr(i) indicates the number of Givens
rotations.
givcol INTEGER. Array, dimension (2, n lgn ).
givnum REAL for slaeda
DOUBLE PRECISION for dlaeda.
Array, dimension (2, n lgn).
q REAL for slaeda
2
Array, dimension ( n ).
Contains the square eigenblocks from previous levels, the
starting positions for blocks are given by qptr.
qptr INTEGER. Array, dimension (n+2). Contains a list of pointers
which indicate where in q an eigenblock is stored. sqrt(
qptr(i+1) - qptr(i)) indicates the size of the block.
ztemp REAL for slaeda
Output Parameters
z REAL for slaeda
Array, dimension (n). Contains the updating vector (the last
row of the first sub-eigenvector matrix and the first row of
the second sub-eigenvector matrix).
1480
info INTEGER.
?laein
Computes a specified right or left eigenvector of
an upper Hessenberg matrix by inverse iteration.
Syntax
call slaein( rightv, noinit, n, h, ldh, wr, wi, vr, vi, b, ldb, work, eps3,
smlnum, bignum, info )
call dlaein( rightv, noinit, n, h, ldh, wr, wi, vr, vi, b, ldb, work, eps3,
smlnum, bignum, info )
call claein( rightv, noinit, n, h, ldh, w, v, b, ldb, rwork, eps3, smlnum,
info )
call zlaein( rightv, noinit, n, h, ldh, w, v, b, ldb, rwork, eps3, smlnum,
info )
Description
for C interface.
The routine ?laein uses inverse iteration to find a right or left eigenvector corresponding to
the eigenvalue (wr,wi) of a real upper Hessenberg matrix H (for real flavors slaein/dlaein)
or to the eigenvalue w of a complex upper Hessenberg matrix H (for complex flavors
claein/zlaein).
Input Parameters
rightv LOGICAL.
If rightv = .TRUE., compute right eigenvector;
if rightv = .FALSE., compute left eigenvector.
noinit LOGICAL.
If noinit = .TRUE., no initial vector is supplied in (vr,vi)
or in v (for complex flavors);
1481
if noinit = .FALSE., initial vector is supplied in (vr,vi)

or in v (for complex flavors).
h REAL for slaein
DOUBLE PRECISION for dlaein
COMPLEX for claein
COMPLEX*16 for zlaein.
Array h(ldh, *).
Contains the upper Hessenberg matrix H.
ldh INTEGER. The first dimension of the array h; ldh ≥ max(1,
n).
wr, wi REAL for slaein
DOUBLE PRECISION for dlaein.
The real and imaginary parts of the eigenvalue of H whose
corresponding right or left eigenvector is to be computed
(for real flavors of the routine).
w COMPLEX for claein
The eigenvalue of H whose corresponding right or left
eigenvector is to be computed (for complex flavors of the
routine).
vr, vi REAL for slaein
Arrays, dimension (n) each. Used for real flavors only. On
entry, if noinit = .FALSE. and wi = 0.0, vr must contain
a real starting vector for inverse iteration using the real
eigenvalue wr;
if noinit = .FALSE. and wi ≠ 0.0, vr and vi must
contain the real and imaginary parts of a complex starting
vector for inverse iteration using the complex eigenvalue
(wr,wi);otherwise vr and vi need not be set.
v COMPLEX for claein
1482
Array, dimension (n). Used for complex flavors only. On
entry, if noinit = .FALSE., v must contain a starting
vector for inverse iteration; otherwise v need not be set.
b REAL for slaein
DOUBLE PRECISION for dlaein
COMPLEX for claein
Workspace array b(ldb, *). The second dimension of b must
ldb INTEGER. The first dimension of the array b;
ldb ≥ n+1 for real flavors;
ldb ≥ max(1, n) for complex flavors.
work REAL for slaein
Used for real flavors only.
rwork REAL for claein
DOUBLE PRECISION for zlaein.
Used for complex flavors only.
eps3, smlnum REAL for slaein/claein
DOUBLE PRECISION for dlaein/zlaein.
eps3 is a small machine-dependent value which is used to
perturb close eigenvalues, and to replace zero pivots.
smlnum is a machine-dependent value close to underflow
threshold.
bignum REAL for slaein
bignum is a machine-dependent value close to overflow
threshold. Used for real flavors only.
Output Parameters
vr, vi On exit, if wi = 0.0 (real eigenvalue), vr contains the
computed real eigenvector; if wi ≠ 0.0 (complex
eigenvalue), vr and vi contain the real and imaginary parts
1483
of the computed complex eigenvector. The eigenvector is

normalized so that the component of largest magnitude has
magnitude 1; here the magnitude of a complex number
(x,y) is taken to be |x| + |y|.
vi is not referenced if wi = 0.0.
v On exit, v contains the computed eigenvector, normalized
so that the component of largest magnitude has magnitude
1; here the magnitude of a complex number (x,y) is taken
to be |x| + |y|.
info INTEGER.
If info = 1, inverse iteration did not converge. For real
flavors, vr is set to the last iterate, and so is vi, if wi ≠
0.0. For complex flavors, v is set to the last iterate.
?laev2
Computes the eigenvalues and eigenvectors of a
2-by-2 symmetric/Hermitian matrix.
Syntax
call slaev2( a, b, c, rt1, rt2, cs1, sn1 )
call dlaev2( a, b, c, rt1, rt2, cs1, sn1 )
call claev2( a, b, c, rt1, rt2, cs1, sn1 )
call zlaev2( a, b, c, rt1, rt2, cs1, sn1 )
Description
for C interface.
The routine performs the eigendecomposition of a 2-by-2 symmetric matrix
1484
(for claev2/zlaev2).
On return, rt1 is the eigenvalue of larger absolute value, rt2 of smaller absolute value, and
(cs1, sn1) is the unit right eigenvector for rt1, giving the decomposition
(for slaev2/dlaev2),
or
(for claev2/zlaev2).
Input Parameters
a, b, c REAL for slaev2
DOUBLE PRECISION for dlaev2
COMPLEX for claev2
COMPLEX*16 for zlaev2.
Elements of the input matrix.
Output Parameters
rt1, rt2 REAL for slaev2/claev2
DOUBLE PRECISION for dlaev2/zlaev2.
Eigenvalues of larger and smaller absolute value,
respectively.
cs1 REAL for slaev2/claev2
DOUBLE PRECISION for dlaev2/zlaev2.
sn1 REAL for slaev2
DOUBLE PRECISION for dlaev2
COMPLEX for claev2
1485
COMPLEX*16 for zlaev2.

The vector (cs1, sn1) is the unit right eigenvector for rt1.
Application Notes
rt1 is accurate to a few ulps barring over/underflow. rt2 may be inaccurate if there is massive
cancellation in the determinant a*c-b*b; higher precision or correctly rounded or correctly
truncated arithmetic would be needed to compute rt2 accurately in all cases. cs1 and sn1 are
accurate to a few ulps barring over/underflow. Overflow is possible only if rt1 is within a factor
of 5 of overflow. Underflow is harmless if the input data is 0 or exceeds underflow_threshold
/ macheps.
?laexc
Swaps adjacent diagonal blocks of a real upper
quasi-triangular matrix in Schur canonical form,
by an orthogonal similarity transformation.
Syntax
call slaexc( wantq, n, t, ldt, q, ldq, j1, n1, n2, work, info )
call dlaexc( wantq, n, t, ldt, q, ldq, j1, n1, n2, work, info )
Description
for C interface.
The routine swaps adjacent diagonal blocks T11 and T22 of order 1 or 2 in an upper
quasi-triangular matrix T by an orthogonal similarity transformation.
T must be in Schur canonical form, that is, block upper triangular with 1-by-1 and 2-by-2
diagonal blocks; each 2-by-2 diagonal block has its diagonal elements equal and its off-diagonal
elements of opposite sign.
Input Parameters
wantq LOGICAL.
If wantq = .TRUE., accumulate the transformation in the
matrix Q;
If wantq = .FALSE., do not accumulate the transformation.
1486
t, q REAL for slaexc
DOUBLE PRECISION for dlaexc
Arrays:
t(ldt,*) contains on entry the upper quasi-triangular
matrix T, in Schur canonical form.
q(ldq,*) contains on entry, if wantq = .TRUE., the
orthogonal matrix Q. If wantq = .FALSE., q is not
referenced. The second dimension of q must be at least
max(1, n).
If wantq = .FALSE., then ldq ≥ 1.
If wantq = .TRUE., then ldq ≥ max(1,n).
j1 INTEGER. The index of the first row of the first block T11.
n1 INTEGER. The order of the first block T11
(n1 = 0, 1, or 2).
n2 INTEGER. The order of the second block T22
(n2 = 0, 1, or 2).
work REAL for slaexc;
DOUBLE PRECISION for dlaexc.
Workspace array, DIMENSION (n).
Output Parameters
t On exit, the updated matrix T, again in Schur canonical
form.
q On exit, if wantq = .TRUE., the updated matrix Q.
info INTEGER.
If info = 1, the transformed matrix T would be too far
from Schur form; the blocks are not swapped and T and Q
are unchanged.
1487
?lag2
Computes the eigenvalues of a 2-by-2 generalized
eigenvalue problem, with scaling as necessary to
avoid over-/underflow.
Syntax
call slag2( a, lda, b, ldb, safmin, scale1, scale2, wr1, wr2, wi )
call dlag2( a, lda, b, ldb, safmin, scale1, scale2, wr1, wr2, wi )
Description
for C interface.
The routine computes the eigenvalues of a 2 x 2 generalized eigenvalue problem A - w *B, with
scaling as necessary to avoid over-/underflow. The scaling factor, s, results in a modified
eigenvalue equation
s*A - w*B ,
where s is a non-negative scaling factor chosen so that w, w*B, and s*A do not overflow and,
if possible, do not underflow, either.
Input Parameters
a, b REAL for slag2
DOUBLE PRECISION for dlag2
Arrays:
a(lda,2) contains, on entry, the 2 x 2 matrix A. It is
assumed that its 1-norm is less than 1/safmin. Entries less
than sqrt(safmin)*norm(A) are subject to being treated
as zero.
b(ldb,2) contains, on entry, the 2 x 2 upper triangular
matrix B. It is assumed that the one-norm of B is less than
1/safmin. The diagonals should be at least sqrt(safmin)
times the largest element of B (in absolute value); if a
diagonal is smaller than that, then +/- sqrt(safmin) will
be used instead of that diagonal.
lda INTEGER. The first dimension of a; lda ≥ 2.
1488
ldb INTEGER. The first dimension of b; ldb ≥ 2.
safmin REAL for slag2;
DOUBLE PRECISION for dlag2.
The smallest positive number such that 1/safmin does not
overflow. (This should always be ?lamch('S') - it is an
argument in order to avoid having to call ?lamch
frequently.)
Output Parameters
scale1 REAL for slag2;
A scaling factor used to avoid over-/underflow in the
eigenvalue equation which defines the first eigenvalue. If
the eigenvalues are complex, then the eigenvalues are (wr1
+/- wii)/scale1 (which may lie outside the exponent
range of the machine), scale1=scale2, and scale1 will
always be positive.
If the eigenvalues are real, then the first (real) eigenvalue
is wr1/scale1, but this may overflow or underflow, and in
fact, scale1 may be zero or less than the underflow
threshhold if the exact eigenvalue is sufficiently large.
scale2 REAL for slag2;
A scaling factor used to avoid over-/underflow in the
eigenvalue equation which defines the second eigenvalue.
If the eigenvalues are complex, then scale2=scale1. If
the eigenvalues are real, then the second (real) eigenvalue
is wr2/scale2, but this may overflow or underflow, and in
fact, scale2 may be zero or less than the underflow
threshold if the exact eigenvalue is sufficiently large.
wr1 REAL for slag2;
If the eigenvalue is real, then wr1 is scale1 times the
eigenvalue closest to the (2,2) element of A*inv(B).
If the eigenvalue is complex, then wr1=wr2 is scale1 times
the real part of the eigenvalues.
wr2 REAL for slag2;
1489

If the eigenvalue is real, then wr2 is scale2 times the other
eigenvalue. If the eigenvalue is complex, then wr1=wr2 is
scale1 times the real part of the eigenvalues.
wi REAL for slag2;
If the eigenvalue is real, then wi is zero. If the eigenvalue
is complex, then wi is scale1 times the imaginary part of
the eigenvalues. wi will always be non-negative.
?lags2
Computes 2-by-2 orthogonal matrices U, V, and
Q, and applies them to matrices A and B such that
the rows of the transformed A and B are parallel.
Syntax
call slags2( upper, a1, a2, a3, b1, b2, b3, csu, snu, csv, snv, csq, snq )
call dlags2( upper, a1, a2, a3, b1, b2, b3, csu, snu, csv, snv, csq, snq )
Description
for C interface.
The routine computes 2-by-2 orthogonal matrices U, V and Q, such that if upper = .TRUE.,
then
and
1490
or if upper = .FALSE., then
and
The rows of the transformed A and B are parallel, where
Here Z' denotes the transpose of Z.
Input Parameters
upper LOGICAL.
If upper = .TRUE., the input matrices A and B are upper
triangular; If upper = .FALSE., the input matrices A and
B are lower triangular.
a1, a2, a3 REAL for slags2
DOUBLE PRECISION for dlags2
On entry, a1, a2 and a3 are elements of the input 2-by-2
upper (lower) triangular matrix A.
b1, b2, b3 REAL for slags2
On entry, b1, b2 and b3 are elements of the input 2-by-2
upper (lower) triangular matrix B.
1491
Output Parameters
csu, snu REAL for slags2
The desired orthogonal matrix U.
csv, snv REAL for slags2
The desired orthogonal matrix V.
csq, snq REAL for slags2
The desired orthogonal matrix Q.
?lagtf
Computes an LU factorization of a matrix T-λ*I,
where T is a general tridiagonal matrix, and λ is a
scalar, using partial pivoting with row interchanges.
Syntax
call slagtf( n, a, lambda, b, c, tol, d, in, info )
call dlagtf( n, a, lambda, b, c, tol, d, in, info )
Description
for C interface.
The routine factorizes the matrix (T - lambda*I), where T is an n-by-n tridiagonal matrix and
lambda is a scalar, as
T - lambda*I = P*L*U,
where P is a permutation matrix, L is a unit lower tridiagonal matrix with at most one non-zero
sub-diagonal elements per column and U is an upper triangular matrix with at most two non-zero
super-diagonal elements per column. The factorization is obtained by Gaussian elimination with
partial pivoting and implicit row scaling. The parameter lambda is included in the routine so
that ?lagtf may be used, in conjunction with ?lagts, to obtain eigenvectors of T by inverse
iteration.
1492
Input Parameters
a, b, c REAL for slagtf
DOUBLE PRECISION for dlagtf
Arrays, dimension a(n), b(n-1), c(n-1):
On entry, a(*) must contain the diagonal elements of the
matrix T.
On entry, b(*) must contain the (n-1) super-diagonal
elements of T.
On entry, c(*) must contain the (n-1) sub-diagonal
elements of T.
tol REAL for slagtf
On entry, a relative tolerance used to indicate whether or
not the matrix (T - lambda*I) is nearly singular. tol should
normally be chose as approximately the largest relative
error in the elements of T. For example, if the elements of
T are correct to about 4 significant figures, then tol should
be set to about 5*10-4. If tol is supplied as less than eps,
where eps is the relative machine precision, then the value
eps is used in place of tol.
Output Parameters
a On exit, a is overwritten by the n diagonal elements of the
upper triangular matrix U of the factorization of T.
b On exit, b is overwritten by the n-1 super-diagonal elements
of the matrix U of the factorization of T.
c On exit, c is overwritten by the n-1 sub-diagonal elements
of the matrix L of the factorization of T.
d REAL for slagtf
Array, dimension (n-2).
On exit, d is overwritten by the n-2 second super-diagonal
elements of the matrix U of the factorization of T.
in INTEGER.
1493

On exit, in contains details of the permutation matrix p. If
an interchange occurred at the k-th step of the elimination,
then in(k) = 1, otherwise in(k) = 0. The element in(n)
returns the smallest positive integer j such that
abs(u(j,j)) ≤ norm((T - lambda*I)(j))*tol,
where norm( A(j)) denotes the sum of the absolute values
of the j-th row of the matrix A.
If no such j exists then in(n) is returned as zero. If in(n)
is returned as positive, then a diagonal element of U is small,
indicating that (T - lambda*I) is singular or nearly
singular.
info INTEGER.
If info = -k, the k-th parameter had an illegal value.
?lagtm
Performs a matrix-matrix product of the form C =
alpha*A*B+beta*C, where A is a tridiagonal
matrix, B and C are rectangular matrices, and
alpha and beta are scalars, which may be 0, 1,
or -1.
Syntax
call slagtm( trans, n, nrhs, alpha, dl, d, du, x, ldx, beta, b, ldb )
call dlagtm( trans, n, nrhs, alpha, dl, d, du, x, ldx, beta, b, ldb )
call clagtm( trans, n, nrhs, alpha, dl, d, du, x, ldx, beta, b, ldb )
call zlagtm( trans, n, nrhs, alpha, dl, d, du, x, ldx, beta, b, ldb )
Description
for C interface.
The routine performs a matrix-vector product of the form:
B := alpha*A*X + beta*B
1494
where A is a tridiagonal matrix of order n, B and X are n-by-nrhs matrices, and alpha and beta
are real scalars, each of which may be 0., 1., or -1.
Input Parameters
If trans = 'N', then B := alpha*A*X + beta*B (no
transpose);
If trans = 'T', then B := alpha*AT*X + beta*B
(transpose);
If trans = 'C', then B := alpha*AH*X + beta*B
(conjugate transpose)
nrhs INTEGER. The number of right-hand sides, i.e., the number
of columns in X and B (nrhs ≥ 0).
alpha, beta REAL for slagtm/clagtm
DOUBLE PRECISION for dlagtm/zlagtm
Specify the scalars alpha and beta respectively. alpha
must be 0., 1., or -1.; otherwise, it is assumed to be 0.
beta must be 0., 1., or -1.; otherwise, it is assumed to be
1.
dl, d, du REAL for slagtm
DOUBLE PRECISION for dlagtm
COMPLEX for clagtm
COMPLEX*16 for zlagtm.
Arrays: dl(n - 1), d(n), du(n - 1).
The array dl contains the (n - 1) sub-diagonal elements of
T.
The array d contains the n diagonal elements of T.
The array du contains the (n - 1) super-diagonal elements
of T.
x, b REAL for slagtm
DOUBLE PRECISION for dlagtm
COMPLEX for clagtm
COMPLEX*16 for zlagtm.
Arrays:
1495
x(ldx,*) contains the n-by-nrhs matrix X. The second

dimension of x must be at least max(1, nrhs).
b(ldb,*) contains the n-by-nrhs matrix B. The second
dimension of b must be at least max(1, nrhs).
ldx INTEGER. The leading dimension of the array x; ldx ≥
max(1, n).
ldb INTEGER. The leading dimension of the array b; ldb ≥
max(1, n).
Output Parameters
b Overwritten by the matrix expression B := alpha*A*X +
beta*B
?lagts
Solves the system of equations (T - lambda*I)*x
T
= y or (T - lambda*I) *x = y ,where T is a general
tridiagonal matrix and lambda is a scalar, using
the LU factorization computed by ?lagtf.
Syntax
call slagts( job, n, a, b, c, d, in, y, tol, info )
call dlagts( job, n, a, b, c, d, in, y, tol, info )
Description
for C interface.
The routine may be used to solve for x one of the systems of equations:
(T - lambda*I)*x = y or (T - lambda*I)'*x = y,
where T is an n-by-n tridiagonal matrix, following the factorization of (T - lambda*I) as
T - lambda*I = P*L*U,
computed by the routine ?lagtf.
1496
The choice of equation to be solved is controlled by the argument job, and in each case there
is an option to perturb zero or very small diagonal elements of U, this option being intended
for use in applications such as inverse iteration.
Input Parameters
job INTEGER. Specifies the job to be performed by ?lagts as
follows:
= 1: The equations (T - lambda*I)x = y are to be solved,
but diagonal elements of U are not to be perturbed.
= -1: The equations (T - lambda*I)x = y are to be solved
and, if overflow would otherwise occur, the diagonal
elements of U are to be perturbed. See argument tol below.
= 2: The equations (T - lambda*I)'x = y are to be
solved, but diagonal elements of U are not to be perturbed.
= -2: The equations (T - lambda*I)'x = y are to be
solved and, if overflow would otherwise occur, the diagonal
elements of U are to be perturbed. See argument tol below.
a, b, c, d REAL for slagts
DOUBLE PRECISION for dlagts
Arrays, dimension a(n), b(n-1), c(n-1), d(n-2):
On entry, a(*) must contain the diagonal elements of U as
returned from ?lagtf.
On entry, b(*) must contain the first super-diagonal
elements of U as returned from ?lagtf.
On entry, c(*) must contain the sub-diagonal elements of
L as returned from ?lagtf.
On entry, d(*) must contain the second super-diagonal
elements of U as returned from ?lagtf.
in INTEGER.
On entry, in(*) must contain details of the matrix p as
returned from ?lagtf.
y REAL for slagts
DOUBLE PRECISION for dlagts
Array, dimension (n). On entry, the right hand side vector
y.
1497
tol REAL for slagtf

DOUBLE PRECISION for dlagtf.
On entry, with job < 0, tol should be the minimum
perturbation to be made to very small diagonal elements of
U. tol should normally be chosen as about eps*norm(U),
where eps is the relative machine precision, but if tol is
supplied as non-positive, then it is reset to eps*max( abs(
u(i,j)) ). If job > 0 then tol is not referenced.
Output Parameters
y On exit, y is overwritten by the solution vector x.
tol On exit, tol is changed as described in Input Parameters
section above, only if tol is non-positive on entry. Otherwise
tol is unchanged.
info INTEGER.
info = i >0, overflow would occur when computing the
ith element of the solution vector x. This can only occur
when job is supplied as positive and either means that a
diagonal element of U is very small, or that the elements of
the right-hand side vector y are very large.
?lagv2
Computes the Generalized Schur factorization of
a real 2-by-2 matrix pencil (A,B) where B is upper
triangular.
Syntax
call slagv2( a, lda, b, ldb, alphar, alphai, beta, csl, snl, csr, snr )
call dlagv2( a, lda, b, ldb, alphar, alphai, beta, csl, snl, csr, snr )
Description
for C interface.
1498
The routine computes the Generalized Schur factorization of a real 2-by-2 matrix pencil (A,B)
where B is upper triangular. The routine computes orthogonal (rotation) matrices given by csl,
snl and csr, snr such that:
1) if the pencil (A,B) has two real eigenvalues (include 0/0 or 1/0 types), then
2) if the pencil (A,B) has a pair of complex conjugate eigenvalues, then
where b11 ≥ b22>0.
Input Parameters
a, b REAL for slagv2
DOUBLE PRECISION for dlagv2
Arrays:
a(lda,2) contains the 2-by-2 matrix A;
b(ldb,2) contains the upper triangular 2-by-2 matrix B.
lda INTEGER. The leading dimension of the array a;
1499
lda ≥ 2.
ldb INTEGER. The leading dimension of the array b;
ldb ≥ 2.
Output Parameters
a On exit, a is overwritten by the “A-part” of the generalized
Schur form.
b On exit, b is overwritten by the “B-part” of the generalized
Schur form.
alphar, alphai, beta REAL for slagv2
DOUBLE PRECISION for dlagv2.
Arrays, dimension (2) each.
(alphar(k) + i*alphai(k))/beta(k) are the
eigenvalues of the pencil (A,B), k=1,2 and i = sqrt(-1).
Note that beta(k) may be zero.
csl, snl REAL for slagv2
The cosine and sine of the left rotation matrix, respectively.
csr, snr REAL for slagv2
The cosine and sine of the right rotation matrix, respectively.
1500
?lahqr
of an upper Hessenberg matrix, using the
double-shift/single-shift QR algorithm.
Syntax
call slahqr( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz,
info )
call dlahqr( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz,
info )
call clahqr( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, info
)
call zlahqr( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, info
)
Description
for C interface.
The routine is an auxiliary routine called by ?hseqr to update the eigenvalues and Schur
decomposition already computed by ?hseqr, by dealing with the Hessenberg submatrix in rows
and columns ilo to ihi.
Input Parameters
wantt LOGICAL.
If wantt = .TRUE., the full Schur form T is required;
If wantt = .FALSE., eigenvalues only are required.
wantz LOGICAL.
If wantz = .TRUE., the matrix of Schur vectors Z is
required;
If wantz = .FALSE., Schur vectors are not required.
ilo, ihi INTEGER.
1501
It is assumed that h is already upper quasi-triangular in

rows and columns ihi+1:n, and that h(ilo,ilo-1) = 0
(unless ilo = 1). The routine ?lahqr works primarily with
the Hessenberg submatrix in rows and columns ilo to ihi,
but applies transformations to all of h if wantt = .TRUE..
Constraints:
1 ≤ ilo ≤ max(1,ihi); ihi ≤ n.
h, z REAL for slahqr
DOUBLE PRECISION for dlahqr
COMPLEX for clahqr
COMPLEX*16 for zlahqr.
Arrays:
h(ldh,*) contains the upper Hessenberg matrix h.
z(ldz,*)
If wantz = .TRUE., then, on entry, z must contain the
current matrix z of transformations accumulated by ?hseqr.
If wantz = .FALSE., then z is not referenced. The second
dimension of z must be at least max(1, n).
ldz INTEGER. The first dimension of z; at least max(1, n).
iloz, ihiz INTEGER. Specify the rows of z to which transformations
must be applied if wantz = .TRUE..
1 ≤ iloz ≤ ilo; ihi ≤ ihiz ≤ n.
Output Parameters
h On exit, if info= 0 and wantt = .TRUE., then,
• for slahqr/dlahqr, h is upper quasi-triangular in rows

and columns ilo:ihi with any 2-by-2 diagonal blocks
in standard form.
• for clahqr/zlahqr, h is upper triangular in rows and
columns ilo:ihi.
1502
If info= 0 and wantt = .FALSE., the contents of h are
unspecified on exit. If info is positive, see description of
info for the output state of h.
wr, wi REAL for slahqr
DOUBLE PRECISION for dlahqr
Arrays, DIMENSION at least max(1, n) each. Used with real
flavors only. The real and imaginary parts, respectively, of
the computed eigenvalues ilo to ihi are stored in the
corresponding elements of wr and wi. If two eigenvalues
are computed as a complex conjugate pair, they are stored
in consecutive elements of wr and wi, say the i-th and
(i+1)-th, with wi(i)> 0 and wi(i+1) < 0.
If wantt = .TRUE., the eigenvalues are stored in the same
order as on the diagonal of the Schur form returned in h,
with wr(i) = h(i,i), and, if h(i:i+1, i:i+1) is a 2-by-2
diagonal block, wi(i) = sqrt(h(i+1,i)*h(i,i+1)) and
wi(i+1) = -wi(i).
w COMPLEX for clahqr
COMPLEX*16 for zlahqr.
Array, DIMENSION at least max(1, n). Used with complex
flavors only. The computed eigenvalues ilo to ihi are
stored in the corresponding elements of w.
If wantt = .TRUE., the eigenvalues are stored in the same
order as on the diagonal of the Schur form returned in h,
with w(i) = h(i,i).
z If wantz = .TRUE., then, on exit z has been updated;
transformations are applied only to the submatrix
z(iloz:ihiz, ilo:ihi).
info INTEGER.
With info > 0,
• if info = i, ?lahqr failed to compute all the eigenvalues

ilo to ihi in a total of 30 iterations per eigenvalue;
elements i+1:ihi of wr and wi (for slahqr/dlahqr) or
w (for clahqr/zlahqr) contain those eigenvalues which
have been successfully computed.
1503
• if wantt is .FALSE., then on exit the remaining

unconverged eigenvalues are the eigenvalues of the
upper Hessenberg matrix rows and columns ilo through
info of the final output value of h.
• if wantt is .TRUE., then on exit
(initial value of h)*u = u*(final value of h),
(*)
where u is an orthognal matrix. The final value of h is
upper Hessenberg and triangular in rows and columns
info+1 through ihi.
• if wantz is .TRUE., then on exit
(final value of z) = (initial value of z)* u,
where u is an orthognal matrix in (*) regardless of the

value of wantt.
?lahrd
Reduces the first nb columns of a general
rectangular matrix A so that elements below the
k-th subdiagonal are zero, and returns auxiliary
matrices which are needed to apply the
Syntax
call slahrd( n, k, nb, a, lda, tau, t, ldt, y, ldy )
call dlahrd( n, k, nb, a, lda, tau, t, ldt, y, ldy )
call clahrd( n, k, nb, a, lda, tau, t, ldt, y, ldy )
call zlahrd( n, k, nb, a, lda, tau, t, ldt, y, ldy )
Description
for C interface.
1504
The routine reduces the first nb columns of a real/complex general n-by-(n-k+1) matrix A so
that elements below the k-th subdiagonal are zero. The reduction is performed by an
orthogonal/unitary similarity transformation Q'*A*Q. The routine returns the matrices V and T
which determine Q as a block reflector I - V*T*V', and also the matrix Y = A*V*T.
The matrix Q is represented as products of nb elementary reflectors:
Q = H(1)*H(2)*... *H(nb)
H(i) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector.
This is an obsolete auxiliary routine. Please use the new routine ?lahr2 instead.
Input Parameters
k INTEGER. The offset for the reduction. Elements below the
k-th subdiagonal in the first nb columns are reduced to zero.
nb INTEGER. The number of columns to be reduced.
a REAL for slahrd
DOUBLE PRECISION for dlahrd
COMPLEX for clahrd
COMPLEX*16 for zlahrd.
Array a(lda, n-k+1) contains the n-by-(n-k+1) general
matrix A to be reduced.
ldt INTEGER. The first dimension of the output array t; must
be at least max(1, nb).
ldy INTEGER. The first dimension of the output array y; must
Output Parameters
a On exit, the elements on and above the k-th subdiagonal
in the first nb columns are overwritten with the
corresponding elements of the reduced matrix; the elements
1505
below the k-th subdiagonal, with the array tau, represent

the matrix Q as a product of elementary reflectors. The other
columns of a are unchanged. See Application Notes below.
tau REAL for slahrd
COMPLEX for clahrd
Array, DIMENSION (nb).
t, y REAL for slahrd
COMPLEX for clahrd
Arrays, dimension t(ldt, nb), y(ldy, nb).
The array t contains upper triangular matrix T.
The array y contains the n-by-nb matrix Y .
Application Notes
For the elementary reflector H(i),
v(1:i+k-1) = 0, v(i+k) = 1; v(i+k+1:n) is stored on exit in a(i+k+1:n, i) and tau is

stored in tau(i).
The elements of the vectors v together form the (n-k+1)-by-nb matrix V which is needed, with
T and Y, to apply the transformation to the unreduced part of the matrix, using an update of
the form:
A := (I - V*T*V') * (A - Y*V').
The contents of A on exit are illustrated by the following example with n = 7, k = 3 and nb
= 2:
1506
See Also
• Auxiliary Routines
• ?lahr2
?lahr2
Reduces the specified number of first columns of
a general rectangular matrix A so that elements
below the specified subdiagonal are zero, and
returns auxiliary matrices which are needed to
apply the transformation to the unreduced part of
A.
Syntax
call slahr2( n, k, nb, a, lda, tau, t, ldt, y, ldy )
call dlahr2( n, k, nb, a, lda, tau, t, ldt, y, ldy )
call clahr2( n, k, nb, a, lda, tau, t, ldt, y, ldy )
call zlahr2( n, k, nb, a, lda, tau, t, ldt, y, ldy )
1507
Description
for C interface.
The routine reduces the first nb columns of a real/complex general n-by-(n-k+1) matrix A so
that elements below the k-th subdiagonal are zero. The reduction is performed by an
orthogonal/unitary similarity transformation Q'*A*Q. The routine returns the matrices V and T
which determine Q as a block reflector I - V*T*V', and also the matrix Y = A*V*T.
The matrix Q is represented as products of nb elementary reflectors:
Q = H(1)*H(2)*... *H(nb)
H(i) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector.
This is an auxiliary routine called by ?gehrd.
Input Parameters
k INTEGER. The offset for the reduction. Elements below the
k-th subdiagonal in the first nb columns are reduced to zero
(k < n).
nb INTEGER. The number of columns to be reduced.
a REAL for slahr2
DOUBLE PRECISION for dlahr2
COMPLEX for clahr2
COMPLEX*16 for zlahr2.
Array, DIMENSION (lda, n-k+1) contains the n-by-(n-k+1)
general matrix A to be reduced.
lda INTEGER. The first dimension of the array a; lda ≥ max(1,
n).
ldt INTEGER. The first dimension of the output array t; ldt ≥
nb.
ldy INTEGER. The first dimension of the output array y; ldy ≥
n.
1508
Output Parameters
a On exit, the elements on and above the k-th subdiagonal
corresponding elements of the reduced matrix; the elements
below the k-th subdiagonal, with the array tau, represent
the matrix Q as a product of elementary reflectors. The other
columns of a are unchanged. See Application Notes below.
tau REAL for slahr2
COMPLEX for clahr2
t, y REAL for slahr2
COMPLEX for clahr2
Arrays, dimension t(ldt, nb), y(ldy, nb).
The array t contains upper triangular matrix T.
The array y contains the n-by-nb matrix Y .
Application Notes
For the elementary reflector H(i),
v(1:i+k-1) = 0, v(i+k) = 1; v(i+k+1:n) is stored on exit in a(i+k+1:n, i) and tau is
stored in tau(i).
the form:
A := (I - V*T*V') * (A - Y*V').
The contents of A on exit are illustrated by the following example with n = 7, k = 3 and nb
= 2:
1509
?laic1
Applies one step of incremental condition
estimation.
Syntax
call slaic1( job, j, x, sest, w, gamma, sestpr, s, c )
call dlaic1( job, j, x, sest, w, gamma, sestpr, s, c )
call claic1( job, j, x, sest, w, gamma, sestpr, s, c )
call zlaic1( job, j, x, sest, w, gamma, sestpr, s, c )
Description
for C interface.
The routine ?laic1 applies one step of incremental condition estimation in its simplest version.
Let x, ||x||2 = 1 (where ||a||2 denotes the 2-norm of a), be an approximate singular vector
of an j-by-j lower triangular matrix L, such that
||L*x||2 = sest
Then ?laic1 computes sestpr, s, c such that the vector
1510
is an approximate singular vector of
in the sense that

||Lhat*xhat||2 = sestpr.
Depending on job, an estimate for the largest or smallest singular value is computed.
Note that [s c]' and sestpr2 is an eigenpair of the system (for slaic1/claic)
where alpha = x'*w ;

or of the system (for claic1/zlaic)
where alpha = conjg(x)'*w.
Input Parameters
job INTEGER.
If job =1, an estimate for the largest singular value is
computed;
1511
If job =2, an estimate for the smallest singular value is

computed;
j INTEGER. Length of x and w.
x, w REAL for slaic1
DOUBLE PRECISION for dlaic1
COMPLEX for claic1
COMPLEX*16 for zlaic1.
Arrays, dimension (j) each. Contain vectors x and w,
respectively.
sest REAL for slaic1/claic1;
DOUBLE PRECISION for dlaic1/zlaic1.
Estimated singular value of j-by-j matrix L.
gamma REAL for slaic1
COMPLEX for claic1
The diagonal element gamma.
Output Parameters
sestpr REAL for slaic1/claic1;
DOUBLE PRECISION for dlaic1/zlaic1.
Estimated singular value of (j+1)-by-(j+1) matrix Lhat.
s, c REAL for slaic1
COMPLEX for claic1
Sine and cosine needed in forming xhat.
1512
?laln2
Solves a 1-by-1 or 2-by-2 linear system of
equations of the specified form.
Syntax
call slaln2( ltrans, na, nw, smin, ca, a, lda, d1, d2, b, ldb, wr, wi, x, ldx,
scale, xnorm, info )
call dlaln2( ltrans, na, nw, smin, ca, a, lda, d1, d2, b, ldb, wr, wi, x, ldx,
scale, xnorm, info )
Description
for C interface.
The routine solves a system of the form
(ca*A - w*D)*X = s*B, or (ca*A' - w*D)*X = s*B
with possible scaling (s) and perturbation of A (A' means A-transpose.)
A is an na-by-na real matrix, ca is a real scalar, D is an na-by-na real diagonal matrix, w is a

real or complex value, and X and B are na-by-1 matrices: real if w is real, complex if w is complex.
The parameter na may be 1 or 2.
If w is complex, X and B are represented as na-by-2 matrices, the first column of each being
the real part and the second being the imaginary part.
The routine computes the scaling factor s ( ≤ 1 ) so chosen that X can be computed without
overflow. X is further scaled if necessary to assure that norm(ca*A - w*D)*norm(X) is less
than overflow.
If both singular values of (ca&*A - w*D) are less than smin, smin*I (where I stands for
identity) will be used instead of (ca*A - w*D). If only one singular value is less than smin, one
element of (ca*A - w*D) will be perturbed enough to make the smallest singular value roughly
smin.
If both singular values are at least smin, (ca*A - w*D) will not be perturbed. In any case,
the perturbation will be at most some small multiple of max(smin, ulp*norm(ca*A - w*D)).
The singular values are computed by infinity-norm approximations, and thus will only be correct
to a factor of 2 or so.
1513
NOTE. All input quantities are assumed to be smaller than overflow by a reasonable
factor (see bignum).
Input Parameters
trans LOGICAL.
If trans = .TRUE., A- transpose will be used.
If trans = .FALSE., A will be used (not transposed.)
na INTEGER. The size of the matrix A, possible values 1 or 2.
nw INTEGER. This parameter must be 1 if w is real, and 2 if w
is complex. Possible values 1 or 2.
smin REAL for slaln2
DOUBLE PRECISION for dlaln2.
The desired lower bound on the singular values of A.
This should be a safe distance away from underflow or
overflow, for example, between
(underflow/machine_precision) and (machine_precision
* overflow). (See bignum and ulp).
ca REAL for slaln2
The coefficient by which A is multiplied.
a REAL for slaln2
Array, DIMENSION (lda,na).
The na-by-na matrix A.
lda INTEGER. The leading dimension of a. Must be at least na.
d1, d2 REAL for slaln2
The (1,1) and (2,2) elements in the diagonal matrix D,
respectively. d2 is not used if nw = 1.
b REAL for slaln2
Array, DIMENSION (ldb,nw). The na-by-nw matrix B
(right-hand side). If nw =2 (w is complex), column 1 contains
the real part of B and column 2 contains the imaginary part.
1514
ldb INTEGER. The leading dimension of b. Must be at least na.
wr, wi REAL for slaln2
The real and imaginary part of the scalar w, respectively.
wi is not used if nw = 1.
ldx INTEGER. The leading dimension of the output array x. Must
be at least na.
Output Parameters
x REAL for slaln2
Array, DIMENSION (ldx,nw). The na-by-nw matrix X
(unknowns), as computed by the routine. If nw = 2 (w is
complex), on exit, column 1 will contain the real part of X
and column 2 will contain the imaginary part.
scale REAL for slaln2
The scale factor that B must be multiplied by to insure that
overflow does not occur when computing X. Thus (ca*A -
w*D) X will be scale*B, not B (ignoring perturbations of
A.) It will be at most 1.
xnorm REAL for slaln2
The infinity-norm of X, when X is regarded as an na-by-nw
real matrix.
info INTEGER.
An error flag. It will be zero if no error occurs, a negative
number if an argument is in error, or a positive number if
(ca*A - w*D) had to be perturbed. The possible values
are:
If info = 0: no error occurred, and (ca*A - w*D) did not
have to be perturbed.
If info = 1: (ca*A - w*D) had to be perturbed to make
its smallest (or only) singular value greater than smin.
NOTE. For higher speed, this routine does not check the inputs for errors.
1515
?lals0
Applies back multiplying factors in solving the least
squares problem using divide and conquer SVD
approach. Used by ?gelsd.
Syntax
call slals0 ( icompq, nl, nr, sqre, nrhs, b, ldb, bx, ldbx, perm, givptr,
givcol, ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, work, info )
call dlals0 ( icompq, nl, nr, sqre, nrhs, b, ldb, bx, ldbx, perm, givptr,
givcol, ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, work, info )
call clals0( icompq, nl, nr, sqre, nrhs, b, ldb, bx, ldbx, perm, givptr,
givcol, ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, rwork, info )
call zlals0( icompq, nl, nr, sqre, nrhs, b, ldb, bx, ldbx, perm, givptr,
givcol, ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, rwork, info )
Description
for C interface.
The routine applies back the multiplying factors of either the left or right singular vector matrix
of a diagonal matrix appended by a row to the right hand side matrix B in solving the least
squares problem using the divide-and-conquer SVD approach.
For the left singular vector matrix, three types of orthogonal matrices are involved:
(1L) Givens rotations: the number of such rotations is givptr;the pairs of columns/rows they
were applied to are stored in givcol;and the c- and s-values of these rotations are stored in
givnum.
(2L) Permutation. The (nl+1)-st row of B is to be moved to the first row, and for j=2:n,
perm(j)-th row of B is to be moved to the j-th row.
(3L) The left singular vector matrix of the remaining matrix.

For the right singular vector matrix, four types of orthogonal matrices are involved:
(1R) The right singular vector matrix of the remaining matrix.
(2R) If sqre = 1, one extra Givens rotation to generate the right null space.
(3R) The inverse transformation of (2L).
1516
(4R) The inverse transformation of (1L).
Input Parameters
icompq INTEGER. Specifies whether singular vectors are to be
computed in factored form:
If icompq = 0: Left singular vector matrix.
If icompq = 1: Right singular vector matrix.
nl INTEGER. The row dimension of the upper block.
nl ≥ 1.
nr INTEGER. The row dimension of the lower block.
nr ≥ 1.
sqre INTEGER.
If sqre = 0: the lower block is an nr-by-nr square matrix.
If sqre = 1: the lower block is an nr-by-(nr+1) rectangular
matrix. The bidiagonal matrix has row dimension n = nl
+ nr + 1, and column dimension m = n + sqre.
nrhs INTEGER. The number of columns of B and bx.
Must be at least 1.
b REAL for slals0
DOUBLE PRECISION for dlals0
COMPLEX for clals0
COMPLEX*16 for zlals0.
Array, DIMENSION ( ldb, nrhs ).
Contains the right hand sides of the least squares problem
in rows 1 through m.
ldb INTEGER. The leading dimension of b.
Must be at least max(1,max( m, n )).
bx REAL for slals0
COMPLEX for clals0
COMPLEX*16 for zlals0.
Workspace array, DIMENSION ( ldbx, nrhs ).
ldbx INTEGER. The leading dimension of bx.
perm INTEGER. Array, DIMENSION (n).
1517
The permutations (from deflation and sorting) applied to

the two blocks.
givptr INTEGER. The number of Givens rotations which took place
in this subproblem.
givcol INTEGER. Array, DIMENSION ( ldgcol, 2 ). Each pair of
numbers indicates a pair of rows/columns involved in a
Givens rotation.
ldgcol INTEGER. The leading dimension of givcol, must be at least
n.
givnum REAL for slals0/clals0
DOUBLE PRECISION for dlals0/zlals0
Array, DIMENSION ( ldgnum, 2 ). Each number indicates the
c or s value used in the corresponding Givens rotation.
ldgnum INTEGER. The leading dimension of arrays difr, poles and
givnum, must be at least k.
poles REAL for slals0/clals0
Array, DIMENSION ( ldgnum, 2 ). On entry, poles(1:k, 1)
contains the new singular values obtained from solving the
secular equation, and poles(1:k, 2) is an array containing
the poles in the secular equation.
difl REAL for slals0/clals0
Array, DIMENSION ( k ). On entry, difl(i) is the distance
between i-th updated (undeflated) singular value and the
i-th (undeflated) old singular value.
difr REAL for slals0/clals0
Array, DIMENSION ( ldgnum, 2 ). On entry, difr(i, 1)
contains the distances between i-th updated (undeflated)
singular value and the i+1-th (undeflated) old singular
value. And difr(i, 2) is the normalizing factor for the i-th
right singular vector.
z REAL for slals0/clals0
1518
Array, DIMENSION ( k ). Contains the components of the
deflation-adjusted updating row vector.
K INTEGER. Contains the dimension of the non-deflated matrix.
This is the order of the related secular equation. 1 ≤ k ≤
n.
c REAL for slals0/clals0
Contains garbage if sqre =0 and the c value of a Givens
rotation related to the right null space if sqre = 1.
s REAL for slals0/clals0
Contains garbage if sqre =0 and the s value of a Givens
work REAL for slals0
Workspace array, DIMENSION ( k ). Used with real flavors
only.
rwork REAL for clals0
DOUBLE PRECISION for zlals0
Workspace array, DIMENSION (k*(1+nrhs) + 2*nrhs). Used
with complex flavors only.
Output Parameters
b On exit, contains the solution X in rows 1 through n.
info INTEGER.
If info = 0: successful exit.
If info = -i < 0, the i-th argument had an illegal value.
1519
?lalsa
Computes the SVD of the coefficient matrix in
compact form. Used by ?gelsd.
Syntax
call slalsa( icompq, smlsiz, n, nrhs, b, ldb, bx, ldbx, u, ldu, vt, k, difl,
difr, z, poles, givptr, givcol, ldgcol, perm, givnum, c, s, work, iwork, info
)
call dlalsa( icompq, smlsiz, n, nrhs, b, ldb, bx, ldbx, u, ldu, vt, k, difl,
difr, z, poles, givptr, givcol, ldgcol, perm, givnum, c, s, work, iwork, info
)
call clalsa( icompq, smlsiz, n, nrhs, b, ldb, bx, ldbx, u, ldu, vt, k, difl,
difr, z, poles, givptr, givcol, ldgcol, perm, givnum, c, s, rwork, iwork,
info )
call zlalsa( icompq, smlsiz, n, nrhs, b, ldb, bx, ldbx, u, ldu, vt, k, difl,
difr, z, poles, givptr, givcol, ldgcol, perm, givnum, c, s, rwork, iwork,
info )
Description
for C interface.
The routine is an intermediate step in solving the least squares problem by computing the SVD
of the coefficient matrix in compact form. The singular vectors are computed as products of
simple orthogonal matrices.
If icompq = 0, ?lalsa applies the inverse of the left singular vector matrix of an upper
bidiagonal matrix to the right hand side; and if icompq = 1, the routine applies the right
singular vector matrix to the right hand side. The singular vector matrices were generated in
the compact form by ?lalsa.
Input Parameters
icompq INTEGER. Specifies whether the left or the right singular
vector matrix is involved. If icompq = 0: left singular vector
matrix is used
If icompq = 1: right singular vector matrix is used.
1520
smlsiz INTEGER. The maximum size of the subproblems at the
bottom of the computation tree.
n INTEGER. The row and column dimensions of the upper
bidiagonal matrix.
nrhs INTEGER. The number of columns of b and bx. Must be at
least 1.
b REAL for slalsa
DOUBLE PRECISION for dlalsa
COMPLEX for clalsa
COMPLEX*16 for zlalsa
Array, DIMENSION (ldb, nrhs). Contains the right hand sides
of the least squares problem in rows 1 through m.
ldb INTEGER. The leading dimension of b in the calling
subprogram. Must be at least max(1,max( m, n )).
ldbx INTEGER. The leading dimension of the output array bx.
u REAL for slalsa/clalsa
DOUBLE PRECISION for dlalsa/zlalsa
Array, DIMENSION (ldu, smlsiz). On entry, u contains the
left singular vector matrices of all subproblems at the bottom
level.
ldu INTEGER, ldu ≥ n. The leading dimension of arrays u, vt,
difl, difr, poles, givnum, and z.
vt REAL for slalsa/clalsa
Array, DIMENSION (ldu, smlsiz +1). On entry, contains
the right singular vector matrices of all subproblems at the
bottom level.
k INTEGER array, DIMENSION ( n ).
difl REAL for slalsa/clalsa
Array, DIMENSION (ldu, nlvl), where nlvl = int(log2(n
/(smlsiz+1))) + 1.
difr REAL for slalsa/clalsa
1521
Array, DIMENSION (ldu, 2*nlvl). On entry, difl(*, i)

and difr(*, 2i -1) record distances between singular values
on the i-th level and singular values on the (i -1)-th level,
and difr(*, 2i) record the normalizing factors of the right
singular vectors matrices of subproblems on i-th level.
z REAL for slalsa/clalsa
Array, DIMENSION (ldu, nlvl . On entry, z(1, i) contains
the components of the deflation- adjusted updating the row
vector for subproblems on the i-th level.
poles REAL for slalsa/clalsa
Array, DIMENSION (ldu, 2*nlvl).
On entry, poles(*, 2i-1: 2i) contains the new and old
singular values involved in the secular equations on the i-th
level.
givptr INTEGER. Array, DIMENSION ( n ).
On entry, givptr( i ) records the number of Givens
rotations performed on the i-th problem on the computation
tree.
givcol INTEGER. Array, DIMENSION ( ldgcol, 2*nlvl ). On entry,
for each i, givcol(*, 2i-1: 2i) records the locations of
Givens rotations performed on the i-th level on the
computation tree.
ldgcol INTEGER, ldgcol ≥ n. The leading dimension of arrays
givcol and perm.
perm INTEGER. Array, DIMENSION ( ldgcol, nlvl ). On entry,
perm(*, i) records permutations done on the i-th level of
the computation tree.
givnum REAL for slalsa/clalsa
Array, DIMENSION (ldu, 2*nlvl). On entry, givnum(*, 2i-1
: 2i) records the c and s values of Givens rotations
performed on the i-th level on the computation tree.
c REAL for slalsa/clalsa
1522
Array, DIMENSION ( n ). On entry, if the i-th subproblem
is not square, c( i ) contains the c value of a Givens rotation
related to the right null space of the i-th subproblem.
s REAL for slalsa/clalsa
Array, DIMENSION ( n ). On entry, if the i-th subproblem
is not square, s( i ) contains the s-value of a Givens rotation
related to the right null space of the i-th subproblem.
work REAL for slalsa
Workspace array, DIMENSION at least (n). Used with real
flavors only.
rwork REAL for clalsa
DOUBLE PRECISION for zlalsa
Workspace array, DIMENSION at least max(n,
(smlsz+1)*nrhs*3). Used with complex flavors only.
iwork INTEGER.
Workspace array, DIMENSION at least (3n).
Output Parameters
b On exit, contains the solution X in rows 1 through n.
bx REAL for slalsa
COMPLEX for clalsa
COMPLEX*16 for zlalsa
Array, DIMENSION (ldbx, nrhs). On exit, the result of
applying the left or right singular vector matrix to b.
info INTEGER. If info = 0: successful exit
1523
?lalsd
Uses the singular value decomposition of A to solve
the least squares problem.
Syntax
call slalsd( uplo, smlsiz, n, nrhs, d, e, b, ldb, rcond, rank, work, iwork,
info )
call dlalsd( uplo, smlsiz, n, nrhs, d, e, b, ldb, rcond, rank, work, iwork,
info )
call clalsd( uplo, smlsiz, n, nrhs, d, e, b, ldb, rcond, rank, work, rwork,
iwork, info )
call zlalsd( uplo, smlsiz, n, nrhs, d, e, b, ldb, rcond, rank, work, rwork,
iwork, info )
Description
for C interface.
The routine uses the singular value decomposition of A to solve the least squares problem of
finding X to minimize the Euclidean norm of each column of A*X-B, where A is n-by-n upper
bidiagonal, and X and B are n-by-nrhs. The solution X overwrites B.
The singular values of A smaller than rcond times the largest singular value are treated as zero
in solving the least squares problem; in this case a minimum norm solution is returned. The
actual singular values are returned in d in ascending order.
subtract like the Cray XMP, Cray YMP, Cray C 90, or Cray 2.
It could conceivably fail on hexadecimal or decimal machines without guard digits, but we know
of none.
Input Parameters
uplo CHARACTER*1.
If uplo = 'U', d and e define an upper bidiagonal matrix.
If uplo = 'L', d and e define a lower bidiagonal matrix.
1524
smlsiz INTEGER. The maximum size of the subproblems at the
bottom of the computation tree.
n INTEGER. The dimension of the bidiagonal matrix.
n ≥ 0.
nrhs INTEGER. The number of columns of B. Must be at least 1.
d REAL for slalsd/clalsd
DOUBLE PRECISION for dlalsd/zlalsd
Array, DIMENSION (n). On entry, d contains the main
diagonal of the bidiagonal matrix.
e REAL for slalsd/clalsd
Array, DIMENSION (n-1). Contains the super-diagonal entries
of the bidiagonal matrix. On exit, e is destroyed.
b REAL for slalsd
DOUBLE PRECISION for dlalsd
COMPLEX for clalsd
COMPLEX*16 for zlalsd
Array, DIMENSION (ldb,nrhs).
On input, b contains the right hand sides of the least squares
problem. On output, b contains the solution X.
ldb INTEGER. The leading dimension of b in the calling
subprogram. Must be at least max(1,n).
rcond REAL for slalsd/clalsd
The singular values of A less than or equal to rcond times
the largest singular value are treated as zero in solving the
least squares problem. If rcond is negative, machine
precision is used instead. For example, for the least squares
problem diag(S)*X=B, where diag(S) is a diagonal matrix
of singular values, the solution is X(i)=B(i)/S(i) if S(i)
is greater than rcond *max(S), and X(i)=0 if S(i) is less
than or equal to rcond *max(S).
rank INTEGER. The number of singular values of A greater than
rcond times the largest singular value.
work REAL for slalsd
DOUBLE PRECISION for dlalsd
1525
COMPLEX for clalsd

COMPLEX*16 for zlalsd
Workspace array.
DIMENSION for real flavors at least
(9n+2n*smlsiz+8n*nlvl+n*nrhs+(smlsiz+1)2),
where
nlvl = max(0, int(log2(n/(smlsiz+1))) + 1).
DIMENSION for complex flavors is (n*nrhs).
rwork REAL for clalsd
DOUBLE PRECISION for zlalsd
Workspace array, used with complex flavors only.
DIMENSION at least (9n + 2n*smlsiz + 8n*nlvl +
3*mlsiz*nrhs + (smlsiz+1)2),
where
nlvl = max(0, int(log2(min(m,n)/(smlsiz+1))) +
1).
iwork INTEGER.
Workspace array of DIMENSION (3n*nlvl + 11n).
Output Parameters
d On exit, if info = 0, d contains singular values of the
bidiagonal matrix.
e On exit, destroyed.
b On exit, b contains the solution X.
info INTEGER.
If info > 0: The algorithm failed to compute a singular
value while working on the submatrix lying in rows and
columns info/(n+1) through mod(info,n+1).
1526
?lamrg
Creates a permutation list to merge the entries of
two independently sorted sets into a single set
sorted in acsending order.
Syntax
call slamrg( n1, n2, a, strd1, strd2, index )
call dlamrg( n1, n2, a, strd1, strd2, index )
Description
for C interface.
The routine creates a permutation list which will merge the elements of a (which is composed
of two independently sorted sets) into a single set which is sorted in ascending order.
Input Parameters
n1, n2 INTEGER. These arguments contain the respective lengths
of the two sorted lists to be merged.
a REAL for slamrg
DOUBLE PRECISION for dlamrg.
Array, DIMENSION (n1+n2).
The first n1 elements of a contain a list of numbers which
are sorted in either ascending or descending order. Likewise
for the final n2 elements.
strd1, strd2 INTEGER.
These are the strides to be taken through the array a.
Allowable strides are 1 and -1. They indicate whether a
subset of a is sorted in ascending (strdx = 1) or
descending (strdx = -1) order.
Output Parameters
index INTEGER. Array, DIMENSION (n1+n2).
On exit, this array will contain a permutation such that if
b(i) = a(index(i)) for i=1, n1+n2, then b will be sorted
in ascending order.
1527
?laneg
Computes the Sturm count, the number of negative
pivots encountered while factoring tridiagonal
T
T-sigma*I = L*D*L .
Syntax
value = slaneg( n, d, lld, sigma, pivmin, r )
value = dlaneg( n, d, lld, sigma, pivmin, r )
Description
for C interface.
The routine computes the Sturm count, the number of negative pivots encountered while
T
factoring tridiagonal T-sigma*I = L*D*L . This implementation works directly on the factors
without forming the tridiagonal matrix T. The Sturm count is also the number of eigenvalues
of T less than sigma. This routine is called from ?larb. The current routine does not use the
pivmin parameter but rather requires IEEE-754 propagation of infinities and NaNs (NaN stands
for 'Not A Number'). This routine also has no input range restrictions but does require default
exception handling such that x/0 produces Inf when x is non-zero, and Inf/Inf produces
NaN. (For more information see [Marques06]).
Input Parameters
n INTEGER. The order of the matrix.
d REAL for slaneg
DOUBLE PRECISION for dlaneg
Contains n diagonal elements of the matrix D.
lld REAL for slaneg
Contains (n-1) elements L(i)*L(i)*D(i).
sigma REAL for slaneg
Shift amount in T-sigma*I = L*D*L**T.
pivmin REAL for slaneg
1528
The minimum pivot in the Sturm sequence. May be used
when zero pivots are encountered on non-IEEE-754
architectures.
r INTEGER.
The twist index for the twisted factorization that is used for
the negcount.
Output Parameters
value INTEGER. The number of negative pivots encountered while
factoring.
?langb
Returns the value of the 1-norm, Frobenius norm,
infinity-norm, or the largest absolute value of any
element of general band matrix.
Syntax
val = slangb( norm, n, kl, ku, ab, ldab, work )
val = dlangb( norm, n, kl, ku, ab, ldab, work )
val = clangb( norm, n, kl, ku, ab, ldab, work )
val = zlangb( norm, n, kl, ku, ab, ldab, work )
Description
for C interface.
The function returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or
the element of largest absolute value of an n-by-n band matrix A, with kl sub-diagonals and
ku super-diagonals.
Input Parameters
norm CHARACTER*1. Specifies the value to be returned by the
routine:
1529
= 'M' or 'm': val = max(abs(Aij)), largest absolute

value of the matrix A.
= '1' or 'O' or 'o': val = norm1(A), 1-norm of the
matrix A (maximum column sum),
= 'I' or 'i': val = normI(A), infinity norm of the matrix
A (maximum row sum),
= 'F', 'f', 'E' or 'e': val = normF(A), Frobenius norm
of the matrix A (square root of sum of squares).
n INTEGER. The order of the matrix A. n ≥ 0. When n = 0,
?langb is set to zero.
kl INTEGER. The number of sub-diagonals of the matrix A. kl
≥ 0.
ku INTEGER. The number of super-diagonals of the matrix A.
ku ≥ 0.
ab REAL for slangb
DOUBLE PRECISION for dlangb
COMPLEX for clangb
COMPLEX*16 for zlangb
Array, DIMENSION (ldab,n).
The band matrix A, stored in rows 1 to kl+ku+1. The j-th
column of A is stored in the j-th column of the array ab as
follows:
ab(ku+1+i-j,j) = a(i,j)
for max(1,j-ku) ≤ i ≤ min(n,j+kl).
ldab INTEGER. The leading dimension of the array ab.
ldab ≥ kl+ku+1.
work REAL for slangb/clangb
DOUBLE PRECISION for dlangb/zlangb
Workspace array, DIMENSION (max(1,lwork)), where
lwork ≥ n when norm = 'I'; otherwise, work is not
referenced.
Output Parameters
val REAL for slangb/clangb
1530
DOUBLE PRECISION for dlangb/zlangb
Value returned by the function.
?lange
element of a general rectangular matrix.
Syntax
val = slange( norm, m, n, a, lda, work )
val = dlange( norm, m, n, a, lda, work )
val = clange( norm, m, n, a, lda, work )
val = zlange( norm, m, n, a, lda, work )
Description
for C interface.
The function ?lange returns the value of the 1-norm, or the Frobenius norm, or the infinity
norm, or the element of largest absolute value of a real/complex matrix A.
Input Parameters
routine:
m INTEGER. The number of rows of the matrix A.
m ≥ 0. When m = 0, ?lange is set to zero.
n INTEGER. The number of columns of the matrix A.
1531
n ≥ 0. When n = 0, ?lange is set to zero.

a REAL for slange
DOUBLE PRECISION for dlange
COMPLEX for clange
COMPLEX*16 for zlange
Array, DIMENSION (lda,n).
The m-by-n matrix A.
lda INTEGER. The leading dimension of the array a.
lda ≥ max(m,1).
work REAL for slange and clange.
DOUBLE PRECISION for dlange and zlange.
Workspace array, DIMENSION max(1,lwork), where lwork
≥ m when norm = 'I'; otherwise, work is not referenced.
Output Parameters
val REAL for slange/clange
DOUBLE PRECISION for dlange/zlange
?langt
element of a general tridiagonal matrix.
Syntax
val = slangt( norm, n, dl, d, du )
val = dlangt( norm, n, dl, d, du )
val = clangt( norm, n, dl, d, du )
val = zlangt( norm, n, dl, d, du )
Description
for C interface.
1532
The routine returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or
the element of largest absolute value of a real/complex tridiagonal matrix A.
Input Parameters
routine:
= 'M' or 'm': val = max(abs(Aij)), largest absolute value
of the matrix A.
= '1' or 'O' or 'o': val = norm1(A), 1-norm of the matrix
A (maximum column sum),
?langt is set to zero.
dl, d, du REAL for slangt
DOUBLE PRECISION for dlangt
COMPLEX for clangt
COMPLEX*16 for zlangt
Arrays: dl (n-1), d (n), du (n-1).
The array dl contains the (n-1) sub-diagonal elements of
A.
The array d contains the diagonal elements of A.
The array du contains the (n-1) super-diagonal elements of
A.
Output Parameters
val REAL for slangt/clangt
DOUBLE PRECISION for dlangt/zlangt
1533
?lanhs
element of an upper Hessenberg matrix.
Syntax
val = slanhs( norm, n, a, lda, work )
val = dlanhs( norm, n, a, lda, work )
val = clanhs( norm, n, a, lda, work )
val = zlanhs( norm, n, a, lda, work )
Description
for C interface.
The function ?lanhs returns the value of the 1-norm, or the Frobenius norm, or the infinity
norm, or the element of largest absolute value of a Hessenberg matrix A.
The value val returned by the function is:
val = max(abs(Aij)), if norm = 'M' or 'm'

= norm1(A), if norm = '1' or 'O' or 'o'
= normI(A), if norm = 'I' or 'i'
= normF(A), if norm = 'F', 'f', 'E' or 'e'
where norm1 denotes the 1-norm of a matrix (maximum column sum), normI denotes the
infinity norm of a matrix (maximum row sum) and normF denotes the Frobenius norm of a
matrix (square root of sum of squares). Note that max(abs(Aij)) is not a consistent matrix
norm.
Input Parameters
routine as described above.
n ≥ 0. When n = 0, ?lanhs is set to zero.
1534
a REAL for slanhs
DOUBLE PRECISION for dlanhs
COMPLEX for clanhs
COMPLEX*16 for zlanhs
Array, DIMENSION (lda,n). The n-by-n upper Hessenberg
matrix A; the part of A below the first sub-diagonal is not
referenced.
lda ≥ max(n,1).
work REAL for slanhs and clanhs.
DOUBLE PRECISION for dlange and zlange.
lwork ≥ n when norm = 'I'; otherwise, work is not
referenced.
Output Parameters
val REAL for slanhs/clanhs
DOUBLE PRECISION for dlanhs/zlanhs
?lansb
Returns the value of the 1-norm, or the Frobenius
norm, or the infinity norm, or the element of
largest absolute value of a symmetric band matrix.
Syntax
val = slansb( norm, uplo, n, k, ab, ldab, work )
val = dlansb( norm, uplo, n, k, ab, ldab, work )
val = clansb( norm, uplo, n, k, ab, ldab, work )
val = zlansb( norm, uplo, n, k, ab, ldab, work )
Description
for C interface.
1535
The function ?lansb returns the value of the 1-norm, or the Frobenius norm, or the infinity
norm, or the element of largest absolute value of an n-by-n real/complex symmetric band
matrix A, with k super-diagonals.
Input Parameters
routine:
of the matrix A.
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the
band matrix A is supplied. If uplo = 'U': upper triangular
part is supplied; If uplo = 'L': lower triangular part is
supplied.
n INTEGER. The order of the matrix A. n ≥ 0.
When n = 0, ?lansb is set to zero.
k INTEGER. The number of super-diagonals or sub-diagonals
of the band matrix A. k ≥ 0.
ab REAL for slansb
DOUBLE PRECISION for dlansb
COMPLEX for clansb
COMPLEX*16 for zlansb
Array, DIMENSION (ldab,n).
The upper or lower triangle of the symmetric band matrix
A, stored in the first k+1 rows of ab. The j-th column of A
is stored in the j-th column of the array ab as follows:
if uplo = 'U', ab(k+1+i-j,j) = a(i,j)
for max(1,j-k) ≤ i≤ j;
if uplo = 'L', ab(1+i-j,j) = a(i,j) for
j≤i≤min(n,j+k).
1536
ldab ≥ k+1.
work REAL for slansb and clansb.
DOUBLE PRECISION for dlansb and zlansb.
lwork ≥ n when norm = 'I' or '1' or 'O'; otherwise,
work is not referenced.
Output Parameters
val REAL for slansb/clansb
DOUBLE PRECISION for dlansb/zlansb
?lanhb
largest absolute value of a Hermitian band matrix.
Syntax
val = clanhb( norm, uplo, n, k, ab, ldab, work )
val = zlanhb( norm, uplo, n, k, ab, ldab, work )
Description
for C interface.
the element of largest absolute value of an n-by-n Hermitian band matrix A, with k
super-diagonals.
Input Parameters
routine:
of the matrix A.
1537

uplo CHARACTER*1.
band matrix A is supplied.
If uplo = 'U': upper triangular part is supplied;
If uplo = 'L': lower triangular part is supplied.
?lanhb is set to zero.
k INTEGER. The number of super-diagonals or sub-diagonals
of the band matrix A.
k ≥ 0.
ab COMPLEX for clanhb.
COMPLEX*16 for zlanhb.
Array, DIMENSION (ldaB,n). The upper or lower triangle of
the Hermitian band matrix A, stored in the first k+1 rows
of ab. The j-th column of A is stored in the j-th column of
the array ab as follows:
if uplo = 'U', ab(k+1+i-j,j) = a(i,j)
for max(1,j-k) ≤ i ≤ j;
if uplo = 'L', ab(1+i-j,j) = a(i,j) for j ≤ i ≤
min(n,j+k).
Note that the imaginary parts of the diagonal elements need
not be set and are assumed to be zero.
ldab INTEGER. The leading dimension of the array ab. ldab ≥
k+1.
work REAL for clanhb.
DOUBLE PRECISION for zlanhb.
Workspace array, DIMENSION max(1, lwork), where
1538
Output Parameters
val REAL for slanhb/clanhb
DOUBLE PRECISION for dlanhb/zlanhb
?lansp
largest absolute value of a symmetric matrix
supplied in packed form.
Syntax
val = slansp( norm, uplo, n, ap, work )
val = dlansp( norm, uplo, n, ap, work )
val = clansp( norm, uplo, n, ap, work )
val = zlansp( norm, uplo, n, ap, work )
Description
for C interface.
The function ?lansp returns the value of the 1-norm, or the Frobenius norm, or the infinity
norm, or the element of largest absolute value of a real/complex symmetric matrix A, supplied
in packed form.
Input Parameters
routine:
of the matrix A.
1539

uplo CHARACTER*1.
symmetric matrix A is supplied.
If uplo = 'U': Upper triangular part of A is supplied
If uplo = 'L': Lower triangular part of A is supplied.
n INTEGER. The order of the matrix A. n ≥ 0. When
n = 0, ?lansp is set to zero.
ap REAL for slansp
DOUBLE PRECISION for dlansp
COMPLEX for clansp
COMPLEX*16 for zlansp
Array, DIMENSION (n(n+1)/2).
The upper or lower triangle of the symmetric matrix A,
packed columnwise in a linear array. The j-th column of A
is stored in the array ap as follows:
if uplo = 'U', ap(i + (j-1)j/2) = A(i,j) for 1 ≤ i
≤ j;
if uplo = 'L', ap(i + (j-1)(2n-j)/2) = A(i,j) for j
≤ i ≤ n.
work REAL for slansp and clansp.
DOUBLE PRECISION for dlansp and zlansp.
Output Parameters
val REAL for slansp/clansp
DOUBLE PRECISION for dlansp/zlansp
1540
?lanhp
largest absolute value of a complex Hermitian
matrix supplied in packed form.
Syntax
val = clanhp( norm, uplo, n, ap, work )
val = zlanhp( norm, uplo, n, ap, work )
Description
for C interface.
The function ?lanhp returns the value of the 1-norm, or the Frobenius norm, or the infinity
norm, or the element of largest absolute value of a complex Hermitian matrix A, supplied in
packed form.
Input Parameters
routine:
of the matrix A.
uplo CHARACTER*1.
Hermitian matrix A is supplied.
If uplo = 'U': Upper triangular part of A is supplied
If uplo = 'L': Lower triangular part of A is supplied.
n ≥ 0. When n = 0, ?lanhp is set to zero.
1541
ap COMPLEX for clanhp.

COMPLEX*16 for zlanhp.
Array, DIMENSION (n(n+1)/2). The upper or lower triangle
of the Hermitian matrix A, packed columnwise in a linear
array. The j-th column of A is stored in the array ap as
follows:
≤ j;
if uplo = 'L', ap(i + (j-1)(2n-j)/2) = A(i,j) for j
≤ i ≤ n.
work REAL for clanhp.
DOUBLE PRECISION for zlanhp.
Output Parameters
val REAL for clanhp.
DOUBLE PRECISION for zlanhp.
?lanst/?lanht
largest absolute value of a real symmetric or
complex Hermitian tridiagonal matrix.
Syntax
val = slanst( norm, n, d, e )
val = dlanst( norm, n, d, e )
val = clanht( norm, n, d, e )
val = zlanht( norm, n, d, e )
1542
Description
for C interface.
The functions ?lanst/?lanht return the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a real symmetric or a complex Hermitian
tridiagonal matrix A.
Input Parameters
routine:
n ≥ 0. When n = 0, ?lanst/?lanht is set to zero.
d REAL for slanst/clanht
DOUBLE PRECISION for dlanst/zlanht
Array, DIMENSION (n). The diagonal elements of A.
e REAL for slanst
DOUBLE PRECISION for dlanst
COMPLEX for clanht
COMPLEX*16 for zlanht
The (n-1) sub-diagonal or super-diagonal elements of A.
Output Parameters
val REAL for slanst/clanht
DOUBLE PRECISION for dlanst/zlanht
1543
?lansy
largest absolute value of a real/complex symmetric
matrix.
Syntax
val = slansy( norm, uplo, n, a, lda, work )
val = dlansy( norm, uplo, n, a, lda, work )
val = clansy( norm, uplo, n, a, lda, work )
val = zlansy( norm, uplo, n, a, lda, work )
Description
for C interface.
The function ?lansy returns the value of the 1-norm, or the Frobenius norm, or the infinity
norm, or the element of largest absolute value of a real/complex symmetric matrix A.
Input Parameters
routine:
uplo CHARACTER*1.
symmetric matrix A is to be referenced.
= 'U': Upper triangular part of A is referenced.
= 'L': Lower triangular part of A is referenced
1544
?lansy is set to zero.
a REAL for slansy
DOUBLE PRECISION for dlansy
COMPLEX for clansy
COMPLEX*16 for zlansy
Array, DIMENSION (lda,n). The symmetric matrix A.
If uplo = 'U', the leading n-by-n upper triangular part of
a contains the upper triangular part of the matrix A, and
the strictly lower triangular part of a is not referenced.
If uplo = 'L', the leading n-by-n lower triangular part of
a contains the lower triangular part of the matrix A, and the
strictly upper triangular part of a is not referenced.
lda ≥ max(n,1).
work REAL for slansy and clansy.
DOUBLE PRECISION for dlansy and zlansy.
Output Parameters
val REAL for slansy/clansy
DOUBLE PRECISION for dlansy/zlansy
?lanhe
largest absolute value of a complex Hermitian
matrix.
Syntax
val = clanhe( norm, uplo, n, a, lda, work )
val = zlanhe( norm, uplo, n, a, lda, work )
1545
Description
for C interface.
The function ?lanhe returns the value of the 1-norm, or the Frobenius norm, or the infinity
norm, or the element of largest absolute value of a complex Hermitian matrix A.
Input Parameters
routine:
of the matrix A.
uplo CHARACTER*1.
Hermitian matrix A is to be referenced.
= 'U': Upper triangular part of A is referenced.
= 'L': Lower triangular part of A is referenced
?lanhe is set to zero.
a COMPLEX for clanhe.
COMPLEX*16 for zlanhe.
Array, DIMENSION (lda,n). The Hermitian matrix A.
lda ≥ max(n,1).
1546
work REAL for clanhe.
DOUBLE PRECISION for zlanhe.
Workspace array, DIMENSION (max(1,lwork)) , where
Output Parameters
val REAL for clanhe.
DOUBLE PRECISION for zlanhe.
?lantb
largest absolute value of a triangular band matrix.
Syntax
val = slantb( norm, uplo, diag, n, k, ab, ldab, work )
val = dlantb( norm, uplo, diag, n, k, ab, ldab, work )
val = clantb( norm, uplo, diag, n, k, ab, ldab, work )
val = zlantb( norm, uplo, diag, n, k, ab, ldab, work )
Description
for C interface.
The function ?lantb returns the value of the 1-norm, or the Frobenius norm, or the infinity
norm, or the element of largest absolute value of an n-by-n triangular band matrix A, with ( k
+ 1 ) diagonals.
Input Parameters
routine:
1547

uplo CHARACTER*1.
Specifies whether the matrix A is upper or lower triangular.
= 'U': Upper triangular
= 'L': Lower triangular.
diag CHARACTER*1.
Specifies whether or not the matrix A is unit triangular.
= 'N': Non-unit triangular
= 'U': Unit triangular.
?lantb is set to zero.
k INTEGER. The number of super-diagonals of the matrix A if
uplo = 'U', or the number of sub-diagonals of the matrix
A if uplo = 'L'. k ≥ 0.
ab REAL for slantb
DOUBLE PRECISION for dlantb
COMPLEX for clantb
COMPLEX*16 for zlantb
Array, DIMENSION (ldab,n). The upper or lower triangular
band matrix A, stored in the first k+1 rows of ab.
The j-th column of A is stored in the j-th column of the
array ab as follows:
if uplo = 'U', ab(k+1+i-j,j) = a(i,j) for max(1,j-k)
≤ i ≤ j;
if uplo = 'L', ab(1+i-j,j) = a(i,j) for j≤ i≤
min(n,j+k).
Note that when diag = 'U', the elements of the array ab
corresponding to the diagonal elements of the matrix A are
not referenced, but are assumed to be one.
1548
ldab ≥ k+1.
work REAL for slantb and clantb.
DOUBLE PRECISION for dlantb and zlantb.
lwork ≥ n when norm = 'I' ; otherwise, work is not
referenced.
Output Parameters
val REAL for slantb/clantb.
DOUBLE PRECISION for dlantb/zlantb.
?lantp
largest absolute value of a triangular matrix
supplied in packed form.
Syntax
val = slantp( norm, uplo, diag, n, ap, work )
val = dlantp( norm, uplo, diag, n, ap, work )
val = clantp( norm, uplo, diag, n, ap, work )
val = zlantp( norm, uplo, diag, n, ap, work )
Description
for C interface.
The function ?lantp returns the value of the 1-norm, or the Frobenius norm, or the infinity
norm, or the element of largest absolute value of a triangular matrix A, supplied in packed
form.
Input Parameters
routine:
1549

uplo CHARACTER*1.
= 'L': Lower triangular.
diag CHARACTER*1.
= 'U': Unit triangular.
n ≥ 0. When n = 0, ?lantp is set to zero.
ap REAL for slantp
DOUBLE PRECISION for dlantp
COMPLEX for clantp
COMPLEX*16 for zlantp
The upper or lower triangular matrix A, packed columnwise
in a linear array. The j-th column of A is stored in the array
ap as follows:
if uplo = 'U', AP(i + (j-1)j/2) = a(i,j) for 1≤ i≤
j;
if uplo = 'L', ap(i + (j-1)(2n-j)/2) = a(i,j) for
j≤ i≤ n.
Note that when diag = 'U', the elements of the array ap
corresponding to the diagonal elements of the matrix A are
not referenced, but are assumed to be one.
work REAL for slantp and clantp.
DOUBLE PRECISION for dlantp and zlantp.
1550
lwork ≥ n when norm = 'I' ; otherwise, work is not
referenced.
Output Parameters
val REAL for slantp/clantp.
DOUBLE PRECISION for dlantp/zlantp.
?lantr
largest absolute value of a trapezoidal or triangular
matrix.
Syntax
val = slantr( norm, uplo, diag, m, n, a, lda, work )
val = dlantr( norm, uplo, diag, m, n, a, lda, work )
val = clantr( norm, uplo, diag, m, n, a, lda, work )
val = zlantr( norm, uplo, diag, m, n, a, lda, work )
Description
for C interface.
The function ?lantr returns the value of the 1-norm, or the Frobenius norm, or the infinity
norm, or the element of largest absolute value of a trapezoidal or triangular matrix A.
Input Parameters
routine:
1551

uplo CHARACTER*1.
Specifies whether the matrix A is upper or lower trapezoidal.
= 'U': Upper trapezoidal
= 'L': Lower trapezoidal.
Note that A is triangular instead of trapezoidal if m = n.
diag CHARACTER*1.
Specifies whether or not the matrix A has unit diagonal.
= 'N': Non-unit diagonal
= 'U': Unit diagonal.
m INTEGER. The number of rows of the matrix A. m ≥ 0, and
if uplo = 'U', m ≤ n.
When m = 0, ?lantr is set to zero.
n INTEGER. The number of columns of the matrix A. n ≥ 0,
and if uplo = 'L', n ≤ m.
When n = 0, ?lantr is set to zero.
a REAL for slantr
DOUBLE PRECISION for dlantr
COMPLEX for clantr
COMPLEX*16 for zlantr
The trapezoidal matrix A (A is triangular if m = n).
If uplo = 'U', the leading m-by-n upper trapezoidal part
of the array a contains the upper trapezoidal matrix, and
the strictly lower triangular part of A is not referenced.
If uplo = 'L', the leading m-by-n lower trapezoidal part
of the array a contains the lower trapezoidal matrix, and
the strictly upper triangular part of A is not referenced. Note
that when diag = 'U', the diagonal elements of A are not
referenced and are assumed to be one.
lda ≥ max(m,1).
1552
work REAL for slantr/clantrp.
DOUBLE PRECISION for dlantr/zlantr.
lwork ≥ m when norm = 'I' ; otherwise, work is not
referenced.
Output Parameters
val REAL for slantr/clantrp.
DOUBLE PRECISION for dlantr/zlantr.
?lanv2
Computes the Schur factorization of a real 2-by-2
nonsymmetric matrix in standard form.
Syntax
call slanv2( a, b, c, d, rt1r, rt1i, rt2r, rt2i, cs, sn )
call dlanv2( a, b, c, d, rt1r, rt1i, rt2r, rt2i, cs, sn )
Description
for C interface.
The routine computes the Schur factorization of a real 2-by-2 nonsymmetric matrix in standard
form:
where either
1. cc = 0 so that aa and dd are real eigenvalues of the matrix, or

2. aa = dd and bb*cc < 0, so that aa ± sqrt(bb*cc) are complex conjugate eigenvalues.
1553
The routine was adjusted to reduce the risk of cancellation errors, when computing real
eigenvalues, and to ensure, if possible, that abs(rt1r) ≥ abs(rt2r).
Input Parameters
a, b, c, d REAL for slanv2
DOUBLE PRECISION for dlanv2.
On entry, elements of the input matrix.
Output Parameters
a, b, c, d On exit, overwritten by the elements of the standardized
Schur form.
rt1r, rt1i, rt2r, rt2i REAL for slanv2
The real and imaginary parts of the eigenvalues.
If the eigenvalues are a complex conjugate pair, rt1i >
0.
cs, sn REAL for slanv2
Parameters of the rotation matrix.
?lapll
Measures the linear dependence of two vectors.
Syntax
call slapll( n, x, incx, Y, incy, ssmin )
call dlapll( n, x, incx, Y, incy, ssmin )
call clapll( n, x, incx, Y, incy, ssmin )
call zlapll( n, x, incx, Y, incy, ssmin )
Description
for C interface.
Given two column vectors x and y of length n, let
1554
A = (x y) be the n-by-2 matrix.
The routine ?lapll first computes the QR factorization of A as A = Q*R and then computes the
SVD of the 2-by-2 upper triangular matrix R. The smaller singular value of R is returned in
ssmin, which is used as the measurement of the linear dependency of the vectors x and y.
Input Parameters
n INTEGER. The length of the vectors x and y.
x REAL for slapll
DOUBLE PRECISION for dlapll
COMPLEX for clapll
COMPLEX*16 for zlapll
Array, DIMENSION (1+(n-1)incx).
On entry, x contains the n-vector x.
y REAL for slapll
DOUBLE PRECISION for dlapll
COMPLEX for clapll
COMPLEX*16 for zlapll
Array, DIMENSION (1+(n-1)incy).
On entry, y contains the n-vector y.
x; incx > 0.
y; incy > 0.
Output Parameters
x On exit, x is overwritten.
y On exit, y is overwritten.
ssmin REAL for slapll/clapll
DOUBLE PRECISION for dlapll/zlapll
The smallest singular value of the n-by-2 matrix A = (x y).
1555
?lapmt
Performs a forward or backward permutation of
the columns of a matrix.
Syntax
call slapmt( forwrd, m, n, x, ldx, k )
call dlapmt( forwrd, m, n, x, ldx, k )
call clapmt( forwrd, m, n, x, ldx, k )
call zlapmt( forwrd, m, n, x, ldx, k )
Description
for C interface.
The routine ?lapmt rearranges the columns of the m-by-n matrix X as specified by the
permutation k(1),k(2),...,k(n) of the integers 1,...,n.
If forwrd = .TRUE., forward permutation:
X(*,k(j)) is moved to X(*,j) for j=1,2,...,n.
If forwrd = .FALSE., backward permutation:
X(*,j) is moved to X(*,k(j)) for j = 1,2,...,n.
Input Parameters
forwrd LOGICAL.
If forwrd = .TRUE., forward permutation
If forwrd = .FALSE., backward permutation
m INTEGER. The number of rows of the matrix X. m ≥ 0.
n INTEGER. The number of columns of the matrix X. n ≥ 0.
x REAL for slapmt
DOUBLE PRECISION for dlapmt
COMPLEX for clapmt
COMPLEX*16 for zlapmt
Array, DIMENSION (ldx,n). On entry, the m-by-n matrix X.
1556
ldx INTEGER. The leading dimension of the array X, ldx ≥
max(1,m).
k INTEGER. Array, DIMENSION (n). On entry, k contains the
permutation vector and is used as internal workspace.
Output Parameters
x On exit, x contains the permuted matrix X.
k On exit, k is reset to its original value.
?lapy2
Returns sqrt(x2+y2).
Syntax
val = slapy2( x, y )
val = dlapy2( x, y )
Description
for C interface.
2 2
The function ?lapy2 returns sqrt(x +y ), avoiding unnecessary overflow or harmful underflow.
Input Parameters
x, y REAL for slapy2
DOUBLE PRECISION for dlapy2
Specify the input values x and y.
Output Parameters
val REAL for slapy2
DOUBLE PRECISION for dlapy2.
1557
?lapy3
Returns sqrt(x2+y2+z2).
Syntax
val = slapy3( x, y, z )
val = dlapy3( x, y, z )
Description
for C interface.
2 2 2
The function ?lapy3 returns sqrt(x +y +z ), avoiding unnecessary overflow or harmful
underflow.
Input Parameters
x, y, z REAL for slapy3
DOUBLE PRECISION for dlapy3
Specify the input values x, y and z.
Output Parameters
val REAL for slapy3
DOUBLE PRECISION for dlapy3.
?laqgb
Scales a general band matrix, using row and
column scaling factors computed by ?gbequ.
Syntax
call slaqgb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, equed )
call dlaqgb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, equed )
call claqgb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, equed )
call zlaqgb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, equed )
1558
Description
for C interface.
The routine equilibrates a general m-by-n band matrix A with kl subdiagonals and ku
superdiagonals using the row and column scaling factors in the vectors r and c.
Input Parameters
m INTEGER. The number of rows of the matrix A. m ≥ 0.
n INTEGER. The number of columns of the matrix A. n ≥ 0.
kl INTEGER. The number of subdiagonals within the band of
A. kl ≥ 0.
ku INTEGER. The number of superdiagonals within the band of
A. ku ≥ 0.
ab REAL for slaqgb
DOUBLE PRECISION for dlaqgb
COMPLEX for claqgb
COMPLEX*16 for zlaqgb
Array, DIMENSION (ldab,n). On entry, the matrix A in band
storage, in rows 1 to kl+ku+1. The j-th column of A is
stored in the j-th column of the array ab as follows:
ab(ku+1+i-j,j) = A(i,j) for max(1,j-ku) ≤ i ≤
min(m,j+kl).
lda ≥ kl+ku+1.
amax REAL for slaqgb/claqgb
DOUBLE PRECISION for dlaqgb/zlaqgb
Absolute value of largest matrix entry.
r, c REAL for slaqgb/claqgb
Arrays r (m), c (n). Contain the row and column scale factors
for A, respectively.
rowcnd REAL for slaqgb/claqgb
1559
Ratio of the smallest r(i) to the largest r(i).

colcnd REAL for slaqgb/claqgb
Ratio of the smallest c(i) to the largest c(i).
Output Parameters
ab On exit, the equilibrated matrix, in the same storage format
as A.
See equed for the form of the equilibrated matrix.
equed CHARACTER*1.
Specifies the form of equilibration that was done.
If equed = 'N': No equilibration
If equed = 'R': Row equilibration, that is, A has been
If equed = 'C': Column equilibration, that is, A has been
If equed = 'B': Both row and column equilibration, that
is, A has been replaced by diag(r)*A*diag(c).
Application Notes
The routine uses internal parameters thresh, large, and small, which have the following
meaning. thresh is a threshold value used to decide if row or column scaling should be done
based on the ratio of the row or column scaling factors. If rowcnd < thresh, row scaling is
done, and if colcnd < thresh, column scaling is done. large and small are threshold values
used to decide if row scaling should be done based on the absolute size of the largest matrix
element. If amax > large or amax < small, row scaling is done.
1560
?laqge
Scales a general rectangular matrix, using row and
column scaling factors computed by ?geequ.
Syntax
call slaqge( m, n, a, lda, r, c, rowcnd, colcnd, amax, equed )
call dlaqge( m, n, a, lda, r, c, rowcnd, colcnd, amax, equed )
call claqge( m, n, a, lda, r, c, rowcnd, colcnd, amax, equed )
call zlaqge( m, n, a, lda, r, c, rowcnd, colcnd, amax, equed )
Description
for C interface.
The routine equilibrates a general m-by-n matrix A using the row and column scaling factors in
the vectors r and c.
Input Parameters
m ≥ 0.
n ≥ 0.
a REAL for slaqge
DOUBLE PRECISION for dlaqge
COMPLEX for claqge
COMPLEX*16 for zlaqge
Array, DIMENSION (lda,n). On entry, the m-by-n matrix A.
lda ≥ max(m,1).
r REAL for slanqge/claqge
DOUBLE PRECISION for dlaqge/zlaqge
Array, DIMENSION (m). The row scale factors for A.
c REAL for slanqge/claqge
1561

Array, DIMENSION (n). The column scale factors for A.
rowcnd REAL for slanqge/claqge
Ratio of the smallest r(i) to the largest r(i).
colcnd REAL for slanqge/claqge
Ratio of the smallest c(i) to the largest c(i).
amax REAL for slanqge/claqge
Output Parameters
a On exit, the equilibrated matrix.
See equed for the form of the equilibrated matrix.
equed CHARACTER*1.
If equed = 'N': No equilibration
If equed = 'R': Row equilibration, that is, A has been
If equed = 'C': Column equilibration, that is, A has been
If equed = 'B': Both row and column equilibration, that
is, A has been replaced by diag(r)*A*diag(c).
Application Notes
meaning. thresh is a threshold value used to decide if row or column scaling should be done
based on the ratio of the row or column scaling factors. If rowcnd < thresh, row scaling is
done, and if colcnd < thresh, column scaling is done. large and small are threshold values
used to decide if row scaling should be done based on the absolute size of the largest matrix
element. If amax > large or amax < small, row scaling is done.
1562
?laqhb
Scales a Hermetian band matrix, using scaling
Syntax
call claqhb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
call zlaqhb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
Description
for C interface.
The routine equilibrates a Hermetian band matrix A using the scaling factors in the vector s.
Input Parameters
uplo CHARACTER*1.
band matrix A is stored.
If uplo = 'U': upper triangular.
If uplo = 'L': lower triangular.
n ≥ 0.
kd INTEGER. The number of super-diagonals of the matrix A if
uplo = 'U', or the number of sub-diagonals if uplo =
'L'.
kd ≥ 0.
ab COMPLEX for claqhb
COMPLEX*16 for zlaqhb
Array, DIMENSION (ldab,n). On entry, the upper or lower
triangle of the band matrix A, stored in the first kd+1 rows
of the array. The j-th column of A is stored in the j-th
column of the array ab as follows:
if uplo = 'U', ab(kd+1+i-j,j) = A(i,j) for
max(1,j-kd) ≤ i ≤ j;
1563
if uplo = 'L', ab(1+i-j,j) = A(i,j) for j ≤ i ≤

min(n,j+kd).
ldab ≥ kd+1.
scond REAL for claqsb
DOUBLE PRECISION for zlaqsb
Ratio of the smallest s(i) to the largest s(i).
amax REAL for claqsb
Output Parameters
ab On exit, if info = 0, the triangular factor U or L from the
Cholesky factorization A = U'*U or A = L*L' of the band
matrix A, in the same storage format as A.
s REAL for claqsb
Array, DIMENSION (n). The scale factors for A.
equed CHARACTER*1.
Specifies whether or not equilibration was done.
If equed = 'N': No equilibration.
If equed = 'Y': Equilibration was done, that is, A has been
replaced by diag(s)*A*diag(s).
Application Notes
meaning. thresh is a threshold value used to decide if scaling should be based on the ratio of
the scaling factors. If scond < thresh, scaling is done.
The values large and small are threshold values used to decide if scaling should be done based
on the absolute size of the largest matrix element. If amax > large or amax < small, scaling
is done.
1564
?laqp2
Computes a QR factorization with column pivoting
of the matrix block.
Syntax
call slaqp2( m, n, offset, a, lda, jpvt, tau, vn1, vn2, work )
call dlaqp2( m, n, offset, a, lda, jpvt, tau, vn1, vn2, work )
call claqp2( m, n, offset, a, lda, jpvt, tau, vn1, vn2, work )
call zlaqp2( m, n, offset, a, lda, jpvt, tau, vn1, vn2, work )
Description
for C interface.
The routine computes a QR factorization with column pivoting of the block A(offset+1:m,1:n).
The block A(1:offset,1:n) is accordingly pivoted, but not factorized.
Input Parameters
offset INTEGER. The number of rows of the matrix A that must be
pivoted but no factorized. offset ≥ 0.
a REAL for slaqp2
DOUBLE PRECISION for dlaqp2
COMPLEX for claqp2
COMPLEX*16 for zlaqp2
Array, DIMENSION (lda,n). On entry, the m-by-n matrix A.
lda INTEGER. The leading dimension of the array a. lda ≥
max(1,m).
jpvt INTEGER.
1565
On entry, if jpvt(i) ≠ 0, the i-th column of A is permuted

to the front of A*P (a leading column); if jpvt(i) = 0, the
i-th column of A is a free column.
vn1, vn2 REAL for slaqp2/claqp2
DOUBLE PRECISION for dlaqp2/zlaqp2
Arrays, DIMENSION (n) each. Contain the vectors with the
partial and exact column norms, respectively.
work REAL for slaqp2
COMPLEX for claqp2
COMPLEX*16 for zlaqp2 Workspace array, DIMENSION (n).
Output Parameters
a On exit, the upper triangle of block A(offset+1:m,1:n) is
the triangular factor obtained; the elements in block
A(offset+1:m,1:n) below the diagonal, together with the
array tau, represent the orthogonal matrix Q as a product
of elementary reflectors. Block A(1:offset,1:n) has been
accordingly pivoted, but not factorized.
jpvt On exit, if jpvt(i) = k, then the i-th column of A*P was
the k-th column of A.
tau REAL for slaqp2
COMPLEX for claqp2
COMPLEX*16 for zlaqp2
Array, DIMENSION (min(m,n)).
The scalar factors of the elementary reflectors.
vn1, vn2 Contain the vectors with the partial and exact column norms,
respectively.
1566
?laqps
Computes a step of QR factorization with column
pivoting of a real m-by-n matrix A by using BLAS
level 3.
Syntax
call slaqps( m, n, offset, nb, kb, a, lda, jpvt, tau, vn1, vn2, auxv, f, ldf
)
call dlaqps( m, n, offset, nb, kb, a, lda, jpvt, tau, vn1, vn2, auxv, f, ldf
)
call claqps( m, n, offset, nb, kb, a, lda, jpvt, tau, vn1, vn2, auxv, f, ldf
)
call zlaqps( m, n, offset, nb, kb, a, lda, jpvt, tau, vn1, vn2, auxv, f, ldf
)
Description
for C interface.
The routine computes a step of QR factorization with column pivoting of a real m-by-n matrix A
by using BLAS level 3. The routine tries to factorize NB columns from A starting from the row
offset+1, and updates all of the matrix with BLAS level 3 routine ?gemm.
In some cases, due to catastrophic cancellations, ?laqps cannot factorize NB columns. Hence,
the actual number of factorized columns is returned in kb.
Block A(1:offset,1:n) is accordingly pivoted, but not factorized.
Input Parameters
offset INTEGER. The number of rows of A that have been factorized
in previous steps.
nb INTEGER. The number of columns to factorize.
a REAL for slaqps
1567
DOUBLE PRECISION for dlaqps

COMPLEX for claqps
COMPLEX*16 for zlaqps
On entry, the m-by-n matrix A.
lda ≥ max(1,m).
jpvt INTEGER. Array, DIMENSION (n).
If jpvt(I) = k then column k of the full matrix A has been
permuted into position i in AP.
vn1, vn2 REAL for slaqps/claqps
DOUBLE PRECISION for dlaqps/zlaqps
Arrays, DIMENSION (n) each. Contain the vectors with the
partial and exact column norms, respectively.
auxv REAL for slaqps
COMPLEX for claqps
Array, DIMENSION (nb). Auxiliary vector.
f REAL for slaqps
COMPLEX for claqps
Array, DIMENSION (ldf,nb). Matrix F' = L*Y'*A.
ldf INTEGER. The leading dimension of the array f.
ldf ≥ max(1,n).
Output Parameters
kb INTEGER. The number of columns actually factorized.
a On exit, block A(offset+1:m,1:kb) is the triangular factor
obtained and block A(1:offset,1:n) has been accordingly
pivoted, but no factorized. The rest of the matrix, block
A(offset+1:m,kb+1:n) has been updated.
1568
jpvt INTEGER array, DIMENSION (n). If jpvt(I) = k then
column k of the full matrix A has been permuted into position
i in AP.
tau REAL for slaqps
COMPLEX for claqps
Array, DIMENSION (kb). The scalar factors of the elementary
reflectors.
vn1, vn2 The vectors with the partial and exact column norms,
respectively.
auxv Auxiliary vector.
f Matrix F' = L*Y'*A.
?laqr0
Computes the eigenvalues of a Hessenberg matrix,
and optionally the marixes from the Schur
decomposition.
Syntax
call slaqr0( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz,
work, lwork, info )
call dlaqr0( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz,
work, lwork, info )
call claqr0( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, work,
lwork, info )
call zlaqr0( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, work,
lwork, info )
Description
for C interface.
1569
The routine computes the eigenvalues of a Hessenberg matrix H, and, optionally, the matrices
T and Z from the Schur decomposition H=Z*T*ZH, where T is an upper quasi-triangular/triangular
matrix (the Schur form), and Z is the orthogonal/unitary matrix of Schur vectors.
Optionally Z may be postmultiplied into an input orthogonal/unitary matrix Q so that this routine
can give the Schur factorization of a matrix A which has been reduced to the Hessenberg form
H by the orthogonal/unitary matrix Q: A = Q*H*QH = (QZ)*H*(QZ)H.
Input Parameters
wantt LOGICAL.
If wantt = .FALSE., only eigenvalues are required.
wantz LOGICAL.
required;
n INTEGER. The order of the Hessenberg matrix H. (n ≥ 0).
ilo, ihi INTEGER.
It is assumed that H is already upper triangular in rows and
columns 1:ilo-1 and ihi+1:n, and if ilo > 1 then
H(ilo, ilo-1) = 0.
ilo and ihi are normally set by a previous call to cgebal,
and then passed to cgehrd when the matrix output by
cgebal is reduced to Hessenberg form. Otherwise, ilo and
ihi should be set to 1 and n, respectively.
If n > 0, then 1 ≤ ilo ≤ ihi ≤ n.
If n=0, then ilo=1 and ihi=0
h REAL for slaqr0
DOUBLE PRECISION for dlaqr0
COMPLEX for claqr0
COMPLEX*16 for zlaqr0.
Array, DIMENSION (ldh, n), contains the upper Hessenberg
matrix H.
ldh INTEGER. The leading dimension of the array h. ldh ≥
max(1, n).
1570
iloz, ihiz INTEGER. Specify the rows of Z to which transformations
must be applied if wantz is .TRUE., 1 ≤ iloz ≤ ilo;
ihi ≤ ihiz ≤ n.
z REAL for slaqr0
COMPLEX for claqr0
Array, DIMENSION (ldz, ihi), contains the matrix Z if
wantz is .TRUE.. If wantz is .FALSE., z is not referenced.
ldz INTEGER. The leading dimension of the array z.
If wantz is .TRUE., then ldz ≥ max(1, ihiz). Otherwise,
ldz ≥ 1.
work REAL for slaqr0
COMPLEX for claqr0
Workspace array with dimension lwork.
lwork ≥ max(1,n) is sufficient, but for the optimal
performance a greater workspace may be required, typically
as large as 6*n.
It is recommended to use the workspace query to determine
the optimal workspace size. If lwork=-1,then the routine
performs a workspace query: it estimates the optimal
workspace size for the given values of the input parameters
n, ilo, and ihi. The estimate is returned in work(1). No
error messages related to the lwork is issued by xerbla.
Neither H nor Z are accessed.
Output Parameters
h If info=0 , and wantt is .TRUE., then h contains the upper
quasi-triangular/triangular matrix T from the Schur
decomposition (the Schur form).
If info=0 , and wantt is .FALSE., then the contents of h
are unspecified on exit.
1571
(The output values of h when info > 0 are given under

the description of the info parameter below.)
The routine may explicitly set h(i,j) for i>j and
j=1,2,...ilo-1 or j=ihi+1, ihi+2,...n.
work(1) On exit work(1) contains the minimum value of lwork
required for optimum performance.
w COMPLEX for claqr0
Arrays, DIMENSION(n). The computed eigenvalues of
h(ilo:ihi, ilo:ihi) are stored in w(ilo:ihi). If wantt is
.TRUE., then the eigenvalues are stored in the same order
as on the diagonal of the Schur form returned in h, with
w(i) = h(i,i).
wr, wi REAL for slaqr0
Arrays, DIMENSION(ihi) each. The real and imaginary parts,
respectively, of the computed eigenvalues of h(ilo:ihi,
ilo:ihi) are stored in the wr(ilo:ihi) and wi(ilo:ihi).
If two eigenvalues are computed as a complex conjugate
pair, they are stored in consecutive elements of wr and wi,
say the i-th and (i+1)-th, with wi(i)> 0 and wi(i+1) <
0. If wantt is .TRUE. , then the eigenvalues are stored in
the same order as on the diagonal of the Schur form
returned in h, with wr(i) = h(i,i), and if
h(i:i+1,i:i+1)is a 2-by-2 diagonal block, then
wi(i)=sqrt(-h(i+1,i)*h(i,i+1)).
z If wantz is .TRUE., then z(ilo:ihi, iloz:ihiz) is
replaced by z(ilo:ihi, iloz:ihiz)*U, where U is the
orthogonal/unitary Schur factor of h(ilo:ihi, ilo:ihi).
If wantz is .FALSE., z is not referenced.
(The output values of z when info > 0 are given under
info INTEGER.
= 0: the execution is successful.
1572
> 0: if info = i, then the routine failed to compute all the
eigenvalues. Elements 1:ilo-1 and i+1:n of wr and wi
contain those eigenvalues which have been successfully
computed.
> 0: if wantt is .FALSE., then the remaining unconverged
eigenvalues are the eigenvalues of the upper Hessenberg
matrix rows and columns ilo through info of the final
output value of h.
> 0: if wantt is .TRUE., then (initial value of h)*U =
U*(final value of h, where U is an orthogonal/unitary
matrix. The final value of h is upper Hessenberg and
quasi-triangular/triangular in rows and columns info+1
through ihi.
> 0: if wantz is .TRUE., then (final value of z(ilo:ihi,
iloz:ihiz))=(initial value of z(ilo:ihi, iloz:ihiz)*U,
where U is the orthogonal/unitary matrix in the previous
expression (regardless of the value of wantt).
> 0: if wantz is .FALSE., then z is not accessed.
?laqr1
Sets a scalar multiple of the first column of the
product of 2-by-2 or 3-by-3 matrix H and specified
shifts.
Syntax
call slaqr1( n, h, ldh, sr1, si1, sr2, si2, v )
call dlaqr1( n, h, ldh, sr1, si1, sr2, si2, v )
call claqr1( n, h, ldh, s1, s2, v )
call zlaqr1( n, h, ldh, s1, s2, v )
Description
for C interface.
Given a 2-by-2 or 3-by-3 matrix H, this routine sets v to a scalar multiple of the first column
of the product
1573
K = (H - s1*I)*(H - s2*I), or K = (H - (sr1 + i*si1)*I)*(H - (sr2 + i*si2)*I)

scaling to avoid overflows and most underflows.
It is assumed that either 1) sr1 = sr2 and si1 = -si2, or 2) si1 = si2 = 0.
This is useful for starting double implicit shift bulges in the QR algorithm.
Input Parameters
n INTEGER.
The order of the matrix H. n must be equal to 2 or 3.
sr1, si2, sr2, si2 REAL for slaqr1
Shift values that define K in the formula above.
s1, s2 COMPLEX for claqr1
Shift values that define K in the formula above.
h REAL for slaqr1
COMPLEX for claqr1
Array, DIMENSION (ldh, n), contains 2-by-2 or 3-by-3 matrix
H in the formula above.
ldh INTEGER.
The leading dimension of the array h just as declared in the
calling routine. ldh ≥ n.
Output Parameters
v REAL for slaqr1
COMPLEX for claqr1
Array with dimension (n).
A scalar multiple of the first column of the matrix K in the
formula above.
1574
?laqr2
Performs the orthogonal/unitary similarity
transformation of a Hessenberg matrix to detect
and deflate fully converged eigenvalues from a
trailing principal submatrix (aggressive early
deflation).
Syntax
call slaqr2( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns,
nd, sr, si, v, ldv, nh, t, ldt, nv, wv, ldwv, work, lwork )
call dlaqr2( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns,
call claqr2( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns,
nd, sh, v, ldv, nh, t, ldt, nv, wv, ldwv, work, lwork )
call zlaqr2( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns,
Description
for C interface.
The routine accepts as input an upper Hessenberg matrix H and performs an orthogonal/unitary
similarity transformation designed to detect and deflate fully converged eigenvalues from a
trailing principal submatrix. On output H has been overwritten by a new Hessenberg matrix
that is a perturbation of an orthogonal/unitary similarity transformation of H. It is to be hoped
that the final version of H has many zero subdiagonal entries.
This subroutine is identical to ?laqr3 except that it avoids recursion by calling ?lahqr instead
of ?laqr4.
Input Parameters
wantt LOGICAL.
If wantt = .TRUE., then the Hessenberg matrix H is fully
updated so that the quasi-triangular/triangular Schur factor
may be computed (in cooperation with the calling
subroutine).
1575
If wantt = .FALSE., then only enough of H is updated to

preserve the eigenvalues.
wantz LOGICAL.
If wantz = .TRUE., then the orthogonal/unitary matrix Z
is updated so that the orthogonal/unitary Schur factor may
be computed (in cooperation with the calling subroutine).
If wantz = .FALSE., then Z is not referenced.
n INTEGER. The order of the Hessenberg matrix H and (if
wantz = .TRUE.) the order of the orthogonal/unitary matrix
Z.
ktop INTEGER.
It is assumed that either ktop=1 or h(ktop,ktop-1)=0.
ktop and kbot together determine an isolated block along
the diagonal of the Hessenberg matrix.
kbot INTEGER.
It is assumed without a check that either kbot=n or
h(kbot+1,kbot)=0. ktop and kbot together determine an
isolated block along the diagonal of the Hessenberg matrix.
nw INTEGER.
Size of the deflation window. 1 ≤ nw ≤ (kbot-ktop+1).
h REAL for slaqr2
COMPLEX for claqr2
Array, DIMENSION (ldh, n), on input the initial n-by-n
section of h stores the Hessenberg matrix H undergoing
aggressive early deflation.
ldh INTEGER. The leading dimension of the array h just as
declared in the calling subroutine. ldh≥n.
must be applied if wantz is .TRUE.. 1 ≤ iloz ≤ ihiz ≤
n.
z REAL for slaqr2
COMPLEX for claqr2
1576
wantz is .TRUE.. If wantz is .FALSE., then z is not
referenced.
ldz INTEGER. The leading dimension of the array z just as
declared in the calling subroutine. ldz ≥ 1.
v REAL for slaqr2
COMPLEX for claqr2
Workspace array with dimension (ldv, nw). An nw-by-nw
work array.
ldv INTEGER. The leading dimension of the array v just as
declared in the calling subroutine. ldv ≥ nw.
nh INTEGER. The number of column of t. nh ≥ nw.
t REAL for slaqr2
COMPLEX for claqr2
Workspace array with dimension (ldt, nw).
ldt INTEGER. The leading dimension of the array t just as
declared in the calling subroutine. ldt≥nw.
nv INTEGER. The number of rows of work array wv available
for workspace. nv≥nw.
wv REAL for slaqr2
COMPLEX for claqr2
Workspace array with dimension (ldwv, nw).
ldwv INTEGER. The leading dimension of the array wv just as
declared in the calling subroutine. ldwv≥nw.
COMPLEX for claqr2
1577

lwork=2*nw) is sufficient, but for the optimal performance
a greater workspace may be required.
If lwork=-1,then the routine performs a workspace query:
it estimates the optimal workspace size for the given values
of the input parameters n, nw, ktop, and kbot. The estimate
is returned in work(1). No error messages related to the
lwork is issued by xerbla. Neither H nor Z are accessed.
Output Parameters
h On output h has been transformed by an orthogonal/unitary
similarity transformation, perturbed, and the returned to
Hessenberg form that (it is to be hoped) has some zero
subdiagonal entries.
work(1) On exit work(1) is set to an estimate of the optimal value
of lwork for the given values of the input parameters n,
nw, ktop, and kbot.
z If wantz is .TRUE., then the orthogonal/unitary similarity
transformation is accumulated into z(iloz:ihiz, ilo:ihi)
from the right.
If wantz is .FALSE., then z is unreferenced.
nd INTEGER. The number of converged eigenvalues uncovered
by the routine.
ns INTEGER. The number of unconverged, that is approximate
eigenvalues returned in sr, si or in sh that may be used
as shifts by the calling subroutine.
sh COMPLEX for claqr2
Arrays, DIMENSION (kbot).
The approximate eigenvalues that may be used for shifts
are stored in the sh(kbot-nd-ns+1)through the
sh(kbot-nd).
The converged eigenvalues are stored in the
sh(kbot-nd+1)through the sh(kbot).
1578
sr, si REAL for slaqr2
Arrays, DIMENSION (kbot) each.
The real and imaginary parts of the approximate eigenvalues
that may be used for shifts are stored in the
sr(kbot-nd-ns+1)through the sr(kbot-nd), and
si(kbot-nd-ns+1) through the si(kbot-nd), respectively.
The real and imaginary parts of converged eigenvalues are
stored in the sr(kbot-nd+1)through the sr(kbot), and
si(kbot-nd+1) through the si(kbot), respectively.
?laqr3
Performs the orthogonal/unitary similarity
transformation of a Hessenberg matrix to detect
and deflate fully converged eigenvalues from a
trailing principal submatrix (aggressive early
deflation).
Syntax
call slaqr3( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns,
call dlaqr3( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns,
call claqr3( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns,
call zlaqr3( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns,
Description
for C interface.
The routine accepts as input an upper Hessenberg matrix H and performs an orthogonal/unitary
similarity transformation designed to detect and deflate fully converged eigenvalues from a
trailing principal submatrix. On output H has been overwritten by a new Hessenberg matrix
that is a perturbation of an orthogonal/unitary similarity transformation of H. It is to be hoped
that the final version of H has many zero subdiagonal entries.
1579
Input Parameters
wantt LOGICAL.
If wantt = .TRUE., then the Hessenberg matrix H is fully
updated so that the quasi-triangular/triangular Schur factor
may be computed (in cooperation with the calling
subroutine).
If wantt = .FALSE., then only enough of H is updated to
preserve the eigenvalues.
wantz LOGICAL.
If wantz = .TRUE., then the orthogonal/unitary matrix Z
is updated so that the orthogonal/unitary Schur factor may
be computed (in cooperation with the calling subroutine).
If wantz = .FALSE., then Z is not referenced.
n INTEGER. The order of the Hessenberg matrix H and (if
wantz = .TRUE.) the order of the orthogonal/unitary matrix
Z.
ktop INTEGER.
It is assumed that either ktop=1 or h(ktop,ktop-1)=0.
ktop and kbot together determine an isolated block along
the diagonal of the Hessenberg matrix.
kbot INTEGER.
It is assumed without a check that either kbot=n or
h(kbot+1,kbot)=0. ktop and kbot together determine an
isolated block along the diagonal of the Hessenberg matrix.
nw INTEGER.
Size of the deflation window. 1≤nw≤(kbot-ktop+1).
h REAL for slaqr3
COMPLEX for claqr3
Array, DIMENSION (ldh, n), on input the initial n-by-n
section of h stores the Hessenberg matrix H undergoing
aggressive early deflation.
declared in the calling subroutine. ldh≥n.
1580
must be applied if wantz is .TRUE.. 1≤iloz≤ihiz≤n.
z REAL for slaqr3
COMPLEX for claqr3
referenced.
declared in the calling subroutine. ldz≥1.
v REAL for slaqr3
COMPLEX for claqr3
Workspace array with dimension (ldv, nw). An nw-by-nw
work array.
declared in the calling subroutine. ldv≥nw.
nh INTEGER. The number of column of t. nh≥nw.
t REAL for slaqr3
COMPLEX for claqr3
Workspace array with dimension (ldt, nw).
ldt INTEGER. The leading dimension of the array t just as
declared in the calling subroutine. ldt≥nw.
nv INTEGER. The number of rows of work array wv available
for workspace. nv≥nw.
wv REAL for slaqr3
COMPLEX for claqr3
Workspace array with dimension (ldwv, nw).
1581

declared in the calling subroutine. ldwv≥nw.
COMPLEX for claqr3
lwork=2*nw) is sufficient, but for the optimal performance
a greater workspace may be required.
If lwork=-1,then the routine performs a workspace query:
it estimates the optimal workspace size for the given values
of the input parameters n, nw, ktop, and kbot. The estimate
is returned in work(1). No error messages related to the
lwork is issued by xerbla. Neither H nor Z are accessed.
Output Parameters
h On output h has been transformed by an orthogonal/unitary
similarity transformation, perturbed, and the returned to
Hessenberg form that (it is to be hoped) has some zero
subdiagonal entries.
work(1) On exit work(1) is set to an estimate of the optimal value
of lwork for the given values of the input parameters n,
nw, ktop, and kbot.
z If wantz is .TRUE., then the orthogonal/unitary similarity
transformation is accumulated into z(iloz:ihiz, ilo:ihi)
from the right.
nd INTEGER. The number of converged eigenvalues uncovered
by the routine.
ns INTEGER. The number of unconverged, that is approximate
eigenvalues returned in sr, si or in sh that may be used
as shifts by the calling subroutine.
sh COMPLEX for claqr3
Arrays, DIMENSION (kbot).
1582
The approximate eigenvalues that may be used for shifts
are stored in the sh(kbot-nd-ns+1)through the
sh(kbot-nd).
The converged eigenvalues are stored in the
sh(kbot-nd+1)through the sh(kbot).
Arrays, DIMENSION (kbot) each.
The real and imaginary parts of the approximate eigenvalues
that may be used for shifts are stored in the
sr(kbot-nd-ns+1)through the sr(kbot-nd), and
si(kbot-nd-ns+1) through the si(kbot-nd), respectively.
The real and imaginary parts of converged eigenvalues are
stored in the sr(kbot-nd+1)through the sr(kbot), and
si(kbot-nd+1) through the si(kbot), respectively.
?laqr4
Computes the eigenvalues of a Hessenberg matrix,
and optionally the matrices from the Schur
decomposition.
Syntax
call slaqr4( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz,
work, lwork, info )
call dlaqr4( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz,
work, lwork, info )
call claqr4( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, work,
lwork, info )
call zlaqr4( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, work,
lwork, info )
Description
for C interface.
1583
The routine computes the eigenvalues of a Hessenberg matrix H, and, optionally, the matrices
T and Z from the Schur decomposition H=Z*T*ZH, where T is an upper quasi-triangular/triangular
matrix (the Schur form), and Z is the orthogonal/unitary matrix of Schur vectors.
Optionally Z may be postmultiplied into an input orthogonal/unitary matrix Q so that this routine
can give the Schur factorization of a matrix A which has been reduced to the Hessenberg form
H by the orthogonal/unitary matrix Q: A = Q*H*QH = (QZ)*H*(QZ)H.
This routine implements one level of recursion for ?laqr0. It is a complete implementation of
the small bulge multi-shift QR algorithm. It may be called by ?laqr0 and, for large enough
deflation window size, it may be called by ?laqr3. This routine is identical to ?laqr0 except
that it calls ?laqr2 instead of ?laqr3.
Input Parameters
wantt LOGICAL.
wantz LOGICAL.
required;
n INTEGER. The order of the Hessenberg matrix H. (n ≥ 0).
ilo, ihi INTEGER.
columns 1:ilo-1 and ihi+1:n, and if ilo > 1 then
h(ilo, ilo-1) = 0.
ilo and ihi are normally set by a previous call to cgebal,
and then passed to cgehrd when the matrix output by
cgebal is reduced to Hessenberg form. Otherwise, ilo and
ihi should be set to 1 and n, respectively.
If n > 0, then 1 ≤ ilo ≤ ihi ≤ n.
If n=0, then ilo=1 and ihi=0
h REAL for slaqr4
COMPLEX for claqr4
1584
Array, DIMENSION (ldh, n), contains the upper Hessenberg
matrix H.
ldh INTEGER. The leading dimension of the array h. ldh ≥
max(1, n).
must be applied if wantz is .TRUE., 1 ≤ iloz ≤ ilo;
ihi ≤ ihiz ≤ n.
z REAL for slaqr4
COMPLEX for claqr4
wantz is .TRUE.. If wantz is .FALSE., z is not referenced.
ldz INTEGER. The leading dimension of the array z.
If wantz is .TRUE., then ldz ≥ max(1, ihiz). Otherwise,
ldz ≥ 1.
COMPLEX for claqr4
lwork ≥ max(1,n) is sufficient, but for the optimal
performance a greater workspace may be required, typically
as large as 6*n.
It is recommended to use the workspace query to determine
the optimal workspace size. If lwork=-1,then the routine
performs a workspace query: it estimates the optimal
workspace size for the given values of the input parameters
n, ilo, and ihi. The estimate is returned in work(1). No
error messages related to the lwork is issued by xerbla.
Neither H nor Z are accessed.
1585
Output Parameters
h If info=0 , and wantt is .TRUE., then h contains the upper
quasi-triangular/triangular matrix T from the Schur
decomposition (the Schur form).
If info=0 , and wantt is .FALSE., then the contents of h
are unspecified on exit.
(The output values of h when info > 0 are given under
The routines may explicitly set h(i,j) for i>j and
j=1,2,...ilo-1 or j=ihi+1, ihi+2,...n.
w COMPLEX for claqr4
Arrays, DIMENSION(n). The computed eigenvalues of
h(ilo:ihi, ilo:ihi) are stored in w(ilo:ihi). If wantt is
.TRUE., then the eigenvalues are stored in the same order
as on the diagonal of the Schur form returned in h, with
w(i) = h(i,i).
wr, wi REAL for slaqr4
Arrays, DIMENSION(ihi) each. The real and imaginary parts,
respectively, of the computed eigenvalues of h(ilo:ihi,
ilo:ihi) are stored in the wr(ilo:ihi) and wi(ilo:ihi).
If two eigenvalues are computed as a complex conjugate
pair, they are stored in consecutive elements of wr and wi,
say the i-th and (i+1)-th, with wi(i)> 0 and wi(i+1) <
0. If wantt is .TRUE. , then the eigenvalues are stored in
the same order as on the diagonal of the Schur form
returned in h, with wr(i) = h(i,i), and if
h(i:i+1,i:i+1)is a 2-by-2 diagonal block, then
wi(i)=sqrt(-h(i+1,i)*h(i,i+1)).
z If wantz is .TRUE., then z(ilo:ihi, iloz:ihiz) is
replaced by z(ilo:ihi, iloz:ihiz)*U, where U is the
orthogonal/unitary Schur factor of h(ilo:ihi, ilo:ihi).
If wantz is .FALSE., z is not referenced.
1586
(The output values of z when info > 0 are given under
info INTEGER.
> 0: if info = i, then the routine failed to compute all the
eigenvalues. Elements 1:ilo-1 and i+1:n of wr and wi
contain those eigenvalues which have been successfully
computed.
> 0: if wantt is .FALSE., then the remaining unconverged
eigenvalues are the eigenvalues of the upper Hessenberg
matrix rows and columns ilo through info of the final
output value of h.
> 0: if wantt is .TRUE., then (initial value of h)*U =
U*(final value of h, where U is an orthogonal/unitary
matrix. The final value of h is upper Hessenberg and
quasi-triangular/triangular in rows and columns info+1
through ihi.
> 0: if wantz is .TRUE., then (final value of z(ilo:ihi,
iloz:ihiz))=(initial value of z(ilo:ihi, iloz:ihiz)*U,
where U is the orthogonal/unitary matrix in the previous
expression (regardless of the value of wantt).
> 0: if wantz is .FALSE., then z is not accessed.
?laqr5
Performs a single small-bulge multi-shift QR sweep.
Syntax
call slaqr5( wantt, wantz, kacc22, n, ktop, kbot, nshfts, sr, si, h, ldh,
iloz, ihiz, z, ldz, v, ldv, u, ldu, nv, wv, ldwv, nh, wh, ldwh )
call dlaqr5( wantt, wantz, kacc22, n, ktop, kbot, nshfts, sr, si, h, ldh,
iloz, ihiz, z, ldz, v, ldv, u, ldu, nv, wv, ldwv, nh, wh, ldwh )
call claqr5( wantt, wantz, kacc22, n, ktop, kbot, nshfts, s, h, ldh, iloz,
ihiz, z, ldz, v, ldv, u, ldu, nv, wv, ldwv, nh, wh, ldwh )
call zlaqr5( wantt, wantz, kacc22, n, ktop, kbot, nshfts, s, h, ldh, iloz,
ihiz, z, ldz, v, ldv, u, ldu, nv, wv, ldwv, nh, wh, ldwh )
1587
Description
for C interface.
This auxiliary routine called by ?laqr0 performs a single small-bulge multi-shift QR sweep.
Input Parameters
wantt LOGICAL.
wantt = .TRUE. if the quasi-triangular/triangular Schur
factor is computed.
wantt is set to .FALSE. otherwise.
wantz LOGICAL.
wantz = .TRUE. if the orthogonal/unitary Schur factor is
computed.
wantz is set to .FALSE. otherwise.
kacc22 INTEGER. Possible values are 0, 1, or 2.
Specifies the computation mode of far-from-diagonal
orthogonal updates.
= 0: the routine does not accumulate reflections and does
not use matrix-matrix multiply to update far-from-diagonal
matrix entries.
= 1: the routine accumulates reflections and uses
matrix-matrix multiply to update the far-from-diagonal
matrix entries.
= 2: the routine accumulates reflections, uses matrix-matrix
multiply to update the far-from-diagonal matrix entries, and
takes advantage of 2-by-2 block structure during matrix
multiplies.
n INTEGER. The order of the Hessenberg matrix H upon which
the routine operates.
ktop, kbot INTEGER.
It is assumed without a check that either ktop=1 or
h(ktop,ktop-1)=0, and either kbot=n or
h(kbot+1,kbot)=0.
nshfts INTEGER.
Number of simultaneous shifts, must be positive and even.
1588
Arrays, DIMENSION (nshfts) each.
sr contains the real parts and si contains the
imaginary parts of the nshfts shifts of origin that
define the multi-shift QR sweep.
s COMPLEX for claqr5
Arrays, DIMENSION (nshfts).
s contains the shifts of origin that define the multi-shift
QR sweep.
h REAL for slaqr5
COMPLEX for claqr5
Array, DIMENSION (ldh, n), on input contains the
Hessenberg matrix.
declared in the calling routine. ldh ≥ max(1, n).
must be applied if wantz is .TRUE.. 1 ≤ iloz ≤ ihiz ≤
n.
z REAL for slaqr5
COMPLEX for claqr5
referenced.
declared in the calling routine. ldz ≥ n.
v REAL for slaqr5
COMPLEX for claqr5
Workspace array with dimension (ldv, nshfts/2).
1589

declared in the calling routine. ldv ≥ 3.
u REAL for slaqr5
COMPLEX for claqr5
Workspace array with dimension (ldu, 3*nshfts-3).
ldu INTEGER. The leading dimension of the array u just as
declared in the calling routine. ldu ≥ 3*nshfts-3.
nh INTEGER. The number of column in the array wh available
for workspace. nh ≥ 1.
wh REAL for slaqr5
COMPLEX for claqr5
Workspace array with dimension (ldwh, nh)
ldwh INTEGER. The leading dimension of the array wh just as
declared in the calling routine. ldwh ≥ 3*nshfts-3
nv INTEGER. The number of rows of the array wv available for
workspace. nv ≥ 1.
wv REAL for slaqr5
COMPLEX for claqr5
Workspace array with dimension (ldwv, 3*nshfts-3).
declared in the calling routine. ldwv ≥ nv.
Output Parameters
h On output a multi-shift QR Sweep with shifts
sr(j)+i*si(j) or s(j) is applied to the isolated diagonal
block in rows and columns ktop through kbot .
1590
z If wantz is .TRUE., then the QR Sweep orthogonal/unitary
similarity transformation is accumulated into z(iloz:ihiz,
ilo:ihi) from the right.
?laqsb
Scales a symmetric band matrix, using scaling
Syntax
call slaqsb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
call dlaqsb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
call claqsb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
call zlaqsb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
Description
for C interface.
The routine equilibrates a symmetric band matrix A using the scaling factors in the vector s.
Input Parameters
uplo CHARACTER*1.
symmetric matrix A is stored.
n ≥ 0.
uplo = 'U', or the number of sub-diagonals if uplo =
'L'.
kd ≥ 0.
ab REAL for slaqsb
1591
DOUBLE PRECISION for dlaqsb

COMPLEX for claqsb
COMPLEX*16 for zlaqsb
Array, DIMENSION (ldab,n). On entry, the upper or lower
triangle of the symmetric band matrix A, stored in the first
kd+1 rows of the array. The j-th column of A is stored in
the j-th column of the array ab as follows:
if uplo = 'U', ab(kd+1+i-j,j) = A(i,j) for
max(1,j-kd) ≤ i ≤ j;
if uplo = 'L', ab(1+i-j,j) = A(i,j) for j ≤ i ≤
min(n,j+kd).
ldab ≥ kd+1.
s REAL for slaqsb/claqsb
DOUBLE PRECISION for dlaqsb/zlaqsb
scond REAL for slaqsb/claqsb
amax REAL for slaqsb/claqsb
Output Parameters
ab On exit, if info = 0, the triangular factor U or L from the
Cholesky factorization A = U'*U or A = L*L' of the band
matrix A, in the same storage format as A.
equed CHARACTER*1.
1592
Application Notes
the scaling factors. If scond < thresh, scaling is done. large and small are threshold values
used to decide if scaling should be done based on the absolute size of the largest matrix element.
If amax > large or amax < small, scaling is done.
?laqsp
Scales a symmetric/Hermitian matrix in packed
storage, using scaling factors computed by ?ppequ.
Syntax
call slaqsp( uplo, n, ap, s, scond, amax, equed )
call dlaqsp( uplo, n, ap, s, scond, amax, equed )
call claqsp( uplo, n, ap, s, scond, amax, equed )
call zlaqsp( uplo, n, ap, s, scond, amax, equed )
Description
for C interface.
The routine ?laqsp equilibrates a symmetric matrix A using the scaling factors in the vector
s.
Input Parameters
uplo CHARACTER*1.
ap REAL for slaqsp
DOUBLE PRECISION for dlaqsp
COMPLEX for claqsp
1593
COMPLEX*16 for zlaqsp

On entry, the upper or lower triangle of the symmetric
matrix A, packed columnwise in a linear array. The j-th
column of A is stored in the array ap as follows:
≤ j;
if uplo = 'L', ap(i + (j-1)(2n-j)/2) = A(i,j) for
j≤i≤n.
s REAL for slaqsp/claqsp
DOUBLE PRECISION for dlaqsp/zlaqsp
scond REAL for slaqsp/claqsp
amax REAL for slaqsp/claqsp
Output Parameters
ap On exit, the equilibrated matrix: diag(s)*A*diag(s), in
the same storage format as A.
equed CHARACTER*1.
Application Notes
1594
?laqsy
Scales a symmetric/Hermitian matrix, using scaling
factors computed by ?poequ.
Syntax
call slaqsy( uplo, n, a, lda, s, scond, amax, equed )
call dlaqsy( uplo, n, a, lda, s, scond, amax, equed )
call claqsy( uplo, n, a, lda, s, scond, amax, equed )
call zlaqsy( uplo, n, a, lda, s, scond, amax, equed )
Description
for C interface.
The routine equilibrates a symmetric matrix A using the scaling factors in the vector s.
Input Parameters
uplo CHARACTER*1.
n ≥ 0.
a REAL for slaqsy
DOUBLE PRECISION for dlaqsy
COMPLEX for claqsy
COMPLEX*16 for zlaqsy
Array, DIMENSION (lda,n). On entry, the symmetric matrix
A.
1595

lda ≥ max(n,1).
s REAL for slaqsy/claqsy
DOUBLE PRECISION for dlaqsy/zlaqsy
scond REAL for slaqsy/claqsy
amax REAL for slaqsy/claqsy
Output Parameters
a On exit, if equed = 'Y', the equilibrated matrix:
diag(s)*A*diag(s).
equed CHARACTER*1.
If equed = 'Y': Equilibration was done, i.e., A has been
Application Notes
1596
?laqtr
Solves a real quasi-triangular system of equations,
or a complex quasi-triangular system of special
form, in real arithmetic.
Syntax
call slaqtr( ltran, lreal, n, t, ldt, b, w, scale, x, work, info )
call dlaqtr( ltran, lreal, n, t, ldt, b, w, scale, x, work, info )
Description
for C interface.
The routine ?laqtr solves the real quasi-triangular system
op(T) * p = scale*c, if lreal = .TRUE.
or the complex quasi-triangular systems
op(T + iB)*(p+iq) = scale*(c+id), if lreal = .FALSE.
in real arithmetic, where T is upper quasi-triangular.
If lreal = .FALSE., then the first diagonal block of T must be 1-by-1, B is the specially
structured matrix
op(A) = A or A', A' denotes the conjugate transpose of matrix A.

On input,
1597
This routine is designed for the condition number estimation in routine ?trsna.
Input Parameters
ltran LOGICAL.
On entry, ltran specifies the option of conjugate transpose:
= .FALSE., op(T + iB) = T + iB,
= .TRUE., op(T + iB) = (T + iB)'.
lreal LOGICAL.
On entry, lreal specifies the input matrix structure:
= .FALSE., the input is complex
= .TRUE., the input is real.
n INTEGER.
On entry, n specifies the order of T + iB. n ≥ 0.
t REAL for slaqtr
DOUBLE PRECISION for dlaqtr
Array, dimension (ldt,n). On entry, t contains a matrix in
Schur canonical form. If lreal = .FALSE., then the first
diagonal block of t must be 1-by-1.
ldt INTEGER. The leading dimension of the matrix T.
ldt ≥ max(1,n).
b REAL for slaqtr
Array, dimension (n). On entry, b contains the elements to
form the matrix B as described above. If lreal = .TRUE.,
b is not referenced.
w REAL for slaqtr
On entry, w is the diagonal element of the matrix B.
If lreal = .TRUE., w is not referenced.
1598
x REAL for slaqtr
Array, dimension (2n). On entry, x contains the right hand
side of the system.
work REAL for slaqtr
Output Parameters
scale REAL for slaqtr
On exit, scale is the scale factor.
x On exit, X is overwritten by the solution.
info INTEGER.
If info = 1: the some diagonal 1-by-1 block has been
perturbed by a small number smin to keep nonsingularity.
If info = 2: the some diagonal 2-by-2 block has been
perturbed by a small number in ?laln2 to keep
nonsingularity.
1599
?lar1v
Computes the (scaled) r-th column of the inverse
of the submatrix in rows b1 through bn of
tridiagonal matrix.
Syntax
call slar1v( n, b1, bn, lambda, d, l, ld, lld, pivmin, gaptol, z, wantnc,
negcnt, ztz, mingma, r, isuppz, nrminv, resid, rqcorr, work )
call dlar1v( n, b1, bn, lambda, d, l, ld, lld, pivmin, gaptol, z, wantnc,
call clar1v( n, b1, bn, lambda, d, l, ld, lld, pivmin, gaptol, z, wantnc,
call zlar1v( n, b1, bn, lambda, d, l, ld, lld, pivmin, gaptol, z, wantnc,
Description
for C interface.
The routine ?lar1v computes the (scaled) r-th column of the inverse of the submatrix in rows
b1 through bn of the tridiagonal matrix L*D*LT - λ*I. When λ is close to an eigenvalue, the
computed vector is an accurate eigenvector. Usually, r corresponds to the index where the
eigenvector is largest in magnitude.
The following steps accomplish this computation :
• Stationary qd transform, L*D*LT - λ*I = L(+)*D(+)*L(+)T
• Progressive qd transform, L*D*LT - λ*I = U(-)*D(-)*U(-)T,
• Computation of the diagonal elements of the inverse of L*D*LT - λ*I by combining the
above transforms, and choosing r as the index where the diagonal of the inverse is (one of
the) largest in magnitude.
• Computation of the (scaled) r-th column of the inverse using the twisted factorization
obtained by combining the top part of the stationary and the bottom part of the progressive
transform.
1600
Input Parameters
T
n INTEGER. The order of the matrix L*D*L .
T
b1 INTEGER. First index of the submatrix of L*D*L .
T
bn INTEGER. Last index of the submatrix of L*D*L .
lambda REAL for slar1v/clar1v
DOUBLE PRECISION for dlar1v/zlar1v
The shift. To compute an accurate eigenvector, lambda
T
should be a good approximation to an eigenvalue of L*D*L .
l REAL for slar1v/clar1v
The (n-1) subdiagonal elements of the unit bidiagonal matrix
L, in elements 1 to n-1.
d REAL for slar1v/clar1v
The n diagonal elements of the diagonal matrix D.
ld REAL for slar1v/clar1v
The n-1 elements Li*Di.
lld REAL for slar1v/clar1v
The n-1 elements Li*Li*Di.
pivmin REAL for slar1v/clar1v
The minimum pivot in the Sturm sequence.
gaptol REAL for slar1v/clar1v
Tolerance that indicates when eigenvector entries are
negligible with respect to their contribution to the residual.
z REAL for slar1v
DOUBLE PRECISION for dlar1v
COMPLEX for clar1v
1601
COMPLEX*16 for zlar1v

Array, DIMENSION (n). All entries of z must be set to 0.
wantnc LOGICAL.
Specifies whether negcnt has to be computed.
r INTEGER.
The twist index for the twisted factorization used to compute
z. On input, 0 ≤ r ≤ n. If r is input as 0, r is set to the index
where (L*D*LT - lambda*I)-1 is largest in magnitude. If
1 ≤ r ≤ n, r is unchanged.
work REAL for slar1v/clar1v
Output Parameters
z REAL for slar1v
COMPLEX for clar1v
Array, DIMENSION (n). The (scaled) r-th column of the
inverse. z(r) is returned to be 1.
negcnt INTEGER. If wantnc is .TRUE. then negcnt = the number
of pivots < pivmin in the matrix factorization L*D*LT, and
negcnt = -1 otherwise.
ztz REAL for slar1v/clar1v
The square of the 2-norm of z.
mingma REAL for slar1v/clar1v
The reciprocal of the largest (in magnitude) diagonal element
of the inverse of L*D*LT - lambda*I.
r On output, r is the twist index used to compute z. Ideally,
r designates the position of the maximum entry in the
eigenvector.
1602
isuppz INTEGER. Array, DIMENSION (2). The support of the vector
in Z, that is, the vector z is nonzero only in elements
isuppz(1) through isuppz(2).
nrminv REAL for slar1v/clar1v
Equals 1/sqrt( ztz ).
resid REAL for slar1v/clar1v
The residual of the FP vector.
resid = ABS( mingma )/sqrt( ztz ).
rqcorr REAL for slar1v/clar1v
The Rayleigh Quotient correction to lambda.
rqcorr = mingma/ztz.
?lar2v
Applies a vector of plane rotations with real cosines
and real/complex sines from both sides to a
sequence of 2-by-2 symmetric/Hermitian matrices.
Syntax
call slar2v( n, x, y, z, incx, c, s, incc )
call dlar2v( n, x, y, z, incx, c, s, incc )
call clar2v( n, x, y, z, incx, c, s, incc )
call zlar2v( n, x, y, z, incx, c, s, incc )
Description
for C interface.
The routine ?lar2v applies a vector of real/complex plane rotations with real cosines from both
sides to a sequence of 2-by-2 real symmetric or complex Hermitian matrices, defined by the
elements of the vectors x, y and z. For i = 1,2,...,n
1603
Input Parameters
n INTEGER. The number of plane rotations to be applied.
x, y, z REAL for slar2v
COMPLEX for clar2v
Arrays, DIMENSION (1+(n-1)*incx) each. Contain the
vectors x, y and z, respectively. For all flavors of ?lar2v,
elements of x and y are assumed to be real.
incx INTEGER. The increment between elements of x, y, and z.
incx > 0.
c REAL for slar2v/clar2v
Array, DIMENSION (1+(n-1)*incc). The cosines of the plane
rotations.
s REAL for slar2v
COMPLEX for clar2v
Array, DIMENSION (1+(n-1)*incc). The sines of the plane
rotations.
incc INTEGER. The increment between elements of c and s. incc
> 0.
Output Parameters
x, y, z Vectors x, y and z, containing the results of transform.
1604
?larf
Applies an elementary reflector to a general
rectangular matrix.
Syntax
call slarf( side, m, n, v, incv, tau, c, ldc, work )
call dlarf( side, m, n, v, incv, tau, c, ldc, work )
call clarf( side, m, n, v, incv, tau, c, ldc, work )
call zlarf( side, m, n, v, incv, tau, c, ldc, work )
Description
for C interface.
The routine applies a real/complex elementary reflector H to a real/complex m-by-n matrix C,
from either the left or the right. H is represented in the form
H = I - tau*v*v',
where tau is a real/complex scalar and v is a real/complex vector.
If tau = 0, then H is taken to be the unit matrix. For clarf/zlarf, to apply H' (the conjugate
transpose of H), supply conjg(tau) instead of tau.
Input Parameters
side CHARACTER*1.
If side = 'L': form H*C
If side = 'R': form C*H.
m INTEGER. The number of rows of the matrix C.
n INTEGER. The number of columns of the matrix C.
v REAL for slarf
DOUBLE PRECISION for dlarf
COMPLEX for clarf
COMPLEX*16 for zlarf
Array, DIMENSION
(1 + (m-1)*abs(incv)) if side = 'L' or
1605
(1 + (n-1)*abs(incv)) if side = 'R'. The vector v in

the representation of H. v is not used if tau = 0.
incv INTEGER. The increment between elements of v.
incv ≠ 0.
tau REAL for slarf
COMPLEX for clarf
The value tau in the representation of H.
c REAL for slarf
COMPLEX for clarf
On entry, the m-by-n matrix C.
ldc INTEGER. The leading dimension of the array c.
ldc ≥ max(1,m).
work REAL for slarf
COMPLEX for clarf
Workspace array, DIMENSION
(n) if side = 'L' or
(m) if side = 'R'.
Output Parameters
c On exit, C is overwritten by the matrix H*C if side = 'L',
or C*H if side = 'R'.
1606
?larfb
Applies a block reflector or its
transpose/conjugate-transpose to a general
rectangular matrix.
Syntax
call slarfb( side, trans, direct, storev, m, n, k, v, ldv, t, ldt, c, ldc,
work, ldwork )
call dlarfb( side, trans, direct, storev, m, n, k, v, ldv, t, ldt, c, ldc,
work, ldwork )
call clarfb( side, trans, direct, storev, m, n, k, v, ldv, t, ldt, c, ldc,
work, ldwork )
call zlarfb( side, trans, direct, storev, m, n, k, v, ldv, t, ldt, c, ldc,
work, ldwork )
Description
for C interface.
The routine ?larfb applies a complex block reflector H or its transpose H' to a complex m-by-n
matrix C from either left or right.
Input Parameters
side CHARACTER*1.
If side = 'L': apply H or H' from the left
If side = 'R': apply H or H' from the right
trans CHARACTER*1.
If trans = 'N': apply H (No transpose)
If trans = 'C': apply H' (Conjugate transpose)
direct CHARACTER*1.
Indicates how H is formed from a product of elementary
reflectors
If direct = 'F': H = H(1)*H(2)*. . . *H(k) (forward)
If direct = 'B': H = H(k)* . . . H(2)*H(1)
(backward)
1607
storev CHARACTER*1.
Indicates how the vectors which define the elementary
reflectors are stored:
If storev = 'C': Column-wise
If storev = 'R': Row-wise
k INTEGER. The order of the matrix T (equal to the number
of elementary reflectors whose product defines the block
reflector).
v REAL for slarfb
DOUBLE PRECISION for dlarfb
COMPLEX for clarfb
COMPLEX*16 for zlarfb
Array, DIMENSION
(ldv, k) if storev = 'C'
(ldv, m) if storev = 'R' and side = 'L'
(ldv, n) if storev = 'R' and side = 'R'
The matrix v.
ldv INTEGER. The leading dimension of the array v.
If storev = 'C' and side = 'L', ldv ≥ max(1,m);
if storev = 'C' and side = 'R', ldv ≥ max(1,n);
if storev = 'R', ldv ≥ k.
t REAL for slarfb
COMPLEX for clarfb
Array, DIMENSION (ldt,k).
Contains the triangular k-by-k matrix T in the representation
of the block reflector.
LDT INTEGER. The leading dimension of the array t.
ldt ≥ k.
c REAL for slarfb
COMPLEX for clarfb
1608
ldc ≥ max(1,m).
work REAL for slarfb
COMPLEX for clarfb
Workspace array, DIMENSION (ldwork, k).
ldwork INTEGER. The leading dimension of the array work.
If side = 'L', ldwork ≥ max(1, n);
if side = 'R', ldwork ≥ max(1, m).
Output Parameters
c On exit, c is overwritten by H*C, or H'*C, or C*H, or C*H'.
?larfg
Generates an elementary reflector (Householder
matrix).
Syntax
call slarfg( n, alpha, x, incx, tau )
call dlarfg( n, alpha, x, incx, tau )
call clarfg( n, alpha, x, incx, tau )
call zlarfg( n, alpha, x, incx, tau )
Description
for C interface.
The routine ?larfg generates a real/complex elementary reflector H of order n, such that
1609
where alpha and beta are scalars (with beta real for all flavors), and x is an (n-1)-element
real/complex vector. H is represented in the form
where tau is a real/complex scalar and v is a real/complex (n-1)-element vector. Note that for
clarfg/zlarfg, H is not Hermitian.
If the elements of x are all zero (and, for complex flavors, alpha is real), then tau = 0 and H
is taken to be the unit matrix.
Otherwise, 1 ≤ tau ≤ 2 (for real flavors), or
1 ≤ Re(tau) ≤ 2 and abs(tau-1) ≤ 1 (for complex flavors).
Input Parameters
n INTEGER. The order of the elementary reflector.
alpha REAL for slarfg
DOUBLE PRECISION for dlarfg
COMPLEX for clarfg
COMPLEX*16 for zlarfg On entry, the value alpha.
x REAL for slarfg
COMPLEX for clarfg
COMPLEX*16 for zlarfg
Array, DIMENSION (1+(n-2)*abs(incx)).
On entry, the vector x.
incx INTEGER.
The increment between elements of x. incx > 0.
1610
Output Parameters
alpha On exit, it is overwritten with the value beta.
x On exit, it is overwritten with the vector v.
tau REAL for slarfg
COMPLEX for clarfg
COMPLEX*16 for zlarfg The value tau.
?larft
Forms the triangular factor T of a block reflector H
H
= I - V*T*V .
Syntax
call slarft( direct, storev, n, k, v, ldv, tau, t, ldt )
call dlarft( direct, storev, n, k, v, ldv, tau, t, ldt )
call clarft( direct, storev, n, k, v, ldv, tau, t, ldt )
call zlarft( direct, storev, n, k, v, ldv, tau, t, ldt )
Description
for C interface.
The routine ?larft forms the triangular factor T of a real/complex block reflector H of order
n, which is defined as a product of k elementary reflectors.
If direct = 'F', H = H(1)*H(2)* . . .*H(k) and T is upper triangular;
If direct = 'B', H = H(k)*. . .*H(2)*H(1) and T is lower triangular.
If storev = 'C', the vector which defines the elementary reflector H(i) is stored in the i-th
column of the array v, and H = I - V*T*V' .
If storev = 'R', the vector which defines the elementary reflector H(i) is stored in the i-th
row of the array v, and H = I - V'*T*V.
Input Parameters
direct CHARACTER*1.
1611
Specifies the order in which the elementary reflectors are

multiplied to form the block reflector:
= 'F': H = H(1)*H(2)*. . . *H(k) (forward)
= 'B': H = H(k)*. . .*H(2)*H(1) (backward)
storev CHARACTER*1.
Specifies how the vectors which define the elementary
reflectors are stored (see also Application Notes below):
= 'C': column-wise
= 'R': row-wise.
n INTEGER. The order of the block reflector H. n ≥ 0.
k INTEGER. The order of the triangular factor T (equal to the
number of elementary reflectors). k ≥ 1.
v REAL for slarft
DOUBLE PRECISION for dlarft
COMPLEX for clarft
COMPLEX*16 for zlarft
Array, DIMENSION
(ldv, k) if storev = 'C' or
(ldv, n) if storev = 'R'.
The matrix V.
If storev = 'C', ldv ≥ max(1,n);
tau REAL for slarft
COMPLEX for clarft
Array, DIMENSION (k). tau(i) must contain the scalar factor
of the elementary reflector H(i).
ldt INTEGER. The leading dimension of the output array t. ldt
≥ k.
Output Parameters
t REAL for slarft
1612
COMPLEX for clarft
Array, DIMENSION (ldt,k). The k-by-k triangular factor T
of the block reflector. If direct = 'F', T is upper
triangular; if direct = 'B', T is lower triangular. The rest
of the array is not used.
v The matrix V.
Application Notes
The shape of the matrix V and the storage of the vectors which define the H(i) is best illustrated
by the following example with n = 5 and k = 3. The elements equal to 1 are not stored; the
corresponding array elements are modified but restored on exit. The rest of the array is not
used.
1613
?larfx
rectangular matrix, with loop unrolling when the
reflector has order less than or equal to 10.
Syntax
call slarfx( side, m, n, v, tau, c, ldc, work )
call dlarfx( side, m, n, v, tau, c, ldc, work )
call clarfx( side, m, n, v, tau, c, ldc, work )
call zlarfx( side, m, n, v, tau, c, ldc, work )
Description
for C interface.
The routine ?larfx applies a real/complex elementary reflector H to a real/complex m-by-n
matrix C, from either the left or the right.
H is represented in the form
H = I - tau*v*v', where tau is a real/complex scalar and v is a real/complex vector.

If tau = 0, then H is taken to be the unit matrix
1614
Input Parameters
side CHARACTER*1.
If side = 'R': form C*H.
v REAL for slarfx
DOUBLE PRECISION for dlarfx
COMPLEX for clarfx
COMPLEX*16 for zlarfx
Array, DIMENSION
(m) if side = 'L' or
(n) if side = 'R'.
The vector v in the representation of H.
tau REAL for slarfx
COMPLEX for clarfx
c REAL for slarfx
COMPLEX for clarfx
Array, DIMENSION (ldc,n). On entry, the m-by-n matrix C.
ldc INTEGER. The leading dimension of the array c. lda ≥ (1,m).
work REAL for slarfx
COMPLEX for clarfx
(m) if side = 'R'.
work is not referenced if H has order < 11.
1615
Output Parameters
?largv
Generates a vector of plane rotations with real
cosines and real/complex sines.
Syntax
call slargv( n, x, incx, y, incy, c, incc )
call dlargv( n, x, incx, y, incy, c, incc )
call clargv( n, x, incx, y, incy, c, incc )
call zlargv( n, x, incx, y, incy, c, incc )
Description
for C interface.
The routine generates a vector of real/complex plane rotations with real cosines, determined
by elements of the real/complex vectors x and y.
For slargv/dlargv:
For clargv/zlargv:
1616
where c(i)2 + abs(s(i))2 = 1 and the following conventions are used (these are the same
as in clartg/zlartg but differ from the BLAS Level 1 routine crotg/zrotg):
If yi = 0, then c(i) = 1 and s(i) = 0;
If xi = 0, then c(i) = 0 and s(i) is chosen so that ri is real.
Input Parameters
n INTEGER. The number of plane rotations to be generated.
x, y REAL for slargv
DOUBLE PRECISION for dlargv
COMPLEX for clargv
COMPLEX*16 for zlargv
Arrays, DIMENSION (1+(n-1)*incx) and (1+(n-1)*incy),
respectively. On entry, the vectors x and y.
incx INTEGER. The increment between elements of x.
incx > 0.
incy INTEGER. The increment between elements of y.
incy > 0.
incc INTEGER. The increment between elements of the output
array c. incc > 0.
Output Parameters
x On exit, x(i) is overwritten by ai (for real flavors), or by ri
(for complex flavors), for i = 1,...,n.
y On exit, the sines s(i) of the plane rotations.
c REAL for slargv/clargv
DOUBLE PRECISION for dlargv/zlargv
Array, DIMENSION (1+(n-1)*incc). The cosines of the plane
rotations.
1617
?larnv
Returns a vector of random numbers from a
uniform or normal distribution.
Syntax
call slarnv( idist, iseed, n, x )
call dlarnv( idist, iseed, n, x )
call clarnv( idist, iseed, n, x )
call zlarnv( idist, iseed, n, x )
Description
for C interface.
The routine ?larnv returns a vector of n random real/complex numbers from a uniform or
This routine calls the auxiliary routine ?laruv to generate random real numbers from a uniform
(0,1) distribution, in batches of up to 128 using vectorisable code. The Box-Muller method is
used to transform numbers from a uniform to a normal distribution.
Input Parameters
idist INTEGER. Specifies the distribution of the random numbers:
for slarnv and dlanrv:
= 1: uniform (0,1)
= 2: uniform (-1,1)
= 3: normal (0,1).
for clarnv and zlanrv:
= 1: real and imaginary parts each uniform (0,1)
= 2: real and imaginary parts each uniform (-1,1)
= 3: real and imaginary parts each normal (0,1)
= 4: uniformly distributed on the disc abs(z) < 1
= 5: uniformly distributed on the circle abs(z) = 1
iseed INTEGER. Array, DIMENSION (4).
On entry, the seed of the random number generator; the
array elements must be between 0 and 4095, and iseed(4)
must be odd.
1618
n INTEGER. The number of random numbers to be generated.
Output Parameters
x REAL for slarnv
DOUBLE PRECISION for dlarnv
COMPLEX for clarnv
COMPLEX*16 for zlarnv
Array, DIMENSION (n). The generated random numbers.
iseed On exit, the seed is updated.
?larra
Computes the splitting points with the specified
threshold.
Syntax
call slarra( n, d, e, e2, spltol, tnrm, nsplit, isplit, info )
call dlarra( n, d, e, e2, spltol, tnrm, nsplit, isplit, info )
Description
for C interface.
The routine computes the splitting points with the specified threshold and sets any "small"
off-diagonal elements to zero.
Input Parameters
n INTEGER. The order of the matrix (n > 1).
d REAL for slarra
DOUBLE PRECISION for dlarra
e REAL for slarra
1619
First (n-1) entries contain the subdiagonal elements of the

tridiagonal matrix T; e(n) need not be set.
e2 REAL for slarra
First (n-1) entries contain the squares of the subdiagonal
elements of the tridiagonal matrix T; e2(n) need not be
set.
spltol REAL for slarra
The threshold for splitting. Two criteria can be used:
spltol<0 : criterion based on absolute off-diagonal value;
spltol>0 : criterion that preserves relative accuracy.
tnrm REAL for slarra
The norm of the matrix.
Output Parameters
e On exit, the entries e(isplit(i)), 1 ≤ i ≤ nsplit, are
set to zero, the other entries of e are untouched.
e2 On exit, the entries e2(isplit(i)), 1 ≤ i ≤ nsplit,
are set to zero.
nsplit INTEGER.
The number of blocks the matrix T splits into. 1 ≤ nsplit
≤ n
isplit INTEGER.
The splitting points, at which T breaks up into blocks. The
first block consists of rows/columns 1 to isplit(1), the
second of rows/columns isplit(1)+1 through isplit(2),
and so on, and the nsplit-th consists of rows/columns
isplit(nsplit-1)+1 through isplit(nsplit)=n.
info INTEGER.
= 0: successful exit.
1620
?larrb
Provides limited bisection to locate eigenvalues for
more accuracy.
Syntax
call slarrb( n, d, lld, ifirst, ilast, rtol1, rtol2, offset, w, wgap, werr,
work, iwork, pivmin, spdiam, twist, info )
call dlarrb( n, d, lld, ifirst, ilast, rtol1, rtol2, offset, w, wgap, werr,
work, iwork, pivmin, spdiam, twist, info )
Description
for C interface.
T
Given the relatively robust representation (RRR) L*D*L , the routine does “limited” bisection
T
to refine the eigenvalues of L*D*L , w( ifirst-offset ) through w( ilast-offset ), to more
accuracy. Initial guesses for these eigenvalues are input in w. The corresponding estimate of
the error in these guesses and their gaps are input in werr and wgap, respectively. During
bisection, intervals [left, right] are maintained by storing their mid-points and semi-widths
in the arrays w and werr respectively.
Input Parameters
n INTEGER. The order of the matrix.
d REAL for slarrb
DOUBLE PRECISION for dlarrb
Array, DIMENSION (n). The n diagonal elements of the
diagonal matrix D.
lld REAL for slarrb
The n-1 elements Li*Li*Di.
ifirst INTEGER. The index of the first eigenvalue to be computed.
ilast INTEGER. The index of the last eigenvalue to be computed.
rtol1, rtol2 REAL for slarrb
1621
Tolerance for the convergence of the bisection intervals. An

interval [left, right] has converged if
RIGHT-LEFT.LT.MAX( rtol1*gap,
rtol2*max(|left|,|right|) ), where gap is the
(estimated) distance to the nearest eigenvalue.
offset INTEGER. Offset for the arrays w, wgap and werr, that is,
the ifirst-offset through ilast-offset elements of these
arrays are to be used.
w REAL for slarrb
Array, DIMENSION (n). On input, w( ifirst-offset ) through
w( ilast-offset ) are estimates of the eigenvalues of
T
L*D*L indexed ifirst through ilast.
wgap REAL for slarrb
Array, DIMENSION (n-1). The estimated gaps between
T
consecutive eigenvalues of L*D*L , that is, wgap(i-offset)
is the gap between eigenvalues i and i+1. Note that if
IFIRST.EQ.ILAST then wgap(ifirst-offset) must be set
to 0.
werr REAL for slarrb
Array, DIMENSION (n). On input, werr(ifirst-offset)
through werr(ilast-offset) are the errors in the estimates
of the corresponding elements in w.
work REAL for slarrb
pivmin REAL for slarrb
The minimum pivot in the Sturm sequence.
spdiam REAL for slarrb
The spectral diameter of the matrix.
twist INTEGER. The twist index for the twisted factorization that
is used for the negcount.
1622
twist = n: Compute negcount from L*D*LT - lambda*i
= L+* D+ *L+T
= U-*D-*U-T
= Nr*D r*Nr
iwork INTEGER.
Output Parameters
w On output, the estimates of the eigenvalues are“refined”.
wgap On output, the gaps are refined.
werr On output, “refined” errors in the estimates of w.
info INTEGER.
Error flag.
?larrc
Computes the number of eigenvalues of the
symmetric tridiagonal matrix.
Syntax
call slarrc( jobt, n, vl, vu, d, e, pivmin, eigcnt, lcnt, rcnt, info )
call dlarrc( jobt, n, vl, vu, d, e, pivmin, eigcnt, lcnt, rcnt, info )
Description
for C interface.
The routine finds the number of eigenvalues of the symmetric tridiagonal matrix T or of its
factorization L*D*LT in the specified interval.
Input Parameters
jobt CHARACTER*1.
1623
= 'T': computes Sturm count for matrix T.

= 'L': computes Sturm count for matrix L*D*LT.
n INTEGER.
The order of the matrix. (n > 1).
vl,vu REAL for slarrc
DOUBLE PRECISION for dlarrc
The lower and upper bounds for the eigenvalues.
d REAL for slarrc
If jobt= 'T': contains the n diagonal elements of the
tridiagonal matrix T.
If jobt= 'L': contains the n diagonal elements of the
diagonal matrix D.
e REAL for slarrc
If jobt= 'T': contains the (n-1)offdiagonal elements of
the matrix T.
If jobt= 'L': contains the (n-1)offdiagonal elements of
the matrix L.
pivmin REAL for slarrc
The minimum pivot in the Sturm sequence for the matrix
T.
Output Parameters
eigcnt INTEGER.
The number of eigenvalues of the symmetric tridiagonal
matrix T that are in the half-open interval (vl,vu].
lcnt,rcnt INTEGER.
The left and right negcounts of the interval.
info INTEGER.
Now it is not used and always is set to 0.
1624
?larrd
Computes the eigenvalues of a symmetric
tridiagonal matrix to suitable accuracy.
Syntax
call slarrd( range, order, n, vl, vu, il, iu, gers, reltol, d, e, e2, pivmin,
nsplit, isplit, m, w, werr, wl, wu, iblock, indexw, work, iwork, info )
call dlarrd( range, order, n, vl, vu, il, iu, gers, reltol, d, e, e2, pivmin,
nsplit, isplit, m, w, werr, wl, wu, iblock, indexw, work, iwork, info )
Description
for C interface.
The routine computes the eigenvalues of a symmetric tridiagonal matrix T to suitable accuracy.
This is an auxiliary code to be called from ?stemr. The user may ask for all eigenvalues, all
eigenvalues in the half-open interval (vl, vu], or the il-th through iu-th eigenvalues.
To avoid overflow, the matrix must be scaled so that its largest element is no greater than
(overflow1/2*underflow1/4) in absolute value, and for greatest accuracy, it should not be
much smaller than that. (For more details see [Kahan66].
Input Parameters
range CHARACTER.
= 'A': ("All") all eigenvalues will be found.
= 'V': ("Value") all eigenvalues in the half-open interval
= 'I': ("Index") the il-th through iu-th eigenvalues will
be found.
order CHARACTER.
= 'B': ("By block") the eigenvalues will be grouped by
split-off block (see iblock, isplit below) and ordered
from smallest to largest within the block.
= 'E': ("Entire matrix") the eigenvalues for the entire
matrix will be ordered from smallest to largest.
n INTEGER. The order of the tridiagonal matrix T (n ≥ 1).
1625
vl,vu REAL for slarrd

DOUBLE PRECISION for dlarrd
If range = 'V': the lower and upper bounds of the
interval to be searched for eigenvalues. Eigenvalues less
than or equal to vl, or greater than vu, will not be returned.
vl < vu.
If range = 'A' or 'I': not referenced.
il,iu INTEGER.
If range = 'I': the indices (in ascending order) of the
smallest and largest eigenvalues to be returned. 1 ≤ il ≤
iu ≤ n, if n > 0; il=1 and iu=0 if n=0.
If range = 'A' or 'V': not referenced.
gers REAL for slarrd

Array, DIMENSION (2*n).
The n Gerschgorin intervals (the i-th Gerschgorin
interval is (gers(2*i-1), gers(2*i)).
reltol REAL for slarrd
is narrower than reltol times the larger (in magnitude)
endpoint, then it is considered to be sufficiently small, that
is converged. Note: this should always be at least
radix*machine epsilon.
d REAL for slarrd
e REAL for slarrd
matrix T.
e2 REAL for slarrd
1626
Contains (n-1) squared off-diagonal elements of the
tridiagonal matrix T.
pivmin REAL for slarrd
T.
nsplit INTEGER.
The number of diagonal blocks the matrix T . 1 ≤ nsplit
≤ n
isplit INTEGER.
Arrays, DIMENSION (n).
The splitting points, at which T breaks up into submatrices.
The first submatrix consists of rows/columns 1 to
isplit(1), the second of rows/columns isplit(1)+1
through isplit(2), and so on, and the nsplit-th consists
of rows/columns isplit(nsplit-1)+1 through
isplit(nsplit)=n.
(Only the first nsplit elements actually is used, but since
the user cannot know a priori value of nsplit, n words
must be reserved for isplit.)
work REAL for slarrd
iwork INTEGER.
Output Parameters
m INTEGER.
The actual number of eigenvalues found. 0 ≤ m ≤ n. (See
also the description of info=2,3.)
w REAL for slarrd
1627
The first m elements of w contain the eigenvalue

approximations. ?laprd computes an interval Ij = (aj,
bj] that includes eigenvalue j. The eigenvalue
approximation is given as the interval midpoint w(j)=
(aj+bj)/2. The corresponding error is bounded by werr(j)
= abs(aj-bj)/2.
werr REAL for slarrd
The error bound on the corresponding eigenvalue
approximation in w.
wl, wu REAL for slarrd
The interval (wl, wu] contains all the wanted eigenvalues.
If range = 'V': then wl=vl and wu=vu.
If range = 'A': then wl and wu are the global Gerschgorin
bounds on the spectrum.
If range = 'I': then wl and wu are computed by ?laebz
from the index range specified.
iblock INTEGER.
At each row/column j where e(j) is zero or small, the
matrix T is considered to split into a block diagonal matrix.
If info = 0, then iblock(i) specifies to which block (from
1 to the number of blocks) the eigenvalue w(i) belongs.
(The routine may use the remaining n-m elements as
workspace.)
indexw INTEGER.
The indices of the eigenvalues within each block (submatrix);
for example, indexw(i)= j and iblock(i)=k imply that
the i-th eigenvalue w(i) is the j-th eigenvalue in block k.
info INTEGER.
< 0: if info = -i, the i-th argument has an illegal value
> 0: some or all of the eigenvalues fail to converge or are
not computed:
1628
=1 or 3: bisection fail to converge for some eigenvalues;
these eigenvalues are flagged by a negative block number.
The effect is that the eigenvalues may not be as accurate
as the absolute and relative tolerances.
=2 or 3: range='I' only: not all of the eigenvalues il:iu
are found.
=4: range='I', and the Gershgorin interval initially used
is too small. No eigenvalues are computed.
?larre
Given the tridiagonal matrix T, sets small
off-diagonal elements to zero and for each
unreduced block Ti, finds base representations and
eigenvalues.
Syntax
call slarre( range, n, vl, vu, il, iu, d, e, e2, rtol1, rtol2, spltol, nsplit,
isplit, m, w, werr, wgap, iblock, indexw, gers, pivmin, work, iwork, info )
call dlarre( range, n, vl, vu, il, iu, d, e, e2, rtol1, rtol2, spltol, nsplit,
isplit, m, w, werr, wgap, iblock, indexw, gers, pivmin, work, iwork, info )
Description
for C interface.
To find the desired eigenvalues of a given real symmetric tridiagonal matrix T, the routine sets
any “small” off-diagonal elements to zero, and for each unreduced block Ti, it finds
• a suitable shift at one end of the block spectrum

• the base representation, Ti - σi *I = Li*Di*LiT, and
T
• eigenvalues of each Li*Di*Li .
The representations and eigenvalues found are then used by ?stemr to compute the eigenvectors
of a symmetric tridiagonal matrix. The accuracy varies depending on whether bisection is used
to find a few eigenvalues or the dqds algorithm (subroutine ?lasq2) to compute all and discard
any unwanted one. As an added benefit, ?larre also outputs the n Gerschgorin intervals for
T
the matrices Li*Di*Li .
1629
Input Parameters
range CHARACTER.
= 'A': ("All") all eigenvalues will be found.
= 'V': ("Value") all eigenvalues in the half-open interval
= 'I': ("Index") the il-th through iu-th eigenvalues of
the entire matrix will be found.
n INTEGER. The order of the matrix. n > 0.
vl, vu REAL for slarre
DOUBLE PRECISION for dlarre
If range='V', the lower and upper bounds for the
eigenvalues. Eigenvalues less than or equal to vl, or greater
than vu, are not returned. vl < vu.
il, iu INTEGER.
If range='I', the indices (in ascending order) of the
smallest and largest eigenvalues to be returned. 1 ≤ il ≤
iu ≤ n.
d REAL for slarre
The n diagonal elements of the diagonal matrices T.
e REAL for slarre
Array, DIMENSION (n). The first (n-1) entries contain the
subdiagonal elements of the tridiagonal matrix T; e(n) need
not be set.
e2 REAL for slarre
Array, DIMENSION (n). The first (n-1) entries contain the
squares of the subdiagonal elements of the tridiagonal
matrix T; e2(n) need not be set.
rtol1, rtol2 REAL for slarre
1630
Parameters for bisection. An interval [LEFT,RIGHT] has
converged if RIGHT-LEFT.LT.MAX( rtol1*gap,
rtol2*max(|LEFT|,|RIGHT|) ).
spltol REAL for slarre
The threshold for splitting.
work REAL for slarre
iwork INTEGER.
Output Parameters
vl, vu On exit, if range='I' or ='A', contain the bounds on the
desired part of the spectrum.
d On exit, the n diagonal elements of the diagonal matrices
Di .
e On exit, the subdiagonal elements of the unit bidiagonal
matrices Li . The entries e( isplit( i) ), 1 ≤ i ≤ nsplit,
contain the base points sigmai on output.
e2 On exit, the entries e2( isplit( i) ), 1 ≤ i ≤ nsplit, have
been set to zero.
nsplit INTEGER. The number of blocks T splits into. 1 ≤ nsplit
≤ n.
isplit INTEGER. Array, DIMENSION (n). The splitting points, at
which T breaks up into blocks. The first block consists of
rows/columns 1 to isplit(1), the second of rows/columns
isplit(1)+1 through isplit(2), etc., and the nsplit-th
consists of rows/columns isplit(nsplit-1)+1 through
isplit(nsplit)=n.
m INTEGER. The total number of eigenvalues (of all the
T
Li*Di*Li ) found.
w REAL for slarre
1631
Array, DIMENSION (n). The first m elements contain the

eigenvalues. The eigenvalues of each of the blocks,
T
Li*Di*Li , are sorted in ascending order. The routine may
use the remaining n-m elements as workspace.
werr REAL for slarre
Array, DIMENSION (n). The error bound on the corresponding
eigenvalue in w.
wgap REAL for slarre
Array, DIMENSION (n). The separation from the right
neighbor eigenvalue in w. The gap is only with respect to
the eigenvalues of the same block as each block has its own
representation tree. Exception: at the right end of a block
the left gap is stored.
iblock INTEGER. Array, DIMENSION (n).
The indices of the blocks (submatrices) associated with the
corresponding eigenvalues in w; iblock(i)=1 if eigenvalue
w(i) belongs to the first block from the top, =2 if w(i)
belongs to the second block, etc.
indexw INTEGER. Array, DIMENSION (n).
for example, indexw(i)= 10 and iblock(i)=2 imply that
the i-th eigenvalue w(i) is the 10-th eigenvalue in the
second block.
gers REAL for slarre
Array, DIMENSION (2*n). The n Gerschgorin intervals (the
i-th Gerschgorin interval is (gers(2*i-1), gers(2*i)).
pivmin REAL for slarre
The minimum pivot in the Sturm sequence for T .
info INTEGER.
If info = 0: successful exit
If info > 0: A problem occured in ?larre. If info = 5,
the Rayleigh Quotient Iteration failed to converge to full
accuracy.
1632
If info < 0: One of the called subroutines signaled an
internal problem. Inspection of the corresponding parameter
info for further information is required.
• If info = -1, there is a problem in ?larrd

• If info = -2, no base representation could be found in
maxtry iterations. Increasing maxtry and recompilation
might be a remedy.
• If info = -3, there is a problem in ?larrb when
computing the refined root representation for ?lasq2.
• If info = -4, there is a problem in ?larrb when
preforming bisection on the desired part of the spectrum.
• If info = -5, there is a problem in ?lasq2.
• If info = -6, there is a problem in ?lasq2.
See Also
• ?stemr
• ?lasq2
• ?larrb
• ?larrd
?larrf
Finds a new relatively robust representation such
that at least one of the eigenvalues is relatively
isolated.
Syntax
call slarrf( n, d, l, ld, clstrt, clend, w, wgap, werr, spdiam, clgapl, clgapr,
pivmin, sigma, dplus, lplus, work, info )
call dlarrf( n, d, l, ld, clstrt, clend, w, wgap, werr, spdiam, clgapl, clgapr,
pivmin, sigma, dplus, lplus, work, info )
1633
Description
for C interface.
T
Given the initial representation L*D*L and its cluster of close eigenvalues (in a relative measure),
w(clstrt), w(clstrt+1), ... w(clend), the routine ?larrf finds a new relatively robust
representation
L*D*LT - σi*I = L(+)*D(+)*L(+)T

T
such that at least one of the eigenvalues of L(+)*D*(+)*L(+) is relatively isolated.
Input Parameters
n INTEGER. The order of the matrix (subblock, if the matrix
is splitted).
d REAL for slarrf
DOUBLE PRECISION for dlarrf
diagonal matrix D.
l REAL for slarrf
The (n-1) subdiagonal elements of the unit bidiagonal matrix
L.
ld REAL for slarrf
The n-1 elements Li*Di.
clstrt INTEGER. The index of the first eigenvalue in the cluster.
clend INTEGER. The index of the last eigenvalue in the cluster.
w REAL for slarrf
Array, DIMENSION ≥ (clend -clstrt+1). The eigenvalue
T
approximations of L*D*L in ascending order. w(clstrt)
through w(clend) form the cluster of relatively close
eigenvalues.
wgap REAL for slarrf
1634
Array, DIMENSION ≥ (clend -clstrt+1). The separation
from the right neighbor eigenvalue in w.
werr REAL for slarrf
Array, DIMENSION ≥ (clend -clstrt+1). On input, werr
contains the semiwidth of the uncertainty interval of the
corresponding eigenvalue approximation in w.
spdiam REAL for slarrf
Estimate of the spectral diameter obtained from the
Gerschgorin intervals.
clgapl, clgapr REAL for slarrf
Absolute gap on each end of the cluster. Set by the calling
routine to protect against shifts too close to eigenvalues
outside the cluster.
pivmin REAL for slarrf
The minimum pivot allowed in the Sturm sequence.
work REAL for slarrf
Output Parameters
wgap On output, the gaps are refined.
sigma REAL for slarrf
T
The shift used to form L(+)*D*(+)*L(+) .
dplus REAL for slarrf
diagonal matrix D(+).
lplus REAL for slarrf
1635
Array, DIMENSION (n). The first (n-1) elements of lplus

contain the subdiagonal elements of the unit bidiagonal
matrix L(+).
?larrj
Performs refinement of the initial estimates of the
eigenvalues of the matrix T.
Syntax
call slarrj( n, d, e2, ifirst, ilast, rtol, offset, w, werr, work, iwork,
pivmin, spdiam, info )
call dlarrj( n, d, e2, ifirst, ilast, rtol, offset, w, werr, work, iwork,
pivmin, spdiam, info )
Description
for C interface.
Given the initial eigenvalue approximations of T, this routine does bisection to refine the
eigenvalues of T, w(ifirst-offset) through w(ilast-offset) , to more accuracy. Initial
guesses for these eigenvalues are input in w, the corresponding estimate of the error in these
guesses in werr. During bisection, intervals [a,b] are maintained by storing their mid-points
and semi-widths in the arrays w and werr respectively.
Input Parameters
n INTEGER. The order of the matrix T.
d REAL for slarrj
DOUBLE PRECISION for dlarrj
Contains n diagonal elements of the matrix T.
e2 REAL for slarrj
Contains (n-1) squared sub-diagonal elements of the T.
ifirst INTEGER.
The index of the first eigenvalue to be computed.
1636
ilast INTEGER.
The index of the last eigenvalue to be computed.
rtol REAL for slarrj
Tolerance for the convergence of the bisection intervals. An
interval [a,b] is considered to be converged if (b-a) ≤
rtol*max(|a|,|b|).
offset INTEGER.
Offset for the arrays w and werr, that is the
ifirst-offset through ilast-offset elements of
these arrays are to be used.
w REAL for slarrj
On input, w(ifirst-offset) through w(ilast-offset)
are estimates of the eigenvalues of L*D*LT indexed ifirst
through ilast.
werr REAL for slarrj
On input, werr(ifirst-offset) through
werr(ilast-offset) are the errors in the estimates of
the corresponding elements in w.
work REAL for slarrj
iwork INTEGER.
pivmin REAL for slarrj
T.
spdiam REAL for slarrj
The spectral diameter of the matrix T.
1637
Output Parameters
w On exit, contains the refined estimates of the eigenvalues.
werr On exit, contains the refined errors in the estimates of the
corresponding elements in w.
info INTEGER.
Now it is not used and always is set to 0.
?larrk
Computes one eigenvalue of a symmetric
tridiagonal matrix T to suitable accuracy.
Syntax
call slarrk( n, iw, gl, gu, d, e2, pivmin, reltol, w, werr, info )
call dlarrk( n, iw, gl, gu, d, e2, pivmin, reltol, w, werr, info )
Description
for C interface.
The routine computes one eigenvalue of a symmetric tridiagonal matrix T to suitable accuracy.
This is an auxiliary code to be called from ?stemr.
To avoid overflow, the matrix must be scaled so that its largest element is no greater than
(overflow1/2*underflow1/4) in absolute value, and for greatest accuracy, it should not be
much smaller than that. For more details see [Kahan66].
Input Parameters
n INTEGER. The order of the matrix T. (n ≥ 1).
iw INTEGER.
The index of the eigenvalue to be returned.
gl, gu REAL for slarrk
DOUBLE PRECISION for dlarrk
An upper and a lower bound on the eigenvalue.
d REAL for slarrk
1638
e2 REAL for slarrk
Contains (n-1) squared off-diagonal elements of the T.
pivmin REAL for slarrk
T.
reltol REAL for slarrk
is narrower than reltol times the larger (in magnitude)
endpoint, then it is considered to be sufficiently small, that
is converged. Note: this should always be at least
radix*machine epsilon.
Output Parameters
w REAL for slarrk
Contains the eigenvalue approximation.
werr REAL for slarrk
Contains the error bound on the corresponding eigenvalue
approximation in w.
info INTEGER.
= 0: Eigenvalue converges
= -1: Eigenvalue does not converge
1639
?larrr
Performs tests to decide whether the symmetric
tridiagonal matrix T warrants expensive
computations which guarantee high relative
accuracy in the eigenvalues.
Syntax
call slarrr( n, d, e, info )
call dlarrr( n, d, e, info )
Description
for C interface.
The routine performs tests to decide whether the symmetric tridiagonal matrix T warrants
expensive computations which guarantee high relative accuracy in the eigenvalues.
Input Parameters
n INTEGER. The order of the matrix T. (n > 0).
d REAL for slarrr
DOUBLE PRECISION for dlarrr
e REAL for slarrr
DOUBLE PRECISION for dlarrr
The first (n-1) entries contain sub-diagonal elements of
the tridiagonal matrix T; e(n) is set to 0.
Output Parameters
info INTEGER.
= 0: the matrix warrants computations preserving relative
accuracy (default value).
= -1: the matrix warrants computations guaranteeing only
absolute accuracy.
1640
?larrv
Computes the eigenvectors of the tridiagonal
T
matrix T = L*D* L given L, D and the eigenvalues
T
of L*D* L .
Syntax
call slarrv( n, vl, vu, d, l, pivmin, isplit, m, dol, dou, minrgp, rtol1,
rtol2, w, werr, wgap, iblock, indexw, gers, z, ldz, isuppz, work, iwork, info
)
call dlarrv( n, vl, vu, d, l, pivmin, isplit, m, dol, dou, minrgp, rtol1,
)
call clarrv( n, vl, vu, d, l, pivmin, isplit, m, dol, dou, minrgp, rtol1,
)
call zlarrv( n, vl, vu, d, l, pivmin, isplit, m, dol, dou, minrgp, rtol1,
)
Description
for C interface.
T
The routine ?larrv computes the eigenvectors of the tridiagonal matrix T = L*D* L given L,
T
D and approximations to the eigenvalues of L*D* L .
The input eigenvalues should have been computed by slarre for real flavors (slarrv/clarrv)
and by dlarre for double precision flavors (dlarre/zlarre).
Input Parameters
n INTEGER. The order of the matrix. n ≥ 0.
vl, vu REAL for slarrv/clarrv
DOUBLE PRECISION for dlarrv/zlarrv
1641
Lower and upper bounds respectively of the interval that

contains the desired eigenvalues. vl < vu. Needed to
compute gaps on the left or right end of the extremal
eigenvalues in the desired range.
d REAL for slarrv/clarrv
Array, DIMENSION (n). On entry, the n diagonal elements
of the diagonal matrix D.
l REAL for slarrv/clarrv
On entry, the (n-1) subdiagonal elements of the unit
bidiagonal matrix L are contained in elements 1 to n-1 of L
if the matrix is not splitted. At the end of each block the
corresponding shift is stored as given by slarre for real
flavors and by dlarre for double precision flavors.
pivmin REAL for slarrv/clarrv
The minimum pivot allowed in the Sturm sequence.
isplit INTEGER. Array, DIMENSION (n).
The splitting points, at which T breaks up into blocks. The
first block consists of rows/columns 1 to isplit(1), the
second of rows/columns isplit(1)+1 through isplit(2),
etc.
m INTEGER. The total number of eigenvalues found.
m = iu - il +1.
dol, dou INTEGER.
If you want to compute only selected eigenvectors from all
the eigenvalues supplied, specify an index range dol:dou.
Or else apply the setting dol=1, dou=m. Note that dol and
dou refer to the order in which the eigenvalues are stored
in w.
If you want to compute only selected eigenpairs, then the
columns dol-1 to dou+1 of the eigenvector space Z contain
the computed eigenvectors. All other columns of Z are set
to zero.
1642
minrgp, rtol1, rtol2 REAL for slarrv/clarrv
Parameters for bisection. An interval [LEFT,RIGHT] has
converged if RIGHT-LEFT.LT.MAX( rtol1*gap,
rtol2*max(|LEFT|,|RIGHT|) ).
w REAL for slarrv/clarrv
Array, DIMENSION (n). The first m elements of w contain the
approximate eigenvalues for which eigenvectors are to be
computed. The eigenvalues should be grouped by split-off
block and ordered from smallest to largest within the block
(the output array w from ?larre is expected here). These
eigenvalues are set with respect to the shift of the
corresponding root representation for their block.
werr REAL for slarrv/clarrv
semiwidth of the uncertainty interval of the corresponding
eigenvalue in w.
wgap REAL for slarrv/clarrv
Array, DIMENSION (n). The separation from the right
neighbor eigenvalue in w.
iblock INTEGER. Array, DIMENSION (n).
The indices of the blocks (submatrices) associated with the
corresponding eigenvalues in w; iblock(i)=1 if eigenvalue
w(i) belongs to the first block from the top, =2 if w(i)
belongs to the second block, etc.
indexw INTEGER. Array, DIMENSION (n).
for example, indexw(i)= 10 and iblock(i)=2 imply that
the i-th eigenvalue w(i) is the 10-th eigenvalue in the
second block.
gers REAL for slarrv/clarrv
1643
Array, DIMENSION (2*n). The n Gerschgorin intervals (the

i-th Gerschgorin interval is (gers(2*i-1), gers(2*i)).
The Gerschgorin intervals should be computed from the
original unshifted matrix.
ldz INTEGER. The leading dimension of the output array Z. ldz
≥ 1, and if jobz = 'V', ldz ≥ max(1,n).
work REAL for slarrv/clarrv
iwork INTEGER.
Output Parameters
d On exit, d may be overwritten.
l On exit, l is overwritten.
w On exit, w holds the eigenvalues of the unshifted matrix.
werr On exit, werr contains refined values of its input
approximations.
wgap On exit, wgap contains refined values of its input
approximations. Very small gaps are changed.
z REAL for slarrv
DOUBLE PRECISION for dlarrv
COMPLEX for clarrv
COMPLEX*16 for zlarrv
Array, DIMENSION (ldz, max(1,m) ).
If info = 0, the first m columns of z contain the orthonormal
eigenvectors of the matrix T corresponding to the input
eigenvalues, with the i-th column of z holding the
eigenvector associated with w(i).
NOTE. The user must ensure that at least max(1,m)

columns are supplied in the array z.
isuppz INTEGER .
1644
Array, DIMENSION (2*max(1,m)). The support of the
eigenvectors in z, that is, the indices indicating the nonzero
elements in z. The i-th eigenvector is nonzero only in
elements isuppz(2i-1) through isuppz(2i).
info INTEGER.
If info = 0: successful exit
If info > 0: A problem occured in ?larrv. If info = 5,
the Rayleigh Quotient Iteration failed to converge to full
accuracy.
If info < 0: One of the called subroutines signaled an
internal problem. Inspection of the corresponding parameter
info for further information is required.
• If info = -1, there is a problem in ?larrb when refining

a child eigenvalue;
• If info = -2, there is a problem in ?larrf when
computing the relatively robust representation (RRR) of
a child. When a child is inside a tight cluster, it can be
difficult to find an RRR. A partial remedy from the user's
point of view is to make the parameter minrgp smaller
and recompile. However, as the orthogonality of the
computed vectors is proportional to 1/minrgp, you should
be aware that you might be trading in precision when
you decrease minrgp.
• If info = -3, there is a problem in ?larrb when refining
a single eigenvalue after the Rayleigh correction was
rejected.
See Also
• ?larrb
• ?larre
• ?larrf
1645
?lartg
Generates a plane rotation with real cosine and
real/complex sine.
Syntax
call slartg( f, g, cs, sn, r )
call dlartg( f, g, cs, sn, r )
call clartg( f, g, cs, sn, r )
call zlartg( f, g, cs, sn, r )
Description
for C interface.
The routine generates a plane rotation so that
where cs2 + |sn|2 = 1
This is a slower, more accurate version of the BLAS Level 1 routine ?rotg, except for the
following differences.
For slartg/dlartg:
f and g are unchanged on return;
If g=0, then cs=1 and sn=0;
If f=0 and g ≠ 0, then cs=0 and sn=1 without doing any floating point operations (saves work
in ?bdsqr when there are zeros on the diagonal);
If f exceeds g in magnitude, cs will be positive.
For clartg/zlartg:
f and g are unchanged on return;
1646
If g=0, then cs=1 and sn=0;
If f=0, then cs=0 and sn is chosen so that r is real.
Input Parameters
f, g REAL for slartg
DOUBLE PRECISION for dlartg
COMPLEX for clartg
COMPLEX*16 for zlartg
The first and second component of vector to be rotated.
Output Parameters
cs REAL for slartg/clartg
DOUBLE PRECISION for dlartg/zlartg
The cosine of the rotation.
sn REAL for slartg
COMPLEX for clartg
The sine of the rotation.
r REAL for slartg
COMPLEX for clartg
The nonzero component of the rotated vector.
1647
?lartv
Applies a vector of plane rotations with real cosines
and real/complex sines to the elements of a pair
of vectors.
Syntax
call slartv( n, x, incx, y, incy, c, s, incc )
call dlartv( n, x, incx, y, incy, c, s, incc )
call clartv( n, x, incx, y, incy, c, s, incc )
call zlartv( n, x, incx, y, incy, c, s, incc )
Description
for C interface.
The routine applies a vector of real/complex plane rotations with real cosines to elements of
the real/complex vectors x and y. For i = 1,2,...,n
Input Parameters
n INTEGER. The number of plane rotations to be applied.
x, y REAL for slartv
DOUBLE PRECISION for dlartv
COMPLEX for clartv
COMPLEX*16 for zlartv
Arrays, DIMENSION (1+(n-1)*incx) and (1+(n-1)*incy),
respectively. The input vectors x and y.
incx INTEGER. The increment between elements of x. incx > 0.
incy INTEGER. The increment between elements of y. incy > 0.
c REAL for slartv/clartv
1648
DOUBLE PRECISION for dlartv/zlartv
Array, DIMENSION (1+(n-1)*incc).
The cosines of the plane rotations.
s REAL for slartv
DOUBLE PRECISION for dlartv
COMPLEX for clartv
COMPLEX*16 for zlartv
Array, DIMENSION (1+(n-1)*incc).
The sines of the plane rotations.
incc INTEGER. The increment between elements of c and s. incc
> 0.
Output Parameters
x, y The rotated vectors x and y.
?laruv
Returns a vector of n random real numbers from
a uniform distribution.
Syntax
call slaruv( iseed, n, x )
call dlaruv( iseed, n, x )
Description
for C interface.
The routine ?laruv returns a vector of n random real numbers from a uniform (0,1) distribution
(n ≤ 128).
This is an auxiliary routine called by ?larnv.
Input Parameters
iseed INTEGER. Array, DIMENSION (4). On entry, the seed of the
random number generator; the array elements must be
between 0 and 4095, and iseed(4) must be odd.
1649
n INTEGER. The number of random numbers to be generated.

n ≤ 128.
Output Parameters
x REAL for slaruv
DOUBLE PRECISION for dlaruv
Array, DIMENSION (n). The generated random numbers.
seed On exit, the seed is updated.
?larz
Applies an elementary reflector (as returned by
?tzrzf) to a general matrix.
Syntax
call slarz( side, m, n, l, v, incv, tau, c, ldc, work )
call dlarz( side, m, n, l, v, incv, tau, c, ldc, work )
call clarz( side, m, n, l, v, incv, tau, c, ldc, work )
call zlarz( side, m, n, l, v, incv, tau, c, ldc, work )
Description
for C interface.
The routine ?larz applies a real/complex elementary reflector H to a real/complex m-by-n
matrix C, from either the left or the right. H is represented in the form
H = I-tau*v*v',
If tau = 0, then H is taken to be the unit matrix.

H
For complex flavors, to apply H (the conjugate transpose of H), supply conjg(tau) instead of
tau.
H is a product of k elementary reflectors as returned by ?tzrzf.
1650
Input Parameters
side CHARACTER*1.
If side = 'R': form C*H
l INTEGER. The number of entries of the vector v containing
the meaningful part of the Householder vectors.
If side = 'L', m ≥ L ≥ 0,
if side = 'R', n ≥ L ≥ 0.
v REAL for slarz
DOUBLE PRECISION for dlarz
COMPLEX for clarz
COMPLEX*16 for zlarz
Array, DIMENSION (1+(l-1)*abs(incv)).
The vector v in the representation of H as returned by
?tzrzf.
v is not used if tau = 0.
incv INTEGER. The increment between elements of v.
incv ≠ 0.
tau REAL for slarz
COMPLEX for clarz
c REAL for slarz
COMPLEX for clarz
ldc ≥ max(1,m).
work REAL for slarz
1651

COMPLEX for clarz
(m) if side = 'R'.
Output Parameters
?larzb
transpose/conjugate-transpose to a general matrix.
Syntax
call slarzb( side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, c, ldc,
work, ldwork )
call dlarzb( side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, c, ldc,
work, ldwork )
call clarzb( side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, c, ldc,
work, ldwork )
call zlarzb( side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, c, ldc,
work, ldwork )
Description
for C interface.
T H
The routine applies a real/complex block reflector H or its transpose H (or H for complex
flavors) to a real/complex distributed m-by-n matrix C from the left or the right. Currently, only
storev = 'R' and direct = 'B' are supported.
Input Parameters
side CHARACTER*1.
1652
T H
If side = 'L': apply H or H /H from the left
T H
If side = 'R': apply H or H /H from the right
trans CHARACTER*1.
If trans = 'N': apply H (No transpose)
T H
If trans='C': apply H /H (transpose/conjugate transpose)
direct CHARACTER*1.
reflectors
= 'F': H = H(1)*H(2)*...*H(k) (forward, not supported)
= 'B': H = H(k)*...*H(2)*H(1) (backward)
storev CHARACTER*1.
Indicates how the vectors which define the elementary
= 'C': Column-wise (not supported)
= 'R': Row-wise.
k INTEGER. The order of the matrix T (equal to the number
of elementary reflectors whose product defines the block
reflector).
l INTEGER. The number of columns of the matrix V containing
the meaningful part of the Householder reflectors.
If side = 'L', m ≥ l ≥ 0, if side = 'R', n ≥ l ≥ 0.
v REAL for slarzb
DOUBLE PRECISION for dlarzb
COMPLEX for clarzb
COMPLEX*16 for zlarzb
Array, DIMENSION (ldv, nv).
If storev = 'C', nv = k;
if storev = 'R', nv = l.
If storev = 'C', ldv ≥ l; if storev = 'R', ldv ≥ k.
t REAL for slarzb
COMPLEX for clarzb
1653

Array, DIMENSION (ldt,k). The triangular k-by-k matrix T
in the representation of the block reflector.
ldt INTEGER. The leading dimension of the array t.
ldt ≥ k.
c REAL for slarzb
COMPLEX for clarzb
Array, DIMENSION (ldc,n). On entry, the m-by-n matrix C.
ldc ≥ max(1,m).
work REAL for slarzb
COMPLEX for clarzb
Workspace array, DIMENSION (ldwork, k).
ldwork INTEGER. The leading dimension of the array work.
If side = 'L', ldwork ≥ max(1, n);
if side = 'R', ldwork ≥ max(1, m).
Output Parameters
T H
c On exit, C is overwritten by H*C, or H /H *C, or C*H, or
T H
C*H /H .
1654
?larzt
Forms the triangular factor T of a block reflector H
H
= I - V*T*V .
Syntax
call slarzt( direct, storev, n, k, v, ldv, tau, t, ldt )
call dlarzt( direct, storev, n, k, v, ldv, tau, t, ldt )
call clarzt( direct, storev, n, k, v, ldv, tau, t, ldt )
call zlarzt( direct, storev, n, k, v, ldv, tau, t, ldt )
Description
for C interface.
The routine forms the triangular factor T of a real/complex block reflector H of order > n, which
is defined as a product of k elementary reflectors.
If direct = 'F', H = H(1)*H(2)*...*H(k), and T is upper triangular.
If direct = 'B', H = H(k)*...*H(2)*H(1), and T is lower triangular.
column of the array v, and H = I-V*T*V'
row of the array v, and H = I-V'*T*V
Currently, only storev = 'R' and direct = 'B' are supported.
Input Parameters
direct CHARACTER*1.
If direct = 'F': H = H(1)*H(2)*...*H(k) (forward,
not supported)
If direct = 'B': H = H(k)*...*H(2)*H(1) (backward)
storev CHARACTER*1.
1655
Specifies how the vectors which define the elementary

reflectors are stored (see also Application Notes below):
If storev = 'C': column-wise (not supported)
If storev = 'R': row-wise
n INTEGER. The order of the block reflector H. n ≥ 0.
k INTEGER. The order of the triangular factor T (equal to the
number of elementary reflectors). k ≥ 1.
v REAL for slarzt
DOUBLE PRECISION for dlarzt
COMPLEX for clarzt
COMPLEX*16 for zlarzt
Array, DIMENSION
(ldv, k) if storev = 'C'
(ldv, n) if storev = 'R' The matrix V.
If storev = 'C', ldv ≥ max(1,n);
tau REAL for slarzt
COMPLEX for clarzt
of the elementary reflector H(i).
ldt INTEGER. The leading dimension of the output array t.
ldt ≥ k.
Output Parameters
t REAL for slarzt
COMPLEX for clarzt
Array, DIMENSION (ldt,k). The k-by-k triangular factor T
of the block reflector. If direct = 'F', T is upper
triangular; if direct = 'B', T is lower triangular. The rest
of the array is not used.
1656
v The matrix V. See Application Notes below.
Application Notes
used.
1657
?las2
Computes singular values of a 2-by-2 triangular
matrix.
Syntax
call slas2( f, g, h, ssmin, ssmax )
call dlas2( f, g, h, ssmin, ssmax )
Description
for C interface.
The routine ?las2 computes the singular values of the 2-by-2 matrix
1658
On return, ssmin is the smaller singular value and SSMAX is the larger singular value.
Input Parameters
f, g, h REAL for slas2
DOUBLE PRECISION for dlas2
The (1,1), (1,2) and (2,2) elements of the 2-by-2 matrix,
respectively.
Output Parameters
ssmin, ssmax REAL for slas2
DOUBLE PRECISION for dlas2
The smaller and the larger singular values, respectively.
Application Notes
Barring over/underflow, all output quantities are correct to within a few units in the last place
(ulps), even in the absence of a guard digit in addition/subtraction. In ieee arithmetic, the
code works correctly if one matrix element is infinite. Overflow will not occur unless the largest
singular value itself overflows, or is within a few ulps of overflow. (On machines with partial
overflow, like the Cray, overflow may occur if the largest singular value is within a factor of 2
of overflow.) Underflow is harmless if underflow is gradual. Otherwise, results may correspond
to a matrix modified by perturbations of size near the underflow threshold.
?lascl
Multiplies a general rectangular matrix by a real
scalar defined as cto/cfrom.
Syntax
call slascl( type, kl, ku, cfrom, cto, m, n, a, lda, info )
call dlascl( type, kl, ku, cfrom, cto, m, n, a, lda, info )
call clascl( type, kl, ku, cfrom, cto, m, n, a, lda, info )
call zlascl( type, kl, ku, cfrom, cto, m, n, a, lda, info )
1659
Description
for C interface.
The routine ?lascl multiplies the m-by-n real/complex matrix A by the real scalar cto/cfrom.
The operation is performed without over/underflow as long as the final result cto*A(i,j)/cfrom
does not over/underflow.
type specifies that A may be full, upper triangular, lower triangular, upper Hessenberg, or
banded.
Input Parameters
type CHARACTER*1. This parameter specifies the storage type of
the input matrix.
= 'G': A is a full matrix.
= 'L': A is a lower triangular matrix.
= 'U': A is an upper triangular matrix.
= 'H': A is an upper Hessenberg matrix.
= 'B': A is a symmetric band matrix with lower bandwidth
kl and upper bandwidth ku and with the only the lower half
stored
= 'Q': A is a symmetric band matrix with lower bandwidth
kl and upper bandwidth ku and with the only the upper half
stored.
= 'Z': A is a band matrix with lower bandwidth kl and
upper bandwidth ku.
kl INTEGER. The lower bandwidth of A. Referenced only if type
= 'B', 'Q' or 'Z'.
ku INTEGER. The upper bandwidth of A. Referenced only if type
= 'B', 'Q' or 'Z'.
cfrom, cto REAL for slascl/clascl
DOUBLE PRECISION for dlascl/zlascl
The matrix A is multiplied by cto/cfrom. A(i,j) is computed
without over/underflow if the final result cto*A(i,j)/cfrom
can be represented without over/underflow. cfrom must be
nonzero.
1660
a REAL for slascl
DOUBLE PRECISION for dlascl
COMPLEX for clascl
COMPLEX*16 for zlascl
Array, DIMENSION (lda, n). The matrix to be multiplied by
cto/cfrom. See type for the storage type.
lda ≥ max(1,m).
Output Parameters
a The multiplied matrix A.
info INTEGER.
If info = 0 - successful exit
?lasd0
Computes the singular values of a real upper
bidiagonal n-by-m matrix B with diagonal d and
off-diagonal e. Used by ?bdsdc.
Syntax
call slasd0( n, sqre, d, e, u, ldu, vt, ldvt, smlsiz, iwork, work, info )
call dlasd0( n, sqre, d, e, u, ldu, vt, ldvt, smlsiz, iwork, work, info )
Description
for C interface.
Using a divide and conquer approach, the routine ?lasd0 computes the singular value
decomposition (SVD) of a real upper bidiagonal n-by-m matrix B with diagonal d and offdiagonal
e, where m = n + sqre.
The algorithm computes orthogonal matrices U and VT such that B = U*S*VT. The singular
values S are overwritten on d.
1661
The related subroutine ?lasda computes only the singular values, and optionally, the singular
vectors in compact form.
Input Parameters
n INTEGER. On entry, the row dimension of the upper
bidiagonal matrix. This is also the dimension of the main
diagonal array d.
sqre INTEGER. Specifies the column dimension of the bidiagonal
matrix.
If sqre = 0: the bidiagonal matrix has column dimension
m = n.
m = n+1.
d REAL for slasd0
DOUBLE PRECISION for dlasd0
e REAL for slasd0
Array, DIMENSION (m-1). Contains the subdiagonal entries
ldu INTEGER. On entry, leading dimension of the output array
u.
ldvt INTEGER. On entry, leading dimension of the output array
vt.
smlsiz INTEGER. On entry, maximum size of the subproblems at
the bottom of the computation tree.
iwork INTEGER.
Workspace array, dimension must be at least (8n).
work REAL for slasd0
Workspace array, dimension must be at least (3m2+2m).
Output Parameters
d On exit d, If info = 0, contains singular values of the
bidiagonal matrix.
1662
u REAL for slasd0
Array, DIMENSION at least (ldq, n). On exit, u contains the
left singular vectors.
vt REAL for slasd0
Array, DIMENSION at least (ldvt, m). On exit, vt' contains
the right singular vectors.
info INTEGER.
If info = 1, an singular value did not converge.
?lasd1
Computes the SVD of an upper bidiagonal matrix
B of the specified size. Used by ?bdsdc.
Syntax
call slasd1( nl, nr, sqre, d, alpha, beta, u, ldu, vt, ldvt, idxq, iwork,
work, info )
call dlasd1( nl, nr, sqre, d, alpha, beta, u, ldu, vt, ldvt, idxq, iwork,
work, info )
Description
for C interface.
The routine computes the SVD of an upper bidiagonal n-by-m matrix B, where n = nl + nr
+ 1 and m = n + sqre.
The routine ?lasd1 is called from ?lasd0.
A related subroutine ?lasd7 handles the case in which the singular values (and the singular
vectors in factored form) are desired.
?lasd1 computes the SVD as follows:
1663
= U(out)*(D(out) 0)*VT(out)
where Z' = (Z1' a Z2' b) = u'*VT', and u is a vector of dimension m with alpha and beta
in the nl+1 and nl+2-th entries and zeros elsewhere; and the entry b is empty if sqre = 0.
The left singular vectors of the original matrix are stored in u, and the transpose of the right
singular vectors are stored in vt, and the singular values are in d. The algorithm consists of
three stages:
1. The first stage consists of deflating the size of the problem when there are multiple singular
values or when there are zeros in the Z vector. For each such occurrence the dimension of
the secular equation problem is reduced by one. This stage is performed by the routine
?lasd2.
2. The second stage consists of calculating the updated singular values. This is done by finding
the square roots of the roots of the secular equation via the routine ?lasd4 (as called by
?lasd3). This routine also calculates the singular vectors of the current problem.
3. The final stage consists of computing the updated singular vectors directly using the updated
singular values. The singular vectors for the current problem are multiplied with the singular
vectors from the overall problem.
Input Parameters
nl ≥ 1.
nr ≥ 1.
sqre INTEGER.
If sqre = 0: the lower block is an nr-by-nr square matrix.
If sqre = 1: the lower block is an nr-by-(nr+1) rectangular
matrix. The bidiagonal matrix has row dimension n = nl
+ nr + 1, and column dimension m = n + sqre.
d REAL for slasd1
1664
Array, DIMENSION (nl+nr+1). n = nl+nr+1. On entry
d(1:nl,1:nl) contains the singular values of the upper
block; and d(nl+2:n) contains the singular values of the
lower block.
alpha REAL for slasd1
Contains the diagonal element associated with the added
row.
beta REAL for slasd1
Contains the off-diagonal element associated with the added
row.
u REAL for slasd1
Array, DIMENSION (ldu, n). On entry u(1:nl, 1:nl)
contains the left singular vectors of the upper block;
u(nl+2:n, nl+2:n) contains the left singular vectors of
the lower block.
ldu INTEGER. The leading dimension of the array U.
ldu ≥ max(1, n).
vt REAL for slasd1
Array, DIMENSION (ldvt, m), where m = n + sqre.
On entry vt(1:nl+1, 1:nl+1)' contains the right singular
vectors of the upper block; vt(nl+2:m, nl+2:m)' contains
the right singular vectors of the lower block.
ldvt INTEGER. The leading dimension of the array vt.
ldvt ≥ max(1, M).
iwork INTEGER.
Workspace array, DIMENSION (4n).
Workspace array, DIMENSION (3m2 + 2m).
1665
Output Parameters
d On exit d(1:n) contains the singular values of the modified
matrix.
alpha On exit, the diagonal element associated with the added
row deflated by max( abs( alpha ), abs( beta ),
abs( D(I) ) ), I = 1,n.
beta On exit, the off-diagonal element associated with the added
row deflated by max( abs( alpha ), abs( beta ),
abs( D(I) ) ), I = 1,n.
u On exit u contains the left singular vectors of the bidiagonal
matrix.
vt On exit vt' contains the right singular vectors of the
bidiagonal matrix.
idxq INTEGER
Array, DIMENSION (n). Contains the permutation which will
reintegrate the subproblem just solved back into sorted
order, that is, d(idxq( i = 1, n )) will be in ascending
order.
info INTEGER.
?lasd2
Merges the two sets of singular values together
into a single sorted set. Used by ?bdsdc.
Syntax
call slasd2( nl, nr, sqre, k, d, z, alpha, beta, u, ldu, vt, ldvt, dsigma,
u2, ldu2, vt2, ldvt2, idxp, idx, idxp, idxq, coltyp, info )
call dlasd2( nl, nr, sqre, k, d, z, alpha, beta, u, ldu, vt, ldvt, dsigma,
u2, ldu2, vt2, ldvt2, idxp, idx, idxp, idxq, coltyp, info )
1666
Description
for C interface.
The routine ?lasd2 merges the two sets of singular values together into a single sorted set.
Then it tries to deflate the size of the problem. There are two ways in which deflation can occur:
when two or more singular values are close together or if there is a tiny entry in the Z vector.
For each such occurrence the order of the related secular equation problem is reduced by one.
Input Parameters
nl ≥ 1.
nr ≥ 1.
sqre INTEGER.
If sqre = 0): the lower block is an nr-by-nr square matrix
If sqre = 1): the lower block is an nr-by-(nr+1)
rectangular matrix. The bidiagonal matrix has n = nl +
nr + 1 rows and m = n + sqre ≥ n columns.
d REAL for slasd2
Array, DIMENSION (n). On entry d contains the singular
values of the two submatrices to be combined.
row.
row.
u REAL for slasd2
1667
Array, DIMENSION (ldu, n). On entry u contains the left

singular vectors of two submatrices in the two square blocks
with corners at (1,1), (nl, nl), and (nl+2, nl+2), (n,n).
ldu INTEGER. The leading dimension of the array u.
ldu ≥ n.
ldu2 INTEGER. The leading dimension of the output array u2.
ldu2 ≥ n.
vt REAL for slasd2
Array, DIMENSION (ldvt, m). On entry, vt' contains the right
singular vectors of two submatrices in the two square blocks
with corners at (1,1), (nl+1, nl+1), and (nl+2, nl+2), (m,
m).
ldvt INTEGER. The leading dimension of the array vt. ldvt ≥
m.
ldvt2 INTEGER. The leading dimension of the output array vt2.
ldvt2 ≥ m.
idxp INTEGER.
Workspace array, DIMENSION (n). This will contain the
permutation used to place deflated values of D at the end
of the array. On output idxp(2:k) points to the nondeflated
d-values and idxp(k+1:n) points to the deflated singular
values.
idx INTEGER.
permutation used to sort the contents of d into ascending
order.
coltyp INTEGER.
Workspace array, DIMENSION (n). As workspace, this array
contains a label that indicates which of the following types
a column in the u2 matrix or a row in the vt2 matrix is:
1 : non-zero in the upper half only
2 : non-zero in the lower half only
3 : dense
4 : deflated.
1668
idxq INTEGER. Array, DIMENSION (n). This parameter contains
the permutation that separately sorts the two sub-problems
in D into ascending order. Note that entries in the first half
of this permutation must first be moved one position
backward; and entries in the second half must first have
nl+1 added to their values.
Output Parameters
k INTEGER. Contains the dimension of the non-deflated matrix,
n.
d On exit D contains the trailing (n-k) updated singular values
u On exit u contains the trailing (n-k) updated left singular
vectors (those which were deflated) in its last n-k columns.
z REAL for slasd2
Array, DIMENSION (n). On exit, z contains the updating row
vector in the secular equation.
dsigma REAL for slasd2
Array, DIMENSION (n). Contains a copy of the diagonal
elements (k-1 singular values and one zero) in the secular
equation.
u2 REAL for slasd2
Array, DIMENSION (ldu2, n). Contains a copy of the first
k-1 left singular vectors which will be used by ?lasd3 in a
matrix multiply (?gemm) to solve for the new left singular
vectors. u2 is arranged into four blocks. The first block
contains a column with 1 at nl+1 and zero everywhere else;
the second block contains non-zero entries only at and above
nl; the third contains non-zero entries only below nl+1;
and the fourth is dense.
1669
vt On exit, vt' contains the trailing (n-k) updated right singular

vectors (those which were deflated) in its last n-k columns.
In case sqre =1, the last row of vt spans the right null
space.
vt2 REAL for slasd2
Array, DIMENSION (ldvt2, n). vt2' contains a copy of the
first k right singular vectors which will be used by ?lasd3
in a matrix multiply (?gemm) to solve for the new right
singular vectors. vt2 is arranged into three blocks. The first
block contains a row that corresponds to the special 0
diagonal element in sigma; the second block contains
non-zeros only at and before nl +1; the third block contains
non-zeros only at and after nl +2.
idxc INTEGER. Array, DIMENSION (n). This will contain the
permutation used to arrange the columns of the deflated u
matrix into three groups: the first group contains non-zero
entries only at and above nl, the second contains non-zero
entries only below nl+2, and the third is dense.
coltyp On exit, it is an array of dimension 4, with coltyp(i) being
the dimension of the i-th type columns.
info INTEGER.
If info = 0): successful exit
?lasd3
Finds all square roots of the roots of the secular
equation, as defined by the values in D and Z, and
then updates the singular vectors by matrix
multiplication. Used by ?bdsdc.
Syntax
call slasd3( nl, nr, sqre, k, d, q, ldq, dsigma, u, ldu, u2, ldu2, vt, ldvt,
vt2, ldvt2, idxc, ctot, z, info )
call dlasd3( nl, nr, sqre, k, d, q, ldq, dsigma, u, ldu, u2, ldu2, vt, ldvt,
vt2, ldvt2, idxc, ctot, z, info )
1670
Description
for C interface.
The routine ?lasd3 finds all the square roots of the roots of the secular equation, as defined
by the values in D and Z.
It makes the appropriate calls to ?lasd4 and then updates the singular vectors by matrix
multiplication.
Input Parameters
nl ≥ 1.
nr ≥ 1.
sqre INTEGER.
If sqre = 0): the lower block is an nr-by-nr square matrix.
If sqre = 1): the lower block is an nr-by-(nr+1)
rectangular matrix. The bidiagonal matrix has n = nl +
nr + 1 rows and m = n + sqre ≥ n columns.
k INTEGER.The size of the secular equation, 1 ≤ k ≤ n.
q REAL for slasd3
Workspace array, DIMENSION at least (ldq, k).
ldq INTEGER. The leading dimension of the array Q.
ldq ≥ k.
Array, DIMENSION (k). The first k elements of this array
contain the old roots of the deflated updating problem. These
are the poles of the secular equation.
ldu INTEGER. The leading dimension of the array u.
ldu ≥ n.
1671
u2 REAL for slasd3

Array, DIMENSION (ldu2, n).
The first k columns of this matrix contain the non-deflated
left singular vectors for the split problem.
ldu2 INTEGER. The leading dimension of the array u2.
ldu2 ≥ n.
ldvt INTEGER. The leading dimension of the array vt.
ldvt ≥ n.
vt2 REAL for slasd3
Array, DIMENSION (ldvt2, n).
The first k columns of vt2' contain the non-deflated right
singular vectors for the split problem.
ldvt2 INTEGER. The leading dimension of the array vt2.
ldvt2 ≥ n.
idxc INTEGER. Array, DIMENSION (n).
The permutation used to arrange the columns of u (and
rows of vt) into three groups: the first group contains
non-zero entries only at and above (or before) nl +1; the
second contains non-zero entries only at and below (or after)
nl+2; and the third is dense. The first column of u and the
row of vt are treated separately, however. The rows of the
singular vectors found by ?lasd4 must be likewise permuted
before the matrix multiplies can take place.
ctot INTEGER. Array, DIMENSION (4). A count of the total number
of the various types of columns in u (or rows in vt), as
described in idxc.
The fourth column type is any column which has been
deflated.
z REAL for slasd3
contain the components of the deflation-adjusted updating
row vector.
1672
Output Parameters
d REAL for slasd3
Array, DIMENSION (k). On exit the square roots of the roots
of the secular equation, in ascending order.
u REAL for slasd3
Array, DIMENSION (ldu, n).
The last n - k columns of this matrix contain the deflated
left singular vectors.
vt REAL for slasd3
Array, DIMENSION (ldvt, m).
The last m - k columns of vt' contain the deflated right
singular vectors.
vt2 Destroyed on exit.
z Destroyed on exit.
info INTEGER.
If info = 0): successful exit.
Application Notes
subtract like the Cray XMP, Cray YMP, Cray C 90, or Cray 2. It could conceivably fail on
hexadecimal or decimal machines without guard digits, but we know of none.
1673
?lasd4
Computes the square root of the i-th updated
eigenvalue of a positive symmetric rank-one
modification to a positive diagonal matrix. Used by
?bdsdc.
Syntax
call slasd4( n, i, d, z, delta, rho, sigma, work, info )
call dlasd4( n, i, d, z, delta, rho, sigma, work, info )
Description
for C interface.
The routine computes the square root of the i-th updated eigenvalue of a positive symmetric
rank-one modification to a positive diagonal matrix whose entries are given as the squares of
the corresponding entries in the array d, and that 0 ≤ d(i) < d(j) for i < j and that rho
> 0. This is arranged by the calling routine, and is no loss in generality. The rank-one modified
system is thus
diag(d)*diag(d) + rho*Z*ZT,
where the Euclidean norm of Z is equal to 1.The method consists of approximating the rational
functions in the secular equation by simpler interpolating rational functions.
Input Parameters
n INTEGER. The length of all arrays.
i INTEGER.
The index of the eigenvalue to be computed. 1 ≤ i ≤ n.
d REAL for slasd4
The original eigenvalues. They must be in order, 0 ≤ d(i)
< d(j) for i < j.
z REAL for slasd4
1674
The components of the updating vector.
rho REAL for slasd4
Workspace array, DIMENSION (n ).
If n ≠ 1, work contains (d(j) + sigma_i) in its j-th
component.
If n = 1, then work( 1 ) = 1.
Output Parameters
delta REAL for slasd4
If n ≠ 1, delta contains (d(j) - sigma_i) in its j-th
component.
If n = 1, then delta (1) = 1. The vector delta contains
the information necessary to construct the (singular)
eigenvectors.
sigma REAL for slasd4
The computed sigma_i, the i-th updated eigenvalue.
info INTEGER.
= 0: successful exit
> 0: If info = 1, the updating process failed.
1675
?lasd5
Computes the square root of the i-th eigenvalue
of a positive symmetric rank-one modification of
a 2-by-2 diagonal matrix.Used by ?bdsdc.
Syntax
call slasd5( i, d, z, delta, rho, dsigma, work )
call dlasd5( i, d, z, delta, rho, dsigma, work )
Description
for C interface.
The routine computes the square root of the i-th eigenvalue of a positive symmetric rank-one
modification of a 2-by-2 diagonal matrix diag(d)*diag(d)+rho*Z*ZT
The diagonal entries in the array d must satisfy 0 ≤ d(i) < d(j) for i<i, rho mustbe greater
than 0, and that the Euclidean norm of the vector Z is equal to 1.
Input Parameters
i INTEGER.The index of the eigenvalue to be computed. i =
1 or i = 2.
d REAL for slasd5
Array, dimension (2 ).
The original eigenvalues, 0 ≤ d(1) < d(2).
z REAL for slasd5
Array, dimension ( 2 ).
The components of the updating vector.
rho REAL for slasd5
DOUBLE PRECISION for dlasd5.
1676
Workspace array, dimension ( 2 ). Contains (d(j) +
sigma_i) in its j-th component.
Output Parameters
delta REAL for slasd5
Array, dimension ( 2 ).
Contains (d(j) - sigma_i) in its j-th component. The
vector delta contains the information necessary to construct
the eigenvectors.
The computed sigma_i , the i-th updated eigenvalue.
?lasd6
Computes the SVD of an updated upper bidiagonal
matrix obtained by merging two smaller ones by
appending a row. Used by ?bdsdc.
Syntax
call slasd6( icompq, nl, nr, sqre, d, vf, vl, alpha, beta, idxq, perm, givptr,
givcol, ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, work, iwork,
info )
call dlasd6( icompq, nl, nr, sqre, d, vf, vl, alpha, beta, idxq, perm, givptr,
givcol, ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, work, iwork,
info )
Description
for C interface.
The routine ?lasd6 computes the SVD of an updated upper bidiagonal matrix B obtained by
merging two smaller ones by appending a row. This routine is used only for the problem which
requires all singular values and optionally singular vector matrices in factored form. B is an
1677
n-by-m matrix with n = nl + nr + 1 and m = n + sqre. A related subroutine, ?lasd1,

handles the case in which all singular values and singular vectors of the bidiagonal matrix are
desired. ?lasd6 computes the SVD as follows:
= U(out)*(D(out)*VT(out)
where Z' = (Z1' a Z2' b) = u'*VT', and u is a vector of dimension m with alpha and beta
in the nl+1 and nl+2-th entries and zeros elsewhere; and the entry b is empty if sqre = 0.
The singular values of B can be computed using D1, D2, the first components of all the right
singular vectors of the lower block, and the last components of all the right singular vectors of
the upper block. These components are stored and updated in vf and vl, respectively, in
?lasd6. Hence U and VT are not explicitly referenced.
The singular values are stored in D. The algorithm consists of two stages:
1. The first stage consists of deflating the size of the problem when there are multiple singular
values or if there is a zero in the Z vector. For each such occurrence the dimension of the
secular equation problem is reduced by one. This stage is performed by the routine ?lasd7.
2. The second stage consists of calculating the updated singular values. This is done by finding
the roots of the secular equation via the routine ?lasd4 (as called by ?lasd8). This routine
also updates vf and vl and computes the distances between the updated singular values
and the old singular values. ?lasd6 is called from ?lasda.
Input Parameters
computed in factored form:
= 0: Compute singular values only
= 1: Compute singular vectors in factored form as well.
nl ≥ 1.
1678
nr ≥ 1.
sqre INTEGER .
= 0: the lower block is an nr-by-nr square matrix.
= 1: the lower block is an nr-by-(nr+1) rectangular matrix.
The bidiagonal matrix has row dimension n=nl+nr+1, and
column dimension m = n + sqre.
d REAL for slasd6
Array, dimension ( nl+nr+1 ). On entry d(1:nl,1:nl)
contains the singular values of the upper block, and
d(nl+2:n) contains the singular values of the lower block.
vf REAL for slasd6
Array, dimension ( m ).
On entry, vf(1:nl+1) contains the first components of all
right singular vectors of the upper block; and vf(nl+2:m)
contains the first components of all right singular vectors
of the lower block.
vl REAL for slasd6
On entry, vl(1:nl+1) contains the last components of all
right singular vectors of the upper block; and vl(nl+2:m)
contains the last components of all right singular vectors of
the lower block.
row.
row.
ldgcol INTEGER.The leading dimension of the output array givcol,
must be at least n.
ldgnum INTEGER.
1679
The leading dimension of the output arrays givnum and

poles, must be at least n.
Workspace array, dimension ( 4m ).
iwork
INTEGER.
Workspace array, dimension ( 3n ).
Output Parameters
d On exit d(1:n) contains the singular values of the modified
matrix.
vf On exit, vf contains the first components of all right singular
vectors of the bidiagonal matrix.
vl On exit, vl contains the last components of all right singular
alpha On exit, the diagonal element associated with the added
row deflated by max(abs(alpha), abs(beta),
abs(D(I))), I = 1,n.
beta On exit, the off-diagonal element associated with the added
row deflated by max(abs(alpha), abs(beta),
abs(D(I))), I = 1,n.
idxq INTEGER.
Array, dimension (n). This contains the permutation which
will reintegrate the subproblem just solved back into sorted
order, that is, d( idxq( i = 1, n ) ) will be in ascending
order.
perm
INTEGER.
Array, dimension (n). The permutations (from deflation
and sorting) to be applied to each block. Not referenced if
icompq = 0.
givptr INTEGER. The number of Givens rotations which took place
in this subproblem. Not referenced if icompq = 0.
givcol
INTEGER.
1680
Array, dimension ( ldgcol, 2 ). Each pair of numbers
indicates a pair of columns to take place in a Givens rotation.
Not referenced if icompq = 0.
givnum REAL for slasd6
Array, dimension ( ldgnum, 2 ). Each number indicates the
C or S value to be used in the corresponding Givens rotation.
poles REAL for slasd6
Array, dimension ( ldgnum, 2 ). On exit, poles(1,*) is an
array containing the new singular values obtained from
solving the secular equation, and poles(2,*) is an array
containing the poles in the secular equation. Not referenced
if icompq = 0.
difl REAL for slasd6
Array, dimension (n). On exit, difl(i) is the distance
between i-th updated (undeflated) singular value and the
i-th (undeflated) old singular value.
difr REAL for slasd6
Array, dimension (ldgnum, 2 ) if icompq = 1 and
dimension (n) if icompq = 0.
On exit, difr(i, 1) is the distance between i-th updated
(undeflated) singular value and the i+1-th (undeflated) old
singular value. If icompq = 1, difr(1: k, 2) is an array
containing the normalizing factors for the right singular
vector matrix.
See ?lasd8 for details on difl and difr.
z REAL for slasd6
The first elements of this array contain the components of
the deflation-adjusted updating row vector.
1681
k INTEGER. Contains the dimension of the non-deflated matrix.

n.
c REAL for slasd6
c contains garbage if sqre =0 and the C-value of a Givens
s REAL for slasd6
s contains garbage if sqre =0 and the S-value of a Givens
info
INTEGER.
< 0: if info = -i, the i-th argument had an illegal value.
> 0: if info = 1, an singular value did not converge
?lasd7
Merges the two sets of singular values together
into a single sorted set. Then it tries to deflate the
size of the problem. Used by ?bdsdc.
Syntax
call slasd7( icompq, nl, nr, sqre, k, d, z, zw, vf, vfw, vl, vlw, alpha, beta,
dsigma, idx, idxp, idxq, perm, givptr, givcol, ldgcol, givnum, ldgnum, c, s,
info )
call dlasd7( icompq, nl, nr, sqre, k, d, z, zw, vf, vfw, vl, vlw, alpha, beta,
dsigma, idx, idxp, idxq, perm, givptr, givcol, ldgcol, givnum, ldgnum, c, s,
info )
Description
for C interface.
1682
The routine ?lasd7 merges the two sets of singular values together into a single sorted set.
Then it tries to deflate the size of the problem. There are two ways in which deflation can occur:
when two or more singular values are close together or if there is a tiny entry in the Z vector.
For each such occurrence the order of the related secular equation problem is reduced by one.
?lasd7 is called from ?lasd6.
Input Parameters
computed in compact form, as follows:
= 0: Compute singular values only.
= 1: Compute singular vectors of upper bidiagonal matrix
in compact form.
nl ≥ 1.
nr ≥ 1.
sqre INTEGER.
= 0: the lower block is an nr-by-nr square matrix.
= 1: the lower block is an nr-by-(nr+1) rectangular matrix.
The bidiagonal matrix has n = nl + nr + 1 rows and m
= n + sqre ≥ n columns.
d REAL for slasd7
Array, DIMENSION (n). On entry d contains the singular
values of the two submatrices to be combined.
zw REAL for slasd7
Array, DIMENSION ( m ).
Workspace for z.
vf REAL for slasd7
Array, DIMENSION ( m ). On entry, vf(1:nl+1) contains
the first components of all right singular vectors of the upper
block; and vf(nl+2:m) contains the first components of all
right singular vectors of the lower block.
vfw REAL for slasd7
1683

Workspace for vf.
vl REAL for slasd7
On entry, vl(1:nl+1) contains the last components of all
right singular vectors of the upper block; and vl(nl+2:m)
contains the last components of all right singular vectors of
the lower block.
VLW REAL for slasd7
Workspace for VL.
row.
row.
idx INTEGER.
permutation used to sort the contents of d into ascending
order.
idxp INTEGER.
permutation used to place deflated values of d at the end
of the array.
idxq INTEGER.
This contains the permutation which separately sorts the
two sub-problems in d into ascending order. Note that
entries in the first half of this permutation must first be
moved one position backward; and entries in the second
half must first have nl+1 added to their values.
1684
ldgcol INTEGER.The leading dimension of the output array givcol,
must be at least n.
ldgnum INTEGER. The leading dimension of the output array givnum,
must be at least n.
Output Parameters
k INTEGER. Contains the dimension of the non-deflated matrix,
this is the order of the related secular equation.
1 ≤ k ≤ n.
d On exit, d contains the trailing (n-k) updated singular values
z REAL for slasd7
On exit, Z contains the updating row vector in the secular
equation.
vf On exit, vf contains the first components of all right singular
vl On exit, vl contains the last components of all right singular
Array, DIMENSION (n). Contains a copy of the diagonal
elements (k-1 singular values and one zero) in the secular
equation.
idxp On output, idxp(2: k) points to the nondeflated d-values
and idxp( k+1:n) points to the deflated singular values.
perm INTEGER.
The permutations (from deflation and sorting) to be applied
to each singular block. Not referenced if icompq = 0.
givptr INTEGER.
The number of Givens rotations which took place in this
subproblem. Not referenced if icompq = 0.
givcol INTEGER.
1685
Array, DIMENSION ( ldgcol, 2 ). Each pair of numbers

indicates a pair of columns to take place in a Givens rotation.
givnum REAL for slasd7
Array, DIMENSION ( ldgnum, 2 ). Each number indicates the
C or S value to be used in the corresponding Givens rotation.
c REAL for slasd7.
If sqre =0, then c contains garbage, and if sqre = 1, then
c contains C-value of a Givens rotation related to the right
null space.
S REAL for slasd7.
If sqre =0, then s contains garbage, and if sqre = 1, then
s contains S-value of a Givens rotation related to the right
null space.
info INTEGER.
?lasd8
Finds the square roots of the roots of the secular
equation, and stores, for each element in D, the
distance to its two nearest poles. Used by ?bdsdc.
Syntax
call slasd8( icompq, k, d, z, vf, vl, difl, difr, lddifr, dsigma, work, info
)
call dlasd8( icompq, k, d, z, vf, vl, difl, difr, lddifr, dsigma, work, info
)
Description
for C interface.
1686
The routine ?lasd8 finds the square roots of the roots of the secular equation, as defined by
the values in dsigma and z. It makes the appropriate calls to ?lasd4, and stores, for each
element in d, the distance to its two nearest poles (elements in dsigma). It also updates the
arrays vf and vl, the first and last components of all the right singular vectors of the original
bidiagonal matrix. ?lasd8 is called from ?lasd6.
Input Parameters
computed in factored form in the calling routine:
= 1: Compute singular vectors in factored form as well.
be solved by ?lasd4. k ≥ 1.
z REAL for slasd8
Array, DIMENSION ( k ).
The first k elements of this array contain the components
of the deflation-adjusted updating row vector.
vf REAL for slasd8
On entry, vf contains information passed through dbede8.
vl REAL for slasd8
Array, DIMENSION ( k ). On entry, vl contains information
passed through dbede8.
lddifr INTEGER. The leading dimension of the output array difr,
must be at least k.
The first k elements of this array contain the old roots of
the deflated updating problem. These are the poles of the
secular equation.
1687
Workspace array, DIMENSION at least (3k).
Output Parameters
d REAL for slasd8
On output, D contains the updated singular values.
vf On exit, vf contains the first k components of the first
components of all right singular vectors of the bidiagonal
matrix.
vl On exit, vl contains the first k components of the last
matrix.
Array, DIMENSION ( k ). On exit, difl(i) = d(i) -
dsigma(i).
Array,
DIMENSION ( lddifr, 2 ) if icompq = 1 and
DIMENSION ( k ) if icompq = 0.
On exit, difr(i,1) = d(i) - dsigma(i+1), difr(k,1)
is not defined and will not be referenced. If icompq = 1,
difr(1:k,2) is an array containing the normalizing factors
for the right singular vector matrix.
info INTEGER.
> 0: If info = 1, an singular value did not converge.
1688
?lasd9
Finds the square roots of the roots of the secular
equation, and stores, for each element in D, the
distance to its two nearest poles. Used by ?bdsdc.
Syntax
call slasd9( icompq, ldu, k, d, z, vf, vl, difl, difr, dsigma, work, info )
call dlasd9( icompq, ldu, k, d, z, vf, vl, difl, difr, dsigma, work, info )
Description
for C interface.
The routine ?lasd9 finds the square roots of the roots of the secular equation, as defined by
the values in dsigma and z. It makes the appropriate calls to ?lasd4, and stores, for each
element in d, the distance to its two nearest poles (elements in dsigma). It also updates the
arrays vf and vl, the first and last components of all the right singular vectors of the original
bidiagonal matrix. ?lasd9 is called from ?lasd7.
Input Parameters
computed in factored form in the calling routine:
If icompq = 0, compute singular values only;
If icompq = 1, compute singular vector matrices in factored
form also.
be solved by slasd4. k ≥ 1.
Array, DIMENSION(k).
The first k elements of this array contain the old roots of
the deflated updating problem. These are the poles of the
secular equation.
z REAL for slasd9
1689

contain the components of the deflation-adjusted updating
row vector.
vf REAL for slasd9
Array, DIMENSION(k). On entry, vf contains information
passed through sbede8.
vl REAL for slasd9
Array, DIMENSION(k). On entry, vl contains information
passed through sbede8.
Workspace array, DIMENSION at least (3k).
Output Parameters
d REAL for slasd9
Array, DIMENSION(k). d(i) contains the updated singular
values.
vf On exit, vf contains the first k components of the first
matrix.
vl On exit, vl contains the first k components of the last
matrix.
Array, DIMENSION (k).
On exit, difl(i) = d(i) - dsigma(i).
Array,
DIMENSION (ldu, 2) if icompq =1 and
DIMENSION (k) if icompq = 0.
1690
On exit, difr(i, 1) = d(i) - dsigma(i+1), difr(k, 1)
is not defined and will not be referenced.
If icompq = 1, difr(1:k, 2) is an array containing the
normalizing factors for the right singular vector matrix.
info INTEGER.
> 0: If info = 1, an singular value did not converge
?lasda
Computes the singular value decomposition (SVD)
of a real upper bidiagonal matrix with diagonal d
and off-diagonal e. Used by ?bdsdc.
Syntax
call slasda( icompq, smlsiz, n, sqre, d, e, u, ldu, vt, k, difl, difr, z,
poles, givptr, givcol, ldgcol, perm, givnum, c, s, work, iwork, info )
call dlasda( icompq, smlsiz, n, sqre, d, e, u, ldu, vt, k, difl, difr, z,
poles, givptr, givcol, ldgcol, perm, givnum, c, s, work, iwork, info )
Description
for C interface.
Using a divide and conquer approach, ?lasda computes the singular value decomposition (SVD)
of a real upper bidiagonal n-by-m matrix B with diagonal d and off-diagonal e, where m = n +
sqre.
The algorithm computes the singular values in the SVD B = U*S*VT. The orthogonal matrices
U and VT are optionally computed in compact form. A related subroutine ?lasd0 computes the
singular values and the singular vectors in explicit form.
Input Parameters
icompq INTEGER.
Specifies whether singular vectors are to be computed in
compact form, as follows:
1691
= 1: Compute singular vectors of upper bidiagonal matrix

in compact form.
smlsiz INTEGER.
The maximum size of the subproblems at the bottom of the
computation tree.
n INTEGER. The row dimension of the upper bidiagonal matrix.
This is also the dimension of the main diagonal array d.
sqre INTEGER. Specifies the column dimension of the bidiagonal
matrix.
m = n
m = n + 1.
d REAL for slasda
DOUBLE PRECISION for dlasda.
e REAL for slasda
Array, DIMENSION ( m - 1 ). Contains the subdiagonal entries
ldu INTEGER. The leading dimension of arrays u, vt, difl, difr,
poles, givnum, and z. ldu ≥ n.
ldgcol INTEGER. The leading dimension of arrays givcol and perm.
ldgcol ≥ n.
work REAL for slasda
Workspace array, DIMENSION (6n+(smlsiz+1)2).
iwork INTEGER.
Workspace array, Dimension must be at least (7n).
Output Parameters
d On exit d, if info = 0, contains the singular values of the
bidiagonal matrix.
u REAL for slasda
1692
Array, DIMENSION (ldu, smlsiz) if icompq =1.
If icompq = 1, on exit, u contains the left singular vector
matrices of all subproblems at the bottom level.
vt REAL for slasda
Array, DIMENSION ( ldu, smlsiz+1 ) if icompq = 1, and
not referenced if icompq = 0. If icompq = 1, on exit, vt'
contains the right singular vector matrices of all subproblems
at the bottom level.
k INTEGER.
Array, DIMENSION (n) if icompq = 1 and
DIMENSION (1) if icompq = 0.
If icompq = 1, on exit, k(i) is the dimension of the i-th
secular equation on the computation tree.
difl REAL for slasda
Array, DIMENSION ( ldu, nlvl ),
where nlvl = floor(log2(n/smlsiz)).
difr REAL for slasda
Array,
DIMENSION ( ldu, 2 nlvl ) if icompq = 1 and
DIMENSION (n) if icompq = 0.
If icompq = 1, on exit, difl(1:n, i) and difr(1:n,2i -1)
record distances between singular values on the i-th level
and singular values on the (i -1)-th level, and difr(1:n, 2i
) contains the normalizing factors for the right singular
vector matrix. See ?lasd8 for details.
z REAL for slasda
Array,
DIMENSION ( ldu, nlvl ) if icompq = 1 and
DIMENSION (n) if icompq = 0. The first k elements of z(1,
i) contain the components of the deflation-adjusted updating
row vector for subproblems on the i-th level.
1693
poles REAL for slasda

DOUBLE PRECISION for dlasda
Array, DIMENSION (ldu, 2*nlvl)
if icompq = 1, and not referenced if icompq = 0. If icompq
= 1, on exit, poles(1, 2i - 1) and poles(1, 2i) contain the
new and old singular values involved in the secular equations
on the i-th level.
givptr INTEGER. Array, DIMENSION (n) if icompq = 1, and not
referenced if icompq = 0. If icompq = 1, on exit, givptr(
i ) records the number of Givens rotations performed on
the i-th problem on the computation tree.
givcol INTEGER .
Array, DIMENSION (ldgcol, 2*nlvl) if icompq = 1, and
not referenced if icompq = 0. If icompq = 1, on exit, for
each i, givcol(1, 2 i - 1) and givcol(1, 2 i) record the
locations of Givens rotations performed on the i-th level on
perm INTEGER . Array, DIMENSION ( ldgcol, nlvl ) if icompq =
1, and not referenced if icompq = 0. If icompq = 1, on
exit, perm (1, i) records permutations done on the i-th
level of the computation tree.
givnum REAL for slasda
Array DIMENSION ( ldu, 2*nlvl ) if icompq = 1, and not
referenced if icompq = 0. If icompq = 1, on exit, for each
i, givnum(1, 2 i - 1) and givnum(1, 2 i) record the C- and
S-values of Givens rotations performed on the i-th level on
c REAL for slasda
Array,
DIMENSION (n) if icompq = 1, and
If icompq = 1 and the i-th subproblem is not square, on
exit, c(i) contains the C-value of a Givens rotation related
to the right null space of the i-th subproblem.
s REAL for slasda
1694
Array,
DIMENSION (n) icompq = 1, and
If icompq = 1 and the i-th subproblem is not square, on
exit, s(i) contains the S-value of a Givens rotation related
to the right null space of the i-th subproblem.
info INTEGER.
< 0: if info = -i, the i-th argument had an illegal value
> 0: If info = 1, an singular value did not converge
?lasdq
Computes the SVD of a real bidiagonal matrix with
diagonal d and off-diagonal e. Used by ?bdsdc.
Syntax
call slasdq( uplo, sqre, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc,
work, info )
call dlasdq( uplo, sqre, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc,
work, info )
Description
for C interface.
The routine ?lasdq computes the singular value decomposition (SVD) of a real (upper or lower)
bidiagonal matrix with diagonal d and off-diagonal e, accumulating the transformations if
desired. If B is the input bidiagonal matrix, the algorithm computes orthogonal matrices Q and
P such that B = Q*S*PT. The singular values S are overwritten on d.
The input matrix U is changed to U*Q if desired.
The input matrix VT is changed to PT*VT if desired.
The input matrix C is changed to QT*C if desired.
1695
Input Parameters
uplo CHARACTER*1. On entry, uplo specifies whether the input
bidiagonal matrix is upper or lower bidiagonal.
If uplo = 'U' or 'u' , B is upper bidiagonal;
If uplo = 'L' or 'l' , B is lower bidiagonal.
sqre INTEGER.
= 0: then the input matrix is n-by-n.
= 1: then the input matrix is n-by-(n+1) if uplu = 'U' and
(n+1)-by-n if uplu
= 'L'. The bidiagonal matrix has n = nl + nr + 1 rows
and m = n + sqre ≥ n columns.
n INTEGER. On entry, n specifies the number of rows and
columns in the matrix. n must be at least 0.
ncvt INTEGER. On entry, ncvt specifies the number of columns
of the matrix VT. ncvt must be at least 0.
nru INTEGER. On entry, nru specifies the number of rows of the
matrix U. nru must be at least 0.
ncc INTEGER. On entry, ncc specifies the number of columns
of the matrix C. ncc must be at least 0.
d REAL for slasdq
DOUBLE PRECISION for dlasdq.
Array, DIMENSION (n). On entry, d contains the diagonal
entries of the bidiagonal matrix.
e REAL for slasdq
Array, DIMENSION is (n-1) if sqre = 0 and n if sqre = 1.
On entry, the entries of e contain the off-diagonal entries
of the bidiagonal matrix.
vt REAL for slasdq
Array, DIMENSION (ldvt, ncvt). On entry, contains a matrix
T
which on exit has been premultiplied by P , dimension
n-by-ncvt if sqre = 0 and (n+1)-by-ncvt if sqre = 1 (not
referenced if ncvt=0).
1696
ldvt INTEGER. On entry, ldvt specifies the leading dimension
of vt as declared in the calling (sub) program. ldvt must
be at least 1. If ncvt is nonzero, ldvt must also be at least
n.
u REAL for slasdq
Array, DIMENSION (ldu, n). On entry, contains a matrix
which on exit has been postmultiplied by Q, dimension
nru-by-n if sqre = 0 and nru-by-(n+1) if sqre = 1 (not
referenced if nru=0).
ldu INTEGER. On entry, ldu specifies the leading dimension of
u as declared in the calling (sub) program. ldu must be at
least max(1, nru ) .
c REAL for slasdq
Array, DIMENSION (ldc, ncc). On entry, contains an
n-by-ncc matrix which on exit has been premultiplied by
Q', dimension n-by-ncc if sqre = 0 and (n+1)-by-ncc if
sqre = 1 (not referenced if ncc=0).
ldc INTEGER. On entry, ldc specifies the leading dimension of
C as declared in the calling (sub) program. ldc must be at
least 1. If ncc is non-zero, ldc must also be at least n.
work REAL for slasdq
Array, DIMENSION (4n). This is a workspace array. Only
referenced if one of ncvt, nru, or ncc is nonzero, and if n
is at least 2.
Output Parameters
d On normal exit, d contains the singular values in ascending
order.
e On normal exit, e will contain 0. If the algorithm does not
converge, d and e will contain the diagonal and
superdiagonal entries of a bidiagonal matrix orthogonally
equivalent to the one given as input.
vt On exit, the matrix has been premultiplied by P'.
1697
u On exit, the matrix has been postmultiplied by Q.

c On exit, the matrix has been premultiplied by Q'.
info INTEGER. On exit, a value of 0 indicates a successful exit.
If info < 0, argument number -info is illegal. If info >
0, the algorithm did not converge, and info specifies how
many superdiagonals did not converge.
?lasdt
Creates a tree of subproblems for bidiagonal divide
and conquer. Used by ?bdsdc.
Syntax
call slasdt( n, lvl, nd, inode, ndiml, ndimr, msub )
call dlasdt( n, lvl, nd, inode, ndiml, ndimr, msub )
Description
for C interface.
The routine creates a tree of subproblems for bidiagonal divide and conquer.
Input Parameters
n INTEGER. On entry, the number of diagonal elements of the
bidiagonal matrix.
msub INTEGER. On entry, the maximum row dimension each
subproblem at the bottom of the tree can be of.
Output Parameters
lvl INTEGER. On exit, the number of levels on the computation
tree.
nd INTEGER. On exit, the number of nodes on the tree.
inode INTEGER.
Array, DIMENSION (n). On exit, centers of subproblems.
ndiml INTEGER .
1698
Array, DIMENSION (n). On exit, row dimensions of left
children.
ndimr INTEGER .
Array, DIMENSION (n). On exit, row dimensions of right
children.
?laset
Initializes the off-diagonal elements and the
diagonal elements of a matrix to given values.
Syntax
call slaset( uplo, m, n, alpha, beta, a, lda )
call dlaset( uplo, m, n, alpha, beta, a, lda )
call claset( uplo, m, n, alpha, beta, a, lda )
call zlaset( uplo, m, n, alpha, beta, a, lda )
Description
for C interface.
The routine initializes an m-by-n matrix A to beta on the diagonal and alpha on the off-diagonals.
Input Parameters
uplo CHARACTER*1. Specifies the part of the matrix A to be set.
If uplo = 'U', upper triangular part is set; the strictly
lower triangular part of A is not changed.
If uplo = 'L': lower triangular part is set; the strictly
upper triangular part of A is not changed.
Otherwise: All of the matrix A is set.
n ≥ 0.
alpha, beta REAL for slaset
DOUBLE PRECISION for dlaset
1699
COMPLEX for claset

COMPLEX*16 for zlaset.
The constants to which the off-diagonal and diagonal
elements are to be set, respectively.
a REAL for slaset
DOUBLE PRECISION for dlaset
COMPLEX for claset
COMPLEX*16 for zlaset.
On entry, the m-by-n matrix A.
lda ≥ max(1,m).
Output Parameters
a On exit, the leading m-by-n submatrix of A is set as follows:
if uplo = 'U', A(i,j) = alpha, 1≤i≤j-1, 1≤j≤n,
if uplo = 'L', A(i,j) = alpha, j+1≤i≤m, 1≤j≤n,
otherwise, A(i,j) = alpha, 1≤i≤m, 1≤j≤n, i ≠ j,
and, for all uplo, A(i,i) = beta, 1≤i≤min(m, n).
?lasq1
Computes the singular values of a real square
bidiagonal matrix. Used by ?bdsqr.
Syntax
call slasq1( n, d, e, work, info )
call dlasq1( n, d, e, work, info )
Description
for C interface.
1700
The routine ?lasq1 computes the singular values of a real n-by-n bidiagonal matrix with diagonal
d and off-diagonal e. The singular values are computed to high relative accuracy, in the absence
of denormalization, underflow and overflow.
Input Parameters
n INTEGER.The number of rows and columns in the matrix. n
≥ 0.
d REAL for slasq1
DOUBLE PRECISION for dlasq1.
On entry, d contains the diagonal elements of the bidiagonal
matrix whose SVD is desired.
e REAL for slasq1
On entry, elements e(1:n-1) contain the off-diagonal
elements of the bidiagonal matrix whose SVD is desired.
work REAL for slasq1
Output Parameters
d On normal exit, d contains the singular values in decreasing
order.
e On exit, e is overwritten.
info INTEGER.
= 0: successful exit;
< 0: if info = -i, the i-th argument had an illegal value;
> 0: the algorithm failed:
= 1, a split was marked by a positive value in e;
= 2, current block of z not diagonalized after 30n iterations
(in inner while loop);
= 3, termination criterion of outer while loop not met
(program created more than n unreduced blocks.
1701
?lasq2
Computes all the eigenvalues of the symmetric
positive definite tridiagonal matrix associated with
the qd array z to high relative accuracy. Used by
?bdsqr and ?stegr.
Syntax
call slasq2( n, z, info )
call dlasq2( n, z, info )
Description
for C interface.
The routine ?lasq2 computes all the eigenvalues of the symmetric positive definite tridiagonal
matrix associated with the qd array z to high relative accuracy, in the absence of
denormalization, underflow and overflow.
To see the relation of z to the tridiagonal matrix, let L be a unit lower bidiagonal matrix with
subdiagonals z(2,4,6,,..) and let U be an upper bidiagonal matrix with 1's above and diagonal
z(1,3,5,,..). The tridiagonal is LU or, if you prefer, the symmetric tridiagonal to which it is
similar.
Input Parameters
n INTEGER. The number of rows and columns in the matrix.
n ≥ 0.
z REAL for slasq2
Array, DIMENSION (4 * n).
On entry, z holds the qd array.
1702
Output Parameters
z On exit, entries 1 to n hold the eigenvalues in decreasing
order, z(2n+1) holds the trace, and z(2n+2) holds the sum
of the eigenvalues. If n > 2, then z(2n+3) holds the
2
iteration count, z(2n+4) holds ndivs/nin , and z(2n+5)
holds the percentage of shifts that failed.
info INTEGER.
= 0: successful exit;
< 0: if the i-th argument is a scalar and had an illegal value,
then info = -i, if the i-th argument is an array and the
j-entry had an illegal value, then info = -(i*100+ j);
> 0: the algorithm failed:
= 1, a split was marked by a positive value in e;
= 2, current block of z not diagonalized after 30*n iterations
(in inner while loop);
= 3, termination criterion of outer while loop not met
(program created more than n unreduced blocks).
Application Notes
The routine ?lasq2 defines a logical variable, ieee, which is .TRUE. on machines which follow
ieee-754 floating-point standard in their handling of infinities and NaNs, and .FALSE. otherwise.
This variable is passed to ?lazq3.
?lasq3
Checks for deflation, computes a shift and calls
dqds. Used by ?bdsqr.
Syntax
call slasq3( i0, n0, z, pp, dmin, sigma, desig, qmax, nfail, iter, ndiv, ieee
)
call dlasq3( i0, n0, z, pp, dmin, sigma, desig, qmax, nfail, iter, ndiv, ieee
)
1703
Description
for C interface.
The routine ?lasq3 checks for deflation, computes a shift, and calls dqds. In case of failure, it
changes shifts, and tries again until output is positive.
WARNING. The ?lasq3 routine is not thread-safe. It is deprecated and retained for
the backward compatibility only. Use the thread-safe ?lazq3 routine instead.
Input Parameters
i0 INTEGER. First index.
n0 INTEGER. Last index.
z REAL for slasq3
Array, DIMENSION (4n). z holds the qd array.
pp INTEGER. pp=0 for ping, pp=1 for pong.
desig REAL for slasq3
Lower order part of sigma.
qmax REAL for slasq3
Maximum value of q.
ieee LOGICAL.
Flag for ieee or non-ieee arithmetic (passed to ?lasq5).
Output Parameters
dmin REAL for slasq3
Minimum value of d.
sigma REAL for slasq3
Sum of shifts used in current segment.
desig Lower order part of sigma.
1704
nfail INTEGER. Number of times shift was too big.
iter INTEGER. Number of iterations.
ndiv INTEGER. Number of divisions.
?lasq4
Computes an approximation to the smallest
eigenvalue using values of d from the previous
transform. Used by ?bdsqr.
Syntax
call slasq4( i0, n0, z, pp, n0in, dmin, dmin1, dmin2, dn, dn1, dn2, tau, ttype
)
call dlasq4( i0, n0, z, pp, n0in, dmin, dmin1, dmin2, dn, dn1, dn2, tau, ttype
)
Description
for C interface.
The routine computes an approximation tau to the smallest eigenvalue using values of d from
the previous transform.
WARNING. The ?lasq4 routine is not thread-safe. It is deprecated and retained for
the backward compatibility only. Use the thread-safe ?lazq4 routine instead.
Input Parameters
z REAL for slasq4
Array, DIMENSION (4n).
Z holds the qd array.
n0in INTEGER. The value of n0 at start of eigtest.
1705

Minimum value of d.
dmin1 REAL for slasq4
Minimum value of d, excluding d(n0).
Minimum value of d, excluding d(n0)
and d(n0-1).
dn REAL for slasq4
DOUBLE PRECISION for dlasq4. Contains d(n).
dn1 REAL for slasq4
DOUBLE PRECISION for dlasq4. Contains d(n-1).
dn2 REAL for slasq4
DOUBLE PRECISION for dlasq4. Contains d(n-2).
Output Parameters
tau REAL for slasq4
Shift.
ttype INTEGER. Shift type.
?lasq5
Computes one dqds transform in ping-pong form.
Used by ?bdsqr and ?stegr.
Syntax
call slasq5( i0, n0, z, pp, tau, dmin, dmin1, dmin2, dn, dnm1, dnm2, ieee )
call dlasq5( i0, n0, z, pp, tau, dmin, dmin1, dmin2, dn, dnm1, dnm2, ieee )
Description
for C interface.
1706
The routine computes one dqds transform in ping-pong form: one version for ieee machines,
another for non-ieee machines.
Input Parameters
z REAL for slasq5
Array, DIMENSION (4n). z holds the qd array. emin is stored
in z(4*n0)
to avoid an extra argument.
tau REAL for slasq5
This is the shift.
ieee LOGICAL. Flag for IEEE or non-IEEE arithmetic.
Output Parameters
Minimum value of d.
Minimum value of d, excluding d(n0) and d(n0-1).
dn REAL for slasq5
DOUBLE PRECISION for dlasq5. Contains d(n0), the last
value of d.
dnm1 REAL for slasq5
DOUBLE PRECISION for dlasq5. Contains d(n0-1).
1707
?lasq6
Computes one dqd transform in ping-pong form.
Used by ?bdsqr and ?stegr.
Syntax
call slasq6( i0, n0, z, pp, dmin, dmin1, dmin2, dn, dnm1, dnm2 )
call dlasq6( i0, n0, z, pp, dmin, dmin1, dmin2, dn, dnm1, dnm2 )
Description
for C interface.
The routine ?lasq6 computes one dqd (shift equal to zero) transform in ping-pong form, with
protection against underflow and overflow.
Input Parameters
z REAL for slasq6
Array, DIMENSION (4n). Z holds the qd array. emin is stored
in z(4*n0) to avoid an extra argument.
Output Parameters
Minimum value of d.
Minimum value of d, excluding d(n0) and d(n0-1).
dn REAL for slasq6
1708
DOUBLE PRECISION for dlasq6. Contains d(n0), the last
value of d.
?lasr
Applies a sequence of plane rotations to a general
rectangular matrix.
Syntax
call slasr( side, pivot, direct, m, n, c, s, a, lda )
call dlasr( side, pivot, direct, m, n, c, s, a, lda )
call clasr( side, pivot, direct, m, n, c, s, a, lda )
call zlasr( side, pivot, direct, m, n, c, s, a, lda )
Description
for C interface.
The routine applies a sequence of plane rotations to a real/complex matrix A, from the left or
the right.
A := P*A, when side = 'L' ( Left-hand side )
A := A*P', when side = 'R' ( Right-hand side )
where P is an orthogonal matrix consisting of a sequence of plane rotations with z = m when
side = 'L' and z = n when side = 'R'.
When direct = 'F' (Forward sequence), then
P = P(z-1)*...P(2)*P(1),
and when direct = 'B' (Backward sequence), then
P = P(1)*P(2)*...*P(z-1),
where P( k ) is a plane rotation matrix defined by the 2-by-2 plane rotation:
1709
When pivot = 'V' ( Variable pivot ), the rotation is performed for the plane (k, k + 1), that
is, P(k) has the form
where R(k) appears as a rank-2 modification to the identity matrix in rows and columns k and
k+1.
When pivot = 'T' ( Top pivot ), the rotation is performed for the plane (1,k+1), so P(k)
has the form
1710
where R(k) appears in rows and columns k and k+1.
Similarly, when pivot = 'B' ( Bottom pivot ), the rotation is performed for the plane (k,z),
giving P(k) the form
where R(k) appears in rows and columns k and z. The rotations are performed without ever
forming P(k) explicitly.
Input Parameters
side CHARACTER*1. Specifies whether the plane rotation matrix
P is applied to A on the left or the right.
1711
= 'L': left, compute A := P*A

= 'R': right, compute A:= A*P'
direct CHARACTER*1. Specifies whether P is a forward or backward
sequence of plane rotations.
= 'F': forward, P = P(z-1)*...*P(2)*P(1)
= 'B': backward, P = P(1)*P(2)*...*P(z-1)
pivot CHARACTER*1. Specifies the plane for which P(k) is a plane
rotation matrix.
= 'V': Variable pivot, the plane (k, k+1)
= 'T': Top pivot, the plane (1, k+1)
= 'B': Bottom pivot, the plane (k, z)
If m ≤ 1, an immediate return is effected.
If n ≤ 1, an immediate return is effected.
c, s REAL for slasr/clasr
DOUBLE PRECISION for dlasr/zlasr.
Arrays, DIMENSION
(m-1) if side = 'L',
(n-1) if side = 'R' .
c(k) and s(k) contain the cosine and sine of the plane
rotations respectively that define the 2-by-2 plane rotation
part (R(k)) of the P(k) matrix as described above in
Description.
a REAL for slasr
DOUBLE PRECISION for dlasr
COMPLEX for clasr
COMPLEX*16 for zlasr.
The m-by-n matrix A.
lda ≥ max(1,m).
1712
Output Parameters
a On exit, A is overwritten by P*A if side = 'R', or by A*P'
if side = 'L'.
?lasrt
Sorts numbers in increasing or decreasing order.
Syntax
call slasrt( id, n, d, info )
call dlasrt( id, n, d, info )
Description
for C interface.
The routine ?lasrt sorts the numbers in d in increasing order (if id = 'I') or in decreasing
order (if id = 'D'). It uses Quick Sort, reverting to Insertion Sort on arrays of size ≤ 20.
32
Dimension of stack limits n to about 2 .
Input Parameters
id CHARACTER*1.
= 'I': sort d in increasing order;
= 'D': sort d in decreasing order.
n INTEGER. The length of the array d.
d REAL for slasrt
DOUBLE PRECISION for dlasrt.
On entry, the array to be sorted.
Output Parameters
d On exit, d has been sorted into increasing order
(d(1) ≤ ... ≤ d(n)) or into decreasing order
(d(1) ≥ ... ≥ d(n)), depending on id.
info INTEGER.
1713
?lassq
Updates a sum of squares represented in scaled
form.
Syntax
call slassq( n, x, incx, scale, sumsq )
call dlassq( n, x, incx, scale, sumsq )
call classq( n, x, incx, scale, sumsq )
call zlassq( n, x, incx, scale, sumsq )
Description
for C interface.
The real routines slassq/dlassq return the values scl and smsq such that
scl2 * smsq = x(1)2 +...+ x(n)2 + scale2 *sumsq,

where x( i ) = x(1 + ( i - 1) incx).
The value of sumsq is assumed to be non-negative and scl returns the value
scl = max( scale, abs(x(i))).

Values scale and sumsq must be supplied in scale and sumsq, and scl and smsq are overwritten
on scale and sumsq, respectively.
The complex routines classq/zlassq return the values scl and ssq such that
scl2 * ssq = x(1)2 +...+ x(n)2 + scale2 *sumsq,

where x(i) = abs(x(1 +(i - 1)*incx)).
The value of sumsq is assumed to be at least unity and the value of ssq will then satisfy 1.0
≤ ssq ≤ sumsq + 2n
scale is assumed to be non-negative and scl returns the value
scl = max( scale, abs(real(x(i))), abs(aimag(x(i)))).
1714
Values scale and sumsq must be supplied in scale and sumsq, and scl and ssq are overwritten
on scale and sumsq, respectively.
All routines ?lassq make only one pass through the vector x.
Input Parameters
n INTEGER. The number of elements to be used from the
vector x.
x REAL for slassq
DOUBLE PRECISION for dlassq
COMPLEX for classq
COMPLEX*16 for zlassq.
The vector for which a scaled sum of squares is computed:
x(i) = x(1+(i-1)*incx), 1 ≤ i ≤ n.
incx INTEGER. The increment between successive values of the
vector x. incx > 0.
scale REAL for slassq/classq
DOUBLE PRECISION for dlassq/zlassq.
On entry, the value scale in the equation above.
sumsq REAL for slassq/classq
DOUBLE PRECISION for dlassq/zlassq.
On entry, the value sumsq in the equation above.
Output Parameters
scale On exit, scale is overwritten with scl, the scaling factor
for the sum of squares.
sumsq For real flavors:
On exit, sumsq is overwritten with the value smsq in the
equation above.
On exit, sumsq is overwritten with the value ssq in the
equation above.
1715
?lasv2
2-by-2 triangular matrix.
Syntax
call slasv2( f, g, h, ssmin, ssmax, snr, csr, snl, csl )
call dlasv2( f, g, h, ssmin, ssmax, snr, csr, snl, csl )
Description
for C interface.
The routine ?lasv2 computes the singular value decomposition of a 2-by-2 triangular matrix
On return, abs(ssmax) is the larger singular value, abs(ssmin) is the smaller singular value,
and (csl,snl) and (csr,snr) are the left and right singular vectors for abs(ssmax), giving the
decomposition
Input Parameters
f, g, h REAL for slasv2
DOUBLE PRECISION for dlasv2.
The (1,1), (1,2) and (2,2) elements of the 2-by-2 matrix,
respectively.
Output Parameters
ssmin, ssmax REAL for slasv2
1716
abs(ssmin) and abs(ssmax) is the smaller and the larger
singular value, respectively.
snl, csl REAL for slasv2
The vector (csl, snl) is a unit left singular vector for the
singular value abs(ssmax).
snr, csr REAL for slasv2
The vector (csr, snr) is a unit right singular vector for the
singular value abs(ssmax).
Application Notes
Any input parameter may be aliased with any output parameter.
Barring over/underflow and assuming a guard digit in subtraction, all output quantities are
correct to within a few units in the last place (ulps).
In ieee arithmetic, the code works correctly if one matrix element is infinite. Overflow will not
occur unless the largest singular value itself overflows or is within a few ulps of overflow. (On
machines with partial overflow, like the Cray, overflow may occur if the largest singular value
is within a factor of 2 of overflow.) Underflow is harmless if underflow is gradual. Otherwise,
results may correspond to a matrix modified by perturbations of size near the underflow
threshold.
?laswp
Performs a series of row interchanges on a general
rectangular matrix.
Syntax
call slaswp( n, a, lda, k1, k2, ipiv, incx )
call dlaswp( n, a, lda, k1, k2, ipiv, incx )
call claswp( n, a, lda, k1, k2, ipiv, incx )
call zlaswp( n, a, lda, k1, k2, ipiv, incx )
1717
Description
for C interface.
The routine performs a series of row interchanges on the matrix A. One row interchange is
initiated for each of rows k1 through k2 of A.
Input Parameters
a REAL for slaswp
DOUBLE PRECISION for dlaswp
COMPLEX for claswp
COMPLEX*16 for zlaswp.
On entry, the matrix of column dimension n to which the
row interchanges will be applied.
k1 INTEGER. The first element of ipiv for which a row
interchange will be done.
k2 INTEGER. The last element of ipiv for which a row
ipiv INTEGER.
Array, DIMENSION (k2*|incx|).
The vector of pivot indices. Only the elements in positions
k1 through k2 of ipiv are accessed.
ipiv(k) = l implies rows k and l are to be interchanged.
incx INTEGER. The increment between successive values of ipiv.
If ipiv is negative, the pivots are applied in reverse order.
Output Parameters
a On exit, the permuted matrix.
1718
?lasy2
Solves the Sylvester matrix equation where the
matrices are of order 1 or 2.
Syntax
call slasy2( ltranl, ltranr, isgn, n1, n2, tl, ldtl, tr, ldtr, b, ldb, scale,
x, ldx, xnorm, info )
call dlasy2( ltranl, ltranr, isgn, n1, n2, tl, ldtl, tr, ldtr, b, ldb, scale,
x, ldx, xnorm, info )
Description
for C interface.
The routine solves for the n1-by-n2 matrix X, 1 ≤ n1, n2 ≤ 2, in
op(TL)*X + isgn*X*op(TR) = scale*B,

where
TL is n1-by-n1,
TR is n2-by-n2,
B is n1-by-n2,
T T
and isgn = 1 or -1. op(T) = T or T , where T denotes the transpose of T.
Input Parameters
ltranl LOGICAL.
On entry, ltranl specifies the op(TL):
= .FALSE., op(TL) = TL,
= .TRUE., op(TL) = (TL)T.
ltranr LOGICAL.
On entry, ltranr specifies the op(TR):
= .FALSE., op(TR) = TR,
= .TRUE., op(TR) = (TR)T.
isgn INTEGER. On entry, isgn specifies the sign of the equation
as described before. isgn may only be 1 or -1.
1719
n1 INTEGER. On entry, n1 specifies the order of matrix TL.

n1 may only be 0, 1 or 2.
n2 INTEGER. On entry, n2 specifies the order of matrix TR.
n2 may only be 0, 1 or 2.
tl REAL for slasy2
DOUBLE PRECISION for dlasy2.
Array, DIMENSION (ldtl,2).
On entry, tl contains an n1-by-n1 matrix TL.
ldtl INTEGER.The leading dimension of the matrix TL.
ldtl ≥ max(1,n1).
tr REAL for slasy2
Array, DIMENSION (ldtr,2). On entry, tr contains an
n2-by-n2 matrix TR.
ldtr INTEGER. The leading dimension of the matrix TR.
ldtr ≥ max(1,n2).
b REAL for slasy2
Array, DIMENSION (ldb,2). On entry, the n1-by-n2 matrix
B contains the right-hand side of the equation.
ldb INTEGER. The leading dimension of the matrix B.
ldb ≥ max(1,n1).
ldx INTEGER. The leading dimension of the output matrix X.
ldx ≥ max(1,n1).
Output Parameters
scale REAL for slasy2
On exit, scale contains the scale factor.
scale is chosen less than or equal to 1 to prevent the
solution overflowing.
x REAL for slasy2
1720
Array, DIMENSION (ldx,2). On exit, x contains the n1-by-n2
solution.
xnorm REAL for slasy2
On exit, xnorm is the infinity-norm of the solution.
info INTEGER. On exit, info is set to 0: successful exit. 1: TL
and TR have too close eigenvalues, so TL or TR is perturbed
to get a nonsingular equation.
?lasyf
Computes a partial factorization of a real/complex
symmetric matrix, using the diagonal pivoting
method.
Syntax
call slasyf( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call dlasyf( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call clasyf( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call zlasyf( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
Description
for C interface.
The routine ?lasyf computes a partial factorization of a real/complex symmetric matrix A using
the Bunch-Kaufman diagonal pivoting method. The partial factorization has the form:
1721
where the order of D is at most nb.
The actual order is returned in the argument kb, and is either nb or nb-1, or n if n ≤ nb.
This is an auxiliary routine called by ?sytrf. It uses blocked code (calling Level 3 BLAS) to
update the submatrix A11 (if uplo = 'U') or A22 (if uplo = 'L').
Input Parameters
uplo CHARACTER*1.
symmetric matrix A is stored:
= 'L': Lower triangular
nb INTEGER. The maximum number of columns of the matrix
A that should be factored. nb should be at least 2 to allow
for 2-by-2 pivot blocks.
a REAL for slasyf
DOUBLE PRECISION for dlasyf
COMPLEX for clasyf
COMPLEX*16 for zlasyf.
Array, DIMENSION (lda, n). If uplo = 'U', the leading
n-by-n upper triangular part of a contains the upper
triangular part of the matrix A, and the strictly lower
triangular part of a is not referenced. If uplo = 'L', the
leading n-by-n lower triangular part of a contains the lower
triangular part of the matrix A, and the strictly upper
max(1,n).
w REAL for slasyf
DOUBLE PRECISION for dlasyf
1722
COMPLEX for clasyf
COMPLEX*16 for zlasyf.
Workspace array, DIMENSION (ldw, nb).
ldw INTEGER. The leading dimension of the array w. ldw ≥
max(1,n).
Output Parameters
kb INTEGER. The number of columns of A that were actually
factored kb is either nb-1 or nb, or n if n ≤ nb.
a On exit, a contains details of the partial factorization.
ipiv INTEGER. Array, DIMENSION (n ). Details of the interchanges
and the block structure of D.
If uplo = 'U', only the last kb elements of ipiv are set;
if uplo = 'L', only the first kb elements are set.
If ipiv(k) > 0, then rows and columns k and ipiv(k) were
interchanged and D(k, k) is a 1-by-1 diagonal block.
If uplo = 'U' and ipiv(k) = ipiv(k-1) < 0, then rows
and columns k-1 and -ipiv(k) were interchanged and
D(k-1:k, k-1:k) is a 2-by-2 diagonal block.
If uplo = 'L' and ipiv(k) = ipiv(k+1) < 0, then rows
and columns k+1 and -ipiv(k) were interchanged and
D(k:k+1, k:k+1) is a 2-by-2 diagonal block.
info INTEGER.
> 0: if info = k, D(k, k) is exactly zero. The factorization
has been completed, but the block diagonal matrix D is
exactly singular.
1723
?lahef
Computes a partial factorization of a complex
Hermitian indefinite matrix, using the diagonal
pivoting method.
Syntax
call clahef( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call zlahef( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
Description
for C interface.
The routine ?lahef computes a partial factorization of a complex Hermitian matrix A, using
the Bunch-Kaufman diagonal pivoting method. The partial factorization has the form:
where the order of D is at most nb.
The actual order is returned in the argument kb, and is either nb or nb-1, or n if n ≤ nb.
Note that U' denotes the conjugate transpose of U.
This is an auxiliary routine called by ?hetrf. It uses blocked code (calling Level 3 BLAS) to
update the submatrix A11 (if uplo = 'U') or A22 (if uplo = 'L').
Input Parameters
uplo CHARACTER*1.
1724
Hermitian matrix A is stored:
= 'U': upper triangular
= 'L': lower triangular
nb INTEGER. The maximum number of columns of the matrix
A that should be factored. nb should be at least 2 to allow
for 2-by-2 pivot blocks.
a COMPLEX for clahef
COMPLEX*16 for zlahef.
On entry, the Hermitian matrix A.
A contains the upper triangular part of the matrix A, and
A contains the lower triangular part of the matrix A, and the
strictly upper triangular part of A is not referenced.
max(1,n).
w COMPLEX for clahef
COMPLEX*16 for zlahef.
Workspace array, DIMENSION (ldw, nb).
ldw INTEGER. The leading dimension of the array w. ldw ≥
max(1,n).
Output Parameters
kb INTEGER. The number of columns of A that were actually
factored kb is either nb-1 or nb, or n if n ≤ nb.
a On exit, A contains details of the partial factorization.
ipiv INTEGER.
Array, DIMENSION (n ). Details of the interchanges and the
block structure of D.
If uplo = 'U', only the last kb elements of ipiv are set;
1725
if uplo = 'L', only the first kb elements are set.

If ipiv(k) > 0, then rows and columns k and ipiv(k) are
interchanged and D(k, k) is a 1-by-1 diagonal block.
and columns k-1 and -ipiv(k) are interchanged and
D(k-1:k, k-1:k) is a 2-by-2 diagonal block.
If uplo = 'L' and ipiv(k) = ipiv(k+1) < 0, then rows
and columns k+1 and -ipiv(k) are interchanged and D(
k:k+1, k:k+1) is a 2-by-2 diagonal block.
info INTEGER.
> 0: if info = k, D(k, k) is exactly zero. The factorization
exactly singular.
?latbs
Solves a triangular banded system of equations.
Syntax
call slatbs( uplo, trans, diag, normin, n, kd, ab, ldab, x, scale, cnorm,
info )
call dlatbs( uplo, trans, diag, normin, n, kd, ab, ldab, x, scale, cnorm,
info )
call clatbs( uplo, trans, diag, normin, n, kd, ab, ldab, x, scale, cnorm,
info )
call zlatbs( uplo, trans, diag, normin, n, kd, ab, ldab, x, scale, cnorm,
info )
Description
for C interface.
The routine solves one of the triangular systems
A*x = s*b, or AT*x = s*b, or AH*x = s*b (for complex flavors)
1726
T
with scaling to prevent overflow, where A is an upper or lower triangular band matrix. Here A
H
denotes the transpose of A, A denotes the conjugate transpose of A, x and b are n-element
vectors, and s is a scaling factor, usually less than or equal to 1, chosen so that the components
of x will be less than the overflow threshold. If the unscaled problem will not cause overflow,
the Level 2 BLAS routine ?tbsv is called. If the matrix A is singular (A(j, j)=0 for some j),
then s is set to 0 and a non-trivial solution to A*x = 0 is returned.
Input Parameters
uplo CHARACTER*1.
trans CHARACTER*1.
Specifies the operation applied to A.
= 'N': solve A*x = s*b (no transpose)
T
= 'T': solve A *x = s*b (transpose)
H
= 'C': solve A *x = s*b (conjugate transpose)
diag CHARACTER*1.
Specifies whether the matrix A is unit triangular
= 'N': non-unit triangular
= 'U': unit triangular
normin CHARACTER*1.
Specifies whether cnorm is set.
= 'Y': cnorm contains the column norms on entry;
= 'N': cnorm is not set on entry. On exit, the norms is
computed and stored in cnorm.
kd INTEGER. The number of subdiagonals or superdiagonals in
the triangular matrix A. kb ≥ 0.
ab REAL for slatbs
DOUBLE PRECISION for dlatbs
COMPLEX for clatbs
COMPLEX*16 for zlatbs.
Array, DIMENSION (ldab, n).
1727
The upper or lower triangular band matrix A, stored in the

first kb+1 rows of the array. The j-th column of A is stored
in the j-th column of the array ab as follows:
if uplo = 'U', ab(kd+1+i -j,j) = A(i ,j) for max(1,
j-kd) ≤ i ≤ j;
if uplo = 'L', ab(1+i -j,j) = A(i ,j) for j ≤ i ≤
min(n, j+kd).
kb+1.
x REAL for slatbs
DOUBLE PRECISION for dlatbs
COMPLEX for clatbs
COMPLEX*16 for zlatbs.
On entry, the right hand side b of the triangular system.
cnorm REAL for slatbs/clatbs
DOUBLE PRECISION for dlatbs/zlatbs.
If NORMIN = 'Y', cnorm is an input argument and cnorm(j)
contains the norm of the off-diagonal part of the j-th column
of A.
If trans = 'N', cnorm(j) must be greater than or equal
to the infinity-norm, and if trans = 'T' or 'C' , cnorm(j)
must be greater than or equal to the 1-norm.
Output Parameters
scale REAL for slatbs/clatbs
DOUBLE PRECISION for dlatbs/zlatbs.
The scaling factor s for the triangular system as described
above. If scale = 0, the matrix A is singular or badly
scaled, and the vector x is an exact or approximate solution
to Ax = 0.
cnorm If normin = 'N', cnorm is an output argument and
cnorm(j) returns the 1-norm of the off-diagonal part of the
j-th column of A.
1728
info INTEGER.
< 0: if info = -k, the k-th argument had an illegal value
?latdf
Uses the LU factorization of the n-by-n matrix
computed by ?getc2 and computes a contribution
to the reciprocal Dif-estimate.
Syntax
call slatdf( ijob, n, z, ldz, rhs, rdsum, rdscal, ipiv, jpiv )
call dlatdf( ijob, n, z, ldz, rhs, rdsum, rdscal, ipiv, jpiv )
call clatdf( ijob, n, z, ldz, rhs, rdsum, rdscal, ipiv, jpiv )
call zlatdf( ijob, n, z, ldz, rhs, rdsum, rdscal, ipiv, jpiv )
Description
for C interface.
The routine ?latdf uses the LU factorization of the n-by-n matrix Z computed by ?getc2 and
computes a contribution to the reciprocal Dif-estimate by solving Z*x = b for x, and choosing
the right-hand side b such that the norm of x is as large as possible. On entry rhs = b holds
the contribution from earlier solved sub-systems, and on return rhs = x.
The factorization of Z returned by ?getc2 has the form Z = P*L*U*Q, where P and Q are
permutation matrices. L is lower triangular with unit diagonal elements and U is upper triangular.
Input Parameters
ijob INTEGER.
ijob = 2: First compute an approximative null-vector e of
Z using ?gecon, e is normalized, and solve for Z*x = ±e-f
with the sign giving the greater value of 2-norm(x). This
option is about 5 times as expensive as default.
ijob ≠ 2 (default): Local look ahead strategy where all
entries of the right-hand side b is chosen as either +1 or -1
.
1729
n INTEGER. The number of columns of the matrix Z.

z REAL for slatdf/clatdf
DOUBLE PRECISION for dlatdf/zlatdf.
Array, DIMENSION (ldz, n)
On entry, the LU part of the factorization of the n-by-n
matrix Z computed by ?getc2: Z = P*L*U*Q.
ldz INTEGER. The leading dimension of the array Z. lda ≥
max(1, n).
rhs REAL for slatdf/clatdf
On entry, rhs contains contributions from other subsystems.
rdsum REAL for slatdf/clatdf
On entry, the sum of squares of computed contributions to
the Dif-estimate under computation by ?tgsyL, where the
scaling factor rdscal has been factored out. If trans =
'T', rdsum is not touched.
Note that rdsum only makes sense when ?tgsy2 is called
by ?tgsyL.
rdscal REAL for slatdf/clatdf
On entry, scaling factor used to prevent overflow in rdsum.
If trans = T', rdscal is not touched.
Note that rdscal only makes sense when ?tgsy2 is called
by ?tgsyL.
ipiv INTEGER.
The pivot indices; for 1 ≤ i ≤ n, row i of the matrix has
jpiv INTEGER.
The pivot indices; for 1 ≤j≤ n, column j of the matrix has
been interchanged with column jpiv(j).
1730
Output Parameters
rhs On exit, rhs contains the solution of the subsystem with
entries according to the value of ijob.
rdsum On exit, the corresponding sum of squares updated with
the contributions from the current sub-system.
If trans = 'T', rdsum is not touched.
rdscal On exit, rdscal is updated with respect to the current
contributions in rdsum.
If trans = 'T', rdscal is not touched.
?latps
Solves a triangular system of equations with the
matrix held in packed storage.
Syntax
call slatps( uplo, trans, diag, normin, n, ap, x, scale, cnorm, info )
call dlatps( uplo, trans, diag, normin, n, ap, x, scale, cnorm, info )
call clatps( uplo, trans, diag, normin, n, ap, x, scale, cnorm, info )
call zlatps( uplo, trans, diag, normin, n, ap, x, scale, cnorm, info )
Description
for C interface.
The routine ?latps solves one of the triangular systems
with scaling to prevent overflow, where A is an upper or lower triangular matrix stored in packed
T H
form. Here A denotes the transpose of A, A denotes the conjugate transpose of A, x and b are
n-element vectors, and s is a scaling factor, usually less than or equal to 1, chosen so that the
components of x will be less than the overflow threshold. If the unscaled problem does not
cause overflow, the Level 2 BLAS routine ?tpsv is called. If the matrix A is singular (A(j, j)
= 0 for some j), then s is set to 0 and a non-trivial solution to A*x = 0 is returned.
1731
Input Parameters
uplo CHARACTER*1.
= 'L': uower triangular
trans CHARACTER*1.
= 'T': solve AT*x = s*b (transpose)
= 'C': solve AH*x = s*b (conjugate transpose)
diag CHARACTER*1.
Specifies whether the matrix A is unit triangular.
= 'U': unit triangular
normin CHARACTER*1.
Specifies whether cnorm is set.
= 'N': cnorm is not set on entry. On exit, the norms will
be computed and stored in cnorm.
ap REAL for slatps
DOUBLE PRECISION for dlatps
COMPLEX for clatps
COMPLEX*16 for zlatps.
The upper or lower triangular matrix A, packed columnwise
in a linear array. The j-th column of A is stored in the array
ap as follows:
if uplo = 'U', ap(i + (j-1)j/2) = A(i,j) for 1≤ i ≤
j;
if uplo = 'L', ap(i + (j-1)(2n-j)/2) = A(i, j) for
j≤i≤n.
x REAL for slatps DOUBLE PRECISION for dlatps
COMPLEX for clatps
COMPLEX*16 for zlatps.
1732
Array, DIMENSION (n)
cnorm REAL for slatps/clatps
DOUBLE PRECISION for dlatps/zlatps.
If normin = 'Y', cnorm is an input argument and cnorm(j)
contains the norm of the off-diagonal part of the j-th column
of A.
If trans = 'N', cnorm(j) must be greater than or equal
to the infinity-norm, and if trans = 'T' or 'C' , cnorm(j)
Output Parameters
x On exit, x is overwritten by the solution vector x.
scale REAL for slatps/clatps
DOUBLE PRECISION for dlatps/zlatps.
The scaling factor s for the triangular system as described
above.
If scale = 0, the matrix A is singular or badly scaled, and
the vector x is an exact or approximate solution to A*x =
0.
j-th column of A.
info INTEGER.
1733
?latrd
Reduces the first nb rows and columns of a
symmetric/Hermitian matrix A to real tridiagonal
form by an orthogonal/unitary similarity
transformation.
Syntax
call slatrd( uplo, n, nb, a, lda, e, tau, w, ldw )
call dlatrd( uplo, n, nb, a, lda, e, tau, w, ldw )
call clatrd( uplo, n, nb, a, lda, e, tau, w, ldw )
call zlatrd( uplo, n, nb, a, lda, e, tau, w, ldw )
Description
for C interface.
The routine ?latrd reduces nb rows and columns of a real symmetric or complex Hermitian
matrix A to symmetric/Hermitian tridiagonal form by an orthogonal/unitary similarity
transformation QT*A*Q for real flavors, QH*A*Q for complex flavors, and returns the matrices
V and W which are needed to apply the transformation to the unreduced part of A.
If uplo = 'U', ?latrd reduces the last nb rows and columns of a matrix, of which the upper
triangle is supplied;
if uplo = 'L', ?latrd reduces the first nb rows and columns of a matrix, of which the lower
triangle is supplied.
This is an auxiliary routine called by ?sytrd/?hetrd.
Input Parameters
uplo CHARACTER*1.
symmetric/Hermitian matrix A is stored:
nb INTEGER. The number of rows and columns to be reduced.
1734
a REAL for slatrd
DOUBLE PRECISION for dlatrd
COMPLEX for clatrd
COMPLEX*16 for zlatrd.
On entry, the symmetric/Hermitian matrix A
(1,n).
ldw INTEGER.
The leading dimension of the output array w. ldw ≥
max(1,n).
Output Parameters
a On exit, if uplo = 'U', the last nb columns have been
reduced to tridiagonal form, with the diagonal elements
overwriting the diagonal elements of a; the elements above
the diagonal with the array tau, represent the
reflectors;
if uplo = 'L', the first nb columns have been reduced to
tridiagonal form, with the diagonal elements overwriting the
diagonal elements of a; the elements below the diagonal
with the array tau, represent the orthogonal/unitary matrix
Q as a product of elementary reflectors.
e REAL for slatrd/clatrd
DOUBLE PRECISION for dlatrd/zlatrd.
If uplo = 'U', e(n-nb:n-1) contains the superdiagonal
elements of the last nb columns of the reduced matrix;
if uplo = 'L', e(1:nb) contains the subdiagonal elements
of the first nb columns of the reduced matrix.
1735
tau REAL for slatrd

COMPLEX for clatrd
The scalar factors of the elementary reflectors, stored in
tau(n-nb:n-1) if uplo = 'U', and in tau(1:nb) if uplo
= 'L'.
w REAL for slatrd
COMPLEX for clatrd
The n-by-nb matrix W required to update the unreduced part
of A.
Application Notes
If uplo = 'U', the matrix Q is represented as a product of elementary reflectors
Q = H(n)*H(n-1)*...*H(n-nb+1)
H(i) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector with v(i:n) = 0 and v(i-1)
= 1; v(1: i-1) is stored on exit in a(1: i-1, i), and tau in tau(i-1).
If uplo = 'L', the matrix Q is represented as a product of elementary reflectors
Q = H(1)*H(2)*...*H(nb)
Each H(i) has the form H(i) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector with v(1: i) = 0 and
v(i+1) = 1; v( i+1:n) is stored on exit in a(i+1:n, i), and tau in tau(i).
The elements of the vectors v together form the n-by-nb matrix V which is needed, with W, to
apply the transformation to the unreduced part of the matrix, using a symmetric/Hermitian
rank-2k update of the form:
A := A - VW' - WV'.
The contents of a on exit are illustrated by the following examples with n = 5 and nb = 2:
1736
where d denotes a diagonal element of the reduced matrix, a denotes an element of the original
matrix that is unchanged, and vi denotes an element of the vector defining H(i).
?latrs
scale factor set to prevent overflow.
Syntax
call slatrs( uplo, trans, diag, normin, n, a, lda, x, scale, cnorm, info )
call dlatrs( uplo, trans, diag, normin, n, a, lda, x, scale, cnorm, info )
call clatrs( uplo, trans, diag, normin, n, a, lda, x, scale, cnorm, info )
call zlatrs( uplo, trans, diag, normin, n, a, lda, x, scale, cnorm, info )
Description
for C interface.
The routine solves one of the triangular systems
T
with scaling to prevent overflow. Here A is an upper or lower triangular matrix, A denotes the
H
transpose of A, A denotes the conjugate transpose of A, x and b are n-element vectors, and s
is a scaling factor, usually less than or equal to 1, chosen so that the components of x will be
1737
less than the overflow threshold. If the unscaled problem will not cause overflow, the Level 2
BLAS routine ?trsv is called. If the matrix A is singular (A(j,j) = 0 for some j), then s is
set to 0 and a non-trivial solution to A*x = 0 is returned.
Input Parameters
uplo CHARACTER*1.
trans CHARACTER*1.
= 'T': solve AT*x = s*b (transpose)
= 'C': solve AH*x = s*b (conjugate transpose)
diag CHARACTER*1.
normin CHARACTER*1.
Specifies whether cnorm has been set or not.
= 'N': cnorm is not set on entry. O
n exit, the norms will be computed and stored in cnorm.
n INTEGER. The order of the matrix A. n ≥ 0
a REAL for slatrs
DOUBLE PRECISION for dlatrs
COMPLEX for clatrs
COMPLEX*16 for zlatrs.
Array, DIMENSION (lda, n). Contains the triangular matrix
A.
the array a contains the upper triangular matrix, and the
strictly lower triangular part of A is not referenced.
the array a contains the lower triangular matrix, and the
1738
If diag = 'U', the diagonal elements of A are also not
referenced and are assumed to be 1.
max(1, n).
x REAL for slatrs
DOUBLE PRECISION for dlatrs
COMPLEX for clatrs
COMPLEX*16 for zlatrs.
cnorm REAL for slatrs/clatrs
DOUBLE PRECISION for dlatrs/zlatrs.
If normin = 'Y', cnorm is an input argument and cnorm
(j) contains the norm of the off-diagonal part of the j-th
column of A.
If trans = 'N', cnorm (j) must be greater than or equal
to the infinity-norm, and if trans = 'T' or 'C', cnorm(j)
Output Parameters
x On exit, x is overwritten by the solution vector x.
scale REAL for slatrs/clatrs
DOUBLE PRECISION for dlatrs/zlatrs.
Array, DIMENSION (lda, n). The scaling factor s for the
triangular system as described above.
the vector x is an exact or approximate solution to A*x =
0.
j-th column of A.
info INTEGER.
1739
Application Notes
A rough bound on x is computed; if that is less than overflow, ?trsv is called, otherwise,
specific code is used which checks for possible overflow or divide-by-zero at every operation.
A columnwise scheme is used for solving Ax = b. The basic algorithm if A is lower triangular
is
x[1:n] := b[1:n]
for j = 1, ..., n
x(j) := x(j) / A(j,j)
x[j+1:n] := x[j+1:n] - x(j)*a[j+1:n,j]
end
Define bounds on the components of x after j iterations of the loop:
M(j) = bound on x[1:j]

G(j) = bound on x[j+1:n]
Initially, let M(0) = 0 and G(0) = max{x(i), i=1,...,n}.
Then for iteration j+1 we have
M(j+1) ≤ G(j) / | a(j+1,j+1)|
G(j+1) ≤ G(j) + M(j+1)*| a[j+2:n,j+1]|
≤ G(j)(1 + cnorm(j+1)/ | a(j+1,j+1)|,

where cnorm(j+1) is greater than or equal to the infinity-norm of column j+1 of a, not counting
the diagonal. Hence
and
1740
Since |x(j)| ≤ M(j), we use the Level 2 BLAS routine ?trsv if the reciprocal of the largest
M(j), j=1,..,n, is larger than max(underflow, 1/overflow).
The bound on x(j) is also used to determine when a step in the columnwise method can be
performed without fear of overflow. If the computed bound is greater than a large constant, x
is scaled to prevent overflow, but if the bound overflows, x is set to 0, x(j) to 1, and scale to
0, and a non-trivial solution to Ax = 0 is found.
Similarly, a row-wise scheme is used to solve ATx = b or AHx = b. The basic algorithm for A
upper triangular is
for j = 1, ..., n
x(j) := ( b(j) - A[1:j-1,j]' x[1:j-1]) / A(j,j)
end
We simultaneously compute two bounds
G(j) = bound on ( b(i) - A[1:i-1,i]'*x[1:i-1]), 1≤ i≤ j
M(j) = bound on x(i), 1≤ i≤ j

The initial values are G(0) = 0, M(0) = max{ b(i), i=1,..,n}, and we add the constraint
G(j) ≥ G(j-1) and M(j) ≥ M(j-1) for j ≥ 1.
Then the bound on x(j) is
M(j) ≤ M(j-1) *(1 + cnorm(j)) / | A(j,j)|
and we can safely call ?trsv if 1/M(n) and 1/G(n) are both greater than max(underflow,
1/overflow).
1741
?latrz
Factors an upper trapezoidal matrix by means of
Syntax
call slatrz( m, n, l, a, lda, tau, work )
call dlatrz( m, n, l, a, lda, tau, work )
call clatrz( m, n, l, a, lda, tau, work )
call zlatrz( m, n, l, a, lda, tau, work )
Description
for C interface.
The routine ?latrz factors the m-by-(m+l) real/complex upper trapezoidal matrix
[A1 A2] = [A(1:m,1:m) A(1: m, n-l+1:n)]

as ( R 0 )* Z, by means of orthogonal/unitary transformations. Z is an (m+l)-by-(m+l)
orthogonal/unitary matrix and R and A1 are m-by -m upper triangular matrices.
Input Parameters
l INTEGER. The number of columns of the matrix A containing
the meaningful part of the Householder vectors.
n-m ≥ l ≥ 0.
a REAL for slatrz
DOUBLE PRECISION for dlatrz
COMPLEX for clatrz
COMPLEX*16 for zlatrz.
On entry, the leading m-by-n upper trapezoidal part of the
array a must contain the matrix to be factorized.
1742
max(1,m).
work REAL for slatrz
COMPLEX for clatrz
Workspace array, DIMENSION (m).
Output Parameters
a On exit, the leading m-by-m upper triangular part of a
contains the upper triangular matrix R, and elements n-l+1
to n of the first m rows of a, with the array tau, represent
the orthogonal/unitary matrix Z as a product of m elementary
reflectors.
tau REAL for slatrz
COMPLEX for clatrz
The scalar factors of the elementary reflectors.
Application Notes
The factorization is obtained by Householder's method. The k-th transformation matrix, z(k),
which is used to introduce zeros into the (m - k + 1)-th row of A, is given in the form
where
1743
tau is a scalar and z(k) is an l-element vector. tau and z(k) are chosen to annihilate the
elements of the k-th row of A2.
The scalar tau is returned in the k-th element of tau and the vector u(k) in the k-th row of
A2, such that the elements of z(k) are in a(k, l+1), ..., a(k, n).
The elements of r are returned in the upper triangular part of A1.
Z is given by
Z = Z(1)*Z(2)*...*Z(m).
?lauu2
Computes the product U*UT(U*UH) or LT*L (LH*L),
where U and L are upper or lower triangular
matrices (unblocked algorithm).
Syntax
call slauu2( uplo, n, a, lda, info )
call dlauu2( uplo, n, a, lda, info )
call clauu2( uplo, n, a, lda, info )
call zlauu2( uplo, n, a, lda, info )
Description
for C interface.
The routine ?lauu2 computes the product U*UT or LT*L for real flavors, and LH*L or LH*L for
complex flavors. Here the triangular factor U or L is stored in the upper or lower triangular part
of the array a.
If uplo = 'U' or 'u' , then the upper triangle of the result is stored, overwriting the factor U
in A.
If uplo = 'L' or 'l', then the lower triangle of the result is stored, overwriting the factor L
in A.
This is the unblocked form of the algorithm, calling BLAS Level 2 Routines.
1744
Input Parameters
uplo CHARACTER*1.
Specifies whether the triangular factor stored in the array
a is upper or lower triangular:
n INTEGER. The order of the triangular factor U or L. n ≥ 0.
a REAL for slauu2
DOUBLE PRECISION for dlauu2
COMPLEX for clauu2
COMPLEX*16 for zlauu2.
Array, DIMENSION (lda, n). On entry, the triangular factor
U or L.
max(1,n).
Output Parameters
a On exit,
if uplo = 'U', then the upper triangle of a is overwritten
with the upper triangle of the product U*UT(U*UH);
if uplo = 'L', then the lower triangle of a is overwritten
with the lower triangle of the product LT*L (LH*L).
info INTEGER.
1745
?lauum
Computes the product U*UT(U*UH) or LT*L (LH*L),
where U and L are upper or lower triangular
matrices (blocked algorithm).
Syntax
call slauum( uplo, n, a, lda, info )
call dlauum( uplo, n, a, lda, info )
call clauum( uplo, n, a, lda, info )
call zlauum( uplo, n, a, lda, info )
Description
for C interface.
The routine ?lauum computes the product U*UT or LT*L for real flavors, and LH*L or LH*L for
complex flavors. Here the triangular factor U or L is stored in the upper or lower triangular part
of the array a.
If uplo = 'U' or 'u', then the upper triangle of the result is stored, overwriting the factor U
in A.
in A.
This is the blocked form of the algorithm, calling BLAS Level 3 Routines.
Input Parameters
uplo CHARACTER*1.
Specifies whether the triangular factor stored in the array
a is upper or lower triangular:
n INTEGER. The order of the triangular factor U or L. n ≥ 0.
a REAL for slauum
DOUBLE PRECISION for dlauum
COMPLEX for clauum
1746
COMPLEX*16 for zlauum .
On entry, the triangular factor U or L.
max(1,n).
Output Parameters
a On exit,
if uplo = 'U', then the upper triangle of a is overwritten
with the upper triangle of the product U*UT(U*UH);
if uplo = 'L', then the lower triangle of a is overwritten
with the lower triangle of the product LT*L (LH*L).
info INTEGER.
?lazq3
Checks for deflation, computes a shift and calls
dqds.
Syntax
call slazq3( i0, n0, z, pp, dmin, sigma, desig, qmax, nfail, iter, ndiv, ieee,
ttype, dmin1, dmin2, dn, dn1, dn2, tau )
call dlazq3( i0, n0, z, pp, dmin, sigma, desig, qmax, nfail, iter, ndiv, ieee,
ttype, dmin1, dmin2, dn, dn1, dn2, tau )
Description
for C interface.
The routine ?lazq3 checks for deflation, computes a shift (tau) and calls dqds. In case of
failure, it changes shifts, and tries again until output is positive.
This routine is a thread safe version of ?lasq3 routine, which passes ttype, dmin1, dmin2, dn,
dn1, dn2, and tau through the argument list in place of declaring them in a SAVE statement.
1747
Input Parameters
i0 INTEGER.
First index.
n0 INTEGER.
Last index.
z REAL for slasq3
Array, DIMENSION (4*n). z holds the qd array.
pp INTEGER.
pp=0 for ping, pp=1 for pong.
desig REAL for slazq3
DOUBLE PRECISION for dlazq3.
Lower order part of sigma.
qmax REAL for slazq3
Maximum value of q.
ieee LOGICAL.
Flag for IEEE or non-IEEE arithmetic (passed to the routine).
ttype INTEGER. Shift type.
dmin1 REAL for slazq3
Minimum value of d, excluding d(n0). Should be 0 on entry
at the first iteration and should not be modified further.
Minimum value of d, excluding d(n0) and d(n0-1). Should
be 0 on entry at the first iteration and should not be
modified further.
dn REAL for slazq3
Contains d(n). Should be 0 on entry at the first iteration
and should not be modified further.
dn1 REAL for slazq3
1748
Contains d(n-1). Should be 0 on entry at the first iteration
dn2 REAL for slazq3
Contains d(n-2). Should be 0 on entry at the first iteration
tau REAL for slazq3
Shift value
Output Parameters
dmin REAL for slazq3
Minimum value of d.
sigma REAL for slazq3
Sum of shifts used in current segment.
desig Lower order part of sigma.
nfail INTEGER.
Number of times shift was too big.
iter INTEGER.
Number of iterations.
ndiv INTEGER.
Number of divisions.
ttype Shift type.
dmin1 Minimum value of d, excluding d(n0).
dmin2 Minimum value of d, excluding d(n0) and d(n0-1).
dn d(n).
dn1 d(n-1).
dn2 d(n-2).
tau Shift value.
1749
?lazq4
Computes an approximation to the smallest
eigenvalue using values of d from the previous
transform.
Syntax
call slazq4( i0, n0, z, pp, n0in, dmin, dmin1, dmin2, dn, dn1, dn2, tau,
ttype, g )
call dlazq4( i0, n0, z, pp, n0in, dmin, dmin1, dmin2, dn, dn1, dn2, tau,
ttype, g )
Description
for C interface.
The routine computes an approximation tau to the smallest eigenvalue using values of d from
the previous transform.
This routine is a thread safe version of ?lasq4 routine, which passes g through the argument
list in place of declaring g in a SAVE statement.
Input Parameters
i0 INTEGER.
First index.
n0 INTEGER.
Last index.
z REAL for slazq4
Array, DIMENSION (4*n).
z holds the qd array.
pp INTEGER.
pp=0 for ping, pp=1 for pong.
n0in INTEGER. The value of n0 at start of eigtest.
dmin REAL for slazq4
Minimum value of d.
1750
Minimum value of d, excluding d(n0)
and d(n0-1).
dn REAL for slazq4
Contains d(n).
dn1 REAL for slazq4
Contains d(n-1).
dn2 REAL for slazq4
Contains d(n-2).
g REAL for slazq4
Shift coefficient.
Output Parameters
tau REAL for slazq4
Shift value.
ttype INTEGER.
Shift type.
g Shift coefficient.
1751
?org2l/?ung2l
Generates all or part of the orthogonal/unitary
matrix Q from a QL factorization determined by
?geqlf (unblocked algorithm).
Syntax
call sorg2l( m, n, k, a, lda, tau, work, info )
call dorg2l( m, n, k, a, lda, tau, work, info )
call cung2l( m, n, k, a, lda, tau, work, info )
call zung2l( m, n, k, a, lda, tau, work, info )
Description
for C interface.
The routine ?org2l/?ung2l generates an m-by-n real/complex matrix Q with orthonormal
columns, which is defined as the last n columns of a product of k elementary reflectors of order
m:
Q = H(k)*...*H(2)*H(1) as returned by ?geqlf.
Input Parameters
m INTEGER. The number of rows of the matrix Q. m ≥ 0.
n INTEGER. The number of columns of the matrix Q. m ≥ n
≥ 0.
product defines the matrix Q. n ≥ k ≥ 0.
a REAL for sorg2l
DOUBLE PRECISION for dorg2l
COMPLEX for cung2l
COMPLEX*16 for zung2l.
1752
On entry, the (n -k+i)-th column must contain the vector
which defines the elementary reflector H(i), for i = 1,2,...,
k, as returned by ?geqlf in the last k columns of its array
argument A.
lda INTEGER. The first dimension of the array a. lda ≥
max(1,m).
tau REAL for sorg2l
COMPLEX for cung2l
reflector H(i), as returned by ?geqlf.
work REAL for sorg2l
COMPLEX for cung2l
Output Parameters
a On exit, the m-by-n matrix Q.
info INTEGER.
1753
?org2r/?ung2r
matrix Q from a QR factorization determined by
?geqrf (unblocked algorithm).
Syntax
call sorg2r( m, n, k, a, lda, tau, work, info )
call dorg2r( m, n, k, a, lda, tau, work, info )
call cung2r( m, n, k, a, lda, tau, work, info )
call zung2r( m, n, k, a, lda, tau, work, info )
Description
for C interface.
The routine ?org2r/?ung2r generates an m-by-n real/complex matrix Q with orthonormal
columns, which is defined as the first n columns of a product of k elementary reflectors of order
m
Q = H(1)*H(2)*...*H(k)
as returned by ?geqrf.
Input Parameters
n INTEGER. The number of columns of the matrix Q. m ≥ n
≥ 0.
product defines the matrix Q. n ≥ k ≥ 0.
a REAL for sorg2r
DOUBLE PRECISION for dorg2r
COMPLEX for cung2r
COMPLEX*16 for zung2r.
1754
On entry, the i-th column must contain the vector which
defines the elementary reflector H(i), for i = 1,2,..., k, as
returned by ?geqrf in the first k columns of its array
argument a.
lda INTEGER. The first DIMENSION of the array a. lda ≥
max(1,m).
tau REAL for sorg2r
COMPLEX for cung2r
reflector H(i), as returned by ?geqrf.
work REAL for sorg2r
COMPLEX for cung2r
Output Parameters
info INTEGER.
1755
?orgl2/?ungl2
matrix Q from an LQ factorization determined by
?gelqf (unblocked algorithm).
Syntax
call sorgl2( m, n, k, a, lda, tau, work, info )
call dorgl2( m, n, k, a, lda, tau, work, info )
call cungl2( m, n, k, a, lda, tau, work, info )
call zungl2( m, n, k, a, lda, tau, work, info )
Description
for C interface.
The routine ?orgl2/?ungl2 generates a m-by-n real/complex matrix Q with orthonormal rows,
which is defined as the first m rows of a product of k elementary reflectors of order n
Q = H(k)*...*H(2)*H(1)for real flavors, or Q = (H(k))H*...*(H(2))H*(H(1))H for complex

flavors as returned by ?gelqf.
Input Parameters
n INTEGER. The number of columns of the matrix Q. n ≥ m.
product defines the matrix Q. m ≥ k ≥ 0.
a REAL for sorgl2
DOUBLE PRECISION for dorgl2
COMPLEX for cungl2
COMPLEX*16 for zungl2.
Array, DIMENSION (lda, n). On entry, the i-th row must
contain the vector which defines the elementary reflector
H(i), for i = 1,2,..., k, as returned by ?gelqf in
the first k rows of its array argument a.
1756
max(1,m).
tau REAL for sorgl2
COMPLEX for cungl2
reflector H(i), as returned by ?gelqf.
work REAL for sorgl2
COMPLEX for cungl2
Output Parameters
info INTEGER.
< 0: if info = -i, the i-th argument has an illegal value.
?orgr2/?ungr2
matrix Q from an RQ factorization determined by
?gerqf (unblocked algorithm).
Syntax
call sorgr2( m, n, k, a, lda, tau, work, info )
call dorgr2( m, n, k, a, lda, tau, work, info )
call cungr2( m, n, k, a, lda, tau, work, info )
call zungr2( m, n, k, a, lda, tau, work, info )
1757
Description
for C interface.
The routine ?orgr2/?ungr2 generates an m-by-n real matrix Q with orthonormal rows, which
is defined as the last m rows of a product of k elementary reflectors of order n
Q = H(1)*H(2)*...*H(k) for real flavors, or Q = (H(1))H*(H(2))H*...*(H(k))H for complex

flavors as returned by ?gerqf.
Input Parameters
n INTEGER. The number of columns of the matrix Q. n ≥ m
k INTEGER.
The number of elementary reflectors whose product defines
the matrix Q. m ≥ k ≥ 0.
a REAL for sorgr2
DOUBLE PRECISION for dorgr2
COMPLEX for cungr2
COMPLEX*16 for zungr2.
On entry, the ( m- k+i)-th row must contain the vector which
defines the elementary reflector H(i), for i = 1,2,...,
k, as returned by ?gerqf in the last k rows of its array
argument a.
max(1,m).
tau REAL for sorgr2
COMPLEX for cungr2
Array, DIMENSION (k).tau(i) must contain the scalar factor
of the elementary reflector H(i), as returned by ?gerqf.
work REAL for sorgr2
COMPLEX for cungr2
1758
Output Parameters
info INTEGER.
?orm2l/?unm2l
Multiplies a general matrix by the
orthogonal/unitary matrix from a QL factorization
determined by ?geqlf (unblocked algorithm).
Syntax
call sorm2l( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call dorm2l( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call cunm2l( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call zunm2l( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
Description
for C interface.
The routine ?orm2l/?unm2l overwrites the general real/complex m-by-n matrix C with
Q*C if side = 'L' and trans = 'N', or

QT*C / QH*C if side = 'L' and trans = 'T' (for real flavors) or trans = 'C' (for complex
flavors), or
C*Q if side = 'R' and trans = 'N', or
C*QT / C*QH if side = 'R' and trans = 'T' (for real flavors) or trans = 'C' (for complex
flavors).
Here Q is a real orthogonal or complex unitary matrix defined as the product of k elementary
reflectors
1759
Q = H(k)*...*H(2)*H(1) as returned by ?geqlf.

Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
side CHARACTER*1.
= 'L': apply Q or QT / QH from the left
= 'R': apply Q or QT / QH from the right
trans CHARACTER*1.
= 'N': apply Q (no transpose)
T
= 'T': apply Q (transpose, for real flavors)
H
= 'C': apply Q (conjugate transpose, for complex flavors)
m INTEGER. The number of rows of the matrix C. m ≥ 0.
n INTEGER. The number of columns of the matrix C. n ≥ 0.
product defines the matrix Q.
If side = 'L', m ≥ k ≥ 0;
if side = 'R', n ≥ k ≥ 0.
a REAL for sorm2l
DOUBLE PRECISION for dorm2l
COMPLEX for cunm2l
COMPLEX*16 for zunm2l.
Array, DIMENSION (lda,k).
The i-th column must contain the vector which defines the
elementary reflector H(i), for i = 1,2,..., k, as returned
by ?geqlf in the last k columns of its array argument a.
The array a is modified by the routine but restored on exit.
If side = 'L', lda ≥ max(1, m)
if side = 'R', lda ≥ max(1, n).
tau REAL for sorm2l
COMPLEX for cunm2l
1760
of the elementary reflector H(i), as returned by ?geqlf.
c REAL for sorm2l
COMPLEX for cunm2l
ldc INTEGER. The leading dimension of the array C. ldc ≥
max(1,m).
work REAL for sorm2l
COMPLEX for cunm2l
Workspace array, DIMENSION:
(n) if side = 'L',
(m) if side = 'R'.
Output Parameters
T
c On exit, c is overwritten by Q*C or Q *C / QH*C, or C*Q, or
C*QT / C*QH.
info INTEGER.
1761
?orm2r/?unm2r
orthogonal/unitary matrix from a QR factorization
determined by ?geqrf (unblocked algorithm).
Syntax
call sorm2r( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call dorm2r( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call cunm2r( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call zunm2r( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
Description
for C interface.
The routine ?orm2r/?unm2r overwrites the general real/complex m-by-n matrix C with

flavors), or
flavors).
reflectors
Q = H(1)*H(2)*...*H(k) as returned by ?geqrf.
Input Parameters
side CHARACTER*1.
trans CHARACTER*1.
1762
T
H
If side = 'L', m ≥ k ≥ 0;
if side = 'R', n ≥ k ≥ 0.
a REAL for sorm2r
DOUBLE PRECISION for dorm2r
COMPLEX for cunm2r
COMPLEX*16 for zunm2r.
Array, DIMENSION (lda,k).
The i-th column must contain the vector which defines the
by ?geqrf in the first k columns of its array argument a.
The array a is modified by the routine but restored on exit.
If side = 'L', lda ≥ max(1, m);
if side = 'R', lda ≥ max(1, n).
tau REAL for sorm2r
COMPLEX for cunm2r
reflector H(i), as returned by ?geqrf.
c REAL for sorm2r
COMPLEX for cunm2r
1763
ldc INTEGER. The leading dimension of the array c. ldc ≥

max(1,m).
work REAL for sorm2r
COMPLEX for cunm2r
(n) if side = 'L',
(m) if side = 'R'.
Output Parameters
T
C*QT / C*QH.
info INTEGER.
?orml2/?unml2
orthogonal/unitary matrix from a LQ factorization
determined by ?gelqf (unblocked algorithm).
Syntax
call sorml2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call dorml2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call cunml2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call zunml2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
Description
for C interface.
The routine ?orml2/?unml2 overwrites the general real/complex m-by-n matrix C with
1764
flavors), or
flavors).
reflectors
Q = H(k)*...*H(2)*H(1)for real flavors, or Q = (H(k))H*...*(H(2))H*(H(1))H for complex
flavors as returned by ?gelqf.
Input Parameters
side CHARACTER*1.
trans CHARACTER*1.
T
H
If side = 'L', m ≥ k ≥ 0;
if side = 'R', n ≥ k ≥ 0.
a REAL for sorml2
DOUBLE PRECISION for dorml2
COMPLEX for cunml2
COMPLEX*16 for zunml2.
Array, DIMENSION
(lda, m) if side = 'L',
(lda, n) if side = 'R'
1765
The i-th row must contain the vector which defines the
by ?gelqf in the first k rows of its array argument a. The
array a is modified by the routine but restored on exit.
max(1,k).
tau REAL for sorml2
COMPLEX for cunml2
reflector H(i), as returned by ?gelqf.
c REAL for sorml2
COMPLEX for cunml2
Array, DIMENSION (ldc, n) On entry, the m-by-n matrix C.
max(1,m).
work REAL for sorml2
COMPLEX for cunml2
(n) if side = 'L',
(m) if side = 'R'
Output Parameters
T
C*QT / C*QH.
info INTEGER.
1766
?ormr2/?unmr2
orthogonal/unitary matrix from a RQ factorization
determined by ?gerqf (unblocked algorithm).
Syntax
call sormr2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call dormr2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call cunmr2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call zunmr2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
Description
for C interface.
The routine ?ormr2/?unmr2 overwrites the general real/complex m-by-n matrix C with

flavors), or
flavors).
reflectors
Q = H(1)*H(2)*...*H(k) for real flavors, or Q = (H(1))H*(H(2))H*...*(H(k))H as returned
by ?gerqf.
Input Parameters
side CHARACTER*1.
1767
trans CHARACTER*1.
T
H
If side = 'L', m ≥ k ≥ 0;
if side = 'R', n ≥ k ≥ 0.
a REAL for sormr2
DOUBLE PRECISION for dormr2
COMPLEX for cunmr2
COMPLEX*16 for zunmr2.
Array, DIMENSION
elementary reflector H(i), for i = 1,2,...,k, as returned
by ?gerqf in the last k rows of its array argument a. The
lda INTEGER.
The leading dimension of the array a. lda ≥ max(1,k).
tau REAL for sormr2
COMPLEX for cunmr2
reflector H(i), as returned by ?gerqf.
c REAL for sormr2
COMPLEX for cunmr2
1768
max(1,m).
work REAL for sormr2
COMPLEX for cunmr2
(n) if side = 'L',
(m) if side = 'R'
Output Parameters
T
C*QT / C*QH.
info INTEGER.
?ormr3/?unmr3
orthogonal/unitary matrix from a RZ factorization
determined by ?tzrzf (unblocked algorithm).
Syntax
call sormr3( side, trans, m, n, k, l, a, lda, tau, c, ldc, work, info )
call dormr3( side, trans, m, n, k, l, a, lda, tau, c, ldc, work, info )
call cunmr3( side, trans, m, n, k, l, a, lda, tau, c, ldc, work, info )
call zunmr3( side, trans, m, n, k, l, a, lda, tau, c, ldc, work, info )
Description
for C interface.
The routine ?ormr3/?unmr3 overwrites the general real/complex m-by-n matrix C with
1769

flavors), or
flavors).
reflectors
Q = H(1)*H(2)*...*H(k) as returned by ?tzrzf.
Input Parameters
side CHARACTER*1.
trans CHARACTER*1.
T
H
If side = 'L', m ≥ k ≥ 0;
if side = 'R' , n ≥ k ≥ 0.
l INTEGER. The number of columns of the matrix A containing
If side = 'L', m ≥ l ≥ 0,
if side = 'R', n ≥ l ≥ 0.
a REAL for sormr3
COMPLEX for cunmr3
1770
Array, DIMENSION
elementary reflector H(i), for i = 1,2,...,k, as returned
by ?tzrzf in the last k rows of its array argument a. The
lda INTEGER.
The leading dimension of the array a. lda ≥ max(1,k).
tau REAL for sormr3
COMPLEX for cunmr3
reflector H(i), as returned by ?tzrzf.
c REAL for sormr3
COMPLEX for cunmr3
max(1,m).
work REAL for sormr3
COMPLEX for cunmr3
(n) if side = 'L',
(m) if side = 'R'.
Output Parameters
T
C*QT / C*QH.
1771
info INTEGER.
?pbtf2
symmetric/ Hermitian positive-definite band matrix
Syntax
call spbtf2( uplo, n, kd, ab, ldab, info )
call dpbtf2( uplo, n, kd, ab, ldab, info )
call cpbtf2( uplo, n, kd, ab, ldab, info )
call zpbtf2( uplo, n, kd, ab, ldab, info )
Description
for C interface.
The routine computes the Cholesky factorization of a real symmetric or complex Hermitian
positive definite band matrix A.
The factorization has the form

A = UT*U for real flavors, A = UH*U for complex flavors if uplo = 'U', or
A = L*LT for real flavors, A = L*LH for complex flavors if uplo = 'L',
where U is an upper triangular matrix, and L is lower triangular. This is the unblocked version
of the algorithm, calling BLAS Level 2 Routines.
Input Parameters
uplo CHARACTER*1.
1772
uplo = 'U' , or the number of sub-diagonals if uplo =
'L'.
kd ≥ 0.
ab REAL for spbtf2
DOUBLE PRECISION for dpbtf2
COMPLEX for cpbtf2
COMPLEX*16 for zpbtf2.
On entry, the upper or lower triangle of the symmetric/
Hermitian band matrix A, stored in the first kd+1 rows of
the array. The j-th column of A is stored in the j-th column
of the array ab as follows:
if uplo = 'U', ab(kd+1+i -j,j) = A(i, j for max(1,
j-kd) ≤ i ≤ j;
if uplo = 'L', ab(1+i -j,j) = A(i, j for j ≤ i ≤
min(n, j+kd).
kd+1.
Output Parameters
ab On exit, If info = 0, the triangular factor U or L from the
Cholesky factorization A=UT*U (A=UH*U), or A= L*LT (A =
L*LH) of the band matrix A, in the same storage format as
A.
info INTEGER.
> 0: if info = k, the leading minor of order k is not positive
definite, and the factorization could not be completed.
1773
?potf2
symmetric/Hermitian positive-definite matrix
Syntax
call spotf2( uplo, n, a, lda, info )
call dpotf2( uplo, n, a, lda, info )
call cpotf2( uplo, n, a, lda, info )
call zpotf2( uplo, n, a, lda, info )
Description
for C interface.
The routine ?potf2 computes the Cholesky factorization of a real symmetric or complex
Hermitian positive definite matrix A. The factorization has the form
A = UT*U for real flavors, A = UH*U for complex flavors if uplo = 'U', or
A = L*LT for real flavors, A = L*LH for complex flavors if uplo = 'L',
where U is an upper triangular matrix, and L is lower triangular.
This is the unblocked version of the algorithm, calling BLAS Level 2 Routines
Input Parameters
uplo CHARACTER*1.
symmetric/Hermitian matrix A is stored.
a REAL for spotf2
DOUBLE PRECISION or dpotf2
COMPLEX for cpotf2
COMPLEX*16 for zpotf2.
1774
On entry, the symmetric/Hermitian matrix A.
lda ≥ max(1,n).
Output Parameters
a On exit, If info = 0, the factor U or L from the Cholesky
factorization A=UT*U (A=UH*U), or A= L*LT (A = L*LH).
info INTEGER.
> 0: if info = k, the leading minor of order k is not positive
definite, and the factorization could not be completed.
?ptts2
Solves a tridiagonal system of the form A*X=B
using the L*D*LH/L*D*LH factorization computed
by ?pttrf.
Syntax
call sptts2( n, nrhs, d, e, b, ldb )
call dptts2( n, nrhs, d, e, b, ldb )
call cptts2( iuplo, n, nrhs, d, e, b, ldb )
call zptts2( iuplo, n, nrhs, d, e, b, ldb )
Description
for C interface.
1775
The routine ?ptts2 solves a tridiagonal system of the form

A*X = B
Real flavors sptts2/dptts2 use the L*D*LT factorization of A computed by spttrf/dpttrf,
and complex flavors cptts2/zptts2 use the UH*D*U or L*D*LH factorization of A computed by
cpttrf/zpttrf.
D is a diagonal matrix specified in the vector d, U (or L) is a unit bidiagonal matrix whose
superdiagonal (subdiagonal) is specified in the vector e, and X and B are n-by-nrhs matrices.
Input Parameters
iuplo INTEGER. Used with complex flavors only.
Specifies the form of the factorization, and whether the
vector e is the superdiagonal of the upper bidiagonal factor
U or the subdiagonal of the lower bidiagonal factor L.
= 1: A = UH*D*U , e is the superdiagonal of U;
= 0: A = L*D*LH, e is the subdiagonal of L
n INTEGER. The order of the tridiagonal matrix A. n ≥ 0.
nrhs INTEGER. The number of right hand sides, that is, the
number of columns of the matrix B. nrhs ≥ 0.
d REAL for sptts2/cptts2
DOUBLE PRECISION for dptts2/zptts2.
The n diagonal elements of the diagonal matrix D from the
factorization of A.
e REAL for sptts2
DOUBLE PRECISION for dptts2
COMPLEX for cptts2
COMPLEX*16 for zptts2.
Contains the (n-1) subdiagonal elements of the unit
bidiagonal factor L from the L*D*LT (for real flavors) or
L*D*LH (for complex flavors when iuplo = 0) factorization
of A.
For complex flavors when iuplo = 1, e contains the (n-1)
superdiagonal elements of the unit bidiagonal factor U from
the factorization A = UH*D*U.
1776
B REAL for sptts2/cptts2
DOUBLE PRECISION for dptts2/zptts2.
Array, DIMENSION (ldb, nrhs).
On entry, the right hand side vectors B for the system of
linear equations.
ldb INTEGER. The leading dimension of the array B. ldb ≥
max(1,n).
Output Parameters
b On exit, the solution vectors, X.
?rscl
Multiplies a vector by the reciprocal of a real scalar.
Syntax
call srscl( n, sa, sx, incx )
call drscl( n, sa, sx, incx )
call csrscl( n, sa, sx, incx )
call zdrscl( n, sa, sx, incx )
Description
for C interface.
The routine ?rscl multiplies an n-element real/complex vector x by the real scalar 1/a. This
is done without overflow or underflow as long as the final result x/a does not overflow or
underflow.
Input Parameters
n INTEGER. The number of components of the vector x.
sa REAL for srscl/csrscl
DOUBLE PRECISION for drscl/zdrscl.
1777
The scalar a which is used to divide each component of the

vector x. sa must be ≥ 0, or the subroutine will divide by
zero.
sx REAL for srscl
DOUBLE PRECISION for drscl
COMPLEX for csrscl
COMPLEX*16 for zdrscl.
Array, DIMENSION (1+(n-1)* |incx|).
The n-element vector x.
incx INTEGER. The increment between successive values of the
vector sx.
If incx > 0, sx(1)=x(1), and sx(1+(i-1)*incx)=x(i),
1<i≤n.
Output Parameters
sx On exit, the result x/a.
?sygs2/?hegs2
Reduces a symmetric/Hermitian definite
generalized eigenproblem to standard form, using
the factorization results obtained from ?potrf
Syntax
call ssygs2( itype, uplo, n, a, lda, b, ldb, info )
call dsygs2( itype, uplo, n, a, lda, b, ldb, info )
call chegs2( itype, uplo, n, a, lda, b, ldb, info )
call zhegs2( itype, uplo, n, a, lda, b, ldb, info )
Description
for C interface.
The routine ?sygs2/?hegs2 reduces a real symmetric-definite or a complex Hermitian-definite
generalized eigenproblem to standard form.
1778
If itype = 1, the problem is
A*x = λ*B*x
and A is overwritten by inv(U')*A*inv(U), or inv(L)*A*inv(L').
If itype = 2 or 3, the problem is
A*B*x = λ*x, or B*A*x = λ*x,

and A is overwritten by U*A*U' or L'*A*L. Here U'(L') is the transpose (conjugate transpose)
of U (L).
B must be previously factorized as U'*U or L*L' by ?potrf.
Input Parameters
itype INTEGER.
= 1: compute inv(U')*A*inv(U), or inv(L)*A*inv(L');
= 2 or 3: compute U*A*U', or L'*A*L.
triangular part of the symmetric/Hermitian matrix A is
stored, and how B has been factorized.
n INTEGER. The order of the matrices A and B. n ≥ 0.
a REAL for ssygs2
DOUBLE PRECISION for dsygs2
COMPLEX for chegs2
COMPLEX*16 for zhegs2.
lda INTEGER.
The leading dimension of the array a. lda ≥ max(1,n).
1779
b REAL for ssygs2

DOUBLE PRECISION for dsygs2
COMPLEX for chegs2
COMPLEX*16 for zhegs2.
The triangular factor from the Cholesky factorization of B
as returned by ?potrf.
ldb INTEGER. The leading dimension of the array b. ldb ≥
max(1,n).
Output Parameters
a On exit, If info = 0, the transformed matrix, stored in the
same format as A.
info INTEGER.
?sytd2/?hetd2
Reduces a symmetric/Hermitian matrix to real
symmetric tridiagonal form by an
orthogonal/unitary similarity
transformation(unblocked algorithm).
Syntax
call ssytd2( uplo, n, a, lda, d, e, tau, info )
call dsytd2( uplo, n, a, lda, d, e, tau, info )
call chetd2( uplo, n, a, lda, d, e, tau, info )
call zhetd2( uplo, n, a, lda, d, e, tau, info )
Description
for C interface.
1780
The routine ?sytd2/?hetd2 reduces a real symmetric/complex Hermitian matrix A to real
symmetric tridiagonal form T by an orthogonal/unitary similarity transformation: QT*A*Q = T
(QH*A*Q = T ).
Input Parameters
uplo CHARACTER*1.
a REAL for ssytd2
DOUBLE PRECISION for dsytd2
COMPLEX for chetd2
COMPLEX*16 for zhetd2.
max(1,n).
Output Parameters
a On exit, if uplo = 'U', the diagonal and first superdiagonal
of a are overwritten by the corresponding elements of the
tridiagonal matrix T, and the elements above the first
superdiagonal, with the array tau, represent the
reflectors;
1781
if uplo = 'L', the diagonal and first subdiagonal of a are

overwritten by the corresponding elements of the tridiagonal
matrix T, and the elements below the first subdiagonal, with
the array tau, represent the orthogonal/unitary matrix Q as
a product of elementary reflectors.
d REAL for ssytd2/chetd2
DOUBLE PRECISION for dsytd2/zhetd2.
The diagonal elements of the tridiagonal matrix T:
d(i) = a(i,i).
e REAL for ssytd2/chetd2
DOUBLE PRECISION for dsytd2/zhetd2.
The off-diagonal elements of the tridiagonal matrix T:
e(i) = a(i,i+1) if uplo = 'U',
e(i) = a(i+1,i) if uplo = 'L'.
tau REAL for ssytd2
DOUBLE PRECISION for dsytd2
COMPLEX for chetd2
COMPLEX*16 for zhetd2.
The first n-1 elements contain scalar factors of the
elementary reflectors. tau(n) is used as workspace.
info INTEGER.
1782
?sytf2
Computes the factorization of a real/complex
symmetric indefinite matrix, using the diagonal
pivoting method (unblocked algorithm).
Syntax
call ssytf2( uplo, n, a, lda, ipiv, info )
call dsytf2( uplo, n, a, lda, ipiv, info )
call csytf2( uplo, n, a, lda, ipiv, info )
call zsytf2( uplo, n, a, lda, ipiv, info )
Description
for C interface.
The routine ?sytf2 computes the factorization of a real/complex symmetric matrix A using the
Bunch-Kaufman diagonal pivoting method:
A = U*D*UT(A = U*D*UH), or A = L*D*LT (A = L*D*LH),
where U (or L) is a product of permutation and unit upper (lower) triangular matrices, and D is
symmetric and block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
This is the unblocked version of the algorithm, calling BLAS Level 2 Routines.
Input Parameters
uplo CHARACTER*1.
symmetric matrix A is stored
a REAL for ssytf2
DOUBLE PRECISION for dsytf2
COMPLEX for csytf2
COMPLEX*16 for zsytf2.
1783
On entry, the symmetric matrix A.

lda INTEGER.
The leading dimension of the array a. lda ≥ max(1,n).
Output Parameters
a On exit, the block diagonal matrix D and the multipliers used
to obtain the factor U or L.
ipiv INTEGER.
Details of the interchanges and the block structure of D
If ipiv(k) > 0, then rows and columns k and ipiv(k) are
interchanged and D(k,k) is a 1-by-1 diagonal block.
and columns k-1 and -ipiv(k) are interchanged and D(k,k)
is a 2-by-2 diagonal block.
If uplo = 'L' and ipiv( k) = ipiv( k+1)< 0, then
rows and columns k+1 and -ipiv(k) were interchanged and
D(k:k+1,k:k+1) is a 2-by-2 diagonal block.
info INTEGER.
< 0: if info = -k, the k-th argument has an illegal value
> 0: if info = k, D(k,k) is exactly zero. The factorization
are completed, but the block diagonal matrix D is exactly
singular, and division by zero will occur if it is used to solve
a system of equations.
1784
?hetf2
Computes the factorization of a complex Hermitian
matrix, using the diagonal pivoting method
Syntax
call chetf2( uplo, n, a, lda, ipiv, info )
call zhetf2( uplo, n, a, lda, ipiv, info )
Description
for C interface.
The routine computes the factorization of a complex Hermitian matrix A using the Bunch-Kaufman
diagonal pivoting method:
A = U*D*U' or A = L*D*L'
where U (or L) is a product of permutation and unit upper (lower) triangular matrices, U' is the
conjugate transpose of U, and D is Hermitian and block diagonal with 1-by-1 and 2-by-2 diagonal
blocks.
This is the unblocked version of the algorithm, calling BLAS Level 2 Routines.
Input Parameters
uplo CHARACTER*1.
A COMPLEX for chetf2
COMPLEX*16 for zhetf2.
On entry, the Hermitian matrix A.
1785

max(1,n).
Output Parameters
a On exit, the block diagonal matrix D and the multipliers used
to obtain the factor U or L.
ipiv INTEGER. Array, DIMENSION (n).
Details of the interchanges and the block structure of D
If ipiv(k) > 0, then rows and columns k and ipiv(k) were
interchanged and D(k,k) is a 1-by-1 diagonal block.
If uplo = 'U' and ipiv(k) = ipiv( k-1) < 0, then
rows and columns k-1 and -ipiv(k) were interchanged and
D(k-1:k,k-1:k ) is a 2-by-2 diagonal block.
If uplo = 'L' and ipiv(k) = ipiv( k+1) < 0, then
rows and columns k+1 and -ipiv(k) were interchanged and
D(k:k+1, k:k+1) is a 2-by-2 diagonal block.
info INTEGER.
> 0: if info = k, D(k,k) is exactly zero. The factorization
exactly singular, and division by zero will occur if it is used
to solve a system of equations.
1786
?tgex2
Swaps adjacent diagonal blocks in an upper (quasi)
triangular matrix pair by an orthogonal/unitary
equivalence transformation.
Syntax
call stgex2( wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, j1, n1, n2,
work, lwork, info )
call dtgex2( wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, j1, n1, n2,
work, lwork, info )
call ctgex2( wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, j1, info )
call ztgex2( wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, j1, info )
Description
for C interface.
The real routines stgex2/dtgex2 swap adjacent diagonal blocks (A11, B11) and (A22, B22) of
size 1-by-1 or 2-by-2 in an upper (quasi) triangular matrix pair (A, B) by an orthogonal
equivalence transformation. (A, B) must be in generalized real Schur canonical form (as returned
by sgges/dgges), that is, A is block upper triangular with 1-by-1 and 2-by-2 diagonal blocks.
B is upper triangular.
The complex routines ctgex2/ztgex2 swap adjacent diagonal 1-by-1 blocks (A11, B11) and
(A22, B22) in an upper triangular matrix pair (A, B) by an unitary equivalence transformation.
(A, B) must be in generalized Schur canonical form, that is, A and B are both upper triangular.
All routines optionally update the matrices Q and Z of generalized Schur vectors:
Q(in)*A(in)*Z(in)' = Q(out)*A(out)*Z(out)'
Q(in)*B(in)*Z(in)' = Q(out)*B(out)*Z(out)', where Z' is the transpose (conjugate
transpose) of Z.
Input Parameters
wantq LOGICAL.
If wantq = .TRUE. : update the left transformation matrix
Q;
1787
If wantq = .FALSE. : do not update Q.

wantz LOGICAL.
If wantz = .TRUE. : update the right transformation matrix
Z;
If wantz = .FALSE.: do not update Z.
n INTEGER. The order of the matrices A and B. n ≥ 0.
a, b REAL for stgex2 DOUBLE PRECISION for dtgex2
COMPLEX for ctgex2
COMPLEX*16 for ztgex2.
Arrays, DIMENSION (lda, n) and (ldb, n), respectively.
On entry, the matrices A and B in the pair (A, B).
max(1,n).
ldb INTEGER. The leading dimension of the array b. ldb ≥
max(1,n).
q, z REAL for stgex2 DOUBLE PRECISION for dtgex2
COMPLEX for ctgex2
COMPLEX*16 for ztgex2.
Arrays, DIMENSION (ldq, n) and (ldz, n), respectively.
On entry, if wantq = .TRUE., q contains the
orthogonal/unitary matrix Q, and if wantz = .TRUE., z
contains the orthogonal/unitary matrix Z.
ldq INTEGER. The leading dimension of the array q. ldq ≥ 1.
If wantq = .TRUE., ldq ≥ n.
ldz INTEGER. The leading dimension of the array z. ldz ≥ 1.
If wantz = .TRUE., ldz ≥ n.
j1 INTEGER.
The index to the first block (A11, B11). 1 ≤ j1 ≤ n.
n1 INTEGER. Used with real flavors only. The order of the first
block (A11, B11). n1 = 0, 1 or 2.
n2 INTEGER. Used with real flavors only. The order of the
second block (A22, B22). n2 = 0, 1 or 2.
1788
work REAL for stgex2
DOUBLE PRECISION for dtgex2.
Workspace array, DIMENSION (max(1,lwork)). Used with
real flavors only.
lwork≥max(n*(n2+n1), 2*(n2+n1)2)
Output Parameters
a On exit, the updated matrix A.
B On exit, the updated matrix B.
Q On exit, the updated matrix Q.
Not referenced if wantq = .FALSE..
z On exit, the updated matrix Z.
Not referenced if wantz = .FALSE..
info INTEGER.
=0: Successful exit For stgex2/dtgex2: If info = 1, the
transformed matrix (A, B) would be too far from generalized
Schur form; the blocks are not swapped and (A, B) and (Q,
Z) are unchanged. The problem of swapping is too
ill-conditioned. If info = -16: lwork is too small.
Appropriate value for lwork is returned in work(1).
For ctgex2/ztgex2:
If info = 1, the transformed matrix pair (A, B) would be
too far from generalized Schur form; the problem is
ill-conditioned.
1789
?tgsy2
Solves the generalized Sylvester equation
Syntax
call stgsy2( trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f,
ldf, scale, rdsum, rdscal, iwork, pq, info )
call dtgsy2( trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f,
call ctgsy2( trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f,
call ztgsy2( trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f,
Description
for C interface.
The routine ?tgsy2 solves the generalized Sylvester equation:
A*R-L*B=scale*C (1)
D*R-L*E=scale*F
using Level 1 and 2 BLAS, where R and L are unknown m-by-n matrices, (A, D), ( B, E) and (C,
F) are given matrix pairs of size m-by -m, n-by-n and m-by-n, respectively. For stgsy2/dtgsy2,
pairs (A, D) and (B, E) must be in generalized Schur canonical form, that is, A, B are upper quasi
triangular and D, E are upper triangular. For ctgsy2/ztgsy2, matrices A, B, D and E are upper
triangular (that is, (A, D) and (B, E) in generalized Schur form).
The solution (R, L) overwrites (C, F).
0 ≤ scale ≤ 1 is an output scaling factor chosen to avoid overflow.

In matrix notation, solving equation (1) corresponds to solve
Z*x = scale*b
where Z is defined as
1790
Here Ik is the identity matrix of size k and X' is the transpose (conjugate transpose) of X.
kron(X, Y) denotes the Kronecker product between the matrices X and Y.
If trans = 'T' , solve the transposed (conjugate transposed) system
Z'*y = scale*b
for y, which is equivalent to solve for R and L in
A'*R+D'*L=scale*C (3)
R*B'+L*E'=scale*(-F)
This case is used to compute an estimate of Dif[(A,D),(B,E)] = sigma_min(Z) using
reverse communication with ?lacon.
?tgsy2 also (for ijob ≥ 1) contributes to the computation in ?tgsyl of an upper bound on
the separation between two matrix pairs. Then the input (A, D), (B, E) are sub-pencils of the
matrix pair (two matrix pairs) in ?tgsyl. See ?tgsyl for details.
Input Parameters
trans CHARACTER*1.
If trans = 'N', solve the generalized Sylvester equation
(1);
If trans = 'T': solve the 'transposed' system (3).
ijob INTEGER. Specifies what kind of functionality is to be
performed.
If ijob = 0: solve (1) only.
If ijob = 1: a contribution from this subsystem to a
Frobenius norm-based estimate of the separation between
two matrix pairs is computed (look ahead strategy is used);
If ijob = 2: a contribution from this subsystem to a
Frobenius norm-based estimate of the separation between
two matrix pairs is computed (?gecon on sub-systems is
used).
Not referenced if trans = 'T'.
1791
m INTEGER. On entry, m specifies the order of A and D, and

the row dimension of C, F, R and L.
n INTEGER. On entry, n specifies the order of B and E, and
the column dimension of C, F, R and L.
a, b REAL for stgsy2
DOUBLE PRECISION for dtgsy2
COMPLEX for ctgsy2
COMPLEX*16 for ztgsy2.
Arrays, DIMENSION (lda, m) and (ldb, n), respectively. On
entry, a contains an upper (quasi) triangular matrix A, and
b contains an upper (quasi) triangular matrix B.
max(1, m).
ldb INTEGER.
The leading dimension of the array b. ldb ≥ max(1, n).
c, f REAL for stgsy2
COMPLEX for ctgsy2
Arrays, DIMENSION (ldc, n) and (ldf, n), respectively. On
entry, c contains the right-hand-side of the first matrix
equation in (1), and f contains the right-hand-side of the
second matrix equation in (1).
max(1, m).
d, e REAL for stgsy2
COMPLEX for ctgsy2
Arrays, DIMENSION (ldd, m) and (lde, n), respectively. On
entry, d contains an upper triangular matrix D, and e
contains an upper triangular matrix E.
ldd INTEGER. The leading dimension of the array d. ldd ≥
max(1, m).
1792
lde INTEGER. The leading dimension of the array e. lde ≥
max(1, n).
ldf INTEGER. The leading dimension of the array f. ldf ≥
max(1, m).
rdsum REAL for stgsy2/ctgsy2
DOUBLE PRECISION for dtgsy2/ztgsy2.
On entry, the sum of squares of computed contributions to
the Dif-estimate under computation by ?tgsyL, where the
scaling factor rdscal has been factored out.
rdscal REAL for stgsy2/ctgsy2
On entry, scaling factor used to prevent overflow in rdsum.
iwork INTEGER. Used with real flavors only.
Workspace array, DIMENSION (m+n+2).
Output Parameters
c On exit, if ijob = 0, c is overwritten by the solution R.
f On exit, if ijob = 0, f is overwritten by the solution L.
scale REAL for stgsy2/ctgsy2
On exit, 0 ≤ scale ≤ 1. If 0 < scale < 1, the solutions
R and L (C and F on entry) hold the solutions to a slightly
perturbed system, but the input matrices A, B, D and E are
not changed. If scale = 0, R and L hold the solutions to
the homogeneous system with C = F = 0. Normally scale
= 1.
rdsum On exit, the corresponding sum of squares updated with
the contributions from the current sub-system.
If trans = 'T', rdsum is not touched.
Note that rdsum only makes sense when ?tgsy2 is called
by ?tgsyl.
rdscal On exit, rdscal is updated with respect to the current
contributions in rdsum.
If trans = 'T' , rdscal is not touched.
1793
Note that rdscal only makes sense when ?tgsy2 is called

by ?tgsyl.
pq INTEGER. Used with real flavors only.
On exit, the number of subsystems (of size 2-by-2, 4-by-4
and 8-by-8) solved by the routine stgsy2/dtgsy2.
info INTEGER. On exit, if info is set to
= 0: Successful exit
< 0: If info = -i, the i-th argument has an illegal value.
> 0: The matrix pairs (A, D) and (B, E) have common or
very close eigenvalues.
?trti2
Computes the inverse of a triangular matrix
Syntax
call strti2( uplo, diag, n, a, lda, info )
call dtrti2( uplo, diag, n, a, lda, info )
call ctrti2( uplo, diag, n, a, lda, info )
call ztrti2( uplo, diag, n, a, lda, info )
Description
for C interface.
The routine ?trti2 computes the inverse of a real/complex upper or lower triangular matrix.
This is the Level 2 BLAS version of the algorithm.
Input Parameters
uplo CHARACTER*1.
diag CHARACTER*1.
1794
a REAL for strti2
DOUBLE PRECISION for dtrti2
COMPLEX for ctrti2
COMPLEX*16 for ztrti2.
On entry, the triangular matrix A.
the array a contains the upper triangular matrix, and the
strictly lower triangular part of a is not referenced.
If diag = 'U', the diagonal elements of a are also not
max(1,n).
Output Parameters
a On exit, the (triangular) inverse of the original matrix, in
the same storage format.
info INTEGER.
clag2z
Converts a complex single precision matrix to a
complex double precision matrix.
Syntax
call clag2z( m, n, sa, ldsa, a, lda, info )
1795
Description
for C interface.
The routine converts a complex single precision matrix SA to a complex double precision matrix
A.
Note that while it is possible to overflow while converting from double to single, it is not possible
to overflow when converting from single to double.
This is a helper routine so there is no argument checking.
Input Parameters
m INTEGER. The number of lines of the matrix A (m ≥ 0).
n INTEGER. The number of columns in the matrix A (n ≥ 0).
ldsa INTEGER. The leading dimension of the array sa; ldsa ≥
max(1, m).
a DOUBLE PRECISION array, DIMENSION (lda, n).
On entry, contains the m-by-n coefficient matrix A.
lda INTEGER. The leading dimension of the array a; lda ≥
max(1, m).
Output Parameters
sa REAL array, DIMENSION (ldsa, n).
On exit, contains the m-by-n coefficient matrix SA.
info INTEGER.
dlag2s
Converts a double precision matrix to a single
precision matrix.
Syntax
call dlag2s( m, n, a, lda, sa, ldsa, info )
1796
Description
for C interface.
The routine converts a double precision matrix SA to a single precision matrix A.
RMAX is the overflow for the single precision arithmetic. dlag2s checks that all the entries of
A are between -RMAX and RMAX. If not, the convertion is aborted and a flag is raised.
Input Parameters
max(1, m).
max(1, m).
Output Parameters
On exit, if info = 0, contains the m-by-n coefficient matrix
SA.
info INTEGER.
If info = k, the (i, j) entry of the matrix A has overflowed
when moving from double precision to single precision. k is
given by k = (i-1)*lda+j.
1797
slag2d
Converts a single precision matrix to a double
precision matrix.
Syntax
call slag2d( m, n, sa, ldsa, a, lda, info )
Description
for C interface.
The routine converts a single precision matrix SA to a double precision matrix A.
Note that while it is possible to overflow while converting from double to single, it is not possible
to overflow when converting from single to double.
Input Parameters
max(1, m).
max(1, m).
Output Parameters
SA.
info INTEGER.
1798
zlag2c
Converts a complex double precision matrix to a
complex single precision matrix.
Syntax
call zlag2c( m, n, a, lda, sa, ldsa, info )
Description
for C interface.
The routine converts a double precision complex matrix SA to a single precision complex matrix
A.
RMAX is the overflow for the single precision arithmetic. zlag2c checks that all the entries of
A are between -RMAX and RMAX. If not, the convertion is aborted and a flag is raised.
Input Parameters
max(1, m).
max(1, m).
Output Parameters
SA.
info INTEGER.
1799
If info = k, the (i, j) entry of the matrix A has overflowed

when moving from double precision to single precision. k is
given by k = (i-1)*lda+j.
Utility Functions and Routines

This section describes LAPACK utility functions and routines. Summary information about these
routines is given in the following table:
Table 5-2 LAPACK Utility Routines
Types
ilaver Returns the version of the Lapack library.
ilaenv Environmental enquiry function which returns values for

tuning algorithmic performance.
iparmq Environmental enquiry function which returns values for

tuning algorithmic performance.
ieeeck Checks if the infinity and NaN arithmetic is safe. Called by

ilaenv.
lsame Tests two characters for equality regardless of case.
lsamen Tests two character strings for equality regardless of case.
?labad s, d Returns the square root of the underflow and overflow

thresholds if the exponent-range is very large.
?lamch s, d Determines machine parameters for floating-point arithmetic.
?lamc1 s, d Called from ?lamc2. Determines machine parameters given

by beta, t, rnd, ieee1.
?lamc2 s, d Used by ?lamch. Determines machine parameters specified

in its arguments list.
?lamc3 s, d Called from ?lamc1-?lamc5. Intended to force a and b to

be stored prior to doing the addition of a and b.
1800
Types
?lamc4 s, d This is a service routine for ?lamc2.
?lamc5 s, d Called from ?lamc2. Attempts to compute the largest

machine floating-point number, without overflow.
second/dsecnd Return user time for a process.
xerbla Error handling routine called by LAPACK routines.
ilaver
Returns the version of the Lapack library.
Syntax
call ilaver( vers_major, vers_minor, vers_patch )
Description
for C interface.
The routine returns the version of the LAPACK library.
Output Parameters
vers_major INTEGER.
Returns the major version of the LAPACK library.
vers_minor INTEGER.
Returns the minor version from the major version of the
LAPACK library.
vers_patch INTEGER.
Returns the patch version from the minor version of the
LAPACK library.
1801
ilaenv
Environmental enquiry function that returns values
for tuning algorithmic performance.
Syntax
value = ilaenv( ispec, name, opts, n1, n2, n3, n4 )
Description
for C interface.
The enquiry function ilaenv is called from the LAPACK routines to choose problem-dependent
parameters for the local environment. See ispec below for a description of the parameters.
This version provides a set of parameters that should give good, but not optimal, performance
on many of the currently available computers. Users are encouraged to modify this subroutine
to set the tuning parameters for their particular machine using the option and problem size
information in the arguments.
This routine will not function correctly if it is converted to all lower case. Converting it to all
upper case is allowed.
Input Parameters
ispec INTEGER.
Specifies the parameter to be returned as the value of
ilaenv:
= 1: the optimal blocksize; if this value is 1, an unblocked
algorithm will give the best performance.
= 2: the minimum block size for which the block routine
should be used; if the usable block size is less than this
value, an unblocked routine should be used.
= 3: the crossover point (in a block routine, for n less than
this value, an unblocked routine should be used)
= 4: the number of shifts, used in the nonsymmetric
eigenvalue routines (deprecated)
= 5: the minimum column dimension for blocking to be
used; rectangular blocks must have dimension at least
k-by-m, where k is given by ilaenv(2,...) and m by
ilaenv(5,...)
1802
= 6: the crossover point for the SVD (when reducing an
m-by-n matrix to bidiagonal form, if max(m,n)/min(m,n)
exceeds this value, a QR factorization is used first to reduce
the matrix to a triangular form.)
= 7: the number of processors
= 8: the crossover point for the multishift QR and QZ
methods for nonsymmetric eigenvalue problems
(deprecated).
= 9: maximum size of the subproblems at the bottom of
the computation tree in the divide-and-conquer algorithm
(used by ?gelsd and ?gesdd)
=10: ieee NaN arithmetic can be trusted not to trap
=11: infinity arithmetic can be trusted not to trap
12 ≤ ispec ≤ 16: ?hseqr or one of its subroutines, see
iparmq for detailed explanation.
name CHARACTER*(*). The name of the calling subroutine, in
either upper case or lower case.
opts CHARACTER*(*). The character options to the subroutine
name, concatenated into a single character string. For
example, uplo = 'U', trans = 'T', and diag = 'N' for
a triangular routine would be specified as opts = 'UTN'.
n1, n2, n3, n4 INTEGER. Problem dimensions for the subroutine name;
these may not all be required.
Output Parameters
value INTEGER.
If value ≥ 0: the value of the parameter specified by
ispec;
If value = -k < 0: the k-th argument had an illegal value.
Application Notes
The following conventions have been used when calling ilaenv from the LAPACK routines:
1. opts is a concatenation of all of the character options to subroutine name, in the same order
that they appear in the argument list for name, even if they are not used in determining the
value of the parameter specified by ispec.
1803
2. The problem dimensions n1, n2, n3, n4 are specified in the order that they appear in the
argument list for name. n1 is used first, n2 second, and so on, and unused problem dimensions
are passed a value of -1.
3. The parameter value returned by ilaenv is checked for validity in the calling subroutine.
For example, ilaenv is used to retrieve the optimal blocksize for strtri as follows:
nb = ilaenv( 1, 'strtri', uplo // diag, n, -1, -1, -1> )
if( nb.le.1 ) nb = max( 1, n )
Below is an example of ilaenv usage in C language:

#include <stdio.h>
#include "mkl.h"
int main(void)
{
int size = 1000;
int ispec = 1;
int dummy = -1;
int blockSize1 = ilaenv(&ispec, "dsytrd", "U", &size, &dummy, &dummy, &dummy);
int blockSize2 = ilaenv(&ispec, "dormtr", "LUN", &size, &size, &dummy, &dummy);
printf("DSYTRD blocksize = %d\n", blockSize1);
printf("DORMTR blocksize = %d\n", blockSize2);
return 0;
}
See Also
• Utility Functions and Routines
• ?hseqr
• iparmq
1804
iparmq
Environmental enquiry function which returns
values for tuning algorithmic performance.
Syntax
value = iparmq( ispec, name, opts, n, ilo, ihi, lwork )
Description
for C interface.
The function sets problem and machine dependent parameters useful for ?hseqr and its
subroutines. It is called whenever ilaenv is called with 12≤ispec≤16.
Input Parameters
ispec INTEGER.
Specifies the parameter to be returned as the value of
iparmq:
= 12: (inmin) Matrices of order nmin or less are sent
directly to ?lahqr, the implicit double shift QR algorithm.
nmin must be at least 11.
= 13: (inwin) Size of the deflation window. This is best set
greater than or equal to the number of simultaneous shifts
ns. Larger matrices benefit from larger deflation windows.
= 14: (inibl) Determines when to stop nibbling and invest
in an (expensive) multi-shift QR sweep. If the aggressive
early deflation subroutine finds ld converged eigenvalues
from an order nw deflation window and
ld>(nw*nibble)/100, then the next QR sweep is skipped
and early deflation is applied immediately to the remaining
active diagonal block. Setting iparmq(ispec=14)=0 causes
TTQRE to skip a multi-shift QR sweep whenever early
deflation finds a converged eigenvalue. Setting ipar-
mq(ispec=14) greater than or equal to 100 prevents TTQRE
from skipping a multi-shift QR sweep.
= 15: (nshfts) The number of simultaneous shifts in a
multi-shift QR iteration.
1805
= 16: (iacc22) iparmq is set to 0, 1 or 2 with the following

meanings.
0: During the multi-shift QR sweep, ?laqr5 does not
accumulate reflections and does not use matrix-matrix
multiply to update the far-from-diagonal matrix entries.
1: During the multi-shift QR sweep, ?laqr5 and/or ?laqr3
accumulates reflections and uses matrix-matrix multiply to
update the far-from-diagonal matrix entries.
2: During the multi-shift QR sweep, ?laqr5 accumulates
reflections and takes advantage of 2-by-2 block structure
during matrix-matrix multiplies.
(If ?trrm is slower than ?gemm, then iparmq(ispec=16)=1
may be more efficient than iparmq(ispec=16)=2 despite
the greater level of arithmetic work implied by the latter
choice.)
name CHARACTER*(*). The name of the calling subroutine.
opts CHARACTER*(*). This is a concatenation of the string
arguments to TTQRE.
n INTEGER. n is the order of the Hessenberg matrix H.
ilo, ihi INTEGER.
columns 1:ilo-1 and ihi+1:n.
lwork INTEGER.
The amount of workspace available.
Output Parameters
value INTEGER.
If value ≥ 0: the value of the parameter specified by
iparmq;
If value = -k < 0: the k-th argument had an illegal value.
Application Notes
The following conventions have been used when calling ilaenv from the LAPACK routines:
1. opts is a concatenation of all of the character options to subroutine name, in the same order
that they appear in the argument list for name, even if they are not used in determining the
value of the parameter specified by ispec.
1806
2. The problem dimensions n1, n2, n3, n4 are specified in the order that they appear in the
argument list for name. n1 is used first, n2 second, and so on, and unused problem dimensions
are passed a value of -1.
3. The parameter value returned by ilaenv is checked for validity in the calling subroutine.
For example, ilaenv is used to retrieve the optimal blocksize for strtri as follows:
nb = ilaenv( 1, 'strtri', uplo // diag, n, -1, -1, -1> )
if( nb.le.1 ) nb = max( 1, n )
ieeeck
Checks if the infinity and NaN arithmetic is safe.
Called by ilaenv.
Syntax
ival = ieeeck( ispec, zero, one )
Description
for C interface.
The function ieeeck is called from ilaenv to verify that infinity and possibly NaN arithmetic
is safe, that is, will not trap.
Input Parameters
ispec INTEGER.
Specifies whether to test just for infinity arithmetic or both
for infinity and NaN arithmetic:
If ispec = 0: Verify infinity arithmetic only.
If ispec = 1: Verify infinity and NaN arithmetic.
zero REAL. Must contain the value 0.0
This is passed to prevent the compiler from optimizing away
this code.
one REAL. Must contain the value 1.0
This is passed to prevent the compiler from optimizing away
this code.
1807
Output Parameters
ival INTEGER.
If ival = 0: Arithmetic failed to produce the correct
answers.
If ival = 1: Arithmetic produced the correct answers.
lsamen
Tests two character strings for equality regardless
of case.
Syntax
val = lsamen(n, ca, cb)
Description
for C interface.
This logical function tests if the first n letters of the string ca are the same as the first n letters
of cb, regardless of case. The function lsamen returns .TRUE. if ca and cb are equivalent
except for case and .FALSE. otherwise. lsamen also returns .FALSE. if len(ca) or len(cb)
is less than n.
Input Parameters
n INTEGER. The number of characters in ca and cb to be
compared.
ca, cb CHARACTER*(*). Specify two character strings of length at
least n to be compared. Only the first n characters of each
string will be accessed.
Output Parameters
val LOGICAL. Result of the comparison.
1808
?labad
Returns the square root of the underflow and
overflow thresholds if the exponent-range is very
large.
Syntax
call slabad( small, large )
call dlabad( small, large )
Description
for C interface.
The routine takes as input the values computed by slamch/dlamch for underflow and overflow,
and returns the square root of each of these values if the log of large is sufficiently large. This
subroutine is intended to identify machines with a large exponent range, such as the Crays,
and redefine the underflow and overflow limits to be the square roots of the values computed
by ?lamch. This subroutine is needed because ?lamch does not compensate for poor arithmetic
in the upper half of the exponent range, as is found on a Cray.
Input Parameters
small REAL for slabad
DOUBLE PRECISION for dlabad.
The underflow threshold as computed by ?lamch.
large REAL for slabad
DOUBLE PRECISION for dlabad.
The overflow threshold as computed by ?lamch.
Output Parameters
small On exit, if log10(large) is sufficiently large, the square
root of small, otherwise unchanged.
large On exit, if log10(large) is sufficiently large, the square
root of large, otherwise unchanged.
1809
?lamch
Determines machine parameters for floating-point
arithmetic.
Syntax
val = slamch( cmach )
val = dlamch( cmach )
Description
for C interface.
The function ?lamch determines single precision and double precision machine parameters.
Input Parameters
cmach CHARACTER*1. Specifies the value to be returned by ?lamch:
= 'E' or 'e', val = eps
= 'S' or 's , val = sfmin
= 'B' or 'b', val = base
= 'P' or 'p', val = eps*base
= 'n' or 'n', val = t
= 'R' or 'r', val = rnd
= 'm' or 'm', val = emin
= 'U' or 'u', val = rmin
= 'L' or 'l', val = emax
= 'O' or 'o', val = rmax
where
eps = relative machine precision;
sfmin = safe minimum, such that 1/sfmin does not
overflow;
base = base of the machine;
prec = eps*base;
t = number of (base) digits in the mantissa;
rnd = 1.0 when rounding occurs in addition, 0.0 otherwise;
emin = minimum exponent before (gradual) underflow;
rmin = underflow_threshold - base**(emin-1);
1810
emax = largest exponent before overflow;
rmax = overflow_threshold - (base**emax)*(1-eps).
Output Parameters
val REAL for slamch
DOUBLE PRECISION for dlamch
?lamc1
Called from ?lamc2. Determines machine
parameters given by beta, t, rnd, ieee1.
Syntax
call slamc1( beta, t, rnd, ieee1 )
call dlamc1( beta, t, rnd, ieee1 )
Description
for C interface.
The routine ?lamc1 determines machine parameters given by beta, t, rnd, ieee1.
Output Parameters
beta INTEGER. The base of the machine.
t INTEGER. The number of (beta) digits in the mantissa.
rnd LOGICAL.
Specifies whether proper rounding ( rnd = .TRUE. ) or
chopping ( rnd = .FALSE. ) occurs in addition. This may
not be a reliable guide to the way in which the machine
performs its arithmetic.
ieee1 LOGICAL.
Specifies whether rounding appears to be done in the ieee
'round to nearest' style.
1811
?lamc2
Used by ?lamch. Determines machine parameters
specified in its arguments list.
Syntax
call slamc2( beta, t, rnd, eps, emin, rmin, emax, rmax )
call dlamc2( beta, t, rnd, eps, emin, rmin, emax, rmax )
Description
for C interface.
The routine ?lamc2 determines machine parameters specified in its arguments list.
Output Parameters
beta INTEGER. The base of the machine.
t INTEGER. The number of (beta) digits in the mantissa.
rnd LOGICAL.
Specifies whether proper rounding (rnd = .TRUE.) or
chopping (rnd = .FALSE.) occurs in addition. This may
not be a reliable guide to the way in which the machine
performs its arithmetic.
eps REAL for slamc2
DOUBLE PRECISION for dlamc2
The smallest positive number such that
fl(1.0 - eps) < 1.0,
where fl denotes the computed value.
emin INTEGER. The minimum exponent before (gradual) underflow
occurs.
rmin REAL for slamc2
The smallest normalized number for the machine, given by
baseemin-1,
where base is the floating point value of beta.
emax INTEGER.The maximum exponent before overflow occurs.
1812
rmax REAL for slamc2
The largest positive number for the machine, given by
baseemax(1 - eps), where base is the floating point value of
beta.
?lamc3
Called from ?lamc1-?lamc5. Intended to force a
and b to be stored prior to doing the addition of a
and b.
Syntax
val = slamc3( a, b )
val = dlamc3( a, b )
Description
for C interface.
The routine is intended to force A and B to be stored prior to doing the addition of A and B, for
use in situations where optimizers might hold one of these in a register.
Input Parameters
a, b REAL for slamc3
The values a and b.
Output Parameters
val REAL for slamc3
The result of adding values a and b.
1813
?lamc4
This is a service routine for ?lamc2.
Syntax
call slamc4( emin, start, base )
call dlamc4( emin, start, base )
Description
for C interface.
This is a service routine for ?lamc2.
Input Parameters
start REAL for slamc4
The starting point for determining emin.
base INTEGER. The base of the machine.
Output Parameters
emin INTEGER. The minimum exponent before (gradual)
underflow, computed by setting a = start and dividing by
base until the previous a can not be recovered.
?lamc5
Called from ?lamc2. Attempts to compute the
largest machine floating-point number, without
overflow.
Syntax
call slamc5( beta, p, emin, ieee, emax, rmax)
call dlamc5( beta, p, emin, ieee, emax, rmax)
1814
Description
for C interface.
The routine ?lamc5 attempts to compute rmax, the largest machine floating-point number,
without overflow. It assumes that emax + abs(emin) sum approximately to a power of 2. It
will fail on machines where this assumption does not hold, for example, the Cyber 205 (emin
= -28625, emax = 28718). It will also fail if the value supplied for emin is too large (that is,
too close to zero), probably with overflow.
Input Parameters
beta INTEGER. The base of floating-point arithmetic.
p INTEGER. The number of base beta digits in the mantissa
of a floating-point value.
emin INTEGER. The minimum exponent before (gradual)
underflow.
ieee LOGICAL. A logical flag specifying whether or not the
arithmetic system is thought to comply with the IEEE
standard.
Output Parameters
emax INTEGER. The largest exponent before overflow.
rmax REAL for slamc5
The largest machine floating-point number.
second/dsecnd
Return user time for a process.
Syntax
val = second()
val = dsecnd()
1815
Description
for C interface.
The functions second/dsecnd return the user time for a process in seconds. These versions
get the time from the system function etime. The difference is that dsecnd returns the result
with double precision.
Output Parameters
val REAL for second
DOUBLE PRECISION for dsecnd
User time for a process.
1816
ScaLAPACK Routines 6
This chapter describes the Intel® Math Kernel Library implementation of routines from the ScaLAPACK
package for distributed-memory architectures. Routines are supported for both real and complex dense
and band matrices to perform the tasks of solving systems of linear equations, solving linear least-squares
problems, eigenvalue and singular value problems, as well as performing a number of related computational
tasks.
Intel MKL ScaLAPACK routines are written in FORTRAN 77 with exception of a few utility routines written
in C to exploit the IEEE arithmetic. All routines are available in all precision types: single precision, double
precision, complex and double complex precision. C declarations of ScaLAPACK routines can be found in
the mkl_scalapack.h header file.
NOTE. ScaLAPACK routines are provided only with Intel® MKL versions for Linux* and
Windows* OSs.
Sections in this chapter include descriptions of ScaLAPACK computational routines that perform distinct
computational tasks, as well as driver routines for solving standard types of problems in one call.
Generally, ScaLAPACK runs on a network of computers using MPI as a message-passing layer and a set
of prebuilt communication subprograms (BLACS), as well as a set of BLAS optimized for the target
architecture. Intel MKL version of ScaLAPACK is optimized for Intel® processors. For the detailed system
and environment requirements, see Intel® MKL Release Notes and Intel® MKL User's Guide.
For full reference on ScaLAPACK routines and related information, see [SLUG].
Overview
The model of the computing environment for ScaLAPACK is represented as a one-dimensional array
of processes (for operations on band or tridiagonal matrices) or also a two-dimensional process grid
(for operations on dense matrices). To use ScaLAPACK, all global matrices or vectors should be
distributed on this array or grid prior to calling the ScaLAPACK routines.
ScaLAPACK uses the two-dimensional block-cyclic data distribution as a layout for dense matrix
computations. This distribution provides good work balance between available processors, as well
as gives the opportunity to use BLAS Level 3 routines for optimal local computations. Information
about the data distribution that is required to establish the mapping between each global array and
its corresponding process and memory location is contained in the so called array descriptor
associated with each global array. An example of an array descriptor structure is given in Table 6-1
.
1817
Table 6-1 Content of the array descriptor for dense matrices

Array Element # Name Definition
1 dtype Descriptor type ( =1 for dense matrices)
2 ctxt BLACS context handle for the process grid
3 m Number of rows in the global array
4 n Number of columns in the global array
5 mb Row blocking factor
6 nb Column blocking factor
7 rsrc Process row over which the first row of the global array is
distributed
8 csrc Process column over which the first column of the global array
is distributed
9 lld Leading dimension of the local array
The number of rows and columns of a global dense matrix that a particular process in a grid
receives after data distributing is denoted by LOCr() and LOCc(), respectively. To compute
these numbers, you can use the ScaLAPACK tool routine numroc.
After the block-cyclic distribution of global data is done, you may choose to perform an operation
on a submatrix of the global matrix A, which is contained in the global subarray sub(A), defined
by the following 6 values (for dense matrices):
m The number of rows of sub(A)
n The number of columns of sub(A)
a A pointer to the local array containing the entire global array A
ia The row index of sub(A) in the global array
ja The column index of sub(A) in the global array
desca The array descriptor for the global array

For each routine introduced in this chapter, you can use the ScaLAPACK name. The naming
convention for ScaLAPACK routines is similar to that used for LAPACK routines (see Routine
Naming Conventions in Chapter 4). A general rule is that each routine name in ScaLAPACK,
which has an LAPACK equivalent, is simply the LAPACK name prefixed by initial letter p.
ScaLAPACK names have the structure p?yyzzz or p?yyzz, which is described below.
The initial letter p is a distinctive prefix of ScaLAPACK routines and is present in each such
routine.
The second symbol ? indicates the data type:
1818
The second and third letters yy indicate the matrix type as:
ge general
gb general band
gg a pair of general matrices (for a generalized problem)
dt general tridiagonal (diagonally dominant-like)
db general band (diagonally dominant-like)
po symmetric or Hermitian positive-definite
pb symmetric or Hermitian positive-definite band
pt symmetric or Hermitian positive-definite tridiagonal
sy symmetric
st symmetric tridiagonal (real)
he Hermitian
or orthogonal
tr triangular (or quasi-triangular)
tz trapezoidal
un unitary
For computational routines, the last three letters zzz indicate the computation performed and
have the same meaning as for LAPACK routines.
For driver routines, the last two letters zz or three letters zzz have the following meaning:
sv a simple driver for solving a linear system
svx an expert driver for solving a linear system
ls a driver for solving a linear least squares problem
ev a simple driver for solving a symmetric eigenvalue problem
evx an expert driver for solving a symmetric eigenvalue problem
svd a driver for computing a singular value decomposition
gvx an expert driver for solving a generalized symmetric definite eigenvalue
problem
1819
Simple driver here means that the driver just solves the general problem, whereas an expert
driver is more versatile and can also optionally perform some related computations (such, for
example, as refining the solution and computing error bounds after the linear system is solved).
In the sections that follow, the descriptions of ScaLAPACK computational routines are given.
These routines perform distinct computational tasks that can be used for:
• Solving Systems of Linear Equations

• Orthogonal Factorizations and LLS Problems
• Symmetric Eigenproblems
• Nonsymmetric Eigenproblems
• Singular Value Decomposition
• Generalized Symmetric-Definite Eigenproblems
See also the respective driver routines.
Linear Equations
ScaLAPACK supports routines for the systems of equations with the following types of matrices:
• general
• general banded
• general diagonally dominant-like banded (including general tridiagonal)
• symmetric or Hermitian positive-definite
• symmetric or Hermitian positive-definite banded
• symmetric or Hermitian positive-definite tridiagonal
A diagonally dominant-like matrix is defined as a matrix for which it is known in advance that
pivoting is not required in the LU factorization of this matrix.
For the above matrix types, the library includes routines for performing the following
computations: factoring the matrix; equilibrating the matrix; solving a system of linear
equations; estimating the condition number of a matrix; refining the solution of linear equations
and computing its error bounds; inverting the matrix. Note that for some of the listed matrix
types only part of the computational routines are provided (for example, routines that refine
the solution are not provided for band or tridiagonal matrices). See Table 6-2 for full list of
available routines.
1820
To solve a particular problem, you can either call two or more computational routines or call a
corresponding driver routine that combines several tasks in one call. Thus, to solve a system
of linear equations with a general matrix, you can first call p?getrf(LU factorization) and then
p?getrs(computing the solution). Then, you might wish to call p?gerfs to refine the solution
and get the error bounds. Alternatively, you can just use the driver routine p?gesvx which
performs all these tasks in one call.
Table 6-2 lists the ScaLAPACK computational routines for factorizing, equilibrating, and inverting
matrices, estimating their condition numbers, solving systems of equations with real matrices,
refining the solution, and estimating its error.
Table 6-2 Computational Routines for Systems of Linear Equations
Matrix type, storage Factorize Equilibrate Solve Condition Estimate Invert
scheme matrix matrix system number error matrix
general (partial p?getrf p?geequ p?getrs p?gecon p?gerfs p?getri
pivoting)
general band (partial p?gbtrf p?gbtrs
pivoting)
general band (no p?dbtrf p?dbtrs
pivoting)
general tridiagonal p?dttrf p?dttrs
(no pivoting)
symmetric/Hermitian p?potrf p?poequ p?potrs p?pocon p?porfs p?potri
positive-definite
symmetric/Hermitian p?pbtrf p?pbtrs
positive-definite,
band
symmetric/Hermitian p?pttrf p?pttrs
positive-definite,
tridiagonal
triangular p?trtrs p?trcon p?trrfs p?trtri
In this table ? stands for s (single precision real), d (double precision real), c (single precision
complex), or z (double precision complex).
Routines for Matrix Factorization

This section describes the ScaLAPACK routines for matrix factorization. The following
factorizations are supported:
• LU factorization of general matrices

• LU factorization of diagonally dominant-like matrices
• Cholesky factorization of real symmetric or complex Hermitian positive-definite matrices
1821
You can compute the factorizations using full and band storage of matrices.
p?getrf
distributed matrix.
Syntax
call psgetrf(m, n, a, ia, ja, desca, ipiv, info)
call pdgetrf(m, n, a, ia, ja, desca, ipiv, info)
call pcgetrf(m, n, a, ia, ja, desca, ipiv, info)
call pzgetrf(m, n, a, ia, ja, desca, ipiv, info)
Description
For C interface, this routine is declared in mkl_scalapack.h file.
This routine forms the LU factorization of a general m-by-n distributed matrix sub(A) =
A(ia:ia+n-1, ja:ja+n-1) as
A = P*L*U
where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower
trapezoidal if m > n) and U is upper triangular (upper trapezoidal if m < n). L and U are stored
in sub(A).
The routine uses partial pivoting, with row interchanges.
Input Parameters
m (global) INTEGER. The number of rows in the distributed
submatrix sub(A); m≥0.
n (global) INTEGER. The number of columns in the distributed
submatrix sub(A); n≥0.
a (local)
REAL for psgetrf
DOUBLE PRECISION for pdgetrf
COMPLEX for pcgetrf
DOUBLE COMPLEX for pzgetrf.
1822
Pointer into the local memory to an array of local dimension
(lld_a, LOCc(ja+n-1)).
Contains the local pieces of the distributed matrix sub(A)
to be factored.
ia, ja (global) INTEGER. The row and column indices in the global
array A indicating the first row and the first column of the
submatrix A(ia:ia+n-1, ja:ja+n-1), respectively.
desca (global and local) INTEGER array, dimension (dlen_). The
array descriptor for the distributed matrix A.
Output Parameters
a Overwritten by local pieces of the factors L and U from the
factorization A = P*L*U. The unit diagonal elements of L are
not stored.
ipiv (local) INTEGER array.
The dimension of ipiv is (LOCr(m_a)+ mb_a).
This array contains the pivoting information: local row i
was interchanged with global row ipiv(i). This array is
tied to the distributed matrix A.
info (global) INTEGER.
info < 0: if the i-th argument is an array and the j-th
entry had an illegal value, then info = -(i*100+j); if the
i-th argument is a scalar and had an illegal value, then
info = -i.
If info = i, uii is 0. The factorization has been completed,
but the factor U is exactly singular. Division by zero will
occur if you use the factor U for solving a system of linear
equations.
1823
p?gbtrf
Computes the LU factorization of a general n-by-n
banded distributed matrix.
Syntax
call psgbtrf(n, bwl, bwu, a, ja, desca, ipiv, af, laf, work, lwork, info)
call pdgbtrf(n, bwl, bwu, a, ja, desca, ipiv, af, laf, work, lwork, info)
call pcgbtrf(n, bwl, bwu, a, ja, desca, ipiv, af, laf, work, lwork, info)
call pzgbtrf(n, bwl, bwu, a, ja, desca, ipiv, af, laf, work, lwork, info)
Description
This routine computes the LU factorization of a general n-by-n real/complex banded distributed
matrix A(1:n, ja:ja+n-1) using partial pivoting with row interchanges.
The resulting factorization is not the same factorization as returned from the LAPACK routine
?gbtrf. Additional permutations are performed on the matrix for the sake of parallelism.
A(1:n, ja:ja+n-1) = P*L*U*Q
where P and Q are permutation matrices, and L and U are banded lower and upper triangular
matrices, respectively. The matrix Q represents reordering of columns for the sake of parallelism,
while P represents reordering of rows for numerical stability using classic partial pivoting.
Input Parameters
n (global) INTEGER. The number of rows and columns in the
distributed submatrix A(1:n, ja:ja+n-1); n ≥ 0.
bwl (global) INTEGER. The number of sub-diagonals within the
band of A
( 0 ≤ bwl ≤ n-1 ).
bwu (global) INTEGER. The number of super-diagonals within
the band of A
( 0 ≤ bwu ≤ n-1 ).
1824
a (local)
REAL for psgbtrf
DOUBLE PRECISION for pdgbtrf
COMPLEX for pcgbtrf
DOUBLE COMPLEX for pzgbtrf.
(lld_a, LOCc(ja+n-1) where
lld_a ≥ 2*bwl + 2*bwu +1.
Contains the local pieces of the n-by-n distributed banded
matrix A(1:n, ja:ja+n-1) to be factored.
ja (global) INTEGER. The index in the global array A that points
to the start of the matrix to be operated on ( which may be
either all of A or a submatrix of A).
If desca(dtype_) = 501, then dlen_ ≥ 7;
else if desca(dtype_) = 1, then dlen_ ≥ 9.
laf (local) INTEGER. The dimension of the array af.
Must be laf ≥
(NB+bwu)*(bwl+bwu)+6*(bwl+bwu)*(bwl+2*bwu).
If laf is not large enough, an error code will be returned
and the minimum acceptable size will be returned in af(1).
work (local) Same type as a. Workspace array of dimension lwork
.
lwork (local or global) INTEGER. The size of the work array (lwork
≥ 1). If lwork is too small, the minimal acceptable size will
be returned in work(1) and an error code is returned.
Output Parameters
a On exit, this array contains details of the factorization. Note
that additional permutations are performed on the matrix,
so that the factors returned are different from those returned
by LAPACK.
1825
The dimension of ipiv must be ≥ desca(NB).

Contains pivot indices for local factorizations. Note that you
should not alter the contents of this array between
factorization and solve.
af (local)
REAL for psgbtrf
DOUBLE PRECISION for pdgbtrf
COMPLEX for pcgbtrf
DOUBLE COMPLEX for pzgbtrf.
Array, dimension (laf).
Auxiliary Fillin space. Fillin is created during the factorization
routine p?gbtrf and this is stored in af.
Note that if a linear system is to be solved using p?gbtrs
after the factorization routine, af must not be altered after
the factorization.
work(1) On exit, work(1) contains the minimum value of lwork
info < 0:
If the ith argument is an array and the jth entry had an
illegal value, then info = -(i*100+j); if the ith argument
is a scalar and had an illegal value, then info = -i.
info > 0:
If info = k ≤ NPROCS, the submatrix stored on processor
info and factored locally was not nonsingular, and the
factorization was not completed. If info = k > NPROCS,
the submatrix stored on processor info-NPROCS
representing interactions with other processors was not
nonsingular, and the factorization was not completed.
1826
p?dbtrf
Computes the LU factorization of a n-by-n
diagonally dominant-like banded distributed matrix.
Syntax
call psdbtrf(n, bwl, bwu, a, ja, desca, af, laf, work, lwork, info)
call pddbtrf(n, bwl, bwu, a, ja, desca, af, laf, work, lwork, info)
call pcdbtrf(n, bwl, bwu, a, ja, desca, af, laf, work, lwork, info)
call pzdbtrf(n, bwl, bwu, a, ja, desca, af, laf, work, lwork, info)
Description
The routine computes the LU factorization of a n-by-n real/complex diagonally dominant-like
banded distributed matrix A(1:n, ja:ja+n-1) without pivoting.
Note that the resulting factorization is not the same factorization as returned from LAPACK.
Additional permutations are performed on the matrix for the sake of parallelism.
Input Parameters
n (global) INTEGER. The number of rows and columns in the
distributed submatrix A(1:n, ja:ja+n-1); n ≥ 0.
band of A
(0 ≤ bwl ≤ n-1).
the band of A
(0 ≤ bwu ≤ n-1).
a (local)
REAL for psdbtrf
DOUBLE PRECISION for pddbtrf
COMPLEX for pcdbtrf
DOUBLE COMPLEX for pzdbtrf.
(lld_a,LOCc(ja+n-1)).
1827
Contains the local pieces of the n-by-n distributed banded

matrix A(1:n, ja:ja+n-1) to be factored.
2
Must be laf ≥ NB*(bwl+bwu)+*(max(bwl,bwu)) .
work (local) Same type as a. Workspace array of dimension
lwork.
lwork (local or global) INTEGER. The size of the work array, must
be lwork ≥ (max(bwl,bwu))2. If lwork is too small, the
minimal acceptable size will be returned in work(1) and an
error code is returned.
Output Parameters
a On exit, this array contains details of the factorization. Note
that additional permutations are performed on the matrix,
so that the factors returned are different from those returned
by LAPACK.
af (local)
REAL for psdbtrf
DOUBLE PRECISION for pddbtrf
COMPLEX for pcdbtrf
DOUBLE COMPLEX for pzdbtrf.
routine p?dbtrf and this is stored in af.
1828
Note that if a linear system is to be solved using p?dbtrs
the factorization.
info < 0:
If the ith argument is an array and the j-th entry had an
illegal value, then info = -(i*100+j); if the i-th
argument is a scalar and had an illegal value, then info =
-i. info > 0:
info and factored locally was not diagonally dominant-like,
and the factorization was not completed. If info = k >
NPROCS, the submatrix stored on processor info-NPROCS
p?potrf
symmetric (Hermitian) positive-definite distributed
matrix.
Syntax
call pspotrf(uplo, n, a, ia, ja, desca, info)
call pdpotrf(uplo, n, a, ia, ja, desca, info)
call pcpotrf(uplo, n, a, ia, ja, desca, info)
call pzpotrf(uplo, n, a, ia, ja, desca, info)
Description
positive-definite distributed n-by-n matrix A(ia:ia+n-1, ja:ja+n-1), denoted below as
sub(A).
1829

sub(A) = UH*U if uplo='U', or
sub(A) = L*LH if uplo='L'
Input Parameters
uplo (global) CHARACTER*1.
Indicates whether the upper or lower triangular part of
sub(A) is stored. Must be 'U' or 'L'.
If uplo = 'U', the array a stores the upper triangular part
H
of the matrix sub(A) that is factored as U *U.
If uplo = 'L', the array a stores the lower triangular part
H
of the matrix sub(A) that is factored as L*L .
n (global) INTEGER. The order of the distributed submatrix
sub(A) (n≥0).
a (local)
REAL for pspotrf
DOUBLE PRECISON for pdpotrf
COMPLEX for pcpotrf
DOUBLE COMPLEX for pzpotrf.
Pointer into the local memory to an array of dimension
On entry, this array contains the local pieces of the n-by-n
symmetric/Hermitian distributed matrix sub(A) to be
factored.
Depending on uplo, the array a contains either the upper
or the lower triangular part of the matrix sub(A) (see uplo).
submatrix sub(A), respectively.
1830
Output Parameters
a The upper or lower triangular part of a is overwritten by the
Cholesky factor U or L, as specified by uplo.
If info=0, the execution is successful;
info < 0: if the i-th argument is an array, and the j-th
info = -i.
If info = k >0, the leading minor of order k,
A(ia:ia+k-1, ja:ja+k-1), is not positive-definite, and
the factorization could not be completed.
p?pbtrf
symmetric (Hermitian) positive-definite banded
distributed matrix.
Syntax
call pspbtrf(uplo, n, bw, a, ja, desca, af, laf, work, lwork, info)
call pdpbtrf(uplo, n, bw, a, ja, desca, af, laf, work, lwork, info)
call pcpbtrf(uplo, n, bw, a, ja, desca, af, laf, work, lwork, info)
call pzpbtrf(uplo, n, bw, a, ja, desca, af, laf, work, lwork, info)
Description
The routine computes the Cholesky factorization of an n-by-n real symmetric or complex
Hermitian positive-definite banded distributed matrix A(1:n, ja:ja+n-1).
The resulting factorization is not the same factorization as returned from LAPACK. Additional
permutations are performed on the matrix for the sake of parallelism.
The factorization has the form:
A(1:n, ja:ja+n-1) = P*UH*PT, if uplo='U', or
A(1:n, ja:ja+n-1) = P*L*LH*PT, if uplo='L',
1831
where P is a permutation matrix and U and L are banded upper and lower triangular matrices,
respectively.
Input Parameters
uplo (global) CHARACTER*1. Must be 'U' or 'L'.
If uplo = 'U', upper triangle of A(1:n, ja:ja+n-1) is
stored;
If uplo = 'L', lower triangle of A(1:n, ja:ja+n-1) is
stored.
A(1:n, ja:ja+n-1).
(n≥0).
bw (global) INTEGER.
The number of superdiagonals of the distributed matrix if
uplo = 'U', or the number of subdiagonals if uplo = 'U'
(bw≥0).
a (local)
REAL for pspbtrf
DOUBLE PRECISON for pdpbtrf
COMPLEX for pcpbtrf
DOUBLE COMPLEX for pzpbtrf.
On entry, this array contains the local pieces of the upper
or lower triangle of the symmetric/Hermitian band
distributed matrix A(1:n, ja:ja+n-1) to be factored.
Must be laf ≥ (NB+2*bw)*bw.
1832
work (local) Same type as a. Workspace array of dimension lwork
.
be lwork ≥ bw2.
Output Parameters
a On exit, if info=0, contains the permuted triangular factor
U or L from the Cholesky factorization of the band matrix
A(1:n, ja:ja+n-1), as specified by uplo.
af (local)
REAL for pspbtrf
DOUBLE PRECISON for pdpbtrf
COMPLEX for pcpbtrf
DOUBLE COMPLEX for pzpbtrf.
Array, dimension (laf). Auxiliary Fillin space. Fillin is created
during the factorization routine p?pbtrf and this is stored
in af. Note that if a linear system is to be solved using
p?pbtrs after the factorization routine, af must not be
altered.
info < 0:
info>0:
info and factored locally was not positive definite, and the
factorization was not completed.
1833
If info = k > NPROCS, the submatrix stored on processor

info-NPROCS representing interactions with other
processors was not nonsingular, and the factorization was
not completed.
p?pttrf
distributed matrix.
Syntax
call pspttrf(n, d, e, ja, desca, af, laf, work, lwork, info)
call pdpttrf(n, d, e, ja, desca, af, laf, work, lwork, info)
call pcpttrf(n, d, e, ja, desca, af, laf, work, lwork, info)
call pzpttrf(n, d, e, ja, desca, af, laf, work, lwork, info)
Description
The routine computes the Cholesky factorization of an n-by-n real symmetric or complex
hermitian positive-definite tridiagonal distributed matrix A(1:n, ja:ja+n-1).
A(1:n, ja:ja+n-1) = P*L*D*LH*PT, or
A(1:n, ja:ja+n-1) = P*UH*D*U*PT,
where P is a permutation matrix, and U and L are tridiagonal upper and lower triangular matrices,
respectively.
Input Parameters
A(1:n, ja:ja+n-1)
(n ≥ 0).
1834
d, e (local)
REAL for pspttrf
DOUBLE PRECISON for pdpttrf
COMPLEX for pcpttrf
DOUBLE COMPLEX for pzpttrf.
Pointers into the local memory to arrays of dimension
(desca(nb_)) each.
On entry, the array d contains the local part of the global
vector storing the main diagonal of the distributed matrix
A.
On entry, the array e contains the local part of the global
vector storing the upper diagonal of the distributed matrix
A.
to the start of the matrix to be operated on (which may be
Must be laf ≥ NB+2.
work (local) Same type as d and e. Workspace array of dimension
lwork .
be at least
lwork ≥ 8*NPCOL.
Output Parameters
d, e On exit, overwritten by the details of the factorization.
af (local)
REAL for pspttrf
DOUBLE PRECISION for pdpttrf
1835
COMPLEX for pcpttrf

routine p?pttrf and this is stored in af.
Note that if a linear system is to be solved using p?pttrs
after the factorization routine, af must not be altered.
info < 0:
If the i-th argument is an array and the j-th entry had an
-i.
info > 0:
info and factored locally was not positive definite, and the
factorization was not completed.
processors was not nonsingular, and the factorization was
not completed.
p?dttrf
Computes the LU factorization of a diagonally
dominant-like tridiagonal distributed matrix.
Syntax
call psdttrf(n, dl, d, du, ja, desca, af, laf, work, lwork, info)
call pddttrf(n, dl, d, du, ja, desca, af, laf, work, lwork, info)
call pcdttrf(n, dl, d, du, ja, desca, af, laf, work, lwork, info)
call pzdttrf(n, dl, d, du, ja, desca, af, laf, work, lwork, info)
1836
Description
The routine computes the LU factorization of an n-by-n real/complex diagonally dominant-like
tridiagonal distributed matrix A(1:n, ja:ja+n-1) without pivoting for stability.
A(1:n, ja:ja+n-1) = P*L*U*PT,
where P is a permutation matrix, and L and U are banded lower and upper triangular matrices,
respectively.
Input Parameters
n (global) INTEGER. The number of rows and columns to be
operated on, that is, the order of the distributed submatrix
A(1:n, ja:ja+n-1) (n ≥ 0).
dl, d, du (local)
REAL for pspttrf
DOUBLE PRECISON for pdpttrf
COMPLEX for pcpttrf
Pointers to the local arrays of dimension (desca(nb_))
each.
On entry, the array dl contains the local part of the global
vector storing the subdiagonal elements of the matrix.
Globally, dl(1) is not referenced, and dl must be aligned
with d.
On entry, the array d contains the local part of the global
vector storing the diagonal elements of the matrix.
On entry, the array du contains the local part of the global
vector storing the super-diagonal elements of the matrix.
du(n) is not referenced, and du must be aligned with d.
1837

array descriptor for the distributed matrix A. If
desca(dtype_) = 501, then dlen_ ≥ 7;
Must be laf ≥ 2*(NB+2) .
work (local) Same type as d. Workspace array of dimension
lwork.
be at least lwork ≥ 8*NPCOL.
Output Parameters
dl, d, du On exit, overwritten by the information containing the
factors of the matrix.
af (local)
REAL for psdttrf
DOUBLE PRECISION for pddttrf
COMPLEX for pcdttrf
DOUBLE COMPLEX for pzdttrf.
routine p?dttrf and this is stored in af.
Note that if a linear system is to be solved using p?dttrs
after the factorization routine, af must not be altered.
info < 0:
-i.
1838
info > 0:
info and factored locally was not diagonally dominant-like,
and the factorization was not completed. If info = k >
NPROCS, the submatrix stored on processor info-NPROCS
Routines for Solving Systems of Linear Equations

This section describes the ScaLAPACK routines for solving systems of linear equations. Before
calling most of these routines, you need to factorize the matrix of your system of equations
(see Routines for Matrix Factorization in this chapter). However, the factorization is not necessary
if your system of equations has a triangular matrix.
p?getrs
Solves a system of distributed linear equations with
a general square matrix, using the LU factorization
computed by p?getrf.
Syntax
call psgetrs(trans, n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
call pdgetrs(trans, n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
call pcgetrs(trans, n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
call pzgetrs(trans, n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
Description
The routine solves a system of distributed linear equations with a general n-by-n distributed
matrix sub(A) = A(ia:ia+n-1, ja:ja+n-1) using the LU factorization computed by p?getrf.
The system has one of the following forms specified by trans:
sub(A)*X = sub(B) (no transpose),

sub(A)T*X = sub(B) (transpose),
1839
sub(A)H*X = sub(B) (conjugate transpose),

where sub(B) = B(ib:ib+n-1, jb:jb+nrhs-1).
Before calling this routine, you must call p?getrf to compute the LU factorization of sub(A).
Input Parameters
trans (global) CHARACTER*1. Must be 'N' or 'T' or 'C'.
If trans = 'N', then sub(A)*X = sub(B) is solved for X.
If trans = 'T', then sub(A)T*X = sub(B) is solved for
X.
If trans = 'C', then sub(A)H *X = sub(B) is solved for
X.
n (global) INTEGER. The number of linear equations; the order
of the submatrix sub(A) (n≥0).
nrhs (global) INTEGER. The number of right hand sides; the
number of columns of the distributed submatrix sub(B)
(nrhs≥0).
a, b (global)
REAL for psgetrs
DOUBLE PRECISION for pdgetrs
COMPLEX for pcgetrs
DOUBLE COMPLEX for pzgetrs.
Pointers into the local memory to arrays of local dimension
a(lld_a, LOCc(ja+n-1)) and b(lld_b,
LOCc(jb+nrhs-1)), respectively.
On entry, the array a contains the local pieces of the factors
L and U from the factorization sub(A) = P*L*U; the unit
diagonal elements of L are not stored. On entry, the array
b contains the right hand sides sub(B).
1840
The dimension of ipiv is (LOCr(m_a) + mb_a). This array
contains the pivoting information: local row i of the matrix
was interchanged with the global row ipiv(i).
This array is tied to the distributed matrix A.
ib, jb (global) INTEGER. The row and column indices in the global
array B indicating the first row and the first column of the
submatrix sub(B), respectively.
descb (global and local) INTEGER array, dimension (dlen_). The
array descriptor for the distributed matrix B.
Output Parameters
b On exit, overwritten by the solution distributed matrix X.
info INTEGER. If info=0, the execution is successful. info <
0:
-i.
p?gbtrs
Solves a system of distributed linear equations with
a general band matrix, using the LU factorization
computed by p?gbtrf.
Syntax
call psgbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, af,
laf, work, lwork, info)
call pdgbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, af,
call pcgbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, af,
call pzgbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, af,
1841
Description
The routine solves a system of distributed linear equations with a general band distributed
matrix sub(A) = A(1:n, ja:ja+n-1) using the LU factorization computed by p?gbtrf.
The system has one of the following forms specified by trans:
sub(A)*X = sub(B) (no transpose),

sub(A)T*X = sub(B) (transpose),
sub(A)H*X = sub(B) (conjugate transpose),
where sub(B) = B(ib:ib+n-1, 1:nrhs) .
Before calling this routine, you must call p?gbtrf to compute the LU factorization of sub(A).
Input Parameters
X.
If trans = 'C', then sub(A)H *X = sub(B) is solved for
X.
of the distributed submatrix sub(A) (n ≥ 0).
band of A ( 0 ≤ bwl ≤ n-1 ).
the band of A ( 0 ≤ bwu ≤ n-1 ).
(nrhs ≥ 0).
a, b (global)
REAL for psgbtrs
DOUBLE PRECISION for pdgbtrs
COMPLEX for pcgbtrs
1842
DOUBLE COMPLEX for pzgbtrs.
a(lld_a,LOCc(ja+n-1)) and b(lld_b,LOCc(nrhs)),
respectively.
The array a contains details of the LU factorization of the
distributed band matrix A.
On entry, the array b contains the local pieces of the right
hand sides B(ib:ib+n-1, 1:nrhs).
ib (global) INTEGER. The index in the global array A that points
Must be laf ≥ NB*(bwl+bwu)+6*(bwl+bwu)*(bwl+2*bwu).
work (local) Same type as a. Workspace array of dimension
lwork.
be at least lwork ≥ nrhs*(NB+2*bwl+4*bwu).
Output Parameters
The dimension of ipiv must be ≥ desca(NB).
1843
Contains pivot indices for local factorizations. Note that you

should not alter the contents of this array between
factorization and solve.
b On exit, overwritten by the local pieces of the solution
distributed matrix X.
af (local)
REAL for psgbtrs
DOUBLE PRECISION for pdgbtrs
COMPLEX for pcgbtrs
DOUBLE COMPLEX for pzgbtrs.
routine p?gbtrf and this is stored in af.
Note that if a linear system is to be solved using p?gbtrs
the factorization.
info < 0:
If the i-th argument is an array and the jth entry had an
-i.
p?potrs
Cholesky-factored symmetric/Hermitian distributed
Syntax
call pspotrs(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pdpotrs(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pcpotrs(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pzpotrs(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
1844
Description
The routine p?potrs solves for X a system of distributed linear equations in the form:
sub(A)*X = sub(B) ,
where sub(A) = A(ia:ia+n-1, ja:ja+n-1) is an n-by-n real symmetric or complex Hermitian
positive definite distributed matrix, and sub(B) denotes the distributed matrix B(ib:ib+n-1,
jb:jb+nrhs-1).
This routine uses Cholesky factorization
sub(A) = UH*U, or sub(A) = L*LH
computed by p?potrf.
Input Parameters
If uplo = 'U', upper triangle of sub(A) is stored;
If uplo = 'L', lower triangle of sub(A) is stored.
sub(A) (n≥0).
(nrhs≥0).
a, b (local)
REAL for pspotrs
DOUBLE PRECISION for pdpotrs
COMPLEX for pcpotrs
DOUBLE COMPLEX for pzpotrs.
a(lld_a,LOCc(ja+n-1)) and
b(lld_b,LOCc(jb+nrhs-1)), respectively.
The array a contains the factors L or U from the Cholesky
H H
factorization sub(A) = L*L or sub(A) = U *U, as computed
by p?potrf.
hand sides sub(B).
1845
Output Parameters
b Overwritten by the local pieces of the solution matrix X.
info < 0: if the i-th argument is an array and the j-th
info = -i.
p?pbtrs
Cholesky-factored symmetric/Hermitian
positive-definite band matrix.
Syntax
call pspbtrs( uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pdpbtrs( uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pcpbtrs( uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pzpbtrs( uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
1846
Description
The routine p?pbtrs solves for X a system of distributed linear equations in the form:
sub(A)*X = sub(B) ,
where sub(A) = A(1:n, ja:ja+n-1) is an n-by-n real symmetric or complex Hermitian
positive definite distributed band matrix, and sub(B) denotes the distributed matrix
B(ib:ib+n-1, 1:nrhs).
This routine uses Cholesky factorization
sub(A) = P*UH*U*PT, or sub(A) = P*L*LH*PT
computed by p?pbtrf.
Input Parameters
sub(A) (n≥0).
bw (global) INTEGER. The number of superdiagonals of the
distributed matrix if uplo = 'U', or the number of
subdiagonals if uplo = 'L' (bw≥0).
(nrhs≥0).
a, b (local)
REAL for pspbtrs
DOUBLE PRECISION for pdpbtrs
COMPLEX for pcpbtrs
DOUBLE COMPLEX for pzpbtrs.
a(lld_a,LOCc(ja+n-1)) and b(lld_b,LOCc(nrhs-1)),
respectively.
1847
The array a contains the permuted triangular factor U or L

from the Cholesky factorization sub(A) = P*UH*U*PT, or
H T
sub(A) = P*L*L *P of the band matrix A, as returned by
p?pbtrf.
On entry, the array b contains the local pieces of the
n-by-nrhs right hand side distributed matrix sub(B).
ib (global) INTEGER. The row index in the global array B
indicating the first row of the submatrix sub(B).
If descb(dtype_) = 502, then dlen_ ≥ 7;
else if descb(dtype_) = 1, then dlen_ ≥ 9.
af, work (local) Arrays, same type as a.
The array af is of dimension (laf). It contains auxiliary
Fillin space. Fillin is created during the factorization routine
p?dbtrf and this is stored in af.
The array work is a workspace array of dimension lwork.
Must be laf ≥ nrhs*bw.
lwork (local or global) INTEGER. The size of the array work, must
be at least lwork ≥ bw2.
Output Parameters
b On exit, if info=0, this array contains the local pieces of
the n-by-nrhs solution distributed matrix X.
1848
info < 0:
-i.
p?pttrs
distributed matrix using the factorization computed
by p?pttrf.
Syntax
call pspttrs(n, nrhs, d, e, ja, desca, b, ib, descb, af, laf, work, lwork,
info)
call pdpttrs(n, nrhs, d, e, ja, desca, b, ib, descb, af, laf, work, lwork,
info)
call pcpttrs(uplo, n, nrhs, d, e, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pzpttrs(uplo, n, nrhs, d, e, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
Description
The routine p?pttrs solves for X a system of distributed linear equations in the form:
sub(A)*X = sub(B) ,
where sub(A) = A(1:n, ja:ja+n-1) is an n-by-n real symmetric or complex Hermitian
positive definite tridiagonal distributed matrix, and sub(B) denotes the distributed matrix
B(ib:ib+n-1, 1:nrhs).
This routine uses the factorization
1849
sub(A) = P*L*D*LH*PT, or sub(A) = P*UH*D*U*PT

computed by p?pttrf.
Input Parameters
uplo (global, used in complex flavors only)
CHARACTER*1. Must be 'U' or 'L'.
sub(A) (n≥0).
(nrhs≥0).
d, e (local)
REAL for pspttrs
DOUBLE PRECISON for pdpttrs
COMPLEX for pcpttrs
DOUBLE COMPLEX for pzpttrs.
Pointers into the local memory to arrays of dimension
(desca(nb_)) each.
These arrays contain details of the factorization as returned
by p?pttrf
If desca(dtype_) = 501 or 502, then dlen_ ≥ 7;
b (local) Same type as d, e.
b(lld_b, LOCc(nrhs)).
1850
ib (global) INTEGER. The row index in the global array B that
points to the first row of the matrix to be operated on (which
may be either all of B or a submatrix of B).
af, work (local) REAL for pspttrs
DOUBLE PRECISION for pdpttrs
COMPLEX for pcpttrs
DOUBLE COMPLEX for pzpttrs.
Arrays of dimension (laf) and (lwork), respectively The
array af contains auxiliary Fillin space. Fillin is created
during the factorization routine p?pttrf and this is stored
in af.
The array work is a workspace array.
Must be laf ≥ NB+2.
If laf is not large enough, an error code is returned and
the minimum acceptable size will be returned in af(1).
be at least
lwork ≥ (10+2*min(100,nrhs))*NPCOL+4*nrhs.
Output Parameters
b On exit, this array contains the local pieces of the solution
info < 0:
if the i-th argument is an array and the j-th entry had an
illegal value, then info = -(i*100+j);
if the i-th argument is a scalar and had an illegal value,
then info = -i.
1851
p?dttrs
diagonally dominant-like tridiagonal distributed
matrix using the factorization computed by p?dt-
trf.
Syntax
call psdttrs(trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af, laf,
work, lwork, info)
call pddttrs(trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af, laf,
work, lwork, info)
call pcdttrs(trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af, laf,
work, lwork, info)
call pzdttrs(trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af, laf,
work, lwork, info)
Description
The routine p?dttrs solves for X one of the systems of equations:
sub(A)*X = sub(B),
(sub(A))T*X = sub(B), or
(sub(A))H*X = sub(B),
where sub(A) = (1:n, ja:ja+n-1); is a diagonally dominant-like tridiagonal distributed
matrix, and sub(B) denotes the distributed matrix B(ib:ib+n-1, 1:nrhs).
This routine uses the LU factorization computed by p?dttrf.
Input Parameters
If trans = 'T', then (sub(A))T*X = sub(B) is solved
for X.
1852
If trans = 'C', then (sub(A))H*X = sub(B) is solved
for X.
sub(A) (n ≥ 0).
(nrhs ≥ 0).
dl, d, du (local)
REAL for psdttrs
DOUBLE PRECISON for pddttrs
COMPLEX for pcdttrs
DOUBLE COMPLEX for pzdttrs.
Pointers to the local arrays of dimension (desca(nb_))
each.
On entry, these arrays contain details of the factorization.
Globally, dl(1) and du(n) are not referenced; dl and du
must be aligned with d.
desca(dtype_) = 501 or 502, then dlen_ ≥ 7;
b (local) Same type as d.
b(lld_b,LOCc(nrhs)).
1853

af, work (local) REAL for psdttrs
DOUBLE PRECISION for pddttrs
COMPLEX for pcdttrs
DOUBLE COMPLEX for pzdttrs.
Arrays of dimension (laf) and (lwork), respectively.
The array af contains auxiliary Fillin space. Fillin is created
during the factorization routine p?dttrf and this is stored
in af. If a linear system is to be solved using p?dttrsafter
the factorization routine, af must not be altered.
Must be laf ≥
NB*(bwl+bwu)+6*(bwl+bwu)*(bwl+2*bwu).
be at least lwork ≥ 10*NPCOL+4*nrhs.
Output Parameters
0:
if the ith argument is an array and the j-th entry had an
-i.
1854
p?dbtrs
diagonally dominant-like banded distributed matrix
using the factorization computed by p?dbtrf.
Syntax
call psdbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af, laf,
work, lwork, info)
call pddbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af, laf,
work, lwork, info)
call pcdbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af, laf,
work, lwork, info)
call pzdbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af, laf,
work, lwork, info)
Description
The routine p?dbtrs solves for X one of the systems of equations:
sub(A)*X = sub(B),
where sub(A) = A(1:n, ja:ja+n-1) is a diagonally dominant-like banded distributed matrix,
and sub(B) denotes the distributed matrix B(ib:ib+n-1, 1:nrhs).
This routine uses the LU factorization computed by p?dbtrf.
Input Parameters
If trans = 'T', then (sub(A))T*X = sub(B) is solved
for X.
1855
If trans = 'C', then (sub(A))H*X = sub(B) is solved

for X.
sub(A) (n ≥ 0).
bwl (global) INTEGER. The number of subdiagonals within the
band of A
( 0 ≤ bwl ≤ n-1 ).
bwu (global) INTEGER. The number of superdiagonals within the
band of A
( 0 ≤ bwu ≤ n-1 ).
(nrhs ≥ 0).
a, b (local)
REAL for psdbtrs
DOUBLE PRECISON for pddbtrs
COMPLEX for pcdbtrs
DOUBLE COMPLEX for pzdbtrs.
respectively.
On entry, the array a contains details of the LU factorization
of the band matrix A, as computed by p?dbtrf.
hand side distributed matrix sub(B).
1856
af, work (local)
REAL for psdbtrs
DOUBLE PRECISION for pddbtrs
COMPLEX for pcdbtrs
DOUBLE COMPLEX for pzdbtrs.
Arrays of dimension (laf) and (lwork), respectively The
array af contains auxiliary Fillin space. Fillin is created
during the factorization routine p?dbtrf and this is stored
in af.
Must be laf ≥ NB*(bwl+bwu)+6*(max(bwl,bwu))2 .
be at least
lwork ≥ (max(bwl,bwu))2.
Output Parameters
0:
1857
if the ith argument is an array and the j-th entry had an

-i.
p?trtrs
triangular distributed matrix.
Syntax
call pstrtrs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb,
info)
call pdtrtrs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb,
info)
call pctrtrs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb,
info)
call pztrtrs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb,
info)
Description
The routine solves for X one of the following systems of linear equations:
sub(A)*X = sub(B),
where sub(A) = A(ia:ia+n-1, ja:ja+n-1) is a triangular distributed matrix of order n,
and sub(B) denotes the distributed matrix B(ib:ib+n-1, jb:jb+nrhs-1).
A check is made to verify that sub(A) is nonsingular.
Input Parameters
Indicates whether sub(A) is upper or lower triangular:
1858
If uplo = 'U', then sub(A) is upper triangular.
If uplo = 'L', then sub(A) is lower triangular.
X.
If trans = 'C', then sub(A)H*X = sub(B) is solved for
X.
diag (global) CHARACTER*1. Must be 'N' or 'U'.
If diag = 'N', then sub(A) is not a unit triangular matrix.
If diag = 'U', then sub(A) is unit triangular.
sub(A) (n≥0).
nrhs (global) INTEGER. The number of right-hand sides; i.e., the
number of columns of the distributed matrix sub(B)
(nrhs≥0).
a, b (local)
REAL for pstrtrs
DOUBLE PRECISION for pdtrtrs
COMPLEX for pctrtrs
DOUBLE COMPLEX for pztrtrs.
The array a contains the local pieces of the distributed
triangular matrix sub(A).
sub(A) contains the upper triangular matrix, and the strictly
lower triangular part of sub(A) is not referenced.
sub(A) contains the lower triangular matrix, and the strictly
upper triangular part of sub(A) is not referenced.
If diag = 'U', the diagonal elements of sub(A) are also
not referenced and are assumed to be 1.
1859

hand side distributed matrix sub(B).
Output Parameters
b On exit, if info=0, sub(B) is overwritten by the solution
matrix X.
info < 0:
if the ith argument is an array and the jth entry had an
is a scalar and had an illegal value, then info = -i;
info > 0:
if info = i, the i-th diagonal element of sub(A) is zero,
indicating that the submatrix is singular and the solutions
X have not been computed.
Routines for Estimating the Condition Number

This section describes the ScaLAPACK routines for estimating the condition number of a matrix.
The condition number is used for analyzing the errors in the solution of a system of linear
equations. Since the condition number may be arbitrarily large when the matrix is nearly
singular, the routines actually compute the reciprocal condition number.
1860
p?gecon
of a general distributed matrix in either the 1-norm
or the infinity-norm.
Syntax
call psgecon( norm, n, a, ia, ja, desca, anorm, rcond, work, lwork, iwork,
liwork, info)
call pdgecon( norm, n, a, ia, ja, desca, anorm, rcond, work, lwork, iwork,
liwork, info)
call pcgecon( norm, n, a, ia, ja, desca, anorm, rcond, work, lwork, rwork,
lrwork, info)
call pzgecon( norm, n, a, ia, ja, desca, anorm, rcond, work, lwork, rwork,
lrwork, info)
Description
The routine estimates the reciprocal of the condition number of a general distributed real/complex
matrix sub(A) = A(ia:ia+n-1, ja:ja+n-1) in either the 1-norm or infinity-norm, using
the LU factorization computed by p?getrf.
An estimate is obtained for ||(sub(A))-1||, and the reciprocal of the condition number is
computed as
Input Parameters
norm (global) CHARACTER*1. Must be '1' or 'O' or 'I'.
Specifies whether the 1-norm condition number or the
infinity-norm condition number is required.
If norm = '1' or 'O', then the 1-norm is used;
If norm = 'I', then the infinity-norm is used.
1861

sub(A) (n ≥ 0).
a (local)
REAL for psgecon
DOUBLE PRECISION for pdgecon
COMPLEX for pcgecon
DOUBLE COMPLEX for pzgecon.
a(lld_a,LOCc(ja+n-1)).
The array a contains the local pieces of the factors L and U
from the factorization sub(A) = P*L*U; the unit diagonal
elements of L are not stored.
anorm (global) REAL for single precision flavors, DOUBLE
PRECISION for double precision flavors.
If norm = '1' or 'O', the 1-norm of the original distributed
matrix sub(A);
If norm = 'I', the infinity-norm of the original distributed
matrix sub(A).
work (local)
REAL for psgecon
DOUBLE PRECISION for pdgecon
COMPLEX for pcgecon
DOUBLE COMPLEX for pzgecon.
The array work of dimension (lwork) is a workspace array.
lwork (local or global) INTEGER. The dimension of the array work.
For real flavors:
lwork must be at least
1862
lwork ≥ 2*LOCr(n+mod(ia-1,mb_a))+
2*LOCc(n+mod(ja-1,nb_a))+ max(2, max(nb_a*max(1,
ceil(NPROW-1, NPCOL)),
LOCc(n+mod(ja-1,nb_a))+nb_a*max(1, ceil(NPCOL-1,
NPROW))).
lwork ≥ 2*LOCr(n+mod(ia-1,mb_a))+ max(2,
max(nb_a*ceil(NPROW-1, NPCOL),
LOCc(n+mod(ja-1,nb_a))+ nb_a*ceil(NPCOL-1,
NPROW))).
LOCr and LOCc values can be computed using the ScaLAPACK
tool function numroc; NPROW and NPCOL can be determined
by calling the subroutine blacs_gridinfo.
iwork (local) INTEGER. Workspace array, DIMENSION (liwork).
Used in real flavors only.
liwork (local or global) INTEGER. The dimension of the array iwork;
used in real flavors only. Must be at least
liwork ≥ LOCr(n+mod(ia-1,mb_a)).
rwork (local) REAL for pcgecon
DOUBLE PRECISION for pzgecon
Workspace array, DIMENSION (lrwork). Used in complex
flavors only.
lrwork (local or global) INTEGER. The dimension of the array rwork;
used in complex flavors only. Must be at least
lrwork ≥ 2*LOCc(n+mod(ja-1,nb_a)).
Output Parameters
rcond (global) REAL for single precision flavors.
The reciprocal of the condition number of the distributed
matrix sub(A). See Description.
1863
iwork(1) On exit, iwork(1) contains the minimum value of liwork

required for optimum performance (for real flavors).
rwork(1) On exit, rwork(1) contains the minimum value of lrwork
required for optimum performance (for complex flavors).
info (global) INTEGER. If info=0, the execution is successful.
info < 0:
-i.
p?pocon
(in the 1 - norm) of a symmetric / Hermitian
positive-definite distributed matrix.
Syntax
call pspocon( uplo, n, a, ia, ja, desca, anorm, rcond, work, lwork, iwork,
liwork, info)
call pdpocon( uplo, n, a, ia, ja, desca, anorm, rcond, work, lwork, iwork,
liwork, info)
call pcpocon( uplo, n, a, ia, ja, desca, anorm, rcond, work, lwork, rwork,
lrwork, info)
call pzpocon( uplo, n, a, ia, ja, desca, anorm, rcond, work, lwork, rwork,
lrwork, info)
Description
The routine estimates the reciprocal of the condition number (in the 1 - norm) of a real
symmetric or complex Hermitian positive definite distributed matrix sub(A) = A(ia:ia+n-1,
ja:ja+n-1), using the Cholesky factorization sub(A) = UH*U or sub(A) = L*LH computed
by p?potrf.
An estimate is obtained for ||(sub(A))-1||, and the reciprocal of the condition number is
computed as
1864
Input Parameters
Specifies whether the factor stored in sub(A) is upper or
lower triangular.
If uplo = 'U', sub(A) stores the upper triangular factor U
of the Cholesky factorization sub(A) = UH*U.
If uplo = 'L', sub(A) stores the lower triangular factor L
of the Cholesky factorization sub(A) = L*LH.
sub(A) (n≥0).
a (local)
REAL for pspocon
DOUBLE PRECISION for pdpocon
COMPLEX for pcpocon
DOUBLE COMPLEX for pzpocon.
The array a contains the local pieces of the factors L or U
from the Cholesky factorization sub(A) = UH*U, or sub(A)
= L*LH, as computed by p?potrf.
anorm (global) REAL for single precision flavors,
The 1-norm of the symmetric/Hermitian distributed matrix
sub(A).
work (local)
1865
REAL for pspocon

DOUBLE PRECISION for pdpocon
COMPLEX for pcpocon
DOUBLE COMPLEX for pzpocon.
For real flavors:
2*LOCc(n+mod(ja-1,nb_a))+ max(2,
LOCc(n+mod(ja-1,nb_a))+
nb_a*ceil(NPCOL-1, NPROW))).
max(nb_a*max(1,ceil(NPROW-1, NPCOL)),
LOCc(n+mod(ja-1,nb_a))+ nb_a*max(1,ceil(NPCOL-1,
NPROW)))).
used in real flavors only. Must be at least liwork ≥
LOCr(n+mod(ia-1,mb_a)).
rwork (local) REAL for pcpocon
DOUBLE PRECISION for pzpocon
flavors only.
used in complex flavors only. Must be at least lrwork ≥
2*LOCc(n+mod(ja-1,nb_a)).
Output Parameters
1866
matrix sub(A).
info < 0:
p?trcon
of a triangular distributed matrix in either 1-norm
or infinity-norm.
Syntax
call pstrcon( norm, uplo, diag, n, a, ia, ja, desca, rcond, work, lwork,
call pdtrcon( norm, uplo, diag, n, a, ia, ja, desca, rcond, work, lwork,
call pctrcon( norm, uplo, diag, n, a, ia, ja, desca, rcond, work, lwork,
rwork, lrwork, info)
call pztrcon( norm, uplo, diag, n, a, ia, ja, desca, rcond, work, lwork,
rwork, lrwork, info)
Description
The routine estimates the reciprocal of the condition number of a triangular distributed matrix
sub(A) = A(ia:ia+n-1, ja:ja+n-1), in either the 1-norm or the infinity-norm.
1867
The norm of sub(A) is computed and an estimate is obtained for ||(sub(A))-1||, then the
reciprocal of the condition number is computed as
Input Parameters
norm (global) CHARACTER*1. Must be '1' or 'O' or 'I'.
Specifies whether the 1-norm condition number or the
infinity-norm condition number is required.
If norm = '1' or 'O', then the 1-norm is used;
If norm = 'I', then the infinity-norm is used.
If uplo = 'U', sub(A) is upper triangular. If uplo = 'L',
sub(A) is lower triangular.
diag (global) CHARACTER*1. Must be 'N' or 'U'.
If diag = 'N', sub(A) is non-unit triangular. If diag =
'U', sub(A) is unit triangular.
sub(A), (n≥0).
a (local)
REAL for pstrcon
DOUBLE PRECISION for pdtrcon
COMPLEX for pctrcon
DOUBLE COMPLEX for pztrcon.
The array a contains the local pieces of the triangular
distributed matrix sub(A).
this distributed matrix contains the upper triangular matrix,
and its strictly lower triangular part is not referenced.
1868
this distributed matrix contains the lower triangular matrix,
and its strictly upper triangular part is not referenced.
work (local)
REAL for pstrcon
DOUBLE PRECISION for pdtrcon
COMPLEX for pctrcon
DOUBLE COMPLEX for pztrcon.
For real flavors:
LOCc(n+mod(ja-1,nb_a))+ max(2,
max(nb_a*max(1,ceil(NPROW-1, NPCOL)),
LOCc(n+mod(ja-1,nb_a))+ nb_a*max(1,ceil(NPCOL-1,
NPROW))).
LOCc(n+mod(ja-1,nb_a))+ nb_a*ceil(NPCOL-1,
NPROW))).
liwork ≥ LOCr(n+mod(ia-1,mb_a)).
rwork (local) REAL for pcpocon
1869
DOUBLE PRECISION for pzpocon

flavors only.
used in complex flavors only. Must be at least
lrwork ≥ LOCc(n+mod(ja-1,nb_a)).
Output Parameters
matrix sub(A).
info < 0:
-i.
Refining the Solution and Estimating Its Error

This section describes the ScaLAPACK routines for refining the computed solution of a system
of linear equations and estimating the solution error. You can call these routines after factorizing
the matrix of the system of equations and computing the solution (see Routines for Matrix
Factorization and Solving Systems of Linear Equations).
1870
p?gerfs
Improves the computed solution to a system of
linear equations and provides error bounds and
backward error estimates for the solution.
Syntax
call psgerfs(trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, ipiv,
b, ib, jb, descb, x, ix, jx, descx, ferr, berr, work, lwork, iwork, liwork,
info)
call pdgerfs(trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, ipiv,
b, ib, jb, descb, x, ix, jx, descx, ferr, berr, work, lwork, iwork, liwork,
info)
call pcgerfs(trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, ipiv,
b, ib, jb, descb, x, ix, jx, descx, ferr, berr, work, lwork, rwork, lrwork,
info)
call pzgerfs(trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, ipiv,
b, ib, jb, descb, x, ix, jx, descx, ferr, berr, work, lwork, rwork, lrwork,
info)
Description
The routine improves the computed solution to one of the systems of linear equations
sub(A)*sub(X) = sub(B),
sub(A)T*sub(X) = sub(B), or
sub(A)H*sub(X) = sub(B) and provides error bounds and backward error estimates for the
solution.
Here sub(A) = A(ia:ia+n-1, ja:ja+n-1), sub(B) = B(ib:ib+n-1, jb:jb+nrhs-1),
and sub(X) = X(ix:ix+n-1, jx:jx+nrhs-1).
Input Parameters
1871
If trans = 'N', the system has the form sub(A)*sub(X)

= sub(B) (No transpose);
If trans = 'T', the system has the form sub(A)T*sub(X)
= sub(B) (Transpose);
If trans = 'C', the system has the form sub(A)H*sub(X)
= sub(B) (Conjugate transpose).
sub(A) (n ≥ 0).
nrhs (global) INTEGER. The number of right-hand sides, i.e., the
number of columns of the matrices sub(B) and sub(X) (nrhs
≥ 0).
a, af, b, x (local)
REAL for psgerfs
DOUBLE PRECISION for pdgerfs
COMPLEX for pcgerfs
DOUBLE COMPLEX for pzgerfs.
a(lld_a,
LOCc(ja+n-1)), af(lld_af,LOCc(jaf+n-1)),
b(lld_b,LOCc(jb+nrhs-1)), and
x(lld_x,LOCc(jx+nrhs-1)), respectively.
The array a contains the local pieces of the distributed
matrix sub(A).
The array af contains the local pieces of the distributed
factors of the matrix sub(A) = P*L*U as computed by
p?getrf.
The array b contains the local pieces of the distributed
matrix of right hand sides sub(B).
On entry, the array x contains the local pieces of the
distributed solution matrix sub(X).
1872
iaf, jaf (global) INTEGER. The row and column indices in the global
array AF indicating the first row and the first column of the
submatrix sub(AF), respectively.
descaf (global and local) INTEGER array, dimension (dlen_). The
array descriptor for the distributed matrix AF.
ix, jx (global) INTEGER. The row and column indices in the global
array X indicating the first row and the first column of the
submatrix sub(X), respectively.
descx (global and local) INTEGER array, dimension (dlen_). The
array descriptor for the distributed matrix X.
ipiv (local) INTEGER.
Array, dimension LOCr(m_af + mb_af.
This array contains pivoting information as computed by
p?getrf. If ipiv(i)=j, then the local row i was swapped
with the global row j.
work (local)
REAL for psgerfs
DOUBLE PRECISION for pdgerfs
COMPLEX for pcgerfs
DOUBLE COMPLEX for pzgerfs.
For real flavors:
lwork ≥ 3*LOCr(n+mod(ia-1,mb_a))
1873

liwork ≥ LOCr(n+mod(ib-1,mb_b)).
rwork (local) REAL for pcgerfs
DOUBLE PRECISION for pzgerfs
flavors only.
LOCr(n+mod(ib-1,mb_b))).
Output Parameters
x On exit, contains the improved solution vectors.
Arrays, dimension LOCc(jb+nrhs-1) each.
The array ferr contains the estimated forward error bound
for each solution vector of sub(X).
If XTRUE is the true solution corresponding to sub(X), ferr
is an estimated upper bound for the magnitude of the largest
element in (sub(X) - XTRUE) divided by the magnitude
of the largest element in sub(X). The estimate is as reliable
as the estimate for rcond, and is almost always a slight
overestimate of the true error.
This array is tied to the distributed matrix X.
The array berr contains the component-wise relative
backward error of each solution vector (that is, the smallest
relative change in any entry of sub(A) or sub(B) that makes
sub(X) an exact solution). This array is tied to the distributed
matrix X.
1874
info < 0:
-i.
p?porfs
Improves the computed solution to a system of
linear equations with symmetric/Hermitian positive
definite distributed matrix and provides error
bounds and backward error estimates for the
solution.
Syntax
call psporfs(uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, b, ib,
jb, descb, x, ix, jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pdporfs(uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, b, ib,
jb, descb, x, ix, jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pcporfs(uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, b, ib,
jb, descb, x, ix, jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
call pzporfs(uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, b, ib,
jb, descb, x, ix, jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
Description
The routine p?porfs improves the computed solution to the system of linear equations
where sub(A) = A(ia:ia+n-1, ja:ja+n-1) is a real symmetric or complex Hermitian positive
definite distributed matrix and
sub(B) = B(ib:ib+n-1, jb:jb+nrhs-1),
sub(X) = X(ix:ix+n-1, jx:jx+nrhs-1)
1875
are right-hand side and solution submatrices, respectively. This routine also provides error
bounds and backward error estimates for the solution.
Input Parameters
symmetric/Hermitian matrix sub(A) is stored.
n (global) INTEGER. The order of the distributed matrix sub(A)
(n≥0).
nrhs (global) INTEGER. The number of right-hand sides, i.e., the
number of columns of the matrices sub(B) and sub(X)
(nrhs≥0).
a, af, b, x (local)
REAL for psporfs
DOUBLE PRECISION for pdporfs
COMPLEX for pcporfs
DOUBLE COMPLEX for pzporfs.
a(lld_a,LOCc(ja+n-1)), af(lld_af,LOCc(ja+n-1)),
b(lld_b,LOCc(jb+nrhs-1)), and
x(lld_x,LOCc(jx+nrhs-1)), respectively.
The array a contains the local pieces of the n-by-n
symmetric/Hermitian distributed matrix sub(A).
sub(A) contains the upper triangular part of the matrix, and
its strictly lower triangular part is not referenced.
sub(A) contains the lower triangular part of the distributed
matrix, and its strictly upper triangular part is not
referenced.
The array af contains the factors L or U from the Cholesky
factorization sub(A) = L*LH or sub(A) = UH*U, as
computed by p?potrf.
distributed matrix of right hand sides sub(B).
1876
On entry, the array x contains the local pieces of the solution
vectors sub(X).
array AF indicating the first row and the first column of the
submatrix sub(AF), respectively.
work (local)
REAL for psporfs
DOUBLE PRECISION for pdporfs
COMPLEX for pcporfs
DOUBLE COMPLEX for pzporfs.
lwork (local) INTEGER. The dimension of the array work.
For real flavors:
1877

rwork (local) REAL for pcporfs
DOUBLE PRECISION for pzporfs
flavors only.
Output Parameters
x On exit, contains the improved solution vectors.
element in (sub(X) - XTRUE)divided by the magnitude of
the largest element in sub(X). The estimate is as reliable
as the estimate for rcond, and is almost always a slight
matrix X.
1878
info < 0:
-i.
p?trrfs
Provides error bounds and backward error
estimates for the solution to a system of linear
equations with a distributed triangular coefficient
matrix.
Syntax
call pstrrfs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb,
x, ix, jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pdtrrfs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb,
x, ix, jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pctrrfs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb,
x, ix, jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
call pztrrfs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb,
x, ix, jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
Description
The routine p?trrfs provides error bounds and backward error estimates for the solution to
one of the systems of linear equations
sub(A)T*sub(X) = sub(B), or
sub(A)H*sub(X) = sub(B) ,
where sub(A) = A(ia:ia+n-1, ja:ja+n-1) is a triangular matrix,
1879
sub(B) = B(ib:ib+n-1, jb:jb+nrhs-1), and

sub(X) = X(ix:ix+n-1, jx:jx+nrhs-1).
The solution matrix X must be computed by p?trtrs or some other means before entering
this routine. The routine p?trrfs does not do iterative refinement because doing so cannot
improve the backward error.
Input Parameters
If trans = 'N', the system has the form sub(A)*sub(X)
= sub(B) (No transpose);
If trans = 'T', the system has the form sub((A)T*sub(X)
= sub(B) (Transpose);
If trans = 'C', the system has the form sub(A)H*sub(X)
= sub(B) (Conjugate transpose).
If diag = 'N', then sub(A) is non-unit triangular.
(n≥0).
nrhs (global) INTEGER. The number of right-hand sides, that is,
the number of columns of the matrices sub(B) and sub(X)
(nrhs≥0).
a, b, x (local)
REAL for pstrrfs
DOUBLE PRECISION for pdtrrfs
COMPLEX for pctrrfs
DOUBLE COMPLEX for pztrrfs.
a(lld_a,LOCc(ja+n-1)), b(lld_b,LOCc(jb+nrhs-1)),
and x(lld_x,LOCc(jx+nrhs-1)), respectively.
1880
The array a contains the local pieces of the original triangular
referenced.
distributed matrix of right hand sides sub(B).
On entry, the array x contains the local pieces of the solution
vectors sub(X).
work (local)
REAL for pstrrfs
DOUBLE PRECISION for pdtrrfs
COMPLEX for pctrrfs
DOUBLE COMPLEX for pztrrfs.
1881

For real flavors:
lwork must be at least lwork ≥
3*LOCr(n+mod(ia-1,mb_a))
rwork (local) REAL for pctrrfs
DOUBLE PRECISION for pztrrfs
flavors only.
Output Parameters
element in (sub(X) - XTRUE) divided by the magnitude of
the largest element in sub(X). The estimate is as reliable as
the estimate for rcond, and is almost always a slight
1882
matrix X.
info < 0:
-i.
Routines for Matrix Inversion

This sections describes ScaLAPACK routines that compute the inverse of a matrix based on the
previously obtained factorization. Note that it is not recommended to solve a system of equations
Ax = b by first computing A-1 and then forming the matrix-vector product x = A-1b. Call a
solver routine instead (see Solving Systems of Linear Equations); this is more efficient and
more accurate.
p?getri
Computes the inverse of a LU-factored distributed
matrix.
Syntax
call psgetri(n, a, ia, ja, desca, ipiv, work, lwork, iwork, liwork, info)
call pdgetri(n, a, ia, ja, desca, ipiv, work, lwork, iwork, liwork, info)
call pcgetri(n, a, ia, ja, desca, ipiv, work, lwork, iwork, liwork, info)
call pzgetri(n, a, ia, ja, desca, ipiv, work, lwork, iwork, liwork, info)
1883
Description
The routine computes the inverse of a general distributed matrix sub(A) = A(ia:ia+n-1,
ja:ja+n-1) using the LU factorization computed by p?getrf. This method inverts U and then
computes the inverse of sub(A) by solving the system
inv(sub(A))*L = inv(U)
for inv(sub(A).
Input Parameters
sub(A) (n≥0).
a (local)
REAL for psgetri
DOUBLE PRECISION for pdgetri
COMPLEX for pcgetri
DOUBLE COMPLEX for pzgetri.
On entry, the array a contains the local pieces of the L and
U obtained by the factorization sub(A) = P*L*U computed
by p?getrf.
work (local)
REAL for psgetri
DOUBLE PRECISION for pdgetri
COMPLEX for pcgetri
DOUBLE COMPLEX for pzgetri.
lwork (local) INTEGER. The dimension of the array work. lwork
must be at least
1884
lwork≥LOCr(n+mod(ia-1,mb_a))*nb_a.
The array work is used to keep at most an entire column
block of sub(A).
iwork (local) INTEGER. Workspace array used for physically
transposing the pivots, DIMENSION (liwork).
liwork (local or global) INTEGER. The dimension of the array iwork.
The minimal value liwork of is determined by the following
code:
if NPROW == NPCOL then
liwork = LOCc(n_a + mod(ja-1,nb_a))+ nb_a
else
liwork = LOCc(n_a + mod(ja-1,nb_a)) +
max(ceil(ceil(LOCr(m_a)/mb_a)/(lcm/NPROW)),nb_a)
end if
where lcm is the least common multiple of process rows
and columns (NPROW and NPCOL).
Output Parameters
ipiv (local) INTEGER.
Array, dimension (LOCr(m_a)+ mb_a).
This array contains the pivoting information.
If ipiv(i)=j, then the local row i was swapped with the
global row j.
info < 0:
1885

-i.
info > 0:
If info = i, U(i,i) is exactly zero. The factorization has
been completed, but the factor U is exactly singular, and
division by zero will occur if it is used to solve a system of
equations.
p?potri
Computes the inverse of a symmetric/Hermitian
positive definite distributed matrix.
Syntax
call pspotri(uplo, n, a, ia, ja, desca, info)
call pdpotri(uplo, n, a, ia, ja, desca, info)
call pcpotri(uplo, n, a, ia, ja, desca, info)
call pzpotri(uplo, n, a, ia, ja, desca, info)
Description
The routine computes the inverse of a real symmetric or complex Hermitian positive definite
distributed matrix sub(A) = A(ia:ia+n-1, ja:ja+n-1) using the Cholesky factorization
sub(A) = UH*U or sub(A) = L*LH computed by p?potrf.
Input Parameters
symmetric/Hermitian matrix sub(A) is stored.
If uplo = 'U', upper triangle of sub(A) is stored. If uplo
= 'L', lower triangle of sub(A) is stored.
1886
sub(A) (n≥0).
a (local)
REAL for pspotri
DOUBLE PRECISION for pdpotri
COMPLEX for pcpotri
DOUBLE COMPLEX for pzpotri.
On entry, the array a contains the local pieces of the
triangular factor U or L from the Cholesky factorization
sub(A) = UH*U, or sub(A) = L*LH, as computed by
p?potrf.
Output Parameters
a On exit, overwritten by the local pieces of the upper or lower
triangle of the (symmetric/Hermitian) inverse of sub(A).
info < 0:
-i.
info > 0:
If info = i, the (i, i) element of the factor U or L is zero,
and the inverse could not be computed.
1887
p?trtri
Computes the inverse of a triangular distributed
matrix.
Syntax
call pstrtri(uplo, diag, n, a, ia, ja, desca, info)
call pdtrtri(uplo, diag, n, a, ia, ja, desca, info)
call pctrtri(uplo, diag, n, a, ia, ja, desca, info)
call pztrtri(uplo, diag, n, a, ia, ja, desca, info)
Description
The routine computes the inverse of a real or complex upper or lower triangular distributed
matrix sub(A) = A(ia:ia+n-1, ja:ja+n-1).
Input Parameters
Specifies whether the distributed matrix sub(A) is upper or
lower triangular.
If uplo = 'U', sub(A) is upper triangular.
If uplo = 'L', sub(A) is lower triangular.
Specifies whether or not the distributed matrix sub(A) is
unit triangular.
If diag = 'N', then sub(A) is non-unit triangular.
sub(A) (n≥0).
a (local)
REAL for pstrtri
DOUBLE PRECISION for pdtrtri
COMPLEX for pctrtri
DOUBLE COMPLEX for pztrtri.
1888
The array a contains the local pieces of the triangular
sub(A) contains the upper triangular matrix to be inverted,
and the strictly lower triangular part of sub(A) is not
referenced.
sub(A) contains the lower triangular matrix, and the strictly
Output Parameters
a On exit, overwritten by the (triangular) inverse of the
original matrix.
info < 0:
-i.
info > 0:
If info = k, A(ia+k-1, ja+k-1) is exactly zero. The
triangular matrix sub(A) is singular and its inverse can not
be computed.
Routines for Matrix Equilibration

ScaLAPACK routines described in this section are used to compute scaling factors needed to
equilibrate a matrix. Note that these routines do not actually scale the matrices.
1889
p?geequ
to equilibrate a general rectangular distributed
matrix and reduce its condition number.
Syntax
call psgeequ(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, info)
call pdgeequ(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, info)
call pcgeequ(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, info)
call pzgeequ(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, info)
Description
The routine computes row and column scalings intended to equilibrate an m-by-n distributed
matrix sub(A) = A(ia:ia+m-1, ja:ja+n-1) and reduce its condition number. The output
array r returns the row scale factors and the array c the column scale factors. These factors
are chosen to try to make the largest element in each row and column of the matrix B with
elements bij=r(i)*aij*c(j) have absolute value 1.
r(i) and c(j) are restricted to be between SMLNUM = smallest safe number and BIGNUM = largest
safe number. Use of these scaling factors is not guaranteed to reduce the condition number of
sub(A) but works well in practice.
The auxiliary function p?laqge uses scaling factors computed by p?geequ to scale a general
rectangular matrix.
Input Parameters
m (global) INTEGER. The number of rows to be operated on,
that is, the number of rows of the distributed submatrix
sub(A) (m ≥ 0).
n (global) INTEGER. The number of columns to be operated
on, that is, the number of columns of the distributed
submatrix sub(A) (n ≥ 0).
a (local)
REAL for psgeequ
1890
DOUBLE PRECISION for pdgeequ
COMPLEX for pcgeequ
DOUBLE COMPLEX for pzgeequ .
The array a contains the local pieces of the m-by-n
distributed matrix whose equilibration factors are to be
computed.
Output Parameters
r, c (local) REAL for single precision flavors;
Arrays, dimension LOCr(m_a) and LOCc(n_a), respectively.
If info = 0, or info > ia+m-1, the array r (ia:ia+m-1)
contains the row scale factors for sub(A). r is aligned with
the distributed matrix A, and replicated across every process
column. r is tied to the distributed matrix A.
If info = 0, the array c (ja:ja+n-1) contains the column
scale factors for sub(A). c is aligned with the distributed
matrix A, and replicated down every process row. c is tied
to the distributed matrix A.
rowcnd, colcnd (global) REAL for single precision flavors;
If info = 0 or info > ia+m-1, rowcnd contains the ratio
of the smallest r(i) to the largest r(i) (ia ≤ i ≤
ia+m-1). If rowcnd ≥ 0.1 and amax is neither too large
nor too small, it is not worth scaling by r (ia:ia+m-1).
If info = 0, colcnd contains the ratio of the smallest c(j)
to the largest c(j) (ja ≤ j ≤ ja+n-1).
If colcnd ≥ 0.1, it is not worth scaling by c(ja:ja+n-1).
1891
amax (global) REAL for single precision flavors;

Absolute value of the largest matrix element. If amax is very
close to overflow or very close to underflow, the matrix
should be scaled.
info < 0:
-i.
info > 0:
If info = i and
i≤ m, the ith row of the distributed matrix
sub(A) is exactly zero;
i > m, the (i-m)th column of the distributed
matrix sub(A) is exactly zero.
p?poequ
definite distributed matrix and reduce its condition
number.
Syntax
call pspoequ(n, a, ia, ja, desca, sr, sc, scond, amax, info)
call pdpoequ(n, a, ia, ja, desca, sr, sc, scond, amax, info)
call pcpoequ(n, a, ia, ja, desca, sr, sc, scond, amax, info)
call pzpoequ(n, a, ia, ja, desca, sr, sc, scond, amax, info)
Description
1892
The routine computes row and column scalings intended to equilibrate a real symmetric or
complex Hermitian positive definite distributed matrix sub(A) = A(ia:ia+n-1, ja:ja+n-1)
and reduce its condition number (with respect to the two-norm). The output arrays sr and sc
return the row and column scale factors
These factors are chosen so that the scaled distributed matrix B with elements
bij=s(i)*aij*s(j) has ones on the diagonal.
This choice of sr and sc puts the condition number of B within a factor n of the smallest possible
condition number over all possible diagonal scalings.
The auxiliary function p?laqsy uses scaling factors computed by p?geequ to scale a general
rectangular matrix.
Input Parameters
sub(A) (n≥0).
a (local)
REAL for pspoequ
DOUBLE PRECISION for pdpoequ
COMPLEX for pcpoequ
DOUBLE COMPLEX for pzpoequ.
The array a contains the n-by-n symmetric/Hermitian
positive definite distributed matrix sub(A) whose scaling
factors are to be computed. Only the diagonal elements of
sub(A) are referenced.
1893
Output Parameters
sr, sc (local)
REAL for single precision flavors;
If info = 0, the array sr(ia:ia+n-1) contains the row
scale factors for sub(A). sr is aligned with the distributed
matrix A, and replicated across every process column. sr
is tied to the distributed matrix A.
If info = 0, the array sc (ja:ja+n-1) contains the
column scale factors for sub(A). sc is aligned with the
distributed matrix A, and replicated down every process
row. sc is tied to the distributed matrix A.
scond (global)
If info = 0, scond contains the ratio of the smallest sr(i)
( or sc(j)) to the largest sr(i) ( or sc(j)), with
ia≤i≤ia+n-1 and ja≤j≤ja+n-1.
If scond ≥ 0.1 and amax is neither too large nor too small,
it is not worth scaling by sr ( or sc ).
amax (global)
Absolute value of the largest matrix element. If amax is very
close to overflow or very close to underflow, the matrix
should be scaled.
info < 0:
-i.
info > 0:
If info = k, the k-th diagonal entry of sub(A) is nonpositive.
1894
This section describes the ScaLAPACK routines for the QR (RQ) and LQ (QL) factorization of
matrices. Routines for the RZ factorization as well as for generalized QR and RQ factorizations
are also included. For the mathematical definition of the factorizations, see the respective
LAPACK sections or refer to [SLUG].
Table 6-3 lists ScaLAPACK routines that perform orthogonal factorization of matrices.
Table 6-3 Computational Routines for Orthogonal Factorizations
Matrix type, Factorize without Factorize with Generate Apply matrix Q
factorization pivoting pivoting matrix Q
general matrices, QR p?geqrf p?geqpf p?orgqr p?ormqrp?un-
factorization p?ungqr mqr
general matrices, RQ p?gerqf p?orgrqp?un- p?ormrqp?un-
factorization grq mrq
general matrices, LQ p?gelqf p?or- p?ormlqp?un-
factorization glqp?unglq mlq
general matrices, QL p?geqlf p?orgqlp?ungql p?ormqlp?un-
factorization mql
trapezoidal matrices, p?tzrzf p?ormrzp?un-
RZ factorization mrz
pair of matrices, p?ggqrf
generalized QR
factorization
pair of matrices, p?ggrqf
generalized RQ
factorization
1895
p?geqrf
matrix.
Syntax
call psgeqrf( m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgeqrf( m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgeqrf( m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgeqrf( m, n, a, ia, ja, desca, tau, work, lwork, info)
Description
This routine forms the QR factorization of a general m-by-n distributed matrix sub(A)=
A(ia:ia+m-1,ja:ja+n-1) as
A=Q*R
Input Parameters
submatrix sub(A); (m ≥ 0).
submatrix sub(A); (n ≥ 0).
a (local)
REAL for psgeqrf
DOUBLE PRECISION for pdgeqrf
COMPLEX for pcgeqrf
DOUBLE COMPLEX for pzgeqrf.
to be factored.
array a indicating the first row and the first column of the
submatrix A(ia:ia+m-1,ja:ja+n-1), respectively.
1896
array descriptor for the distributed matrix A
work (local).
REAL for psgeqrf
DOUBLE PRECISION for pdgeqrf.
COMPLEX for pcgeqrf.
DOUBLE COMPLEX for pzgeqrf
Workspace array of dimension lwork.
lwork (local or global) INTEGER, dimension of work, must be at
least lwork ≥ nb_a * (mp0+nq0+nb_a), where
iroff = mod(ia-1, mb_a), icoff = mod(ja-1, nb_a),
iarow = indxg2p(ia, mb_a, MYROW, rsrc_a, NPROW),
iacol = indxg2p(ja, nb_a, MYCOL, csrc_a, NPCOL),
mp0 = numroc(m+iroff, mb_a, MYROW, iarow, NPROW),
nq0 = numroc(n+icoff, nb_a, MYCOL, iacol, NPCOL),
and numroc, indxg2p are ScaLAPACK tool functions; MYROW,
MYCOL, NPROW and NPCOL can be determined by calling the
subroutine blacs_gridinfo.
If lwork = -1, then lwork is global input and a workspace
query is assumed; the routine only calculates the minimum
and optimal size for all work arrays. Each of these values
is returned in the first entry of the corresponding work array,
and no error message is issued by pxerbla.
Output Parameters
a The elements on and above the diagonal of sub(A) contain
the min(m,n)-by-n upper trapezoidal matrix R (R is upper
triangular if m ≥ n); the elements below the diagonal, with
a product of elementary reflectors (see Application Notes
below).
tau (local)
REAL for psgeqrf
COMPLEX for pcgeqrf
1897
Array, DIMENSION LOCc(ja+min(m,n)-1).

Contains the scalar factor tau of elementary reflectors. tau
= 0, the execution is successful.
< 0, if the i-th argument is an array and the j-entry had
an illegal value, then info = - (i* 100+j), if the i-th
-i.
Application Notes
The matrix Q is represented as a product of elementary reflectors
Q = H(ja)*H(ja+1)*...*H(ja+k-1),
where k = min(m,n).
H(j) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector with v(1:i-1) = 0 and
v(i) = 1; v(i+1:m) is stored on exit in A(ia+i:ia+m-1, ja+i-1), and tau in tau(ja+i-1).
p?geqpf
matrix with pivoting.
Syntax
call psgeqpf( m, n, a, ia, ja, desca, ipiv, tau, work, lwork, info)
call pdgeqpf( m, n, a, ia, ja, desca, ipiv, tau, work, lwork, info)
call pcgeqpf( m, n, a, ia, ja, desca, ipiv, tau, work, lwork, info)
call pzgeqpf( m, n, a, ia, ja, desca, ipiv, tau, work, lwork, info)
Description
1898
This routine forms the QR factorization with column pivoting of a general m-by-n distributed
matrix sub(A)= A(ia:ia+m-1,ja:ja+n-1) as
sub(A)*P=Q*R
Input Parameters
m (global) INTEGER. The number of rows in the submatrix
sub(A) (m ≥ 0).
n (global) INTEGER. The number of columns in the submatrix
sub(A) (n ≥ 0).
a (local)
REAL for psgeqpf
DOUBLE PRECISION for pdgeqpf
COMPLEX for pcgeqpf
DOUBLE COMPLEX for pzgeqpf.
to be factored.
work (local).
REAL for psgeqpf
DOUBLE PRECISION for pdgeqpf.
COMPLEX for pcgeqpf.
DOUBLE COMPLEX for pzgeqpf
least
For real flavors:
lwork ≥ max(3,mp0+nq0) + LOCc (ja+n-1) + nq0.
lwork ≥ max(3,mp0+nq0) .
1899
Here
mp0 = numroc(m+iroff, mb_a, MYROW, iarow, NPROW
),
nq0 = numroc(n+icoff, nb_a, MYCOL, iacol, NPCOL),
LOCc (ja+n-1) = numroc(ja+n-1, nb_a,
MYCOL,csrc_a, NPCOL), and numroc, indxg2p are
ScaLAPACK tool functions; MYROW, MYCOL, NPROW and NPCOL
can be determined by calling the subroutine blacs_gridin-
fo.
Output Parameters
a The elements on and above the diagonal of sub(A)contain
the min(m, n)-by-n upper trapezoidal matrix R (R is upper
triangular if m ≥ n); the elements below the diagonal, with
below).
ipiv (local) INTEGER. Array, DIMENSION LOCc(ja+n-1).
ipiv(i) = k, the local i-th column of sub(A)*P was the
global k-th column of sub(A). ipiv is tied to the distributed
matrix A.
tau (local)
REAL for psgeqpf
DOUBLE PRECISION for pdgeqpf
COMPLEX for pcgeqpf
DOUBLE COMPLEX for pzgeqpf.
Array, DIMENSION LOCc(ja+min(m, n)-1)).
1900
-i.
Application Notes
Q = H(1)*H(2)*...*H(n),
H = I - tau*v*v'
v(i) = 1; v(i+1:m) is stored on exit in A(ia+i-1:ia+m-1,ja+i-1).
The matrix P is represented in jpvt as follows: if jpvt(j)= i then the j-th column of P is the
i-th canonical unit vector.
p?orgqr
Generates the orthogonal matrix Q of the QR
factorization formed by p?geqrf.
Syntax
call psorgqr(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorgqr(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Description
This routine generates the whole or part of m-by-n real distributed matrix Q denoting
A(ia:ia+m-1, ja:ja+n-1) with orthonormal columns, which is defined as the first n columns
of a product of k elementary reflectors of order m
Q= H(1)*H(2)*...*H(k)
1901
as returned by p?geqrf.
Input Parameters
sub(Q) (m ≥ 0).
sub(Q) (m ≥ n ≥ 0).
k (global) INTEGER. The number of elementary reflectors
whose product defines the matrix Q (n ≥ k ≥ 0).
a (local)
REAL for psorgqr
DOUBLE PRECISION for pdorgqr
(lld_a, LOCc(ja+n-1)). The j-th column must contain
the vector which defines the elementary reflector H(j),
ja≤j≤ja +k-1, as returned by p?geqrf in the k columns
of its distributed matrix argument A(ia:*, ja:ja+k-1).
tau (local)
REAL for psorgqr
Array, DIMENSION LOCc(ja+k-1)).
Contains the scalar factor tau (j) of elementary reflectors
H(j) as returned by p?geqrf. tau is tied to the distributed
matrix A.
work (local)
REAL for psorgqr
Workspace array of dimension of lwork.
lwork (local or global) INTEGER, dimension of work.
1902
Must be at least lwork ≥ nb_a*(nqa0 + mpa0 + nb_a),
where
iroffa = mod(ia-1, mb_a), icoffa = mod(ja-1,
nb_a),
mpa0 = numroc(m+iroffa, mb_a, MYROW, iarow,
NPROW),
nqa0 = numroc(n+icoffa, nb_a, MYCOL, iacol,
NPCOL);
indxg2p and numroc are ScaLAPACK tool functions; MYROW,
Output Parameters
a Contains the local pieces of the m-by-n distributed matrix
Q.
< 0: if the i-th argument is an array and the j-entry had
-i.
1903
p?ungqr
Generates the complex unitary matrix Q of the QR
factorization formed by p?geqrf.
Syntax
call pcungqr( m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzungqr( m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Description
This routine generates the whole or part of m-by-n complex distributed matrix Q denoting
of a product of k elementary reflectors of order m
Q = H(1)*H(2)*...*H(k)
Input Parameters
sub(Q); (m≥0).
sub(Q) (m≥n≥0).
whose product defines the matrix Q (n≥k≥0).
a (local)
COMPLEX for pcungqr
DOUBLE COMPLEX for pzungqr
(lld_a, LOCc(ja+n-1)).The j-th column must contain the
vector which defines the elementary reflector H(j), ja≤ j≤
ja +k-1, as returned by p?geqrf in the k columns of its
distributed matrix argument A(ia:*, ja:ja+k-1).
1904
submatrix A, respectively.
tau (local)
COMPLEX for pcungqr
matrix A.
work (local)
COMPLEX for pcungqr
least lwork ≥ nb_a*(nqa0 + mpa0 + nb_a), where
iroffa = mod(ia-1, mb_a),
icoffa = mod(ja-1, nb_a),
mpa0 = numroc(m+iroffa, mb_a, MYROW, iarow, NPROW),
nqa0 = numroc(n+icoffa, nb_a, MYCOL, iacol, NPCOL)
Output Parameters
Q.
1905

an illegal value, then info = -(i*100+j), if the i-th
-i.
p?ormqr
Multiplies a general matrix by the orthogonal matrix
Q of the QR factorization formed by p?geqrf.
Syntax
call psormqr( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pdormqr( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
Description
This routine overwrites the general real m-by-n distributed matrix sub(C) =
C(ic:ic+m-1,jc:jc+n-1) with
side ='L' side ='R'

trans = 'N': Q*sub(C) sub(C)*Q
T T
trans = 'T': Q *sub(C) sub(C)*Q
where Q is a real orthogonal distributed matrix defined as the product of k elementary reflectors
Q = H(1) H(2)... H(k)

as returned by p?geqrf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
side (global) CHARACTER
1906
T
='L': Q or Q is applied from the left.
T
='R': Q or Q is applied from the right.
trans (global) CHARACTER
='N', no transpose, Q is applied.
T
='T', transpose, Q is applied.
matrix sub(C) (m≥0).
matrix sub(C) (n≥0).
whose product defines the matrix Q. Constraints:
If side = 'L', m≥k≥0
If side = 'R', n≥k≥0.
a (local)
REAL for psormqr
DOUBLE PRECISION for pdormqr.
(lld_a, LOCc(ja+k-1)). The j-th column must contain
ja≤j≤ja+k-1, as returned by p?geqrf in the k columns of
its distributed matrix argument A(ia:*, ja:ja+k-1).
A(ia:*, ja:ja+k-1) is modified by the routine but
restored on exit.
If side = 'L', lld_a ≥ max(1, LOCr(ia+m-1)
If side ='R', lld_a ≥ max(1, LOCr(ia+n-1)
tau (local)
REAL for psormqr
DOUBLE PRECISION for pdormqr
Array, DIMENSION LOCc(ja+k-1).
1907

matrix A.
c (local)
REAL for psormqr
(lld_c, LOCc(jc+n-1)).
Contains the local pieces of the distributed matrix sub(C)
to be factored.
ic, jc (global) INTEGER. The row and column indices in the global
array c indicating the first row and the first column of the
submatrix C, respectively.
descc (global and local) INTEGER array, dimension (dlen_). The
array descriptor for the distributed matrix C.
work (local)
REAL for psormqr
least:
if side = 'L',
lwork ≥ max((nb_a*(nb_a-1))/2, (nqc0+mpc0)*nb_a)
+ nb_a*nb_a
else if side = 'R',
lwork ≥ max((nb_a*(nb_a-1))/2,
(nqc0+max(npa0+numroc(numroc(n+icoffc, nb_a, 0,
0, NPCOL), nb_a, 0, 0, lcmq), mpc0))*nb_a) +
nb_a*nb_a
end if
where
lcmq = lcm/NPCOL with lcm = ilcm(NPROW, NPCOL),
1908
npa0= numroc(n+iroffa, mb_a, MYROW, iarow,
NPROW),
iroffc = mod(ic-1, mb_c),
icoffc = mod(jc-1, nb_c),
icrow = indxg2p(ic, mb_c, MYROW, rsrc_c, NPROW),
iccol = indxg2p(jc, nb_c, MYCOL, csrc_c, NPCOL),
mpc0= numroc(m+iroffc, mb_c, MYROW, icrow,
NPROW),
nqc0= numroc(n+icoffc, nb_c, MYCOL, iccol,
NPCOL),
ilcm, indxg2p and numroc are ScaLAPACK tool functions;
MYROW, MYCOL, NPROW and NPCOL can be determined by
calling the subroutine blacs_gridinfo.
Output Parameters
c Overwritten by the product Q*sub(C), or QT*sub(C), or
sub(C)*QT, or sub(C)*Q.
-i.
1909
p?unmqr
Q of the QR factorization formed by p?geqrf.
Syntax
call pcunmqr( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info )
call pzunmqr( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info )
Description
This routine overwrites the general complex m-by-n distributed matrix sub (C) =
side ='L' side ='R'

H H
where Q is a complex unitary distributed matrix defined as the product of k elementary reflectors
Q = H(1) H(2)... H(k) as returned by p?geqrf. Q is of order m if side = 'L' and of order
n if side ='R'.
Input Parameters
H
H
H
='C', conjugate transpose, Q is applied.
1910
a (local)
COMPLEX for pcunmqr
DOUBLE COMPLEX for pzunmqr.
ja≤j≤ja +k-1, as returned by p?geqrf in the k columns
of its distributed matrix argument A(ia:*, ja:ja+k-1).
A(ia:*, ja:ja+k-1) is modified by the routine but
restored on exit.
If side= 'L', lld_a≥ max(1, LOCr(ia+m-1)
If side = 'R', lld_a≥ max(1, LOCr(ia+n-1)
tau (local)
COMPLEX for pcunmqr
DOUBLE COMPLEX for pzunmqr
matrix A.
c (local)
COMPLEX for pcunmqr
to be factored.
1911
work (local)
COMPLEX for pcunmqr
least:
If side = 'L',
lwork ≥ max((nb_a*(nb_a-1))/2, (nqc0 +
mpc0)*nb_a) + nb_a*nb_a
else if side = 'R',
lwork ≥ max((nb_a*(nb_a-1))/2, (nqc0 + max(npa0
+ numroc(numroc(n+icoffc, nb_a, 0, 0, NPCOL),
nb_a, 0, 0, lcmq), mpc0))*nb_a) + nb_a*nb_a
end if
where
lcmq = lcm/NPCOL with lcm = ilcm (NPROW, NPCOL),
npa0 = numroc(n+iroffa, mb_a, MYROW, iarow,
NPROW),
mpc0 = numroc(m+iroffc, mb_c, MYROW, icrow,
NPROW),
nqc0 = numroc(n+icoffc, nb_c, MYCOL, iccol,
NPCOL),
1912
Output Parameters
c Overwritten by the product Q*sub(C), or QH*sub(C), or
sub(C)*QH, or sub(C)*Q .
-i.
p?gelqf
Computes the LQ factorization of a general
rectangular matrix.
Syntax
call psgelqf( m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgelqf( m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgelqf( m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgelqf( m, n, a, ia, ja, desca, tau, work, lwork, info)
Description
This routine computes the LQ factorization of a real/complex distributed m-by-n matrix sub(A)=
A(ia:ia+m-1, ia:ia+n-1) = L*Q.
1913
Input Parameters
sub(Q) (m ≥ 0).
sub(Q) (n ≥ 0).
whose product defines the matrix Q (n ≥ k ≥ 0).
a (local)
REAL for psgelqf
DOUBLE PRECISION for pdgelqf
COMPLEX for pcgelqf
DOUBLE COMPLEX for pzgelqf
to be factored.
submatrix A((ia:ia+m-1, ia:ia+n-1), respectively.
work (local)
REAL for psgelqf
COMPLEX for pcgelqf
least lwork ≥ mb_a*(mp0 + nq0 + mb_a), where
iroff = mod(ia-1, mb_a),
icoff = mod(ja-1, nb_a),
nq0 = numroc(n+icoff, nb_a, MYCOL, iacol, NPCOL)
1914
Output Parameters
a The elements on and below the diagonal of sub(A) contain
the m by min(m,n) lower trapezoidal matrix L (L is lower
trapezoidal if m ≤ n); the elements above the diagonal, with
below).
tau (local)
REAL for psgelqf
COMPLEX for pcgelqf
Array, DIMENSION LOCr(ia+min(m, n)-1)).
Contains the scalar factors of elementary reflectors. tau is
-i.
Application Notes
Q = H(ia+k-1)*H(ia+k-2)*...*H(ia),
1915
where k = min(m,n)
H(i) = I - tau*v*v'
v(i) = 1; v(i+1:n) is stored on exit in A(ia+i-1:ia+i-1,ja+n-1), and tau in tau
(ia+i-1).
p?orglq
Generates the real orthogonal matrix Q of the LQ
factorization formed by p?gelqf.
Syntax
call psorglq( m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorglq( m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Description
A(ia:ia+m-1, ja:ja+n-1) with orthonormal rows, which is defined as the first m rows of a
product of k elementary reflectors of order n
Q = H(k)*...* H(2)* H(1)

as returned by p?gelqf.
Input Parameters
sub(Q); (m≥0).
sub(Q) (n≥m≥0).
whose product defines the matrix Q (m≥k≥0).
a (local)
REAL for psorglq
1916
DOUBLE PRECISION for pdorglq
(lld_a, LOCc (ja+n-1)). On entry, the i-th row must
H(i), ia≤i≤ia+k-1, as returned by p?gelqf in the k rows
of its distributed matrix argument A(ia:ia+k -1, ja:*).
submatrix A((ia:ia+m-1, ja:ja+n-1), respectively.
work (local)
REAL for psorglq
least lwork ≥ mb_a*(mpa0+nqa0+mb_a), where
NPROW),
Output Parameters
a Contains the local pieces of the m-by-n distributed matrix Q
to be factored.
1917
tau (local)
REAL for psorglq
Array, DIMENSION LOCr(ia+k-1).
Contains the scalar factors tau of elementary reflectors H(i).
tau is tied to the distributed matrix A.
an illegal value, then info = -(i* 100+j), if the i-th
-i.
p?unglq
Generates the unitary matrix Q of the LQ
factorization formed by p?gelqf.
Syntax
call pcunglq( m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzunglq( m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Description
Q = (H(k))H...*(H(2))H*(H(1))H as returned by p?gelqf.
Input Parameters
sub(Q) (m≥0).
1918
sub(Q) (n≥m≥0).
a (local)
COMPLEX for pcunglq
DOUBLE COMPLEX for pzunglq
(lld_a, LOCc(ja+n-1)). On entry, the i-th row must
H(i), ia≤i≤ia+k-1, as returned by p?gelqf in the k rows
of its distributed matrix argument A(ia:ia+k-1, ja:*).
tau (local)
COMPLEX for pcunglq
Array, DIMENSION LOCr(ia+k-1)).
Contains the scalar factors tau of elementary reflectors H(i).
tau is tied to the distributed matrix A.
work (local)
COMPLEX for pcunglq
least lwork ≥ mb_a*(mpa0+nqa0+mb_a), where
NPROW),
1919

NPCOL)
Output Parameters
to be factored.
-i.
p?ormlq
Q of the LQ factorization formed by p?gelqf.
Syntax
call psormlq( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, work,
lwork, info)
call pdormlq( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, work,
lwork, info)
Description
1920
side ='L' side ='R'

T T
Q = H(k)...H(2) H(1)
as returned by p?gelqf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
T
T
T
a (local)
REAL for psormlq
DOUBLE PRECISION for pdormlq.
(lld_a, LOCc(ja+m-1)), if side = 'L' and (lld_a,
LOCc(ja+n-1)), if side = 'R'.The i-th row must contain
1921
the vector which defines the elementary reflector H(i),

ia≤i≤ia+k-1, as returned by p?gelqf in the k rows of its
distributed matrix argument A(ia:ia+k-1, ja:*).
A(ia:ia+k-1, ja:*) is modified by the routine but
restored on exit.
tau (local)
REAL for psormlq
DOUBLE PRECISION for pdormlq
Contains the scalar factor tau (i) of elementary reflectors
H(i) as returned by p?gelqf. tau is tied to the distributed
matrix A.
c (local)
REAL for psormlq
DOUBLE PRECISION for pdormlq
to be factored.
work (local)
REAL for psormlq
DOUBLE PRECISION for pdormlq.
lwork (local or global) INTEGER, dimension of the array work;
must be at least:
If side = 'L',
1922
lwork ≥ max((mb_a*(mb_a-1))/2, (mpc0+max mqa0)+
numroc(numroc(m + iroffc, mb_a, 0, 0, NPROW),
mb_a, 0, 0, lcmp), nqc0))* mb_a) + mb_a*mb_a
else if side = 'R',
lwork ≥ max((mb_a* (mb_a-1))/2, (mpc0+nqc0)*mb_a
+ mb_a*mb_a
end if
where
lcmp = lcm/NPROW with lcm = ilcm (NPROW, NPCOL),
mqa0 = numroc(m+icoffa, nb_a, MYCOL, iacol,
NPCOL),
NPROW),
NPCOL),
Output Parameters
c Overwritten by the product Q*sub(C), or Q' *sub (C), or
sub(C)*Q', or sub(C)*Q
1923

-i.
p?unmlq
Multiplies a general matrix by the unitary matrix
Q of the LQ factorization formed by p?gelqf.
Syntax
call pcunmlq( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info )
call pzunmlq( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info )
Description
This routine overwrites the general complex m-by-n distributed matrix sub (C) = C
(ic:ic+m-1,jc:jc+n-1) with
side ='L' side ='R'

H H
Q = H(k)' ... H(2)' H(1)'

as returned by p?gelqf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
H
H
1924
H
a (local)
COMPLEX for pcunmlq
DOUBLE COMPLEX for pzunmlq.
(lld_a, LOCc(ja+m-1)), if side = 'L', and (lld_a,
LOCc(ja+n-1)), if side = 'R', where lld_a ≥ max(1,
LOCr (ia+k-1)). The i-th column must contain the vector
which defines the elementary reflector H(i), ia≤i≤ia+k-1,
as returned by p?gelqf in the k rows of its distributed
matrix argument A(ia:ia+k-1, ja:*). A(ia:ia+k-1,
ja:*) is modified by the routine but restored on exit.
tau (local)
COMPLEX for pcunmlq
DOUBLE COMPLEX for pzunmlq
Array, DIMENSION LOCc(ia+k-1)).
H(i) as returned by p?gelqf. tau is tied to the distributed
matrix A.
c (local)
1925
COMPLEX for pcunmlq

to be factored.
work (local)
COMPLEX for pcunmlq
lwork (local or global) INTEGER, dimension of the array work;
must be at least:
If side = 'L',
lwork ≥ max((mb_a*(mb_a-1))/2, (mpc0 + max mqa0)+
numroc(numroc(m + iroffc, mb_a, 0, 0, NPROW),
mb_a, 0, 0, lcmp), nqc0))*mb_a) + mb_a*mb_a
else if side = 'R',
lwork ≥ max((mb_a* (mb_a-1))/2, (mpc0 +
nqc0)*mb_a + mb_a*mb_a
end if
where
mqa0 = numroc(m + icoffa, nb_a, MYCOL, iacol,
NPCOL),
1926
NPROW),
NPCOL),
Output Parameters
c Overwritten by the product Q*sub(C), or Q'*sub (C), or
-i.
p?geqlf
Computes the QL factorization of a general matrix.
Syntax
call psgeqlf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgeqlf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgeqlf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgeqlf(m, n, a, ia, ja, desca, tau, work, lwork, info)
1927
Description
This routine forms the QL factorization of a real/complex distributed m-by-n matrix sub(A) =
A(ia:ia+m-1, ja:ja+n-1) = Q*L.
Input Parameters
sub(Q); (m ≥ 0).
sub(Q) (n ≥ 0).
a (local)
REAL for psgeqlf
DOUBLE PRECISION for pdgeqlf
COMPLEX for pcgeqlf
DOUBLE COMPLEX for pzgeqlf
(lld_a, LOCc(ja+n-1)). Contains the local pieces of the
distributed matrix sub(A) to be factored.
submatrix A((ia:ia+m-1, ia:ia+n-1), respectively.
work (local)
REAL for psgeqlf
COMPLEX for pcgeqlf
least lwork ≥ nb_a*(mp0 + nq0 + nb_a), where
1928
numroc and indxg2p are ScaLAPACK tool functions; MYROW,
Output Parameters
a On exit, if m≥n, the lower triangle of the distributed
submatrix A(ia+m-n:ia+m-1, ja:ja+n-1) contains the
n-by-n lower triangular matrix L; if m≤n, the elements on
and below the (n-m)-th superdiagonal contain the m-by-n
lower trapezoidal matrix L; the remaining elements, with
below).
tau (local)
REAL for psgeqlf
COMPLEX for pcgeqlf
Array, DIMENSION LOCc(ja+n-1)).
Contains the scalar factors of elementary reflectors. tau is
1929

-i.
Application Notes
Q = H(ja+k-1)*...*H(ja+1)*H(ja),
where k = min(m,n)
H(i) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector with v(m-k+i+1:m) = 0
and v(m-k+i) = 1; v(m-k+i-1) is stored on exit in A(ia+ia+m-k+i-2, ja+n-k+i-1), and
tau in tau (ja+n-k+i-1).
p?orgql
Generates the orthogonal matrix Q of the QL
factorization formed by p?geqlf.
Syntax
call psorgql( m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorgql( m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Description
Q = H(k)*...*H(2)*H(1)
as returned by p?geqlf.
1930
Input Parameters
sub(Q); (m≥0).
sub(Q)(m≥n≥0).
a (local)
REAL for psorgql
DOUBLE PRECISION for pdorgql
(lld_a, LOCc (ja+n-1)). On entry, the j-th column must
H(j),ja+n-k≤j≤ja+n-1, as returned by p?geqlf in the k
columns of its distributed matrix argument
A(ia:*ja+n-k:ja+n-1).
submatrix A((ia:ia+m-1, ja:ja+n-1), respectively.
tau (local)
REAL for psorgql
Array, DIMENSION LOCc(ja+n-1)).
Contains the scalar factors tau(j) of elementary reflectors
H(j). tau is tied to the distributed matrix A.
work (local)
REAL for psorgql
least lwork ≥ nb_a*(nqa0+mpa0+nb_a), where
1931

NPROW),
NPCOL)
Output Parameters
to be factored.
an illegal value, then info = - (i*100+j), if the i-th
-i.
p?ungql
Generates the unitary matrix Q of the QL
factorization formed by p?geqlf.
Syntax
call pcungql( m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzungql( m, n, k, a, ia, ja, desca, tau, work, lwork, info)
1932
Description
A(ia:ia+m-1, ja:ja+n-1) with orthonormal rows, which is defined as the first n columns of
a product of k elementary reflectors of order m
Q = (H(k))H...*(H(2))H*(H(1))H as returned by p?geqlf.
Input Parameters
sub(Q) (m≥0).
sub(Q) (m≥n≥0).
a (local)
COMPLEX for pcungql
DOUBLE COMPLEX for pzungql Pointer into the local memory
to an array of local dimension (lld_a, LOCc(ja+n-1)).
On entry, the j-th column must contain the vector which
defines the elementary reflector H(j), ja+n-k≤ j≤ ja+n-1,
as returned by p?geqlf in the k columns of its distributed
matrix argument A(ia:*, ja+n-k: ja+n-1).
tau (local)
COMPLEX for pcungql
DOUBLE COMPLEX for pzungql
Array, DIMENSION LOCr(ia+n-1)).
Contains the scalar factors tau (j) of elementary reflectors
H(j). tau is tied to the distributed matrix A.
1933
work (local)
COMPLEX for pcungql
DOUBLE COMPLEX for pzungql
least lwork ≥ nb_a*(nqa0 + mpa0 + nb_a), where
NPROW),
Output Parameters
to be factored.
-i.
1934
p?ormql
Q of the QL factorization formed by p?geqlf.
Syntax
call psormql( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pdormql( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
Description
This routine overwrites the general real m-by-n distributed matrix sub(C) = C
side ='L' side ='R'

T T
Q = H(k)' ... H(2)' H(1)'

as returned by p?geqlf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
T
T
T
1935

a (local)
REAL for psormql
DOUBLE PRECISION for pdormql.
ja≤j≤ja+k-1, as returned by p?gelqf in the k columns of
its distributed matrix argument A(ia:*,
ja:ja+k-1).A(ia:*, ja:ja+k-1) is modified by the
routine but restored on exit.
If side = 'L',lld_a ≥ max(1, LOCr(ia+m-1)),
If side = 'R', lld_a ≥ max(1, LOCr(ia+n-1)).
tau (local)
REAL for psormql
Array, DIMENSION LOCc(ja+n-1) ).
H(j) as returned by p?geqlf. tau is tied to the distributed
matrix A.
c (local)
REAL for psormql
1936
to be factored.
work (local)
REAL for psormql
least:
If side = 'L',
lwork ≥ max((nb_a*(nb_a-1))/2, (nqc0+mpc0)*nb_a
+ nb_a*nb_a
else if side ='R',
lwork ≥ max((nb_a*(nb_a-1))/2, (nqc0+max npa0)+
numroc(numroc(n+icoffc, nb_a, 0, 0, NPCOL),
end if
where
lcmp = lcm/NPCOL with lcm = ilcm (NPROW, NPCOL),
npa0= numroc(n + iroffa, mb_a, MYROW, iarow,
NPROW),
NPROW),
NPCOL),
1937

Output Parameters
c Overwritten by the product Q* sub(C), or Q'*sub (C), or
sub(C)* Q', or sub(C)* Q
-i.
p?unmql
Q of the QL factorization formed by p?geqlf.
Syntax
call pcunmql( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pzunmql( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
Description
This routine overwrites the general complex m-by-n distributed matrix sub(C) = C
1938
side ='L' side ='R'
H H
trans = 'C': Q *sub(C) sub(C)*Q
Q = H(k)' ... H(2)' H(1)'

as returned by p?geqlf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
H
H
H
a (local)
COMPLEX for pcunmql
DOUBLE COMPLEX for pzunmql.
ja≤j≤ja+k-1, as returned by p?geqlf in the k columns of
its distributed matrix argument A(ia:*,
ja:ja+k-1).A(ia:*, ja:ja+k-1) is modified by the
If side = 'L',lld_a ≥ max(1, LOCr(ia+m-1)),
1939
If side = 'R', lld_a ≥ max(1, LOCr(ia+n-1)).

tau (local)
COMPLEX for pcunmql
DOUBLE COMPLEX for pzunmql
Array, DIMENSION LOCc(ia+n-1)).
H(j) as returned by p?geqlf. tau is tied to the distributed
matrix A.
c (local)
COMPLEX for pcunmql
to be factored.
work (local)
COMPLEX for pcunmql
least:
If side = 'L',
lwork ≥ max((nb_a* (nb_a-1))/2, (nqc0+mpc0)*nb_a
+ nb_a*nb_a
else if side ='R',
1940
lwork ≥ max((nb_a*(nb_a-1))/2, (nqc0+max npa0)+
numroc(numroc(n+icoffc, nb_a, 0, 0, NPCOL),
end if
where
lcmp = lcm/NPCOL with lcm = ilcm (NPROW, NPCOL),
npa0 = numroc (n + iroffa, mb_a, MYROW, iarow,
NPROW),
NPROW),
NPCOL),
Output Parameters
c Overwritten by the product Q* sub(C), or Q' sub (C), or
1941

-i.
p?gerqf
Computes the RQ factorization of a general
rectangular matrix.
Syntax
call psgerqf( m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgerqf( m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgerqf( m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgerqf( m, n, a, ia, ja, desca, tau, work, lwork, info)
Description
This routine forms the QR factorization of a general m-by-n distributed matrix sub(A)=
A(ia:ia+m-1,ja:ja+n-1) as
A= R*Q
Input Parameters
submatrix sub(A); (m≥0).
submatrix sub(A); (n≥0).
a (local)
REAL for psgeqrf
COMPLEX for pcgeqrf
1942
to be factored.
array descriptor for the distributed matrix A
work (local).
REAL for psgeqrf
DOUBLE PRECISION for pdgeqrf.
COMPLEX for pcgeqrf.
DOUBLE COMPLEX for pzgeqrf
least lwork ≥ mb_a*(mp0+nq0+mb_a), where
Output Parameters
a On exit, if m≤n, the upper triangle of A(ia:ia+m-1,
ja:ja+n-1) contains the m-by-m upper triangular matrix R;
if m≥n, the elements on and above the (m - n)-th subdiagonal
contain the m-by-n upper trapezoidal matrix R; the remaining
1943
elements, with the array tau, represent the

reflectors (see Application Notes below).
tau (local)
REAL for psgeqrf
COMPLEX for pcgeqrf
Array, DIMENSION LOCr(ia+m-1).
an illegal value, then info = -(i* 100+j), if the i-th
-i.
Application Notes
Q = H(ia)*H(ia+1)*...*H(ia+k-1),
where k = min(m,n).
H(i) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector with v(n-k+i+1:n) = 0
and v(n-k+i) = 1; v(1:n-k+i-1)/conjg (v(1:n-k+i-1)) is stored on exit in
A(ia+m-k+i-1,ja:ja+n-k+i-2), and tau in tau(ia+m-k+i-1).
1944
p?orgrq
Generates the orthogonal matrix Q of the RQ
factorization formed by p?gerqf.
Syntax
call psorgrq( m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorgrq( m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Description
A(ia:ia+m-1, ja:ja+n-1) with orthonormal columns, which is defined as the last m rows of
a product of k elementary reflectors of order m
Q= H(1)*H(2)*...*H(k)
as returned by p?gerqf.
Input Parameters
sub(Q); (m≥0).
sub(Q) (n≥m≥0).
a (local)
REAL for psorgrq
DOUBLE PRECISION for pdorgrq
(lld_a, LOCc(ja+n-1)). The i-th column must contain
ja≤j≤ja+k-1, as returned by p?gerqf in the k columns of
its distributed matrix argument A(ia:*, ja:ja+k-1).
1945
tau (local)
REAL for psorgrq
H(i) as returned by p?gerqf. tau is tied to the distributed
matrix A.
work (local)
REAL for psorgrq
least lwork≥mb_a*(mpa0 + nqa0 + mb_a), where
NPROW),
NPCOL)
1946
Output Parameters
Q.
-i.
p?ungrq
Generates the unitary matrix Q of the RQ
factorization formed by p?gerqf.
Syntax
call pcungrq( m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzungrq( m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Description
This routine generates the m-by-n complex distributed matrix Q denoting
A(ia:ia+m-1,ja:ja+n-1) with orthonormal rows, which is defined as the last m rows of a
Q = (H(1))H*(H(2))H*...*(H(k))H as returned by p?gerqf.
Input Parameters
sub(Q); (m≥0).
sub(Q) (n≥m≥0).
1947

a (local)
COMPLEX for pcungrq
DOUBLE COMPLEX for pzungrqc
(lld_a, LOCc(ja+n-1)). The i-th row must contain the
vector which defines the elementary reflector H(i),
ia+m-k≤i≤ia+m-1, as returned by p?gerqf in the k rows
of its distributed matrix argument A(ia+m-k:ia+m-1,
ja:*).
tau (local)
COMPLEX for pcungrq
DOUBLE COMPLEX for pzungrq
Array, DIMENSION LOCr(ia+m-1)).
matrix A.
work (local)
COMPLEX for pcungrq
DOUBLE COMPLEX for pzungrq
least lwork ≥ mb_a*(mpa0 +nqa0+mb_a), where
NPROW),
1948
Output Parameters
Q.
-i.
p?ormrq
Q of the RQ factorization formed by p?gerqf.
Syntax
call psormrq( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pdormrq( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
Description
1949
side ='L' side ='R'

T T
Q = H(1) H(2)... H(k)

as returned by p?gerqf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
T
T
T
a (local)
REAL for psormqr
(lld_a, LOCc(ja+m-1)) if side = 'L', and (lld_a,
LOCc(ja+n-1)) if side = 'R'.
elementary reflector H(i), ia≤i≤ia+k-1, as returned by
p?gerqf in the k rows of its distributed matrix argument
A(ia:ia+k-1, ja:*).A(ia:ia+k-1, ja:*) is modified
by the routine but restored on exit.
1950
tau (local)
REAL for psormqr
matrix A.
c (local)
REAL for psormrq
DOUBLE PRECISION for pdormrq
to be factored.
work (local)
REAL for psormrq
DOUBLE PRECISION for pdormrq.
least:
If side = 'L',
lwork ≥ max((mb_a*(mb_a-1))/2, (mpc0 + max(mqa0
+ numroc(numroc(n+iroffc, mb_a, 0, 0, NPROW),
else if side ='R',
1951
lwork ≥ max((mb_a*(mb_a-1))/2, (mpc0 +

nqc0)*mb_a) + mb_a*mb_a
end if
where
mqa0 = numroc(n+icoffa, nb_a, MYCOL, iacol,
NPCOL),
NPROW),
NPCOL),
Output Parameters
1952
-i.
p?unmrq
Q of the RQ factorization formed by p?gerqf.
Syntax
call pcunmrq( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pzunmrq( side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
Description
This routine overwrites the general complex m-by-n distributed matrix sub (C)=
side ='L' side ='R'

H H
Q = H(1)' H(2)'... H(k)'

as returned by p?gerqf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
H
H
1953
H
a (local)
COMPLEX for pcunmrq
DOUBLE COMPLEX for pzunmrq.
LOCc(ja+n-1)) if side = 'R'. The i-th row must contain
ia≤i≤ia+k-1, as returned by p?gerqf in the k rows of its
distributed matrix argument A(ia:ia +k-1,
ja*).A(ia:ia +k-1, ja*) is modified by the routine but
restored on exit.
tau (local)
COMPLEX for pcunmrq
DOUBLE COMPLEX for pzunmrq
matrix A.
c (local)
COMPLEX for pcunmrq
1954
to be factored.
work (local)
COMPLEX for pcunmrq
least:
If side = 'L',
max(mqa0+numroc(numroc(n+iroffc, mb_a, 0, 0,
NPROW), mb_a, 0, 0, lcmp), nqc0))*mb_a) +
mb_a*mb_a
else if side = 'R',
end if
where
lcmp = lcm/NPROW with lcm = ilcm(NPROW, NPCOL),
NPCOL),
NPROW),
1955

NPCOL),
Output Parameters
c Overwritten by the product Q* sub(C) or Q'*sub (C), or
-i.
p?tzrzf
Reduces the upper trapezoidal matrix A to upper
triangular form.
Syntax
call pstzrzf( m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdtzrzf( m, n, a, ia, ja, desca, tau, work, lwork, info)
call pctzrzf( m, n, a, ia, ja, desca, tau, work, lwork, info)
call pztzrzf( m, n, a, ia, ja, desca, tau, work, lwork, info)
Description
1956
The routine reduces the m-by-n (m ≤ n) real/complex upper trapezoidal matrix
sub(A)=(ia:ia+m-1,ja:ja+n-1) to upper triangular form by means of orthogonal/unitary
transformations. The upper trapezoidal matrix A is factored as
A = (R 0)*Z,
Input Parameters
sub(A); (m≥0).
sub(A) (n≥0).
a (local)
REAL for pstzrzf
DOUBLE PRECISION for pdtzrzf.
COMPLEX for pctzrzf.
DOUBLE COMPLEX for pztzrzf.
m-by-n distributed matrix sub (A) to be factored.
work (local)
REAL for pstzrzf
least lwork ≥ mb_a*(mp0+nq0+mb_a), where
1957

mp0 = numroc (m+iroff, mb_a, MYROW, iarow,
NPROW),
nq0 = numroc (n+icoff, nb_a, MYCOL, iacol,
NPCOL)
Output Parameters
a On exit, the leading m-by-m upper triangular part of sub(A)
contains the upper triangular matrix R, and elements m+1
to n of the first m rows of sub (A), with the array tau,
represent the orthogonal/unitary matrix Z as a product of
m elementary reflectors.
tau (local)
REAL for pstzrzf
Array, DIMENSION LOCr(ia+m-1)).
Contains the scalar factor of elementary reflectors. tau is
-i.
1958
Application Notes
The factorization is obtained by the Householder's method. The k-th transformation matrix,
Z(k), which is or whose conjugate transpose is used to introduce zeros into the (m - k +1)-th
row of sub(A), is given in the form
where
T(k) = i - tau*u(k)*u(k)',
tau is a scalar and Z(k) is an (n - m) element vector. tau and Z(k) are chosen to annihilate the
elements of the k-th row of sub(A). The scalar tau is returned in the k-th element of tau and
the vector u(k) in the k-th row of sub(A), such that the elements of Z(k) are in a(k, m +
1),..., a(k, n). The elements of R are returned in the upper triangular part of sub(A). Z is
given by
Z = Z(1) * Z(2) *... * Z(m).
p?ormrz
from a reduction to upper triangular form formed
by p?tzrzf.
Syntax
call psormrz( side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pdormrz( side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
1959
Description
This routine overwrites the general real m-by-n distributed matrix sub(C) = C(ic:ic+m-1,
jc:jc+n-1) with
side ='L' side ='R'

T T
Q = H(1) H(2)... H(k)

as returned by p?tzrzf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
T
T
T
If side = 'L', m ≥ k ≥0
If side = 'R', n ≥ k ≥0.
l (global)
The columns of the distributed submatrix sub(A) containing
If side = 'L', m ≥ l ≥0
If side = 'R', n ≥ l ≥0.
1960
a (local)
REAL for psormrz
DOUBLE PRECISION for pdormrz.
LOCc(ja+n-1)) if side = 'R', where lld_a ≥
max(1,LOCr(ia+k-1).
elementary reflector H(i), ia≤i≤ia+k-1, as returned by
p?tzrzf in the k rows of its distributed matrix argument
A(ia:ia+k-1, ja:*).A(ia:ia+k-1, ja:*) is modified
by the routine but restored on exit.
tau (local)
REAL for psormrz
DOUBLE PRECISION for pdormrz
H(i) as returned by p?tzrzf. tau is tied to the distributed
matrix A.
c (local)
REAL for psormrz
DOUBLE PRECISION for pdormrz
to be factored.
1961
work (local)
REAL for psormrz
DOUBLE PRECISION for pdormrz.
least:
If side = 'L',
lwork ≥ max((mb_a*(mb_a-1))/2, (mpc0 + max(mqa0
+ numroc(numroc(n+iroffc, mb_a, 0, 0, NPROW),
else if side ='R',
end if
where
nb_a),
mqa0 = numroc(n+icoffa, nb_a, MYCOL, iacol,
NPCOL),
NPROW),
NPCOL),
1962
Output Parameters
c Overwritten by the product Q*sub(C), or Q'*sub (C), or
-i.
p?unmrz
Multiplies a general matrix by the unitary
transformation matrix from a reduction to upper
triangular form determined by p?tzrzf.
Syntax
call pcunmrz( side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pzunmrz( side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
Description
This routine overwrites the general complex m-by-n distributed matrix sub(C) =
side ='L' side ='R'

H H
Q = H(1)' H(2)'... H(k)'
1963
as returned by pctzrzf/pztzrzf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
H
H
H
matrix sub(C), (m≥0).
matrix sub(C), (n≥0).
l (global) INTEGER. The columns of the distributed submatrix
sub(A) containing the meaningful part of the Householder
reflectors.
If side = 'L', m≥l≥0
If side = 'R', n≥l≥0.
a (local)
COMPLEX for pcunmrz
DOUBLE COMPLEX for pzunmrz.
LOCc(ja+n-1)) if side = 'R', where lld_a≥max(1, LOCr
(ja+k-1). The i-th row must contain the vector which
defines the elementary reflector H(i), ia≤i≤ia+k-1, as
returned by p?gerqf in the k rows of its distributed matrix
argument A(ia:ia +k-1, ja*). A(ia:ia +k-1, ja*)
is modified by the routine but restored on exit.
1964
tau (local)
COMPLEX for pcunmrz
DOUBLE COMPLEX for pzunmrz
matrix A.
c (local)
COMPLEX for pcunmrz
to be factored.
work (local)
COMPLEX for pcunmrz
least:
If side = 'L',
lwork ≥ max((mb_a*(mb_a-1))/2,
(mpc0+max(mqa0+numroc(numroc(n+iroffc, mb_a, 0,
0, NPROW), mb_a, 0, 0, lcmp), nqc0))*mb_a) +
mb_a*mb_a
else if side ='R',
1965
lwork ≥ max((mb_a*(mb_a-1))/2, (mpc0+nqc0)*mb_a)

+ mb_a*mb_a
end if
where
lcmp = lcm/NPROW with lcm = ilcm(NPROW, NPCOL),
NPCOL),
NPROW),
NPCOL),
Output Parameters
1966
-i.
p?ggqrf
Computes the generalized QR factorization.
Syntax
call psggqrf(n, m, p, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work,
lwork, info)
call pdggqrf(n, m, p, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work,
lwork, info)
call pcggqrf(n, m, p, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work,
lwork, info)
call pzggqrf(n, m, p, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work,
lwork, info)
Description
This routine forms the generalized QR factorization of an n-by-m matrix
sub(A) = A(ia:ia+n-1, ja:ja+m-1)

and an n-by-p matrix
sub(B) = B(ib:ib+n-1, jb:jb+p-1):

as
sub(A) = Q*R, sub(B) = Q*T*Z,
where Q is an n-by-n orthogonal/unitary matrix, Z is a p-by-p orthogonal/unitary matrix, and
R and T assume one of the forms:
If n ≥ m
1967
or if n <m
where R11 is upper triangular, and
where T12 or T21 is an upper triangular matrix.
In particular, if sub(B) is square and nonsingular, the GQR factorization of sub(A) and sub(B)
implicitly gives the QR factorization of inv (sub(B))* sub (A):
inv(sub(B))*sub(A) = ZH*(inv(T)*R
Input Parameters
n (global) INTEGER. The number of rows in the distributed
matrices sub (A) and sub(B) (n≥0).
m (global) INTEGER. The number of columns in the distributed
matrix sub(A) (m≥0).
1968
p INTEGER. The number of columns in the distributed matrix
sub(B) (p≥0).
a (local)
REAL for psggqrf
DOUBLE PRECISION for pdggqrf
COMPLEX for pcggqrf
DOUBLE COMPLEX for pzggqrf.
(lld_a, LOCc(ja+m-1)). Contains the local pieces of the
n-by-m matrix sub(A) to be factored.
b (local)
REAL for psggqrf
COMPLEX for pcggqrf
(lld_b, LOCc(jb+p-1)). Contains the local pieces of the
n-by-p matrix sub(B) to be factored.
array b indicating the first row and the first column of the
submatrix B, respectively.
work (local)
REAL for psggqrf
COMPLEX for pcggqrf
lwork (local or global) INTEGER. Dimension of work, must be at
least
1969
lwork ≥ max(nb_a*(npa0+mqa0+nb_a),
max((nb_a*(nb_a-1))/2,
(pqb0+npb0)*nb_a)+nb_a*nb_a,
mb_b*(npb0+pqb0+mb_b)),
where
iroffa = mod(ia-1, mb_A),
npa0 = numroc (n+iroffa, mb_a, MYROW, iarow,
NPROW),
mqa0 = numroc (m+icoffa, nb_a, MYCOL, iacol,
NPCOL)
iroffb = mod(ib-1, mb_b),
icoffb = mod(jb-1, nb_b),
ibrow = indxg2p(ib, mb_b, MYROW, rsrc_b, NPROW),
ibcol = indxg2p(jb, nb_b, MYCOL, csrc_b, NPCOL),
npb0 = numroc (n+iroffa, mb_b, MYROW, Ibrow,
NPROW),
pqb0 = numroc(m+icoffb, nb_b, MYCOL, ibcol,
NPCOL)
Output Parameters
a On exit, the elements on and above the diagonal of sub (A)
contain the min(n, m)-by-m upper trapezoidal matrix R (R is
upper triangular if n ≥ m); the elements below the diagonal,
with the array taua, represent the orthogonal/unitary matrix
Q as a product of min(n,m) elementary reflectors. (See
Application Notes below).
1970
taua, taub (local)
REAL for psggqrf
COMPLEX for pcggqrf
Arrays, DIMENSION LOCc(ja+min(n,m)-1)for taua and
LOCr(ib+n-1) for taub.
taua is tied to the distributed matrix A. (See Application
Notes below).
taub is tied to the distributed matrix B. (See Application
Notes below).
-i.
Application Notes
Q = H(ja)*H(ja+1)*...*H(ja+k-1),
where k = min(n,m).
H(i) = i - taua*v*v'
where taua is a real/complex scalar, and v is a real/complex vector with v(1:i-1) = 0 and
v(i) = 1; v(i+1:n) is stored on exit in A(ia+i:ia+n-1,ja+i-1), and taua in
taua(ja+i-1).To form Q explicitly, use ScaLAPACK subroutine p?orgqr/p?ungqr. To use Q
to update another matrix, use ScaLAPACK subroutine p?ormqr/p?unmqr.
The matrix Z is represented as a product of elementary reflectors
1971
Z = H(ib)*H(ib+1)*...*H(ib+k-1), where k = min(n,p).

H(i) = i - taub*v*v'
where taub is a real/complex scalar, and v is a real/complex vector with v(p-k+i+1:p) = 0
and v(p-k+i) = 1; v(1:p-k+i-1) is stored on exit in B(ib+n-k+i-1,jb:jb+p-k+i-2),
and taub in taub(ib+n-k+i-1). To form Z explicitly, use ScaLAPACK subroutine p?or-
grq/p?ungrq. To use Z to update another matrix, use ScaLAPACK subroutine p?ormrq/p?unmrq.
p?ggrqf
Computes the generalized RQ factorization.
Syntax
call psggrqf(m, p, n, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work,
lwork, info)
call pdggrqf(m, p, n, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work,
lwork, info)
call pcggrqf(m, p, n, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work,
lwork, info)
call pzggrqf(m, p, n, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work,
lwork, info)
Description
This routine forms the generalized RQ factorization of an m-by-n matrix sub(A)=(ia:ia+m-1,
ja:ja+n-1) and a p-by-n matrix sub(B)=(ib:ib+p-1, ja:ja+n-1):
sub(A) = R*Q, sub(B) = Z*T*Q,
where Q is an n-by-n orthogonal/unitary matrix, Z is a p-by-p orthogonal/unitary matrix, and
R and T assume one of the forms:
1972
or
where R11 or R21 is upper triangular, and
or
11
where T is upper triangular.
In particular, if sub(B) is square and nonsingular, the GRQ factorization of sub(A) and sub(B)
implicitly gives the RQ factorization of sub (A)*inv(sub(B)):
sub(A)*inv(sub(B))= (R*inv(T))*Z'
where inv(sub(B)) denotes the inverse of the matrix sub(B), and Z' denotes the transpose
(conjugate transpose) of matrix Z.
Input Parameters
matrices sub (A) (m≥0).
p INTEGER. The number of rows in the distributed matrix
sub(B) (p≥0).
1973

matrices sub(A) and sub(B) (n≥0).
a (local)
REAL for psggrqf
DOUBLE PRECISION for pdggrqf
COMPLEX for pcggrqf
DOUBLE COMPLEX for pzggrqf.
m-by-n distributed matrix sub(A) to be factored.
b (local)
REAL for psggrqf
COMPLEX for pcggrqf
(lld_b, LOCc(jb+n-1)).
Contains the local pieces of the p-by-n matrix sub(B) to be
factored.
work (local)
REAL for psggrqf
COMPLEX for pcggrqf
lwork (local or global) INTEGER.
1974
Dimension of work, must be at least lwork ≥
max(mb_a*(mpa0+nqa0+mb_a),
max((mb_a*(mb_a-1))/2, (ppb0+nqb0)*mb_a) +
mb_a*mb_a, nb_b*(ppb0+nqb0+nb_b)), where
iroffa = mod(ia-1, mb_A),
mpa0 = numroc (m+iroffa, mb_a, MYROW, iarow,
NPROW),
nqa0 = numroc (m+icoffa, nb_a, MYCOL, iacol,
NPCOL)
ibrow = indxg2p(ib, mb_b, MYROW, rsrc_b, NPROW
),
ibcol = indxg2p(jb, nb_b, MYCOL, csrc_b, NPCOL
),
ppb0 = numroc (p+iroffb, mb_b, MYROW, ibrow,
NPROW),
nqb0 = numroc (n+icoffb, nb_b, MYCOL, ibcol,
NPCOL)
Output Parameters
a On exit, if m≤n, the upper triangle of A(ia:ia+m-1,
ja+n-m:ja+n-1) contains the m-by-m upper triangular
matrix R; if m≥n, the elements on and above the (m-n)-th
subdiagonal contain the m-by-n upper trapezoidal matrix R;
1975
the remaining elements, with the array taua, represent the

orthogonal/unitary matrix Q as a product of min(n, m)
elementary reflectors (see Application Notes below).
taua, taub (local)
REAL for psggqrf
COMPLEX for pcggqrf
Arrays, DIMENSION LOCr(ia+m-1)for taua and
LOCc(jb+min(p,n)-1) for taub.
taua is tied to the distributed matrix A.(See Application
Notes below).
taub is tied to the distributed matrix B. (See Application
Notes below).
-i.
Application Notes
Q = H(ia)*H(ia+1)*...*H(ia+k-1),
where k = min(m,n).
H(i) = i - taua*v*v'
1976
where taua is a real/complex scalar, and v is a real/complex vector with v(n-k+i+1:n) = 0
and v(n-k+i) = 1; v(1:n-k+i-1) is stored on exit in A(ia+m-k+i-1, ja:ja+n-k+i-2),
and taua in taua(ia+m-k+i-1). To form Q explicitly, use ScaLAPACK subroutine p?orgrq/p?un-
grq. To use Q to update another matrix, use ScaLAPACK subroutine p?ormrq/p?unmrq.
The matrix Z is represented as a product of elementary reflectors
Z = H(jb)*H(jb+1)*...*H(jb+k-1), where k = min(p,n).

H(i) = i - taub*v*v'
where taub is a real/complex scalar, and v is a real/complex vector with v(1:i-1) = 0 and
v(i)= 1; v(i+1:p) is stored on exit in B(ib+i:ib+p-1,jb+i-1), and taub in taub(jb+i-1).
To form Z explicitly, use ScaLAPACK subroutine p?orgqr/p?ungqr. To use Z to update another
matrix, use ScaLAPACK subroutine p?ormqr/p?unmqr.
To solve a symmetric eigenproblem with ScaLAPACK, you usually need to reduce the matrix to
real tridiagonal form T and then find the eigenvalues and eigenvectors of the tridiagonal matrix
T. ScaLAPACK includes routines for reducing the matrix to a tridiagonal form by an orthogonal
(or unitary) similarity transformation A = QTQH as well as for solving tridiagonal symmetric
eigenvalue problems. These routines are listed in Table 6-4 .
There are different routines for symmetric eigenproblems, depending on whether you need
eigenvalues only or eigenvectors as well, and on the algorithm used (either the QTQ algorithm,
or bisection followed by inverse iteration).
Table 6-4 Computational Routines for Solving Symmetric Eigenproblems
Operation Dense Orthogonal/unitary Symmetric
symmetric/Hermitian matrix tridiagonal
matrix matrix
Reduce to tridiagonal form A = p?sytrd/p?hetrd
H
QTQ
Multiply matrix after reduction p?ormtr/p?unmtr
Find all eigenvalues and steqr2*)
eigenvectors of a tridiagonal
matrix T by a QTQ method
Find selected eigenvalues of a p?stebz
tridiagonal matrix T via bisection
1977
Operation Dense Orthogonal/unitary Symmetric

symmetric/Hermitian matrix tridiagonal
matrix matrix
Find selected eigenvectors of a p?stein
tridiagonal matrix T by inverse
iteration
*)
This routine is described as part of auxiliary ScaLAPACK routines.
p?sytrd
Reduces a symmetric matrix to real symmetric
tridiagonal form by an orthogonal similarity
transformation.
Syntax
call pssytrd( uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info)
call pdsytrd( uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info)
Description
The routine reduces a real symmetric matrix sub(A) to symmetric tridiagonal form T by an
orthogonal similarity transformation:
Q'*sub(A)*Q = T,
where sub(A) = A(ia:ia+n-1,ja:ja+n-1).
Input Parameters
uplo (global) CHARACTER.
symmetric matrix sub(A) is stored:
If uplo = 'U', upper triangular
If uplo = 'L', lower triangular
(n≥0).
a (local)
REAL for pssytrd
DOUBLE PRECISION for pdsytrd.
1978
(lld_a, LOCc(ja+n-1)). On entry, this array contains
the local pieces of the symmetric distributed matrix sub(A).
sub(A) contains the lower triangular part of the matrix, and
its strictly upper triangular part is not referenced. See
Application Notes below.
work (local)
REAL for pssytrd
least:
lwork ≥ max(NB*(np +1), 3*NB),
where NB = mb_a = nb_a,
np = numroc(n, NB, MYROW, iarow, NPROW),
iarow = indxg2p(ia, NB, MYROW, rsrc_a, NPROW).
1979
Output Parameters
of sub(A) are overwritten by the corresponding elements of
the tridiagonal matrix T, and the elements above the first
superdiagonal, with the array tau, represent the orthogonal
matrix Q as a product of elementary reflectors; if uplo =
'L', the diagonal and first subdiagonal of sub(A) are
the array tau, represent the orthogonal matrix Q as a
product of elementary reflectors. See Application Notes
below.
d (local)
REAL for pssytrd
Arrays, DIMENSION LOCc(ja+n-1) .The diagonal elements
of the tridiagonal matrix T:
d(i)= A(i,i).
d is tied to the distributed matrix A.
e (local)
REAL for pssytrd
Arrays, DIMENSION LOCc(ja+n-1) if uplo = 'U',
LOCc(ja+n-2) otherwise.
e(i)= A(i,i+1) if uplo = 'U',
e(i) = A(i+1,i) if uplo = 'L'.
e is tied to the distributed matrix A.
tau (local)
REAL for pssytrd
Arrays, DIMENSION LOCc(ja+n-1). This array contains the
scalar factors tau of the elementary reflectors. tau is tied
1980
-i.
Application Notes
Q = H(n-1)... H(2) H(1).

H(i) = i - tau * v * v',

where tau is a real scalar, and v is a real vector with v(i+1:n) = 0 and v(i) = 1; v(1:i-1)
is stored on exit in A(ia:ia+i-2, ja+i), and tau in tau (ja+i-1).
Q = H(1) H(2)... H(n-1).

H(i) = i - tau * v * v',

where tau is a real scalar, and v is a real vector with v(1:i) = 0 and v(i+1) = 1; v(i+2:n)
is stored on exit in A(ia+i+1:ia+n-1,ja+i-1), and tau in tau(ja+i-1).
The contents of sub(A) on exit are illustrated by the following examples with n = 5:
If uplo = 'U':
If uplo = 'L':
1981
where d and e denote diagonal and off-diagonal elements of T, and v i denotes an element of
the vector defining H(i).
p?ormtr
Multiplies a general matrix by the orthogonal
transformation matrix from a reduction to
tridiagonal form determined by p?sytrd.
Syntax
call psormtr( side, uplo, trans, m, n, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pdormtr( side, uplo, trans, m, n, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
Description
This routine overwrites the general real distributed m-by-n matrix sub(C) =
side ='L' side ='R'

T T
where Q is a real orthogonal distributed matrix of order nq, with nq = m if side = 'L' and nq
= n if side = 'R'.
Q is defined as the product of nq elementary reflectors, as returned by p?sytrd.
If uplo = 'U', Q = H(nq-1)... H(2) H(1);
1982
If uplo = 'L', Q = H(1) H(2)... H(nq-1).
Input Parameters
T
T
T
= 'U': Upper triangle of A(ia:*, ja:*) contains
elementary reflectors from p?sytrd;
= 'L': Lower triangle of A(ia:*,ja:*) contains elementary
reflectors from p?sytrd
a (local)
REAL for psormtr
DOUBLE PRECISION for pdormtr.
(lld_a, LOCc(ja+m-1)) if side='L', or (lld_a,
Contains the vectors which define the elementary reflectors,
as returned by p?sytrd.
If side='L', lld_a ≥ max(1,LOCr(ia+m-1));
If side ='R', lld_a ≥ max(1, LOCr(ia+n-1)).
tau (local)
REAL for psormtr
1983

Array, DIMENSION of ltau where
if side = 'L' and uplo = 'U', ltau = LOCc(m_a),
if side = 'L' and uplo = 'L', ltau = LOCc(ja+m-2),
if side = 'R' and uplo = 'U', ltau = LOCc(n_a),
if side = 'R' and uplo = 'L', ltau = LOCc(ja+n-2).
reflector H(i), as returned by p?sytrd. tau is tied to the
distributed matrix A.
c (local) REAL for psormtr
(lld_a, LOCc (ja+n-1)). Contains the local pieces of
the distributed matrix sub (C).
work (local)
REAL for psormtr
least:
if uplo = 'U',
iaa= ia; jaa= ja+1, icc= ic; jcc= jc;
else uplo = 'L',
iaa= ia+1, jaa= ja;
If side = 'L',
icc= ic+1; jcc= jc;
else icc= ic; jcc= jc+1;
end if
end if
If side = 'L',
mi= m-1; ni= n
1984
else
If side = 'R',
mi= m; mi = n-1;
max(npa0+numroc(numroc(ni+icoffc, nb_a, 0, 0,
NPCOL), nb_a, 0, 0, lcmq), mpc0))*nb_a)+
nb_a*nb_a
end if
where lcmq = lcm/NPCOL with lcm = ilcm(NPROW,
NPCOL),
iroffa = mod(iaa-1, mb_a),
icoffa = mod(jaa-1, nb_a),
iarow = indxg2p(iaa, mb_a, MYROW, rsrc_a, NPROW),
npa0 = numroc(ni+iroffa, mb_a, MYROW, iarow,
NPROW),
iroffc = mod(icc-1, mb_c),
icoffc = mod(jcc-1, nb_c),
icrow = indxg2p(icc, mb_c, MYROW, rsrc_c, NPROW),
iccol = indxg2p(jcc, nb_c, MYCOL, csrc_c, NPCOL),
mpc0 = numroc(mi+iroffc, mb_c, MYROW, icrow,
NPROW),
nqc0 = numroc(ni+icoffc, nb_c, MYCOL, iccol,
NPCOL),
calling the subroutine blacs_gridinfo. If lwork = -1,
then lwork is global input and a workspace query is
assumed; the routine only calculates the minimum and
optimal size for all work arrays. Each of these values is
returned in the first entry of the corresponding work array,
Output Parameters
c Overwritten by the product Q*sub(C), or Q'*sub(C), or
sub(C)*Q', or sub(C)*Q.
1985

-i.
p?hetrd
Reduces a Hermitian matrix to Hermitian
tridiagonal form by a unitary similarity
transformation.
Syntax
call pchetrd( uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info)
call pzhetrd( uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info)
Description
The routine reduces a complex Hermitian matrix sub(A) to Hermitian tridiagonal form T by a
unitary similarity transformation:
Q'*sub(A)*Q = T
where sub(A) = A(ia:ia+n-1,ja:ja+n-1).
Input Parameters
Hermitian matrix sub(A) is stored:
If uplo = 'U', upper triangular
If uplo = 'L', lower triangular
(n≥0).
a (local)
1986
COMPLEX for pchetrd
DOUBLE COMPLEX for pzhetrd.
the local pieces of the Hermitian distributed matrix sub(A).
its strictly upper triangular part is not referenced. (see
work (local)
COMPLEX for pchetrd
least:
lwork≥max(NB*(np +1), 3*NB)
np = numroc(n, NB, MYROW, iarow, NPROW),
iarow = indxg2p(ia, NB, MYROW, rsrc_a, NPROW).
1987
Output Parameters
a On exit,
If uplo = 'U', the diagonal and first superdiagonal of
sub(A) are overwritten by the corresponding elements of
superdiagonal, with the array tau, represent the unitary
matrix Q as a product of elementary reflectors;if uplo =
'L', the diagonal and first subdiagonal of sub(A) are
the array tau, represent the unitary matrix Q as a product
of elementary reflectors (see Application Notes below).
d (local)
REAL for pchetrd
DOUBLE PRECISION for pzhetrd.
Arrays, DIMENSION LOCc(ja+n-1). The diagonal elements
d(i)= A(i,i).
e (local)
REAL for pchetrd
DOUBLE PRECISION for pzhetrd.
Arrays, DIMENSION LOCc(ja+n-1) if uplo = 'U';
LOCc(ja+n-2) - otherwise.
e(i)= A(i,i+1) if uplo = 'U',
e(i)= A(i+1,i) if uplo = 'L'.
tau (local) COMPLEX for pchetrd
Arrays, DIMENSION LOCc(ja+n-1). This array contains the
1988
-i.
Application Notes
Q = H(n-1)*...*H(2)*H(1).
H(i) = i - tau*v*v',
where tau is a complex scalar, and v is a complex vector with v(i+1:n) = 0 and v(i) = 1;
v(1:i-1) is stored on exit in A(ia:ia+i-2, ja+i), and tau in tau (ja+i-1).
Q = H(1)*H(2)*...*H(n-1).
H(i) = i - tau*v*v',
where tau is a complex scalar, and v is a complex vector with v(1:i) = 0 and v(i+1) = 1;
v(i+2:n) is stored on exit in A(ia+i+1:ia+n-1,ja+i-1), and tau in tau(ja+i-1).
The contents of sub(A) on exit are illustrated by the following examples with n = 5:
If uplo = 'U':
If uplo = 'L':
1989
where d and e denote diagonal and off-diagonal elements of T, and v i denotes an element of
p?unmtr
tridiagonal form determined by p?hetrd.
Syntax
call pcunmtr( side, uplo, trans, m, n, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pzunmtr( side, uplo, trans, m, n, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
Description
This routine overwrites the general complex distributed m-by-n matrix sub(C) =
side ='L' side ='R'

H H
where Q is a complex unitary distributed matrix of order nq, with nq = m if side = 'L' and
nq = n if side = 'R'.
Q is defined as the product of nq-1 elementary reflectors, as returned by p?hetrd.
If uplo = 'U', Q = H(nq-1)... H(2) H(1);
1990
If uplo = 'L', Q = H(1) H(2)... H(nq-1).
Input Parameters
H
H
H
= 'U': Upper triangle of A(ia:*, ja:*) contains
elementary reflectors from p?hetrd;
= 'L': Lower triangle of A(ia:*,ja:*) contains elementary
reflectors from p?hetrd
a (local)
REAL for pcunmtr
DOUBLE PRECISION for pzunmtr.
(lld_a, LOCc(ja+m-1)) if side='L', or (lld_a,
as returned by p?hetrd.
If side='L', lld_a ≥ max(1,LOCr(ia+m-1));
If side ='R', lld_a ≥ max(1, LOCr(ia+n-1)).
tau (local)
COMPLEX for pcunmtr
1991
DOUBLE COMPLEX for pzunmtr.

Array, DIMENSION of ltau where
If side = 'L' and uplo = 'U', ltau = LOCc(m_a),
if side = 'L' and uplo = 'L', ltau = LOCc(ja+m-2),
if side = 'R' and uplo = 'U', ltau = LOCc(n_a),
if side = 'R' and uplo = 'L', ltau = LOCc(ja+n-2).
reflector H(i), as returned by p?hetrd. tau is tied to the
c (local) COMPLEX for pcunmtr
(lld_a, LOCc (ja+n-1)). Contains the local pieces of
the distributed matrix sub (C).
work (local)
COMPLEX for pcunmtr
least:
If uplo = 'U',
iaa= ia; jaa= ja+1, icc= ic; jcc= jc;
else uplo = 'L',
iaa= ia+1, jaa= ja;
If side = 'L',
icc= ic+1; jcc= jc;
else icc= ic; jcc= jc+1;
end if
end if
If side = 'L',
mi= m-1; ni= n
1992
else
If side = 'R',
mi= m; mi = n-1;
NPCOL), nb_a, 0, 0, lcmq), mpc0))*nb_a) +
nb_a*nb_a
end if
NPCOL),
npa0 = numroc(ni+iroffa, mb_a, MYROW, iarow, NPROW),
NPROW),
NPCOL),
calling the subroutine blacs_gridinfo. If lwork = -1,
then lwork is global input and a workspace query is
assumed; the routine only calculates the minimum and
optimal size for all work arrays. Each of these values is
returned in the first entry of the corresponding work array,
Output Parameters
c Overwritten by the product Q*sub(C), or Q'*sub(C), or
1993

an illegal value, then info = - i
p?stebz
Computes the eigenvalues of a symmetric
tridiagonal matrix by bisection.
Syntax
call psstebz(ictxt, range, order, n, vl, vu, il, iu, abstol, d, e, m, nsplit,
w, iblock, isplit, work, iwork, liwork, info)
call pdstebz(ictxt, range, order, n, vl, vu, il, iu, abstol, d, e, m, nsplit,
w, iblock, isplit, work, iwork, liwork, info)
Description
The routine computes the eigenvalues of a symmetric tridiagonal matrix in parallel. These may
be all eigenvalues, all eigenvalues in the interval [v1 vu], or the eigenvalues indexed il
through iu. A static partitioning of work is done at the beginning of p?stebz which results in
all processes finding an (almost) equal number of eigenvalues.
Input Parameters
ictxt (global) INTEGER. The BLACS context handle.
range (global) CHARACTER. Must be 'A' or 'V' or 'I'.
If range = 'V', the routine computes eigenvalues in the
interval [v1, vu].
If range ='I', the routine computes eigenvalues with
indices il to iu.
order (global) CHARACTER. Must be 'B' or 'E'.
If order = 'B', the eigenvalues are to be ordered from
smallest to largest within each split-off block.
1994
If order = 'E', the eigenvalues for the entire matrix are
to be ordered from smallest to largest.
n (global) INTEGER. The order of the tridiagonal matrix T
(n≥0).
vl, vu (global)
REAL for psstebz
DOUBLE PRECISION for pdstebz.
If range = 'V', the routine computes the lower and the
upper bounds for the eigenvalues on the interval [1, vu].
il, iu (global)
INTEGER. Constraint: 1≤il≤iu≤n.
If range = 'I', the index of the smallest eigenvalue is
returned for il and of the largest eigenvalue for iu
(assuming that the eigenvalues are in ascending order)
must be returned. il must be at least 1. iu must be at least
il and no greater than n.
abstol (global)
REAL for psstebz
The absolute tolerance to which each eigenvalue is required.
An eigenvalue (or cluster) is considered to have converged
if it lies in an interval of width abstol. If abstol≤0, then
the tolerance is taken as ulp||T||, where ulp is the
machine precision, and ||T|| means the 1-norm of T
Eigenvalues will be computed most accurately when abstol
is set to the underflow threshold slamch('U'), not 0. Note
that if eigenvectors are desired later by inverse iteration
(p?stein), abstol should be set to 2*p?lamch('S').
d (global)
REAL for psstebz
1995

To avoid overflow, the matrix must be scaled so that its
largest entry is no greater than the overflow(1/2) *
underflow(1/4) in absolute value, and for greatest accuracy,
it should not be much smaller than that.
e (global)
REAL for psstebz
Array, DIMENSION (n - 1).
matrix T. To avoid overflow, the matrix must be scaled so
that its largest entry is no greater than overflow(1/2) *
work (local)
REAL for psstebz
Array, DIMENSION max(5n, 7). This is a workspace array.
lwork (local) INTEGER. The size of the work array must be ≥
max(5n, 7).
iwork (local) INTEGER. Array, DIMENSION max(4n, 14). This is
a workspace array.
liwork (local) INTEGER. the size of the iwork array must ≥
max(4n, 14, NPROCS).
If liwork = -1, then liwork is global input and a
workspace query is assumed; the routine only calculates
the minimum and optimal size for all work arrays. Each of
these values is returned in the first entry of the
corresponding work array, and no error message is issued
by pxerbla.
1996
Output Parameters
m (global) INTEGER. The actual number of eigenvalues found.
0≤m≤n
nsplit (global) INTEGER. The number of diagonal blocks detected
in T. 1≤nsplit≤n
w (global)
REAL for psstebz
Array, DIMENSION (n). On exit, the first m elements of w
contain the eigenvalues on all processes.
iblock (global) INTEGER.
Array, DIMENSION (n). At each row/column j where e(j) is
zero or small, the matrix T is considered to split into a block
diagonal matrix. On exit iblock(i) specifies which block
(from 1 to the number of blocks) the eigenvalue w(i)
belongs to.
NOTE. In the (theoretically impossible) event that

bisection does not converge for some or all
eigenvalues, info is set to 1 and the ones for which
it did not are identified by a negative block number.
isplit (global) INTEGER.

Contains the splitting points, at which T breaks up into
submatrices. The first submatrix consists of rows/columns
1 to isplit(1), the second of rows/columns isplit(1)+1
through isplit(2), etc., and the nsplit-th consists of
rows/columns isplit(nsplit-1)+1 through
isplit(nsplit)=n. (Only the first nsplit elements are
used, but since the nsplit values are not known, n words
must be reserved for isplit.)
1997
If info < 0, if info = -i, the i-th argument has an illegal

value.
If info > 0, some or all of the eigenvalues fail to converge
or not computed.
If info = 1, bisection fails to converge for some
eigenvalues; these eigenvalues are flagged by a negative
block number. The effect is that the eigenvalues may not
be as accurate as the absolute and relative tolerances.
If info = 2, mismatch between the number of eigenvalues
output and the number desired.
If info = 3: range='i', and the Gershgorin interval
initially used is incorrect. No eigenvalues are computed.
Probable cause: the machine has a sloppy floating point
arithmetic. Increase the fudge parameter, recompile, and
try again.
p?stein
Computes the eigenvectors of a tridiagonal matrix
using inverse iteration.
Syntax
call psstein(n, d, e, m, w, iblock, isplit, orfac, z, iz, jz, descz, work,
lwork, iwork, liwork, ifail, iclustr, gap, info)
call pdstein(n, d, e, m, w, iblock, isplit, orfac, z, iz, jz, descz, work,
call pcstein(n, d, e, m, w, iblock, isplit, orfac, z, iz, jz, descz, work,
call pzstein(n, d, e, m, w, iblock, isplit, orfac, z, iz, jz, descz, work,
Description
The routine computes the eigenvectors of a symmetric tridiagonal matrix T corresponding to
specified eigenvalues, by inverse iteration. p?stein does not orthogonalize vectors that are
on different processes. The extent of orthogonalization is controlled by the input parameter
1998
lwork. Eigenvectors that are to be orthogonalized are computed by the same process. p?stein
decides on the allocation of work among the processes and then calls ?stein2 (modified LAPACK
routine) on each individual process. If insufficient workspace is allocated, the expected
orthogonalization may not be done.
NOTE. If the eigenvectors obtained are not orthogonal, increase lwork and run the code
again.
p = NPROW*NPCOL is the total number of processes.
Input Parameters
n (global) INTEGER. The order of the matrix T (n ≥ 0).
m (global) INTEGER. The number of eigenvectors to be
returned.
d, e, w (global)
REAL for single-precision flavors
Arrays: d(*) contains the diagonal elements of T.
DIMENSION (n).
DIMENSION (n-1).
w(*) contains all the eigenvalues grouped by split-off
block.The eigenvalues are supplied from smallest to largest
within the block. (Here the output array w from p?stebz
with order = 'B' is expected. The array should be replicated
in all processes.)
DIMENSION(m)
iblock (global) INTEGER.
Array, DIMENSION (n). The submatrix indices associated
with the corresponding eigenvalues in w--1 for eigenvalues
belonging to the first submatrix from the top, 2 for those
belonging to the second submatrix, etc. (The output array
iblock from p?stebz is expected here).
isplit (global) INTEGER.
1999
Array, DIMENSION (n). The splitting points, at which T breaks

up into submatrices. The first submatrix consists of
rows/columns 1 to isplit(1), the second of rows/columns
isplit(1)+1 through isplit(2), etc., and the nsplit-th
consists of rows/columns isplit(nsplit-1)+1 through
isplit(nsplit)=n . (The output array isplit from
p?stebz is expected here.)
orfac (global)
orfac specifies which eigenvectors should be orthogonalized.
Eigenvectors that correspond to eigenvalues within
orfac*||T|| of each other are to be orthogonalized.
However, if the workspace is insufficient (see lwork), this
tolerance may be decreased until all eigenvectors can be
stored in one process. No orthogonalization is done if orfac
is equal to zero. A default value of 1000 is used if orfac is
negative. orfac should be identical on all processes
iz, jz (global) INTEGER. The row and column indices in the global
array z indicating the first row and the first column of the
submatrix Z, respectively.
descz (global and local) INTEGER array, dimension (dlen_). The
array descriptor for the distributed matrix Z.
work (local). REAL for single-precision flavors
lwork (local) INTEGER.
lwork controls the extent of orthogonalization which can be
done. The number of eigenvectors for which storage is
allocated on each process is nvec =
floor((lwork-max(5*n,np00*mq00))/n). Eigenvectors
corresponding to eigenvalue clusters of size nvec -
ceil(m/p) + 1 are guaranteed to be orthogonal (the
orthogonality is similar to that obtained from ?stein2).
2000
NOTE. lwork must be no smaller than
max(5*n,np00*mq00) + ceil(m/p)*n and should
have the same input value on all processes.
It is the minimum value of lwork input on different

processes that is significant.
iwork (local) INTEGER.
Workspace array, DIMENSION (3n+p+1).
liwork (local) INTEGER. The size of the array iwork. It must be
greater than (3*n+p+1).
by pxerbla.
Output Parameters
z (local)
REAL for psstein
DOUBLE PRECISION for pdstein
COMPLEX for pcstein
DOUBLE COMPLEX for pzstein.
Array, DIMENSION (descz(dlen_), n/NPCOL + NB). z
contains the computed eigenvectors associated with the
specified eigenvalues. Any vector which fails to converge is
set to its current iterate after MAXIT iterations (See
?stein2). On output, z is distributed across the p processes
in block cyclic format.
2001
work(1) On exit, work(1) gives a lower bound on the workspace

(lwork) that guarantees the user desired orthogonalization
(see orfac). Note that this may overestimate the minimum
workspace needed.
iwork On exit, iwork(1) contains the amount of integer workspace
required.
On exit, the iwork(2) through iwork(p+2) indicate the
eigenvectors computed by each process. Process i computes
eigenvectors indexed iwork(i+2)+1 through iwork(i+3).
ifail (global). INTEGER. Array, DIMENSION (m). On normal exit,
all elements of ifail are zero. If one or more eigenvectors
fail to converge after MAXIT iterations (as in ?stein), then
info > 0 is returned. If mod(info, m+1)>0, then for i=1
to mod(info,m+1), the eigenvector corresponding to the
eigenvalue w(ifail(i)) failed to converge (w refers to the
array of eigenvalues on output).
iclustr (global) INTEGER. Array, DIMENSION (2*p)
This output array contains indices of eigenvectors
corresponding to a cluster of eigenvalues that could not be
orthogonalized due to insufficient workspace (see lwork,
orfac and info). Eigenvectors corresponding to clusters of
eigenvalues indexed iclustr(2*I-1) to iclustr(2*I),
i = 1 to info/(m+1), could not be orthogonalized due to
lack of workspace. Hence the eigenvectors corresponding
to these clusters may not be orthogonal. iclustr is a zero
terminated array
---(iclustr(2*k).ne.0.and.iclustr(2*k+1).eq.0)
if and only if k is the number of clusters.
gap (global)
This output array contains the gap between eigenvalues
whose eigenvectors could not be orthogonalized. The info/m
output values in this array correspond to the info/(m+1)
clusters indicated by the array iclustr. As a result, the dot
product between eigenvectors corresponding to the i-th
cluster may be as high as (O(n)*macheps)/gap(i).
2002
If info < 0: If the i-th argument is an array and the
j-entry had an illegal value, then info = -(i*100+j),
If the i-th argument is a scalar and had an illegal value,
then info = -i.
If info < 0: if info = -i, the i-th argument had an illegal
value.
If info > 0: if mod(info, m+1) = i, then i eigenvectors
failed to converge in MAXIT iterations. Their indices are
stored in the array ifail. If info/(m+1) = i, then
eigenvectors corresponding to i clusters of eigenvalues
could not be orthogonalized due to insufficient workspace.
The indices of the clusters are stored in the array iclustr.

This section describes ScaLAPACK routines for solving nonsymmetric eigenvalue problems,
computing the Schur factorization of general matrices, as well as performing a number of related
To solve a nonsymmetric eigenvalue problem with ScaLAPACK, you usually need to reduce the
matrix to the upper Hessenberg form and then solve the eigenvalue problem with the Hessenberg
matrix obtained.
Table 6-5 lists ScaLAPACK routines for reducing the matrix to the upper Hessenberg form by
an orthogonal (or unitary) similarity transformation A = QHQH, as well as routines for solving
eigenproblems with Hessenberg matrices, and multiplying the matrix after reduction.
Table 6-5 Computational Routines for Solving Nonsymmetric Eigenproblems
Operation performed General matrix Orthogonal/Unitary Hessenberg matrix
matrix
Reduce to Hessenberg form A = p?gehrd
QHQH
Multiply the matrix after p?ormhr/p?unmhr
reduction
Find eigenvalues and Schur p?lahqr
factorization
2003
p?gehrd
form.
Syntax
call psgehrd( n, ilo, ihi, a, ia, ja, desca, tau, work, lwork, info)
call pdgehrd( n, ilo, ihi, a, ia, ja, desca, tau, work, lwork, info)
call pcgehrd( n, ilo, ihi, a, ia, ja, desca, tau, work, lwork, info)
call pzgehrd( n, ilo, ihi, a, ia, ja, desca, tau, work, lwork, info)
Description
This routine reduces a real/complex general distributed matrix sub (A) to upper Hessenberg
form H by an orthogonal or unitary similarity transformation
Q'*sub(A)*Q = H,
where sub(A) = A(ia+n-1:ia+n-1, ja+n-1:ja+n-1).
Input Parameters
(n≥0).
ilo, ihi (global) INTEGER.
It is assumed that sub(A) is already upper triangular in rows
ia:ia+ilo-2 and ia+ihi:ia+n-1 and columns
ja:ja+ilo-2 and ja+ihi:ja+n-1. (See Application Notes
below).
If n > 0, 1≤ilo≤ihi≤n; otherwise set ilo = 1, ihi =
n.
a (local) REAL for psgehrd
DOUBLE PRECISION for pdgehrd
COMPLEX for pcgehrd
DOUBLE COMPLEX for pzgehrd.
2004
the local pieces of the n-by-n general distributed matrix
sub(A) to be reduced.
work (local)
REAL for psgehrd
COMPLEX for pcgehrd
lwork (local or global) INTEGER, dimension of the array work.
lwork is local input and must be at least
lwork≥NB*NB + NB*max(ihip+1, ihlp+inlq)
iroffa = mod(ia-1, NB),
icoffa = mod(ja-1, NB),
ioff = mod(ia+ilo-2, NB), iarow = indxg2p(ia,
NB, MYROW, rsrc_a, NPROW), ihip =
numroc(ihi+iroffa, NB, MYROW, iarow, NPROW),
ilrow = indxg2p(ia+ilo-1, NB, MYROW, rsrc_a,
NPROW),
ihlp = numroc(ihi-ilo+ioff+1, NB, MYROW, ilrow,
NPROW),
ilcol = indxg2p(ja+ilo-1, NB, MYCOL, csrc_a,
NPCOL),
inlq = numroc(n-ilo+ioff+1, NB, MYCOL, ilcol,
NPCOL),
2005

Output Parameters
a On exit, the upper triangle and the first subdiagonal of
sub(A)are overwritten with the upper Hessenberg matrix H,
and the elements below the first subdiagonal, with the array
tau, represent the orthogonal/unitary matrix Q as a product
of elementary reflectors (see Application Notes below).
tau (local). REAL for psgehrd
COMPLEX for pcgehrd
Array, DIMENSION at least max(ja+n-2).
The scalar factors of the elementary reflectors (see
Application Notes below). Elements ja:ja+ilo-2 and
ja+ihi:ja+n-2 of tau are set to zero. tau is tied to the
-i.
Application Notes
The matrix Q is represented as a product of (ihi-ilo) elementary reflectors
Q = H(ilo)*H(ilo+1)*...*H(ihi-1).
H(i)= i - tau*v*v'
2006
where tau is a real/complex scalar, and v is a real/complex vector with v(1:i)= 0, v(i+1)=
1 and v(ihi+1:n)= 0; v(i+2:ihi) is stored on exit in a(ia+ilo+i:ia+ihi-1,ja+ilo+i-2),
and tau in tau(ja+ilo+i-2). The contents of a(ia:ia+n-1,ja:ja+n-1) are illustrated by
the following example, with n = 7, ilo = 2 and ihi = 6:
on entry
on exit
where a denotes an element of the original matrix sub(A), H denotes a modified element of the
upper Hessenberg matrix H, and vi denotes an element of the vector defining H(ja+ilo+i-2).
2007
p?ormhr
Multiplies a general matrix by the orthogonal
Hessenberg form determined by p?gehrd.
Syntax
call psormhr(side, trans, m, n, ilo, ihi, a, ia, ja, desca, tau, c, ic, jc,
descc, work, lwork, info)
call pdormhr(side, trans, m, n, ilo, ihi, a, ia, ja, desca, tau, c, ic, jc,
Description
This routine overwrites the general real distributed m-by-n matrix sub(C)= C(ic:ic+m-1,
jc:jc+n-1) with
side ='L' side ='R'

T T
where Q is a real orthogonal distributed matrix of order nq, with nq = m if side = 'L' and nq
= n if side = 'R'.
Q is defined as the product of ihi-ilo elementary reflectors, as returned by p?gehrd.
Q = H(ilo) H(ilo+1)... H(ihi-1).
Input Parameters
T
T
T
matrix sub (C) (m≥0).
2008
n (global) INTEGER. The number of columns in he distributed
matrix sub (C) (n≥0).
ilo and ihi must have the same values as in the previous
call of p?gehrd. Q is equal to the unit matrix except for the
distributed submatrix
Q(ia+ilo:ia+ihi-1,ia+ilo:ja+ihi-1).
If side = 'L', 1≤ilo≤ihi≤max(1,m);
If side = 'R', 1≤ilo≤ihi≤max(1,n);
ilo and ihi are relative indexes.
a (local)
REAL for psormhr
DOUBLE PRECISION for pdormhr
(lld_a, LOCc(ja+m-1)) if side='L', and (lld_a,
as returned by p?gehrd.
tau (local)
REAL for psormhr
Array, DIMENSION LOCc(ja+m-2), if side = 'L', and
LOCc(ja+n-2) if side = 'R'.
This array contains the scalar factors tau(j) of the
elementary reflectors H(j) as returned by p?gehrd. tau is
c (local)
REAL for psormhr
(lld_c,LOCc(jc+n-1)).
2009
Contains the local pieces of the distributed matrix sub(C).

work (local)
REAL for psormhr
lwork must be at least iaa = ia + ilo; jaa =
ja+ilo-1;
If side = 'L',
mi = ihi-ilo; ni = n; icc = ic + ilo; jcc = jc;
lwork ≥ max((nb_a*(nb_a-1))/2, (nqc0+mpc0)*nb_a)
+ nb_a*nb_a
else if side = 'R',
mi = m; ni = ihi-ilo; icc = ic; jcc = jc + ilo;
lwork ≥ max((nb_a*(nb_a-1))/2,
(nqc0+max(npa0+numroc(numroc(ni+icoffc, nb_a,
0, 0, NPCOL), nb_a, 0, 0, lcmq), mpc0))*nb_a) +
nb_a*nb_a
end if
NPCOL),
NPROW),
iroffc = mod(icc-1, mb_c), icoffc = mod(jcc-1,
nb_c),
2010
NPROW),
NPCOL),
Output Parameters
c sub(C) is overwritten by Q*sub(C), or Q'*sub(C), or
-i.
p?unmhr
Hessenberg form determined by p?gehrd.
Syntax
call pcunmhr(side, trans, m, n, ilo, ihi, a, ia, ja, desca, tau, c, ic, jc,
call pzunmhr(side, trans, m, n, ilo, ihi, a, ia, ja, desca, tau, c, ic, jc,
2011
Description
This routine overwrites the general complex distributed m-by-n matrix sub(C) =
side ='L' side ='R'

H H
trans = 'H': Q *sub(C) sub(C)*Q
where Q is a complex unitary distributed matrix of order nq, with nq = m if side = 'L' and
nq = n if side = 'R'.
Q is defined as the product of ihi-ilo elementary reflectors, as returned by p?gehrd.
Q = H(ilo) H(ilo+1)... H(ihi-1).
Input Parameters
H
H
H
submatrix sub (C) (m≥0).
submatrix sub (C) (n≥0).
ilo, ihi (global) INTEGER
These must be the same parameters ilo and ihi,
respectively, as supplied to p?gehrd. Q is equal to the unit
matrix except in the distributed submatrix Q
(ia+ilo:ia+ihi-1,ia+ilo:ja+ihi-1).
If side ='L', then 1≤ilo≤ihi≤max(1,m).
If side = 'R', then 1≤ilo≤ihi≤max(1,n)
ilo and ihi are relative indexes.
2012
a (local)
COMPLEX for pcunmhr
DOUBLE COMPLEX for pzunmhr.
(lld_a, LOC c(ja+m-1)) if side='L', and (lld_a,
as returned by p?gehrd.
tau (local)
COMPLEX for pcunmhr
Array, DIMENSION LOCc(ja+m-2), if side = 'L', and
LOCc(ja+n-2) if side = 'R'.
This array contains the scalar factors tau(j) of the
elementary reflectors H(j) as returned by p?gehrd. tau is
c (local)
COMPLEX for pcunmhr
Contains the local pieces of the distributed matrix sub(C).
work (local)
COMPLEX for pcunmhr
2013
lwork (local or global)

lwork must be at least iaa = ia + ilo; jaa =
ja+ilo-1;
If side = 'L', mi = ihi-ilo; ni = n; icc = ic + ilo;
jcc = jc; lwork ≥ max((nb_a*(nb_a-1))/2,
(nqc0+mpc0)*nb_a) + nb_a*nb_a
else if side = 'R',
mi = m; ni = ihi-ilo; icc = ic; jcc = jc + ilo;
NPCOL), nb_a, 0, 0, lcmq ), mpc0))*nb_a) +
nb_a*nb_a
end if
NPCOL),
NPROW),
NPROW),
NPCOL),
2014
Output Parameters
c C is overwritten by Q* sub(C) or Q'*sub(C) or sub(C)*Q' or
sub(C)*Q.
-i.
p?lahqr
Computes the Schur decomposition and/or
eigenvalues of a matrix already in Hessenberg
form.
Syntax
call pslahqr(wantt, wantz, n, ilo, ihi, a, desca, wr, wi, iloz, ihiz, z,
descz, work, lwork, iwork, ilwork, info)
call pdlahqr(wantt, wantz, n, ilo, ihi, a, desca, wr, wi, iloz, ihiz, z,
descz, work, lwork, iwork, ilwork, info)
Description
This is an auxiliary routine used to find the Schur decomposition and/or eigenvalues of a matrix
already in Hessenberg form from columns ilo to ihi.
Input Parameters
wantt (global) LOGICAL
wantz (global) LOGICAL.
2015
If wantz = .TRUE., the matrix of Schur vectors z is

required;
n (global) INTEGER. The order of the Hessenberg matrix A
(and z if wantz). (n≥0).
It is assumed that A is already upper quasi-triangular in
rows and columns ihi+1:n, and that A(ilo, ilo-1) = 0
(unless ilo = 1). p?lahqr works primarily with the
Hessenberg submatrix in rows and columns ilo to ihi, but
applies transformations to all of h if wantt is .TRUE..
1≤ilo≤max(1,ihi); ihi ≤ n.
a (global)
REAL for pslahqr
DOUBLE PRECISION for pdlahqr
Array, DIMENSION (desca(lld_),*). On entry, the upper
Hessenberg matrix A.
iloz, ihiz (global) INTEGER. Specify the rows of z to which
transformations must be applied if wantz is .TRUE..
1≤iloz≤ilo; ihi≤ihiz≤n.
z (global ) REAL for pslahqr
Array. If wantz is .TRUE., on entry z must contain the
current matrix Z of transformations accumulated by
pdhseqr. If wantz is .FALSE., z is not referenced.
work (local)
REAL for pslahqr
2016
lwork (local) INTEGER. The dimension of work. lwork is assumed
big enough so that lwork≥3*n +
max(2*max(descz(lld_),desca(lld_)) + 2*LOCq(n),
7*ceil(n/hbl)/lcm(NPROW,NPCOL))).
If lwork = -1, then work(1)gets set to the above number
and the code returns immediately.
iwork (global and local) INTEGER array of size ilwork.
ilwork (local) INTEGER This holds some of the iblk integer arrays.
Output Parameters
a On exit, if wantt is .TRUE., A is upper quasi-triangular in
rows and columns ilo:ihi, with any 2-by-2 or larger
diagonal blocks not yet in standard form. If wantt is
.FALSE., the contents of A are unspecified on exit.
wr, wi (global replicated output)
REAL for pslahqr
Arrays, DIMENSION(n) each. The real and imaginary parts,
respectively, of the computed eigenvalues ilo to ihi are
stored in the corresponding elements of wr and wi. If two
eigenvalues are computed as a complex conjugate pair,
they are stored in consecutive elements of wr and wi, say
the i-th and (i+1)-th, with wi(i)> 0 and wi(i+1) < 0. If
wantt is .TRUE. , the eigenvalues are stored in the same
order as on the diagonal of the Schur form returned in A. A
may be returned with larger diagonal blocks until the next
release.
z On exit z has been updated; transformations are applied
only to the submatrix z(iloz:ihiz, ilo:ihi).
< 0: parameter number -info incorrect or inconsistent
2017
> 0: p?lahqr failed to compute all the eigenvalues ilo to

ihi in a total of 30*(ihi-ilo+1) iterations; if info = i,
elements i+1:ihi of wr and wi contain those eigenvalues
which have been successfully computed.

This section describes ScaLAPACK routines for computing the singular value decomposition
(SVD) of a general m-by-n matrix A (see “Singular Value Decomposition” in LAPACK chapter).
To find the SVD of a general matrix A, this matrix is first reduced to a bidiagonal matrix B by
a unitary (orthogonal) transformation, and then SVD of the bidiagonal matrix is computed.
Note that the SVD of B is computed using the LAPACK routine ?bdsqr .
Table 6-6 lists ScaLAPACK computational routines for performing this decomposition.
Table 6-6 Computational Routines for Singular Value Decomposition (SVD)
Operation General matrix Orthogonal/unitary matrix
Reduce A to a bidiagonal matrix p?gebrd
Multiply matrix after reduction p?ormbr/p?unmbr
p?gebrd
Reduces a general matrix to bidiagonal form.
Syntax
call psgebrd(m, n, a, ia, ja, desca, d, e, tauq, taup, work, lwork, info)
call pdgebrd(m, n, a, ia, ja, desca, d, e, tauq, taup, work, lwork, info)
call pcgebrd(m, n, a, ia, ja, desca, d, e, tauq, taup, work, lwork, info)
call pzgebrd(m, n, a, ia, ja, desca, d, e, tauq, taup, work, lwork, info)
Description
This routine reduces a real/complex general m-by-n distributed matrix sub(A)=
A(ia:ia+m-1,ja:ja+n-1) to upper or lower bidiagonal form B by an orthogonal/unitary
transformation:
Q'*sub(A)*P = B.
2018
If m≥ n, B is upper bidiagonal; if m < n, B is lower bidiagonal.
Input Parameters
matrix sub(A) (m≥0).
matrix sub(A) (n≥0).
a (local)
REAL for psgebrd
DOUBLE PRECISION for pdgebrd
COMPLEX for pcgebrd
DOUBLE COMPLEX for pzgebrd.
Real pointer into the local memory to an array of dimension
the distributed matrix sub (A).
work (local)
REAL for psgebrd
COMPLEX for pcgebrd
DOUBLE COMPLEX for pzgebrd. Workspace array of
dimension lwork.
least:
lwork ≥ nb*(mpa0 + nqa0+1)+ nqa0
where nb = mb_a = nb_a,
iroffa = mod(ia-1, nb),
icoffa = mod(ja-1, nb),
iarow = indxg2p(ia, nb, MYROW, rsrc_a, NPROW),
iacol = indxg2p (ja, nb, MYCOL, csrc_a, NPCOL),
2019
mpa0 = numroc(m +iroffa, nb, MYROW, iarow,

NPROW),
nqa0 = numroc(n +icoffa, nb, MYCOL, iacol,
NPCOL),
Output Parameters
a On exit, if m≥n, the diagonal and the first superdiagonal of
sub(A) are overwritten with the upper bidiagonal matrix B;
the elements below the diagonal, with the array tauq,
elementary reflectors, and the elements above the first
superdiagonal, with the array taup, represent the orthogonal
matrix P as a product of elementary reflectors. If m < n,
the diagonal and the first subdiagonal are overwritten with
the lower bidiagonal matrix B; the elements below the first
subdiagonal, with the array tauq, represent the
reflectors, and the elements above the diagonal, with the
array taup, represent the orthogonal matrix P as a product
of elementary reflectors. See Application Notes below.
d (local)
DIMENSION LOCc(ja+min(m,n)-1) if m≥n;
LOCr(ia+min(m,n)-1) otherwise. The distributed diagonal
elements of the bidiagonal matrix B: d(i) = a(i,i).
e (local)
2020
DIMENSION LOCr(ia+min(m,n)-1) if m≥n;
LOCc(ja+min(m,n)-2) otherwise. The distributed
off-diagonal elements of the bidiagonal distributed matrix
B:
If m≥n, e(i) = a(i,i+1) for i = 1,2,..., n-1; if m
< n, e(i) = a(i+1, i) for i = 1,2,...,m-1. e is tied
tauq, taup (local)
REAL for psgebrd
COMPLEX for pcgebrd
DOUBLE COMPLEX for pzgebrd.
Arrays, DIMENSION LOCc(ja+min(m,n)-1) for tauq and
LOCr(ia+min(m,n)-1) for taup. Contain the scalar factors
of the elementary reflectors which represent the
orthogonal/unitary matrices Q and P, respectively. tauq and
taup are tied to the distributed matrix A. See Application
Notes below.
-i.
Application Notes
The matrices Q and P are represented as products of elementary reflectors:
If m ≥ n,
Q = H(1)*H(2)*...*H(n), and P = G(1)*G(2)*...*G(n-1).

Each H(i) and G(i) has the form:
H(i)= i - tauq * v * v' and G(i) = i - taup*u*u'
2021
where tauq and taup are real/complex scalars, and v and u are real/complex vectors;
v(1:i-1) = 0, v(i) = 1, and v(i+1:m) is stored on exit in A(ia+i:ia+m-1,ja+i-1);

u(1:i) = 0, u(i+1) = 1, and u(i+2:n) is stored on exit in A (ia+i-1,ja+i+1:ja+n-1);
tauq is stored in tauq(ja+i-1) and taup in taup(ia+i-1).
If m < n,
Q = H(1)*H(2)*...*H(m-1), and P = G(1)* G(2)*...* G(m)

Each H (i) and G(i) has the form:
H(i)= i-tauq*v*v' and G(i)= i-taup*u*u'

here tauq and taup are real/complex scalars, and v and u are real/complex vectors;
v(1:i) = 0, v(i+1) = 1, and v(i+2:m) is stored on exit in A (ia+i:ia+m-1,ja+i-1);

u(1:i-1) = 0, u(i) = 1, and u(i+1:n) is stored on exit in A(ia+i-1,ja+i+1:ja+n-1);
tauq is stored in tauq(ja+i-1) and taup in taup(ia+i-1).
The contents of sub(A) on exit are illustrated by the following examples:
m = 6 and n = 5 (m > n):
m = 5 and n = 6 (m < n):
2022
where d and e denote diagonal and off-diagonal elements of B, vi denotes an element of the
vector defining H(i), and ui an element of the vector defining G(i).
p?ormbr
Multiplies a general matrix by one of the orthogonal
matrices from a reduction to bidiagonal form
determined by p?gebrd.
Syntax
call psormbr(vect, side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc,
call pdormbr(vect, side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc,
Description
If vect = 'Q', the routine overwrites the general real distributed m-by-n matrix sub(C) =
C(c:ic+m-1,jc:jc+n-1) with
side ='L' side ='R'

trans = 'N': Q sub(C) sub(C) Q
T T
trans = 'T': Q sub(C) sub(C) Q
If vect = 'P', the routine overwrites sub(C) with
side ='L' side ='R'

trans = 'N': P sub(C) sub(C) P
T T
trans = 'T': P sub(C) sub(C) P
T
Here Q and P are the orthogonal distributed matrices determined by p?gebrd when reducing
a real distributed matrix A(ia:*, ja:*) to bidiagonal form: A(ia:*,ja:*) = Q*B*PT. Q and
T
P are defined as products of elementary reflectors H(i) and G(i) respectively.
Let nq = m if side = 'L' and nq = n if side = 'R'. Thus nq is the order of the orthogonal
T
matrix Q or P that is applied.
2023
If vect = 'Q', A(ia:*, ja:*) is assumed to have been an nq-by-k matrix:
If nq ≥ k, Q = H(1) H(2)...H(k);
If nq < k, Q = H(1) H(2)...H(nq-1).

If vect = 'P', A(ia:*, ja:*) is assumed to have been a k-by-nq matrix:
If k < nq, P = G(1) G(2)...G(k);
If k ≥ nq, P = G(1) G(2)...G(nq-1).
Input Parameters
vect (global) CHARACTER.
T
If vect ='Q', then Q or Q is applied.
T
If vect ='P', then P or P is applied.
side (global) CHARACTER.
T T
If side ='L', then Q or Q , P or P is applied from the left.
T T
If side ='R', then Q or Q , P or P is applied from the right.
trans (global) CHARACTER.
If trans = 'N', no transpose, Q or P is applied.
T T
If trans = 'T', then Q or P is applied.
matrix sub (C).
matrix sub (C).
k (global) INTEGER.
If vect = 'Q', the number of columns in the original
distributed matrix reduced by p?gebrd;
If vect = 'P', the number of rows in the original
distributed matrix reduced by p?gebrd.
Constraints: k ≥ 0.
a (local)
REAL for psormbr
DOUBLE PRECISION for pdormbr.
(lld_a, LOCc(ja+min(nq,k)-1))
If vect='Q', and (lld_a, LOCc(ja+nq-1))
2024
If vect = 'P'.
nq = m if side = 'L', and nq = n otherwise.
The vectors which define the elementary reflectors H(i) and
G(i), whose products determine the matrices Q and P, as
returned by p?gebrd.
If vect = 'Q', lld_a≥max(1, LOCr(ia+nq-1));
If vect = 'P', lld_a≥max(1, LOCr(ia+min(nq, k)-1)).
tau (local)
REAL for psormbr
Array, DIMENSION LOCc(ja+min(nq, k)-1), if vect =
'Q', and LOCr(ia+min(nq, k)-1), if vect = 'P'.
reflector H(i) or G(i), which determines Q or P, as returned
by pdgebrd in its array argument tauq or taup. tau is tied
c (local) REAL for psormbr
DOUBLE PRECISION for pdormbr
(lld_a, LOCc (jc+n-1)).
Contains the local pieces of the distributed matrix sub (C).
work (local)
REAL for psormbr
2025

least:
If side = 'L'
nq = m;
if ((vect = 'Q' and nq≥k) or (vect is not equal to 'Q'
and nq>k)), iaa=ia; jaa=ja; mi=m; ni=n; icc=ic;
jcc=jc;
else
iaa= ia+1; jaa=ja; mi=m-1; ni=n; icc=ic+1; jcc= jc;
end if
else
If side = 'R', nq = n;
if((vect = 'Q' and nq≥k) or (vect is not equal
to 'Q' and nq>k)),
iaa=ia; jaa=ja; mi=m; ni=n; icc=ic; jcc=jc;
else
iaa= ia; jaa= ja+1; mi= m; ni= n-1; icc= ic; jcc=
jc+1;
end if
end if
If vect = 'Q',
If side = 'L', lwork≥max((nb_a*(nb_a-1))/2, (nqc0 +
mpc0)*nb_a) + nb_a * nb_a
else if side = 'R',
lwork≥max((nb_a*(nb_a-1))/2, (nqc0 + max(npa0 +
numroc(numroc(ni+icoffc, nb_a, 0, 0, NPCOL),
end if
else if vect is not equal to 'Q', if side = 'L',
lwork≥max((mb_a*(mb_a-1))/2, (mpc0 + max(mqa0 +
numroc(numroc(mi+iroffc, mb_a, 0, 0, NPROW),
else if side = 'R',
lwork≥max((mb_a*(mb_a-1))/2, (mpc0 + nqc0)*mb_a)
+ mb_a*mb_a
end if
end if
2026
where lcmp = lcm/NPROW, lcmq = lcm/NPCOL, with
lcm = ilcm(NPROW, NPCOL),
iacol = indxg2p(jaa, nb_a, MYCOL, csrc_a, NPCOL),
mqa0 = numroc(mi+icoffa, nb_a, MYCOL, iacol,
NPCOL),
NPROW),
NPROW),
NPCOL),
Output Parameters
c On exit, if vect='Q', sub(C) is overwritten by Q*sub(C), or
Q'*sub(C), or sub(C)*Q', or sub(C)*Q; if vect='P', sub(C)
is overwritten by P*sub(C), or P'*sub(C), or sub(C)*P, or
sub(C)*P'.
2027

-i.
p?unmbr
Multiplies a general matrix by one of the unitary
transformation matrices from a reduction to
bidiagonal form determined by p?gebrd.
Syntax
call pcunmbr(vect, side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc,
call pzunmbr(vect, side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc,
Description
If vect = 'Q', the routine overwrites the general complex distributed m-by-n matrix sub(C)
= C(ic:ic+m-1, jc:jc+n-1) with
side ='L' side ='R'

H H
If vect = 'P', the routine overwrites sub(C) with
side ='L' side ='R'

trans = 'N': P*sub(C) sub(C)*P
H H
trans = 'C': P *sub(C) sub(C)*P
H
Here Q and P are the unitary distributed matrices determined by p?gebrd when reducing a
complex distributed matrix A(ia:*, ja:*) to bidiagonal form: A(ia:*,ja:*) = Q*B*PH.
H
Q and P are defined as products of elementary reflectors H(i) and G(i) respectively.
2028
Let nq = m if side = 'L' and nq = n if side = 'R'. Thus nq is the order of the unitary
H
matrix Q or P that is applied.
If vect = 'Q', A(ia:*, ja:*) is assumed to have been an nq-by-k matrix:
If nq ≥ k, Q = H(1) H(2)... H(k);
If nq < k, Q = H(1) H(2)... H(nq-1).

If vect = 'P', A(ia:*, ja:*) is assumed to have been a k-by-nq matrix:
If k < nq, P = G(1) G(2)... G(k);
If k ≥ nq, P = G(1) G(2)... G(nq-1).
Input Parameters
vect (global) CHARACTER.
H
If vect ='Q', then Q or Q is applied.
H
If vect ='P', then P or P is applied.
H H
If side ='L', then Q or Q , P or P is applied from the left.
H H
If side ='R', then Q or Q , P or P is applied from the right.
If trans = 'N', no transpose, Q or P is applied.
H H
If trans = 'C', conjugate transpose, Q or P is applied.
matrix sub (C) m≥0.
matrix sub (C) n≥0.
k (global) INTEGER.
If vect = 'Q', the number of columns in the original
distributed matrix reduced by p?gebrd;
If vect = 'P', the number of rows in the original
distributed matrix reduced by p?gebrd.
Constraints: k ≥ 0.
a (local)
COMPLEX for psormbr
DOUBLE COMPLEX for pdormbr.
2029

(lld_a, LOCc(ja+min(nq,k)-1)) if vect='Q', and
(lld_a, LOCc(ja+nq-1)) if vect = 'P'.
nq = m if side = 'L', and nq = n otherwise.
The vectors which define the elementary reflectors H(i) and
G(i), whose products determine the matrices Q and P, as
returned by p?gebrd.
If vect = 'Q', lld_a ≥ max(1, LOCr(ia+nq-1));
If vect = 'P', lld_a ≥ max(1, LOCr(ia+min(nq,
k)-1)).
tau (local)
COMPLEX for pcunmbr
DOUBLE COMPLEX for pzunmbr.
Array, DIMENSION LOCc(ja+min(nq, k)-1), if vect =
'Q', and LOCr(ia+min(nq, k)-1), if vect = 'P'.
reflector H(i) or G(i), which determines Q or P, as returned
by p?gebrd in its array argument tauq or taup. tau is tied
c (local) COMPLEX for pcunmbr
DOUBLE COMPLEX for pzunmbr
(lld_a, LOCc (jc+n-1)).
Contains the local pieces of the distributed matrix sub (C).
work (local)
COMPLEX for pcunmbr
2030
DOUBLE COMPLEX for pzunmbr.
least:
If side = 'L'
nq = m;
if ((vect = 'Q' and nq ≥ k) or (vect is not equal
to 'Q' and nq > k)), iaa= ia; jaa= ja; mi= m; ni=
n; icc= ic; jcc= jc;
else
iaa= ia+1; jaa= ja; mi= m-1; ni= n; icc= ic+1; jcc=
jc;
end if
else
If side = 'R', nq = n;
if ((vect = 'Q' and nq ≥ k) or (vect is not equal
to 'Q' and nq ≥ k)),
iaa= ia; jaa= ja; mi= m; ni= n; icc= ic; jcc= jc;
else
iaa= ia; jaa= ja+1; mi= m; ni= n-1; icc= ic; jcc=
jc+1;
end if
end if
If vect = 'Q',
If side = 'L', lwork ≥ max((nb_a*(nb_a-1))/2,
(nqc0+mpc0)*nb_a) + nb_a*nb_a
else if side = 'R',
NPCOL), nb_a, 0, 0, lcmq), mpc0))*nb_a) +
nb_a*nb_a
end if
else if vect is not equal to 'Q',
if side = 'L',
2031

max(mqa0+numroc(numroc(mi+iroffc, mb_a, 0, 0,
NPROW), mb_a, 0, 0, lcmp), nqc0))*mb_a) +
mb_a*mb_a
else if side = 'R',
end if
end if
where lcmp = lcm/NPROW, lcmq = lcm/NPCOL, with lcm
= ilcm(NPROW, NPCOL),
iacol = indxg2p(jaa, nb_a, MYCOL, csrc_a, NPCOL),
mqa0 = numroc(mi+icoffa, nb_a, MYCOL, iacol, NPCOL),
npa0 = numroc(ni+iroffa, mb_a, MYROW, iarow, NPROW),
NPROW),
NPCOL),
2032
Output Parameters
c On exit, if vect='Q', sub(C) is overwritten by Q*sub(C),
or Q'*sub(C), or sub(C)*Q', or sub(C)*Q; if vect='P',
sub(C) is overwritten by P*sub(C), or P'*sub(C), or
sub(C)*P, or sub(C)*P'.
-i.
Generalized Symmetric-Definite Eigen Problems

This section describes ScaLAPACK routines that allow you to reduce the generalized
symmetric-definite eigenvalue problems (see Generalized Symmetric-Definite Eigenvalue
Problems in LAPACK chapters) to standard symmetric eigenvalue problem Cy = λy, which you
can solve by calling ScaLAPACK routines described earlier in this chapter (see Symmetric
Eigenproblems).
Table 6-7 lists these routines.
Table 6-7 Computational Routines for Reducing Generalized Eigenproblems to Standard
Problems
Operation Real symmetric matrices Complex Hermitian matrices
Reduce to standard problems p?sygst p?hegst
2033
p?sygst
Syntax
call pssygst( ibtype, uplo, n, a, ia, ja, desca, b, ib, jb, descb, scale,
info )
call pdsygst( ibtype, uplo, n, a, ia, ja, desca, b, ib, jb, descb, scale,
info )
Description
The routine reduces real symmetric-definite generalized eigenproblems to the standard form.
In the following sub(A) denotes A(ia:ia+n-1, ja:ja+n-1) and sub(B) denotes
B(ib:ib+n-1, jb:jb+n-1).
If ibtype = 1, the problem is
sub(A)*x = λ*sub(B)*x,
T T
and sub(A) is overwritten by inv(U )*sub(A)*inv(U), or inv(L)*sub(A)*inv(L ).
If ibtype = 2 or 3, the problem is
sub(A)*sub(B)*x = λ*x, or sub(B)*sub(A)*x = λ*x,

T T
and sub(A) is overwritten by U*sub(A)*U , or L *sub(A)*L.
T T
sub(B) must have been previously factorized as U *U or L*L by p?potrf.
Input Parameters
ibtype (global) INTEGER. Must be 1 or 2 or 3.
If itype = 1, compute inv(UT)*sub(A)*inv(U), or
inv(L)*sub(A)*inv(LT);
If itype = 2 or 3, compute U*sub(A)*UT, or
LT*sub(A)*L.
uplo (global) CHARACTER. Must be 'U' or 'L'.
2034
If uplo = 'U', the upper triangle of sub(A) is stored and
T
sub (B) is factored as U *U.
If uplo = 'L', the lower triangle of sub(A) is stored and
T
sub (B) is factored as L*L .
n (global) INTEGER. The order of the matrices sub (A) and
sub (B) (n ≥ 0).
a (local)
REAL for pssygst
DOUBLE PRECISION for pdsygst.
(lld_a, LOCc(ja+n-1)). On entry, the array contains the
local pieces of the n-by-n symmetric distributed matrix
sub(A).
its strictly upper triangular part is not referenced.
b (local)
REAL for pssygst
(lld_b, LOCc(jb+n-1)). On entry, the array contains the
local pieces of the triangular factor from the Cholesky
factorization of sub (B) as returned by p?potrf.
2035
Output Parameters
a On exit, if info = 0, the transformed matrix, stored in the
same format as sub(A).
scale (global)
REAL for pssygst
Amount by which the eigenvalues should be scaled to
compensate for the scaling performed in this routine. At
present, scale is always returned as 1.0, it is returned here
to allow for future enhancement.
If info = 0, the execution is successful. If info < 0, if the
i-th argument is an array and the j-entry had an illegal
value, then info = -(i*100+j), if the i-th argument is
a scalar and had an illegal value, then info = -i.
p?hegst
Reduces a Hermitian-definite generalized
Syntax
call pchegst( ibtype, uplo, n, a, ia, ja, desca, b, ib, jb, descb, scale,
info)
call pzhegst( ibtype, uplo, n, a, ia, ja, desca, b, ib, jb, descb, scale,
info)
Description
The routine reduces complex Hermitian-definite generalized eigenproblems to the standard
form.
In the following sub(A) denotes A(ia:ia+n-1, ja:ja+n-1) and sub(B) denotes
B(ib:ib+n-1, jb:jb+n-1).
2036
sub(A)*x = λ*sub(B)*x,
H H
and sub(A) is overwritten by inv(U )*sub(A)*inv(U), or inv(L)*sub(A)*inv(L ).
sub(A)*sub(B)*x = λ*x, or sub(B)*sub(A)*x = λ*x,

H H
and sub(A) is overwritten by U*sub(A)*U , or L *sub(A)*L.
H H
sub(B) must have been previously factorized as U *U or L*L by p?potrf.
Input Parameters
If itype = 1, compute inv(UH)*sub(A)*inv(U), or
inv(L)*sub(A)*inv(LH);
If itype = 2 or 3, compute U*sub(A)*UH, or
LH*sub(A)*L.
If uplo = 'U', the upper triangle of sub(A) is stored and
sub (B) is factored as UH*U.
If uplo = 'L', the lower triangle of sub(A) is stored and
sub (B) is factored as L*LH.
n (global) INTEGER. The order of the matrices sub (A) and
sub (B) (n≥0).
a (local)
COMPLEX for pchegst
DOUBLE COMPLEX for pzhegst.
(lld_a, LOCc(ja+n-1)). On entry, the array contains the
local pieces of the n-by-n Hermitian distributed matrix
sub(A). If uplo = 'U', the leading n-by-n upper triangular
part of sub(A) contains the upper triangular part of the
matrix, and its strictly lower triangular part is not
referenced. If uplo = 'L', the leading n-by-n lower
triangular part of sub(A) contains the lower triangular part
of the matrix, and its strictly upper triangular part is not
referenced.
2037
b (local)
COMPLEX for pchegst
DOUBLE COMPLEX for pzhegst.
(lld_b, LOCc(jb+n-1)). On entry, the array contains the
local pieces of the triangular factor from the Cholesky
factorization of sub (B) as returned by p?potrf.
Output Parameters
a On exit, if info = 0, the transformed matrix, stored in the
same format as sub(A).
scale (global)
REAL for pchegst
DOUBLE PRECISION for pzhegst.
Amount by which the eigenvalues should be scaled to
compensate for the scaling performed in this routine. At
present, scale is always returned as 1.0, it is returned here
to allow for future enhancement.
If info = 0, the execution is successful. If info <0, if the
value, then info = -(i100+j), if the i-th argument is a
scalar and had an illegal value, then info = -i.
2038
Driver Routines
Table 6-8 lists ScaLAPACK driver routines available for solving systems of linear equations,
linear least-squares problems, standard eigenvalue and singular value problems, and generalized
symmetric definite eigenproblems.
Table 6-8 ScaLAPACK Driver Routines
Type of Problem Matrix type, storage scheme Driver
Linear equations general (partial pivoting) p?gesv (simple driver)p?gesvx
(expert driver)
general band (partial pivoting) p?gbsv (simple driver)
general band (no pivoting) p?dbsv (simple driver)
general tridiagonal (no pivoting) p?dtsv (simple driver)
symmetric/Hermitian p?posv (simple driver)p?posvx
positive-definite (expert driver)
symmetric/Hermitian p?pbsv (simple driver)
positive-definite, band
symmetric/Hermitian p?ptsv (simple driver)
positive-definite, tridiagonal
Linear least squares problem general m-by-n p?gels
Symmetric eigenvalue symmetric/Hermitian p?syev (simple driver) p?syevx
problem / p?heevx (expert driver)
Singular value general m-by-n p?gesvd
decomposition
Generalized symmetric symmetric/Hermitian, one matrix p?sygvx / p?hegvx (expert
definite eigenvalue problem also positive-definite driver)
p?gesv
equations with a square distributed matrix and
multiple right-hand sides.
Syntax
call psgesv(n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
call pdgesv(n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
call pcgesv(n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
call pzgesv(n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
2039
Description
The routine p?gesv computes the solution to a real or complex system of linear equations
sub(A)*X = sub(B), where sub(A) = A(ia:ia+n-1, ja:ja+n-1) is an n-by-n distributed
matrix and X and sub(B) = B(ib:ib+n-1, jb:jb+nrhs-1) are n-by-nrhs distributed matrices.
The LU decomposition with partial pivoting and row interchanges is used to factor sub(A) as
sub(A) = P*L*U, where P is a permutation matrix, L is unit lower triangular, and U is upper
triangular. L and U are stored in sub(A). The factored form of sub(A) is then used to solve the
system of equations sub(A)*X = sub(B).
Input Parameters
sub(A) (n ≥ 0).
nrhs (global) INTEGER. The number of right hand sides, that is,
the number of columns of the distributed submatrices B and
X (nrhs ≥ 0).
a, b (local)
REAL for psgesv
DOUBLE PRECISION for pdgesv
COMPLEX for pcgesv
DOUBLE COMPLEX for pzgesv.
On entry, the array a contains the local pieces of the n-by-n
distributed matrix sub(A) to be factored.
On entry, the array b contains the right hand side distributed
matrix sub(B).
array A indicating the first row and the first column of
sub(A), respectively.
2040
array B indicating the first row and the first column of
sub(B), respectively.
Output Parameters
a Overwritten by the factors L and U from the factorization
sub(A) = P*L*U; the unit diagonal elements of L are not
stored .
b Overwritten by the solution distributed matrix X.
The dimension of ipiv is (LOCr(m_a)+mb_a). This array
contains the pivoting information. The (local) row i of the
matrix was interchanged with the (global) row ipiv(i).
info < 0:
-i.
info > 0:
If info = k, U(ia+k-1,ja+k-1) is exactly zero. The
factorization has been completed, but the factor U is exactly
singular, so the solution could not be computed.
2041
p?gesvx
Uses the LU factorization to compute the solution
to the system of linear equations with a square
matrix A and multiple right-hand sides, and
provides error bounds on the solution.
Syntax
call psgesvx(fact, trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf,
ipiv, equed, r, c, b, ib, jb, descb, x, ix, jx, descx, rcond, ferr, berr,
call pdgesvx(fact, trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf,
call pcgesvx(fact, trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf,
work, lwork, rwork, lrwork, info)
call pzgesvx(fact, trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf,
work, lwork, rwork, lrwork, info)
Description
linear equations AX = B, where A denotes the n-by-n submatrix A(ia:ia+n-1, ja:ja+n-1),
B denotes the n-by-nrhs submatrix B(ib:ib+n-1, jb:jb+nrhs-1) and X denotes the n-by-nrhs
submatrix X(ix:ix+n-1, jx:jx+nrhs-1).
In the following description, af stands for the subarray af(iaf:iaf+n-1, jaf:jaf+n-1).
The routine p?gesvx performs the following steps:
1. If fact = 'E', real scaling factors R and C are computed to equilibrate the system:
trans = 'N': diag(R)*A*diag(C) *diag(C)-1*X = diag(R)*B
trans = 'T': (diag(R)*A*diag(C))T *diag(R)-1*X = diag(C)*B
2042
trans = 'C': (diag(R)*A*diag(C))H *diag(R)-1*X = diag(C)*B
if equilibration is used, A is overwritten by diag(R)*A*diag(C) and B by diag(R)*B (if
2. If fact = 'N' or 'E', the LU decomposition is used to factor the matrix A (after equilibration
if fact = 'E') as A = P L U, where P is a permutation matrix, L is a unit lower triangular
matrix, and U is upper triangular.
3. The factored form of A is used to estimate the condition number of the matrix A. If the
reciprocal of the condition number is less than relative machine precision, steps 4 - 6 are
skipped.
6. If equilibration was used, the matrix X is premultiplied by diag(C) (if trans = 'N') or
diag(R) (if trans = 'T' or 'C') so that it solves the original system before equilibration.
Input Parameters
fact (global) CHARACTER*1. Must be 'F', 'N', or 'E'.
Specifies whether or not the factored form of the matrix A
is supplied on entry, and if not, whether the matrix A should
be equilibrated before it is factored.
If fact = 'F' then, on entry, af and ipiv contain the
factored form of A. If equed is not 'N', the matrix A has
been equilibrated with scaling factors given by r and c.
Arrays a, af, and ipiv are not modified.
If fact = 'N', the matrix A is copied to af and factored.
If fact = 'E', the matrix A is equilibrated if necessary,
then copied to af and factored.
trans (global) CHARACTER*1. Must be 'N', 'T', or 'C'.
If trans = 'N', the system has the form A*X = B (No
transpose);
T
If trans = 'T', the system has the form A *X = B
(Transpose);
2043
H
If trans = 'C', the system has the form A *X = B
(Conjugate transpose);
of the submatrix A (n ≥ 0).
number of columns of the distributed submatrices B and X
(nrhs ≥ 0).
a, af, b, work (local)
REAL for psgesvx
DOUBLE PRECISION for pdgesvx
COMPLEX for pcgesvx
DOUBLE COMPLEX for pzgesvx.
a(lld_a,LOCc(ja+n-1)), af(lld_af,LOCc(ja+n-1)),
b(lld_b,LOCc(jb+nrhs-1)), work(lwork), respectively.
The array a contains the matrix A. If fact = 'F' and equed
is not 'N', then A must have been equilibrated by the scaling
factors in r and/or c.
The array af is an input argument if fact = 'F'. In this
case it contains on entry the factored form of the matrix A,
that is, the factors L and U from the factorization A = P*L*U
as computed by p?getrf. If equed is not 'N', then af is
the factored form of the equilibrated matrix A.
The array b contains on entry the matrix B whose columns
are the right-hand sides for the systems of equations.
work(*) is a workspace array. The dimension of work is
(lwork).
submatrix A(ia:ia+n-1, ja:ja+n-1), respectively.
array af indicating the first row and the first column of the
subarray af(iaf:iaf+n-1, jaf:jaf+n-1), respectively.
2044
submatrix B(ib:ib+n-1, jb:jb+nrhs-1), respectively.
The dimension of ipiv is (LOCr(m_a)+mb_a).
The array ipiv is an input argument if fact = 'F' .
On entry, it contains the pivot indices from the factorization
A = P*L*U as computed by p?getrf; (local) row i of the
matrix was interchanged with the (global) row ipiv(i).
This array must be aligned with A(ia:ia+n-1, *).
equed (global) CHARACTER*1. Must be 'N', 'R', 'C', or 'B'.
equed is an input argument if fact = 'F' . It specifies the
form of equilibration that was done:
If equed = 'N', no equilibration was done (always true if
fact = 'N');
If equed = 'R', row equilibration was done, that is, A has
been premultiplied by diag(r);
If equed = 'C', column equilibration was done, that is, A
has been postmultiplied by diag(c);
If equed = 'B', both row and column equilibration was
done; A has been replaced by diag(r)*A*diag(c).
r, c (local) REAL for single precision flavors;
The array r contains the row scale factors for A, and the
array c contains the column scale factors for A. These arrays
are input arguments if fact = 'F' only; otherwise they
are output arguments. If equed = 'R' or 'B', A is
multiplied on the left by diag(r); if equed = 'N' or 'C', r
is not accessed.
If fact = 'F' and equed = 'R' or 'B', each element of
r must be positive.
2045
If equed = 'C' or 'B', A is multiplied on the right by

diag(c); if equed = 'N' or 'R', c is not accessed.
If fact = 'F' and equed = 'C' or 'B', each element of
c must be positive. Array r is replicated in every process
column, and is aligned with the distributed matrix A. Array
c is replicated in every process row, and is aligned with the
submatrix X(ix:ix+n-1, jx:jx+nrhs-1), respectively.
lwork (local or global) INTEGER. The dimension of the array work
; must be at least max(p?gecon(lwork),
p?gerfs(lwork))+LOCr(n_a) .
iwork (local, psgesvx/pdgesvx only) INTEGER. Workspace array.
The dimension of iwork is (liwork).
liwork (local, psgesvx/pdgesvx only) INTEGER. The dimension of
the array iwork , must be at least LOCr(n_a) .
rwork (local) REAL for pcgesvx
DOUBLE PRECISION for pzgesvx.
Workspace array, used in complex flavors only.
The dimension of rwork is (lrwork).
lrwork (local or global, pcgesvx/pzgesvx only) INTEGER. The
dimension of the array rwork;must be at least 2*LOCc(n_a)
.
Output Parameters
x (local)
REAL for psgesvx
DOUBLE PRECISION for pdgesvx
COMPLEX for pcgesvx
DOUBLE COMPLEX for pzgesvx.
x(lld_x,LOCc(jx+nrhs-1)).
2046
If info = 0, the array x contains the solution matrix X to
the original system of equations. Note that A and B are
modified on exit if equed ≠ 'N', and the solution to the
equilibrated system is:
diag(C)-1*X, if trans = 'N' and equed = 'C' or 'B';
and diag(R)-1*X, if trans = 'T' or 'C' and equed =
'R' or 'B'.
a Array a is not modified on exit if fact = 'F' or 'N', or if
fact = 'E' and equed = 'N'.
If equed ≠ 'N', A is scaled on exit as follows:
equed = 'R': A = diag(R)*A
equed = 'B': A = diag(R)*A*diag(c)
af If fact = 'N' or 'E', then af is an output argument and
on exit returns the factors L and U from the factorization A
= P*L*U of the original matrix A (if fact = 'N') or of the
equilibrated matrix A (if fact = 'E'). See the description
of a for the form of the equilibrated matrix.
b Overwritten by diag(R)*B if trans = 'N' and equed =
'R' or 'B';
overwritten by diag(c)*B if trans = 'T' and equed =
'C' or 'B'; not changed if equed = 'N'.
r, c These arrays are output arguments if fact ≠ 'F'.
See the description of r, c in Input Arguments section.
An estimate of the reciprocal condition number of the matrix
A after equilibration (if done). The routine sets rcond =0 if
the estimate underflows; in this case the matrix is singular
(to working precision). However, anytime rcond is small
compared to 1.0, for the working precision, the matrix may
be poorly conditioned or even singular.
ferr, berr (local) REAL for single precision flavors
2047
Arrays, DIMENSION LOCc(n_b) each. Contain the

component-wise forward and relative backward errors,
respectively, for each solution vector.
Arrays ferr and berr are both replicated in every process
row, and are aligned with the matrices B and X.
ipiv If fact = 'N' or 'E', then ipiv is an output argument
and on exit contains the pivot indices from the factorization
A = P*L*U of the original matrix A (if fact = 'N') or of
the equilibrated matrix A (if fact = 'E').
equed If fact ≠ 'F' , then equed is an output argument. It
specifies the form of equilibration that was done (see the
description of equed in Input Arguments section).
work(1) If info=0, on exit work(1) returns the minimum value of
lwork required for optimum performance.
iwork(1) If info=0, on exit iwork(1) returns the minimum value of
liwork required for optimum performance.
rwork(1) If info=0, on exit rwork(1) returns the minimum value of
lrwork required for optimum performance.
info < 0: if the ith argument is an array and the jth entry
had an illegal value, then info = -(i*100+j); if the ith
-i. If info = i, and i ≤ n, then U(i,i) is exactly zero.
The factorization has been completed, but the factor U is
exactly singular, so the solution and error bounds could not
be computed. If info = i, and i = n +1, then U is
nonsingular, but rcond is less than machine precision. The
factorization has been completed, but the matrix is singular
to working precision and the solution and error bounds have
not been computed.
2048
p?gbsv
equations with a general banded distributed matrix
and multiple right-hand sides.
Syntax
call psgbsv(n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, work, lwork,
info)
call pdgbsv(n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, work, lwork,
info)
call pcgbsv(n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, work, lwork,
info)
call pzgbsv(n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, work, lwork,
info)
Description
The routine p?gbsv computes the solution to a real or complex system of linear equations
sub(A)*X = sub(B),
where sub(A) = A(1:n, ja:ja+n-1) is an n-by-n real/complex general banded distributed
matrix with bwl subdiagonals and bwu superdiagonals, and X and sub(B)= B(ib:ib+n-1,
1:rhs) are n-by-nrhs distributed matrices.
The LU decomposition with partial pivoting and row interchanges is used to factor sub(A) as
sub(A) = P*L*U*Q, where P and Q are permutation matrices, and L and U are banded lower
and upper triangular matrices, respectively. The matrix Q represents reordering of columns for
the sake of parallelism, while P represents reordering of rows for numerical stability using classic
partial pivoting.
Input Parameters
sub(A) (n ≥ 0).
2049
bwl (global) INTEGER. The number of subdiagonals within the

band of A (0≤ bwl ≤ n-1 ).
bwu (global) INTEGER. The number of superdiagonals within the
band of A (0≤ bwu ≤ n-1 ).
(nrhs ≥ 0).
a, b (local)
REAL for psgbsv
DOUBLE PRECISON for pdgbsv
COMPLEX for pcgbsv
DOUBLE COMPLEX for pzgbsv.
respectively.
On entry, the array a contains the local pieces of the global
array A.
On entry, the array b contains the right hand side distributed
matrix sub(B).
work (local)
2050
REAL for psgbsv
DOUBLE PRECISON for pdgbsv
COMPLEX for pcgbsv
DOUBLE COMPLEX for pzgbsv.
Workspace array of dimension (lwork).
be at least lwork ≥
(NB+bwu)*(bwl+bwu)+6*(bwl+bwu)*(bwl+2*bwu) +
+ max(nrhs *(NB+2*bwl+4*bwu), 1).
Output Parameters
a On exit, contains details of the factorization. Note that the
resulting factorization is not the same factorization as
returned from LAPACK. Additional permutations are
performed on the matrix for the sake of parallelism.
The dimension of ipiv must be at least desca(NB). This
array contains pivot indices for local factorizations. You
should not alter the contents between factorization and
solve.
0:
If the ith argument is an array and the j-th entry had an
info > 0:
info and factored locally was not nonsingular, and the
factorization was not completed. If info = k > NPROCS,
the submatrix stored on processor info-NPROCS
2051
p?dbsv
Solves a general band system of linear equations.
Syntax
call psdbsv(n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, work, lwork, info)
call pddbsv(n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, work, lwork, info)
call pcdbsv(n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, work, lwork, info)
call pzdbsv(n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, work, lwork, info)
Description
The routine solves the system of linear equations
A(1:n, ja:ja+n-1)* X = B(ib:ib+n-1, 1:nrhs),
where A(1:n, ja:ja+n-1) is an n-by-n real/complex banded diagonally dominant-like
distributed matrix with bandwidth bwl, bwu.
Gaussian elimination without pivoting is used to factor a reordering of the matrix into LU.
Input Parameters
n (global) INTEGER. The order of the distributed submatrix A,
(n ≥ 0).
bwl (global) INTEGER. Number of subdiagonals. 0 ≤ bwl ≤
n-1.
bwu (global) INTEGER. Number of subdiagonals. 0 ≤ bwu ≤
n-1.
nrhs (global) INTEGER. The number of right-hand sides; the
number of columns of the distributed submatrix B, (nrhs
≥ 0).
a (local). REAL for psdbsv
DOUBLE PRECISION for pddbsv
COMPLEX for pcdbsv
DOUBLE COMPLEX for pzdbsv.
2052
Pointer into the local memory to an array with first
dimension lld_a ≥ (bwl+bwu+1) (stored in desca). On
entry, this array contains the local pieces of the distributed
matrix.
ja (global) INTEGER. The index in the global array a that points
desca (global and local) INTEGER array of dimension dlen.
If 1d type (dtype_a=501 or 502), dlen ≥ 7;
If 2d type (dtype_a=1), dlen ≥ 9.
The array descriptor for the distributed matrix A.
Contains information of mapping of A to memory.
b (local)
REAL for psdbsv
DOUBLE PRECISON for pddbsv
COMPLEX for pcdbsv
Pointer into the local memory to an array of local lead
dimension lld_b ≥ nb. On entry, this array contains the
local pieces of the right hand sides B(ib:ib+n-1, 1:nrhs).
ib (global) INTEGER. The row index in the global array b that
may be either all of b or a submatrix of B).
descb (global and local) INTEGER array of dimension dlen.
If 1d type (dtype_b =502), dlen ≥ 7;
If 2d type (dtype_b =1), dlen ≥ 9.
The array descriptor for the distributed matrix B.
Contains information of mapping of B to memory.
work (local).
REAL for psdbsv
DOUBLE PRECISON for pddbsv
COMPLEX for pcdbsv
2053
Temporary workspace. This space may be overwritten in

between calls to routines. work must be the size given in
lwork.
lwork (local or global) INTEGER. Size of user-input workspace
work. If lwork is too small, the minimal acceptable size will
be returned in work(1) and an error code is returned.
lwork ≥
nb(bwl+bwu)+6max(bwl,bwu)*max(bwl,bwu)+max((max(bwl,bwu)nrhs),
max(bwl,bwu)*max(bwl,bwu))
Output Parameters
a On exit, this array contains information containing details
of the factorization.
Note that permutations are performed on the matrix, so
that the factors returned are different from those returned
by LAPACK.
b On exit, this contains the local piece of the solutions
work On exit, work(1) contains the minimal lwork.
info (local) INTEGER. If info=0, the execution is successful.
< 0: If the i-th argument is an array and the j-entry had
-i.
> 0: If info = k < NPROCS, the submatrix stored on
processor info and factored locally was not positive definite,
and the factorization was not completed.
processors was not positive definite, and the factorization
was not completed.
2054
p?dtsv
Solves a general tridiagonal system of linear
equations.
Syntax
call psdtsv(n, nrhs, dl, d, du, ja, desca, b, ib, descb, work, lwork, info)
call pddtsv(n, nrhs, dl, d, du, ja, desca, b, ib, descb, work, lwork, info)
call pcdtsv(n, nrhs, dl, d, du, ja, desca, b, ib, descb, work, lwork, info)
call pzdtsv(n, nrhs, dl, d, du, ja, desca, b, ib, descb, work, lwork, info)
Description
A(1:n, ja:ja+n-1) * X = B(ib:ib+n-1, 1:nrhs),
where A(1:n, ja:ja+n-1) is an n-by-n complex tridiagonal diagonally dominant-like distributed
matrix.
Gaussian elimination without pivoting is used to factor a reordering of the matrix into L U.
Input Parameters
n (global) INTEGER. The order of the distributed submatrix A
(n ≥ 0).
nrhs INTEGER. The number of right hand sides; the number of
columns of the distributed matrix B (nrhs ≥ 0).
dl (local). REAL for psdtsv
DOUBLE PRECISION for pddtsv
COMPLEX for pcdtsv
DOUBLE COMPLEX for pzdtsv.
Pointer to local part of global vector storing the lower
diagonal of the matrix. Globally, dl(1)is not referenced, and
dl must be aligned with d. Must be of size > desca( nb_ ).
d (local). REAL for psdtsv
2055
COMPLEX for pcdtsv

Pointer to local part of global vector storing the main
diagonal of the matrix.
du (local). REAL for psdtsv
COMPLEX for pcdtsv
Pointer to local part of global vector storing the upper
diagonal of the matrix. Globally, du(n) is not referenced,
and du must be aligned with d.
b (local)
REAL for psdtsv
DOUBLE PRECISONfor pddtsv
COMPLEX for pcdtsv
dimension lld_b > nb. On entry, this array contains the
If 1d type (dtype_b =502), dlen ≥ 7;
If 2d type (dtype_b =1), dlen ≥ 9.
2056
work (local).
REAL for psdtsv
DOUBLE PRECISON for pddtsv
COMPLEX for pcdtsv
DOUBLE COMPLEX for pzdtsv. Temporary workspace. This
space may be overwritten in between calls to routines. work
must be the size given in lwork.
be returned in work(1) and an error code is returned. lwork
> (12*NPCOL+3*nb)+max((10+2*min(100,
nrhs))*NPCOL+4*nrhs, 8*NPCOL)
Output Parameters
dl On exit, this array contains information containing the *
d On exit, this array contains information containing the *
factors of the matrix. Must be of size > desca( nb_ ).
du On exit, this array contains information containing the *
factors of the matrix. Must be of size > desca( nb_ ).
-i.
> 0: If info = k < NPROCS, the submatrix stored on
was not completed.
2057
p?posv
Solves a symmetric positive definite system of
linear equations.
Syntax
call psposv(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pdposv(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pcposv(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pzposv(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
Description
The routine computes the solution to a real/complex system of linear equations
sub(A)*X = sub(B),
where sub(A) denotes A(ia:ia+n-1,ja:ja+n-1) and is an n-by-n symmetric/Hermitian
distributed positive definite matrix and X and sub(B) denoting B(ib:ib+n-1,jb:jb+nrhs-1)
are n-by-nrhs distributed matrices. The Cholesky decomposition is used to factor sub(A) as
sub(A) = UT*U, if uplo = 'U', or

sub(A) = L*LT, if uplo = 'L',
sub(A) is then used to solve the system of equations.
Input Parameters
uplo (global). CHARACTER. Must be 'U' or 'L'.
Indicates whether the upper or lower triangular part of
sub(A) is stored.
sub(A) (n ≥ 0).
columns of the distributed submatrix sub(B) (nrhs ≥ 0).
a (local)
2058
REAL for psposv
DOUBLE PRECISION for pdposv
COMPLEX for pcposv
COMPLEX*16 for pzposv.
the local pieces of the n-by-n symmetric distributed matrix
sub(A) to be factored.
referenced.
b (local)
REAL for psposv
DOUBLE PRECISON for pdposv
COMPLEX for pcposv
COMPLEX*16 for pzposv.
(lld_b,LOC(jb+nrhs-1)). On entry, the local pieces of
the right hand sides distributed matrix sub(B).
2059
Output Parameters
a On exit, if info = 0, this array contains the local pieces of
the factor U or L from the Cholesky factorization sub(A) =
H
UH*U, or L*L .
b On exit, if info = 0, sub(B) is overwritten by the solution
If info =0, the execution is successful.
j-entry had an illegal value, then info = -(i*100+j), if
the i-th argument is a scalar and had an illegal value, then
info = -i.
If info > 0: If info = k, the leading minor of order k,
A(ia:ia+k-1, ja:ja+k-1) is not positive definite, and
the factorization could not be completed, and the solution
has not been computed.
p?posvx
Solves a symmetric or Hermitian positive definite
system of linear equations.
Syntax
call psposvx(fact, uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf,
equed, sr, sc, b, ib, jb, descb, x, ix, jx, descx, rcond, ferr, berr, work,
call pdposvx(fact, uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf,
call pcposvx(fact, uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf,
call pzposvx(fact, uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf,
2060
Description
The routine uses the Cholesky factorization A=UT*U or A=L*LT to compute the solution to a real
or complex system of linear equations
A(ia:ia+n-1, ja:ja+n-1)*X = B(ib:ib+n-1, jb:jb+nrhs-1),
where A(ia:ia+n-1, ja:ja+n-1) is a n-by-n matrix and X and B(ib:ib+n-1,jb:jb+nrhs-1)
are n-by-nrhs matrices.
In the following comments y denotes Y(iy:iy+m-1, jy:jy+k-1) a m-by-k matrix where y
can be a, af, b and x.
The routine p?posvx performs the following steps:
diag(sr)*A*diag(sc)*inv(diag(sc))*X = diag(sr)*B
if equilibration is used, A is overwritten by diag(sr)*A*diag(sc) and B by diag(sr)*B .
A = UT*U, if uplo = 'U', or
A = L*LT, if uplo = 'L',
3. The factored form of A is used to estimate the condition number of the matrix A. If the
reciprocal of the condition number is less than machine precision, steps 4-6 are skipped
6. If equilibration was used, the matrix X is premultiplied by diag(sr) so that it solves the
Input Parameters
fact (global) CHARACTER. Must be 'F', 'N', or 'E'.
2061
Specifies whether or not the factored form of the matrix A

is supplied on entry, and if not, whether the matrix A should
be equilibrated before it is factored.
If fact = 'F': on entry, af contains the factored form of
A. If equed = 'Y', the matrix A has been equilibrated with
scaling factors given by s. a and af will not be modified.
If fact = 'N', the matrix A will be copied to af and
factored.
If fact = 'E', the matrix A will be equilibrated if necessary,
then copied to af and factored.
Indicates whether the upper or lower triangular part of A is
stored.
sub(A) (n ≥ 0).
number of columns of the distributed submatrices B and X.
(nrhs ≥ 0).
a (local)
REAL for psposvx
DOUBLE PRECISION for pdposvx
COMPLEX for pcposvx
DOUBLE COMPLEX for pzposvx.
(lld_a, LOCc(ja+n-1)). On entry, the
symmetric/Hermitian matrix A, except if fact = 'F' and
equed = 'Y', then A must contain the equilibrated matrix
diag(sr)*A*diag(sc).
strictly upper triangular part of A is not referenced. A is not
modified if fact = 'F' or 'N', or if fact = 'E' and equed
= 'N' on exit.
2062
af (local)
REAL for psposvx
COMPLEX for pcposvx
(lld_af, LOCc(ja+n-1)).
If fact = 'F', then af is an input argument and on entry
contains the triangular factor U or L from the Cholesky
factorization A = UT*U or A = L*LT, in the same storage
format as A. If equed ≠ 'N', then af is the factored form
of the equilibrated matrix diag(sr)*A*diag(sc).
array af indicating the first row and the first column of the
submatrix AF, respectively.
equed (global). CHARACTER. Must be 'N' or 'Y'.
equed is an input argument if fact = 'F'. It specifies the
form of equilibration that was done:
If equed = 'N', no equilibration was done (always true if
fact = 'N');
If equed = 'Y', equilibration was done and A has been
replaced by diag(sr)*A*diag(sc).
sr (local)
REAL for psposvx
COMPLEX for pcposvx
Array, DIMENSION (lld_a).
2063
The array s contains the scale factors for A. This array is an

input argument if fact = 'F' only; otherwise it is an output
argument.
If fact = 'F' and equed = 'Y', each element of s must
be positive.
b (local)
REAL for psposvx
COMPLEX for pcposvx
(lld_b, LOCc(jb+nrhs-1)). On entry, the n-by-nrhs
right-hand side matrix B.
descb (global and local) INTEGER. Array, dimension (dlen_). The
x (local)
REAL for psposvx
COMPLEX for pcposvx
(lld_x, LOCc(jx+nrhs-1)).
array x indicating the first row and the first column of the
submatrix X, respectively.
work (local)
REAL for psposvx
COMPLEX for pcposvx
2064
The dimension of the array work. lwork is local input and
must be at least lwork = max(p?pocon(lwork),
p?porfs(lwork)) + LOCr(n_a).
lwork = 3*desca(lld_).
iwork (local) INTEGER. Workspace array, dimension (liwork).
liwork (local or global)
INTEGER. The dimension of the array iwork. liwork is local
input and must be at least liwork = desca(lld_) liwork
= LOCr(n_a).
by pxerbla.
Output Parameters
a On exit, if fact = 'E' and equed = 'Y', a is overwritten
by diag(sr)*a*diag(sc).
af If fact = 'N', then af is an output argument and on exit
returns the triangular factor U or L from the Cholesky
factorization A = UT*U or A = L*LT of the original matrix
A.
If fact = 'E', then af is an output argument and on exit
returns the triangular factor U or L from the Cholesky
factorization A = UT*U or A = L*LT of the equilibrated
matrix A (see the description of A for the form of the
equilibrated matrix).
equed If fact ≠ 'F' , then equed is an output argument. It
specifies the form of equilibration that was done (see the
description of equed in Input Arguments section).
2065
sr This array is an output argument if fact ≠ 'F'.

See the description of sr in Input Arguments section.
sc This array is an output argument if fact ≠ 'F'.
See the description of sc in Input Arguments section.
b On exit, if equed = 'N', b is not modified; if trans = 'N'
and equed = 'R' or 'B', b is overwritten by diag(r)*b;
if trans = 'T' or 'C' and equed = 'C' or 'B', b is
overwritten by diag(c)*b.
x (local)
REAL for psposvx
COMPLEX for pcposvx
If info = 0 the n-by-nrhs solution matrix X to the original
system of equations.
Note that A and B are modified on exit if equed ≠ 'N', and
the solution to the equilibrated system is
inv(diag(sc))*X if trans = 'N' and equed = 'C' or
'B', or
inv(diag(sr))*X if trans = 'T' or 'C' and equed =
'R' or 'B'.
rcond (global)
REAL for single precision flavors.
An estimate of the reciprocal condition number of the matrix
A after equilibration (if done). If rcond is less than the
machine precision (in particular, if rcond=0), the matrix is
singular to working precision. This condition is indicated by
a return code of info > 0.
ferr REAL for single precision flavors.
Arrays, DIMENSION at least max(LOC,n_b). The estimated
forward error bounds for each solution vector X(j) (the j-th
column of the solution matrix X). If xtrue is the true
solution, ferr(j) bounds the magnitude of the largest entry
in (X(j) - xtrue) divided by the magnitude of the largest
2066
entry in X(j). The quality of the error bound depends on
the quality of the estimate of norm(inv(A)) computed in
the code; if the estimate of norm(inv(A)) is accurate, the
error bound is guaranteed.
berr (local)
REAL for single precision flavors.
Arrays, DIMENSION at least max(LOC,n_b). The
componentwise relative backward error of each solution
vector X(j) (the smallest relative change in any entry of A
or B that makes X(j) an exact solution).
work(1) (local) On exit, work(1) returns the minimal and optimal
liwork.
> 0: if info = i, and i is ≤ n: if info = i, the leading
minor of order i of a is not positive definite, so the
factorization could not be completed, and the solution and
error bounds could not be computed.
= n+1: rcond is less than machine precision. The
factorization has been completed, but the matrix is singular
to working precision, and the solution and error bounds
have not been computed.
p?pbsv
Solves a symmetric/Hermitian positive definite
banded system of linear equations.
Syntax
call pspbsv(uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, work, lwork, info)
call pdpbsv(uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, work, lwork, info)
call pcpbsv(uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, work, lwork, info)
call pzpbsv(uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, work, lwork, info)
2067
Description
A(1:n, ja:ja+n-1)*X = B(ib:ib+n-1, 1:nrhs),
where A(1:n, ja:ja+n-1) is an n-by-n real/complex banded symmetric positive definite
distributed matrix with bandwidth bw.
Cholesky factorization is used to factor a reordering of the matrix into L*L'.
Input Parameters
Indicates whether the upper or lower triangular of A is
stored.
If uplo = 'U', the upper triangular A is stored
If uplo = 'L', the lower triangular of A is stored.
n (global) INTEGER. The order of the distributed matrix A (n
≥ 0).
bw (global) INTEGER. The number of subdiagonals in L or U. 0
≤ bw ≤ n-1.
number of columns in B (nrhs ≥ 0).
a (local). REAL for pspbsv
DOUBLE PRECISON for pdpbsv
COMPLEX for pcpbsv
DOUBLE COMPLEX for pzpbsv.
dimension lld_a ≥ (bw+1) (stored in desca). On entry,
this array contains the local pieces of the distributed matrix
sub(A) to be factored.
2068
b (local)
REAL for pspbsv
COMPLEX for pcpbsv
dimension lld_b ≥ nb. On entry, this array contains the
If 1D type (dtype_b =502), dlen ≥ 7;
If 2D type (dtype_b =1), dlen ≥ 9.
work (local).
REAL for pspbsv
COMPLEX for pcpbsv
lwork.
be returned in work(1)and an error code is returned. lwork
≥ (nb+2*bw)*bw +max((bw*nrhs), bw*bw)
Output Parameters
a On exit, this array contains information containing details
of the factorization. Note that permutations are performed
on the matrix, so that the factors returned are different
from those returned by LAPACK.
b On exit, contains the local piece of the solutions distributed
matrix X.
2069

info (global). INTEGER. If info=0, the execution is successful.
-i.
> 0: If info = k ≤ NPROCS, the submatrix stored on
was not completed.
p?ptsv
Solves a symmetric or Hermitian positive definite
tridiagonal system of linear equations.
Syntax
call psptsv(n, nrhs, d, e, ja, desca, b, ib, descb, work, lwork, info)
call pdptsv(n, nrhs, d, e, ja, desca, b, ib, descb, work, lwork, info)
call pcptsv(n, nrhs, d, e, ja, desca, b, ib, descb, work, lwork, info)
call pzptsv(n, nrhs, d, e, ja, desca, b, ib, descb, work, lwork, info)
Description
A(1:n, ja:ja+n-1)*X = B(ib:ib+n-1, 1:nrhs),
where A(1:n, ja:ja+n-1) is an n-by-n real tridiagonal symmetric positive definite distributed
matrix.
Cholesky factorization is used to factor a reordering of the matrix into L*L'.
2070
Input Parameters
n (global) INTEGER. The order of matrix A (n ≥ 0).
number of columns of the distributed submatrix B (nrhs ≥
0).
d (local)
REAL for psptsv
DOUBLE PRECISON for pdptsv
COMPLEX for pcptsv
DOUBLE COMPLEX for pzptsv.
e (local)
REAL for psptsv
COMPLEX for pcptsv
diagonal of the matrix. Globally, du(n) is not referenced,
and du must be aligned with d.
b (local)
REAL for psptsv
COMPLEX for pcptsv
dimension lld_b ≥ nb.
2071
On entry, this array contains the local pieces of the right

hand sides B(ib:ib+n-1, 1:nrhs).
If 1d type (dtype_b = 502), dlen ≥ 7;
If 2d type (dtype_b = 1), dlen ≥ 9.
work (local).
REAL for psptsv
COMPLEX for pcptsv
lwork.
be returned in work(1) and an error code is returned. lwork
> (12*NPCOL+3*nb)+max((10+2*min(100,
nrhs))*NPCOL+4*nrhs, 8*NPCOL).
Output Parameters
d On exit, this array contains information containing the
factors of the matrix. Must be of size greater than or equal
to desca(nb_).
e On exit, this array contains information containing the
factors of the matrix. Must be of size greater than or equal
to desca(nb_).
2072
-i.
> 0: If info = k ≤ NPROCS, the submatrix stored on
was not completed.
p?gels
Solves overdetermined or underdetermined linear
systems involving a matrix of full rank.
Syntax
call psgels(trans, m, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, work, lwork,
info)
call pdgels(trans, m, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, work, lwork,
info)
call pcgels(trans, m, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, work, lwork,
info)
call pzgels(trans, m, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, work, lwork,
info)
Description
The routine solves overdetermined or underdetermined real/ complex linear systems involving
an m-by-n matrix sub(A) = A(ia:ia+m-1,ja:ja+n-1), or its transpose/ conjugate-transpose,
using a QTQ or LQ factorization of sub(A). It is assumed that sub(A) has full rank.
The following options are provided:
1. If trans = 'N' and m ≥ n: find the least squares solution of an overdetermined system,
2073
minimize ||sub(B) - sub(A)*X||

2. If trans = 'N' and m < n: find the minimum norm solution of an underdetermined system
sub(A)*X = sub(B).
3. If trans = 'T' and m ≥ n: find the minimum norm solution of an undetermined system
sub(A)T*X = sub(B).
4. If trans = 'T' and m < n: find the least squares solution of an overdetermined system,
minimize ||sub(B) - sub(A)T*X||,
where sub(B) denotes B(ib:ib+m-1, jb:jb+nrhs-1) when trans = 'N' and
B(ib:ib+n-1, jb:jb+nrhs-1) otherwise. Several right hand side vectors b and solution
vectors x can be handled in a single call; when trans = 'N', the solution vectors are stored
as the columns of the n-by-nrhs right hand side matrix sub(B) and the m-by-nrhs right hand
side matrix sub(B) otherwise.
Input Parameters
trans (global) CHARACTER. Must be 'N', or 'T'.
If trans = 'N', the linear system involves matrix sub(A);
If trans = 'T', the linear system involves the transposed
T
matrix A (for real flavors only).
submatrix sub (A) (m ≥ 0).
submatrix sub (A) (n ≥ 0).
number of columns in the distributed submatrices sub(B)
and X. (nrhs ≥ 0).
a (local)
REAL for psgels
DOUBLE PRECISION for pdgels
COMPLEX for pcgels
DOUBLE COMPLEX for pzgels.
2074
(lld_a, LOCc(ja+n-1)). On entry, contains the m-by-n
matrix A.
b (local)
REAL for psgels
COMPLEX for pcgels
(lld_b, LOCc(jb+nrhs-1)). On entry, this array contains
the local pieces of the distributed matrix B of right-hand
side vectors, stored columnwise; sub(B) is m-by-nrhs if
trans='N', and n-by-nrhs otherwise.
work (local)
REAL for psgels
COMPLEX for pcgels
The dimension of the array work lwork is local input and
must be at least lwork ≥ ltau + max(lwf, lws), where
if m > n, then
ltau = numroc(ja+min(m,n)-1, nb_a, MYCOL,
csrc_a, NPCOL),
lwf = nb_a*(mpa0 + nqa0 + nb_a)
2075
lws = max((nb_a*(nb_a-1))/2, (nrhsqb0 +

mpb0)*nb_a) + nb_a*nb_a
else
ltau = numroc(ia+min(m,n)-1, mb_a, MYROW,
rsrc_a, NPROW),
lwf = mb_a * (mpa0 + nqa0 + mb_a)
lws = max((mb_a*(mb_a-1))/2, (npb0 + max(nqa0 +
numroc(numroc(n+iroffb, mb_a, 0, 0, NPROW),
mb_a, 0, 0, lcmp), nrhsqb0))*mb_a) + mb_a*mb_a
end if,
where lcmp = lcm/NPROW with lcm = ilcm(NPROW,
NPCOL),
iacol= indxg2p(ja, nb_a, MYROW, rsrc_a, NPROW)
NPROW),
NPCOL),
ibrow = indxg2p(ib, mb_b, MYROW, rsrc_b, NPROW),
ibcol = indxg2p(jb, nb_b, MYCOL, csrc_b, NPCOL),
mpb0 = numroc(m+iroffb, mb_b, MYROW, icrow,
NPROW),
nqb0 = numroc(n+icoffb, nb_b, MYCOL, ibcol,
NPCOL),
MYROW, MYCOL, NPROW, and NPCOL can be determined by
2076
Output Parameters
a On exit, If m ≥ n, sub(A) is overwritten by the details of its
QR factorization as returned by p?geqrf; if m < n, sub(A)
is overwritten by details of its LQ factorization as returned
by p?gelqf.
b On exit, sub(B) is overwritten by the solution vectors, stored
columnwise: if trans = 'N' and m ≥ n, rows 1 to n of
sub(B) contain the least squares solution vectors; the
residual sum of squares for the solution in each column is
given by the sum of squares of elements n+1 to m in that
column;
If trans = 'N' and m < n, rows 1 to n of sub(B) contain
the minimum norm solution vectors;
If trans = 'T' and m ≥ n, rows 1 to m of sub(B) contain
the minimum norm solution vectors; if trans = 'T' and
m < n, rows 1 to m of sub(B) contain the least squares
solution vectors; the residual sum of squares for the solution
in each column is given by the sum of squares of elements
m+1 to n in that column.
-i.
2077
p?syev
of a symmetric matrix.
Syntax
call pssyev( jobz, uplo, n, a, ia, ja, desca, w, z, iz, jz, descz, work, lwork,
info )
call pdsyev( jobz, uplo, n, a, ia, ja, desca, w, z, iz, jz, descz, work, lwork,
info )
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a real symmetric matrix
A by calling the recommended sequence of ScaLAPACK routines.
In its present form, the routine assumes a homogeneous system and makes no checks for
consistency of the eigenvalues or eigenvectors across the different processes. Because of this,
it is possible that a heterogeneous system may return incorrect results without any error
messages.
Input Parameters
np = the number of rows local to a given process.
nq = the number of columns local to a given process.
jobz (global). CHARACTER. Must be 'N' or 'V'. Specifies if it is

necessary to compute the eigenvectors:
If jobz ='N', then only eigenvalues are computed.
If jobz ='V', then eigenvalues and eigenvectors are
computed.
uplo (global). CHARACTER. Must be 'U' or 'L'. Specifies whether
the upper or lower triangular part of the symmetric matrix
A is stored:
n (global) INTEGER. The number of rows and columns of the
matrix A (n ≥ 0).
2078
a (local)
REAL for pssyev.
DOUBLE PRECISION for pdsyev.
Block cyclic array of global dimension (n, n) and local
dimension (lld_a, LOC c(ja+n-1)). On entry, the
symmetric matrix A.
If uplo = 'U', only the upper triangular part of A is used
to define the elements of the symmetric matrix.
If uplo = 'L', only the lower triangular part of A is used
work (local)
REAL for pssyev.
DOUBLE PRECISION for pdsyev.
Array, DIMENSION (lwork).
lwork (local) INTEGER. See below for definitions of variables used
to define lwork.
If no eigenvectors are requested (jobz = 'N'), then lwork
≥ 5*n + sizesytrd + 1,
where sizesytrd is the workspace for p?sytrd and is
max(NB*(np +1), 3*NB).
If eigenvectors are requested (jobz = 'V') then the
amount of workspace required to guarantee that all
eigenvectors are computed is:
qrmem = 2*n-2
lwmin = 5*n + n*ldc + max(sizemqrleft, qrmem) +
1
Variable definitions:
2079
nb = desca(mb_) = desca(nb_) = descz(mb_) =

descz(nb_);
nn = max(n, nb, 2);
desca(rsrc_) = desca(rsrc_) = descz(rsrc_) =
descz(csrc_) = 0
np = numroc(nn, nb, 0, 0, NPROW)
nq = numroc(max(n, nb, 2), nb, 0, 0, NPCOL)
nrc = numroc(n, nb, myprowc, 0, NPROCS)
ldc = max(1, nrc)
sizemqrleft is the workspace for p?ormtr when its side
argument is 'L'.
myprowc is defined when a new context is created as follows:
call blacs_get(desca(ctxt_), 0, contextc)
call blacs_gridinit(contextc, 'R', NPROCS, 1)
call blacs_gridinfo(contextc, nprowc, npcolc,
myprowc, mypcolc)
Output Parameters
a On exit, the lower triangle (if uplo='L') or the upper
triangle (if uplo='U') of A, including the diagonal, is
destroyed.
w (global). REAL for pssyev
DOUBLE PRECISION for pdsyev
On normal exit, the first m entries contain the selected
eigenvalues in ascending order.
z (local). REAL for pssyev
DOUBLE PRECISION for pdsyev
Array, global dimension (n, n), local dimension (lld_z,
LOCc(jz+n-1)). If jobz = 'V', then on normal exit the
first m columns of z contain the orthonormal eigenvectors
of the matrix corresponding to the selected eigenvalues.
2080
work(1) On output, work(1) returns the workspace needed to
guarantee completion. If the input parameters are incorrect,
work(1) may also be incorrect.
If jobz = 'N' work(1) = minimal (optimal) amount of
workspace
If jobz = 'V' work(1) = minimal workspace required to
generate all the eigenvectors.
j-entry had an illegal value, then info = -(i*100+j), if
the i-th argument is a scalar and had an illegal value, then
info = -i.
If info > 0:
If info= 1 through n, the i-th eigenvalue did not converge
in ?steqr2 after a total of 30n iterations.
If info= n+1, then p?syev has detected heterogeneity by
finding that eigenvalues were not identical across the
process grid. In this case, the accuracy of the results from
p?syev cannot be guaranteed.
p?syevx
eigenvectors of a symmetric matrix.
Syntax
call pssyevx(jobz, range, uplo, n, a, ia, ja, desca, vl, vu, il, iu, abstol,
m, nz, w, orfac, z, iz, jz, descz, work, lwork, iwork, liwork, ifail, iclustr,
gap, info)
call pdsyevx(jobz, range, uplo, n, a, ia, ja, desca, vl, vu, il, iu, abstol,
m, nz, w, orfac, z, iz, jz, descz, work, lwork, iwork, liwork, ifail, iclustr,
gap, info)
Description
2081
matrix A by calling the recommended sequence of ScaLAPACK routines. Eigenvalues and
Input Parameters
jobz (global). CHARACTER*1. Must be 'N' or 'V'. Specifies if it

is necessary to compute the eigenvectors:
computed.
range (global). CHARACTER*1. Must be 'A', 'V', or 'I'.
[vl, vu] will be found.
iu will be found.
uplo (global). CHARACTER*1. Must be 'U' or 'L'.
symmetric matrix A is stored:
matrix A (n ≥ 0).
a (local). REAL for pssyevx
DOUBLE PRECISION for pdsyevx.
dimension (lld_a, LOCc(ja+n-1)). On entry, the
symmetric matrix A.
2082
vl, vu (global)
REAL for pssyevx
range = 'A' or 'I'.
il, iu (global) INTEGER.
If range ='I', the indices of the smallest and largest
eigenvalues to be returned.
Constraints: il ≥ 1
min(il,n) ≤ iu ≤ n
Not referenced if range = 'A' or 'V'.
abstol (global). REAL for pssyevx
If jobz='V', setting abstol to p?lamch(context, 'U')
yields the most orthogonal eigenvectors.
The absolute error tolerance for the eigenvalues. An
approximate eigenvalue is accepted as converged when it
is determined to lie in an interval [a, b] of width less than
or equal to
abstol + eps * max(|a|,|b|),
where eps is the machine precision. If abstol is less than
or equal to zero, then eps*norm(T) will be used in its place,
where norm(T) is the 1-norm of the tridiagonal matrix
obtained by reducing A to tridiagonal form.
is set to twice the underflow threshold 2*p?lamch('S')
not zero. If this routine returns with
((mod(info,2).ne.0).or. * (mod(info/8,2).ne.0)),
indicating that some eigenvalues or eigenvectors did not
converge, try setting abstol to 2*p?lamch('S').
2083
orfac (global). REAL for pssyevx

Specifies which eigenvectors should be reorthogonalized.
Eigenvectors that correspond to eigenvalues which are within
tol=orfac*norm(A)of each other are to be
reorthogonalized. However, if the workspace is insufficient
(see lwork), tol may be decreased until all eigenvectors
to be reorthogonalized can be stored in one process. No
reorthogonalization will be done if orfac equals zero. A
default value of 1.0e-3 is used if orfac is negative. orfac
should be identical on all processes.
array descriptor for the distributed matrix Z.descz(ctxt_)
must equal desca(ctxt_).
work (local)
REAL for pssyevx.
See below for definitions of variables used to define lwork.
≥ 5*n + max(5*nn, NB*(np0 + 1)).
If eigenvectors are requested (jobz = 'V'), then the
lwork ≥ 5*n + max(5*nn, np0*mq0 + 2*NB*NB) +
iceil(neig, NPROW*NPCOL)*nn
The computed eigenvectors may not be orthogonal if the
minimal workspace is supplied and orfac is too small. If
you want to guarantee orthogonality (at the cost of
potentially poor performance) you should add the following
to lwork:
(clustersize-1)*n,
2084
where clustersize is the number of eigenvalues in the
largest cluster, where a cluster is defined as a set of close
eigenvalues:
{w(k),..., w(k+clustersize-1)| w(j+1) ≤ w(j)) +
orfac*2*norm(A)},
where
neig = number of eigenvectors requested
descz(nb_);
nn = max(n, nb, 2);
desca(rsrc_) = desca(nb_) = descz(rsrc_) =
descz(csrc_) = 0;
np0 = numroc(nn, nb, 0, 0, NPROW);
mq0 = numroc(max(neig, nb, 2), nb, 0, 0, NPCOL)
iceil(x, y) is a ScaLAPACK function returning ceiling(x/y)
If lwork is too small to guarantee orthogonality, p?syevx
attempts to maintain orthogonality in the clusters with the
smallest spacing between the eigenvalues.
If lwork is too small to compute all the eigenvectors
requested, no computation is performed and info= -23 is
returned.
Note that when range='V', number of requested
eigenvectors are not known until the eigenvalues are
computed. In this case and if lwork is large enough to
compute the eigenvalues, p?sygvx computes the
eigenvalues and as many eigenvectors as possible.
Relationship between workspace, orthogonality &
performance:
Greater performance can be achieved if adequate workspace
is provided. In some situations, performance can decrease
as the provided workspace increases above the workspace
amount shown below:
lwork ≥ max(lwork, 5*n + nsytrd_lwopt),
where lwork, as defined previously, depends upon the
number of eigenvectors requested, and
nsytrd_lwopt = n + 2*(anb+1)*(4*nps+2) + (nps +
3)*nps;
2085
anb = pjlaenv(desca(ctxt_), 3, 'p?syttrd', 'L',

0, 0, 0, 0);
sqnpc = int(sqrt(dble(NPROW * NPCOL)));
nps = max(numroc(n, 1, 0, 0, sqnpc), 2*anb);
numroc is a ScaLAPACK tool functions;
pjlaenv is a ScaLAPACK environmental inquiry function
For large n, no extra workspace is needed, however the
biggest boost in performance comes for small n, so it is wise
to provide the extra workspace (typically less than a
megabyte per process).
If clustersize > n/sqrt(NPROW*NPCOL), then providing
enough space to compute all the eigenvectors orthogonally
will cause serious degradation in performance. At the limit
(that is, clustersize = n-1) p?stein will perform no
better than ?stein on single processor.
For clustersize = n/sqrt(NPROW*NPCOL)
reorthogonalizing all eigenvectors will increase the total
execution time by a factor of 2 or more.
For clustersize > n/sqrt(NPROW*NPCOL) execution time
will grow as the square of the cluster size, all other factors
remaining equal and assuming enough workspace. Less
workspace means less reorthogonalization but faster
execution.
query is assumed; the routine only calculates the size
required for optimal performance for all work arrays. Each
of these values is returned in the first entry of the
corresponding work arrays, and no error message is issued
by pxerbla.
iwork (local) INTEGER. Workspace array.
liwork (local) INTEGER, dimension of iwork. liwork ≥ 6*nnp
Where: nnp = max(n, NPROW*NPCOL + 1, 4)
2086
by pxerbla.
Output Parameters
triangle (if uplo = 'U')of A, including the diagonal, is
overwritten.
m (global) INTEGER. The total number of eigenvalues found;
0 ≤ m ≤ n.
nz (global) INTEGER. Total number of eigenvectors computed.
0 ≤ nz ≤ m.
The number of columns of z that are filled.
If jobz ≠ 'V', nz is not referenced.
If jobz = 'V', nz = m unless the user supplies insufficient
space and p?syevx is not able to detect this before
beginning computation. To get all the eigenvectors
requested, the user must supply both sufficient space to
hold the eigenvectors in z (m.le.descz(n_)) and sufficient
workspace to compute them. (See lwork). p?syevx is
always able to detect insufficient space without computation
unless range.eq.'V'.
w (global). REAL for pssyevx
selected eigenvalues in ascending order.
z (local). REAL for pssyevx
LOCc(jz+n-1)).
If jobz = 'V', then on normal exit the first m columns of
z contain the orthonormal eigenvectors of the matrix
corresponding to the selected eigenvalues. If an eigenvector
fails to converge, then that column of z contains the latest
approximation to the eigenvector, and the index of the
eigenvector is returned in ifail.
2087

work(1) On exit, returns workspace adequate workspace to allow
optimal performance.
iwork(1) On return, iwork(1) contains the amount of integer
workspace required.
ifail (global) INTEGER.
If jobz = 'V', then on normal exit, the first m elements of
ifail are zero. If (mod(info,2). ne.0) on exit, then
ifail contains the indices of the eigenvectors that failed
to converge.
iclustr (global) INTEGER. Array, DIMENSION (2*NPROW*NPCOL)
This array contains indices of eigenvectors corresponding
to a cluster of eigenvalues that could not be reorthogonalized
due to insufficient workspace (see lwork, orfac and info).
Eigenvectors corresponding to clusters of eigenvalues
indexed iclustr(2*i-1) to iclustr(2*i), could not
be reorthogonalized due to lack of workspace. Hence the
eigenvectors corresponding to these clusters may not be
orthogonal. iclustr() is a zero terminated array.
(iclustr(2*k).ne.0. and. iclustr(2*k+1).eq.0)
if and only if k is the number of clusters.
iclustr is not referenced if jobz = 'N'.
gap (global)
REAL for pssyevx
Array, DIMENSION (NPROW*NPCOL)
This array contains the gap between eigenvalues whose
eigenvectors could not be reorthogonalized. The output
values in this array correspond to the clusters indicated by
the array iclustr. As a result, the dot product between
eigenvectors corresponding to the ith cluster may be as
high as (C*n)/gap(i) where C is a small constant.
If info < 0:
2088
If the i-th argument is an array and the j-entry had an
illegal value, then info = -(i*100+j), if the i-th
-i.
If info > 0: if (mod(info,2).ne.0), then one or more
eigenvectors failed to converge. Their indices are stored in
ifail. Ensure abstol=2.0*p?lamch('U').
If (mod(info/2,2).ne.0), then eigenvectors corresponding
to one or more clusters of eigenvalues could not be
reorthogonalized because of insufficient workspace.The
indices of the clusters are stored in the array iclustr.
If (mod(info/4,2).ne.0), then space limit prevented
p?syevxf rom computing all of the eigenvectors between
vl and vu. The number of eigenvectors computed is returned
in nz.
If (mod(info/8,2).ne.0), then p?stebz failed to compute
eigenvalues. Ensure abstol=2.0*p?lamch('U').
p?heevx
Syntax
call pcheevx( jobz, range, uplo, n, a, ia, ja, desca, vl, vu, il, iu, abstol,
m, nz, w, orfac, z, iz, jz, descz, work, lwork, rwork, lrwork, iwork, liwork,
ifail, iclustr, gap, info )
call pzheevx( jobz, range, uplo, n, a, ia, ja, desca, vl, vu, il, iu, abstol,
m, nz, w, orfac, z, iz, jz, descz, work, lwork, rwork, lrwork, iwork, liwork,
ifail, iclustr, gap, info )
Description
matrix A by calling the recommended sequence of ScaLAPACK routines. Eigenvalues and
2089
Input Parameters
jobz (global). CHARACTER*1. Must be 'N' or 'V'.

Specifies if it is necessary to compute the eigenvectors:
computed.
range (global). CHARACTER*1. Must be 'A', 'V', or 'I'.
[vl, vu] will be found.
iu will be found.
matrix A (n ≥ 0).
a (local).
COMPLEX for pcheevx
DOUBLE COMPLEX for pzheevx.
dimension (lld_a, LOC c(ja+n-1)). On entry, the
Hermitian matrix A.
to define the elements of the Hermitian matrix.
2090
desca(ctxt_) is incorrect, p?heevx cannot guarantee
correct error reporting
vl, vu (global)
REAL for pcheevx
DOUBLE PRECISION for pzheevx.
to be searched for eigenvalues; not referenced if range =
'A' or 'I'.
il, iu (global)
INTEGER. If range ='I', the indices of the smallest and
largest eigenvalues to be returned.
Constraints:
il ≥ 1; min(il,n) ≤ iu ≤ n.
Not referenced if range = 'A' or 'V'.
abstol (global).
REAL for pcheevx
is determined to lie in an interval [a, b] of width less than
or equal to abstol+eps*max(|a|,|b|), where eps is the
machine precision. If abstol is less than or equal to zero,
then eps*norm(T) will be used in its place, where norm(T)
is the 1-norm of the tridiagonal matrix obtained by reducing
A to tridiagonal form.
Eigenvalues are computed most accurately when abstol is
set to twice the underflow threshold 2*p?lamch('S'), not
zero. If this routine returns with
((mod(info,2).ne.0).or.(mod(info/8,2).ne.0)),
orfac (global). REAL for pcheevx
2091

tol=orfac*norm(A) of each other are to be
default value of 1.0e-3 is used if orfac is negative.
orfac should be identical on all processes.
array descriptor for the distributed matrix Z.descz( ctxt_
) must equal desca( ctxt_ ).
work (local)
COMPLEX for pcheevx
lwork (local). INTEGER. The dimension of the array work.
If only eigenvalues are requested:
lwork ≥ n + max(nb*(np0 + 1), 3)
If eigenvectors are requested:
lwork ≥ n + (np0+mq0+nb)*nb
with nq0 = numroc(nn, nb, 0, 0, NPCOL).
lwork ≥ 5*n + max(5*nn, np0*mq0+2*nb*nb) +
For optimal performance, greater workspace is needed, that
is
lwork ≥ max(lwork, nhetrd_lwork)
where lwork is as defined above, and nhetrd_lwork = n
+ 2*(anb+1)*(4*nps+2) + (nps+1)*nps
ictxt = desca(ctxt_)
anb = pjlaenv(ictxt, 3, 'pchettrd', 'L', 0, 0,
0, 0)
sqnpc = sqrt(dble(NPROW * NPCOL))
2092
nps = max(numroc(n, 1, 0, 0, sqnpc), 2*anb)
by pxerbla.
rwork (local)
REAL for pcheevx
Workspace array, DIMENSION (lrwork).
lrwork (local) INTEGER. The dimension of the array work.
See below for definitions of variables used to define lwork.
If no eigenvectors are requested (jobz = 'N'), then
lrwork ≥ 5*nn+4*n.
lrwork ≥ 4*n + max(5*nn, np0*mq0+2*nb*nb) +
values to lrwork:
(clustersize-1)*n,
eigenvalues:
{w(k),..., w(k+clustersize-1)|w(j+1) ≤
w(j)+orfac*2*norm(A)}.
neig = number of eigenvectors requested;
descz(nb_);
nn = max(n, NB, 2);
2093

descz(csrc_) = 0;
mq0 = numroc(max(neig, nb, 2), nb, 0, 0, NPCOL);
When lrwork is too small:
If lwork is too small to guarantee orthogonality, p?heevx
smallest spacing between the eigenvalues. If lwork is too
small to compute all the eigenvectors requested, no
computation is performed and info= -23 is returned. Note
that when range='V', p?heevx does not know how many
eigenvectors are requested until the eigenvalues are
computed. Therefore, when range='V' and as long as lwork
is large enough to allow p?heevx to compute the
eigenvalues, p?heevx will compute the eigenvalues and as
many eigenvectors as it can.
Relationship between workspace, orthogonality and
performance:
If clustersize ≥ n/sqrt(NPROW*NPCOL), then providing
will cause serious degradation in performance. In the limit
better than ?stein on 1 processor.
execution.
by pxerbla.
2094
liwork (local) INTEGER, dimension of iwork.
liwork ≥ 6*nnp
Where: nnp = max(n, NPROW*NPCOL+1, 4)
by pxerbla.
Output Parameters
a On exit, the lower triangle (if uplo = 'L'), or the upper
overwritten.
m (global) INTEGER. The total number of eigenvalues found;
0 ≤ m ≤ n.
0 ≤ nz ≤ m.
The number of columns of z that are filled.
space and p?heevx is not able to detect this before
workspace to compute them. (See lwork). p?heevx is
always able to detect insufficient space without computation
unless range.eq.'V'.
w (global).
REAL for pcheevx
selected eigenvalues in ascending order.
z (local).
2095
COMPLEX for pcheevx

LOCc(jz+n-1)).
If jobz ='V', then on normal exit the first m columns of z
contain the orthonormal eigenvectors of the matrix
work(1) On exit, returns workspace adequate workspace to allow
optimal performance.
rwork (local).
REAL for pcheevx
Array, DIMENSION (lrwork). On return, rwork(1) contains
the optimal amount of workspace required for efficient
execution.
If jobz='N' rwork(1) = optimal amount of workspace
required to compute eigenvalues efficiently.
If jobz='V' rwork(1) = optimal amount of workspace
required to compute eigenvalues and eigenvectors efficiently
with no guarantee on orthogonality.
If range='V', it is assumed that all eigenvectors may be
required.
iwork(1) (local)
On return, iwork(1) contains the amount of integer
workspace required.
If jobz ='V', then on normal exit, the first m elements of
ifail are zero. If (mod(info,2).ne.0) on exit, then ifail
contains the indices of the eigenvectors that failed to
converge.
iclustr (global) INTEGER.
Array, DIMENSION (2*NPROW*NPCOL).
2096
This array contains indices of eigenvectors corresponding
to a cluster of eigenvalues that could not be reorthogonalized
due to insufficient workspace (see lwork, orfac and info).
indexed iclustr(2*i-1) to iclustr(2*i), could not be
reorthogonalized due to lack of workspace. Hence the
(iclustr(2*k).ne.0. and. iclustr(2*k+1).eq.0) if
and only if k is the number of clusters. iclustr is not
referenced if jobz = 'N'.
gap (global)
REAL for pcheevx
Array, DIMENSION (NPROW*NPCOL)
eigenvectors corresponding to the i-th cluster may be as
high as (C*n)/gap(i) where C is a small constant.
If info < 0:
illegal value, then info = -(i*100+j), if the i-th
-i.
If info > 0:
If (mod(info,2).ne.0), then one or more eigenvectors
failed to converge. Their indices are stored in ifail. Ensure
abstol=2.0*p?lamch('U')
If (mod(info/2,2).ne.0), then eigenvectors corresponding
reorthogonalized because of insufficient workspace.The
2097

p?syevx from computing all of the eigenvectors between
in nz.
eigenvalues. Ensure abstol=2.0*p?lamch('U').
p?gesvd
general matrix, optionally computing the left and/or
right singular vectors.
Syntax
call psgesvd( jobu, jobvt, m, n, a, ia, ja, desca, s, u, iu, ju, descu, vt,
ivt, jvt, descvt, work, lwork, info )
call pdgesvd( jobu, jobvt, m, n, a, ia, ja, desca, s, u, iu, ju, descu, vt,
ivt, jvt, descvt, work, lwork, info )
call pcgesvd( jobu, jobvt, m, n, a, ia, ja, desca, s, u, iu, ju, descu, vt,
ivt, jvt, descvt, work, lwork, rwork, info )
call pzgesvd( jobu, jobvt, m, n, a, ia, ja, desca, s, u, iu, ju, descu, vt,
ivt, jvt, descvt, work, lwork, rwork, info )
Description
The routine computes the singular value decomposition (SVD) of an m-by-n matrix A, optionally
computing the left and/or right singular vectors. The SVD is written
A = U*Σ*VT,
where Σ is an m-by-n matrix that is zero except for its min(m, n) diagonal elements, U is an
m-by-m orthogonal matrix, and V is an n-by-n orthogonal matrix. The diagonal elements of Σ
are the singular values of A and the columns of U and V are the corresponding right and left
singular vectors, respectively. The singular values are returned in array s in decreasing order
T
and only the first min(m,n) columns of U and rows of vt = V are computed.
2098
Input Parameters
mp = number of local rows in A and U
nq = number of local columns in A and VT
size = min(m, n)
sizeq = number of local columns in U
sizep = number of local rows in VT
jobu (global). CHARACTER*1. Specifies options for computing all

or part of the matrix U.
If jobu = 'V', the first size columns of U (the left singular
vectors) are returned in the array u;
If jobu ='N', no columns of U (no left singular vectors)are
computed.
jobvt (global) CHARACTER*1.
T
Specifies options for computing all or part of the matrix V .
T
If jobvt = 'V', the first size rows of V (the right singular
vectors) are returned in the array vt;
T
If jobvt = 'N', no rows of V (no right singular vectors)
are computed.
m (global) INTEGER. The number of rows of the matrix A (m
≥ 0).
n (global) INTEGER. The number of columns in A (n ≥ 0).
a (local). REAL for psgesvd
DOUBLE PRECISION for pdgesvd
COMPLEX for pcgesvd
COMPLEX*16 for pzgesvd
Block cyclic array, global dimension (m, n), local dimension
(mp, nq).
2099
iu, ju (global) INTEGER. The row and column indices in the global
submatrix U, respectively.
descu (global and local) INTEGER array, dimension (dlen_). The
array descriptor for the distributed matrix U.
ivt, jvt (global) INTEGER. The row and column indices in the global
array vt indicating the first row and the first column of the
submatrix VT, respectively.
descvt (global and local) INTEGER array, dimension (dlen_). The
array descriptor for the distributed matrix VT.
work (local). REAL for psgesvd
COMPLEX for pcgesvd
Workspace array, dimension (lwork)
lwork (local) INTEGER. The dimension of the array work;
lwork > 2 + 6*sizeb + max(watobd, wbdtosvd),
where sizeb = max(m, n), and watobd and wbdtosvd
refer, respectively, to the workspace required to
bidiagonalize the matrix A and to go from the bidiagonal
matrix to the singular value decomposition U S VT.
For watobd, the following holds:
watobd = max(max(wp?lange,wp?gebrd),
max(wp?lared2d, wp?lared1d)),
where wp?lange, wp?lared1d, wp?lared2d, wp?gebrd are
the workspaces required respectively for the subprograms
p?lange, p?lared1d, p?lared2d, p?gebrd. Using the
standard notation
mp = numroc(m, mb, MYROW, desca(ctxt_), NPROW),
nq = numroc(n, nb, MYCOL, desca(lld_), NPCOL),
the workspaces required for the above subprograms are
wp?lange = mp,
wp?lared1d = nq0,
wp?lared2d = mp0,
wp?gebrd = nb*(mp + nq + 1) + nq,
2100
where nq0 and mp0 refer, respectively, to the values
obtained at MYCOL = 0 and MYROW = 0. In general, the
upper limit for the workspace is given by a workspace
required on processor (0,0):
watobd ≤ nb*(mp0 + nq0 + 1) + nq0.
In case of a homogeneous process grid this upper limit can
be used as an estimate of the minimum workspace for every
processor.
For wbdtosvd, the following holds:
wbdtosvd = size*(wantu*nru + wantvt*ncvt) +
max(w?bdsqr, max(wantu*wp?ormbrqln,
wantvt*wp?ormbrprt)),
where
wantu(wantvt) = 1, if left/right singular vectors are wanted,
and wantu(wantvt) = 0, otherwise. w?bdsqr,
wp?ormbrqln, and wp?ormbrprt refer respectively to the
workspace required for the subprograms ?bdsqr, p?orm-
br(qln), and p?ormbr(prt), where qln and prt are the
values of the arguments vect, side, and trans in the call
to p?ormbr. nru is equal to the local number of rows of the
matrix U when distributed 1-dimensional "column" of
processes. Analogously, ncvt is equal to the local number
of columns of the matrix VT when distributed across
1-dimensional "row" of processes. Calling the LAPACK
procedure ?bdsqr requires
w?bdsqr = max(1, 2*size + (2*size - 4)*
max(wantu, wantvt))
on every processor. Finally,
wp?ormbrqln = max((nb*(nb-1))/2,
(sizeq+mp)*nb)+nb*nb,
wp?ormbrprt = max((mb*(mb-1))/2,
(sizep+nq)*mb)+mb*mb,
size for the work array. The required workspace is returned
as the first element of work and no error message is issued
by pxerbla.
rwork REAL for psgesvd
2101

COMPLEX for pcgesvd
Workspace array, dimension (1 + 4*sizeb)
Output Parameters
a On exit, the contents of a are destroyed.
s (global). REAL for psgesvd
COMPLEX for pcgesvd
Array, DIMENSION (size).
Contains the singular values of A sorted so that s(i) ≥
s(i+1).
u (local). REAL for psgesvd
COMPLEX for pcgesvd
local dimension (mp, sizeq), global dimension (m, size)
If jobu = 'V', u contains the first min(m, n) columns of U.
If jobu = 'N' or 'O', u is not referenced.
vt (local). REAL for psgesvd
COMPLEX for pcgesvd
local dimension (sizep, nq), global dimension (size, n)
T
If jobvt = 'V', vt contains the first size rows of V if jobu
= 'N', vt is not referenced.
work On exit, if info = 0, then work(1) returns the required
rwork On exit, if info = 0, then rwork(1) returns the required
size of rwork.
If info < 0, If info = -i, the ith parameter had an illegal
value.
2102
If info > 0 i, then if ?bdsqr did not converge,
If info = min(m,n) + 1, then p?gesvd has detected
heterogeneity by finding that eigenvalues were not identical
across the process grid. In this case, the accuracy of the
results from p?gesvd cannot be guaranteed.
See Also
• Driver Routines
• ?bdsqr
• p?ormbr
• pxerbla
p?sygvx
Syntax
call pssygvx(ibtype, jobz, range, uplo, n, a, ia, ja, desca, b, ib, jb, descb,
vl, vu, il, iu, abstol, m, nz, w, orfac, z, iz, jz, descz, work, lwork, iwork,
liwork, ifail, iclustr, gap, info)
call pdsygvx(ibtype, jobz, range, uplo, n, a, ia, ja, desca, b, ib, jb, descb,
vl, vu, il, iu, abstol, m, nz, w, orfac, z, iz, jz, descz, work, lwork, iwork,
liwork, ifail, iclustr, gap, info)
Description
sub(A)*x = λ*sub(B)*x, sub(A) sub(B)*x = λ*x, or sub(B)*sub(A)*x = λ*x.
Here x denotes eigen vectors, λ (lambda) denotes eigenvalues, sub(A) denoting A(ia:ia+n-1,
ja:ja+n-1) is assumed to symmetric, and sub(B) denoting B(ib:ib+n-1, jb:jb+n-1) is
also positive definite.
2103
Input Parameters
If ibtype = 1, the problem type is sub(A)*x =
lambda*sub(B)*x;
If ibtype = 2, the problem type is sub(A)*sub(B)*x =
lambda*x;
If ibtype = 3, the problem type is sub(B)*sub(A)*x =
lambda*x.
If jobz ='N', then compute eigenvalues only.
If jobz ='V', then compute eigenvalues and eigenvectors.
range (global). CHARACTER*1. Must be 'A' or 'V' or 'I'.
interval: [vl, vu]
indices il through iu.
sub(A) and sub (B);
sub(A) and sub (B).
n (global). INTEGER. The order of the matrices sub(A) and
sub (B), n ≥ 0.
a (local)
REAL for pssygvx
DOUBLE PRECISION for pdsygvx.
sub(A).
sub(A) contains the upper triangular part of the matrix.
sub(A) contains the lower triangular part of the matrix.
2104
desca(ctxt_) is incorrect, p?sygvx cannot guarantee
correct error reporting.
b (local). REAL for pssygvx
(lld_b, LOCc(jb+n-1)). On entry, this array contains
sub(B).
sub(B) contains the upper triangular part of the matrix.
sub(A) contains the lower triangular part of the matrix.
array descriptor for the distributed matrix B. descb(ctxt_)
must be equal to desca(ctxt_).
vl, vu (global)
REAL for pssygvx
il, iu (global)
INTEGER.
smallest and largest eigenvalues to be returned. Constraint:
il ≥ 1, min(il, n)≤ iu ≤ n
abstol (global)
2105
REAL for pssygvx

is determined to lie in an interval [a,b] of width less than
or equal to
abstol + eps*max(|a|,|b|),
((mod(info,2).ne.0).or.*(mod(info/8,2).ne.0)),
orfac (global).
REAL for pssygvx
default value of 1.0e-3 is used if orfac is negative. orfac
array descriptor for the distributed matrix Z.descz(ctxt_)
must equal desca(ctxt_).
work (local)
2106
REAL for pssygvx
lwork (local) INTEGER.
Dimension of the array work. See below for definitions of
variables used to define lwork.
≥ 5*n + max(5*nn, NB*(np0 + 1)).
lwork ≥ 5*n + max(5*nn, np0*mq0 + 2*nb*nb) +
iceil(neig, NPROW*NPCOL)*nn.
you want to guarantee orthogonality at the cost of
potentially poor performance you should add the following
to lwork:
(clustersize-1)*n,
eigenvalues:
{w(k),..., w(k+clustersize-1)|w(j+1) ≤ w(j) +
orfac*2*norm(A)}
neig = number of eigenvectors requested,
descz(nb_),
nn = max(n, nb, 2),
descz(csrc_) = 0,
np0 = numroc(nn, nb, 0, 0, NPROW),
mq0 = numroc(max(neig, nb, 2), nb, 0, 0, NPCOL)
If lwork is too small to guarantee orthogonality, p?syevx
2107

returned.
Note that when range='V', number of requested
eigenvectors are not known until the eigenvalues are
computed. In this case and if lwork is large enough to
compute the eigenvalues, p?sygvx computes the
eigenvalues and as many eigenvectors as possible.
Greater performance can be achieved if adequate workspace
is provided. In some situations, performance can decrease
as the provided workspace increases above the workspace
amount shown below:
lwork ≥ max(lwork, 5*n + nsytrd_lwopt,
nsygst_lwopt), where
lwork, as defined previously, depends upon the number of
eigenvectors requested, and
nsytrd_lwopt = n + 2*(anb+1)*(4*nps+2) +
(nps+3)*nps
nsygst_lwopt = 2*np0*nb + nq0*nb + nb*nb
anb = pjlaenv(desca(ctxt_), 3, p?syttrd ', 'L',
0, 0, 0, 0)
sqnpc = int(sqrt(dble(NPROW * NPCOL)))
NB = desca(mb_)
np0 = numroc(n, nb, 0, 0, NPROW)
nq0 = numroc(n, nb, 0, 0, NPCOL)
For large n, no extra workspace is needed, however the
biggest boost in performance comes for small n, so it is wise
to provide the extra workspace (typically less than a
Megabyte per process).
2108
If clustersize ≥ n/sqrt(NPROW*NPCOL), then providing
will cause serious degradation in performance. At the limit
better than ?stein on a single processor.
execution.
by pxerbla.
liwork ≥ 6*nnp
Where:
nnp = max(n, NPROW*NPCOL + 1, 4)
If liwork = -1, then liwork is global input and a workspace
Output Parameters
a On exit,
If jobz = 'V', and if info = 0, sub(A) contains the
distributed matrix Z of eigenvectors. The eigenvectors are
normalized as follows:
for ibtype = 1 or 2, ZT*sub(B)*Z = i;
for ibtype = 3, ZT*inv(sub(B))*Z = i.
2109
If jobz = 'N', then on exit the upper triangle (if uplo='U')

or the lower triangle (if uplo='L') of sub(A), including the
diagonal, is destroyed.
b On exit, if info ≤ n, the part of sub(B) containing the
matrix is overwritten by the triangular factor U or L from
the Cholesky factorization sub(B) = UT*U or sub(B) =
L*LT.
m (global) INTEGER. The total number of eigenvalues found,
0 ≤ m ≤ n.
nz (global) INTEGER.
Total number of eigenvectors computed. 0 ≤ nz ≤ m. The
number of columns of z that are filled.
space and p?sygvx is not able to detect this before
workspace to compute them. (See lwork below.) p?sygvx
is always able to detect insufficient space without
computation unless range.eq.'V'.
w (global)
REAL for pssygvx
Array, DIMENSION (n). On normal exit, the first m entries
contain the selected eigenvalues in ascending order.
z (local).
REAL for pssygvx
global dimension (n, n), local dimension (lld_z,
LOCc(jz+n-1)).
2110
work If jobz='N' work(1) = optimal amount of workspace
required to compute eigenvalues efficiently
If jobz = 'V' work(1) = optimal amount of workspace
required.
ifail provides additional information when info.ne.0
If (mod(info/16,2).ne.0) then ifail(1) indicates the
order of the smallest minor which is not positive definite. If
(mod(info,2).ne.0) on exit, then ifail contains the
indices of the eigenvectors that failed to converge.
If neither of the above error conditions hold and jobz =
'V', then the first m elements of ifail are set to zero.
Array, DIMENSION (2*NPROW*NPCOL). This array contains
indices of eigenvectors corresponding to a cluster of
eigenvalues that could not be reorthogonalized due to
insufficient workspace (see lwork, orfac and info).
(iclustr(2*k).ne.0.and. iclustr(2*k+1).eq.0) if
and only if k is the number of clusters iclustr is not
referenced if jobz = 'N'.
gap (global)
REAL for pssygvx
2111
Array, DIMENSION (NPROW*NPCOL). This array contains the

gap between eigenvalues whose eigenvectors could not be
reorthogonalized. The output values in this array correspond
to the clusters indicated by the array iclustr. As a result,
the dot product between eigenvectors corresponding to the
i-th cluster may be as high as (C*n)/gap(i), where C is
a small constant.
If info <0: the i-th argument is an array and the j-entry
had an illegal value, then info = -(i*100+j), if the i-th
-i.
If info > 0:
failed to converge. Their indices are stored in ifail.
If (mod(info,2,2).ne.0), then eigenvectors corresponding
reorthogonalized because of insufficient workspace. The
p?sygvx from computing all of the eigenvectors between
in nz.
eigenvalues.
If (mod(info/16,2).ne.0), then B was not positive
definite. ifail(1) indicates the order of the smallest minor
which is not positive definite.
2112
p?hegvx
Syntax
call pchegvx(ibtype, jobz, range, uplo, n, a, ia, ja, desca, b, ib, jb, descb,
vl, vu, il, iu, abstol, m, nz, w, orfac, z, iz, jz, descz, work, lwork, rwork,
lrwork, iwork, liwork, ifail, iclustr, gap, info)
call pzhegvx(ibtype, jobz, range, uplo, n, a, ia, ja, desca, b, ib, jb, descb,
vl, vu, il, iu, abstol, m, nz, w, orfac, z, iz, jz, descz, work, lwork, rwork,
lrwork, iwork, liwork, ifail, iclustr, gap, info)
Description
sub(A)*x = λ*sub(B)*x, sub(A)*sub(B)*x = λ*x, or sub(B)*sub(A)*x = λ*x.

Here sub (A) denoting A(ia:ia+n-1, ja:ja+n-1) and sub(B) are assumed to be Hermitian
and sub(B) denoting B(ib:ib+n-1, jb:jb+n-1) is also positive definite.
Input Parameters
If ibtype = 1, the problem type is
sub(A)*x = lambda*sub(B)*x;
sub(A)*sub(B)*x = lambda*x;
sub(B)*sub(A)*x = lambda*x.
If jobz ='N', then compute eigenvalues only.
If jobz ='V', then compute eigenvalues and eigenvectors.
range (global). CHARACTER*1. Must be 'A' or 'V' or 'I'.
2113

interval: [vl, vu]
indices il through iu.
sub(A) and sub (B);
sub(A) and sub (B).
n (global). INTEGER.
The order of the matrices sub(A) and sub (B) (n ≥ 0).
a (local)
COMPLEX for pchegvx
DOUBLE COMPLEX for pzhegvx.
the local pieces of the n-by-n Hermitian distributed matrix
sub(A). If uplo = 'U', the leading n-by-n upper triangular
part of sub(A) contains the upper triangular part of the
matrix. If uplo = 'L', the leading n-by-n lower triangular
part of sub(A) contains the lower triangular part of the
matrix.
ia, ja (global) INTEGER.
The row and column indices in the global array a indicating
the first row and the first column of the submatrix A,
respectively.
desca (global and local) INTEGER array, dimension (dlen_).
The array descriptor for the distributed matrix A. If
desca(ctxt_) is incorrect, p?hegvx cannot guarantee
correct error reporting.
b (local).
COMPLEX for pchegvx
2114
(lld_b, LOCc(jb+n-1)). On entry, this array contains
the local pieces of the n-by-n Hermitian distributed matrix
sub(B).
sub(B) contains the upper triangular part of the matrix.
sub(B) contains the lower triangular part of the matrix.
ib, jb (global) INTEGER.
The row and column indices in the global array b indicating
the first row and the first column of the submatrix B,
respectively.
descb (global and local) INTEGER array, dimension (dlen_).
descb(ctxt_) must be equal to desca(ctxt_).
vl, vu (global)
REAL for pchegvx
DOUBLE PRECISION for pzhegvx.
il, iu (global)
INTEGER.
smallest and largest eigenvalues to be returned. Constraint:
il ≥ 1, min(il, n) ≤ iu ≤ n
abstol (global)
REAL for pchegvx
is determined to lie in an interval [a,b] of width less than
or equal to
abstol + eps*max(|a|,|b|),
2115

((mod(info,2).ne.0).or. * (mod(info/8,2).ne.0)),
orfac (global).
REAL for pchegvx
default value of 1.0E-3 is used if orfac is negative. orfac
array descriptor for the distributed matrix Z.descz( ctxt_
) must equal desca( ctxt_ ).
work (local)
COMPLEX for pchegvx
lwork (local).
INTEGER. The dimension of the array work.
If only eigenvalues are requested:
lwork ≥ n+ max(NB*(np0 + 1), 3)
If eigenvectors are requested:
2116
lwork ≥ n + (np0+ mq0 + NB)*NB
with nq0 = numroc(nn, NB, 0, 0, NPCOL).
For optimal performance, greater workspace is needed, that
is
lwork ≥ max(lwork, n, nhetrd_lwopt, nhegst_lwopt)
where lwork is as defined above, and
nhetrd_lwork = 2*(anb+1)*(4*nps+2) + (nps +
1)*nps;
nhegst_lwopt = 2*np0*nb + nq0*nb + nb*nb
nb = desca(mb_)
np0 = numroc(n, nb, 0, 0, NPROW)
nq0 = numroc(n, nb, 0, 0, NPCOL)
ictxt = desca(ctxt_)
anb = pjlaenv(ictxt, 3, 'p?hettrd', 'L', 0, 0,
0, 0)
sqnpc = sqrt(dble(NPROW * NPCOL))
by pxerbla.
rwork (local)
REAL for pchegvx
Workspace array, DIMENSION (lrwork).
lrwork (local) INTEGER. The dimension of the array rwork.
See below for definitions of variables used to define lrwork.
If no eigenvectors are requested (jobz = 'N'), then
lrwork ≥ 5*nn+4*n
2117

lrwork ≥ 4*n + max(5*nn, np0*mq0)+ iceil(neig,
NPROW*NPCOL)*nn
value to lrwork:
(clustersize-1)*n,
eigenvalues:
{w(k),..., w(k+clustersize-1)| w(j+1) ≤
w(j)+orfac*2*norm(A)}
neig = number of eigenvectors requested;
descz(nb_);
nn = max(n, nb, 2);
descz(csrc_) = 0 ;
mq0 = numroc(max(neig, nb, 2), nb, 0, 0, NPCOL);
iceil(x, y) is a ScaLAPACK function returning ceiling(x/y).
When lrwork is too small:
If lwork is too small to guarantee orthogonality, p?hegvx
returned. Note that when range='V', p?hegvx does not
know how many eigenvectors are requested until the
eigenvalues are computed. Therefore, when range='V' and
as long as lwork is large enough to allow p?hegvx to
compute the eigenvalues, p?hegvx will compute the
eigenvalues and as many eigenvectors as it can.
2118
Relationship between workspace, orthogonality &
performance:
If clustersize > n/sqrt(NPROW*NPCOL), then providing
will cause serious degradation in performance. In the limit
better than ?stein on 1 processor.
execution.
If lwork = -1, then lrwork is global input and a workspace
by pxerbla.
liwork ≥ 6*nnp
Where: nnp = max(n, NPROW*NPCOL + 1, 4)
by pxerbla.
Output Parameters
a On exit, if jobz = 'V', then if info = 0, sub(A) contains
the distributed matrix Z of eigenvectors.
If ibtype = 1 or 2, then ZH*sub(B)*Z = i;
2119
If ibtype = 3, then ZH*inv(sub(B))*Z = i.

If jobz = 'N', then on exit the upper triangle (if uplo='U')
or the lower triangle (if uplo='L') of sub(A), including the
diagonal, is destroyed.
b On exit, if info ≤ n, the part of sub(B) containing the
matrix is overwritten by the triangular factor U or L from
H H
the Cholesky factorization sub(B) = U *U, or sub(B) = L*L .
m (global) INTEGER. The total number of eigenvalues found,
0 ≤ m ≤ n.
0 < nz < m. The number of columns of z that are filled.
space and p?hegvx is not able to detect this before
hold the eigenvectors in z (m. le. descz(n_)) and
sufficient workspace to compute them. (See lwork below.)
The routine p?hegvx is always able to detect insufficient
space without computation unless range = 'V'.
w (global)
REAL for pchegvx
Array, DIMENSION (n). On normal exit, the first m entries
contain the selected eigenvalues in ascending order.
z (local).
COMPLEX for pchegvx
global dimension (n, n), local dimension (lld_z,
LOCc(jz+n-1)).
2120
work On exit, work(1) returns the optimal amount of workspace.
rwork On exit, rwork(1) contains the amount of workspace
required for optimal efficiency
If jobz='N' rwork(1) = optimal amount of workspace
required to compute eigenvalues efficiently
If jobz='V' rwork(1) = optimal amount of workspace
required when computing optimal workspace.
ifail provides additional information when info.ne.0
If (mod(info/16,2).ne.0), then ifail(1) indicates the
order of the smallest minor which is not positive definite.
If (mod(info,2).ne.0) on exit, then ifail(1) contains
the indices of the eigenvectors that failed to converge.
If neither of the above error conditions are held, and jobz
= 'V', then the first m elements of ifail are set to zero.
Array, DIMENSION (2*NPROW*NPCOL). This array contains
indices of eigenvectors corresponding to a cluster of
eigenvalues that could not be reorthogonalized due to
insufficient workspace (see lwork, orfac and info).
orthogonal.
iclustr() is a zero terminated array.
(iclustr(2*k).ne.0.and.clustr(2*k+1).eq.0) if and
only if k is the number of clusters.
iclustr is not referenced if jobz = 'N'.
gap (global)
REAL for pchegvx
2121
Array, DIMENSION (NPROW*NPCOL).

eigenvectors corresponding to the i-th cluster may be as
high as (C*n)/gap(i), where C is a small constant.
If info <0: the i-th argument is an array and the j-entry
had an illegal value, then info = -(i*100+j), if the i-th
-i.
If info > 0:
failed to converge. Their indices are stored in ifail.
If (mod(info,2,2).ne.0), then eigenvectors corresponding
reorthogonalized because of insufficient workspace. The
p?sygvx from computing all of the eigenvectors between
in nz.
eigenvalues.
If (mod(info/16,2).ne.0), then B was not positive
definite. ifail(1) indicates the order of the smallest minor
which is not positive definite.
2122
ScaLAPACK Auxiliary and Utility
Routines 7
This chapter describes the Intel® Math Kernel Library implementation of ScaLAPACK Auxiliary Routines
and Utility Functions and Routines. The library includes routines for both real and complex data.
NOTE. ScaLAPACK routines are provided only with Intel® MKL versions for Linux* and
Windows* OSs.
Routine naming conventions, mathematical notation, and matrix storage schemes used for ScaLAPACK
auxiliary and utility routines are the same as described in previous chapters. Some routines and functions
may have combined character codes, such as sc or dz. For example, the routine pscsum1 uses a complex
input array and returns a real value.
Auxiliary Routines
Table 7-1 ScaLAPACK Auxiliary Routines
Types
p?lacgv c,z Conjugates a complex vector.
p?max1 c,z Finds the index of the element whose real part has maximum
absolute value (similar to the Level 1 PBLAS p?amax, but using
the absolute value to the real part).
?combamax1 c,z Finds the element with maximum real part absolute value and
its corresponding global index.
p?sum1 sc,dz Forms the 1-norm of a complex vector similar to Level 1 PBLAS
p?asum, but using the true absolute value.
p?dbtrsv s,d,c,z Computes an LU factorization of a general tridiagonal matrix

with no pivoting. The routine is called by p?dbtrs.
p?dttrsv s,d,c,z Computes an LU factorization of a general band matrix, using

partial pivoting with row interchanges. The routine is called
by p?dttrs.
2123

Types
p?gebd2 s,d,c,z Reduces a general rectangular matrix to real bidiagonal

form by an orthogonal/unitary transformation (unblocked
algorithm).
p?gehd2 s,d,c,z Reduces a general matrix to upper Hessenberg form by an

orthogonal/unitary similarity transformation (unblocked
algorithm).
p?gelq2 s,d,c,z Computes an LQ factorization of a general rectangular

matrix (unblocked algorithm).
p?geql2 s,d,c,z Computes a QL factorization of a general rectangular matrix

p?geqr2 s,d,c,z Computes a QR factorization of a general rectangular matrix

p?gerq2 s,d,c,z Computes an RQ factorization of a general rectangular

matrix (unblocked algorithm).
p?getf2 s,d,c,z Computes an LU factorization of a general matrix, using

partial pivoting with row interchanges (local blocked
algorithm).
p?labrd s,d,c,z Reduces the first nb rows and columns of a general

rectangular matrix A to real bidiagonal form by an
orthogonal/unitary transformation, and returns auxiliary
matrices that are needed to apply the transformation to
p?lacon s,d,c,z Estimates the 1-norm of a square matrix, using the reverse
p?laconsb s,d Looks for two consecutive small subdiagonal elements.
p?lacp2 s,d,c,z Copies all or part of a distributed matrix to another

distributed matrix.
p?lacp3 s,d Copies from a global parallel array into a local replicated
array or vice versa.
2124
ScaLAPACK Auxiliary and Utility Routines 7
Types
p?lacpy s,d,c,z Copies all or part of one two-dimensional array to another.
p?laevswp s,d,c,z Moves the eigenvectors from where they are computed to
ScaLAPACK standard block cyclic array.
p?lahrd s,d,c,z Reduces the first nb columns of a general rectangular

matrix A so that elements below the kth subdiagonal are
zero, by an orthogonal/unitary transformation, and returns
auxiliary matrices that are needed to apply the
p?laiect s,d,c,z Exploits IEEE arithmetic to accelerate the computations of

eigenvalues. (C interface function).
p?lange s,d,c,z Returns the value of the 1-norm, Frobenius norm,

infinity-norm, or the largest absolute value of any element,
of a general rectangular matrix.
p?lanhs s,d,c,z Returns the value of the 1-norm, Frobenius norm,

of an upper Hessenberg matrix.
p?lansy, p?lanhe s,d,c,z/c,z Returns the value of the 1-norm, Frobenius norm,
of a real symmetric or complex Hermitian matrix.
p?lantr s,d,c,z Returns the value of the 1-norm, Frobenius norm,

of a triangular matrix.
p?lapiv s,d,c,z Applies a permutation matrix to a general distributed

matrix, resulting in row or column pivoting.
p?laqge s,d,c,z Scales a general rectangular matrix, using row and column
scaling factors computed by p?geequ.
p?laqsy s,d,c,z Scales a symmetric/Hermitian matrix, using scaling factors

computed by p?poequ.
2125

Types
p?lared1d s,d Redistributes an array assuming that the input array bycol
is distributed across rows and that all process columns
contain the same copy of bycol.
p?lared2d s,d Redistributes an array assuming that the input array byrow
is distributed across columns and that all process rows
contain the same copy of byrow .
p?larf s,d,c,z Applies an elementary reflector to a general rectangular

matrix.
p?larfb s,d,c,z Applies a block reflector or its

transpose/conjugate-transpose to a general rectangular
matrix.
p?larfc c,z Applies the conjugate transpose of an elementary reflector

p?larfg s,d,c,z Generates an elementary reflector (Householder matrix).
p?larft s,d,c,z Forms the triangular vector T of a block reflector H=I-VTVH
p?larz s,d,c,z Applies an elementary reflector as returned by p?tzrzf

p?larzb s,d,c,z Applies a block reflector or its

transpose/conjugate-transpose as returned by p?tzrzf to
a general matrix.
p?larzc c,z Applies (multiplies by) the conjugate transpose of an

elementary reflector as returned by p?tzrzf to a general
matrix.
p?larzt s,d,c,z Forms the triangular factor T of a block reflector H=I-VTVH

as returned by p?tzrzf.
p?lascl s,d,c,z Multiplies a general rectangular matrix by a real scalar

defined as Cto/Cfrom.
p?laset s,d,c,z Initializes the off-diagonal elements of a matrix to α and

the diagonal elements to β.
2126
Types
p?lasmsub s,d Looks for a small subdiagonal element from the bottom of
the matrix that it can safely set to zero.
p?lassq s,d,c,z Updates a sum of squares represented in scaled form.
p?laswp s,d,c,z Performs a series of row interchanges on a general

rectangular matrix.
p?latra s,d,c,z Computes the trace of a general square distributed matrix.
p?latrd s,d,c,z Reduces the first nb rows and columns of a

symmetric/Hermitian matrix A to real tridiagonal form by
an orthogonal/unitary similarity transformation.
p?latrz s,d,c,z Reduces an upper trapezoidal matrix to upper triangular

form by means of orthogonal/unitary transformations.
p?lauu2 s,d,c,z Computes the product UUH or LHL, where U and L are upper
or lower triangular matrices (local unblocked algorithm).
p?lauum s,d,c,z Computes the product UUH or LHL, where U and L are upper
or lower triangular matrices.
p?lawil s,d Forms the Wilkinson transform.
p?org2l/p?ung2l s,d,c,z Generates all or part of the orthogonal/unitary matrix Q

from a QL factorization determined by p?geqlf (unblocked
algorithm).
p?org2r/p?ung2r s,d,c,z Generates all or part of the orthogonal/unitary matrix Q

from a QR factorization determined by p?geqrf (unblocked
algorithm).
p?orgl2/p?ungl2 s,d,c,z Generates all or part of the orthogonal/unitary matrix Q

from an LQ factorization determined by p?gelqf (unblocked
algorithm).
p?orgr2/p?ungr2 s,d,c,z Generates all or part of the orthogonal/unitary matrix Q

from an RQ factorization determined by p?gerqf (unblocked
algorithm).
2127

Types
p?orm2l/p?unm2l s,d,c,z Multiplies a general matrix by the orthogonal/unitary matrix

from a QL factorization determined by p?geqlf (unblocked
algorithm).
p?orm2r/p?unm2r s,d,c,z Multiplies a general matrix by the orthogonal/unitary matrix

from a QR factorization determined by p?geqrf (unblocked
algorithm).
p?orml2/p?unml2 s,d,c,z Multiplies a general matrix by the orthogonal/unitary matrix

from an LQ factorization determined by p?gelqf (unblocked
algorithm).
p?ormr2/p?unmr2 s,d,c,z Multiplies a general matrix by the orthogonal/unitary matrix

from an RQ factorization determined by p?gerqf (unblocked
algorithm).
p?pbtrsv s,d,c,z Solves a single triangular linear system via frontsolve or

backsolve where the triangular matrix is a factor of a
banded matrix computed by p?pbtrf.
p?pttrsv s,d,c,z Solves a single triangular linear system via frontsolve or

backsolve where the triangular matrix is a factor of a
tridiagonal matrix computed by p?pttrf.
p?potf2 s,d,c,z Computes the Cholesky factorization of a

symmetric/Hermitian positive definite matrix (local
unblocked algorithm).
p?rscl s,d,cs,zd Multiplies a vector by the reciprocal of a real scalar.
p?sygs2/p?hegs2 s,d,c,z Reduces a symmetric/Hermitian definite generalized

eigenproblem to standard form, using the factorization
results obtained from p?potrf (local unblocked algorithm).
p?sytd2/p?hetd2 s,d,c,z Reduces a symmetric/Hermitian matrix to real symmetric

tridiagonal form by an orthogonal/unitary similarity
transformation (local unblocked algorithm).
p?trti2 s,d,c,z Computes the inverse of a triangular matrix (local

2128
Types
?lamsh s,d Sends multiple shifts through a small (single node) matrix
to maximize the number of bulges that can be sent through.
?laref s,d Applies Householder reflectors to matrices on either their

rows or columns.
?lasorte s,d Sorts eigenpairs by real and complex data types.
?lasrt2 s,d Sorts numbers in increasing or decreasing order.
?stein2 s,d Computes the eigenvectors corresponding to specified

eigenvalues of a real symmetric tridiagonal matrix, using
inverse iteration.
?dbtf2 s,d,c,z Computes an LU factorization of a general band matrix with

no pivoting (local unblocked algorithm).
?dbtrf s,d,c,z Computes an LU factorization of a general band matrix with

no pivoting (local blocked algorithm).
?dttrf s,d,c,z Computes an LU factorization of a general tridiagonal matrix

with no pivoting (local blocked algorithm).
?dttrsv s,d,c,z Solves a general tridiagonal system of linear equations

using the LU factorization computed by ?dttrf.
?pttrsv s,d,c,z Solves a symmetric (Hermitian) positive-definite tridiagonal

system of linear equations, using the LDLH factorization
computed by ?pttrf.
?steqr2 s,d Computes all eigenvalues and, optionally, eigenvectors of

a symmetric tridiagonal matrix using the implicit QL or QR
method.
2129
p?lacgv
Conjugates a complex vector.
Syntax
call pclacgv(n, x, ix, jx, descx, incx)
call pzlacgv(n, x, ix, jx, descx, incx)
Description
The routine conjugates a complex vector sub(x) of length n, where sub(x) denotes X(ix,
jx:jx+n-1) if incx = m_x, and X(ix:ix+n-1, jx) if incx = 1.
Input Parameters
n (global) INTEGER. The length of the distributed vector
sub(x).
x (local).
COMPLEX for pclacgv
COMPLEX*16 for pzlacgv.Pointer into the local memory to
an array of DIMENSION (lld_x,*). On entry the vector to
be conjugated x(i) = X(ix+(jx-1)*m_x+(i-1)*incx),
1 ≤ i ≤ n.
ix (global) INTEGER.The row index in the global array x
indicating the first row of sub(x).
jx (global) INTEGER. The column index in the global array x
indicating the first column of sub(x).
descx (global and local) INTEGER. Array, DIMENSION (dlen_). The
incx (global) INTEGER.The global increment for the elements of
X. Only two values of incx are supported in this version,
namely 1 and m_x. incx must not be zero.
Output Parameters
x (local).
2130
On exit, the conjugated vector.
p?max1
Finds the index of the element whose real part has
maximum absolute value (similar to the Level 1
PBLAS p?amax, but using the absolute value to the
real part).
Syntax
call pcmax1(n, amax, indx, x, ix, jx, descx, incx)
call pzmax1(n, amax, indx, x, ix, jx, descx, incx)
Description
The routine computes the global index of the maximum element in absolute value of a distributed
vector sub(x). The global index is returned in indx and the value is returned in amax, where
sub(x) denotes X(ix:ix+n-1, jx) if incx = 1, X(ix, jx:jx+n-1) if incx = m_x.
Input Parameters
n (global) pointer to INTEGER. The number of components of
the distributed vector sub(x). n ≥ 0.
x (local)
COMPLEX for pcmax1.
COMPLEX*16 for pzmax1
Array containing the local pieces of a distributed matrix of
dimension of at least ((jx-1)*m_x+ix+(n-1)*abs(incx)).
This array contains the entries of the distributed vector sub
(x).
ix (global) INTEGER.The row index in the global array X
jx (global) INTEGER. The column index in the global array X
indicating the first column of sub(x)
descx (global and local) INTEGER. Array, DIMENSION (dlen_). The
2131

namely 1 and m_x. incx must not be zero.
Output Parameters
amax (global output) pointer to REAL.The absolute value of the
largest entry of the distributed vector sub(x) only in the
scope of sub(x).
indx (global output) pointer to INTEGER.The global index of the
element of the distributed vector sub(x) whose real part
has maximum absolute value.
?combamax1
Finds the element with maximum real part absolute
value and its corresponding global index.
Syntax
call ccombamax1(v1, v2)
call zcombamax1(v1, v2)
Description
The routine finds the element having maximum real part absolute value as well as its
corresponding global index.
Input Parameters
v1 (local)
COMPLEX for ccombamax1
COMPLEX*16 for zcombamax1 Array, DIMENSION 2. The first
maximum absolute value element and its global index.
v1(1)=amax, v1(2)=indx.
v2 (local)
COMPLEX for ccombamax1
COMPLEX*16 for zcombamax1
2132
Array, DIMENSION 2. The second maximum absolute value
element and its global index. v2(1)=amax, v2(2)=indx.
Output Parameters
v1 (local).
The first maximum absolute value element and its global
index. v1(1)=amax, v1(2)=indx.
p?sum1
Forms the 1-norm of a complex vector similar to
Level 1 PBLAS p?asum, but using the true absolute
value.
Syntax
call pscsum1(n, asum, x, ix, jx, descx, incx)
call pdzsum1(n, asum, x, ix, jx, descx, incx)
Description
The routine returns the sum of absolute values of a complex distributed vector sub(x) in asum,
where sub(x) denotes X(ix:ix+n-1, jx:jx), if incx = 1, X(ix:ix, jx:jx+n-1), if incx
= m_x.
Based on p?asum from the Level 1 PBLAS. The change is to use the 'genuine' absolute value.
Input Parameters
n (global) pointer to INTEGER. The number of components of
the distributed vector sub(x). n ≥ 0.
x (local ) COMPLEX for pscsum1
COMPLEX*16 for pdzsum1.
dimension of at least ((jx-1)*m_x+ix+(n-1)*abs(incx)).
This array contains the entries of the distributed vector sub
(x).
2133
ix (global) INTEGER.The row index in the global array X

jx (global) INTEGER. The column index in the global array X
indicating the first column of sub(x)
descx (global and local) INTEGER. Array, DIMENSION 8. The array
descriptor for the distributed matrix X.
namely 1 and m_x.
Output Parameters
asum (local)
Pointer to REAL. The sum of absolute values of the
distributed vector sub(x) only in its scope.
p?dbtrsv
Computes an LU factorization of a general
triangular matrix with no pivoting. The routine is
called by p?dbtrs.
Syntax
call psdbtrsv(uplo, trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af,
call pddbtrsv(uplo, trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af,
call pcdbtrsv(uplo, trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af,
call pzdbtrsv(uplo, trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af,
Description
The routines solves a banded triangular system of linear equations
A(1 :n, ja:ja+n-1) * X = B(ib:ib+n-1, 1 :nrhs) or
2134
T H
A(1 :n, ja:ja+n-1) * X = B(ib:ib+n-1, 1 :nrhs) (for real flavors); A(1 :n, ja:ja+n-1) * X
= B(ib:ib+n-1, 1 :nrhs) (for complex flavors),
where A(1 :n, ja:ja+n-1) is a banded triangular matrix factor produced by the Gaussian
elimination code PD@(dom_pre)BTRF and is stored in A(1 :n, ja:ja+n-1) and af. The matrix
stored in A(1 :n, ja:ja+n-1) is either upper or lower triangular according to uplo, and the
T
choice of solving A(1 :n, ja:ja+n-1) or A(1 :n, ja:ja+n-1) is dictated by the user by the
parameter trans.
Routine p?dbtrf must be called first.
Input Parameters
If uplo='U', the upper triangle of A(1:n, ja:ja+n-1) is
stored,
if uplo = 'L', the lower triangle of A(1:n, ja:ja+n-1) is
stored.
If trans = 'N', solve with A(1:n, ja:ja+n-1),
if trans = 'C', solve with conjugate transpose A(1:n,
ja:ja+n-1).
A;(n≥ 0).
bwl (global) INTEGER. Number of subdiagonals. 0 ≤ bwl ≤ n-1.
bwu (global) INTEGER. Number of subdiagonals. 0 ≤ bwu ≤ n-1.
number of columns of the distributed submatrix B (nrhs≥
0).
a (local).
REAL for psdbtrsv
DOUBLE PRECISION for pddbtrsv
COMPLEX for pcdbtrsv
COMPLEX*16 for pzdbtrsv.
DIMENSION lld_a≥(bwl+bwu+1)(stored in desca). On entry,
this array contains the local pieces of the n-by-n
2135
unsymmetric banded distributed Cholesky factor L or LT*A(1

:n, ja:ja+n-1). This local portion is stored in the packed
banded format used in LAPACK. See the Application Notes
below and the ScaLAPACK manual for more detail on the
format of distributed matrices.
desca (global and local) INTEGER array of DIMENSION (dlen_).
if 1d type (dtype_a = 501 or 502), dlen≥ 7;
if 2d type (dtype_a = 1), dlen≥ 9. The array descriptor for
the distributed matrix A. Contains information of mapping
of A to memory.
b (local)
REAL for psdbtrsv
DIMENSION lld_b≥nb. On entry, this array contains the
local pieces of the right-hand sides B(ib:ib+n-1, 1:nrhs).
descb (global and local) INTEGER array of DIMENSION (dlen_).
if 1d type (dtype_b =502), dlen≥7;
if 2d type (dtype_b =1), dlen≥9. The array descriptor for
the distributed matrix B. Contains information of mapping
B to memory.
laf (local)
INTEGER. Size of user-input Auxiliary Filling space af.
laf must be ≥nb*(bwl+bwu)+6*max(bwl, bwu)*max(bwl,
bwu). If laf is not large enough, an error code is returned
work (local).
2136
REAL for psdbtrsv
between calls to routines.
work must be the size given in lwork.
Size of user-input workspace work. If lwork is too small,
the minimal acceptable size will be returned in work(1) and
an error code is returned.
lwork≥ max(bwl, bwu)*nrhs.
Output Parameters
a (local).
This local portion is stored in the packed banded format
used in LAPACK. Please see the ScaLAPACK manual for more
detail on the format of distributed matrices.
af (local).
REAL for psdbtrsv
Auxiliary Filling Space. Filling is created during the
factorization routine p?dbtrf and this is stored in af. If a
linear system is to be solved using p?dbtrf after the
factorization routine, af must not be altered after the
factorization.
work On exit, work( 1 ) contains the minimal lwork.
info (local).
INTEGER. If info = 0, the execution is successful.
2137

-i.
p?dttrsv
Computes an LU factorization of a general band
matrix, using partial pivoting with row
interchanges. The routine is called by p?dttrs.
Syntax
call psdttrsv(uplo, trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af,
call pddttrsv(uplo, trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af,
call pcdttrsv(uplo, trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af,
call pzdttrsv(uplo, trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af,
Description
The routine solves a tridiagonal triangular system of linear equations
A(1 :n, ja:ja+n-1)*X = B(ib:ib+n-1, 1 :nrhs) or
A(1 :n, ja:ja+n-1)T * X = B(ib:ib+n-1, 1 :nrhs) for real flavors; A(1 :n,
ja:ja+n-1)H * X = B(ib:ib+n-1, 1 :nrhs) for complex flavors,
where A(1 :n, ja:ja+n-1) is a tridiagonal matrix factor produced by the Gaussian elimination
code PS@(dom_pre)TTRF and is stored in A(1 :n, ja:ja+n-1) and af.
The matrix stored in A(1 :n, ja:ja+n-1) is either upper or lower triangular according to
uplo, and the choice of solving A(1 :n, ja:ja+n-1) or A(1 :n, ja:ja+n-1)T is dictated
by the user by the parameter trans.
Routine p?dttrf must be called first.
2138
Input Parameters
If uplo='U', the upper triangle of A(1:n, ja:ja+n-1) is
stored,
if uplo = 'L', the lower triangle of A(1:n, ja:ja+n-1) is
stored.
If trans = 'N', solve with A(1:n, ja:ja+n-1),
if trans = 'C', solve with conjugate transpose A(1:n,
ja:ja+n-1).
A;(n≥ 0).
number of columns of the distributed submatrix
B(ib:ib+n-1, 1:nrhs). (nrhs≥ 0).
dl (local).
REAL for psdttrsv
DOUBLE PRECISION for pddttrsv
COMPLEX for pcdttrsv
COMPLEX*16 for pzdttrsv.
Pointer to local part of global vector storing the lower
Globally, dl(1) is not referenced, and dl must be aligned
with d.
Must be of size ≥desca( nb_ ).
d (local).
REAL for psdttrsv
du (local).
REAL for psdttrsv
2139

Globally, du(n) is not referenced, and du must be aligned
with d.
desca (global and local). INTEGER array of DIMENSION (dlen_).
if 1d type (dtype_a = 501 or 502), dlen≥ 7;
if 2d type (dtype_a = 1), dlen≥ 9.
The array descriptor for the distributed matrix A. Contains
information of mapping of A to memory.
b (local)
REAL for psdttrsv
DIMENSION lld_b≥nb. On entry, this array contains the
local pieces of the right-hand sides B(ib:ib+n-1, 1
:nrhs).
ib (global). INTEGER. The row index in the global array b that
descb (global and local).INTEGER array of DIMENSION (dlen_).
if 1d type (dtype_b = 502), dlen≥7;
if 2d type (dtype_b = 1), dlen≥ 9.
The array descriptor for the distributed matrix B. Contains
information of mapping B to memory.
laf (local).
INTEGER.
Size of user-input Auxiliary Filling space af.
2140
laf must be ≥ 2*(nb+2). If laf is not large enough, an
error code is returned and the minimum acceptable size will
be returned in af(1).
work (local).
REAL for psdttrsv
work must be the size given in lwork.
lwork (local or global).INTEGER.
Size of user-input workspace work. If lwork is too small,
the minimal acceptable size will be returned in work(1) and
an error code is returned.
lwork≥ 10*npcol+4*nrhs.
Output Parameters
dl (local).
On exit, this array contains information containing the
d On exit, this array contains information containing the
factors of the matrix. Must be of size ≥desca (nb_ ).
af (local).
REAL for psdttrsv
Auxiliary Filling Space. Filling is created during the
factorization routine p?dttrf and this is stored in af. If a
linear system is to be solved using p?dttrs after the
factorization routine, af must not be altered after the
factorization.
2141
info (local). INTEGER.

if info< 0: If the i-th argument is an array and the j-entry
had an illegal value, then info = - (i*100+j), if the i-th
-i.
p?gebd2
Reduces a general rectangular matrix to real
bidiagonal form by an orthogonal/unitary
Syntax
call psgebd2(m, n, a, ia, ja, desca, d, e, tauq, taup, work, lwork, info)
call pdgebd2(m, n, a, ia, ja, desca, d, e, tauq, taup, work, lwork, info)
call pcgebd2(m, n, a, ia, ja, desca, d, e, tauq, taup, work, lwork, info)
call pzgebd2(m, n, a, ia, ja, desca, d, e, tauq, taup, work, lwork, info)
Description
The routine reduces a real/complex general m-by-n distributed matrix sub(A) = A(ia:ia+m-1,
ja:ja+n-1) to upper or lower bidiagonal form B by an orthogonal/unitary transformation:
Q'*sub(A)*P = B.
If m ≥ n, B is the upper bidiagonal; if m<n, B is the lower bidiagonal.
Input Parameters
m (global) INTEGER.
The number of rows of the distributed submatrix sub(A).
(m≥0).
n (global) INTEGER.
The order of the distributed submatrix sub(A). (n≥0).
a (local).
REAL for psgebd2
2142
DOUBLE PRECISION for pdgebd2
COMPLEX for pcgebd2
COMPLEX*16 for pzgebd2.
Pointer into the local memory to an array of
DIMENSION(lld_a, LOCc(ja+n-1)).
On entry, this array contains the local pieces of the general
desca (global and local) INTEGER array, DIMENSION (dlen_). The
work (local).
REAL for psgebd2
COMPLEX for pcgebd2
This is a workspace array of DIMENSION (lwork).
lwork is local input and must be at least lwork ≥
max(mpa0, nqa0),
where nb = mb_a = nb_a, iroffa = mod(ia-1, nb),
iarow = indxg2p(ia, nb, myrow, rsrc_a, nprow),
iacol = indxg2p(ja, nb, mycol, csrc_a, npcol),
mpa0 = numroc(m+iroffa, nb, myrow, iarow,
nprow),
nqa0 = numroc(n+icoffa, nb, mycol, iacol,
npcol).
indxg2p and numroc are ScaLAPACK tool functions; myrow,
mycol, nprow, and npcol can be determined by calling the
2143
Output Parameters
a (local).
On exit, if m ≥ n, the diagonal and the first superdiagonal
of sub(A) are overwritten with the upper bidiagonal matrix
B; the elements below the diagonal, with the array tauq,
elementary reflectors, and the elements above the first
superdiagonal, with the array taup, represent the orthogonal
matrix P as a product of elementary reflectors. If m < n,
the diagonal and the first subdiagonal are overwritten with
the lower bidiagonal matrix B; the elements below the first
subdiagonal, with the array tauq, represent the
reflectors, and the elements above the diagonal, with the
array taup, represent the orthogonal matrix P as a product
of elementary reflectors. See Applications Notes below.
d (local)
REAL for psgebd2
COMPLEX for pcgebd2
Array, DIMENSION LOCc(ja+min(m,n)-1) if m ≥ n;
elements of the bidiagonal matrix B: d(i) = a(i,i). d is
e (local)
REAL for psgebd2
COMPLEX for pcgebd2
Array, DIMENSION LOCc(ja+min(m,n)-1) if m ≥ n;
elements of the bidiagonal matrix B:
if m ≥ n, e(i) = a(i, i+1) for i = 1, 2, ... , n-1;
if m < n, e(i) = a(i+1, i) for i = 1, 2, ..., m-1.
2144
tauq (local).
REAL for psgebd2
COMPLEX for pcgebd2
Array, DIMENSIONLOCc(ja+min(m,n)-1). The scalar factors
of the elementary reflectors which represent the
orthogonal/unitary matrix Q. tauq is tied to the distributed
matrix A.
taup (local).
REAL for psgebd2
COMPLEX for pcgebd2
Array, DIMENSION LOCr(ia+min(m,n)-1). The scalar
factors of the elementary reflectors which represent the
orthogonal/unitary matrix P. taup is tied to the distributed
matrix A.
work On exit, work(1) returns the minimal and optimal lwork.
info (local)
INTEGER.
if info < 0: If the i-th argument is an array and the
j-entry had an illegal value, then info = - (i*100+j),
then info = -i.
Application Notes
If m≥n,
Q = H(1)*H(2)*...*H(n), and P = G(1)*G(2)*...*G(n-1)

H(i) = I - tauq*v*v', and G(i) = I - taup*u*u',

where tauq and taup are real/complex scalars, and v and u are real/complex vectors. v(1:
i-1) = 0, v(i) = 1, and v(i+i:m) is stored on exit in
2145
A(ia+i-ia+m-1, a+i-1);
u(1:i) = 0, u(i+1) = 1, and u(i+2:n) is stored on exit in A(ia+i-1, ja+i+1:ja+n-1);
tauq is stored in TAUQ(ja+i-1) and taup in TAUP(ia+i-1).
If m < n,
v(1: i) = 0, v(i+1) = 1, and v(i+2:m) is stored on exit in A(ia+i+1: ia+m-1, ja+i-1);
u(1: i-1) = 0, u(i) = 1, and u(i+1 :n) is stored on exit in A(ia+i-1,ja+i:ja+n-1);
tauq is stored in TAUQ(ja+i-1) and taup in TAUP(ia+i-1).
The contents of sub(A) on exit are illustrated by the following examples:
where d and e denote diagonal and off-diagonal elements of B, vi denotes an element of the
vector defining H(i), and ui an element of the vector defining G(i).
p?gehd2
Syntax
call psgehd2(n, ilo, ihi, a, ia, ja, desca, tau, work, lwork, info)
call pdgehd2(n, ilo, ihi, a, ia, ja, desca, tau, work, lwork, info)
call pcgehd2(n, ilo, ihi, a, ia, ja, desca, tau, work, lwork, info)
call pzgehd2(n, ilo, ihi, a, ia, ja, desca, tau, work, lwork, info)
Description
2146
The routine reduces a real/complex general distributed matrix sub(A) to upper Hessenberg
form H by an orthogonal/unitary similarity transformation: Q'*sub(A)*Q = H, where sub(A)
= A(ia+n-1 :ia+n-1, ja+n-1 :ja+n-1).
Input Parameters
n (global) INTEGER. The order of the distributed submatrix A.
(n≥ 0).
ilo, ihi (global) INTEGER. It is assumed that sub(A) is already upper
triangular in rows ia:ia+ilo-2 and ia+ihi:ia+n-1 and
columns ja:ja+jlo-2 and ja+jhi:ja+n-1. See Application
Notes for further information.
If n≥ 0, 1 ≤ ilo ≤ ihi ≤ n; otherwise set ilo = 1, ihi
= n.
a (local).
REAL for psgehd2
DOUBLE PRECISION for pdgehd2
COMPLEX for pcgehd2
COMPLEX*16 for pzgehd2.
general distributed matrix sub(A) to be reduced.
work (local).
REAL for psgehd2
COMPLEX for pcgehd2
lwork (local or global). INTEGER.
2147
lwork is local input and must be at least lwork≥nb + max(

npa0, nb ), where nb = mb_a = nb_a, iroffa = mod(
ia-1, nb ), iarow = indxg2p ( ia, nb, myrow,
rsrc_a, nprow ),npa0 = numroc(ihi+iroffa, nb,
myrow, iarow, nprow ).
indxg2p and numroc are ScaLAPACK tool functions;myrow,
Output Parameters
a (local). On exit, the upper triangle and the first subdiagonal
of sub(A) are overwritten with the upper Hessenberg matrix
H, and the elements below the first subdiagonal, with the
array tau, represent the orthogonal/unitary matrix Q as a
product of elementary reflectors. (see Application Notes
below).
tau (local).
REAL for psgehd2
COMPLEX for pcgehd2
Array, DIMENSION LOCc(ja+n-2) The scalar factors of the
elementary reflectors (see Application Notes below).
Elements ja:ja+ilo-2 and ja+ihi:ja+n-2 of tau are set
to zero. tau is tied to the distributed matrix A.
info (local).INTEGER.
if info < 0: If the i-th argument is an array and the j-entry
-i.
2148
Application Notes
The matrix Q is represented as a product of (ihi-ilo) elementary reflectors
Q = H(ilo)*H(ilo+1)*...*H(ihi-1).
H(i) = I - tau*v*v',
where tau is a real/complex scalar, and v is a real/complex vector with v(1: i)=0, v(i+1)=1
and v(ihi+1:n)=0; v(i+2:ihi) is stored on exit in A(ia+ilo+i:ia+ihi-1, ia+ilo+i-2),
and tau in tau(ja+ilo+i-2).
The contents of A(ia:ia+n-1, ja:ja+n-1) are illustrated by the following example, with n =
7, ilo = 2 and ihi = 6:
where a denotes an element of the original matrix sub(A), h denotes a modified element of the
upper Hessenberg matrix H, and vi denotes an element of the vector defining H(ja+ilo+i-2).
p?gelq2
Computes an LQ factorization of a general
rectangular matrix (unblocked algorithm).
Syntax
call psgelq2(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgelq2(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgelq2(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgelq2(m, n, a, ia, ja, desca, tau, work, lwork, info)
2149
Description
The routine computes an LQ factorization of a real/complex distributed m-by-n matrix sub(A)
= A(ia:ia+m-1, ja:ja+n-1) = L*Q.
Input Parameters
m (global) INTEGER.
The number of rows to be operated on, that is, the number
of rows of the distributed submatrix sub(A). (m≥0).
n (global) INTEGER.
The number of columns to be operated on, that is, the
number of columns of the distributed submatrix sub(A).
(n≥0).
a (local).
REAL for psgelq2
DOUBLE PRECISION for pdgelq2
COMPLEX for pcgelq2
COMPLEX*16 for pzgelq2.
On entry, this array contains the local pieces of the m-by-n
distributed matrix sub(A) which is to be factored.
work (local).
REAL for psgelq2
COMPLEX for pcgelq2
2150
lwork is local input and must be at least lwork≥nq0 + max(
1, mp0 ),
where iroff = mod(ia-1, mb_a), icoff = mod(ja-1,
nb_a),
iarow = indxg2p(ia, mb_a, myrow, rsrc_a, nprow),
iacol = indxg2p(ja, nb_a, mycol, csrc_a, npcol),
mp0 = numroc(m+iroff, mb_a, myrow, iarow, nprow),
nq0 = numroc(n+icoff, nb_a, mycol, iacol, npcol),
Output Parameters
a (local).
On exit, the elements on and below the diagonal of sub(A)
contain the m by min(m,n) lower trapezoidal matrix L (L is
lower triangular if m ≤ n); the elements above the diagonal,
Q as a product of elementary reflectors (see Application
Notes below).
tau (local).
REAL for psgelq2
COMPLEX for pcgelq2
Array, DIMENSION LOCr(ia+min(m, n)-1). This array
contains the scalar factors of the elementary reflectors. tau
2151
info (local).INTEGER. If info = 0, the execution is successful.

-i.
Application Notes
Q =H(ia+k-1)*H(ia+k-2)*. . . *H(ia) for real flavors, Q

=(H(ia+k-1))H*(H(ia+k-2))H...*(H(ia))H for complex flavors,
where k = min(m,n).
H(i) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector with v(1: i-1) = 0 and
v(i) = 1; v(i+1: n) (for real flavors) or conjg(v(i+1: n)) (for complex flavors) is stored
on exit in A(ia+i-1,ja+i:ja+n-1), and tau in TAU(ia+i-1).
p?geql2
Computes a QL factorization of a general
Syntax
call psgeql2(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgeql2(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgeql2(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgeql2(m, n, a, ia, ja, desca, tau, work, lwork, info)
Description
The routine computes a QL factorization of a real/complex distributed m-by-n matrix sub(A)
= A(ia:ia+m-1, ja:ja+n-1)= Q *L.
2152
Input Parameters
m (global) INTEGER.
of rows of the distributed submatrix sub(A). (m≥ 0).
n (global) INTEGER.
number of columns of the distributed submatrix sub(A). (n≥
0).
a (local).
REAL for psgeql2
DOUBLE PRECISION for pdgeql2
COMPLEX for pcgeql2
COMPLEX*16 for pzgeql2.
Pointer into the local memory to an array of DIMENSION
(lld_a,LOCc (ja+n-1)).
work (local).
REAL for psgeql2
COMPLEX for pcgeql2
lwork is local input and must be at least lwork≥mp0 +
max(1, nq0),
nb_a),
2153

Output Parameters
a (local).
On exit,
if m ≥ n, the lower triangle of the distributed submatrix
A(ia+m-n:ia+m-1, ja:ja+n-1) contains the n-by-n lower
triangular matrix L;
if m ≤ n, the elements on and below the (n-m)-th
superdiagonal contain the m-by-n lower trapezoidal matrix
L; the remaining elements, with the array tau, represent
the orthogonal/ unitary matrix Q as a product of elementary
reflectors (see Application Notes below).
tau (local).
REAL for psgeql2
COMPLEX for pcgeql2
Array, DIMENSION LOCc(ja+n-1). This array contains the
scalar factors of the elementary reflectors. tau is tied to the
2154
If info = 0, the execution is successful. if info < 0: If the
value, then info = - (i*100+j), if the i-th argument is
a scalar and had an illegal value, then info = -i.
Application Notes
Q = H(ja+k-1)*...*H(ja+1)*H(ja), where k = min(m,n).

H(i) = I- tau *v*v'
where tau is a real/complex scalar, and v is a real/complex vector with v(m-k+i+1: m) = 0
and v(m-k+i) = 1; v(1: m-k+i-1) is stored on exit in A(ia:ia+m-k+i-2, ja+n-k+i-1), and
tau in TAU(ja+n-k+i-1).
p?geqr2
Computes a QR factorization of a general
Syntax
call psgeqr2(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgeqr2(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgeqr2(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgeqr2(m, n, a, ia, ja, desca, tau, work, lwork, info)
Description
The routine computes a QR factorization of a real/complex distributed m-by-n matrix sub(A)
= A(ia:ia+m-1, ja:ja+n-1)= Q*R.
Input Parameters
m (global). INTEGER.
2155

n (global).INTEGER. The number of columns to be operated
submatrix sub(A). (n≥0).
a (local).
REAL for psgeqr2
DOUBLE PRECISION for pdgeqr2
COMPLEX for pcgeqr2
COMPLEX*16 for pzgeqr2.
(lld_a, LOCc (ja+n-1)).
work (local).
REAL for psgeqr2
COMPLEX for pcgeqr2
lwork is local input and must be at least lwork≥mp0+max(1,
nq0),
nb_a),
nq0 = numroc(n+icoff, nb_a, mycol, iacol, npcol).
2156
Output Parameters
a (local).
On exit, the elements on and above the diagonal of sub(A)
contain the min(m,n) by n upper trapezoidal matrix R (R is
upper triangular if m≥n); the elements below the diagonal,
Q as a product of elementary reflectors (see Application
Notes below).
tau (local).
REAL for psgeqr2
COMPLEX for pcgeqr2
Array, DIMENSION LOCc(ja+min(m,n)-1). This array
contains the scalar factors of the elementary reflectors. tau
If info = 0, the execution is successful. if info < 0:
illegal value, then info = - (i*100+j),
then info = -i.
Application Notes
Q = H(ja)*H(ja+1)*. . .* H(ja+k-1), where k = min(m,n).
2157

H(j)= I - tau*v*v',
where tau is a real/complex scalar, and v is a real/complex vector with v(1: i-1) = 0 and v(i)
= 1; v(i+1: m) is stored on exit in A(ia+i:ia+m-1, ja+i-1), and tau in TAU(ja+i-1).
p?gerq2
Computes an RQ factorization of a general
Syntax
call psgerq2(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgerq2(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgerq2(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgerq2(m, n, a, ia, ja, desca, tau, work, lwork, info)
Description
The routine computes an RQ factorization of a real/complex distributed m-by-n matrix sub(A)
= A(ia:ia+m-1, ja:ja+n-1) = R*Q.
Input Parameters
submatrix sub(A). (n≥0).
a (local).
REAL for psgerq2
DOUBLE PRECISION for pdgerq2
COMPLEX for pcgerq2
COMPLEX*16 for pzgerq2.
2158
work (local).
REAL for psgerq2
COMPLEX for pcgerq2
lwork is local input and must be at least lwork≥nq0 +
max(1, mp0), where
mp0 = numroc( m+iroff, mb_a, myrow, iarow,
nprow),
Output Parameters
a (local).
On exit,
2159
if m ≤ n, the upper triangle of A(ia+m-n:ia+m-1, ja:ja+n-1)

contains the m-by-m upper triangular matrix R;
if m ≥ n, the elements on and above the (m-n)-th subdiagonal
contain the m-by-n upper trapezoidal matrix R; the remaining
elements, with the array tau, represent the orthogonal/
unitary matrix Q as a product of elementary reflectors (see
tau (local).
REAL for psgerq2
COMPLEX for pcgerq2
Array, DIMENSION LOCr(ia+m -1). This array contains the
-i.
Application Notes
Q = H(ia)*H(ia+1)*...*H(ia+k-1) for real flavors,

Q = (H(ia))H*(H(ia+1))H...*(H(ia+k-1))H for complex flavors,
where k = min(m, n).
where tau is a real/complex scalar, and v is a real/complex vector with v(n-k+i+1:n) = 0
and v(n-k+i) = 1; v(1:n-k+i-1) for real flavors or conjg(v(1:n-k+i-1)) for complex
flavors is stored on exit in A(ia+m-k+i-1, ja:ja+n-k+i-2), and tau in TAU(ia+m-k+i-1).
2160
p?getf2
Computes an LU factorization of a general matrix,
using partial pivoting with row interchanges (local
blocked algorithm).
Syntax
call psgetf2(m, n, a, ia, ja, desca, ipiv, info)
call pdgetf2(m, n, a, ia, ja, desca, ipiv, info)
call pcgetf2(m, n, a, ia, ja, desca, ipiv, info)
call pzgetf2(m, n, a, ia, ja, desca, ipiv, info)
Description
The routine computes an LU factorization of a general m-by-n distributed matrix sub(A) =
A(ia:ia+m-1, ja:ja+n-1) using partial pivoting with row interchanges.
The factorization has the form sub(A) = P * L* U, where P is a permutation matrix, L is
lower triangular with unit diagonal elements (lower trapezoidal if m>n), and U is upper triangular
(upper trapezoidal if m < n). This is the right-looking Parallel Level 2 BLAS version of the
algorithm.
Input Parameters
submatrix sub(A). (nb_a - mod(ja-1, nb_a)≥n≥0).
a (local).
REAL for psgetf2
DOUBLE PRECISION for pdgetf2
COMPLEX for pcgetf2
COMPLEX*16 for pzgetf2.
2161

Output Parameters
ipiv (local). INTEGER.
Array, DIMENSION(LOCr(m_a) + mb_a). This array contains
the pivoting information. ipiv(i) -> The global row that
local row i was swapped with. This array is tied to the
If info < 0:
• if the i-th argument is an array and the j-entry had an

illegal value, then info = -(i*100+j),
• if the i-th argument is a scalar and had an illegal value,
then info = - i.
If info > 0: If info = k, u(ia+k-1, ja+k-1 ) is exactly

zero. The factorization has been completed, but the factor
u is exactly singular, and division by zero will occur if it is
used to solve a system of equations.
2162
p?labrd
Reduces the first nb rows and columns of a general
rectangular matrix A to real bidiagonal form by an
orthogonal/unitary transformation, and returns
auxiliary matrices that are needed to apply the
Syntax
call pslabrd(m, n, nb, a, ia, ja, desca, d, e, tauq, taup, x, ix, jx, descx,
y, iy, jy, descy, work)
call pdlabrd(m, n, nb, a, ia, ja, desca, d, e, tauq, taup, x, ix, jx, descx,
call pclabrd(m, n, nb, a, ia, ja, desca, d, e, tauq, taup, x, ix, jx, descx,
call pzlabrd(m, n, nb, a, ia, ja, desca, d, e, tauq, taup, x, ix, jx, descx,
Description
The routine reduces the first nb rows and columns of a real/complex general m-by-n distributed
matrix sub(A) = A(ia:ia+m-1, ja:ja+n-1) to upper or lower bidiagonal form by an
orthogonal/unitary transformation Q'* A * P, and returns the matrices X and Y necessary to
apply the transformation to the unreduced part of sub(A).
If m ≥n, sub(A) is reduced to upper bidiagonal form; if m < n, sub(A) is reduced to lower
bidiagonal form.
This is an auxiliary routine called by p?gebrd.
Input Parameters
of rows of the distributed submatrix sub(A). (m ≥ 0).
2163

submatrix sub(A). (n ≥ 0).
nb (global) INTEGER.
The number of leading rows and columns of sub(A) to be
reduced.
a (local).
REAL for pslabrd
DOUBLE PRECISION for pdlabrd
COMPLEX for pclabrd
COMPLEX*16 for pzlabrd.
On entry, this array contains the local pieces of the general
descx (global and local) INTEGER array, DIMENSION (dlen_). The
iy, jy (global) INTEGER. The row and column indices in the global
array y indicating the first row and the first column of the
submatrix sub(Y), respectively.
descy (global and local) INTEGER array, DIMENSION (dlen_). The
array descriptor for the distributed matrix Y.
work (local).
REAL for pslabrd
COMPLEX for pclabrd
COMPLEX*16 for pzlabrd
Workspace array, DIMENSION(lwork)
2164
lwork ≥ nb_a + nq,
with nq = numroc(n+mod(ia-1, nb_y), nb_y, mycol,
iacol, npcol)
iacol = indxg2p (ja, nb_a, mycol, csrc_a, npcol)
Output Parameters
a (local)
On exit, the first nb rows and columns of the matrix are
overwritten; the rest of the distributed matrix sub(A) is
unchanged.
If m ≥ n, elements on and below the diagonal in the first
nb columns, with the array tauq, represent the
reflectors;and elements above the diagonal in the first nb
rows, with the array taup, represent the orthogonal/unitary
matrix P as a product of elementary reflectors.
If m < n, elements below the diagonal in the first nb
reflectors, and elements on and above the diagonal in the
first nb rows, with the array taup, represent the
orthogonal/unitary matrix P as a product of elementary
reflectors. See Application Notes below.
d (local).
REAL for pslabrd
COMPLEX for pclabrd
Array, DIMENSION LOCr(ia+min(m,n)-1) if m ≥ n;
LOCc(ja+min(m,n)-1) otherwise. The distributed diagonal
elements of the bidiagonal distributed matrix B:
d(i) = A(ia+i-1, ja+i-1).
2165
e (local).
REAL for pslabrd
COMPLEX for pclabrd
Array, DIMENSION LOCr(ia+min(m,n)-1) if m ≥ n;
LOCc(ja+min(m,n)-2) otherwise. The distributed
off-diagonal elements of the bidiagonal distributed matrix
B:
if m ≥ n, E(i) = A(ia+i-1, ja+i) for i = 1, 2, ...,
n-1;
if m<n, E(i) = A(ia+i, ja+i-1) for i = 1, 2, ...,
m-1.
tauq, taup (local).
REAL for pslabrd
COMPLEX for pclabrd
Array DIMENSION LOCc(ja+min(m, n)-1) for tauq,
DIMENSION LOCr(ia+min(m, n)-1) for taup. The scalar
factors of the elementary reflectors which represent the
orthogonal/unitary matrix Q for tauq, P for taup. tauq and
taup are tied to the distributed matrix A. See Application
Notes below.
x (local)
REAL for pslabrd
COMPLEX for pclabrd
(lld_x, nb). On exit, the local pieces of the distributed
m-by-nb matrix X(ix:ix+m-1, jx:jx+nb-1) required to
update the unreduced part of sub(A).
y (local).
REAL for pslabrd
2166
COMPLEX for pclabrd
(lld_y, nb). On exit, the local pieces of the distributed
n-by-nb matrix Y(iy:iy+n-1, jy:jy+nb-1) required to
update the unreduced part of sub(A).
Application Notes
Q = H(1)*H(2)*...*H(nb), and P = G(1)*G(2)*...*G(nb)

H(i) = I - tauq*v*v' , and G(i) = I - taup*u*u',
where tauq and taup are real/complex scalars, and v and u are real/complex vectors.
If m ≥ n, v(1: i-1 ) = 0, v(i) = 1, and v(i:m) is stored on exit in
A(ia+i-1:ia+m-1, ja+i-1); u(1:i) = 0, u(i+1 ) = 1, and u(i+1:n) is stored on exit

in A(ia+i-1, ja+i:ja+n-1); tauq is stored in TAUQ(ja+i-1) and taup in TAUP(ia+i-1).
If m < n, v(1: i) = 0, v(i+1 ) = 1, and v(i+1:m) is stored on exit in

A(ia+i+1:ia+m-1, ja+i-1); u(1:i-1 ) = 0, u(i) = 1, and u(i:n) is stored on exit in
A(ia+i-1, ja+i:ja+n-1); tauq is stored in TAUQ(ja+i-1) and taup in TAUP(ia+i-1). The
elements of the vectors v and u together form the m-by-nb matrix V and the nb-by-n matrix U'
which are necessary, with X and Y, to apply the transformation to the unreduced part of the
matrix, using a block update of the form: sub(A):= sub(A) - V*Y' - X*U'. The contents
of sub(A) on exit are illustrated by the following examples with nb = 2:
where a denotes an element of the original matrix which is unchanged, vi denotes an element
of the vector defining H(i), and ui an element of the vector defining G(i).
2167
p?lacon
the reverse communication for evaluating
matrix-vector products.
Syntax
call pslacon(n, v, iv, jv, descv, x, ix, jx, descx, isgn, est, kase)
call pdlacon(n, v, iv, jv, descv, x, ix, jx, descx, isgn, est, kase)
call pclacon(n, v, iv, jv, descv, x, ix, jx, descx, isgn, est, kase)
call pzlacon(n, v, iv, jv, descv, x, ix, jx, descx, isgn, est, kase)
Description
The routine estimates the 1-norm of a square, real/unitary distributed matrix A. Reverse
communication is used for evaluating matrix-vector products. x and v are aligned with the
distributed matrix A, this information is implicitly contained within iv, ix, descv, and descx.
Input Parameters
n (global).INTEGER. The length of the distributed vectors v
and x. n ≥ 0.
v (local).
REAL for pslacon
DOUBLE PRECISION for pdlacon
COMPLEX for pclacon
COMPLEX*16 for pzlacon.
LOCr(n+mod(iv-1, mb_v)). On the final return, v = a*w,
where est = norm(v)/norm(w) (w is not returned).
iv, jv (global) INTEGER. The row and column indices in the global
array v indicating the first row and the first column of the
submatrix V, respectively.
descv (global and local) INTEGER array, DIMENSION (dlen_). The
array descriptor for the distributed matrix V.
x (local).
2168
REAL for pslacon
DOUBLE PRECISION for pdlacon
COMPLEX for pclacon
COMPLEX*16 for pzlacon.
LOCr(n+mod(ix-1, mb_x)).
submatrix X, respectively.
descx (global and local) INTEGER array, DIMENSION (dlen_). The
isgn (local). INTEGER.
Array, DIMENSION LOCr(n+mod(ix-1, mb_x)). isgn is
aligned with x and v.
kase (local). INTEGER.
On the initial call to p?lacon, kase should be 0.
Output Parameters
x (local).
On an intermediate return, X should be overwritten by A*X,
if kase=1, A' *X, if kase=2,
p?lacon must be re-called with all the other parameters
unchanged.
est (global). REAL for single precision flavors
DOUBLE PRECISION for double precision flavors
kase (local)
INTEGER. On an intermediate return, kase is 1 or 2,
indicating whether X should be overwritten by A*X, or A'*X.
On the final return from p?lacon, kase is again 0.
2169
p?laconsb
Looks for two consecutive small subdiagonal
elements.
Syntax
call pslaconsb(a, desca, i, l, m, h44, h33, h43h34, buf, lwork)
call pdlaconsb(a, desca, i, l, m, h44, h33, h43h34, buf, lwork)
Description
The routine looks for two consecutive small subdiagonal elements by analyzing the effect of
starting a double shift QR iteration given by h44, h33, and h43h34 to see if this process makes
a subdiagonal negligible.
Input Parameters
a (global). REAL for pslaconsb
DOUBLE PRECISION for pdlaconsb
Array, DIMENSION (desca (lld_),*). On entry, the
Hessenberg matrix whose tridiagonal part is being scanned.
Unchanged on exit.
desca (global and local) INTEGER.
Array of DIMENSION (dlen_). The array descriptor for the
i (global) INTEGER.
The global location of the bottom of the unreduced submatrix
of A. Unchanged on exit.
l (global) INTEGER.
The global location of the top of the unreduced submatrix
h44, h33, h43h34 (global). REAL for pslaconsb
These three values are for the double shift QR iteration.
lwork (global) INTEGER.
2170
This must be at least 7*ceil(ceil( (i-l)/hbl
)/lcm(nprow, npcol)). Here lcm is least common multiple
and nprowxnpcol is the logical grid size.
Output Parameters
m (global). On exit, this yields the starting location of the QR
double shift. This will satisfy:
l ≤ m ≤ i-2.
buf (local).
REAL for pslaconsb
Array of size lwork.
lwork (global). On exit, lwork is the size of the work buffer.
p?lacp2
Copies all or part of a distributed matrix to another
distributed matrix.
Syntax
call pslacp2(uplo, m, n, a, ia, ja, desca, b, ib, jb, descb)
call pdlacp2(uplo, m, n, a, ia, ja, desca, b, ib, jb, descb)
call pclacp2(uplo, m, n, a, ia, ja, desca, b, ib, jb, descb)
call pzlacp2(uplo, m, n, a, ia, ja, desca, b, ib, jb, descb)
Description
The routine copies all or part of a distributed matrix A to another distributed matrix B. No
communication is performed, p?lacp2 performs a local copy sub(A):= sub(B), where sub(A)
denotes A(ia:ia+m-1, a:ja+n-1) and sub(B) denotes B(ib:ib+m-1, jb:jb+n-1).
p?lacp2 requires that only dimension of the matrix operands is distributed.
2171
Input Parameters
uplo (global) CHARACTER. Specifies the part of the distributed
matrix sub(A) to be copied:
= 'U': Upper triangular part is copied; the strictly lower
triangular part of sub(A) is not referenced;
= 'L': Lower triangular part is copied; the strictly upper
triangular part of sub(A) is not referenced.
Otherwise: all of the matrix sub(A) is copied.
m (global) INTEGER.
n (global) INTEGER.
number of columns of the distributed submatrix sub(A). (n
≥ 0).
a (local).
REAL for pslacp2
DOUBLE PRECISION for pdlacp2
COMPLEX for pclacp2
COMPLEX*16 for pzlacp2.
array A indicating the first row and the first column of
sub(A), respectively.
array B indicating the first row and the first column of
sub(B), respectively.
descb (global and local) INTEGER array, DIMENSION (dlen_). The
2172
Output Parameters
b (local).
REAL for pslacp2
COMPLEX for pclacp2
COMPLEX*16 for pzlacp2.
(lld_b, LOCc(jb+n-1) ). This array contains on exit the local
pieces of the distributed matrix sub( B ) set as follows:
if uplo = 'U', B(ib+i-1, jb+j-1) = A(ia+i-1, ja+j-1),
1 ≤ i ≤ j, 1 ≤ j ≤ n;
if uplo = 'L', B(ib+i-1, jb+j-1) = A(ia+i-1, ja+j-1),
j ≤ i ≤ m, 1≤ j ≤ n;
otherwise, B(ib+i-1, jb+j-1) = A(ia+i-1, ja+j-1), 1 ≤ i
≤ m, 1 ≤ j ≤ n.
p?lacp3
Copies from a global parallel array into a local
replicated array or vice versa.
Syntax
call pslacp3(m, i, a, desca, b, ldb, ii, jj, rev)
call pdlacp3(m, i, a, desca, b, ldb, ii, jj, rev)
Description
This is an auxiliary routine that copies from a global parallel array into a local replicated array
or vise versa. Note that the entire submatrix that is copied gets placed on one node or more.
The receiving node can be specified precisely, or all nodes can receive, or just one row or
column of nodes.
Input Parameters
m (global) INTEGER.
m is the order of the square submatrix that is copied.
2173
m ≥ 0. Unchanged on exit.
i (global) INTEGER. A(i, i) is the global location that the
copying starts from. Unchanged on exit.
a (global). REAL for pslacp3
Array, DIMENSION (desca(lld_),*). On entry, the parallel
matrix to be copied into or from.
b (local).
REAL for pslacp3
Array, DIMENSION (ldb, m). If rev = 0, this is the global
portion of the array A(i:i+m-1, i:i+m-1). If rev = 1,
this is the unchanged on exit.
ldb (local)
INTEGER.
The leading dimension of B.
ii (global) INTEGER. By using rev 0 and 1, data can be sent
out and returned again. If rev = 0, then ii is destination
row index for the node(s) receiving the replicated B. If ii
≥ 0, jj ≥ 0, then node (ii, jj) receives the data. If ii =
-1, jj ≥ 0, then all rows in column jj receive the data. If
ii ≥ 0, jj = -1, then all cols in row ii receive the data. f
ii = -1, jj = -1, then all nodes receive the data. If rev
!=0, then ii is the source row index for the node(s) sending
the replicated B.
jj (global) INTEGER. Similar description as ii above.
rev (global) INTEGER. Use rev = 0 to send global A into locally
replicated B (on node (ii, jj)). Use rev != 0 to send locally
replicated B from node (ii, jj) to its owner (which changes
depending on its location in A) into the global A.
2174
Output Parameters
a (global). On exit, if rev = 1, the copied data. Unchanged
on exit if rev = 0.
b (local). If rev = 1, this is unchanged on exit.
p?lacpy
Copies all or part of one two-dimensional array to
another.
Syntax
call pslacpy(uplo, m, n, a, ia, ja, desca, b, ib, jb, descb)
call pdlacpy(uplo, m, n, a, ia, ja, desca, b, ib, jb, descb)
call pclacpy(uplo, m, n, a, ia, ja, desca, b, ib, jb, descb)
call pzlacpy(uplo, m, n, a, ia, ja, desca, b, ib, jb, descb)
Description
The routine copies all or part of a distributed matrix A to another distributed matrix B. No
communication is performed, p?lacpy performs a local copy sub(A):= sub(B), where sub(A)
denotes A(ia:ia+m-1,ja:ja+n-1) and sub(B) denotes B(ib:ib+m-1,jb:jb+n-1).
Input Parameters
uplo (global). CHARACTER. Specifies the part of the distributed
matrix sub(A) to be copied:
= 'U': Upper triangular part; the strictly lower triangular
part of sub(A) is not referenced;
= 'L': Lower triangular part; the strictly upper triangular
part of sub(A) is not referenced.
Otherwise: all of the matrix sub(A) is copied.
m (global) INTEGER.
of rows of the distributed submatrix sub(A). (m ≥0).
n (global) INTEGER.
2175

(n≥0).
a (local).
REAL for pslacpy
DOUBLE PRECISION for pdlacpy
COMPLEX for pclacpy
COMPLEX*16 for pzlacpy.
On entry, this array contains the local pieces of the
array B indicating the first row and the first column of sub(B)
respectively.
Output Parameters
b (local).
REAL for pslacpy
DOUBLE PRECISION for pdlacpy
COMPLEX for pclacpy
COMPLEX*16 for pzlacpy.
(lld_b, LOCc(jb+n-1) ). This array contains on exit the local
pieces of the distributed matrix sub( B ) set as follows:
if uplo = 'U', B(ib+i-1, jb+j-1) = A(ia+i-1, ja+j-1),
1≤i≤j, 1≤j≤n;
if uplo = 'L', B(ib+i-1, jb+j-1) = A(ia+i-1,
ja+j-1),j≤i≤m, 1≤j≤n;
2176
otherwise, B(ib+i-1, jb+j-1) = A(ia+i-1, ja+j-1), 1≤i≤m,
1≤j≤n.
p?laevswp
Moves the eigenvectors from where they are
computed to ScaLAPACK standard block cyclic
array.
Syntax
call pslaevswp(n, zin, ldzi, z, iz, jz, descz, nvs, key, rwork, lrwork)
call pdlaevswp(n, zin, ldzi, z, iz, jz, descz, nvs, key, rwork, lrwork)
call pclaevswp(n, zin, ldzi, z, iz, jz, descz, nvs, key, rwork, lrwork)
call pzlaevswp(n, zin, ldzi, z, iz, jz, descz, nvs, key, rwork, lrwork)
Description
The routine moves the eigenvectors (potentially unsorted) from where they are computed, to
a ScaLAPACK standard block cyclic array, sorted so that the corresponding eigenvalues are
sorted.
Input Parameters
The order of the matrix A. n ≥ 0.
zin (local).
REAL for pslaevswp
DOUBLE PRECISION for pdlaevswp
COMPLEX for pclaevswp
COMPLEX*16 for pzlaevswp. Array, DIMENSION (ldzi,
nvs(iam) ). The eigenvectors on input. Each eigenvector
resides entirely in one process. Each process holds a
2177
contiguous set of nvs(iam) eigenvectors. The first

eigenvector which the process holds is: sum for i=[0, iam-1)
of nvs(i).
ldzi (local)
INTEGER.The leading dimension of the zin array.
iz, jz (global) INTEGER.The row and column indices in the global
array Z indicating the first row and the first column of the
descz (global and local) INTEGER array, DIMENSION (dlen_). The
nvs (global) INTEGER.
Array, DIMENSION( nprocs+1)
nvs(i) = number of processes number of eigenvectors held
by processes [0, i-1)
nvs(1) = number of eigen vectors held by[0, 1 -1) = 0
nvs(nprocs+1)= number of eigen vectors held by [0,
nprocs)= total number of eigenvectors.
key (global) INTEGER.
Array, DIMENSION (n). Indicates the actual index (after
sorting) for each of the eigenvectors.
rwork (local).
REAL for pslaevswp
DOUBLE PRECISIONfor pdlaevswp
COMPLEX*16 for pzlaevswp. Array, DIMENSION (lrwork).
lrwork (local)
INTEGER. Dimension of work.
Output Parameters
z (local).
REAL for pslaevswp
DOUBLE PRECISION for pdlaevswp
COMPLEX*16 for pzlaevswp.
2178
Array, global DIMENSION (n, n), local DIMENSION
(descz(dlen_), nq). The eigenvectors on output. The
eigenvectors are distributed in a block cyclic manner in both
dimensions, with a block size of nb.
p?lahrd
Reduces the first nb columns of a general
rectangular matrix A so that elements below the
k-th subdiagonal are zero, by an orthogonal/unitary
transformation, and returns auxiliary matrices that
are needed to apply the transformation to the
unreduced part of A.
Syntax
call pslahrd(n, k, nb, a, ia, ja, desca, tau, t, y, iy, jy, descy, work)
call pdlahrd(n, k, nb, a, ia, ja, desca, tau, t, y, iy, jy, descy, work)
call pclahrd(n, k, nb, a, ia, ja, desca, tau, t, y, iy, jy, descy, work)
call pzlahrd(n, k, nb, a, ia, ja, desca, tau, t, y, iy, jy, descy, work)
Description
The routine reduces the first nb columns of a real general n-by-(n-k+1) distributed matrix
A(ia:ia+n-1 , ja:ja+n-k) so that elements below the k-th subdiagonal are zero. The
reduction is performed by an orthogonal/unitary similarity transformation Q'*A*Q. The routine
returns the matrices V and T which determine Q as a block reflector I-V*T*V', and also the
matrix Y = A*V*T.
This is an auxiliary routine called by p?gehrd. In the following comments sub(A) denotes
A(ia:ia+n-1, ja:ja+n-1).
Input Parameters
n (global) INTEGER.
The order of the distributed submatrix sub(A). n ≥ 0.
k (global) INTEGER.
2179
The offset for the reduction. Elements below the k-th

subdiagonal in the first nb columns are reduced to zero.
The number of columns to be reduced.
a (local).
REAL for pslahrd
DOUBLE PRECISION for pdlahrd
COMPLEX for pclahrd
COMPLEX*16 for pzlahrd.
(lld_a, LOCc(ja+n-k)). On entry, this array contains the
local pieces of the n-by-(n-k+1) general distributed matrix
A(ia:ia+n-1, ja:ja+n-k).
iy, jy (global) INTEGER. The row and column indices in the global
array Y indicating the first row and the first column of the
submatrix sub(Y), respectively.
descy (global and local) INTEGER array, DIMENSION (dlen_). The
array descriptor for the distributed matrix Y.
work (local).
REAL for pslahrd
COMPLEX for pclahrd
Output Parameters
a (local).
On exit, the elements on and above the k-th subdiagonal
corresponding elements of the reduced distributed
matrix;the elements below the k-th subdiagonal, with the
2180
array tau, represent the matrix Q as a product of elementary
reflectors. The other columns of A(ia:ia+n-1, ja:ja+n-k)
are unchanged. (See Application Notes below.)
tau (local)
REAL for pslahrd
COMPLEX for pclahrd
Array, DIMENSION LOCc(ja+n-2). The scalar factors of the
elementary reflectors (see Application Notes below). tau is
t (local)REAL for pslahrd
COMPLEX for pclahrd
Array, DIMENSION (nb_a, nb_a) The upper triangular matrix
T.
y (local).
REAL for pslahrd
COMPLEX for pclahrd
(lld_y, nb_a). On exit, this array contains the local pieces
of the n-by-nb distributed matrix Y. lld_y ≥ LOCr(ia+n-1).
Application Notes
The matrix Q is represented as a product of nb elementary reflectors
Q = H(1)*H(2)*...*H(nb).
H(i) = i-tau*v*v',
where tau is a real/complex scalar, and v is a real/complex vector with v(1: i+k-1)= 0, v(i+k)=
1; v(i+k+1:n) is stored on exit in A(ia+i+k:ia+n-1, ja+i-1), and tau in TAU(ja+i-1).
2181
the form: A(ia:ia+n-1, ja:ja+n-k) := (I-V*T*V')*(A(ia:ia+n-1, ja:ja+n-k)-Y*V'). The
contents of A(ia:ia+n-1, ja:ja+n-k) on exit are illustrated by the following example with n
= 7, k = 3, and nb = 2:
where a denotes an element of the original matrix A(ia:ia+n-1, ja:ja+n-k), h denotes a

modified element of the upper Hessenberg matrix H, and vi denotes an element of the vector
defining H(i).
p?laiect
Exploits IEEE arithmetic to accelerate the
computations of eigenvalues. (C interface function).
Syntax
void pslaiect(float *sigma, int *n, float *d, int *count);
void pdlaiectb(float *sigma, int *n, float *d, int *count);
void pdlaiectl(float *sigma, int *n, float *d, int *count);
Description
The routine computes the number of negative eigenvalues of (A- σI). This implementation of
the Sturm Sequence loop exploits IEEE arithmetic and has no conditionals in the innermost
loop. The signbit for real routine pslaiect is assumed to be bit 32. Double precision routines
pdlaiectb and pdlaiectl differ in the order of the double precision word storage and,
2182
consequently, in the signbit location. For pdlaiectb, the double precision word is stored in
the big-endian word order and the signbit is assumed to be bit 32. For pdlaiectl, the double
precision word is stored in the little-endian word order and the signbit is assumed to be bit 64.
Note that all arguments are call-by-reference so that this routine can be directly called from
Fortran code.
This is a ScaLAPACK internal subroutine and arguments are not checked for unreasonable
values.
Input Parameters
sigma Realfor pslaiect
DOUBLE PRECISIONfor pdlaiectb/pdlaiectl.
The shift. p?laiect finds the number of eigenvalues less
than equal to sigma.
n INTEGER. The order of the tridiagonal matrix T. n ≥ 1.
d Real for pslaiect
DOUBLE PRECISION for pdlaiectb/pdlaiectl.
Array of DIMENSION(2n -1).
On entry, this array contains the diagonals and the squares
of the off-diagonal elements of the tridiagonal matrix T.
These elements are assumed to be interleaved in memory
for better cache performance. The diagonal entries of T are
in the entries d(1), d(3),..., d(2n-1), while the
squares of the off-diagonal entries are d(2), d(4), ...,
d(2n-2). To avoid overflow, the matrix must be scaled so
that its largest entry is no greater than overflow(1/2) *
Output Parameters
n INTEGER. The count of the number of eigenvalues of T less
than or equal to sigma.
2183
p?lange
element, of a general rectangular matrix.
Syntax
val = pslange(norm, m, n, a, ia, ja, desca, work)
val = pdlange(norm, m, n, a, ia, ja, desca, work)
val = pclange(norm, m, n, a, ia, ja, desca, work)
val = pzlange(norm, m, n, a, ia, ja, desca, work)
Description
the element of largest absolute value of a distributed matrix sub(A) = A(ia:ia+m-1,
ja:ja+n-1).
Input Parameters
norm (global) CHARACTER. Specifies what value is returned by the
routine:
value of the matrix A, it s not a matrix norm.
of rows of the distributed submatrix sub(A). When m = 0,
p?lange is set to zero. m ≥ 0.
2184
When n = 0, p?lange is set to zero. n ≥ 0.
a (local).
Real for pslange
DOUBLE PRECISION for pdlange
COMPLEX for pclange
COMPLEX*16 for pzlange.
(lld_a, LOCc(ja+n-1)) containing the local pieces of the
work (local).
Real for pslange
DOUBLE PRECISION for pdlange
COMPLEX for pclange
COMPLEX*16 for pzlange.
Array DIMENSION (lwork).
lwork ≥ 0 if norm = 'M' or 'm' (not referenced),
nq0 if norm = '1', 'O' or 'o',
mp0 if norm = 'I' or 'i',
0 if norm = 'F', 'f', 'E' or 'e' (not referenced),
where
nb_a),
mp0 = numroc(m+iroffa, mb_a, myrow, iarow,
nprow),
nq0 = numroc(n+icoffa, nb_a, mycol, iacol,
npcol),
2185
indxg2p and numroc are ScaLAPACK tool routines; myrow,

Output Parameters
val The value returned by the routine.
p?lanhs
element, of an upper Hessenberg matrix.
Syntax
val = pslanhs(norm, n, a, ia, ja, desca, work)
val = pdlanhs(norm, n, a, ia, ja, desca, work)
val = pclanhs(norm, n, a, ia, ja, desca, work)
val = pzlanhs(norm, n, a, ia, ja, desca, work)
Description
the element of largest absolute value of an upper Hessenberg distributed matrix sub(A) =
A(ia:ia+m-1, ja:ja+n-1).
Input Parameters
routine:
2186
n (global) INTEGER.
When n = 0, p?lanhs is set to zero. n ≥ 0.
a (local).
Real for pslanhs
DOUBLE PRECISION for pdlanhs
COMPLEX for pclanhs
COMPLEX*16 for pzlanhs.
DIMENSION(lld_a, LOCc(ja+n-1)) containing the local
pieces of the distributed matrix sub(A).
The row and column indices in the global array A indicating
the first row and the first column of the submatrix sub(A),
respectively.
work (local).
Real for pslanhs
DOUBLE PRECISION for pdlanhs
COMPLEX for pclanhs
COMPLEX*16 for pzlanh.
where
iroffa = mod( ia-1, mb_a ), icoffa = mod( ja-1, nb_a ),
iarow = indxg2p( ia, mb_a, myrow, rsrc_a, nprow ),
iacol = indxg2p( ja, nb_a, mycol, csrc_a, npcol ),
mp0 = numroc( m+iroffa, mb_a, myrow, iarow, nprow ),
nq0 = numroc( n+icoffa, nb_a, mycol, iacol, npcol ),
2187
indxg2p and numroc are ScaLAPACK tool routines; myrow,

imycol, nprow, and npcol can be determined by calling the
Output Parameters
val The value returned by the fuction.
p?lansy, p?lanhe
element, of a real symmetric or a complex
Hermitian matrix.
Syntax
val = pslansy(norm, uplo, n, a, ia, ja, desca, work)
val = pdlansy(norm, uplo, n, a, ia, ja, desca, work)
val = pclansy(norm, uplo, n, a, ia, ja, desca, work)
val = pzlansy(norm, uplo, n, a, ia, ja, desca, work)
val = pclanhe(norm, uplo, n, a, ia, ja, desca, work)
val = pzlanhe(norm, uplo, n, a, ia, ja, desca, work)
Description
The routines return the value of the 1-norm, or the Frobenius norm, or the infinity norm, or
the element of largest absolute value of a distributed matrix sub(A) = A(ia:ia+m-1,
ja:ja+n-1).
Input Parameters
routine:
2188
uplo (global) CHARACTER. Specifies whether the upper or lower
triangular part of the symmetric matrix sub(A) is to be
referenced.
= 'U': Upper triangular part of sub(A) is referenced,
= 'L': Lower triangular part of sub(A) is referenced.
n (global) INTEGER.
The number of columns to be operated on i.e the number
of columns of the distributed submatrix sub(A). When n =
0, p?lansy is set to zero. n ≥ 0.
a (local).
Real for pslansy
DOUBLE PRECISION for pdlansy
COMPLEX for pclansy, pclanhe
COMPLEX*16 for pzlansy, pzlanhe.
(lld_a, LOCc(ja+n-1)) containing the local pieces of the
sub(A) contains the upper triangular matrix whose norm is
to be computed, and the strictly lower triangular part of this
matrix is not referenced. If uplo = 'L', the leading n-by-n
lower triangular part of sub(A) contains the lower triangular
matrix whose norm is to be computed, and the strictly upper
triangular part of sub(A) is not referenced.
work (local).
Real for pslansy
DOUBLE PRECISION for pdlansy
COMPLEX for pclansy, pclanhe
2189
COMPLEX*16 for pzlansy, pzlanhe.

2*nq0+np0+ldw if norm = '1', 'O' or 'o', 'I' or 'i',
where ldw is given by:
if( nprow.ne.npcol ) then
ldw = mb_a*ceil(ceil(np0/mb_a)/(lcm/nprow))
else
ldw = 0
end if
where lcm is the least common multiple of nprow and
npcol, lcm = ilcm( nprow, npcol ) and ceil denotes
the ceiling operation (iceil).
iroffa = mod(ia-1, mb_a ), icoffa = mod( ja-1, nb_a),
mp0 = numroc(m+iroffa, mb_a, myrow, iarow, nprow),
nq0 = numroc(n+icoffa, nb_a, mycol, iacol, npcol),
Output Parameters
2190
p?lantr
element, of a triangular matrix.
Syntax
val = pslantr(norm, uplo, diag, m, n, a, ia, ja, desca, work)
val = pdlantr(norm, uplo, diag, m, n, a, ia, ja, desca, work)
val = pclantr(norm, uplo, diag, m, n, a, ia, ja, desca, work)
val = pzlantr(norm, uplo, diag, m, n, a, ia, ja, desca, work)
Description
the element of largest absolute value of a trapezoidal or triangular distributed matrix sub(A)
= A(ia:ia+m-1, ja:ja+n-1).
Input Parameters
routine:
symmetric matrix sub(A) is to be referenced.
= 'U': Upper trapezoidal,
= 'L': Lower trapezoidal.
Note that sub(A) is triangular instead of trapezoidal if m =
n.
2191
diag (global). CHARACTER.

Specifies whether the distributed matrix sub(A) has unit
diagonal.
= 'N': Non-unit diagonal.
= 'U': Unit diagonal.
m (global) INTEGER.
p?lantr is set to zero. m ≥ 0.
n (global) INTEGER.
The number of columns to be operated on i.e the number
of columns of the distributed submatrix sub(A). When n =
0, p?lantr is set to zero. n ≥ 0.
a (local).
Real for pslantr
DOUBLE PRECISION for pdlantr
COMPLEX for pclantr
COMPLEX*16 for pzlantr.
The row and column indices in the global array a indicating
respectively.
work (local).
Real for pslantr
DOUBLE PRECISION for pdlantr
COMPLEX for pclantr
COMPLEX*16 for pzlantr.
2192
where lcm is the least common multiple of nprow and npcol
lcm = ilcm(nprow, npcol) and ceil denotes the ceiling
operation (iceil).
iroffa = mod(ia-1, mb_a ), icoffa = mod( ja-1,
nb_a),
mp0 = numroc(m+iroffa, mb_a, myrow, iarow,
nprow),
nq0 = numroc(n+icoffa, nb_a, mycol, iacol,
npcol),
Output Parameters
p?lapiv
Applies a permutation matrix to a general
distributed matrix, resulting in row or column
pivoting.
Syntax
call pslapiv(direc, rowcol, pivroc, m, n, a, ia, ja, desca, ipiv, ip, jp,
descip, iwork)
call pdlapiv(direc, rowcol, pivroc, m, n, a, ia, ja, desca, ipiv, ip, jp,
descip, iwork)
call pclapiv(direc, rowcol, pivroc, m, n, a, ia, ja, desca, ipiv, ip, jp,
descip, iwork)
call pzlapiv(direc, rowcol, pivroc, m, n, a, ia, ja, desca, ipiv, ip, jp,
descip, iwork)
2193
Description
The routine applies either P (permutation matrix indicated by ipiv) or inv(P) to a general m-by-n
distributed matrix sub(A) = A(ia:ia+m-1, ja:ja+n-1), resulting in row or column pivoting.
The pivot vector may be distributed across a process row or a column. The pivot vector should
be aligned with the distributed matrix A. This routine will transpose the pivot vector, if necessary.
For example, if the row pivots should be applied to the columns of sub(A), pass rowcol='C'
and pivroc='C'.
Input Parameters
direc (global) CHARACTER*1.
Specifies in which order the permutation is applied:
= 'F' (Forward). Applies pivots forward from top of matrix.
Computes P*sub(A).
= 'B' (Backward). Applies pivots backward from bottom of
matrix. Computes inv(P)*sub(A).
rowcol (global) CHARACTER*1.
Specifies if the rows or columns are to be permuted:
= 'R' Rows will be permuted,
= 'C' Columns will be permuted.
pivroc (global) CHARACTER*1.
Specifies whether ipiv is distributed over a process row or
column:
= 'R'ipiv is distributed over a process row,
= 'C'ipiv is distributed over a process column.
m (global) INTEGER.
p?lapiv is set to zero. m ≥ 0.
n (global) INTEGER.
When n = 0, p?lapiv is set to zero. n ≥ 0.
a (local).
Real for pslapiv
2194
DOUBLE PRECISION for pdlapiv
COMPLEX for pclapiv
COMPLEX*16 for pzlapiv.
respectively.
ipiv (local). INTEGER.
Array, DIMENSION (lipiv) ;
when rowcol='R' or 'r':
lipiv ≥ LOCr(ia+m-1) + mb_a if pivroc='C' or 'c',
lipiv ≥ LOCc(m + mod(jp-1, nb_p)) if pivroc='R' or
'r', and,
when rowcol='C' or 'c':
lipiv ≥ LOCr(n + mod(ip-1, mb_p)) if pivroc='C' or
'c',
lipiv ≥ LOCc(ja+n-1) + nb_a if pivroc='R' or 'r'.
This array contains the pivoting information. ipiv(i) is the
global row (column), local row (column) i was swapped
with. When rowcol='R' or 'r' and pivroc='C' or 'c',
or rowcol='C' or 'c' and pivroc='R' or 'r', the last
piece of this array of size mb_a (resp. nb_a) is used as
workspace. In those cases, this array is tied to the
ip, jp (global) INTEGER. The row and column indices in the global
array P indicating the first row and the first column of the
submatrix sub(P), respectively.
descip (global and local) INTEGER array, DIMENSION (dlen_). The
array descriptor for the distributed vector ipiv.
iwork (local). INTEGER.
2195
Array, DIMENSION (ldw), where ldw is equal to the

workspace necessary for transposition, and the storage of
the transposed ipiv :
Let lcm be the least common multiple of nprow and npcol.
If( rowcol.eq.'r' .and. pivroc. eq.'r') then
If( nprow.eq. npcol) then
ldw = LOCr( n_p + mod(jp-1, nb_p) ) + nb_p
else
ldw = LOCr( n_p + mod(jp-1, nb_p) )+
nb_p * ceil( ceil(LOCc(n_p)/nb_p) / (lcm/npcol) )
end if
else if( rowcol.eq.'c' .and. pivroc.eq.'c') then
if( nprow.eq.
npcol ) then
ldw = LOCc( m_p + mod(ip-1, mb_p) ) + mb_p
else
ldw = LOCc( m_p + mod(ip-1, mb_p) ) +
mb_p *ceil(ceil(LOCr(m_p)/mb_p) / (lcm/nprow) )
end if
else
iwork is not referenced.
end if.
Output Parameters
a (local).
On exit, the local pieces of the permuted distributed
submatrix.
2196
p?laqge
Scales a general rectangular matrix, using row and
column scaling factors computed by p?geequ .
Syntax
call pslaqge(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, equed)
call pdlaqge(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, equed)
call pclaqge(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, equed)
call pzlaqge(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, equed)
Description
The routine equilibrates a general m-by-n distributed matrix sub(A) = A(ia:ia+m-1, ja:ja+n-1)
using the row and scaling factors in the vectors r and c computed by p?geequ.
Input Parameters
of rows of the distributed submatrix sub(A). (m ≥0).
n (global).INTEGER.
≥0).
a (local).
REAL for pslaqge
DOUBLE PRECISION for pdlaqge
COMPLEX for pclaqge
COMPLEX*16 for pzlaqge.
On entry, this array contains the distributed matrix sub(A).
2197
r (local).
REAL for pslaqge
COMPLEX for pclaqge
Array, DIMENSION LOCr(m_a). The row scale factors for
sub(A). r is aligned with the distributed matrix A, and
replicated across every process column. r is tied to the
c (local).
REAL for pslaqge
COMPLEX for pclaqge
Array, DIMENSION LOCc(n_a). The row scale factors for
sub(A). c is aligned with the distributed matrix A, and
replicated across every process column. c is tied to the
rowcnd (local).
REAL for pslaqge
COMPLEX for pclaqge
The global ratio of the smallest r(i) to the largest r(i),
ia ≤ i ≤ ia+m-1.
colcnd (local).
REAL for pslaqge
COMPLEX for pclaqge
2198
The global ratio of the smallest c(i) to the largest c(i),
ia ≤ i ≤ ia+n-1.
amax (global). REAL for pslaqge
COMPLEX for pclaqge
Absolute value of largest distributed submatrix entry.
Output Parameters
a (local).
On exit, the equilibrated distributed matrix. See equed for
the form of the equilibrated distributed submatrix.
equed (global) CHARACTER.
= 'N': No equilibration
= 'R': Row equilibration, that is, sub(A) has been
pre-multiplied by diag(r(ia:ia+m-1)),
= 'C': column equilibration, that is, sub(A) has been
post-multiplied by diag(c(ja:ja+n-1)),
= 'B': Both row and column equilibration, that is, sub(A)
has been replaced by diag(r(ia:ia+m-1))* sub(A) *
diag(c(ja:ja+n-1)).
p?laqsy
Scales a symmetric/Hermitian matrix, using scaling
factors computed by p?poequ .
Syntax
call pslaqsy(uplo, n, a, ia, ja, desca, sr, sc, scond, amax, equed)
call pdlaqsy(uplo, n, a, ia, ja, desca, sr, sc, scond, amax, equed)
call pclaqsy(uplo, n, a, ia, ja, desca, sr, sc, scond, amax, equed)
call pzlaqsy(uplo, n, a, ia, ja, desca, sr, sc, scond, amax, equed)
2199
Description
The routine equilibrates a symmetric distributed matrix sub(A) = A(ia:ia+n-1, ja:ja+n-1)
using the scaling factors in the vectors sr and sc. The scaling factors are computed by
p?poequ.
Input Parameters
uplo (global) CHARACTER. Specifies the upper or lower triangular
part of the symmetric distributed matrix sub(A) is to be
referenced:
= 'U': Upper triangular part;
= 'L': Lower triangular part.
n (global) INTEGER.
The order of the distributed submatrix sub(A). n ≥ 0.
a (local).
REAL for pslaqsy
DOUBLE PRECISION for pdlaqsy
COMPLEX for pclaqsy
COMPLEX*16 for pzlaqsy.
distributed matrix sub(A). On entry, the local pieces of the
distributed symmetric matrix sub(A).
the strictly lower triangular part of sub(A) is not referenced.
the strictly upper triangular part of sub(A) is not referenced.
respectively.
2200
sr (local)
REAL for pslaqsy
COMPLEX for pclaqsy
Array, DIMENSION LOCr(m_a). The scale factors for
A(ia:ia+m-1, ja:ja+n-1). sr is aligned with the
distributed matrix A, and replicated across every process
column. sr is tied to the distributed matrix A.
sc (local)
REAL for pslaqsy
COMPLEX for pclaqsy
Array, DIMENSION LOCc(m_a). The scale factors for A
(ia:ia+m-1, ja:ja+n-1). sr is aligned with the distributed
matrix A, and replicated across every process column. sr
scond (global). REAL for pslaqsy
COMPLEX for pclaqsy
Ratio of the smallest sr(i) (respectively sc(j)) to the
largest sr(i) (respectively sc(j)), with ia ≤ i ≤ ia+n-1
and ja ≤ j ≤ ja+n-1.
amax (global).
REAL for pslaqsy
COMPLEX for pclaqsy
Absolute value of largest distributed submatrix entry.
Output Parameters
a On exit,
if equed = 'Y', the equilibrated matrix:
2201
diag(sr(ia:ia+n-1)) * sub(A) *
diag(sc(ja:ja+n-1)).
equed (global) CHARACTER*1.
= 'N': No equilibration.
= 'Y': Equilibration was done, that is, sub(A) has been
replaced by:
diag(sr(ia:ia+n-1))* sub(A) *
diag(sc(ja:ja+n-1)).
p?lared1d
Redistributes an array assuming that the input
array, bycol, is distributed across rows and that
all process columns contain the same copy of
bycol.
Syntax
call pslared1d(n, ia, ja, desc, bycol, byall, work, lwork)
call pdlared1d(n, ia, ja, desc, bycol, byall, work, lwork)
Description
The routine redistributes a 1D array. It assumes that the input array bycol is distributed across
rows and that all process column contain the same copy of bycol. The output array byall is
identical on all processes and contains the entire array.
Input Parameters
np = Number of local rows in bycol()
The number of elements to be redistributed. n ≥ 0.
ia, ja (global) INTEGER. ia, ja must be equal to 1.
desc (global and local) INTEGER array, DIMENSION 8. A 2D array
descriptor, which describes bycol.
bycol (local).
2202
REAL for pslared1d
DOUBLE PRECISION for pdlared1d
COMPLEX for pclared1d
COMPLEX*16 for pzlared1d.
Distributed block cyclic array global DIMENSION (n), local
DIMENSION np. bycol is distributed across the process rows.
All process columns are assumed to contain the same value.
work (local).
REAL for pslared1d
DIMENSION (lwork). Used to hold the buffers sent from one
process to another.
lwork (local)
INTEGER. The size of the work array. lwork ≥ numroc(n,
desc( nb_), 0, 0, npcol).
Output Parameters
byall (global). REAL for pslared1d
Global DIMENSION (n), local DIMENSION (n). byall is exactly
duplicated on all processes. It contains the same values as
bycol, but it is replicated across all processes rather than
being distributed.
2203
p?lared2d
Redistributes an array assuming that the input
array byrow is distributed across columns and that
all process rows contain the same copy of byrow.
Syntax
call pslared2d(n, ia, ja, desc, byrow, byall, work, lwork)
call pdlared2d(n, ia, ja, desc, byrow, byall, work, lwork)
Description
The routine redistributes a 1D array. It assumes that the input array byrow is distributed across
columns and that all process rows contain the same copy of byrow. The output array byall will
be identical on all processes and will contain the entire array.
Input Parameters
np = Number of local rows in byrow()
n (global) INTEGER.
The number of elements to be redistributed. n ≥ 0.
ia, ja (global) INTEGER. ia, ja must be equal to 1.
desc (global and local) INTEGER array, DIMENSION (dlen_). A
2D array descriptor, which describes byrow.
byrow (local).
REAL for pslared2d
Distributed block cyclic array global DIMENSION (n), local
DIMENSION np. bycol is distributed across the process
columns. All process rows are assumed to contain the same
value.
work (local).
REAL for pslared2d
2204
DIMENSION (lwork). Used to hold the buffers sent from one
process to another.
lwork (local).INTEGER. The size of the work array. lwork ≥
numroc(n, desc( nb_ ), 0, 0, npcol).
Output Parameters
byall (global). REAL for pslared2d
Global DIMENSION(n), local DIMENSION (n). byall is exactly
duplicated on all processes. It contains the same values as
bycol, but it is replicated across all processes rather than
being distributed.
p?larf
rectangular matrix.
Syntax
call pslarf(side, m, n, v, iv, jv, descv, incv, tau, c, ic, jc, descc, work)
call pdlarf(side, m, n, v, iv, jv, descv, incv, tau, c, ic, jc, descc, work)
call pclarf(side, m, n, v, iv, jv, descv, incv, tau, c, ic, jc, descc, work)
call pzlarf(side, m, n, v, iv, jv, descv, incv, tau, c, ic, jc, descc, work)
Description
The routine applies a real/complex elementary reflector Q (or QT) to a real/complex m-by-n
distributed matrix sub(C) = C(ic:ic+m-1, jc:jc+n-1), from either the left or the right. Q
is represented in the form
Q = I-tau*v*v',
2205
If tau = 0, then Q is taken to be the unit matrix.
Input Parameters
side (global). CHARACTER.
= 'L': form Q*sub(C),
= 'R': form sub(C)*Q, Q=QT.
m (global) INTEGER.
n (global) INTEGER.
≥ 0).
v (local).
REAL for pslarf
DOUBLE PRECISION for pdlarf
COMPLEX for pclarf
COMPLEX*16 for pzlarf.
(lld_v,*) containing the local pieces of the distributed
vectors V representing the Householder transformation Q,
v(iv:iv+m-1, jv) if side = 'L' and incv = 1,
v(iv, jv:jv+m-1) if side = 'L' and incv = m_v,
v(iv:iv+n-1, jv) if side = 'R' and incv = 1,
v(iv, jv:jv+n-1) if side = 'R' and incv = m_v.
The vector v is the representation of Q. v is not used if tau
= 0.
array V indicating the first row and the first column of the
submatrix sub(V), respectively.
incv (global) INTEGER.
The global increment for the elements of v. Only two values
of incv are supported in this version, namely 1 and m_v.
incv must not be zero.
2206
tau (local).
REAL for pslarf
COMPLEX for pclarf
Array, DIMENSION LOCc(jv) if incv = 1, and LOCr(iv)
otherwise. This array contains the Householder scalars
related to the Householder vectors.
tau is tied to the distributed matrix v.
c (local).
REAL for pslarf
COMPLEX for pclarf
DIMENSION(lld_c, LOCc(jc+n-1) ), containing the local
pieces of sub(C).
ic, jc (global) INTEGER.
The row and column indices in the global array c indicating
the first row and the first column of the submatrix sub(C),
respectively.
descc (global and local) INTEGER array, DIMENSION (dlen_). The
work (local).
REAL for pslarf
COMPLEX for pclarf
If incv = 1,
if side = 'L',
if ivcol = iccol,
lwork≥nqc0
else
lwork≥mpc0 + max( 1, nqc0 )
end if
else if side = 'R' ,
2207
lwork≥nqc0 + max( max( 1, mpc0), numroc(numroc( n+

icoffc,nb_v,0,0,npcol),nb_v,0,0,lcmq ) )
end if
else if incv = m_v,
if side = 'L',
lwork≥mpc0 + max( max( 1, nqc0 ), numroc(
numroc(m+iroffc,mb_v,0,0,nprow ),mb_v,0,0, lcmp )
)
else if side = 'R',
if ivrow = icrow,
lwork≥mpc0
else
lwork≥nqc0 + max( 1, mpc0 )
end if
end if
end if,
and lcm = ilcm( nprow, npcol ), lcmp = lcm/nprow, lcmq
= lcm/npcol,
iroffc = mod( ic-1, mb_c ), icoffc = mod( jc-1,
nb_c ),
icrow = indxg2p( ic, mb_c, myrow, rsrc_c, nprow
),
iccol = indxg2p( jc, nb_c, mycol, csrc_c, npcol
),
mpc0 = numroc( m+iroffc, mb_c, myrow, icrow,
nprow ),
nqc0 = numroc( n+icoffc, nb_c, mycol, iccol,
npcol ),
ilcm, indxg2p, and numroc are ScaLAPACK tool functions;
myrow, mycol, nprow, and npcol can be determined by
Output Parameters
c (local).
On exit, sub(C) is overwritten by the Q*sub(C) if side =
'L',
2208
or sub(C) * Q if side = 'R'.
p?larfb
transpose/conjugate-transpose to a general
rectangular matrix.
Syntax
call pslarfb(side, trans, direct, storev, m, n, k, v, iv, jv, descv, t, c,
ic, jc, descc, work)
call pdlarfb(side, trans, direct, storev, m, n, k, v, iv, jv, descv, t, c,
call pclarfb(side, trans, direct, storev, m, n, k, v, iv, jv, descv, t, c,
call pzlarfb(side, trans, direct, storev, m, n, k, v, iv, jv, descv, t, c,
Description
The routine applies a real/complex block reflector Q or its transpose QT/conjugate transpose QH
to a real/complex distributed m-by-n matrix sub(C) = C(ic:ic+m-1, jc:jc+n-1) from the left
or the right.
Input Parameters
side (global).CHARACTER.
T H
if side = 'L': apply Q or Q for real flavors (Q for complex
flavors) from the Left;
T H
if side = 'R': apply Q or Q for real flavors (Q for complex
flavors) from the Right.
trans (global).CHARACTER.
if trans = 'N': no transpose, apply Q;
T
for real flavors, if trans='T': transpose, apply Q
for complex flavors, if trans = 'C': conjugate transpose,
apply QH;
2209
direct (global) CHARACTER. Indicates how Q is formed from a

product of elementary reflectors.
if direct = 'F': Q = H(1)*H(2)*...*H(k) (Forward)
if direct = 'B': Q = H(k)*...*H(2)*H(1) (Backward)
storev (global) CHARACTER.
Indicates how the vectors that define the elementary
if storev = 'C': Columnwise
if storev = 'R': Rowwise.
m (global) INTEGER.
of rows of the distributed submatrix sub(C). (m ≥ 0).
n (global) INTEGER.
number of columns of the distributed submatrix sub(C). (n
≥ 0).
k (global) INTEGER.
The order of the matrix T.
v (local).
REAL for pslarfb
DOUBLE PRECISION for pdlarfb
COMPLEX for pclarfb
COMPLEX*16 for pzlarfb.
( lld_v, LOCc(jv+k-1)) if storev = 'C',
(lld_v, LOCc(jv+m-1)) if storev = 'R' and side =
'L',
(lld_v, LOCc(jv+n-1) ) if storev = 'R' and side =
'R'.
It contains the local pieces of the distributed vectors V
representing the Householder transformation.
if storev = 'C' and side = 'L', lld_v ≥
max(1,LOCr(iv+m-1));
if storev = 'C' and side = 'R', lld_v ≥
max(1,LOCr(iv+n-1));
if storev = 'R', lld_v ≥ LOCr(jv+k-1).
2210
iv, jv (global) INTEGER.
The row and column indices in the global array V indicating
the first row and the first column of the submatrix sub(V),
respectively.
c (local).
REAL for pslarfb
COMPLEX for pclarfb
pieces of sub(C).
array C indicating the first row and the first column of the
submatrix sub(C), respectively.
work (local).
REAL for pslarfb
COMPLEX for pclarfb
If storev = 'C',
if side = 'L',
lwork ≥ ( nqc0 + mpc0 ) * k
else if side = 'R',
lwork ≥ ( nqc0 + max( npv0 + numroc( numroc( n +
icoffc, nb_v, 0, 0, npcol ), nb_v, 0, 0, lcmq ),
mpc0 ) ) * k
end if
else if storev = 'R' ,
if side = 'L' ,
lwork≥ ( mpc0 + max( mqv0 + numroc( numroc( m +
2211
iroffc, mb_v, 0, 0, nprow ), mb_v, 0, 0, lcmp ),

nqc0 ) ) * k
else if side = 'R',
lwork ≥ ( mpc0 + nqc0 ) * k
end if
end if,
where
lcmq = lcm / npcol with lcm = iclm( nprow, npcol ),
iroffv = mod( iv-1, mb_v ), icoffv = mod( jv-1, nb_v ),
ivrow = indxg2p( iv, mb_v, myrow, rsrc_v, nprow ),
ivcol = indxg2p( jv, nb_v, mycol, csrc_v, npcol ),
MqV0 = numroc( m+icoffv, nb_v, mycol, ivcol, npcol ),
NpV0 = numroc( n+iroffv, mb_v, myrow, ivrow, nprow ),
iroffc = mod( ic-1, mb_c ), icoffc = mod( jc-1, nb_c ),
icrow = indxg2p( ic, mb_c, myrow, rsrc_c, nprow ),
iccol = indxg2p( jc, nb_c, mycol, csrc_c, npcol ),
MpC0 = numroc( m+iroffc, mb_c, myrow, icrow, nprow ),
NpC0 = numroc( n+icoffc, mb_c, myrow, icrow, nprow ),
NqC0 = numroc( n+icoffc, nb_c, mycol, iccol, npcol ),
Output Parameters
t (local).
REAL for pslarfb
COMPLEX for pclarfb
Array, DIMENSION( mb_v, mb_v) if storev = 'R', and
( nb_v, nb_v) if storev = 'C'. The triangular matrix t
is the representation of the block reflector.
c (local).
On exit, sub(C) is overwritten by the Q*sub(C), or
Q'*sub(C), or sub(C)*Q, or sub(C)*Q'. Q' is transpose
(conjugate transpose) of Q.
2212
p?larfc
Applies the conjugate transpose of an elementary
reflector to a general matrix.
Syntax
call pclarfc(side, m, n, v, iv, jv, descv, incv, tau, c, ic, jc, descc, work)
call pzlarfc(side, m, n, v, iv, jv, descv, incv, tau, c, ic, jc, descc, work)
Description
H
The routine applies a complex elementary reflector Q to a complex m-by-n distributed matrix
sub(C) = C(ic:ic+m-1, jc:jc+n-1), from either the left or the right. Q is represented in the
form
Q = i-tau*v*v',
where tau is a complex scalar and v is a complex vector.
Input Parameters
side (global).CHARACTER.
if side = 'L': form QH*sub(C) ;
if side = 'R': form sub (C)*QH.
m (global) INTEGER.
n (global) INTEGER.
≥ 0).
v (local).
COMPLEX for pclarfc
COMPLEX*16 for pzlarfc.
2213

vectors v representing the Householder transformation Q,
v(iv:iv+m-1, jv) if side = 'L' and incv = 1,
v(iv, jv:jv+m-1) if side = 'L' and incv = m_v,
v(iv:iv+n-1, jv) if side = 'R' and incv = 1,
v(iv, jv:jv+n-1) if side = 'R' and incv = m_v.
The vector v is the representation of Q. v is not used if tau
= 0.
respectively.
tau (local)
COMPLEX for pclarfc
tau is tied to the distributed matrix V.
c (local).
COMPLEX for pclarfc
(lld_c, LOCc(jc+n-1) ), containing the local pieces of
sub(C).
The row and column indices in the global array C indicating
respectively.
2214
work (local).
COMPLEX for pclarfc
If incv = 1,
if side = 'L' ,
if ivcol = iccol,
lwork ≥ nqc0
else
lwork ≥ mpc0 + max( 1, nqc0 )
end if
else if side = 'R',
lwork ≥ nqc0 + max( max( 1, mpc0 ), numroc( numroc(
n+icoffc,nb_v,0,0,npcol ), nb_v,0,0,lcmq ) )
end if
else if incv = m_v,
if side = 'L',
lwork ≥ mpc0 + max( max( 1, nqc0 ), numroc( numroc(
m+iroffc,mb_v,0,0,nprow ),mb_v,0,0,lcmp ) )
if ivrow = icrow,
lwork ≥ mpc0
else
lwork ≥ nqc0 + max( 1, mpc0 )
end if
end if
end if,
and lcm = ilcm(nprow, npcol),
lcmp = lcm/nprow, lcmq = lcm/npcol,
iroffc = mod(ic-1, mb_c), icoffc = mod(jc-1,
nb_c),
icrow = indxg2p(ic, mb_c, myrow, rsrc_c, nprow),
iccol = indxg2p(jc, nb_c, mycol, csrc_c, npcol),
2215
mpc0 = numroc(m+iroffc, mb_c, myrow, icrow,

nprow),
nqc0 = numroc(n+icoffc, nb_c, mycol, iccol,
npcol),
ilcm, indxg2p, and numroc are ScaLAPACK tool
functions;myrow, mycol, nprow, and npcol can be
determined by calling the subroutine blacs_gridinfo.
Output Parameters
c (local).
On exit, sub(C) is overwritten by the QH*sub(C) if side =
'L', or sub(C) * QH if side = 'R'.
p?larfg
Generates an elementary reflector (Householder
matrix).
Syntax
call pslarfg(n, alpha, iax, jax, x, ix, jx, descx, incx, tau)
call pdlarfg(n, alpha, iax, jax, x, ix, jx, descx, incx, tau)
call pclarfg(n, alpha, iax, jax, x, ix, jx, descx, incx, tau)
call pzlarfg(n, alpha, iax, jax, x, ix, jx, descx, incx, tau)
Description
The routine generates a real/complex elementary reflector H of order n, such that
H*sub(X) = H*(x(iax, jax)) = (alpha), H'*H = i,

( x ) ( 0 )
where alpha is a scalar (a real scalar - for complex flavors), and sub(X) is an (n-1)-element
real/complex distributed vector X(ix:ix+n-2, jx) if incx = 1 and X(ix, jx:jx+n-2) if
incx = descx(m_). H is represented in the form
H = I-tau*(1)*(1 v') ,
2216
( v )
where tau is a real/complex scalar and v is a real/complex (n-1)-element vector. Note that H
is not Hermitian.
If the elements of sub(X) are all zero (and X(iax, jax) is real for complex flavors), then tau
= 0 and H is taken to be the unit matrix.
Otherwise 1 ≤ real(tau) ≤ 2 and abs(tau-1) ≤ 1.
Input Parameters
n (global) INTEGER.
The global order of the elementary reflector. n ≥ 0.
iax, jax (global) INTEGER.
The global row and column indices in x of X(iax, jax).
x (local).
Real for pslarfg
DOUBLE PRECISION for pdlarfg
COMPLEX for pclarfg
COMPLEX*16 for pzlarfg.
(lld_x, *). This array contains the local pieces of the
distributed vector sub(X). Before entry, the incremented
array sub(X) must contain vector x.
ix, jx (global) INTEGER.
The row and column indices in the global array X indicating
the first row and the first column of sub(X), respectively.
descx (global and local) INTEGER.
incx (global) INTEGER.
The global increment for the elements of x. Only two values
of incx are supported in this version, namely 1 and m_x.
Output Parameters
alpha (local)
2217
REAL for pslafg

DOUBLE PRECISION for pdlafg
COMPLEX for pclafg
COMPLEX*16 for pzlafg.
On exit, alpha is computed in the process scope having the
vector sub(X).
x (local).
On exit, it is overwritten with the vector v.
tau (local).
Real for pslarfg
DOUBLE PRECISION for pdlarfg
COMPLEX for pclarfg
COMPLEX*16 for pzlarfg.
Array, DIMENSION LOCc(jx) if incx = 1, and LOCr(ix)
tau is tied to the distributed matrix X.
p?larft
Forms the triangular vector T of a block reflector
H=I-V*T*VH.
Syntax
call pslarft(direct, storev, n, k, v, iv, jv, descv, tau, t, work)
call pdlarft(direct, storev, n, k, v, iv, jv, descv, tau, t, work)
call pclarft(direct, storev, n, k, v, iv, jv, descv, tau, t, work)
call pzlarft(direct, storev, n, k, v, iv, jv, descv, tau, t, work)
Description
The routine forms the triangular factor T of a real/complex block reflector H of order n, which
is defined as a product of k elementary reflectors.
If direct = 'F', H = H(1)*H(2)...*H(k), and T is upper triangular;
2218
column of the distributed matrix V, and
H = I-V*T*V'
row of the distributed matrix V, and
H = I-V'*T*V.
Input Parameters
direct (global) CHARACTER*1.
if direct = 'F': H = H(1)*H(2)*...*H(k) (forward)
if direct = 'B': H = H(k)*...*H(2)*H(1) (backward).
storev (global) CHARACTER*1.
Specifies how the vectors that define the elementary
reflectors are stored (See Application Notes below):
if storev = 'C': columnwise;
if storev = 'R': rowwise.
n (global) INTEGER.
The order of the block reflector H. n ≥ 0.
k (global) INTEGER.
The order of the triangular factor T, is equal to the number
of elementary reflectors.
1 ≤ k ≤ mb_v (= nb_v).
v REAL for pslarft
DOUBLE PRECISION for pdlarft
COMPLEX for pclarft
COMPLEX*16 for pzlarft.
Pointer into the local memory to an array of local DIMENSION
(LOCr(iv+n-1), LOCc(jv+k-1)) if storev = 'C', and
(LOCr(iv+k-1), LOCc(jv+n-1)) if storev = 'R'.
The distributed matrix V contains the Householder vectors.
(See Application Notes below).
2219
The row and column indices in the global array v indicating

respectively.
tau (local)
REAL for pslarft
COMPLEX for pclarft
Array, DIMENSIONLOCr(iv+k-1) if incv = m_v, and
LOCc(jv+k-1) otherwise. This array contains the
Householder scalars related to the Householder vectors.
work (local).
REAL for pslarft
COMPLEX for pclarft
Workspace array, DIMENSION (k*(k-1)/2).
Output Parameters
v REAL for pslarft
COMPLEX for pclarft
t (local)
REAL for pslarft
COMPLEX for pclarft
Array, DIMENSION (nb_v,nb_v) if storev = 'C', and
(mb_v,mb_v) otherwise. It contains the k-by-k triangular
factor of the block reflector associated with v. If direct =
'F', t is upper triangular;
if direct = 'B', t is lower triangular.
2220
Application Notes
The shape of the matrix V and the storage of the vectors that define the H(i) is best illustrated
used.
2221
p?larz
Applies an elementary reflector as returned by
p?tzrzf to a general matrix.
Syntax
call pslarz(side, m, n, l, v, iv, jv, descv, incv, tau, c, ic, jc, descc,
work)
call pdlarz(side, m, n, l, v, iv, jv, descv, incv, tau, c, ic, jc, descc,
work)
call pclarz(side, m, n, l, v, iv, jv, descv, incv, tau, c, ic, jc, descc,
work)
call pzlarz(side, m, n, l, v, iv, jv, descv, incv, tau, c, ic, jc, descc,
work)
Description
T
The routine applies a real/complex elementary reflector Q (or Q ) to a real/complex m-by-n
distributed matrix sub(C) = C(ic:ic+m-1, jc:jc+n-1), from either the left or the right. Q
is represented in the form
Q = I-tau*v*v',
Q is a product of k elementary reflectors as returned by p?tzrzf.
Input Parameters
if side = 'L': form Q*sub(C),
if side = 'R': form sub(C)*Q, Q = QT (for real flavors).
m (global) INTEGER.
n (global) INTEGER.
2222
≥ 0).
l (global). INTEGER.
the meaningful part of the Householder reflectors. If side
= 'L', m ≥ l ≥ 0,
if side = 'R', n ≥ l ≥ 0.
v (local).
REAL for pslarz
DOUBLE PRECISION for pdlarz
COMPLEX for pclarz
COMPLEX*16 for pzlarz.
v(iv:iv+l-1, jv) if side = 'L' and incv = 1,
v(iv, jv:jv+l-1) if side = 'L' and incv = m_v,
v(iv:iv+l-1, jv) if side = 'R' and incv = 1,
v(iv, jv:jv+l-1) if side = 'R' and incv = m_v.
The vector v in the representation of Q. v is not used if tau
= 0.
tau (local)
REAL for pslarz
COMPLEX for pclarz
2223

c (local).
REAL for pslarz
COMPLEX for pclarz
(lld_c, LOCc(jc+n-1) ), containing the local pieces of
sub(C).
respectively.
work (local).
REAL for pslarz
COMPLEX for pclarz
Array, DIMENSION (lwork)
If incv = 1,
if side = 'L' ,
if ivcol = iccol,
lwork ≥ NqC0
else
lwork ≥ MpC0 + max(1, NqC0)
end if
lwork ≥ NqC0 + max(max(1, MpC0),
numroc(numroc(n+icoffc,nb_v,0,0,npcol),nb_v,0,0,lcmq))
end if
else if incv = m_v,
if side = 'L' ,
2224
lwork ≥ MpC0 + max(max(1, NqC0),
numroc(numroc(m+iroffc,mb_v,0,0,nprow),mb_v,0,0,lcmp))
if ivrow = icrow,
lwork ≥ MpC0
else
lwork ≥ NqC0 + max(1, MpC0)
end if
end if
end if.
Here lcm is the least common multiple of nprow and npcol
and
lcm = ilcm( nprow, npcol ), lcmp = lcm / nprow,
lcmq = lcm / npcol,
iroffc = mod( ic-1, mb_c ), icoffc = mod( jc-1, nb_c ),
icrow = indxg2p( ic, mb_c, myrow, rsrc_c, nprow ),
iccol = indxg2p( jc, nb_c, mycol, csrc_c, npcol ),
mpc0 = numroc( m+iroffc, mb_c, myrow, icrow, nprow ),
nqc0 = numroc( n+icoffc, nb_c, mycol, iccol, npcol ),
Output Parameters
c (local).
On exit, sub(C) is overwritten by the Q*sub(C) if side =
'L', or sub(C)*Q if side = 'R'.
2225
p?larzb
transpose/conjugate-transpose as returned by
p?tzrzf to a general matrix.
Syntax
call pslarzb(side, trans, direct, storev, m, n, k, l, v, iv, jv, descv, t, c,
call pdlarzb(side, trans, direct, storev, m, n, k, l, v, iv, jv, descv, t, c,
call pclarzb(side, trans, direct, storev, m, n, k, l, v, iv, jv, descv, t, c,
call pzlarzb(side, trans, direct, storev, m, n, k, l, v, iv, jv, descv, t, c,
Description
T
The routine applies a real/complex block reflector Q or its transpose Q (conjugate transpose
QH for complex flavors) to a real/complex distributed m-by-n matrix sub(C) = C(ic:ic+m-1,
jc:jc+n-1) from the left or the right.
Input Parameters
T H
if side = 'L': apply Q or Q (Q for complex flavors) from
the Left;
T H
if side = 'R': apply Q or Q (Q for complex flavors) from
the Right.
if trans = 'N': No transpose, apply Q;
If trans='T': Transpose, apply QT (real flavors);
If trans='C': Conjugate transpose, apply QH (complex
flavors).
2226
direct (global) CHARACTER.
reflectors.
if direct = 'F': H = H(1)*H(2)*...*H(k) - forward
(not supported) ;
if direct = 'B': H = H(k)*...*H(2)*H(1) - backward.
Indicates how the vectors that define the elementary
if storev = 'C': columnwise (not supported ).
m (global) INTEGER.
n (global) INTEGER.
≥ 0).
k (global) INTEGER.
The order of the matrix T. (= the number of elementary
reflectors whose product defines the block reflector).
l (global) INTEGER.
If side = 'L', m ≥ l ≥ 0,
if side = 'R', n ≥ l ≥ 0.
v (local).
REAL for pslarzb
DOUBLE PRECISION for pdlarzb
COMPLEX for pclarzb
COMPLEX*16 for pzlarzb.
DIMENSION(lld_v, LOCc(jv+m-1)) if side = 'L',
(lld_v, LOCc(jv+m-1)) if side = 'R'.
2227
It contains the local pieces of the distributed vectors V

representing the Householder transformation as returned
by p?tzrzf.
lld_v ≥ LOCr(iv+k-1).
respectively.
t (local)
REAL for pslarzb
COMPLEX for pclarzb
Array, DIMENSION mb_v by mb_v.
The lower triangular matrix T in the representation of the
block reflector.
c (local).
REAL for pslarfb
COMPLEX for pclarfb
DIMENSION(lld_c, LOCc(jc+n-1)).
On entry, the m-by-n distributed matrix sub(C).
The row and column indices in the global array c indicating
respectively.
work (local).
REAL for pslarzb
COMPLEX for pclarzb
2228
If storev = 'C' ,
if side = 'L' ,
lwork ≥(nqc0 + mpc0)* k

lwork ≥ (nqc0 + max(npv0 + numroc(numroc(n+icoffc,

nb_v, 0, 0, npcol),
nb_v, 0, 0, lcmq), mpc0))* k
end if
else if storev = 'R' ,
if side = 'L' ,
lwork ≥ (mpc0 + max(mqv0 + numroc(numroc(m+iroffc,

mb_v, 0, 0, nprow),
mb_v, 0, 0, lcmp), nqc0))* k
lwork ≥ (mpc0 + nqc0) * k

end if
end if.
Here lcmq = lcm/npcol with lcm = iclm(nprow,
npcol),
iroffv = mod(iv-1, mb_v), icoffv = mod( jv-1,
nb_v),
ivrow = indxg2p(iv, mb_v, myrow, rsrc_v, nprow),
ivcol = indxg2p(jv, nb_v, mycol, csrc_v, npcol),
mqv0 = numroc(m+icoffv, nb_v, mycol, ivcol,
npcol),
npv0 = numroc(n+iroffv, mb_v, myrow, ivrow,
nprow),
iroffc = mod(ic-1, mb_c ), icoffc= mod( jc-1,
nb_c),
2229
icrow= indxg2p(ic, mb_c, myrow, rsrc_c, nprow),

iccol= indxg2p(jc, nb_c, mycol, csrc_c, npcol),
nprow),
npc0 = numroc(n+icoffc, mb_c, myrow, icrow,
nprow),
npcol),
Output Parameters
c (local).
On exit, sub(C) is overwritten by the Q*sub(C), or
Q'*sub(C), or sub(C)*Q, or sub(C)*Q', where Q' is the
transpose (conjugate transpose) of Q.
p?larzc
Applies (multiplies by) the conjugate transpose of
an elementary reflector as returned by p?tzrzf
Syntax
call pclarzc(side, m, n, l, v, iv, jv, descv, incv, tau, c, ic, jc, descc,
work)
call pzlarzc(side, m, n, l, v, iv, jv, descv, incv, tau, c, ic, jc, descc,
work)
Description
The routine applies a complex elementary reflector QH to a complex m-by-n distributed matrix
sub(C) = C(ic:ic+m-1, jc:jc+n-1), from either the left or the right. Q is represented in
the form
Q = i-tau*v*v',
2230
where tau is a complex scalar and v is a complex vector.
Input Parameters
if side = 'L': form QH*sub(C);
if side = 'R': form sub(C)*QH .
m (global) INTEGER.
n (global) INTEGER.
≥ 0).
l (global) INTEGER.
If side = 'L', m ≥ l ≥ 0,
if side = 'R', n ≥ l ≥ 0.
v (local).
COMPLEX for pclarzc
COMPLEX*16 for pzlarzc.
v(iv:iv+l-1, jv) if side = 'L' and incv = 1,
v(iv, jv:jv+l-1) if side = 'L' and incv = m_v,
v(iv:iv+l-1, jv) if side = 'R' and incv = 1,
v(iv, jv:jv+l-1) if side = 'R' and incv = m_v.
The vector v in the representation of Q. v is not used if tau
= 0.
2231

respectively.
incv (global). INTEGER.
tau (local)
COMPLEX for pclarzc
Array, DIMENSIONLOCc(jv) if incv = 1, and LOCr(iv)
c (local).
COMPLEX for pclarzc
pieces of sub(C).
respectively.
2232
work (local).
If incv = 1,
if side = 'L' ,
if ivcol = iccol,
lwork ≥ nqc0
else
lwork ≥ mpc0 + max(1, nqc0)

end if
lwork ≥ nqc0 + max(max(1, mpc0),

numroc(numroc(n+icoffc, nb_v, 0, 0, npcol),
nb_v, 0, 0, lcmq))
end if
else if incv = m_v,
if side = 'L' ,
lwork ≥ mpc0 + max(max(1, nqc0),

numroc(numroc(m+iroffc, mb_v, 0, 0, nprow),
mb_v, 0, 0, lcmp))
else if side = 'R',
if ivrow = icrow,
lwork ≥ mpc0
else
lwork ≥ nqc0 + max(1, mpc0)

end if
end if
end if
Here lcm is the least common multiple of nprow and npcol;
2233
lcm = ilcm(nprow, npcol), lcmp = lcm/nprow, lcmq=

lcm/npcol,
iroffc = mod(ic-1, mb_c), icoffc= mod(jc-1,
nb_c),
nprow),
npcol),
Output Parameters
c (local).
On exit, sub(C) is overwritten by the QH*sub(C) if side =
'L', or sub(C)*QH if side = 'R'.
p?larzt
Forms the triangular factor T of a block reflector
H
H=I-V*T*V as returned by p?tzrzf.
Syntax
call pslarzt(direct, storev, n, k, v, iv, jv, descv, tau, t, work)
call pdlarzt(direct, storev, n, k, v, iv, jv, descv, tau, t, work)
call pclarzt(direct, storev, n, k, v, iv, jv, descv, tau, t, work)
call pzlarzt(direct, storev, n, k, v, iv, jv, descv, tau, t, work)
Description
The routine forms the triangular factor T of a real/complex block reflector H of order greater
than n, which is defined as a product of k elementary reflectors as returned by p?tzrzf.
If direct = 'F', H = H(1)*H(2)*...*H(k), and T is upper triangular;
2234
If storev = 'C', the vector which defines the elementary reflector H(i), is stored in the i-th
column of the array v, and
H = i-v*t*v'.
If storev = 'R', the vector, which defines the elementary reflector H(i), is stored in the
i-th row of the array v, and
H = i-v'*t*v
Input Parameters
direct (global) CHARACTER.
if direct = 'F': H = H(1)*H(2)*...*H(k) (Forward,
not supported)
if direct = 'B': H = H(k)*...*H(2)*H(1) (Backward).
Specifies how the vectors which defines the elementary
if storev = 'C': columnwise (not supported);
The order of the block reflector H. n ≥ 0.
k (global). INTEGER.
The order of the triangular factor T (= the number of
elementary reflectors).
1≤k≤mb_v(= nb_v).
v REAL for pslarzt
DOUBLE PRECISION for pdlarzt
COMPLEX for pclarzt
COMPLEX*16 for pzlarzt.
Pointer into the local memory to an array of local
DIMENSION(LOCr(iv+k-1), LOCc(jv+n-1)).
2235
The distributed matrix V contains the Householder vectors.

See Application Notes below.
tau (local)
REAL for pslarzt
COMPLEX for pclarzt
Array, DIMENSION LOCr(iv+k-1) if incv = m_v, and
LOCc(jv+k-1) otherwise. This array contains the
Householder scalars related to the Householder vectors.
work (local).
REAL for pslarzt
COMPLEX for pclarzt
Workspace array, DIMENSION(k*(k-1)/2).
Output Parameters
v REAL for pslarzt
COMPLEX for pclarzt
t (local)
REAL for pslarzt
COMPLEX for pclarzt
Array, DIMENSION (mb_v, mb_v). It contains the k-by-k
triangular factor of the block reflector associated with v. t
is lower triangular.
2236
Application Notes
used.
2237
p?lascl
Multiplies a general rectangular matrix by a real
scalar defined as Cto/Cfrom.
Syntax
call pslascl(type, cfrom, cto, m, n, a, ia, ja, desca, info)
call pdlascl(type, cfrom, cto, m, n, a, ia, ja, desca, info)
call pclascl(type, cfrom, cto, m, n, a, ia, ja, desca, info)
call pzlascl(type, cfrom, cto, m, n, a, ia, ja, desca, info)
Description
The routine multiplies the m-by-n real/complex distributed matrix sub(A) denoting A(ia:ia+m-1,
ja:ja+n-1) by the real/complex scalar cto/cfrom. This is done without over/underflow as
long as the final result cto*A(i,j)/cfrom does not over/underflow. type specifies that sub(A)
may be full, upper triangular, lower triangular or upper Hessenberg.
Input Parameters
type (global) CHARACTER.
type indicates the storage type of the input distributed
matrix.
if type = 'G': sub(A) is a full matrix,
if type = 'L': sub(A) is a lower triangular matrix,
if type = 'U': sub(A) is an upper triangular matrix,
if type = 'H': sub(A) is an upper Hessenberg matrix.
2238
cfrom, cto (global)
REAL for pslascl/pclascl
DOUBLE PRECISION for pdlascl/pzlascl.
The distributed matrix sub(A) is multiplied by cto/cfrom .
A(i,j) is computed without over/underflow if the final result
cto*A(i,j)/cfrom can be represented without
over/underflow. cfrom must be nonzero.
m (global) INTEGER.
n (global) INTEGER.
(n≥0).
a (local input/local output)
REAL for pslascl
DOUBLE PRECISION for pdlascl
COMPLEX for pclascl
COMPLEX*16 for pzlascl.
This array contains the local pieces of the distributed matrix
sub(A).
The column and row indices in the global array A indicating
the first row and column of the submatrix sub(A),
respectively.
desca (global and local) INTEGER .
Array of DIMENSION (dlen_).The array descriptor for the
Output Parameters
a (local).
On exit, this array contains the local pieces of the distributed
matrix multiplied by cto/cfrom.
info (local)
2239
INTEGER.
if info = 0: the execution is successful.
if info < 0: If the i-th argument is an array and the
j-entry had an illegal value, then info = -(i*100+j),
then info = -i.
p?laset
Initializes the offdiagonal elements of a matrix to
alpha and the diagonal elements to beta.
Syntax
call pslaset(uplo, m, n, alpha, beta, a, ia, ja, desca)
call pdlaset(uplo, m, n, alpha, beta, a, ia, ja, desca)
call pclaset(uplo, m, n, alpha, beta, a, ia, ja, desca)
call pzlaset(uplo, m, n, alpha, beta, a, ia, ja, desca)
Description
The routine initializes an m-by-n distributed matrix sub(A) denoting A(ia:ia+m-1, ja:ja+n-1)
to beta on the diagonal and alpha on the offdiagonals.
Input Parameters
Specifies the part of the distributed matrix sub(A) to be set:
if uplo = 'U': upper triangular part; the strictly lower
triangular part of sub(A) is not changed;
if uplo = 'L': lower triangular part; the strictly upper
triangular part of sub(A) is not changed.
Otherwise: all of the matrix sub(A) is set.
m (global) INTEGER.
n (global) INTEGER.
2240
(n≥0).
alpha (global).
REAL for pslaset
DOUBLE PRECISION for pdlaset
COMPLEX for pclaset
COMPLEX*16 for pzlaset.
The constant to which the offdiagonal elements are to be
set.
beta (global).
REAL for pslaset
COMPLEX for pclaset
The constant to which the diagonal elements are to be set.
Output Parameters
a (local).
REAL for pslaset
COMPLEX for pclaset
This array contains the local pieces of the distributed matrix
sub(A) to be set. On exit, the leading m-by-n submatrix
sub(A) is set as follows:
if uplo = 'U', A(ia+i-1, ja+j-1) = alpha, 1≤i≤j-1,
1≤j≤n,
if uplo = 'L', A(ia+i-1, ja+j-1) = alpha, j+1≤i≤ m,
1≤j≤ n,
otherwise, A(ia+i-1, ja+j-1) = alpha, 1≤i≤m, 1≤j≤n,
ia+i.ne.ja+j, and, for all uplo, A(ia+i-1, ja+i-1) =
beta, 1≤i≤min(m,n).
2241

The column and row indices in the global array A indicating
the first row and column of the submatrix sub(A),
respectively.
desca (global and local) INTEGER .
p?lasmsub
Looks for a small subdiagonal element from the
bottom of the matrix that it can safely set to zero.
Syntax
call pslasmsub(a, desca, i, l, k, smlnum, buf, lwork)
call pdlasmsub(a, desca, i, l, k, smlnum, buf, lwork)
Description
The routine looks for a small subdiagonal element from the bottom of the matrix that it can
safely set to zero. This routine does a global maximum and must be called by all processes.
Input Parameters
a (global)
REAL for pslasmsub
DOUBLE PRECISION for pdlasmsub
Array, DIMENSION(desca(lld_),*).
On entry, the Hessenberg matrix whose tridiagonal part is
being scanned. Unchanged on exit.
desca (global and local) INTEGER.
i (global) INTEGER.
The global location of the bottom of the unreduced submatrix
l (global) INTEGER.
2242
The global location of the top of the unreduced submatrix
of A.
Unchanged on exit.
smlnum (global)
REAL for pslasmsub
On entry, a “small number” for the given matrix. Unchanged
on exit.
lwork (global) INTEGER.
On exit, lwork is the size of the work buffer.
This must be at least 2*ceil(ceil((i-l)/hbl )/
lcm(nprow,npcol)). Here lcm is least common multiple,
and nprow x npcol is the logical grid size.
Output Parameters
k (global) INTEGER.
On exit, this yields the bottom portion of the unreduced
submatrix. This will satisfy: l ≤ m ≤ i-1.
buf (local).
REAL for pslasmsub
Array of size lwork.
p?lassq
Updates a sum of squares represented in scaled
form.
Syntax
call pslassq(n, x, ix, jx, descx, incx, scale, sumsq)
call pdlassq(n, x, ix, jx, descx, incx, scale, sumsq)
call pclassq(n, x, ix, jx, descx, incx, scale, sumsq)
call pzlassq(n, x, ix, jx, descx, incx, scale, sumsq)
2243
Description
The routine returns the values scl and smsq such that
scl2 * smsq = x(1)2 + ... + x(n)2 + scale2*sumsq,
where x(i) = sub(X) = X(ix + (jx-1)*descx(m_) + (i - 1)*incx) for pslassq/pd-

lassq ,
and x(i) = sub(X) = abs(X(ix + (jx-1)*descx(m_) + (i - 1)*incx) for
pclassq/pzlassq.
For real routines pslassq/pdlassq the value of sumsq is assumed to be non-negative and scl
returns the value
scl = max(scale, abs(x(i))).
For complex routines pclassq/pzlassq the value of sumsq is assumed to be at least unity and
the value of ssq will then satisfy
1.0 ≤ ssq ≤ sumsq +2n
Value scale is assumed to be non-negative and scl returns the value
For all routines p?lassq values scale and sumsq must be supplied in scale and sumsq
respectively, and scale and sumsq are overwritten by scl and ssq respectively.
All routines p?lassq make only one pass through the vector sub(x).
Input Parameters
n (global) INTEGER.
The length of the distributed vector sub(x ).
x REAL for pslassq
DOUBLE PRECISION for pdlassq
COMPLEX for pclassq
COMPLEX*16 for pzlassq.
The vector for which a scaled sum of squares is computed:
2244
x(ix + (jx-1)*m_x + (i - 1)*incx), 1 ≤ i ≤ n.
ix (global) INTEGER.
The row index in the global array X indicating the first row
of sub(X).
jx (global) INTEGER.
The column index in the global array X indicating the first
column of sub(X).
descx (global and local) INTEGER array of DIMENSION (dlen_).
The array descriptor for the distributed matrix X.
The global increment for the elements of X. Only two values
of incx are supported in this version, namely 1 and m_x.
The argument incx must not equal zero.
scale (local).
REAL for pslassq/pclassq
DOUBLE PRECISION for pdlassq/pzlassq.
On entry, the value scale in the equation above.
sumsq (local)
REAL for pslassq/pclassq
DOUBLE PRECISION for pdlassq/pzlassq.
On entry, the value sumsq in the equation above.
Output Parameters
scale (local).
On exit, scale is overwritten with scl , the scaling factor
for the sum of squares.
sumsq (local).
On exit, sumsq is overwritten with the value smsq, the basic
sum of squares from which scl has been factored out.
2245
p?laswp
Performs a series of row interchanges on a general
rectangular matrix.
Syntax
call pslaswp(direc, rowcol, n, a, ia, ja, desca, k1, k2, ipiv)
call pdlaswp(direc, rowcol, n, a, ia, ja, desca, k1, k2, ipiv)
call pclaswp(direc, rowcol, n, a, ia, ja, desca, k1, k2, ipiv)
call pzlaswp(direc, rowcol, n, a, ia, ja, desca, k1, k2, ipiv)
Description
The routine performs a series of row or column interchanges on the distributed matrix
sub(A)=A(ia:ia+n-1, ja:ja+n-1). One interchange is initiated for each of rows or columns
k1 through k2 of sub(A). This routine assumes that the pivoting information has already been
broadcast along the process row or column. Also note that this routine will only work for k1-k2
being in the same mb (or nb) block. If you want to pivot a full matrix, use p?lapiv.
Input Parameters
direc (global) CHARACTER.
Specifies in which order the permutation is applied:
= 'F' - forward,
= 'B' - backward.
rowcol (global) CHARACTER.
Specifies if the rows or columns are permuted:
= 'R' - rows,
= 'C' - columns.
n (global) INTEGER.
If rowcol='R', the length of the rows of the distributed
matrix A(*, ja:ja+n-1) to be permuted;
If rowcol='C', the length of the columns of the distributed
matrix A(ia:ia+n-1 , *) to be permuted;
a (local)
REAL for pslaswp
2246
DOUBLE PRECISION for pdlaswp
COMPLEX for pclaswp
COMPLEX*16 for pzlaswp.
(lld_a, *). On entry, this array contains the local pieces of
the distributed matrix to which the row/columns
interchanges will be applied.
ia (global) INTEGER.
The row index in the global array A indicating the first row
of sub(A).
ja (global) INTEGER.
The column index in the global array A indicating the first
column of sub(A).
k1 (global) INTEGER.
The first element of ipiv for which a row or column
k2 (global) INTEGER.
The last element of ipiv for which a row or column
ipiv (local)
INTEGER. Array, DIMENSION LOCr(m_a)+mb_a for row
pivoting and LOCr(n_a)+nb_a for column pivoting. This
array is tied to the matrix A, ipiv(k)=l implies rows (or
columns) k and l are to be interchanged.
Output Parameters
A (local)
REAL for pslaswp
DOUBLE PRECISION for pdlaswp
COMPLEX for pclaswp
COMPLEX*16 for pzlaswp.
On exit, the permuted distributed matrix.
2247
p?latra
Computes the trace of a general square distributed
matrix.
Syntax
val = pslatra(n, a, ia, ja, desca)
val = pdlatra(n, a, ia, ja, desca)
val = pclatra(n, a, ia, ja, desca)
val = pzlatra(n, a, ia, ja, desca)
Description
This function computes the trace of an n-by-n distributed matrix sub(A) denoting A(ia:ia+n-1,
ja:ja+n-1). The result is left on every process of the grid.
Input Parameters
n (global) INTEGER.
The number of rows and columns to be operated on, that
is, the order of the distributed submatrix sub(A). n ≥0.
a (local).
Real for pslatra
DOUBLE PRECISION for pdlatra
COMPLEX for pclatra
COMPLEX*16 for pzlatra.
pieces of the distributed matrix, the trace of which is to be
computed.
ia, ja (global) INTEGER. The row and column indices respectively
in the global array A indicating the first row and the first
column of the submatrix sub(A), respectively.
2248
Output Parameters
val The value returned by the fuction.
p?latrd
Reduces the first nb rows and columns of a
symmetric/Hermitian matrix A to real tridiagonal
transformation.
Syntax
call pslatrd(uplo, n, nb, a, ia, ja, desca, d, e, tau, w, iw, jw, descw, work)
call pdlatrd(uplo, n, nb, a, ia, ja, desca, d, e, tau, w, iw, jw, descw, work)
call pclatrd(uplo, n, nb, a, ia, ja, desca, d, e, tau, w, iw, jw, descw, work)
call pzlatrd(uplo, n, nb, a, ia, ja, desca, d, e, tau, w, iw, jw, descw, work)
Description
The routine reduces nb rows and columns of a real symmetric or complex Hermitian matrix
sub(A)= A(ia:ia+n-1, ja:ja+n-1) to symmetric/complex tridiagonal form by an
orthogonal/unitary similarity transformation Q'*sub(A)*Q, and returns the matrices V and W,
which are needed to apply the transformation to the unreduced part of sub(A).
If uplo = U, p?latrd reduces the last nb rows and columns of a matrix, of which the upper
triangle is supplied;
if uplo = L, p?latrd reduces the first nb rows and columns of a matrix, of which the lower
triangle is supplied.
This is an auxiliary routine called by p?sytrd/p?hetrd.
Input Parameters
symmetric/Hermitian matrix sub(A) is stored:
= L: Lower triangular.
2249
n (global) INTEGER.
is, the order of the distributed submatrix sub(A). n ≥ 0.
The number of rows and columns to be reduced.
a REAL for pslatrd
DOUBLE PRECISION for pdlatrd
COMPLEX for pclatrd
COMPLEX*16 for pzlatrd.
If uplo = U, the leading n-by-n upper triangular part of
If uplo = L, the leading n-by-n lower triangular part of
its strictly upper triangular part is not referenced.
of sub(A).
column of sub(A).
iw (global) INTEGER.
The row index in the global array W indicating the first row
of sub(W).
jw (global) INTEGER.
The column index in the global array W indicating the first
column of sub(W).
descw (global and local) INTEGER array of DIMENSION (dlen_).
The array descriptor for the distributed matrix W.
work (local)
2250
REAL for pslatrd
COMPLEX for pclatrd
Workspace array of DIMENSION (nb_a).
Output Parameters
a (local)
On exit, if uplo = 'U', the last nb columns have been
reduced to tridiagonal form, with the diagonal elements
overwriting the diagonal elements of sub(A); the elements
above the diagonal with the array tau represent the
reflectors;
if uplo = 'L', the first nb columns have been reduced to
tridiagonal form, with the diagonal elements overwriting the
diagonal elements of sub(A); the elements below the
diagonal with the array tau represent the orthogonal/unitary
matrix Q as a product of elementary reflectors.
d (local)
REAL for pslatrd/pclatrd
DOUBLE PRECISION for pdlatrd/pzlatrd.
Array, DIMENSION LOCc(ja+n-1).
The diagonal elements of the tridiagonal matrix T: d(i) =
a(i,i). d is tied to the distributed matrix A.
e (local)
REAL for pslatrd/pclatrd
DOUBLE PRECISION for pdlatrd/pzlatrd.
Array, DIMENSION LOCc(ja+n-1) if uplo = 'U',
LOCc(ja+n-2) otherwise.
e(i) = a(i, i + 1) if uplo = 'U',
e(i) = a(i + 1, i) if uplo = L.
tau (local)
REAL for pslatrd
2251
COMPLEX for pclatrd

Array, DIMENSION LOCc(ja+n-1). This array contains the
w (local)
REAL for pslatrd
COMPLEX for pclatrd
(lld_w, nb_w). This array contains the local pieces of the
n-by-nb_w matrix w required to update the unreduced part
of sub(A).
Application Notes
Q = H(n)*H(n-1)*...*H(n-nb+1)
H(i) = I - tau*v*v' ,
where tau is a real/complex scalar, and v is a real/complex vector with v(i:n) = 0 and v(i-1)
= 1; v(1:i-1) is stored on exit in A(ia:ia+i-1, ja+i), and tau in tau(ja+i-1).
If uplo = L, the matrix Q is represented as a product of elementary reflectors
Q = H(1)*H(2)*...*H(nb)
H(i) = I - tau*v*v' ,
where tau is a real/complex scalar, and v is a real/complex vector with v(1:i) = 0 and v(i+1)
= 1; v(i+2: n) is stored on exit in A(ia+i+1: ia+n-1, ja+i-1), and tau in tau(ja+i-1).
The elements of the vectors v together form the n-by-nb matrix V which is needed, with W, to
apply the transformation to the unreduced part of the matrix, using a symmetric/Hermitian
rank-2k update of the form:
sub(A) := sub(A)-vw'-wv'.
The contents of a on exit are illustrated by the following examples with
2252
n = 5 and nb = 2:
where d denotes a diagonal element of the reduced matrix, a denotes an element of the original
matrix that is unchanged, and vi denotes an element of the vector defining H(i).
p?latrs
scale factor set to prevent overflow.
Syntax
call pslatrs(uplo, trans, diag, normin, n, a, ia, ja, desca, x, ix, jx, descx,
scale, cnorm, work)
call pdlatrs(uplo, trans, diag, normin, n, a, ia, ja, desca, x, ix, jx, descx,
scale, cnorm, work)
call pclatrs(uplo, trans, diag, normin, n, a, ia, ja, desca, x, ix, jx, descx,
scale, cnorm, work)
call pzlatrs(uplo, trans, diag, normin, n, a, ia, ja, desca, x, ix, jx, descx,
scale, cnorm, work)
Description
The routine solves a triangular system of equations Ax = sb, ATx = sb or AHx = sb, where
s is a scale factor set to prevent overflow. The description of the routine will be extended in
the future releases.
2253
Input Parameters
uplo CHARACTER*1.
trans CHARACTER*1.
Specifies the operation applied to Ax.
= 'N': Solve Ax = s*b (no transpose)
T
= 'T': Solve A x = s*b (transpose)
H
= 'C': Solve A x = s*b (conjugate transpose),
where s - is a scale factor
diag CHARACTER*1.
= 'U': Unit triangular
normin CHARACTER*1.
Specifies whether cnorm has been set or not.
= 'N': cnorm is not set on entry. On exit, the norms will
be computed and stored in cnorm.
n INTEGER.
The order of the matrix A. n ≥ 0
a REAL for pslatrs/pclatrs
DOUBLE PRECISION for pdlatrs/pzlatrs
Array, DIMENSION (lda, n). Contains the triangular matrix
A.
If uplo = U, the leading n-by-n upper triangular part of the
array a contains the upper triangular matrix, and the strictly
lower triangular part of a is not referenced.
If diag = 'U', the diagonal elements of a are also not
2254
x REAL for pslatrs/pclatrs
DOUBLE PRECISION for pdlatrs/pzlatrs
Array, DIMENSION (n). On entry, the right hand side b of
the triangular system.
ix (global) INTEGER.The row index in the global array x
The column index in the global array x indicating the first
column of sub(x).
descx (global and local) INTEGER.
Array, DIMENSION (dlen_). The array descriptor for the
cnorm REAL for pslatrs/pclatrs
DOUBLE PRECISION for pdlatrs/pzlatrs.
Array, DIMENSION (n). If normin = 'Y', cnorm is an input
argument and cnorm (j) contains the norm of the
off-diagonal part of the j-th column of A. If trans = 'N',
cnorm (j) must be greater than or equal to the infinity-norm,
and if trans = 'T' or 'C', cnorm(j) must be greater than
or equal to the 1-norm.
work (local).
REAL for pslatrs
DOUBLE PRECISION for pdlatrs
COMPLEX for pclatrs
COMPLEX*16 for pzlatrs.
Temporary workspace.
Output Parameters
X On exit, x is overwritten by the solution vector x.
scale REAL for pslatrs/pclatrs
DOUBLE PRECISION for pdlatrs/pzlatrs.
2255
Array, DIMENSION (lda, n). The scaling factor s for the

triangular system as described above.
the vector x is an exact or approximate solution to Ax =
0.
cnorm If normin = 'N', cnorm is an output argument and cnorm(j)
returns the 1-norm of the off-diagonal part of the j-th
column of A.
p?latrz
Reduces an upper trapezoidal matrix to upper
triangular form by means of orthogonal/unitary
transformations.
Syntax
call pslatrz(m, n, l, a, ia, ja, desca, tau, work)
call pdlatrz(m, n, l, a, ia, ja, desca, tau, work)
call pclatrz(m, n, l, a, ia, ja, desca, tau, work)
call pzlatrz(m, n, l, a, ia, ja, desca, tau, work)
Description
The routine reduces the m-by-n(m ≤ n) real/complex upper trapezoidal matrix sub(A) =
[A(ia:ia+m-1, ja:ja+m-1) A(ia:ia+m-1, ja+n-l:ja+n-1)] to upper triangular form by
means of orthogonal/unitary transformations.
The upper trapezoidal matrix sub(A) is factored as
sub(A) = ( R 0 )*Z,
Input Parameters
m (global) INTEGER.
of rows of the distributed submatrix sub(A). m ≥ 0.
2256
n (global) INTEGER.
number of columns of the distributed submatrix sub(A). n
≥ 0.
l (global) INTEGER.
The number of columns of the distributed submatrix sub(A)
containing the meaningful part of the Householder reflectors.
l > 0.
a (local)
REAL for pslatrz
DOUBLE PRECISION for pdlatrz
COMPLEX for pclatrz
COMPLEX*16 for pzlatrz.
DIMENSION(lld_a, LOCc(ja+n-1)). On entry, the local
pieces of the m-by-n distributed matrix sub(A), which is to
be factored.
of sub(A).
column of sub(A).
work (local)
REAL for pslatrz
COMPLEX for pclatrz
lwork ≥ nq0 + max(1, mp0), where
2257

numroc, indxg2p, and numroc are ScaLAPACK tool
functions; myrow, mycol, nprow, and npcol can be
determined by calling the subroutine blacs_gridinfo.
Output Parameters
a On exit, the leading m-by-m upper triangular part of sub(A)
contains the upper triangular matrix R, and elements n-l+1
to n of the first m rows of sub(A), with the array tau,
represent the orthogonal/unitary matrix Z as a product of
m elementary reflectors.
tau (local)
REAL for pslatrz
COMPLEX for pclatrz
Array, DIMENSION(LOCr(ja+m-1)). This array contains the
Application Notes
The factorization is obtained by Householder's method. The k-th transformation matrix, Z(k),
which is used (or, in case of complex routines, whose conjugate transpose is used) to introduce
zeros into the (m - k + 1)-th row of sub(A), is given in the form
where
2258
tau is a scalar and z( k ) is an (n-m)-element vector. tau and z( k ) are chosen to annihilate
the elements of the k-th row of sub(A). The scalar tau is returned in the k-th element of tau
and the vector u( k ) in the k-th row of sub(A), such that the elements of z(k ) are in a(
k, m + 1 ), ..., a( k, n ). The elements of R are returned in the upper triangular part
of sub(A).
Z is given by
Z = Z(1)Z(2)...Z(m).
p?lauu2
Computes the product U*U' or L'*L, where U and
L are upper or lower triangular matrices (local
Syntax
call pslauu2(uplo, n, a, ia, ja, desca)
call pdlauu2(uplo, n, a, ia, ja, desca)
call pclauu2(uplo, n, a, ia, ja, desca)
call pzlauu2(uplo, n, a, ia, ja, desca)
Description
The routine computes the product U*U' or L'*L, where the triangular factor U or L is stored in
the upper or lower triangular part of the distributed matrix
sub(A)= A(ia:ia+n-1, ja:ja+n-1).
in sub(A).
in sub(A).
This is the unblocked form of the algorithm, calling BLAS Level 2 Routines. No communication
is performed by this routine, the matrix to operate on should be strictly local to one process.
Input Parameters
2259
Specifies whether the triangular factor stored in the matrix

sub(A) is upper or lower triangular:
= U: upper triangular
= L: lower triangular.
n (global) INTEGER.
is, the order of the triangular factor U or L. n ≥ 0.
a (local)
REAL for pslauu2
DOUBLE PRECISION for pdlauu2
COMPLEX for pclauu2
COMPLEX*16 for pzlauu2.
DIMENSION(lld_a, LOCc(ja+n-1). On entry, the local
pieces of the triangular factor U or L.
of sub(A).
column of sub(A).
Output Parameters
a (local)
On exit, if uplo = 'U', the upper triangle of the distributed
matrix sub(A) is overwritten with the upper triangle of the
product U*U'; if uplo = 'L', the lower triangle of sub(A)
is overwritten with the lower triangle of the product L'*L.
2260
p?lauum
Computes the product U*U' or L'*L, where U and
L are upper or lower triangular matrices.
Syntax
call pslauum(uplo, n, a, ia, ja, desca)
call pdlauum(uplo, n, a, ia, ja, desca)
call pclauum(uplo, n, a, ia, ja, desca)
call pzlauum(uplo, n, a, ia, ja, desca)
Description
The routine computes the product U*U' or L'*L, where the triangular factor U or L is stored in
the upper or lower triangular part of the matrix sub(A)= A(ia:ia+n-1, ja:ja+n-1).
in sub(A). If uplo = 'L' or 'l', then the lower triangle of the result is stored, overwriting the
factor L in sub(A).
This is the blocked form of the algorithm, calling Level 3 PBLAS.
Input Parameters
Specifies whether the triangular factor stored in the matrix
sub(A) is upper or lower triangular:
= 'L': lower triangular.
n (global) INTEGER.
is, the order of the triangular factor U or L. n ≥ 0.
a (local)
REAL for pslauum
DOUBLE PRECISION for pdlauum
COMPLEX for pclauum
COMPLEX*16 for pzlauum.
2261

DIMENSION(lld_a, LOCc(ja+n-1). On entry, the local
pieces of the triangular factor U or L.
of sub(A).
column of sub(A).
Output Parameters
a (local)
On exit, if uplo = 'U', the upper triangle of the distributed
matrix sub(A) is overwritten with the upper triangle of the
product U*U' ; if uplo = 'L', the lower triangle of sub(A)
is overwritten with the lower triangle of the product L'*L.
p?lawil
Forms the Wilkinson transform.
Syntax
call pslawil(ii, jj, m, a, desca, h44, h33, h43h34, v)
call pdlawil(ii, jj, m, a, desca, h44, h33, h43h34, v)
Description
The routine gets the transform given by h44, h33, and h43h34 into v starting at row m.
Input Parameters
ii (global) INTEGER.
Row owner of h(m+2, m+2).
jj (global) INTEGER.
2262
Column owner of h(m+2, m+2).
m (global) INTEGER.
On entry, the location from where the transform starts (row
m). Unchanged on exit.
a (global)
REAL for pslawil
DOUBLE PRECISION for pdlawil
Array, DIMENSION (desca(lld_),*).
On entry, the Hessenberg matrix. Unchanged on exit.
desca (global and local) INTEGER
distributed matrix A. Unchanged on exit.
h43h34 (global)
REAL for pslawil
These three values are for the double shift QR iteration.
Unchanged on exit.
Output Parameters
v (global)
REAL for pslawil
Array of size 3 that contains the transform on output.
p?org2l/p?ung2l
matrix Q from a QL factorization determined by
p?geqlf (unblocked algorithm).
Syntax
call psorg2l(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorg2l(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pcung2l(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzung2l(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
2263
Description
The routine p?org2l/p?ung2l generates an m-by-n real/complex distributed matrix Q denoting
A(ia:ia+m-1, ja:ja+n-1) with orthonormal columns, which is defined as the last n columns
of a product of k elementary reflectors of order m:
Q = H(k)*...*H(2)*H(1) as returned by p?geqlf.
Input Parameters
m (global) INTEGER.
of rows of the distributed submatrix Q. m ≥ 0.
n (global) INTEGER.
number of columns of the distributed submatrix Q. m ≥ n
≥ 0.
k (global) INTEGER.
the matrix Q. n≥ k ≥ 0.
a REAL for psorg2l
DOUBLE PRECISION for pdorg2l
COMPLEX for pcung2l
COMPLEX*16 for pzung2l.
Pointer into the local memory to an array, DIMENSION
(lld_a, LOCc(ja+n-1).
On entry, the j-th column must contain the vector that
defines the elementary reflector H(j), ja+n-k ≤ j ≤
ja+n-k, as returned by p?geqlf in the k columns of its
distributed matrix argument A(ia:*,ja+n-k:ja+n-1).
of sub(A).
column of sub(A).
2264
tau (local)
REAL for psorg2l
COMPLEX for pcung2l
Array, DIMENSION LOCc(ja+n-1).
This array contains the scalar factor tau(j) of the
elementary reflector H(j), as returned by p?geqlf.
work (local)
REAL for psorg2l
COMPLEX for pcung2l
lwork is local input and must be at least lwork ≥ mpa0 +
max(1, nqa0), where
mpa0 = numroc(m+iroffa, mb_a, myrow, iarow,
nprow),
nqa0 = numroc(n+icoffa, nb_a, mycol, iacol,
npcol).
2265
Output Parameters
a On exit, this array contains the local pieces of the m-by-n
distributed matrix Q.
info (local) INTEGER.
an illegal value,
then info = - (i*100 +j),
then info = -i.
p?org2r/p?ung2r
matrix Q from a QR factorization determined by
p?geqrf (unblocked algorithm).
Syntax
call psorg2r(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorg2r(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pcung2r(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzung2r(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Description
The routine p?org2r/p?ung2r generates an m-by-n real/complex matrix Q denoting
of a product of k elementary reflectors of order m:
Q = H(1)*H(2)*...*H(k)
Input Parameters
m (global) INTEGER.
2266
of rows of the distributed submatrix Q.m ≥ 0.
n (global) INTEGER.
number of columns of the distributed submatrix Q. m ≥ n
≥ 0.
k (global) INTEGER.
the matrix Q. n ≥ k ≥ 0.
a REAL for psorg2r
DOUBLE PRECISION for pdorg2r
COMPLEX for pcung2r
COMPLEX*16 for pzung2r.
Pointer into the local memory to an array,
DIMENSION(lld_a, LOCc(ja+n-1).
defines the elementary reflector H(j), ja ≤ j ≤ ja+k-1,
as returned by p?geqrf in the k columns of its distributed
matrix argument A(ia:*,ja:ja+k-1).
of sub(A).
column of sub(A).
tau (local)
REAL for psorg2r
COMPLEX for pcung2r
Array, DIMENSION LOCc(ja+k-1).
2267
This array contains the scalar factor tau(j) of the

elementary reflector H(j), as returned by p?geqrf. This
array is tied to the distributed matrix A.
work (local)
REAL for psorg2r
COMPLEX for pcung2r
lwork is local input and must be at least lwork ≥ mpa0 +
max(1, nqa0),
where
iroffa = mod(ia-1, mb_a , icoffa = mod(ja-1,
nb_a),
nprow),
npcol).
Output Parameters
2268
an illegal value,
then info = - (i*100+j),
then info = -i.
p?orgl2/p?ungl2
matrix Q from an LQ factorization determined by
p?gelqf (unblocked algorithm).
Syntax
call psorgl2(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorgl2(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pcungl2(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzungl2(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Description
The routine p?orgl2/p?ungl2 generates a m-by-n real/complex matrix Q denoting
Q = H(k)*...*H(2)*H(1) (for real flavors),

Q = (H(k))H*...*(H(2))H*(H(1))H (for complex flavors) as returned by p?gelqf.
Input Parameters
m (global) INTEGER.
n (global) INTEGER.
number of columns of the distributed submatrix Q. n ≥ m
≥ 0.
2269
k (global) INTEGER.
a REAL for psorgl2
DOUBLE PRECISION for pdorgl2
COMPLEX for pcungl2
COMPLEX*16 for pzungl2.
(lld_a, LOCc(ja+n-1).
On entry, the i-th row must contain the vector that defines
the elementary reflector H(i), ia ≤ i ≤ ia+k-1, as
returned by p?gelqf in the k rows of its distributed matrix
argument A(ia:ia+k-1, ja:*).
of sub(A).
column of sub(A).
tau (local)
REAL for psorgl2
COMPLEX for pcungl2
Array, DIMENSION LOCr(ja+k-1). This array contains the
scalar factors tau(i) of the elementary reflectors H(i), as
returned by p?gelqf. This array is tied to the distributed
matrix A.
WORK (local)
REAL for psorgl2
COMPLEX for pcungl2
2270
lwork is local input and must be at least lwork ≥ nqa0 +
max(1, mpa0), where
nprow),
npcol).
Output Parameters
an illegal value,
then info = -i.
2271
p?orgr2/p?ungr2
matrix Q from an RQ factorization determined by
p?gerqf (unblocked algorithm).
Syntax
call psorgr2(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorgr2(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pcungr2(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzungr2(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Description
The routine p?orgr2/p?ungr2 generates an m-by-n real/complex matrix Q denoting
A(ia:ia+m-1, ja:ja+n-1) with orthonormal rows, which is defined as the last m rows of a
Q = H(1)*H(2)*...*H(k) (for real flavors);

Q = (H(1))H*(H(2))H...*(H(k))H (for complex flavors) as returned by p?gerqf.
Input Parameters
m (global) INTEGER.
n (global) INTEGER.
number of columns of the distributed submatrix Q. n ≥ m
≥ 0.
k (global) INTEGER.
a REAL for psorgr2
DOUBLE PRECISION for pdorgr2
COMPLEX for pcungr2
2272
COMPLEX*16 for pzungr2.
DIMENSION(lld_a, LOCc(ja+n-1).
the elementary reflector H(i), ia+m-k ≤ i ≤ ia+m-1,
as returned by p?gerqf in the k rows of its distributed
matrix argument A(ia+m-k:ia+m-1, ja:*).
of sub(A).
column of sub(A).
tau (local)
REAL for psorgr2
COMPLEX for pcungr2
Array, DIMENSION LOCr(ja+m-1). This array contains the
scalar factors tau(i) of the elementary reflectors H(i), as
returned by p?gerqf. This array is tied to the distributed
matrix A.
work (local)
REAL for psorgr2
COMPLEX for pcungr2
lwork is local input and must be at least lwork ≥ nqa0 +
max(1, mpa0 ), where iroffa = mod( ia-1, mb_a ),
icoffa = mod( ja-1, nb_a ),
2273
iarow = indxg2p( ia, mb_a, myrow, rsrc_a, nprow

),
iacol = indxg2p( ja, nb_a, mycol, csrc_a, npcol
),
mpa0 = numroc( m+iroffa, mb_a, myrow, iarow,
nprow ),
nqa0 = numroc( n+icoffa, nb_a, mycol, iacol,
npcol ).
Output Parameters
an illegal value,
then info = -i.
2274
p?orm2l/p?unm2l
orthogonal/unitary matrix from a QL factorization
determined by p?geqlf (unblocked algorithm).
Syntax
call psorm2l(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pdorm2l(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pcunm2l(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pzunm2l(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
Description
The routine p?orm2l/p?unm2l overwrites the general real/complex m-by-n distributed matrix
sub (C)=C(ic:ic+m-1,jc:jc+n-1) with
Q*sub(C) if side = 'L' and trans = 'N', or
QT*sub(C) / QH*sub(C) if side = 'L' and trans = 'T' (for real flavors) or trans = 'C'
(for complex flavors), or
sub(C)*Q if side = 'R' and trans = 'N', or
sub(C)*QT / sub(C)*QH if side = 'R' and trans = 'T' (for real flavors) or trans = 'C'
(for complex flavors).
where Q is a real orthogonal or complex unitary distributed matrix defined as the product of k
elementary reflectors
Q = H(k)*...*H(2)*H(1) as returned by p?geqlf . Q is of order m if side = 'L' and of
order n if side = 'R'.
Input Parameters
T H
= 'L': apply Q or Q for real flavors (Q for complex flavors)
from the left,
2275
T H
= 'R': apply Q or Q for real flavors (Q for complex flavors)
from the right.
T
H
m (global) INTEGER.
of rows of the distributed submatrix sub(C). m ≥ 0.
n (global) INTEGER.
number of columns of the distributed submatrix sub(C). n
≥ 0.
k (global) INTEGER.
the matrix Q.
If side = 'L', m ≥ k ≥ 0;
if side = 'R', n ≥ k ≥ 0.
a (local)
REAL for psorm2l
DOUBLE PRECISION for pdorm2l
COMPLEX for pcunm2l
COMPLEX*16 for pzunm2l.
DIMENSION(lld_a, LOCc(ja+k-1).
On entry, the j-th row must contain the vector that defines
the elementary reflector H(j), ja ≤ j ≤ ja+k-1, as
returned by p?geqlf in the k columns of its distributed
matrix argument A(ia:*,ja:ja+k-1). The argument
A(ia:*,ja:ja+k-1) is modified by the routine but restored
on exit.
If side = 'L', lld_a ≥ max(1, LOCr(ia+m-1)),
if side = 'R', lld_a ≥ max(1, LOCr(ia+n-1)).
2276
of sub(A).
column of sub(A).
tau (local)
REAL for psorm2l
COMPLEX for pcunm2l
Array, DIMENSIONLOCc(ja+n-1). This array contains the
scalar factor tau(j) of the elementary reflector H(j), as
returned by p?geqlf. This array is tied to the distributed
matrix A.
c (local)
REAL for psorm2l
COMPLEX for pcunm2l
DIMENSION(lld_c, LOCc(jc+n-1)).On entry, the local
pieces of the distributed matrix sub (C).
ic (global) INTEGER.
The row index in the global array C indicating the first row
of sub(C).
jc (global) INTEGER.
The column index in the global array C indicating the first
column of sub(C).
descc (global and local) INTEGER array of DIMENSION (dlen_).
The array descriptor for the distributed matrix C.
work (local)
REAL for psorm2l
COMPLEX for pcunm2l
2277

On exit, work(1) returns the minimal and optimal lwork.
if side = 'L', lwork ≥ mpc0 + max(1, nqc0),
if side = 'R', lwork ≥ nqc0 + max(max(1, mpc0),
numroc(numroc(n+icoffc, nb_a, 0, 0, npcol),
nb_a, 0, 0, lcmq)),
where
lcmq = lcm/npcol,
lcm = iclm(nprow, npcol),
Mqc0 = numroc(m+icoffc, nb_c, mycol, icrow,
nprow),
Npc0 = numroc(n+iroffc, mb_c, myrow, iccol,
npcol),
Output Parameters
c On exit, c is overwritten by Q*sub(C), or QT*sub(C)/
QH*sub(C), or sub(C)*Q, or sub(C)*QT / sub(C)*QH
2278
an illegal value,
then info = -i.
NOTE. The distributed submatrices A(ia:*, ja:*) and C(ic:ic+m-1,jc:jc+n-1)

must verify some alignment properties, namely the following expressions should be true:
If side = 'L', ( mb_a.eq.mb_c .AND. iroffa.eq.iroffc .AND. iarow.eq.icrow
)
If side = 'R', ( mb_a.eq.nb_c .AND. iroffa.eq.iroffc ).
p?orm2r/p?unm2r
orthogonal/unitary matrix from a QR factorization
determined by p?geqrf (unblocked algorithm).
Syntax
call psorm2r(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pdorm2r(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pcunm2r(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pzunm2r(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
Description
The routine p?orm2r/p?unm2r overwrites the general real/complex m-by-n distributed matrix
sub (C)=C(ic:ic+m-1, jc:jc+n-1) with
2279

where Q is a real orthogonal or complex unitary matrix defined as the product of k elementary
reflectors
Q = H(k)*...*H(2)*H(1) as returned by p?geqrf . Q is of order m if side = 'L' and of
order n if side = 'R'.
Input Parameters
T H
from the left,
T H
from the right.
T
H
m (global) INTEGER.
n (global) INTEGER.
≥ 0.
k (global) INTEGER.
the matrix Q.
If side = 'L', m ≥ k ≥ 0;
if side = 'R', n ≥ k ≥ 0.
a (local)
REAL for psorm2r
DOUBLE PRECISION for pdorm2r
COMPLEX for pcunm2r
2280
COMPLEX*16 for pzunm2r.
DIMENSION(lld_a, LOCc(ja+k-1).
defines the elementary reflector H(j), ja ≤ j ≤ ja+k-1,
as returned by p?geqrf in the k columns of its distributed
matrix argument A(ia:*,ja:ja+k-1). The argument
A(ia:*,ja:ja+k-1) is modified by the routine but restored
on exit.
If side = 'L', lld_a ≥ max(1, LOCr(ia+m-1)),
if side = 'R', lld_a ≥ max(1, LOCr(ia+n-1)).
of sub(A).
column of sub(A).
tau (local)
REAL for psorm2r
COMPLEX for pcunm2r
Array, DIMENSION LOCc(ja+k-1). This array contains the
scalar factors tau(j) of the elementary reflector H(j), as
returned by p?geqrf. This array is tied to the distributed
matrix A.
c (local)
REAL for psorm2r
COMPLEX for pcunm2r
DIMENSION(lld_c, LOCc(jc+n-1)).
On entry, the local pieces of the distributed matrix sub (C).
2281
of sub(C).
column of sub(C).
work (local)
REAL for psorm2r
COMPLEX for pcunm2r
if side = 'L', lwork ≥ mpc0 + max(1, nqc0),
if side = 'R', lwork ≥ nqc0 + max(max(1, mpc0),
numroc(numroc(n+icoffc, nb_a, 0, 0, npcol),
nb_a, 0, 0, lcmq)),
where
lcmq = lcm/npcol ,
Mqc0 = numroc(m+icoffc, nb_c, mycol, icrow,
nprow),
Npc0 = numroc(n+iroffc, mb_c, myrow, iccol,
npcol),
2282
Output Parameters
an illegal value,
then info = -i.
NOTE. The distributed submatrices A(ia:*, ja:*) and C(ic:ic+m-1, jc:jc+n-1)

If side = 'L', (mb_a.eq.mb_c .AND. iroffa.eq.iroffc .AND. iarow.eq.icrow)
If side = 'R', (mb_a.eq.nb_c .AND. iroffa.eq.iroffc).
2283
p?orml2/p?unml2
orthogonal/unitary matrix from an LQ factorization
determined by p?gelqf (unblocked algorithm).
Syntax
call psorml2(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pdorml2(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pcunml2(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pzunml2(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
Description
The routine p?orml2/p?unml2 overwrites the general real/complex m-by-n distributed matrix
Q = H(k)*...*H(2)*H(1) (for real flavors)
Q = (H(k))H*...*(H(2))H*(H(1))H (for complex flavors)
as returned by p?gelqf . Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
2284
T H
from the left,
T H
from the right.
T
H
m (global) INTEGER.
n (global) INTEGER.
≥ 0.
k (global) INTEGER.
the matrix Q.
If side = 'L', m ≥ k ≥ 0;
if side = 'R', n ≥ k ≥ 0.
a (local)
REAL for psorml2
DOUBLE PRECISION for pdorml2
COMPLEX for pcunml2
COMPLEX*16 for pzunml2.
(lld_a, LOCc(ja+m-1) if side='L',
(lld_a, LOCc(ja+n-1) if side='R',
where lld_a ≥ max (1, LOCr(ia+k-1)).
returned by p?gelqf in the k rows of its distributed matrix
argument A(ia:ia+k-1, ja:*). The argument
A(ia:ia+k-1, ja:*) is modified by the routine but
restored on exit.
2285
of sub(A).
column of sub(A).
tau (local)
REAL for psorml2
COMPLEX for pcunml2
Array, DIMENSION LOCc(ia+k-1). This array contains the
scalar factors tau(i) of the elementary reflector H(i), as
returned by p?gelqf. This array is tied to the distributed
matrix A.
c (local)
REAL for psorml2
COMPLEX for pcunml2
DIMENSION(lld_c, LOCc(jc+n-1)). On entry, the local
of sub(C).
column of sub(C).
work (local)
REAL for psorml2
2286
COMPLEX for pcunml2
if side = 'L', lwork ≥ mqc0 + max(max( 1, npc0),
numroc(numroc(m+icoffc, mb_a, 0, 0, nprow),
mb_a, 0, 0, lcmp)),
if side = 'R', lwork ≥ npc0 + max(1, mqc0),
where
lcmp = lcm / nprow,
Mpc0 = numroc(m+icoffc, mb_c, mycol, icrow,
nprow),
Nqc0 = numroc(n+iroffc, nb_c, myrow, iccol,
npcol),
Output Parameters
2287

an illegal value,
then info = -i.
NOTE. The distributed submatrices A(ia:*, ja:*) and C(ic:ic+m-1, jc:jc+n-1)

If side = 'L', (nb_a.eq.mb_c .AND. icoffa.eq.iroffc)
If side = 'R', (nb_a.eq.nb_c .AND. icoffa.eq.icoffc .AND. iacol.eq.iccol).
p?ormr2/p?unmr2
orthogonal/unitary matrix from an RQ factorization
determined by p?gerqf (unblocked algorithm).
Syntax
call psormr2(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pdormr2(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pcunmr2(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
call pzunmr2(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc,
work, lwork, info)
Description
The routine p?ormr2/p?unmr2 overwrites the general real/complex m-by-n distributed matrix
2288
Q = H(1)*H(2)*...*H(k) (for real flavors)
Q = (H(1))H*(H(2))H*...*(H(k))H (for complex flavors)
as returned by p?gerqf . Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
T H
from the left,
T H
from the right.
T
H
m (global) INTEGER.
n (global) INTEGER.
≥ 0.
k (global) INTEGER.
the matrix Q.
If side = 'L', m ≥ k ≥ 0;
if side = 'R', n ≥ k ≥ 0.
a (local)
REAL for psormr2
2289
DOUBLE PRECISION for pdormr2

COMPLEX for pcunmr2
COMPLEX*16 for pzunmr2.
(lld_a, LOCc(ja+m-1) if side='L',
(lld_a, LOCc(ja+n-1) if side='R',
where lld_a ≥ max (1, LOCr(ia+k-1)).
returned by p?gerqf in the k rows of its distributed matrix
argument A(ia:ia+k-1, ja:*).
The argument A(ia:ia+k-1, ja:*) is modified by the
of sub(A).
column of sub(A).
tau (local)
REAL for psormr2
COMPLEX for pcunmr2
Array, DIMENSION LOCc(ia+k-1). This array contains the
scalar factors tau(i) of the elementary reflector H(i), as
returned by p?gerqf. This array is tied to the distributed
matrix A.
c (local)
REAL for psormr2
COMPLEX for pcunmr2
2290
DIMENSION(lld_c, LOCc(jc+n-1)). On entry, the local
of sub(C).
column of sub(C).
work (local)
REAL for psormr2
COMPLEX for pcunmr2
if side = 'L', lwork ≥ mpc0 + max(max(1, nqc0),
numroc(numroc(m+iroffc, mb_a, 0, 0, nprow),
mb_a, 0, 0, lcmp)),
if side = 'R', lwork ≥ nqc0 + max(1, mpc0),
where lcmp = lcm/nprow,
Mpc0 = numroc(m+iroffc, mb_c, myrow, icrow,
nprow),
Nqc0 = numroc(n+icoffc, nb_c, mycol, iccol,
npcol),
2291

Output Parameters
an illegal value,
then info = -i.
NOTE. The distributed submatrices A(ia:*, ja:*) and C(ic:ic+m-1,jc:jc+n-1)

If side = 'L', ( nb_a.eq.mb_c .AND. icoffa.eq.iroffc )
If side = 'R', ( nb_a.eq.nb_c .AND. icoffa.eq.icoffc .AND. iacol.eq.iccol
).
2292
p?pbtrsv
Solves a single triangular linear system via
frontsolve or backsolve where the triangular matrix
is a factor of a banded matrix computed by p?pb-
trf.
Syntax
call pspbtrsv(uplo, trans, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf,
work, lwork, info)
call pdpbtrsv(uplo, trans, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf,
work, lwork, info)
call pcpbtrsv(uplo, trans, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf,
work, lwork, info)
call pzpbtrsv(uplo, trans, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf,
work, lwork, info)
Description
The routine p?pbtrsv solves a banded triangular system of linear equations
A(1:n, ja:ja+n-1)*X = B(jb:jb+n-1, 1:nrhs)
or
A(1:n, ja:ja+n-1)T*X = B(jb:jb+n-1, 1:nrhs) for real flavors,
A(1:n, ja:ja+n-1)H*X = B(jb:jb+n-1, 1:nrhs) for complex flavors,
where A(1:n, ja:ja+n-1) is a banded triangular matrix factor produced by the Cholesky
factorization code p?pbtrf and is stored in A(1:n, ja:ja+n-1) and af. The matrix stored in
A(1:n, ja:ja+n-1) is either upper or lower triangular according to uplo.
Routine p?pbtrf must be called first.
Input Parameters
stored;
2293

stored.
trans (global) CHARACTER. Must be 'N' or 'T' or 'C'.
If trans = 'N', solve with A(1:n, ja:ja+n-1);
If trans = 'T' or 'C' for real flavors, solve with A(1:n,
ja:ja+n-1)T.
If trans = 'C' for complex flavors, solve with conjugate
transpose(A(1:n, ja:ja+n-1)H.
n (global) INTEGER.
is, the order of the distributed submatrix A(1:n,
ja:ja+n-1). n ≥ 0.
bw (global) INTEGER.
The number of subdiagonals in 'L' or 'U', 0 ≤ bw ≤ n-1.
nrhs (global) INTEGER.
The number of right hand sides; the number of columns of
the distributed submatrix B(jb:jb+n-1, 1:nrhs); nrhs
≥ 0.
a (local)
REAL for pspbtrsv
DOUBLE PRECISION for pdpbtrsv
COMPLEX for pcpbtrsv
COMPLEX*16 for pzpbtrsv.
Pointer into the local memory to an array with the first
DIMENSION lld_a ≥ (bw+1), stored in desca.
symmetric banded distributed Cholesky factor L or
LT*A(1:n, ja:ja+n-1).
This local portion is stored in the packed banded format
used in LAPACK. See the Application Notes below and the
ScaLAPACK manual for more detail on the format of
distributed matrices.
2294
If 1D type (dtype_a = 501), then dlen ≥ 7;
If 2D type (dtype_a = 1), then dlen ≥ 9.
Contains information on mapping of A to memory. (See
ScaLAPACK manual for full description and options.)
b (local)
REAL for pspbtrsv
DIMENSION lld_b ≥ nb.
hand sides B(jb:jb+n-1, 1:nrhs).
If 1D type (dtype_b = 502), then dlen ≥ 7;
If 2D type (dtype_b = 1), then dlen ≥ 9.
Contains information on mapping of B to memory. Please,
see ScaLAPACK manual for full description and options.
laf (local)
INTEGER. The size of user-input auxiliary Fillin space af.
Must be laf ≥ (nb+2*bw)*bw . If laf is not large enough,
an error code will be returned and the minimum acceptable
size will be returned in af(1).
work (local)
REAL for pspbtrsv
2295
The array work is a temporary workspace array of

DIMENSION lwork. This space may be overwritten in
lwork (local or global) INTEGER. The size of the user-input
workspace work, must be at least lwork ≥ bw*nrhs. If
lwork is too small, the minimal acceptable size will be
returned in work(1) and an error code is returned.
Output Parameters
af (local)
REAL for pspbtrsv
The array af is of DIMENSION laf. It contains auxiliary Fillin
space. Fillin is created during the factorization routine
p?pbtrf and this is stored in af. If a linear system is to be
solved using p?pbtrs after the factorization routine, af
must not be altered after the factorization.
b On exit, this array contains the local piece of the solutions
work(1) On exit, work(1) contains the minimum value of lwork.
an illegal value,
then info = -(i*100+j),
then info = -i.
Application Notes
If the factorization routine and the solve routine are to be called separately to solve various
sets of right-hand sides using the same coefficient matrix, the auxiliary space af must not be
altered between calls to the factorization routine and the solve routine.
2296
The best algorithm for solving banded and tridiagonal linear systems depends on a variety of
parameters, especially the bandwidth. Currently, only algorithms designed for the case N/P
>> bw are implemented. These algorithms go by many names, including Divide and Conquer,
Partitioning, domain decomposition-type, etc.
The Divide and Conquer algorithm assumes the matrix is narrowly banded compared with the
number of equations. In this situation, it is best to distribute the input matrix A
one-dimensionally, with columns atomic and rows divided amongst the processes. The basic
algorithm divides the banded matrix up into P pieces with one stored on each processor, and
then proceeds in 2 phases for the factorization or 3 for the solution of a linear system.
1. Local Phase: The individual pieces are factored independently and in parallel. These factors
are applied to the matrix creating fill-in, which is stored in a non-inspectable way in auxiliary
space af. Mathematically, this is equivalent to reordering the matrix A as PAPT and then
factoring the principal leading submatrix of size equal to the sum of the sizes of the matrices
factored on each processor. The factors of these submatrices overwrite the corresponding
parts of A in memory.
2. Reduced System Phase: A small (bw*(P-1)) system is formed representing interaction
of the larger blocks and is stored (as are its factors) in the space af. A parallel Block Cyclic
Reduction algorithm is used. For a linear system, a parallel front solve followed by an
analogous backsolve, both using the structure of the factored matrix, are performed.
3. Back Subsitution Phase: For a linear system, a local backsubstitution is performed on
each processor in parallel.
2297
p?pttrsv
Solves a single triangular linear system via
frontsolve or backsolve where the triangular matrix
is a factor of a tridiagonal matrix computed by
p?pttrf .
Syntax
call pspttrsv(uplo, n, nrhs, d, e, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pdpttrsv(uplo, n, nrhs, d, e, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pcpttrsv(uplo, trans, n, nrhs, d, e, ja, desca, b, ib, descb, af, laf,
work, lwork, info)
call pzpttrsv(uplo, trans, n, nrhs, d, e, ja, desca, b, ib, descb, af, laf,
work, lwork, info)
Description
The routine solves a tridiagonal triangular system of linear equations
A(1:n, ja:ja+n-1)*X = B(jb:jb+n-1, 1:nrhs)
or
A(1:n, ja:ja+n-1)T*X = B(jb:jb+n-1, 1:nrhs) for real flavors,
A(1:n, ja:ja+n-1)H*X = B(jb:jb+n-1, 1:nrhs) for complex flavors,
where A(1:n, ja:ja+n-1) is a tridiagonal triangular matrix factor produced by the Cholesky
factorization code p?pttrf and is stored in A(1:n, ja:ja+n-1) and af. The matrix stored in
A(1:n, ja:ja+n-1) is either upper or lower triangular according to uplo.
Routine p?pttrf must be called first.
Input Parameters
stored;
2298
stored.
trans (global) CHARACTER. Must be 'N' or 'C'.
If trans = 'N', solve with A(1:n, ja:ja+n-1);
If trans = 'C' (for complex flavors), solve with conjugate
transpose (A(1:n, ja:ja+n-1))H.
n (global) INTEGER.
is, the order of the distributed submatrix A(1:n,
ja:ja+n-1). n ≥ 0.
nrhs (global) INTEGER.
The number of right hand sides; the number of columns of
the distributed submatrix B(jb:jb+n-1, 1:nrhs); nrhs ≥
0.
d (local)
REAL for pspttrsv
DOUBLE PRECISION for pdpttrsv
COMPLEX for pcpttrsv
COMPLEX*16 for pzpttrsv.
Pointer to the local part of the global vector storing the main
diagonal of the matrix; must be of size ≥ desca(nb_).
e (local)
REAL for pspttrsv
Pointer to the local part of the global vector storing the
upper diagonal of the matrix; must be of size ≥ desca(nb_).
Globally, du(n) is not referenced, and du must be aligned
with d.
2299
If 1D type (dtype_a = 501 or 502), then dlen ≥ 7;

If 2D type (dtype_a = 1), then dlen ≥ 9.
Contains information on mapping of A to memory. See
ScaLAPACK manual for full description and options.
b (local)
REAL for pspttrsv
DIMENSION lld_b ≥ nb.
hand sides B(jb:jb+n-1, 1:nrhs).
If 1D type (dtype_b = 502), then dlen ≥ 7;
If 2D type (dtype_b = 1), then dlen ≥ 9.
Contains information on mapping of B to memory. See
ScaLAPACK manual for full description and options.
laf (local)
INTEGER. The size of user-input auxiliary Fillin space af.
Must be laf ≥ (nb+2*bw)*bw.
work (local)
REAL for pspttrsv
DIMENSION lwork. This space may be overwritten in
2300
lwork (local or global) INTEGER. The size of the user-input
workspace work, must be at least lwork ≥(10+2*min(100,
nrhs))*npcol+4*nrhs. If lwork is too small, the minimal
acceptable size will be returned in work(1) and an error
code is returned.
Output Parameters
d, e (local).
REAL for pspttrsv
On exit, these arrays contain information on the factors of
the matrix.
af (local)
REAL for pspttrsv
The array af is of DIMENSION laf. It contains auxiliary Fillin
space. Fillin is created during the factorization routine
p?pbtrf and this is stored in af. If a linear system is to be
solved using p?pttrs after the factorization routine, af
must not be altered after the factorization.
b On exit, this array contains the local piece of the solutions
work(1) On exit, work(1) contains the minimum value of lwork.
an illegal value,
then info = -i.
2301
p?potf2
symmetric/Hermitian positive definite matrix (local
Syntax
call pspotf2(uplo, n, a, ia, ja, desca, info)
call pdpotf2(uplo, n, a, ia, ja, desca, info)
call pcpotf2(uplo, n, a, ia, ja, desca, info)
call pzpotf2(uplo, n, a, ia, ja, desca, info)
Description
positive definite distributed matrix sub (A)=A(ia:ia+n-1, ja:ja+n-1).
sub(A) = U'*U, if uplo = 'U', or sub(A) = L*L', if uplo = 'L',
where U is an upper triangular matrix, L is lower triangular. X' denotes transpose (conjugate
transpose) of X.
Input Parameters
symmetric/Hermitian matrix A is stored.
= 'U': upper triangle of sub (A) is stored;
= 'L': lower triangle of sub (A) is stored.
n (global) INTEGER.
is, the order of the distributed submatrix sub (A). n ≥ 0.
a (local)
REAL for pspotf2
DOUBLE PRECISION or pdpotf2
COMPLEX for pcpotf2
2302
COMPLEX*16 for pzpotf2.
pieces of the n-by-n symmetric distributed matrix sub(A) to
be factored.
sub(A) contains the upper triangular matrix and the strictly
lower triangular part of this matrix is not referenced.
sub(A) contains the lower triangular matrix and the strictly
the first row and the first column of the sub(A), respectively.
Output Parameters
a (local)
On exit,
if uplo = 'U', the upper triangular part of the distributed
matrix contains the Cholesky factor U;
if uplo = 'L', the lower triangular part of the distributed
matrix contains the Cholesky factor L.
an illegal value,
then info = -i.
> 0: if info = k, the leading minor of order k is not
positive definite, and the factorization could not be
completed.
2303
p?rscl
Multiplies a vector by the reciprocal of a real scalar.
Syntax
call psrscl(n, sa, sx, ix, jx, descx, incx)
call pdrscl(n, sa, sx, ix, jx, descx, incx)
call pcsrscl(n, sa, sx, ix, jx, descx, incx)
call pzdrscl(n, sa, sx, ix, jx, descx, incx)
Description
The routine multiplies an n-element real/complex vector sub(x) by the real scalar 1/a. This
is done without overflow or underflow as long as the final result sub(x)/a does not overflow
or underflow.
sub(x) denotes x(ix:ix+n-1, jx:jx), if incx = 1,
and x(ix:ix, jx:jx+n-1), if incx = m_x.
Input Parameters
n (global) INTEGER.
The number of components of the distributed vector sub(x).
n ≥ 0.
sa REAL for psrscl/pcsrscl
DOUBLE PRECISION for pdrscl/pzdrscl.
The scalar a that is used to divide each component of the
vector x. This parameter must be ≥ 0.
sx REAL forpsrscl
DOUBLE PRECISION for pdrscl
COMPLEX for pcsrscl
COMPLEX*16 for pzdrscl.
DIMENSION of at least ((jx-1)*m_x + ix +
(n-1)*abs(incx)). This array contains the entries of the
distributed vector sub(x).
2304
ix (global) INTEGER.The row index of the submatrix of the
distributed matrix X to operate on.
The column index of the submatrix of the distributed matrix
X to operate on.
descx (global and local). INTEGER.
Array of DIMENSION 8. The array descriptor for the
The increment for the elements of X. This version supports
only two values of incx, namely 1 and m_x.
Output Parameters
sx On exit, the result x/a.
p?sygs2/p?hegs2
Reduces a symmetric/Hermitian definite
generalized eigenproblem to standard form, using
the factorization results obtained from p?potrf
(local unblocked algorithm).
Syntax
call pssygs2(ibtype, uplo, n, a, ia, ja, desca, b, ib, jb, descb, info)
call pdsygs2(ibtype, uplo, n, a, ia, ja, desca, b, ib, jb, descb, info)
call pchegs2(ibtype, uplo, n, a, ia, ja, desca, b, ib, jb, descb, info)
call pzhegs2(ibtype, uplo, n, a, ia, ja, desca, b, ib, jb, descb, info)
Description
The routine p?sygs2/p?hegs2 reduces a real symmetric-definite or a complex Hermitian-definite
generalized eigenproblem to standard form.
Here sub(A) denotes A(ia:ia+n-1, ja:ja+n-1), and sub(B) denotes B(ib:ib+n-1,
jb:jb+n-1).
2305
sub(A)*x = λ*sub(B)*x
and sub(A) is overwritten by

inv(UT)*sub(A)*inv(U) or inv(L)*sub(A)*inv(LT) - for real flavors, and
inv(UH)*sub(A)*inv(U) or inv(L)*sub(A)*inv(LH) - for complex flavors.
sub(A)*sub(B)x = λ*x or sub(B)*sub(A)x =λ*x
and sub(A) is overwritten by

U*sub(A)*UT or L**T*sub(A)*L - for real flavors and
U*sub(A)*UH or L**H*sub(A)*L - for complex flavors.
T
The matrix sub(B) must have been previously factorized as UT*U or L*L (for real flavors), or
H
as UH*U or L*L (for complex flavors) by p?potrf.
Input Parameters
ibtype (global) INTEGER.
= 1:
compute inv(UT)*sub(A)*inv(U), or
inv(L)*sub(A)*inv(LT) for real subroutines,
and inv(UH)*sub(A)*inv(U), or inv(L)*sub(A)*inv(LH)
for complex subroutines;
= 2 or 3:
compute U*sub(A)*UT, or LT*sub(A)*L for real subroutines,
and U*sub(A)*UH or LH*sub(A)*L for complex subroutines.
uplo (global) CHARACTER
symmetric/Hermitian matrix sub(A) is stored, and how
sub(B) is factorized.
= 'U': Upper triangular of sub(A) is stored and sub(B) is
T H
factorized as U *U (for real subroutines) or as U *U (for
complex subroutines).
= 'L': Lower triangular of sub(A) is stored and sub(B) is
T H
factorized as L*L (for real subroutines) or as L*L (for
complex subroutines)
2306
n (global) INTEGER.
The order of the matrices sub(A) and sub(B). n ≥ 0.
a (local)
REAL for pssygs2
DOUBLE PRECISION for pdsygs2
COMPLEX for pchegs2
COMPLEX*16 for pzhegs2.
B (local)
REAL for pssygs2
DOUBLE PRECISION for pdsygs2
COMPLEX for pchegs2
COMPLEX*16 for pzhegs2.
DIMENSION(lld_b, LOCc(jb+n-1)).
triangular factor from the Cholesky factorization of sub(B)
as returned by p?potrf.
ib, jb (global) INTEGER.
The row and column indices in the global array B indicating
the first row and the first column of the sub(B), respectively.
2307

Output Parameters
a (local)
On exit, if info = 0, the transformed matrix is stored in
the same format as sub(A).
info INTEGER.
an illegal value,
then info = - (i*100),
then info = -i.
p?sytd2/p?hetd2
Reduces a symmetric/Hermitian matrix to real
symmetric tridiagonal form by an
orthogonal/unitary similarity transformation (local
Syntax
call pssytd2(uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info)
call pdsytd2(uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info)
call pchetd2(uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info)
call pzhetd2(uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info)
Description
The routine p?sytd2/p?hetd2 reduces a real symmetric/complex Hermitian matrix sub(A) to
symmetric/Hermitian tridiagonal form T by an orthogonal/unitary similarity transformation:
Q'*sub(A)*Q = T, where sub(A) = A(ia:ia+n-1, ja:ja+n-1).
Input Parameters
2308
symmetric/Hermitian matrix sub(A) is stored:
n (global) INTEGER.
is, the order of the distributed submatrix sub(A). n ≥ 0.
a (local)
REAL for pssytd2
DOUBLE PRECISION for pdsytd2
COMPLEX for pchetd2
COMPLEX*16 for pzhetd2.
work (local)
REAL for pssytd2
COMPLEX for pchetd2
DIMENSION lwork.
2309
Output Parameters
of sub(A) are overwritten by the corresponding elements of
superdiagonal, with the array tau, represent the
reflectors;
if uplo = 'L', the diagonal and first subdiagonal of A are
the array tau, represent the orthogonal/unitary matrix Q
as a product of elementary reflectors. See the Application
Notes below.
d (local)
REAL for pssytd2/pchetd2
DOUBLE PRECISION for pdsytd2/pzhetd2.
Array, DIMENSION(LOCc(ja+n-1)). The diagonal elements
d(i) = a(i,i); d is tied to the distributed matrix A.
e (local)
REAL for pssytd2/pchetd2
DOUBLE PRECISION for pdsytd2/pzhetd2.
Array, DIMENSION(LOCc(ja+n-1)),
if uplo = 'U', LOCc(ja+n-2) otherwise.
e(i) = a(i,i+1) if uplo = 'U',
e(i) = a(i+1,i) if uplo = 'L'.
tau (local)
REAL for pssytd2
COMPLEX for pchetd2
Array, DIMENSION(LOCc(ja+n-1)).
The scalar factors of the elementary reflectors. tau is tied
2310
work(1) On exit, work(1) returns the minimal and optimal value of
lwork.
The dimension of the workspace array work.
lwork is local input and must be at least lwork ≥ 3n.
an illegal value,
then info = -(i*100),
then info = -i.
Application Notes
Q = H(n-1)*...*H(2)*H(1)
where tau is a real/complex scalar, and v is a real/complex vector with v(i+1:n) = 0 and
v(i) = 1; v(1:i-1) is stored on exit in A(ia:ia+i-2, ja+i), and tau in TAU(ja+i-1).
Q = H(1)*H(2)*...*H(n-1).
H(i) = I - tau*v*v' ,
where tau is a real/complex scalar, and v is a real/complex vector with v(1:i) = 0 and v(i+1)
= 1; v(i+2:n) is stored on exit in A(ia+i+1:ia+n-1, ja+i-1), and tau in TAU(ja+i-1).
The contents of sub (A) on exit are illustrated by the following examples with n = 5:
2311
where d and e denotes diagonal and off-diagonal elements of T, and vi denotes an element of
NOTE. The distributed submatrix sub(A) must verify some alignment properties, namely
the following expression should be true:
( mb_a.eq.nb_a .AND. iroffa.eq.icoffa ) with iroffa = mod(ia - 1, mb_a)
and icoffa = mod(ja -1, nb_a).
p?trti2
Computes the inverse of a triangular matrix (local
Syntax
call pstrti2(uplo, diag, n, a, ia, ja, desca, info)
call pdtrti2(uplo, diag, n, a, ia, ja, desca, info)
call pctrti2(uplo, diag, n, a, ia, ja, desca, info)
call pztrti2(uplo, diag, n, a, ia, ja, desca, info)
Description
The routine computes the inverse of a real/complex upper or lower triangular block matrix sub
(A) = A(ia:ia+n-1, ja:ja+n-1).
This matrix should be contained in one and only one process memory space (local operation).
2312
Input Parameters
Specifies whether the matrix sub (A) is upper or lower
triangular.
= 'U': sub (A) is upper triangular
= 'L': sub (A) is lower triangular.
diag (global) CHARACTER*1.
= 'N': sub (A) is non-unit triangular
= 'U': sub (A) is unit triangular.
n (global) INTEGER.
The number of rows and columns to be operated on, i.e.,
the order of the distributed submatrix sub(A). n ≥ 0.
a (local)
REAL for pstrti2
DOUBLE PRECISION for pdtrti2
COMPLEX for pctrti2
COMPLEX*16 for pztrti2.
triangular matrix sub(A).
the matrix sub(A) contains the upper triangular part of the
matrix, and the strictly lower triangular part of sub(A) is not
referenced.
the matrix sub(A) contains the lower triangular part of the
matrix, and the strictly upper triangular part of sub(A) is
not referenced. If diag = 'U', the diagonal elements of
sub(A) are not referenced either and are assumed to be 1.
2313
Output Parameters
a On exit, the (triangular) inverse of the original matrix, in
the same storage format.
info INTEGER.
an illegal value,
then info = - (i*100),
then info = -i.
?lamsh
Sends multiple shifts through a small (single node)
matrix to maximize the number of bulges that can
be sent through.
Syntax
call slamsh(s, lds, nbulge, jblk, h, ldh, n, ulp)
call dlamsh(s, lds, nbulge, jblk, h, ldh, n, ulp)
Description
The routine sends multiple shifts through a small (single node) matrix to see how small
consecutive subdiagonal elements are modified by subsequent shifts in an effort to maximize
the number of bulges that can be sent through. The subroutine should only be called when
there are multiple shifts/bulges (nbulge > 1) and the first shift is starting in the middle of
an unreduced Hessenberg matrix because of two or more small consecutive subdiagonal
elements.
Input Parameters
s (local)
INTEGER. REAL for slamsh
DOUBLE PRECISION for dlamsh
Array, DIMENSION (lds,*).
2314
On entry, the matrix of shifts. Only the 2x2 diagonal of s is
referenced. It is assumed that s has jblk double shifts (size
2).
lds (local) INTEGER.
On entry, the leading dimension of S; unchanged on exit.
1<nbulge ≤ jblk ≤ lds/2.
nbulge (local) INTEGER.
On entry, the number of bulges to send through h (>1).
nbulge should be less than the maximum determined
(jblk). 1<nbulge ≤ jblk ≤ lds/2.
jblk (local) INTEGER.
On entry, the leading dimension of S; unchanged on exit.
h (local) INTEGER.
REAL for slamsh
Array, DIMENSION (lds, n).
On entry, the local matrix to apply the shifts on.
h should be aligned so that the starting row is 2.
ldh (local)
INTEGER.
On entry, the leading dimension of H; unchanged on exit.
n (local) INTEGER.
On entry, the size of H. If all the bulges are expected to go
through, n should be at least 4nbulge+2. Otherwise, nbulge
may be reduced by this routine.
ulp (local)
REAL for slamsh
On entry, machine precision. Unchanged on exit.
Output Parameters
s On exit, the data is rearranged in the best order for applying.
nbulge On exit, the maximum number of bulges that can be sent
through.
h On exit, the data is destroyed.
2315
?laref
Applies Householder reflectors to matrices on either
their rows or columns.
Syntax
call slaref(type, a, lda, wantz, z, ldz, block, irow1, icol1, istart, istop,
itmp1, itmp2, liloz, lihiz, vecs, v2, v3, t1, t2, t3)
call dlaref(type, a, lda, wantz, z, ldz, block, irow1, icol1, istart, istop,
itmp1, itmp2, liloz, lihiz, vecs, v2, v3, t1, t2, t3)
Description
The routine applies one or several Householder reflectors of size 3 to one or two matrices (if
column is specified) on either their rows or columns.
Input Parameters
type (global) CHRACTER*1.
If type = 'R', apply reflectors to the rows of the matrix
(apply from left). Otherwise, apply reflectors to the columns
of the matrix. Unchanged on exit.
a (global) REAL for slaref
DOUBLE PRECISION for dlaref
On entry, the matrix to receive the reflections.
lda (local) INTEGER.
On entry, the leading dimension of A; unchanged on exit.
wantz (global) LOGICAL.
If wantz = .TRUE., apply any column reflections to Z as
well.
If wantz = .FALSE., do no additional work on Z.
z (global) REAL for slaref
DOUBLE PRECISION for dlaref
On entry, the second matrix to receive column reflections.
ldz (local) INTEGER.
2316
On entry, the leading dimension of Z; unchanged on exit.
block (global). LOGICAL.
= .TRUE. : apply several reflectors at once and read their
data from the vecs array;
= .FALSE. : apply the single reflector given by v2, v3, t1,
t2, and t3.
irow1 (local) INTEGER.
On entry, the local row element of the matrix A.
icol1 (local) INTEGER.
On entry, the local column element of the matrix A.
istart (global) INTEGER.
Specifies the "number" of the first reflector.
istart is used as an index into vecs if block is set. istart
is ignored if block is .FALSE. .
istop (global) INTEGER.
Specifies the "number" of the last reflector.
istop is used as an index into vecs if block is set. istop
is ignored if block is .FALSE. .
itmp1 (local) INTEGER.
Starting range into A. For rows, this is the local first column.
For columns, this is the local first row.
itmp2 (local) INTEGER.
Ending range into A. For rows, this is the local last column.
For columns, this is the local last row.
liloz, lihiz (local). INTEGER.
Serve the same purpose as itmp1, itmp2 but for Z when
wantz is set.
vecs (global)
REAL for slaref
DOUBLE PRECISION for dlaref.
Array of size 3 *n (matrix size). This array holds the size 3
reflectors one after another and is only accessed when block
is .TRUE.
v2,v3,t1,t2,t3 (global). INTEGER.
REAL for slaref
DOUBLE PRECISION for dlaref.
2317
These parameters hold information on a single size 3

Householder reflector and are read when block is .FALSE.,
and overwritten when block is .TRUE..
Output Parameters
a On exit, the updated matrix.
z Changed only if wantz is set. If wantz is .FALSE. , z is not
referenced.
irow1 Undefined.
icol1 Undefined.
v2,v3,t1,t2,t3 These parameters are read when block is .FALSE., and
overwritten when block is .TRUE..
?lasorte
Sorts eigenpairs by real and complex data types.
Syntax
call slasorte(s, lds, j, out, info)
call dlasorte(s, lds, j, out, info)
Description
The routine sorts eigenpairs so that real eigenpairs are together and complex eigenpairs are
together. This helps to employ 2x2 shifts easily since every second subdiagonal is guaranteed
to be zero. This routine does no parallel work and makes no calls.
Input Parameters
s (local) INTEGER.
REAL for slasorte
DOUBLE PRECISION for dlasorte
Array, DIMENSION (lds).
On entry, a matrix already in Schur form.
lds (local) INTEGER.
2318
On entry, the leading dimension of the array s; unchanged
on exit.
j (local) INTEGER.
On entry, the order of the matrix S; unchanged on exit.
out (local) INTEGER.
REAL for slasorte
DOUBLE PRECISION for dlasorte
Array, DIMENSION (2*j). The work buffer required by the
routine.
Set, if the input matrix had an odd number of real
eigenvalues and things could not be paired or if the input
matrix S was not originally in Schur form. 0 indicates
successful completion.
Output Parameters
s On exit, the diagonal blocks of S have been rewritten to pair
the eigenvalues. The resulting matrix is no longer similar
to the input.
out Work buffer.
?lasrt2
Sorts numbers in increasing or decreasing order.
Syntax
call slasrt2(id, n, d, key, info)
call dlasrt2(id, n, d, key, info)
Description
The routine is modified LAPACK routine ?lasrt, which sorts the numbers in d in increasing
order (if id = 'I') or in decreasing order (if id = 'D' ). It uses Quick Sort, reverting to
32
Insertion Sort on arrays of size ≤ 20. Dimension of STACK limits n to about 2 .
2319
Input Parameters
id CHARACTER*1.
= 'I': sort d in increasing order;
= 'D': sort d in decreasing order.
n INTEGER. The length of the array d.
d REAL for slasrt2
DOUBLE PRECISION for dlasrt2.
On entry, the array to be sorted.
key INTEGER.
On entry, key contains a key to each of the entries in d().
Typically, key(i) = i for all i .
Output Parameters
D On exit, d has been sorted into increasing order
(d(1) ≤ ... ≤ d(n) ) or into decreasing order
(d(1) ≥ ... ≥ d(n) ), depending on id.
info INTEGER.
key On exit, key is permuted in exactly the same manner as d()
was permuted from input to output. Therefore, if key(i)
= i for all i upon input, then d_out(i) = d_in(key(i)).
2320
?stein2
Computes the eigenvectors corresponding to
specified eigenvalues of a real symmetric
tridiagonal matrix, using inverse iteration.
Syntax
call sstein2(n, d, e, m, w, iblock, isplit, orfac, z, ldz, work, iwork, ifail,
info)
call dstein2(n, d, e, m, w, iblock, isplit, orfac, z, ldz, work, iwork, ifail,
info)
Description
The routine is a modified LAPACK routine ?stein. It computes the eigenvectors of a real
symmetric tridiagonal matrix T corresponding to specified eigenvalues, using inverse iteration.
The maximum number of iterations allowed for each eigenvector is specified by an internal
parameter maxits (currently set to 5).
Input Parameters
m INTEGER. The number of eigenvectors to be found (0 ≤ m
≤ n).
d, e, w REAL for single-precision flavors
Arrays: d(*), DIMENSION (n). The n diagonal elements of
the tridiagonal matrix T.
e(*), DIMENSION (n).
The (n-1) subdiagonal elements of the tridiagonal matrix
T, in elements 1 to n-1. e(n) need not be set.
w(*), DIMENSION (n).
2321
The first m elements of w contain the eigenvalues for which

eigenvectors are to be computed. The eigenvalues should
be grouped by split-off block and ordered from smallest to
largest within the block. (The output array w from ?stebz
with ORDER = 'B' is expected here).
The dimension of w must be at least max(1, n).
iblock INTEGER.
The submatrix indices associated with the corresponding
eigenvalues in w ;
iblock(i) = 1, if eigenvalue w(i) belongs to the first
submatrix from the top,
iblock(i) = 2, if eigenvalue w(i) belongs to the second
submatrix, etc. (The output array iblock from ?stebz is
expected here).
isplit INTEGER.
The splitting points, at which T breaks up into submatrices.
The first submatrix consists of rows/columns 1 to isplit(1),
the second submatrix consists of rows/columns
isplit(1)+1 through isplit( 2 ), etc. (The output array
isplit from ?stebz is expected here).
orfac REAL for single-precision flavors
orfac specifies which eigenvectors should be orthogonalized.
orfac*||T|| of each other are to be orthogonalized.
≥ max(1, n).
work REAL for single-precision flavors
iwork INTEGER. Workspace array, DIMENSION (n).
Output Parameters
z REAL for sstein2
2322
DOUBLE PRECISION for dstein2
Array, DIMENSION (ldz, m).
The computed eigenvectors. The eigenvector associated
with the eigenvalue w(i) is stored in the i-th column of z.
Any vector that fails to converge is set to its current iterate
after maxits iterations.
ifail INTEGER.
On normal exit, all elements of ifail are zero. If one or
more eigenvectors fail to converge after maxits iterations,
then their indices are stored in the array ifail.
info INTEGER.
info = 0, the exit is successful.
info < 0: if info = -i, the i-th had an illegal value.
info > 0: if info = i, then i eigenvectors failed to
converge in maxits iterations. Their indices are stored in
the array ifail.
?dbtf2
matrix with no pivoting (local unblocked algorithm).
Syntax
call sdbtf2(m, n, kl, ku, ab, ldab, info)
call ddbtf2(m, n, kl, ku, ab, ldab, info)
call cdbtf2(m, n, kl, ku, ab, ldab, info)
call zdbtf2(m, n, kl, ku, ab, ldab, info)
Description
The routine computes an LU factorization of a general real/complex m-by-n band matrix A without
using partial pivoting with row interchanges.
This is the unblocked version of the algorithm, calling BLAS Routines and Functions.
2323
Input Parameters
m INTEGER. The number of rows of the matrix A(m ≥ 0).
n INTEGER. The number of columns in A(n ≥ 0).
A(kl ≥ 0).
of A(ku ≥ 0).
ab REAL for sdbtf2
DOUBLE PRECISION for ddbtf2
COMPLEX for cdbtf2
COMPLEX*16 for zdbtf2.
The matrix A in band storage, in rows kl+1 to 2kl+ku+1;
rows 1 to kl of the array need not be set. The j-th column
of A is stored in the j-th column of the array ab as follows:
ab(kl+ku+1+i-j,j) = A(i,j) for max(1,j-ku) ≤ i
≤ min(m,j+kl).
Output Parameters
ab On exit, details of the factorization: U is stored as an upper
triangular band matrix with kl+ku superdiagonals in rows
1 to kl+ku+1, and the multipliers used during the
factorization are stored in rows kl+ku+2 to 2*kl+ku+1.
See the Application Notes below for further details.
info INTEGER.
< 0: if info = - i, the i-th argument had an illegal value,
> 0: if info = + i, u(i,i) is 0. The factorization has
been completed, but the factor U is exactly singular. Division
by 0 will occur if you use the factor U for solving a system
2324
Application Notes
The band storage scheme is illustrated by the following example, when m = n = 6, kl = 2,
ku = 1:
The routine does not use array elements marked *; elements marked + need not be set on
entry, but the routine requires them to store elements of U, because of fill-in resulting from
the row interchanges.
?dbtrf
matrix with no pivoting (local blocked algorithm).
Syntax
call sdbtrf(m, n, kl, ku, ab, ldab, info)
call ddbtrf(m, n, kl, ku, ab, ldab, info)
call cdbtrf(m, n, kl, ku, ab, ldab, info)
call zdbtrf(m, n, kl, ku, ab, ldab, info)
Description
This routine computes an LU factorization of a real m-by-n band matrix A without using partial
pivoting or row interchanges.
This is the blocked version of the algorithm, calling BLAS Routines and Functions.
2325
Input Parameters
n INTEGER. The number of columns in A(n ≥ 0).
A(kl ≥ 0).
of A(ku ≥ 0).
ab REAL for sdbtrf
DOUBLE PRECISION for ddbtrf
COMPLEX for cdbtrf
COMPLEX*16 for zdbtrf.
The matrix A in band storage, in rows kl+1 to 2kl+ku+1;
rows 1 to kl of the array need not be set. The j-th column
of A is stored in the j-th column of the array ab as follows:
ab(kl+ku+1+i-j,j) = A(i,j) for max(1,j-ku) ≤ i ≤
min(m,j+kl).
Output Parameters
ab On exit, details of the factorization: U is stored as an upper
triangular band matrix with kl+ku superdiagonals in rows
1 to kl+ku+1, and the multipliers used during the
factorization are stored in rows kl+ku+2 to 2*kl+ku+1. See
the Application Notes below for further details.
info INTEGER.
> 0: if info = + i, u(i,i) is 0. The factorization has
been completed, but the factor U is exactly singular. Division
by 0 will occur if you use the factor U for solving a system
2326
Application Notes
The band storage scheme is illustrated by the following example, when m = n = 6, kl = 2,
ku = 1:
The routine does not use array elements marked *.
?dttrf
Computes an LU factorization of a general
tridiagonal matrix with no pivoting (local blocked
algorithm).
Syntax
call sdttrf(n, dl, d, du, info)
call ddttrf(n, dl, d, du, info)
call cdttrf(n, dl, d, du, info)
call zdttrf(n, dl, d, du, info)
Description
The routine computes an LU factorization of a real or complex tridiagonal matrix A using
elimination without partial pivoting.
The factorization has the form A = L*U, where L is a product of unit lower bidiagonal matrices
and U is upper triangular with nonzeros only in the main diagonal and first superdiagonal.
2327
Input Parameters
n INTEGER. The order of the matrix A(n ≥ 0).
dl, d, du REAL for sdttrf
DOUBLE PRECISION for ddttrf
COMPLEX for cdttrf
COMPLEX*16 for zdttrf.
Arrays containing elements of A.
The array dl of DIMENSION(n - 1) contains the
sub-diagonal elements of A.
The array d of DIMENSION n contains the diagonal elements
of A.
The array du of DIMENSION(n - 1) contains the
super-diagonal elements of A.
Output Parameters
dl Overwritten by the (n-1) multipliers that define the matrix
L from the LU factorization of A.
d Overwritten by the n diagonal elements of the upper
du Overwritten by the (n-1) elements of the first
info INTEGER.
> 0: if info = i, u(i,i) is exactly 0. The factorization
has been completed, but the factor U is exactly singular.
Division by 0 will occur if you use the factor U for solving a
system of linear equations.
2328
?dttrsv
Solves a general tridiagonal system of linear
equations using the LU factorization computed by
?dttrf.
Syntax
call sdttrsv(uplo, trans, n, nrhs, dl, d, du, b, ldb, info)
call ddttrsv(uplo, trans, n, nrhs, dl, d, du, b, ldb, info)
call cdttrsv(uplo, trans, n, nrhs, dl, d, du, b, ldb, info)
call zdttrsv(uplo, trans, n, nrhs, dl, d, du, b, ldb, info)
Description
The routine solves one of the following systems of linear equations:
L*X = B, LT*X = B, or LH*X = B,
U*X = B, UT*X = B, or UH*X = B
with factors of the tridiagonal matrix A from the LU factorization computed by ?dttrf.
Input Parameters
uplo CHARACTER*1.
Specifies whether to solve with L or U.
trans CHARACTER. Must be 'N' or 'T' or 'C'.
If trans = 'N', then A*X=B is solved for X (no transpose).
If trans = 'T', then AT*X = B is solved for X (transpose).
If trans = 'C', then AH*X = B is solved for X (conjugate
transpose).
n INTEGER. The order of the matrix A(n ≥ 0).
nrhs INTEGER. The number of right-hand sides, that is, the
number of columns in the matrix B(nrhs ≥ 0).
dl,d,du,b REAL for sdttrsv
DOUBLE PRECISION for ddttrsv
2329
COMPLEX for cdttrsv

COMPLEX*16 for zdttrsv.
Arrays of DIMENSIONs: dl(n -1 ), d(n), du(n -1 ),
b(ldb,nrhs).
The array dl contains the (n - 1) multipliers that define
the matrix L from the LU factorization of A.
The array d contains n diagonal elements of the upper
The array du contains the (n - 1) elements of the first
On entry, the array b contains the right-hand side matrix
B.
ldb INTEGER. The leading dimension of the array b; ldb ≥
max(1, n).
Output Parameters
?pttrsv
Solves a symmetric (Hermitian) positive-definite
tridiagonal system of linear equations, using the
H
L*D*L factorization computed by ?pttrf.
Syntax
call spttrsv(trans, n, nrhs, d, e, b, ldb, info)
call dpttrsv(trans, n, nrhs, d, e, b, ldb, info)
call cpttrsv(uplo, trans, n, nrhs, d, e, b, ldb, info)
call zpttrsv(uplo, trans, n, nrhs, d, e, b, ldb, info)
Description
The routine solves one of the triangular systems:
2330
LT*X = B, or L*X = B for real flavors,
or
L*X = B, or LH*X = B,
U*X = B, or UH*X = B for complex flavors,
where L (or U for complex flavors) is the Cholesky factor of a Hermitian positive-definite
tridiagonal matrix A such that
A = L*D*LH (computed by spttrf/dpttrf)

or
A = UH*D*U or A = L*D*LH (computed by cpttrf/zpttrf).
Input Parameters
Specifies whether the superdiagonal or the subdiagonal of
the tridiagonal matrix A is stored and the form of the
factorization:
If uplo = 'U', e is the superdiagonal of U, and A = UH*D*U
or A = L*D*LH;
if uplo = 'L', e is the subdiagonal of L, and A = L*D*LH.
The two forms are equivalent, if A is real.
trans CHARACTER.
for real flavors:
if trans = 'N': L*X = B (no transpose)
if trans = 'T': LT*X = B (transpose)
for complex flavors:
if trans = 'N': U*X = B or L*X = B (no transpose)
if trans = 'C': UH*X = B or LH*X = B (conjugate
transpose).
n INTEGER. The order of the tridiagonal matrix A. n ≥ 0.
nrhs INTEGER. The number of right hand sides, that is, the
number of columns of the matrix B. nrhs ≥ 0.
2331
d REAL array, DIMENSION (n). The n diagonal elements of the

diagonal matrix D from the factorization computed by ?pt-
trf.
e COMPLEX array, DIMENSION(n-1). The (n-1) off-diagonal
elements of the unit bidiagonal factor U or L from the
factorization computed by ?pttrf. See uplo.
b COMPLEX array, DIMENSION (ldb, nrhs).
On entry, the right hand side matrix B.
ldb INTEGER.
The leading dimension of the array b.
ldb ≥ max(1, n).
Output Parameters
b On exit, the solution matrix X.
info INTEGER.
?steqr2
eigenvectors of a symmetric tridiagonal matrix
using the implicit QL or QR method.
Syntax
call ssteqr2(compz, n, d, e, z, ldz, nr, work, info)
call dsteqr2(compz, n, d, e, z, ldz, nr, work, info)
Description
The routine is a modified version of LAPACK routine ?steqr. The routine ?steqr2 computes
all eigenvalues and, optionally, eigenvectors of a symmetric tridiagonal matrix using the implicit
QL or QR method. ?steqr2 is modified from ?steqr to allow each ScaLAPACK process running
?steqr2 to perform updates on a distributed matrix Q. Proper usage of ?steqr2 can be gleaned
from examination of ScaLAPACK routine p?syev.
2332
Input Parameters
compz CHARACTER*1. Must be 'N' or 'I'.
If compz = 'N', the routine computes eigenvalues only. If
compz = 'I', the routine computes the eigenvalues and
z must be initialized to the identity matrix by p?laset or
?laset prior to entering this subroutine.
n INTEGER. The order of the matrix T(n ≥ 0).
Arrays: d contains the diagonal elements of T. The dimension
of d must be at least max(1, n).
e contains the (n-1) subdiagonal elements of T. The
dimension of e must be at least max(1, n-1).
work is a workspace array. The dimension of work is max(1,
2*n-2). If compz = 'N', then work is not referenced.
z (local)
REAL for ssteqr2
DOUBLE PRECISION for dsteqr2
Array, global DIMENSION (n, n), local DIMENSION (ldz, nr).
If compz = 'V', then z contains the orthogonal matrix used
in the reduction to tridiagonal form.
ldz INTEGER. The leading dimension of the array z. Constraints:
ldz ≥ 1,
ldz ≥ max(1, n), if eigenvectors are desired.
nr INTEGER. nr = max(1, numroc(n, nb, myprow, 0,
nprocs)).
If compz = 'N', then nr is not referenced.
Output Parameters
d REAL array, DIMENSION (n), for ssteqr2.
DOUBLE PRECISION array, DIMENSION (n), for dsteqr2.
On exit, the eigenvalues in ascending order, if info = 0.
See also info.
2333
e REAL array, DIMENSION (n-1), for ssteqr2.

DOUBLE PRECISION array, DIMENSION (n-1), for dsteqr2.
On exit, e has been destroyed.
z (local)
REAL for ssteqr2
DOUBLE PRECISION for dsteqr2
Array, global DIMENSION (n, n), local DIMENSION (ldz, nr).
On exit, if info = 0, then,
if compz = 'V', z contains the orthonormal eigenvectors
of the original symmetric matrix, and if compz = 'I', z
contains the orthonormal eigenvectors of the symmetric
tridiagonal matrix. If compz = 'N', then z is not referenced.
info INTEGER.
info = 0, the exit is successful.
info < 0: if info = -i, the i-th had an illegal value.
info > 0: the algorithm has failed to find all the
eigenvalues in a total of 30n iterations;
if info = i, then i elements of e have not converged to
zero; on exit, d and e contain the elements of a symmetric
tridiagonal matrix, which is orthogonally similar to the
original matrix.
Utility Functions and Routines

This section describes ScaLAPACK utility functions and routines. Summary information about
these routines is given in the following table:
Table 7-2 ScaLAPACK Utility Functions and Routines
Name
p?labad s,d Returns the square root of the underflow and overflow
thresholds if the exponent-range is very large.
p?lachkieee s,d Performs a simple check for the features of the IEEE standard.
(C interface function).
p?lamch s,d Determines machine parameters for floating-point arithmetic.
2334
Name
p?lasnbt s,d Computes the position of the sign bit of a floating-point

number. (C interface function).
pxerbla Error handling routine called by ScaLAPACK routines.
p?labad
Returns the square root of the underflow and
overflow thresholds if the exponent-range is very
large.
Syntax
call pslabad(ictxt, small, large)
call pdlabad(ictxt, small, large)
Description
The routine takes as input the values computed by p?lamch for underflow and overflow, and
returns the square root of each of these values if the log of large is sufficiently large. This
subroutine is intended to identify machines with a large exponent range, such as the Crays,
and redefine the underflow and overflow limits to be the square roots of the values computed
by p?lamch. This subroutine is needed because p?lamch does not compensate for poor
arithmetic in the upper half of the exponent range, as is found on a Cray.
In addition, this routine performs a global minimization and maximization on these values, to
support heterogeneous computing networks.
Input Parameters
ictxt (global) INTEGER.
The BLACS context handle in which the computation takes
place.
small (local).
REAL PRECISION for pslabad.
DOUBLE PRECISION for pdlabad.
On entry, the underflow threshold as computed by p?lamch.
2335
large (local).
REAL PRECISION for pslabad.
DOUBLE PRECISION for pdlabad.
On entry, the overflow threshold as computed by p?lamch.
Output Parameters
small (local).
On exit, if log10(large) is sufficiently large, the square
root of small, otherwise unchanged.
large (local).
On exit, if log10(large) is sufficiently large, the square
root of large, otherwise unchanged.
p?lachkieee
Performs a simple check for the features of the
IEEE standard. (C interface function).
Syntax
void pslachkieee(int *isieee, float *rmax, float *rmin);
void pdlachkieee(int *isieee, float *rmax, float *rmin);
Description
The routine performs a simple check to make sure that the features of the IEEE standard are
implemented. In some implementations, p?lachkieee may not return.
Note that all arguments are call-by-reference so that this routine can be directly called from
Fortran code.
This is a ScaLAPACK internal subroutine and arguments are not checked for unreasonable
values.
Input Parameters
rmax (local).
REAL for pslachkieee
DOUBLE PRECISION for pdlachkieee
2336
The overflow threshold(= ?lamch ('O')).
rmin (local).
REAL for pslachkieee
DOUBLE PRECISION for pdlachkieee
The underflow threshold(= ?lamch ('U')).
Output Parameters
isieee (local). INTEGER.
On exit, isieee = 1 implies that all the features of the
IEEE standard that we rely on are implemented. On exit,
isieee = 0 implies that some the features of the IEEE
standard that we rely on are missing.
p?lamch
Determines machine parameters for floating-point
arithmetic.
Syntax
val = pslamch(ictxt, cmach)
val = pdlamch(ictxt, cmach)
Description
The routine determines single precision machine parameters.
Input Parameters
ictxt (global). INTEGER.The BLACS context handle in which the
computation takes place.
cmach (global) CHARACTER*1.
Specifies the value to be returned by p?lamch:
= 'E' or 'e', p?lamch := eps
= 'S' or 's' , p?lamch := sfmin
= 'B' or 'b', p?lamch := base
= 'P' or 'p', p?lamch := eps*base
= 'N' or 'n', p?lamch := t
2337
= 'R' or 'r', p?lamch := rnd

= 'M' or 'm', p?lamch := emin
= 'U' or 'u', p?lamch := rmin
= 'L' or 'l', p?lamch := emax
= 'O' or 'o', p?lamch := rmax,
where
eps = relative machine precision
sfmin = safe minimum, such that 1/sfmin does not overflow
base = base of the machine
prec = eps*base
t = number of (base) digits in the mantissa
rnd = 1.0 when rounding occurs in addition, 0.0 otherwise
emin = minimum exponent before (gradual) underflow
(emin-1)
rmin = underflow threshold - base
emax = largest exponent before overflow
rmax = overflow threshold - (baseemax)*(1-eps)
Output Parameters
val Value returned by the routine.
p?lasnbt
Computes the position of the sign bit of a
floating-point number. (C interface function).
Syntax
void pslasnbt(int *ieflag);
void pdlasnbt(int *ieflag);
Description
The routine finds the position of the signbit of a single/double precision floating point number.
This routine assumes IEEE arithmetic, and hence, tests only the 32-nd bit (for single precision)
or 32-nd and 64-th bits (for double precision) as a possibility for the signbit. sizeof(int) is
assumed equal to 4 bytes.
2338
If a compile time flag (NO_IEEE) indicates that the machine does not have IEEE arithmetic,
ieflag = 0 is returned.
Output Parameters
ieflag INTEGER.
This flag indicates the position of the signbit of any
single/double precision floating point number.
ieflag = 0, if the compile time flag NO_IEEE indicates
that the machine does not have IEEE arithmetic, or if
sizeof(int) is different from 4 bytes.
ieflag = 1 indicates that the signbit is the 32-nd bit for
a single precision routine.
In the case of a double precision routine:
ieflag = 1 indicates that the signbit is the 32-nd bit (Big
Endian).
ieflag = 2 indicates that the signbit is the 64-th bit (Little
Endian).
pxerbla
Error handling routine called by ScaLAPACK
routines.
Syntax
call pxerbla(ictxt, srname, info)
Description
This routine is an error handler for the ScaLAPACK routines. It is called by a ScaLAPACK routine
if an input parameter has an invalid value. A message is printed. Program execution is not
terminated. For the ScaLAPACK driver and computational routines, a RETURN statement is
issued following the call to pxerbla.
Control returns to the higher-level calling routine, and it is left to the user to determine how
the program should proceed. However, in the specialized low-level ScaLAPACK routines (auxiliary
routines that are Level 2 equivalents of computational routines), the call to pxerbla() is
immediately followed by a call to BLACS_ABORT() to terminate program execution since recovery
from an error at this level in the computation is not possible.
2339
It is always good practice to check for a nonzero value of info on return from a ScaLAPACK
routine. Installers may consider modifying this routine in order to call system-specific
exception-handling facilities.
Input Parameters
ictxt (global) INTEGER
The BLACS context handle, indicating the global context of
the operation. The context itself is global.
srname (global) CHARACTER*6
The name of the routine which called pxerbla.
The position of the invalid parameter in the parameter list
of the calling routine.
2340
Sparse Solver Routines 8
Intel® Math Kernel Library (Intel® MKL) provides user-callable sparse solver software to solve real or
complex, symmetric, structurally symmetric or non-symmetric, positive definite, indefinite or Hermitian
sparse linear system of equations.
The terms and concepts required to understand the use of the Intel MKL sparse solver routines are
discussed in the Appendix A "Linear Solvers Basics" Appendix "Linear Solvers Basics" . If you are familiar
with linear sparse solvers and sparse matrix storage schemes, you can skip these sections and go directly
to the interface descriptions.
This chapter describes the direct sparse solver PARDISO* and the alternative interface for the direct sparse
solver referred to here as DSS interface; iterative sparse solvers (ISS) based on the reverse communication
interface (RCI); and two preconditioners based on the incomplete LU factorization technique.
PARDISO* - Parallel Direct Sparse Solver Interface

This section describes the interface to the shared-memory multiprocessing parallel direct sparse
solver known as the PARDISO* solver. The interface is Fortran, but can be called from C programs
by observing Fortran parameter passing and naming conventions used by the supported compilers
and operating systems. A discussion of the algorithms used in the PARDISO* software and more
information on the solver can be found at http://www.pardiso-project.org.
The current implementation of the PARDISO* solver additionally supports the out-of-core (OOC)
version.
The PARDISO* package is a high-performance, robust, memory efficient and easy to use software
for solving large sparse symmetric and unsymmetric linear systems of equations on shared memory
multiprocessors. The solver uses a combination of left- and right-looking Level-3 BLAS supernode
techniques [Schenk00-2]. To improve sequential and parallel sparse numerical factorization
performance, the algorithms are based on a Level-3 BLAS update and pipelining parallelism is used
with a combination of left- and right-looking supernode techniques [Schenk00, Schenk01, Schenk02,
Schenk03]. The parallel pivoting methods allow complete supernode pivoting to compromise numerical
stability and scalability during the factorization process. For sufficiently large problem sizes, numerical
experiments demonstrate that the scalability of the parallel algorithm is nearly independent of the
shared-memory multiprocessing architecture and a speedup of up to seven using eight processors
has been observed.
2341
The PARDISO* solver supports a wide range of sparse matrix types (see Figure 8-1 below) and
computes the solution of real or complex, symmetric, structurally symmetric or unsymmetric,
positive definite, indefinite or Hermitian sparse linear system of equations on shared-memory
multiprocessing architectures.
Figure 8-1 Sparse Matrices That Can Be Solved with the PARDISO* Solver
You can find a code example that uses the PARDISO interface routine to solve systems of linear
equations in PARDISO Code Examples section in the Appendix C .
pardiso
Calculates the solution of a set of sparse linear
equations with multiple right-hand sides.
Syntax
Fortran:
call pardiso (pt, maxfct, mnum, type, phase, n, a, ia, ja, perm, nrhs, iparm,
msglvl, b, x, error)
2342
C:
pardiso (pt, &maxfct, &mnum, &mtype, &phase, &n, a, ia, ja, perm, &nrhs,
iparm, &msglvl, b, x, &error);
(An underscore may or may not be required after “pardiso” depending on the OS and compiler
conventions for that OS).
Interface:
SUBROUTINE pardiso (pt, maxfct, mnum, mtype, phase, n, a, ia, ja, perm, nrhs,
iparm, msglvl, b, x, error)
INTEGER *4 pt (64)
INTEGER *4 maxfct, mnum, mtype, phase, n, nrhs, error, ia(*), ja(*), perm(*),
iparm(*)
REAL *8 a(*), b(n,nrhs), x(n,nrhs)
Note that the above interface is given for the 32-bit architectures. For 64-bit architectures, the
argument pt(64) must be defined as INTEGER*8 type.
Description
This routine is declared in mkl_pardiso.f77 for FORTRAN 77 interface, in mkl_pardiso.f90
for Fortran 90 interface, and in mkl_pardiso.h for C interface.
PARDISO calculates the solution of a set of sparse linear equations
A*X = B
T
with multiple right-hand sides, using a parallel LU, LDL or LL factorization, where A is an n-by-n
matrix, and X and B are n-by-nrhs matrices. PARDISO performs the following analysis steps
depending on the structure of the input matrix A.
Symmetric Matrices: The solver first computes a symmetric fill-in reducing permutation P
based on either the minimum degree algorithm [Liu85] or the nested dissection algorithm from
the METIS package [Karypis98] (included with Intel MKL), followed by the parallel left-right
looking numerical Cholesky factorization [Schenk00-2] of PAPT = LLT for symmetric
T T
positive-definite matrices, or PAP = LDL for symmetric indefinite matrices. The solver uses
diagonal pivoting, or 1x1 and 2x2 Bunch and Kaufman pivoting for symmetric indefinite matrices,
and an approximation of X is found by forward and backward substitution and iterative
refinements.
2343
The coefficient matrix is perturbed whenever numerically acceptable 1x1 and 2x2 pivots cannot
be found within the diagonal supernode block. One or two passes of iterative refinements may
be required to correct the effect of the perturbations. This restricting notion of pivoting with
iterative refinements is effective for highly indefinite symmetric systems. Furthermore the
accuracy of this method is for a large set of matrices from different applications areas as
accurate as a direct factorization method that uses complete sparse pivoting techniques
[Schenk04].
Another possibility to improve the pivoting accuracy is to use symmetric weighted matching
algorithms. These methods identify large entries in the coefficient matrix A that, if permuted
close to the diagonal, permit the factorization process to identify more acceptable pivots and
proceed with fewer pivot perturbations. The methods are based on maximum weighted matchings
and improve the quality of the factor in a complementary way to the alternative idea of using
more complete pivoting techniques.
The inertia is also computed for real symmetric indefinite matrices.
Structurally Symmetric Matrices: The solver first computes a symmetric fill-in reducing
T T
permutation P followed by the parallel numerical factorization of PAP = QLU . The solver uses
partial pivoting in the supernodes and an approximation of X is found by forward and backward
substitution and iterative refinements.
Unsymmetric Matrices: The solver first computes a non-symmetric permutation PMPS and
scaling matrices Dr and Dc with the aim to place large entries on the diagonal that enhances
reliability of the numerical factorization process [Duff99]. In the next step the solver computes
a fill-in reducing permutation P based on the matrix PMPSA + (PMPSA)T followed by the parallel
numerical factorization
QLUR = PPMPSDrADcP
with supernode pivoting matrices Q and R. When the factorization algorithm reaches a point
where it cannot factorize the supernodes with this pivoting strategy, it uses a pivoting
perturbation strategy similar to [Li99]. The magnitude of the potential pivot is tested against
a constant threshold of alpha = eps*||A2||inf , where eps is the machine precision, A2 =
P*PMPS*Dr*A*Dc*P, and ||A2||inf is the infinity norm of the scaled and permuted matrix A.
Any tiny pivots encountered during elimination are set to the sign (lII)*eps*||A2||inf, which
trades off some numerical stability for the ability to keep pivots from getting too small. Although
many failures could render the factorization well-defined but essentially useless, in practice the
diagonal elements are rarely modified for a large class of matrices. The result of this pivoting
approach is that the factorization is, in general, not exact and iterative refinement may be
needed.
2344
Direct-Iterative Preconditioning for Unsymmetric Linear Systems. The solver also allows
for a combination of direct and iterative methods [Sonn89] to accelerate the linear solution
process for transient simulation. Most of applications of sparse solvers require solutions of
systems with gradually changing values of the nonzero coefficient matrix, but the same identical
sparsity pattern. In these applications, the analysis phase of the solvers has to be performed
only once and the numerical factorizations are the important time-consuming steps during the
simulation. PARDISO uses a numerical factorization A = LU for the first system and applies the
factors L and U for the next steps in a preconditioned Krylow-Subspace iteration. If the iteration
does not converge, the solver automatically switches back to the numerical factorization. This
method can be applied to unsymmetric matrices in PARDISO and the user can select the method
using only one input parameter. For further details see the parameter description (iparm(4),
iparm(20)).
Separate Forward and Backward Substitution.
The solver execution step ( see parameter phase =33 below) can be divided into two or three
separate substitutions: forward, backward and diagonal (if possible). This separation can be
explained on the examples of solving system with different matrix types.
Real symmetric positive definite matrix A (mtype = 2) is factorized by PARDISO as A = L*LT
. In this case solution of the system A*x=b can be found as sequence of substitutions: L*y=b
(forward substitution, phase =331) and LT*x=y (backward substitution, phase =333).
Real unsymmetric matrix A (mtype = 11) is factorized by PARDISO as A = L*U . In this case
solution of the system A*x=b can be found by the following sequence: L*y=b (forward
substitution, phase =331) and U*x=y (backward substitution, phase =333).
Solving system with a real symmetric indefinite matrix A (mtype = -2) is slightly different
from above cases. PARDISO factorizes this matrix as A=LDLT, and solution of the system A*x=b
can be calculated as the following sequence of substitutions: L*y=b (forward substitution,
phase =331) s: D*v=y (diagonal substitution, phase =332) and, finally LT*x=v (backward
substitution, phase =333). Diagonal substitution makes sense only for indefinite matrices
(mtype = -2, -4, 6). For matrices of other types solution can be found as described in the
first two examples.
NOTE. Number of refinement steps (iparm(8)) must be set to zero if solution is

calculated step by step (phase = 331, 332, 333), otherwise PARDISO produces wrong
result.
2345
NOTE. Different pivoting (iparm(21)) produces different LDLT factorization. Therefore

results of forward, diagonal and backward substitutions with diagonal pivoting can differ
from results of the same steps with Bunch and Kaufman pivoting. Of course the final
results of sequential execution of forward, diagonal and backward substitution are equal
to the results of the full solving step (phase=33) regardless of the pivoting used.
The sparse data storage in PARDISO follows the scheme described in Sparse Matrix Storage
Format section with ja standing for columns, ia for rowIndex, and a for values. The algorithms
in PARDISO require column indices ja to be increasingly ordered per row and the presence of
the diagonal element per row for any symmetric or structurally symmetric matrix. The
unsymmetric matrices need no diagonal elements in the PARDISO solver.
PARDISO performs four tasks:
• analysis and symbolic factorization

• numerical factorization
• forward and backward substitution including iterative refinement
• termination to release all internal solver memory.
When an input data structure is not accessed in a call, a NULL pointer or any valid address can
be passed as a place holder for that argument.
NOTE. This routine supports the Progress Routine feature (for the numeric factorization
phase only). See Progress Function section for details.
Input Parameters
pt INTEGER*4 for 32 -bit operating systems;
INTEGER*8 for 64 -bit operating systems
Array, DIMENSION (64)
On entry, solver internal data address pointer. These
addresses are passed to the solver and all related internal
memory management is organized through this pointer
2346
NOTE. pt is an integer array with 64 entries. It is
very important that the pointer is initialized with zero
at the first call of PARDISO. After that first call you
should never modify the pointer, as a serious memory
leak can occur. The integer length should be 4-byte
on 32-bit operating systems and 8-byte on 64-bit
operating systems.
maxfct INTEGER
Maximal number of factors with identical nonzero sparsity
structure that the user would like to keep at the same time
in memory. It is possible to store several different
factorizations with the same nonzero structure at the same
time in the internal data management of the solver. In most
of the applications this value is equal to 1.
PARDISO can process several matrices with identical matrix
sparsity pattern and store the factors of these matrices at
the same time. Matrices with a different sparsity structure
can be kept in memory with different memory address
pointers pt.
mnum INTEGER
Actual matrix for the solution phase. With this scalar you
can define the matrix that you would like to factorize. The
value must be: 1 ≤ mnum ≤ maxfct.
In most applications this value is equal to 1.
mtype INTEGER
This scalar value defines the matrix type. The PARDISO
solver supports the following matrices:
= 1 real and structurally symmetric matrix
= 2 real and symmetric positive definite
matrix
= -2 real and symmetric indefinite matrix
= 3 complex and structurally symmetric
matrix
= 4 complex and Hermitian positive definite
matrix
2347
= -4 complex and Hermitian indefinite matrix

= 6 complex and symmetric matrix
= 11 real and unsymmetric matrix
= 13 complex and unsymmetric matrix
This parameter influences the pivoting method.
phase INTEGER
Controls the execution of the solver. Usually it is a two-digit
integer ij (10I + j, 1≤I ≤3, I≤j≤3 for normal execution
modes). The I digit indicates the starting phase of execution,
and j indicates the ending phase. PARDISO has the following
phases of execution:
• Phase 1: Fill-reduction analysis and symbolic factorization

• Phase 2: Numerical factorization
• Phase 3: Forward and Backward solve including iterative
refinements
This phase can be divided into two or three separate
substitutions: forward, backward, and diagonal (see
above).
• Termination and Memory Release Phase (phase≤ 0)
If a previous call to the routine has computed information

from previous phases, execution may start at any phase.
The phase parameter can have the following values:
phase Solver Execution Steps
11 Analysis
12 Analysis, numerical factorization
13 Analysis, numerical factorization, solve,
iterative refinement
22 Numerical factorization
23 Numerical factorization, solve, iterative
refinement
33 Solve, iterative refinement
331 like phase=33, but only forward substitution
2348
phase Solver Execution Steps
332 like phase=33, but only diagonal substitution
333 like phase=33, but only backward
substitution
0 Release internal memory for L and U matrix
number mnum
-1 Release all internal memory for all matrices
n INTEGER
Number of equations in the sparse linear systems of
equations A*X = B. Constraint: n > 0.
a REAL/COMPLEX
Array. Contains the non-zero elements of the coefficient
matrix A corresponding to the indices in ja. The size of a is
the same as that of ja and the coefficient matrix can be
either real or complex. The matrix must be stored in
compressed sparse row format with increasing values of ja
for each row. Refer to values array description in Sparse
Matrix Storage Format for more details.
NOTE. The non-zero elements of each row of the

matrix A must be stored in increasing order. For
symmetric or structural symmetric matrices, it is also
important that the diagonal elements are available
and stored in the matrix. If the matrix is symmetric,
the array a is only accessed in the factorization
phase, in the triangular solution and iterative
refinement phase. Unsymmetric matrices are
accessed in all phases of the solution process.
ia INTEGER
Array, dimension (n+1). For i≤n, ia(i) points to the first
column index of row i in the array ja in compressed sparse
row format. That is, ia(i) gives the index of the element
in array a that contains the first non-zero element from row
i of A. The last element ia(n+1) is taken to be equal to
2349
the number of non-zero lements in A, plus one. Refer to

rowIndex array description in Sparse Matrix Storage Format
for more details. The array ia is also accessed in all phases
of the solution process. Note that the row and columns
numbers start from 1.
ja INTEGER
Array. ja(*) contains column indices of the sparse matrix
A stored in compressed sparse row format. The indices in
each row must be sorted in increasing order. The array ja
is also accessed in all phases of the solution process. For
symmetric and structurally symmetric matrices it is assumed
that zero diagonal elements are also stored in the list of
non-zero elements in a and ja. For symmetric matrices, the
solver needs only the upper triangular part of the system
as is shown for columns array in Sparse Matrix Storage
Format.
perm INTEGER
Array, dimension (n). Holds the permutation vector of size
n. The array perm is defined as follows. Let A be the original
matrix and B = P*A*PT be the permuted matrix. Row
(column) i of A is the perm(i) row (column) of B. The
numbering of the array must start with 1 and must describe
a permutation.
On entry, you can apply your own fill-in reducing ordering
to the solver. The permutation vector perm is used by the
solver if iparm(5) = 1. The second purpose of the array
perm is to return the permutation vector calculated during
fill-in reducing ordering stage of PARDISO to the user. The
permutation vector is returned into the perm array if
iparm(5) = 2.
nrhs INTEGER
Number of right-hand sides that need to be solved for.
iparm INTEGER
Array, dimension (64). This array is used to pass various
parameters to PARDISO and to return some useful
information after execution of the solver. If iparm(1) =
0, PARDISO fills iparm(1), and iparm(4) through
2350
iparm(64)with default values and uses them. Note that
there is no default value for iparm(3) and the user should
supply this value, whether iparm(1) is 0 or 1.
Individual components of the iparm array are described
below (some of them are described in the Output Parameters
section).
iparm(1)- use default values.
If iparm(1) = 0, iparm(2) and iparm(4) through
iparm(64) are filled with default values, otherwise the user
has to supply all values in iparm from iparm(2) to
iparm(64).
iparm(2) - fill-in reducing ordering.
iparm(2) controls the fill-in reducing ordering for the input
matrix. If iparm(2) is 0, the minimum degree algorithm is
applied [Li99], if iparm(2) is 2, the solver uses the nested
dissection algorithm from the METIS package [Karypis98].
The default value of iparm(2) is 2.
iparm(3)- currently is not used.
CAUTION. User can control the parallel execution

of the solver by explicitly setting MKL_NUM_THREADS.
If less processors are available than specified, the
execution may slow down instead of speeding up. If
the variable MKL_NUM_THREADS is not defined, then
the solver uses all available processors.
There is no default value for iparm(3).

iparm(4) - preconditioned CGS.
This parameter controls preconditioned CGS [Sonn89] for
unsymmetric or structural symmetric matrices and
Conjugate-Gradients for symmetric matrices. iparm(4) has
the form
iparm(4)= 10*L+K
The K and L values have the meanings as follow.
2351
Value of K Description
0 The factorization is always computed as
required by phase.
1 CGS iteration replaces the computation of
LU. The preconditioner is LU that was
computed at a previous step (the first step
or last step with a failure) in a sequence of
solutions needed for identical sparsity
patterns.
2 CG iteration for symmetric matrices replaces
the computation of LU. The preconditioner
is LU that was computed at a previous step
(the first step or last step with a failure) in
a sequence of solutions needed for identical
sparsity patterns.
Value L:
The value L controls the stopping criterion of the
Krylow-Subspace iteration:
epsCGS = 10-L is used in the stopping criterion
||dxi|| / ||dxi|| < epsCGS
with ||dxi|| = ||inv(L*U)*ri|| and ri is the residuum

at iteration i of the preconditioned Krylow-Subspace
iteration.
Strategy: A maximum number of 150 iterations is fixed by
expecting that the iteration will converge before consuming
half the factorization time. Intermediate convergence rates
and residuum excursions are checked and can terminate
the iteration process. If phase =23, then the factorization
for a given A is automatically recomputed in these cases
where the Krylow-Subspace iteration failed, and the
corresponding direct solution is returned. Otherwise the
solution from the preconditioned Krylow-Subspace iteration
is returned. Using phase =33 results in an error message
(error =4) if the stopping criteria for the Krylow-Subspace
iteration can not be reached. More information on the failure
can be obtained from iparm(20).
2352
The default is iparm(4)=0, and other values are only
recommended for an advanced user. iparm(4) must be
greater or equal to zero.
Examples:
iparm(4) Description
31 LU-preconditioned CGS iteration with a
stopping criterion of 1.0E-3 for
unsymmetric matrices
unsymmetric matrices
symmetric matrices
iparm(5)- user permutation.
This parameter controls whether the user supplied fill-in
reducing permutation is used instead of the integrated
multiple-minimum degree or nested dissection algorithms.
Another possible use of this parameter is to control obtaining
the fill-in reducing permutation vector calculated during the
reordering stage of PARDISO.
This option may be useful for testing reordering algorithms,
adapting the code to special applications problems (for
instance, to move zero diagonal elements to the end
P*A*PT), or if the user wants to use the permutation vector
more than once for equal or similar matrices. For definition
of the permutation, see description of the perm parameter.
If parm(5)=0 (default value), then the array perm is not
used by PARDISO;
if parm(5)=1, then the user supplied fill-in reducing
permutation is used;
if parm(5)=2, then PARDISO returns the permutaion vector
into the array perm.
iparm(6)- write solution on x.
2353
If iparm(6)is 0 (which is the default), then the array x

contains the solution and the value of b is not changed. If
iparm(6) is 1, then the solver will store the solution on the
right hand side b.
Note that the array x is always used. The default value of
iparm(6) is 0.
iparm(8) - iterative refinement step.
On entry to the solve and iterative refinement step,
iparm(8)should be set to the maximum number of iterative
refinement steps that the solver will perform. The solver
will not perform more than the absolute value of
iparm(8)steps of iterative refinement and will stop the
process if a satisfactory level of accuracy of the solution in
terms of backward error has been achieved.
Note that if iparm(8)< 0, the accumulation of the residuum
is using extended precision real and complex data types.
Perturbed pivots result in iterative refinement (independent
of iparm(8)=0) and the iteration number executed is
reported on iparm(7).
The solver will automatically perform two steps of iterative
refinements when perturbed pivots have been obtained
during the numerical factorization and iparm(8) was equal
to zero.
The number of performed iterative refinement steps is
reported on iparm(8).
The default value for iparm(8) is 0.
iparm(9)
This parameter is reserved for future use. Its value must
be set to 0.
iparm(10)- pivoting perturbation.
This parameter instructs PARDISO how to handle small
pivots or zero pivots for unsymmetric matrices (mtype =11
or mtype =13) and symmetric matrices (mtype =-2, mtype
=-4, or mtype =6). For these matrices the solver uses a
complete supernode pivoting approach. When the
factorization algorithm reaches a point where it cannot
factorize the supernodes with this pivoting strategy, it uses
a pivoting perturbation strategy similar to [Li99],
[Schenk04].
2354
The magnitude of the potential pivot is tested against a
constant threshold of
alpha = eps*||A2||inf,
where eps = 10(-iparm(10)), A2 = P*PMPS*Dr*A*Dc*P,

and ||A2||inf is the infinity norm of the scaled and
permuted matrix A. Any tiny pivots encountered during
elimination are set to the sign (lII)*eps*||A2||inf - this
trades off some numerical stability for the ability to keep
pivots from getting too small. Small pivots are therefore
perturbed with eps = 10(-iparm(10)).
The default value of iparm(10) is 13 and therefore eps =
1.0E-13 for unsymmetric matrices (mtype =11 or mtype
=13).
The default value of iparm(10) is 8, and therefore eps =
1.0E-8 for symmetric indefinite matrices (mtype =-2,
mtype =-4, or mtype =6).
iparm(11)- scaling vectors.
PARDISO uses a maximum weight matching algorithm to
permute large elements on the diagonal and to scale the
matrix so that the diagonal elements are equal to 1 and the
absolute values of the off-diagonal entries are less or equal
to 1. This scaling method is applied only to unsymmetric
matrices (mtype =11 or mtype =13). The scaling can also
be used for symmetric indefinite matrices (mtype =-2,
mtype =-4, or mtype =6) when the symmetric weighted
matchings are applied (iparm(13)= 1).
Use iparm(11) = 1 (scaling) and iparm(13) = 1
(matching) for highly indefinite symmetric matrices, for
example, from interior point optimizations or saddle point
problems. Note that in the analysis phase (phase=11) you
must provide the numerical values of the matrix A in case
of scaling and symmetric weighted matching.
The default value of iparm(11) is 1 for unsymmetric
matrices (mtype =11 or mtype =13). The default value of
iparm(11) is 0 for symmetric indefinite matrices (mtype
=-2, mtype =-4, or mtype =6).
iparm(12)
2355
This parameter is reserved for future use. Its value must

be set to 0.
iparm(13) - improved accuracy using (non-)symmetric
weighted matchings.
PARDISO can use a maximum weighted matching algorithm
to permute large elements close the diagonal. This strategy
adds an additional level of reliability to our factorization
methods and can be seen as a complement to the alternative
idea of using more complete pivoting techniques during the
numerical factorization.
It is recommended to use iparm(11)=1 (scalings) and
iparm(13)=1 (matchings) for highly indefinite symmetric
matrices, for example from interior point optimizations or
saddle point problems. It is also very important to note that
in the analysis phase (phase =11) you must provide the
numerical values of the matrix A in the case of scalings and
symmetric weighted matchings.
The default value of iparm(13) is 1 for unsymmetric
matrices (mtype =11 or mtype =13). The default value of
iparm(13) is 0 for symmetric matrices (mtype =-2, mtype
=-4, or mtype =6).
iparm(18)
The solver will report the numbers of non-zero elements on
the factors if iparm(18)< 0 on entry.
The default value of iparm(18) is -1.
iparm(19)- MFlops of factorization.
If iparm(19)< 0 on entry, the solver will report MFlop
(1.0E6) that are necessary to factor the matrix A. This will
increase the reordering time.
iparm(21) - pivoting for symmetric indefinite matrices.
iparm(21)controls the pivoting method for sparse
symmetric indefinite matrices. If iparm(21) is 0, then 1x1
diagonal pivoting is used. If iparm(21) is 1, then 1x1 and
2x2 Bunch and Kaufman pivoting will be used in the
factorization process. If iparm(21) is 2, a very robust
preprocessing method based on symmetric weighted
matchings and 1x1 and 2x2 Bunch and Kaufman pivoting
2356
will be used in the factorization process. The default value
of iparm(21) is 1. Bunch and Kaufman pivoting is available
for matrices: mtype=-2, mtype=-4, or mtype=6.
iparm(27) - matrix checker.
If iparm(27)=0 (default value), the PARDISO doesn't check
the sparse matrix representation.
If iparm(27)=1, then PARDISO check integer arrays ia
and ja. In particular, PARDISO checks whether column
indices are sorted in increasing order within each row.
iparm(60) - version of PARDISO.
iparm(60) controls what version of PARDISO - out-of-core
(OOC) version or in-core version - is used. The OOC
PARDISO can solve very large problems by holding the
matrix factors in files on the disk. Because of that the
amount of main memory required by OOC PARDISO is
significantly reduced.
WARNING. The current OOC version does not use

threading, thus both the parameter iparm(3) and
the variable MKL_NUM_THREADS must be set to 1
when iparm(60) is equal to 1 or 2.
If iparm(60) is set to 0, then the in-core PARDISO is used.

If iparm(60) is set to 2 - the OOC PARDISO is used. If
iparm(60) is set to 1 - the in-core PARDISO is used if the
total memory (in MBytes) needed for storing the matrix
factors is less than the value of the environment variable
MKL_PARDISO_OOC_MAX_CORE_SIZE (its default value is
2000), and OOC PARDISO is used otherwise.
Note that if iparm(60) is equal to 1 or 2, and the total peak
memory needed for strong local arrays is more than
MKL_PARDISO_OOC_MAX_CORE_SIZE, the program stops
with error -9. In this case, increase of
MKL_PARDISO_OOC_MAX_CORE_SIZE is recommended.
2357
OOC parameters can be set in the configuration file. You

can set the path to this file and its name via environmental
variable MKL_PARDISO_OOC_CFG_PATH and
MKL_PARDISO_OOC_CFG_FILE_NAME.
Path and name are as follows:
<MKL_PARDISO_OOC_CFG_PATH >/<
MKL_PARDISO_OOC_CFG_FILE_NAME> for Linux* OS, and
<MKL_PARDISO_OOC_CFG_PATH >\<
MKL_PARDISO_OOC_CFG_FILE_NAME> for Windows* OS.
By default, the name of the file is pardiso_ooc.cfg and
it is placed to the current directory.
All temporary data files can be deleted or stored when the
calculations are completed in accordance with the value of
the environmental variable MKL_PARDISO_OOC_KEEP_FILE.
If it is set to 1 (default value) - all files are deleted, if it is
set to 0 - all files are stored.
By default, the OOC PARDISO uses the current directory for
storing data, and all work arrays associated with the matrix
factors are stored in files named ooc_temp with different
extensions. These default values can be changed by using
the environmental variable MKL_PARDISO_OOC_PATH.
To set the environmental variables
MKL_PARDISO_OOC_MAX_CORE_SIZE,
MKL_PARDISO_OOC_KEEP_FILE, and
MKL_PARDISO_OOC_PATH, the configuration file should be
created with the following lines:
MKL_PARDISO_OOC_PATH = <path>\ooc_file
MKL_PARDISO_OOC_MAX_CORE_SIZE = N
MKL_PARDISO_OOC_KEEP_FILE = 0 (or 1)
where <path> - the directory for storing data, ooc_file -
file name without extension, N - memory size in MBytes, it
is not recommended to set this value greater than the size
of the available RAM (default value - 2000 MBytes).
WARNING. Maximum length of the path lines in

the configuration files is 1000 characters.
2358
Alternatively the environment variables can be set via
command line:
export MKL_PARDISO_OOC_PATH = <path>/ooc_file
export MKL_PARDISO_OOC_MAX_CORE_SIZE = N
export MKL_PARDISO_OOC_KEEP_FILE = 0 (or 1)
for Linux* OS, and
set MKL_PARDISO_OOC_PATH = <path>\ooc_file
set MKL_PARDISO_OOC_MAX_CORE_SIZE = N
set MKL_PARDISO_OOC_KEEP_FILE = 0 (or 1)
for Windows* OS.
NOTE. The values specified in a command line have

higher priorities - it means that if variable is changed
in the configuration file and in the command line,
OOC PARDISO uses only value defined in the
command line. Setting OOC parameters via command
line is recommended.
msglvl INTEGER
Message level information. If msglvl = 0 then PARDISO
generates no output, if msglvl = 1 the solver prints
statistical information to the screen.
b REAL/COMPLEX
Array, dimension (n,nrhs). On entry, contains the right
hand side vector/matrix B. Note that b is only accessed in
the solution phase.
Output Parameters
pt This parameter contains internal address pointers.
iparm On output, some iparm values will report useful information,
for example, numbers of non-zero elements in the factors,
and so on.
iparm(7)- number of performed iterative refinement
steps.
The number of iterative refinement steps that are actually
performed during the solve step.
2359
iparm(14)- number of perturbed pivots.

After factorization, iparm(14) contains the number of
perturbed pivots during the elimination process for mtype
=11, mtype =13, mtype =-2, mtype =-4, or mtype =-6.
iparm(15) - peak memory symbolic factorization.
The parameter iparm(15) provides the user with the total
peak memory in KBytes that the solver needed during the
analysis and symbolic factorization phase. This value is only
computed in phase 1.
iparm(16) - permanent memory symbolic
factorization.
The parameter iparm(16) provides the user with the
permanent memory in KBytes that the solver needed from
the analysis and symbolic factorization phase in the
factorization and solve phases. This value is only computed
in phase 1.
iparm(17) - memory numerical factorization and
solution.
The parameter iparm(17) provides the user with the total
double precision memory consumption (KBytes) of the solver
for the factorization and solve phases. This value is only
computed in phase 2.
Note that the total peak memory solver consumption is
max(iparm(15), iparm(16)+iparm(17))
iparm(18) - number of non-zero elements in factors.
The solver will report the numbers of non-zero elements on
the factors if iparm(18) < 0 on entry.
iparm(19) - MFlops of factorization.
Number of operations in MFlop (1.0E6 operations) that are
necessary to factor the matrix A are returned to the user if
iparm(19) < 0 on entry.
iparm(20) - CG/CGS diagnostics.
The value is used to give CG/CGS diagnostics (for example,
the number of iterations and cause of failure):
If iparm(20)> 0, CGS succeeded, and the number of
iterations executed are reported in iparm(20).
If iparm(20 )< 0, iterations executed, but CG/CGS failed.
The error report details in iparm(20) are of the form:
iparm(20)= - it_cgs*10 - cgs_error.
2360
If phase= 23, then the factors L, U are recomputed for the
matrix A and the error flag error=0 in case of a successful
factorization. If phase =33, then error = -4 signals the
failure.
Description of cgs_error is given in the table below:
cgs_error Description
1 - fluctuations of the residuum are too large
2 - ||dxmax_it_cgs/2|| too large (slow
convergence)
3 - stopping criterion not reached at
max_it_cgs
4 - perturbed pivots caused iterative
refinement
5 - factorization is too fast for this matrix. It
is better to use the factorization method with
iparm(4)=0
iparm(22) - inertia: number of positive eigenvalues.
The parameter iparm(22) reports the number of positive
eigenvalues for symmetric indefinite matrices.
iparm(23) - inertia: number of negative eigenvalues.
The parameter iparm(23) reports the number of negative
eigenvalues for symmetric indefinite matrices.
iparm(24) to iparm(59)
These parameters are reserved for future use. Their values
must be set to 0.
iparm(61) - the total peak memory in MBytes that the
solver used during the analysis and symbolic factorization
phases if the in-core PARDISO is used. iparm(61) is similar
to iparm(15), but iparm(15) returns the value of the total
peak memory if the OOC PARDISO is used.
iparm(62) - the total double precision memory consumption
in MBytes that the solver used during the analysis and
symbolic factorization phase in the factorization and solver
phases if the in-core PARDISO is used. iparm(62) is similar
to iparm(16), but iparm(16) returns the value of the
memory consumption if the OOC PARDISO is used.
2361
iparm(63) - the total double precision memory consumption

in MBytes that the solver used for factorization and solution
phases if the in-core PARDISO is used. Value of iparm(63)
is similar to iparm(17), but iparm(17) returns the value of
the memory consumption if the OOC PARDISO is used.
b On output, the array is replaced with the solution if
iparm(6) = 1.
x REAL/COMPLEX
Array, dimension (n,nrhs). On output, contains solution if
iparm(6)=0. Note that x is only accessed in the solution
phase.
error INTEGER
The error indicator according to the below table:
error Information
0 no error
-1 input inconsistent
-2 not enough memory
-3 reordering problem
-4 zero pivot, numerical factorization or
iterative refinement problem
-5 unclassified (internal) error
-6 preordering failed (matrix types 11, 13 only)
-7 diagonal matrix problem
-8 32-bit integer overflow problem
-9 not enough memory for OOC
-10 problems with opening OOC temporary files
-11 read/write problems with the OOC data file
Direct Sparse Solver (DSS) Interface Routines

Intel MKL supports an alternative to the PARDISO* interface for the direct sparse solver referred
to here as DSS interface. The DSS interface implements a group of user-callable routines that
are used in the step-by-step solving process and exploits the general scheme described in
2362
Linear Solvers Basics for solving sparse systems of linear equations. This interface also includes
one routine for gathering statistics related to the solving process and an auxiliary routine for
passing character strings from Fortran routines to C routines.
The current implementation of the DSS interface additionally supports the out-of-core (OOC)
version.
Table 8-1 lists the names of the routines and describes their general use.
Table 8-1 DSS Interface Routines
Routine Description
dss_create
Initializes the solver and creates the basic data
structures necessary for the solver. This routine
must be called before any other DSS routine.
dss_define_structure
Informs the solver of the locations of the non-zero
dss_reorder
Based on the non-zero structure of the matrix, this
routine computes a permutation vector to reduce
fill-in during the factoring process.
dss_factor_real, T T
Computes the LU, LDT or LL factorization of a real
dss_factor_complex
or complex matrix.
dss_solve_real,
Computes the solution vector for a system of
dss_solve_complex
equations based on the factorization computed in
the previous phase.
dss_delete
Deletes all data structures created during the
solving process.
dss_statistics
Returns statistics about various phases of the
solving process. Gathers the following statistics:
time taken to do reordering, time taken to do
factorization, problem solving duration, determinant
of a matrix, inertia of a matrix, number of floating
point operations taken during factorization, peak
memory symbolic factorization, permanent memory
symbolic factorization, and memory numerical
factorization and solution. This routine can be
invoked in any phase of the solving process after
the “reorder” phase, but before the “delete” phase.
2363
Routine Description
Note that appropriate argument(s) must be
supplied to this routine to match the phase in which
it is invoked.
mkl_cvt_to_null_terminated_str
Passes character strings from Fortran routines to
C routines.
To find a single solution vector for a single system of equations with a single right hand side,
the Intel MKL DSS interface routines are invoked in the order in which they are listed in Table
8-1 , with the exception of dss_statistics, which is invoked as described in the table.
However, in certain applications it is necessary to produce solution vectors for multiple right-hand
sides for a given factorization and/or factor several matrices with the same non-zero structure.
Consequently, it is necessary to invoke the Intel MKL sparse routines in an order other than
listed in the table. The DSS interface provides such option. The solving process is conceptually
divided into six phases, as shown in Figure 8-2 , that indicates the typical order(s) in which
the DSS interface routines can be invoked.
Figure 8-2 Typical order for invoking DSS interface routines
See code examples that uses the DSS interface routines to solve systems of linear equations
in Direct Sparse Solver Examples section in the appendix C.
2364
DSS Interface Description

As noted in Memory Allocation and Handles section, each DSS routine reads or writes an opaque
data object called a handle. It is declared as being of type MKL_DSS_HANDLE in this
documentation. You can refer to Memory Allocation and Handles to determine the correct
method for declaring a handle argument.
All other types in this documentation refer to the common Fortran types, INTEGER, REAL,
COMPLEX, DOUBLE PRECISION, and DOUBLE COMPLEX.
C and C++ programmers should refer to Calling Direct Sparse Solver and Preconditioner Routines
From C/C++ section for information on mapping Fortran types to C/C++ types.
Routine Options
The DSS routines have an integer argument (referred below to as opt) for passing various
options to the routines. The permissible values for opt should be specified using only the symbol
constants defined in the language-specific header files (see Implementation Details). The
routines accept options for setting the message and termination levels as described in Table
8-2 . Additionally, the routines accept the option MKL_DSS_DEFAULTS that establishes the
documented default options for each DSS routine.
Table 8-2 Symbolic Names for the Message and Termination Levels Options
Message Level Termination Level
MKL_DSS_MSG_LVL_SUCCESS MKL_DSS_TERM_LVL_SUCCESS
MKL_DSS_MSG_LVL_INFO MKL_DSS_TERM_LVL_INFO
MKL_DSS_MSG_LVL_WARNING MKL_DSS_TERM_LVL_WARNING
MKL_DSS_MSG_LVL_ERROR MKL_DSS_TERM_LVL_ERROR
MKL_DSS_MSG_LVL_FATAL MKL_DSS_TERM_LVL_FATAL
The settings for message and termination levels can be set on any call to a DSS routine.
However, once set to a particular level, they remain at that level until they are changed in
another call to a DSS routine.
Users can specify multiple options for a DSS routine by adding the options together. For example,
to set the message level to debug and the termination level to error for all the DSS routines,
use the following call:
CALL dss_create( handle, MKL_DSS_MSG_LVL_INFO + MKL_DSS_TERM_LVL_ERROR)
2365
User Data Arrays

Many of the DSS routines take arrays of user data as input. For example, user arrays are passed
to the routine dss_define_structure to describe the location of the non-zero entries in the
matrix. To minimize storage requirements and improve overall run-time efficiency, the Intel
MKL DSS routines do not make copies of the user input arrays.
WARNING. The contents of these arrays cannot be modified after they are passed to
one of the solver routines.
dss_create
Initializes the solver.
Syntax
dss_create(handle, opt)
Description
This routine is declared in mkl_dss.f77 for FORTRAN 77 interface, in mkl_dss.f90 for Fortran
90 interface, and in mkl_dss.h for C interface.
The routine dss_create initializes the solver. After the call to dss_create, all subsequent
invocations of the Intel MKL DSS routines must use the value of handle returned by the routine
dss_create.
WARNING. Do not write the value of handle directly.
The default value of the parameter opt is

MKL_DSS_MSG_LVL_WARNING + MKL_DSS_TERM_LVL_ERROR.
This parameter can control number of refinement steps used on the solution stage by specifying
the two following values:
MKL_DSS_REFINEMENT_OFF - maximum number of refinement steps is set to zero;
MKL_DSS_REFINEMENT_ON (default value) - maximum number of refinement steps is set to 2.
This parameter can contain additionally one of two possible values to launch the out-of-core
version of DSS (OOC DSS): MKL_DSS_OOC_VARIABLE and MKL_DSS_OOC_STRONG.
2366
MKL_DSS_OOC_STRONG - OOC DSS is used.
MKL_DSS_OOC_VARIABLE - OOC DSS uses the in-core kernels of PARDISO* if there is enough
memory for storing work arrays associated with the matrix factors in the main memory.
Specifically, if the memory needed for the matrix factors is less than the value of the environment
variable MKL_PARDISO_OOC_MAX_CORE_SIZE, then the OOC version of PARDISO* uses the
in-core computations, otherwise it uses the OOC computations.
The variable MKL_PARDISO_OOC_MAX_CORE_SIZE defines the maximum size of the main memory
allowed for storing work arrays associated with the matrix factors. It is ignored if
MKL_DSS_OOC_STRONG is set. The default value of MKL_PARDISO_OOC_MAX_CORE_SIZE is 2000
MBytes. This value and default path and file name for storing temporary data can be changed
using the configuration file pardiso_ooc.cfg or command line (See more details in the PARDISO
Parameters description above).
WARNING. Do not change the behavior of the OOC DSS after it is specified in the
routine dss_create.
Input Parameters
opt INTEGER Options passing argument. The default value is
Output Parameters
handle Data object of the MKL_DSS_HANDLE type (see Interface
Description).
Return Values
MKL_DSS_SUCCESS
MKL_DSS_INVALID_OPTION
MKL_DSS_OUT_OF_MEMORY
2367
dss_define_structure
Communicates locations of non-zero elements in
the matrix to the solver.
Syntax
dss_define_structure(handle, opt, rowIndex, nRows, nCols, columns, nNonZeros);
Description
The routine dss_define_structure communicates the locations of the nNonZeros number of
non-zero elements in a matrix of nRows by nCols size to the solver.
Note that currently the Intel MKL DSS software operates only on square matrices, so nRows
must be equal to nCols.
To communicate the locations of non-zeros in the matrix, do the following:
1. Define the general non-zero structure of the matrix by specifying one of the following values
for the options argument opt:
• MKL_DSS_SYMMETRIC_STRUCTURE
• MKL_DSS_SYMMETRIC
• MKL_DSS_NON_SYMMETRIC
2. Provide the actual locations of the non-zeros by means of the arrays rowIndex and columns
(see Sparse Matrix Storage Format).
Input Parameters
opt INTEGER. Options passing argument. The default option for
the matrix structure is MKL_DSS_SYMMETRIC.
rowIndex INTEGER. Array of size min(nRows, nCols)+1. Defines the
location of non-zero entries in the matrix.
nRows INTEGER. Number of rows in the matrix.
nCols INTEGER. Number of columns in the matrix.
2368
columns INTEGER. Array of size nNonZeros. Defines the location of
non-zero entries in the matrix.
nNonZeros INTEGER. Number of non-zero elements in the matrix.
WARNING. Pointers to the rowIndex and columns are stored in the handle. These
arrays are used when the solver runs. Do not free or delete them before calling
dss_solve_real or dss_solver_complex.
Output Parameters
Description).
Return Values
MKL_DSS_SUCCESS
MKL_DSS_STATE_ERR
MKL_DSS_COL_ERR
MKL_DSS_NOT_SQUARE
MKL_DSS_TOO_FEW_VALUES
MKL_DSS_TOO_MANY_VALUES
dss_reorder
Computes permutation vector that minimizes the
fill-in during the factorization phase.
Syntax
dss_reorder(handle, opt, perm)
Description
2369
If opt contains the option MKL_DSS_AUTO_ORDER, then the routine dss_reorder computes a
permutation vector that minimizes the fill-in during the factorization phase. For this option, the
perm array is never accessed.
If opt contains the option MKL_DSS_MY_ORDER, then the array perm is considered to be a
permutation vector supplied by the user. In this case, the array perm is of length nRows, where
nRows is the number of rows in the matrix as defined by the previous call to dss_define_structure.
Input Parameters
opt INTEGER. Option passing argument. The default option for
the permutation type is MKL_DSS_AUTO_ORDER.
perm INTEGER. Array of length nRows. Contains a user-defined
permutation vector (accessed only if opt contains
MKL_DSS_MY_ORDER).
Output Parameters
Description).
Return Values
MKL_DSS_SUCCESS
MKL_DSS_STATE_ERR
dss_factor_real, dss_factor_complex
Compute factorization of the matrix with previously
specified location.
Syntax
dss_factor_real(handle, opt, rValues)
dss_factor_complex(handle, opt, cValues)
2370
Description
These routines compute factorization of the matrix whose non-zero locations were previously
specified by a call to dss_define_structure and whose non-zero values are given in the array
rValues or cValues. The arrays rValues and cValues are assumed to be of length nNonZeros
as defined in a previous call to dss_define_structure.
The opt argument should contain one of the following options:
• MKL_DSS_POSITIVE_DEFINITE,
• MKL_DSS_INDEFINITE,
• MKL_DSS_HERMITIAN_POSITIVE_DEFINITE,
• MKL_DSS_HERMITIAN_INDEFINITE ,
depending on whether the non-zero values in rValues and cValues describe a positive definite,
indefinite, or Hermitian matrix.
for details.
Input Parameters
Description).
opt INTEGER Options passing argument. The default option for
the matrix type is MKL_DSS_POSITIVE_DEFINITE.
rValues DOUBLE PRECISION. Array of size nNonZeros. Contains real
non-zero elements of the matrix.
cValues DOUBLE COMPLEX. Array of size nNonZeros. Contains
complex non-zero elements of the matrix.
Return Values
MKL_DSS_SUCCESS
MKL_DSS_STATE_ERR
2371
MKL_DSS_OPTION_CONFLICT
MKL_DSS_ZERO_PIVOT
dss_solve_real, dss_solve_complex
Compute the corresponding solutions vector and
place it in the output array.
Syntax
dss_solve_real(handle, opt, rRhsValues, nRhs, rSolValues)
dss_solve_complex(handle, opt, cRhsValues, nRhs, cSolValues)
Description
These routines are declared in mkl_dss.f77 for FORTRAN 77 interface, in mkl_dss.f90 for
Fortran 90 interface, and in mkl_dss.h for C interface.
For each right hand side column vector defined in the arrays rRhsValues or cRhsValues, these
routines compute the corresponding solutions vector and place it in the arrays rSolValues or
cSolValues .
The lengths of the right-hand side and solution vectors, nCols and nRows respectively, are
assumed to have been defined in a previous call to dss_define_structure.
By default, both routines perform full solution step (it corresponds to phase =33 in PARDISO).
The parameter opt allows to calculate the final solution step-by-step, calling forward and
backward substitutions. If it is set to MKL_DSS_FORWARD_SOLVE - the forward substitution
(corresponding to phase =331 in PARDISO) is performed, if it is set to
MKL_DSS_DIAGONAL_SOLVE - the diagonal substitution (corresponding to phase =332 in
PARDISO) is performed, and if it is set to MKL_DSS_BACKWARD_SOLVE - the backward substitution
(corresponding to phase =333 in PARDISO ) is performed. For more details about using these
substitutions for different types of matrices, see description of the PARDISO solver.
This parameter also can control number of refinement steps that is used on the solution stage:
if it is set to MKL_DSS_REFINEMENT_OFF, the maximum number of refinement steps equal to
zero, and if it is set to MKL_DSS_REFINEMENT_ON (default value), the maximum number of
refinement steps is equal to 2.
2372
Input Parameters
Description).
opt INTEGER. Options passing argument.
nRhs INTEGER. Number of the right-hand sides in the linear
equation.
rRhsValues DOUBLE PRECISION. Array of size nRows by nRhs. Contains
real right-hand side vectors.
cRhsValues DOUBLE COMPLEX. Array of size nRows by nRhs. Contains
complex right-hand side vectors.
Output Parameters
rSolValues DOUBLE PRECISION. Array of size nCols by nRhs. Contains
real solution vectors.
cSolValues DOUBLE COMPLEX. Array of size nCols by nRhs. Contains
complex solution vectors.
Return Values
MKL_DSS_SUCCESS
MKL_DSS_STATE_ERR
dss_delete
Deletes all of data structures created during the
solutions process.
Syntax
dss_delete(handle, opt)
Description
2373
The routine dss_delete deletes all data structures created during the solving process.
Input Parameters
opt INTEGER. Options passing argument. The default value is
Output Parameters
Description).
Return Values
MKL_DSS_SUCCESS
dss_statistics
Returns statistics about various phases of the
solving process.
Syntax
dss_statistics(handle, opt, statArr, retValues)
Description
The dss_statistics routine returns statistics about various phases of the solving process.
This routine gathers the following statistics:
– time taken to do reordering,

– time taken to do factorization,
– problem solving duration,
– determinant of a matrix,
– inertia of a matrix,
– number of floating point operations taken during factorization,
2374
– total peak memory needed during the analysis and symbolic factorization,
– permanent memory needed from the analysis and symbolic factorization,
– memory consumption for the factorization and solve phases.
Statistics are returned in accordance with the input string specified by the parameter statArr.
The value of the statistics is returned in double precision in a return array allocated by user.
For multiple statistics, multiple string constants separated by commas can be used as input.
Return values are put into the return array in the same order as specified in the input string.
Statistics must be requested only at the appropriate stages of the solving process. For example,
inquiring about FactorTime before a matrix is factored leads to errors.
The following table shows the point at which each individual statistics item can be called:
Table 8-3 Statistics Calling Sequences
Type of Statistics When to Call
ReorderTime After dss_reorder is completed successfully.
FactorTime After dss_factor_real or dss_factor_complex is completed successfully.
SolveTime After dss_solve_real or dss_solve_complex is completed successfully.
Determinant After dss_factor_real or dss_factor_complex is completed successfully.
Inertia After dss_factor_real is completed successfully and the matrix is real,
symmetric, and indefinite.
Flops After dss_factor_real or dss_factor_complex is completed successfully.
Peakmem After dss_reorder is completed successfully.
Factormem After dss_reorder is completed successfully.
Solvemem After dss_factor_real or dss_factor_complex is completed successfully.
Input Parameters
Description).
opt INTEGER. Options passing argument.
statArr STRING. Input string that defines the type of the returned
statistics. The parameter can include one or more of the
following string constants (case of the input string has no
effect):
ReorderTime Amount of time taken to do the
reordering.
2375
FactorTime Amount of time taken to do the

factorization.
SolveTime Amount of time taken to solve the
problem after factorization.
Determinant Determinant of the matrix A.
For real matrices: the determinant is
returned as det_pow, det_base in two
consecutive return array locations, where
1.0 ≤abs(det_base) < 10.0 and
determinant = det_base*10(det_pow).
For complex matrices: the determinant
is returned as det_pow, det_re, det_im
in three consecutive return array
locations, where 1.0 ≤abs(det_re) +
abs(det_im) < 10.0 and determinant
= det_re, det_im*10(det_pow).
Inertia Inertia of a real symmetric matrix is
defined as a triplet of nonnegative
integers (p,n,z), where p is the number
of positive eigenvalues, n is the number
of negative eigenvalues, and z is the
number of zero eigenvalues.
Inertia is returned as three consecutive
return array locations as p, n, z.
Computing Inertia is recommended only
for stable matrices. Unstable matrices can
lead to incorrect results.
Inertia of a k-by-k real symmetric
positive definite matrix is always (k, 0,
0). Therefore Inertia is returned only
in cases of real symmetric indefinite
matrices. For all other matrix types, an
error message is returned.
Flops Number of floating point operations
performed during the factorization.
2376
Peakmem Total peak memory in KBytes that the
solver needs during the analysis and
symbolic factorization phase.
Factormem Permanent memory in KBytes that the
solver needs from the analysis and
symbolic factorization phase in the
factorization and solve phases.
Solvemem Total double precision memory
consumption (KBytes) of the solver for
the factorization and solve phases.
NOTE. To avoid problems in passing strings from Fortran to C, Fortran users must call
the mkl_cvt_to_null_terminated_str routine before calling dss_statistics. Refer
to the description of mkl_cvt_to_null_terminated_str for details.
Output Parameters
retValues DOUBLE PRECISION. Value of the statistics returned.
Finding “time used to reorder” and “inertia” of a matrix.

The example below illustrates the use of the dss_statistics routine.
To find the above values, call dss_statistics(handle, opt, statArr, retValues), where
statArr is “ReorderTime,Inertia”
In this example, retValues has the following values:
Return Values
MKL_DSS_SUCCESS
MKL_DSS_STATISTICS_INVALID_MATRIX
MKL_DSS_STATISTICS_INVALID_STATE
MKL_DSS_STATISTICS_INVALID_STRING
2377
mkl_cvt_to_null_terminated_str
Passes character strings from Fortran routines to
C routines.
Syntax
mkl_cvt_to_null_terminated_str (destStr, destLen, srcStr)
Description
This routine is declared in mkl_dss.f77 for FORTRAN 77 interface and in mkl_dss.f90 for
Fortran 90 interface.
The routine mkl_cvt_to_null_terminated_str passes character strings from Fortran routines
to C routines. The strings are converted into integer arrays before being passed to C. Using
this routine avoids the problems that can occur on some platforms when passing strings from
Fortran to C. The use of this routine is highly recommended.
Input Parameters
destLen INTEGER. Length of the output array destStr.
srcStr STRING. Input string.
Output Parameters
destStr INTEGER. One-dimensional array of integer.
Implementation Details
Several aspects of the Intel MKL DSS interface are platform-specific and language-specific. To
promote portability across platforms and ease of use across different languages, one of the
following Intel MKL DSS language-specific header files can be included:
• mkl_dss.f77 for F77 programs
• mkl_dss.f90 for F90 programs
• mkl_dss.h for C programs
These header files define symbolic constants for error returns, function options, certain defined
data types, and function prototypes.
2378
NOTE. Constants for options, error returns, and message severities must be referred
only by the symbolic names that are defined in these header files. Use of the Intel MKL
DSS software without including one of the above header files is not supported.
Memory Allocation and Handles

To simplify the use of the Intel MKL DSS routines, they do not require the user to allocate any
temporary working storage. The solver itself allocates any required storage. To enable multiple
users to access the solver simultaneously, the solver keeps track of the storage allocated for
a particular application by using an opaque data object called a handle.
Each of the Intel MKL DSS routines creates, uses or deletes a handle. Consequently, user
programs must be able to allocate storage for a handle. The exact syntax for allocating storage
for a handle varies from language to language. To standardize the handle declarations, the
language-specific header files declare constants and defined data types that should be used
when declaring a handle object in the user code.
Fortran 90 programmers should declare a handle as:
INCLUDE "mkl_dss.f90"
TYPE(MKL_DSS_HANDLE) handle
C and C++ programmers should declare a handle as:
#include "mkl_dss.h"
_MKL_DSS_HANDLE_t handle;
FORTRAN 77 programmers using compilers that support eight byte integers, should declare a
handle as:
INCLUDE "mkl_dss.f77"
INTEGER*8 handle
Otherwise they should replace the INTEGER*8 data types with the DOUBLE PRECISION data
type.
In addition to the definition for the correct declaration of a handle, the include file also defines
the following:
• function prototypes for languages that support prototypes

• symbolic constants that are used for the error returns
• user options for the solver routines
• message severity.
2379
Iterative Sparse Solvers based on Reverse Communication

Interface (RCI ISS)
Intel MKL supports the iterative sparse solvers (ISS) based on the reverse communication
interface (RCI), referred to here as . The RCI ISS interface implements a group of user-callable
routines that are used in the step-by-step solving process of a symmetric positive definite
system (RCI Conjugate Gradient Solver, or RCI CG), and of a non-symmetric indefinite
(non-degenerate) system (RCI Flexible Generalized Minimal RESidual Solver, or RCI FGMRES)
of linear algebraic equations and exploits the general RCI scheme described in [Dong95].
See the Linear Solvers Basics for discussion of terms and concepts related to the ISS routines.
RCI means that when the solver needs the results of certain operations (for example,
matrix-vector multiplications), the user himself must perform them and pass the result to the
solver. This gives the great universality to the solver as it is independent of the specific
implementation of the operations like the matrix-vector multiplication. To perform such
operations, the user can use the built-in sparse matrix-vector multiplications and triangular
solvers routines (see Sparse BLAS Level 2 and Level 3).
NOTE. The RCI CG solver is implemented in two versions: for system of equations with
single right hand side, and for system of equations with multiple right hand sides.
The CG method may fail to compute the solution or compute the wrong solution if the
matrix of the system is not symmetric and positive definite.
The FGMRES method may fail if the matrix is degenerate.
Table 8-4 lists the names of the routines, and describes their general use.
Table 8-4 RCI ISS Interface Routines
Routine Description
dcg_init, dcgmrhs_init, dfgm-
res_init
Checks the consistency and correctness of the user
dcg_check, dcgmrhs_check, dfgm- defined data.
res_check
Computes the approximate solution vector.
dcg, dcgmrhs, dfgmres
Retrieves the number of the current iteration.
dcg_get, dcgmrhs_get, dfgmres_get
2380
The Intel MKL RCI ISS interface routines are normally invoked in the order listed in Table8-4,
with the exception of dcg_get, dcgmrhs_get, and dfgmres_get routines that can be invoked
at any place in the code. However, in this case some precautions should be taken to avoid the
wrong results. Advanced users can change that order if they need it. Others should follow the
above order of calls.
The following diagram in Figure 8-3 indicates the typical order in which the RCI ISS interface
routines can be invoked.
2381
Figure 8-3 Typical Order for Invoking RCI ISS interface Routines
The following pseudocode shows the general schemes of using the RCI CG routines.
...
generate matrix A
generate preconditioner C (optional)
call dcg_init(n, x, b, RCI_request, ipar, dpar, tmp)
change parameters in ipar, dpar if necessary
call dcg_check(n, x, b, RCI_request, ipar, dpar, tmp)
1 call dcg(n, x, b, RCI_request, ipar, dpar, tmp)
if (RCI_request.eq.1) then
multiply the matrix A by tmp(1:n,1) and put the result in tmp(1:n,2)
It is possible to use MKL Sparse BLAS Level 2 subroutines for this operation
c proceed with CG iterations
goto 1
2382
endif
if (RCI_request.eq.2)then
do the stopping test
if (test not passed) then
go to 1
else
c stop CG iterations
goto 2
endif
endif
if (RCI_request.eq.3) then (optional)
apply the preconditioner C inverse to tmp(1:n,3) and put the result in tmp(1:n,4)
goto 1
end
2 call dcg_get(n, x, b, RCI_request, ipar, dpar, tmp, itercount)
current iteration number is in itercount
the computed approximation is in the array x
And the following pseudocode shows the general schemes of using the RCI FGMRES routines.
...
generate matrix A
call dfgmres_init(n, x, b, RCI_request, ipar, dpar, tmp)
call dfgmres_check(n, x, b, RCI_request, ipar, dpar, tmp)
1 call dfgmres(n, x, b, RCI_request, ipar, dpar, tmp)
multiply the matrix A by tmp(ipar(22)) and put the result in tmp(ipar(23))
2383
It is possible to use MKL Sparse BLAS Level 2 subroutines for this operation
c proceed with FGMRES iterations
goto 1
endif
go to 1
else
c stop FGMRES iterations
goto 2
endif
endif
if (RCI_request.eq.3) then (optional)
apply the preconditioner C inverse to tmp(ipar(22)) and put the result in tmp(ipar(23))
goto 1
endif
check the norm of the next orthogonal vector, it is contained in dpar(7)
if (the norm is not zero up to rounding/computational errors) then
goto 1
else
goto 2
endif
endif
2384
2 call dfgmres_get(n, x, b, RCI_request, ipar, dpar, tmp, itercount)
Note that for the FGMRES method the array x initially contains the current initial approximation
to the solution that can be updated only by calling the routine dfgmres_get. It updates the
solution in accordance with the computations performed by the routine dfgmres.
The above pseudocodes demonstrate two main differences in the use of RCI CG and RCI FGMRES
interfaces. The first difference relates to RCI_request=3 (different locations in the tmp array,
which is 2-dimensional for CG and 1-dimensional for FGMRES). The second difference relates
to RCI_request=4 (the RCI CG interface never produces RCI_request=4).
Code examples that use the RCI ISS interface routines to solve systems of linear equations can
be found in the Iterative Sparse Solver Code Example section in the Appendix C.
CG Interface Description
All types in this documentation refer to the common Fortran types, INTEGER, and DOUBLE
PRECISION.
C and C++ programmers should refer to the section Calling Sparse Solver and Preconditioner
Routines From C/C++ for information on mapping Fortran types to C/C++ types.
Each routine for the RCI CG solver is implemented in two versions: for a system of equations
with asingle right hand side (SRHS), and for a system of equations with multiple right hand
sides (MRHS). The names of routines for a system with MRHS contain the suffix mrhs.
Routines Options
All of the RCI CG routines have parameters for passing various options to the routines. The
values for these parameters should be specified very carefully (see CG Common Parameters),
and they can be changed during computations according to the user's needs.
NOTE. Provide correct and consistent parameters to the subroutines to avoid fails or
wrong results.
2385
User Data Arrays

Many of the RCI CG routines take arrays of user data as input. For example, user arrays are
passed to the routine dcg to compute the solution of a system of linear algebraic equations.
The Intel MKL RCI CG routines do not make copies of the user input arrays to minimize storage
requirements and improve overall run-time efficiency.
CG Common Parameters
NOTE. The default and initial values listed below are assigned to the parameters by
calling the dcg_init/dcgmrhs_init routine.
n - INTEGER, this parameter sets the size of the problem in the

dcg_init/dcgmrhs_init routine. All the other routines uses the
ipar(1) parameter instead.
x - DOUBLE PRECISION array of size n for SRHS, or matrix of size
n-by-nrhs for MRHS. This parameter contains the current
approximation to the solution. Before the first call to the
dcg/dcgmrhs routine, it contains the initial approximation to the
solution.
nrhs - INTEGER, this parameter sets the number of right-hand sides for
MRHS routines.
b - DOUBLE PRECISION array containing a single right-hand side
vector, or matrix (nrhs, n) containing right-hand side vectors.
RCI_request - INTEGER, this parameter informs about the result of work of the
RCI CG routines. The negative values of the parameter indicate
that the routine is completed with errors or warnings. The 0 value
indicates the successful completion of the task. The positive values
mean that the user must perform certain actions, specifically:
RCI_request= 1 - to multiply the matrix by tmp (1:n,1), put
the result in tmp(1:n,2), and return the
control to the dcg/dcgmrhs routine;
RCI_request= 2 - to perform the stopping tests. If they fail,
return the control to the dcg/dcgmrhs routine.
Otherwise, the solution is found and stored in
the x array;
2386
RCI_request= 3 - for SRHS: to apply the preconditioner to
tmp(1:n,3), put the result in tmp(1:n,4),
and return the control to the dcg routine;
- for MRHS: to apply the preconditioner to
tmp(:,3+ipar(3)), put the result in
tmp(:,3), and return the control to the
dcgmrhs routine.
Note that the dcg_get/dcgmrhs_get routine does not change the
parameter RCI_request. This enables to use this routine inside
the Reverse Communication computations.
ipar - INTEGER array, of size 128 for SRHS, and of size (128+2*nrhs)
for MRHS; this parameter specifies the integer set of data for the
RCI CG computations:
ipar(1) - specifies the size of the problem. The
dcg_init/dcgmrhs_init routine assigns
ipar(1)=n. All the other routines uses this
parameter instead of n. There is no default
value for this parameter.
ipar(2) - specifies the type of output for error and
warning messages generated by the RCI CG
routines. The default value 6 means that all
messages are displayed on the screen.
Otherwise the error and warning messages
are written to the newly created files
dcg_errors.txt and
dcg_check_warnings.txt respectively. Note
that if ipar(6) and ipar(7) parameters are
set to 0, error and warning messages are not
generated at all.
ipar(3) - for SRHS: contains the current stage of the
RCI CG computations, the initial value is 1;
- for MRHS: contains the right-hand side for
which the calculations are currently performed.
WARNING. Avoid altering this variable

during computations.
2387
ipar(4) - contains the current iteration number, the

initial value is 0.
ipar(5) - specifies the maximum number of iterations,
the default value is min{150, n}.
ipar(6) - if the value is not equal to 0, the routines
output error messages in accordance with the
parameter ipar(2). Otherwise, the routines
do not output error messages at all, but return
a negative value of the parameter
RCI_request. The default value is 1.
output warning messages in accordance with
the parameter ipar(2). Otherwise, the
routines do not output warning messages at
all, but they return a negative value of the
parameter RCI_request. The default value is
1.
ipar(8) - if the value is not equal to 0, the dcg/dcgm-
rhs routine performs the stopping test for the
maximum number of iterations, namely,
ipar(4)≤ipar(5). Otherwise, the method is
stopped and the corresponding value is
assigned to the RCI_request. If the value is
0, the routine does not perform this stopping
test. The default value is 1.
rhs routine performs the residual stopping
test, namely,
dpar(5)≤dpar(4)=dpar(1)*dpar(3)+dpar(2).
Otherwise, the method is stopped and
corresponding value is assigned to the
RCI_request. If the value is 0, the routine
does not perform this stopping test. The
default value is 0.
rhs routine requests a user-defined stopping
test by setting the output parameter
2388
RCI_request=2. If the value is 0, the routine
does not perform the user defined stopping
test. The default value is 1.
NOTE. At least one of the parameters

ipar(8)-ipar(10) must be set to 1.
ipar(11) - if the value is equal to 0, the dcg/dcgmrhs

routine runs the non-preconditioned version
of the corresponding Conjugate Gradient
method. Otherwise, the routine runs the
preconditioned version of the Conjugate
Gradient method, and requests the user to
perform the preconditioning step by setting
the output parameter RCI_request=3. The
default value is 0.
ipar(11:128), are reserved and not used in the current RCI
ipar(11:128+2*nrhs) CG SRHS and MRHS routines respectively.
NOTE. Advanced users can define the

array in the code using RCI CG SRHS as
INTEGER ipar(11). However, to
guarantee compatibility with the future
releases of Intel MKL, declaring the array
ipar with length 128 is highly
recommended.
dpar - DOUBLE PRECISION array, for SRHS of size 128, for MRHS of
size (128+2*nrhs); this parameter is used to specify the double
precision set of data for the RCI CG computations, specifically:
dpar(1) - specifies the relative tolerance. The default
value is 1.0D-6.
dpar(2) - specifies the absolute tolerance. The default
value is 0.0D-0.
2389
dpar(3) - specifies the square norm of the initial

residual (if it is computed in the dcg/dcgmrhs
routine). The initial value is 0.
dpar(4) - service variable equal to
dpar(1)*dpar(3)+dpar(2) (if it is computed
in the dcg/dcgmrhs routine). The initial value
is 0.
dpar(5) - specifies the square norm of the current
residual. The initial value is 0.0.
dpar(6) - specifies the square norm of residual from
the previous iteration step (if available). The
initial value is 0.0.
dpar(7) - contains the alpha parameter of the CG
method. The initial value is 0.0.
dpar(8) - contains the beta parameter of the CG
method, it is equal to dpar(5)/dpar(6) The
dpar(9:128), are reserved and not used in the current RCI
dpar(9:128+2*nrhs) CG SRHS and MRHS routines respectively.
NOTE. Advanced users can define this

DOUBLE PRECISION dpar(8). However,
to guarantee compatibility with the future
dpar with length 128 is highly
recommended.
tmp - DOUBLE PRECISION array of size (n,4)for SRHS, and

(n,3+nrhs)for MRHS. This parameter is used to supply the double
precision temporary space for the RCI CG computations,
specifically:
tmp(:,1) - specifies the current search direction. The
2390
tmp(:,2) - contains the matrix multiplied by the current
search direction. The initial value is 0.0.
tmp(:,3) - contains the current residual. The initial value
is 0.0.
tmp(:,4) - contains the inverse of the preconditioner
applied to the current residual. There is no
initial value for this parameter.

DOUBLE PRECISION tmp(n,3) if they
run only non-preconditioned CG
iterations.
FGMRES Interface Description

All types in this documentation refer to the common Fortran types, INTEGER, and DOUBLE
PRECISION.
C and C++ programmers should refer to the section Calling Sparse Solver Routines From C/C++
for information on mapping Fortran types to C/C++ types.
Routines Options
All of the RCI FGMRES routines have parameters for passing various options to the routines.
The values for these parameters should be specified very carefully (see FGMRES Common
Parameters), and they can be changed during computations according to the user's needs.
NOTE. Provide correct and consistent parameters to the subroutines to avoid fails or
wrong results.
2391
User Data Arrays

Many of the RCI FGMRES routines take arrays of user data as input. For example, user arrays
are passed to the routine dfgmres to compute the solution of a system of linear algebraic
equations. To minimize storage requirements and improve overall run-time efficiency, the Intel
MKL RCI FGMRES routines do not make copies of the user input arrays.
FGMRES Common Parameters
NOTE. The default and initial values listed below are assigned to the parameters by
calling the dfgmres_init routine.
n - INTEGER, this parameter sets the size of the problem in the

dfgmres_init routine. All the other routines uses ipar(1)
parameter instead.
x - DOUBLE PRECISION array, this parameter contains the current
approximation to the solution vector. Before the first call to the
dfgmres routine, it contains the initial approximation to the
solution vector.
b - DOUBLE PRECISION array, this parameter contains the right-hand
side vector. Depending on user requests, it may later contain the
approximate solution.
RCI_request - INTEGER, this parameter informs about the result of work of the
RCI FGMRES routines. The negative values of the parameter
indicate that the routine is completed with errors or warnings. The
0 value indicates the successful completion of the task. The positive
values mean that the user must perform certain actions,
specifically:
RCI_request= 1 - multiply the matrix by tmp(ipar(22)), put
the result in tmp(ipar(23)), and return the
control to the dfgmres routine;
RCI_request= 2 - perform the stopping tests. If they fail,
return the control to the dfgres routine.
Otherwise, the solution can be updated by a
subsequent call to dfgmres_get routine;
2392
RCI_request= 3 - apply the preconditioner to tmp(ipar(22)),
put the result in tmp(ipar(23)), and return
the control to the dfgmres routine.
RCI_request= 4 - check if the norm of the current orthogonal
vector is not zero within the
rounding/computational errors. Return the
control to the dfgmres routine if it is not zero,
otherwise complete the solution process by
calling dfgmres_get routine.
ipar(128) - INTEGER array, this parameter specifies the integer set of data
for the RCI FGMRES computations:
ipar(1) - specifies the size of the problem. The dfgm-
res_init routine assigns ipar(1)=n. All the
other routines uses this parameter instead of
n. There is no default value for this parameter.
ipar(2) - specifies the type of output for error and
warning messages that are generated by the
RCI FGMRES routines. The default value 6
means that all messages are displayed on the
screen. Otherwise the error and warning
messages are written to the newly created file
MKL_RCI_FGMRES_Log.txt. Note that if
ipar(6) and ipar(7) parameters are set to
0, error and warning messages are not
generated at all.
ipar(3) - contains the current stage of the RCI
FGMRES computations, the initial value is 1.

ipar(4) - contains the current iteration number. The

initial value is 0.
ipar(5) - specifies the maximum number of iterations.
The default value is min {150,n}.
2393

output error messages in accordance with the
parameter ipar(2). Otherwise, the routines
do not output error messages at all, but return
a negative value of the parameter
RCI_request. The default value is 1.
output warning messages in accordance with
the parameter ipar(2). Otherwise, the
routines do not output warning messages at
all, but they return a negative value of the
parameter RCI_request. The default value is
1.
ipar(8) - if the value is not equal to 0, the dfmres
routine performs the stopping test for the
maximum number of iterations, namely,
ipar(4)≤ipar(5). Otherwise, the method is
stopped and the corresponding value is
assigned to the RCI_request. If the value is
0, the dfgmres routine does not perform this
stopping test. The default value is 1.
ipar(9) - if the value is not equal to 0, the dfgmres
routine performs the residual stopping test,
namely, dpar(5)≤dpar(4) =
dpar(1)·dpar(3)+dpar(2). If the criterion
is met, the method is stopped and
corresponding value is assigned to the output
parameter RCI_request. If the value is 0, the
dfgmres routine does not perform this
stopping test. The default value is 0.
routine requests for the user defined stopping
test by setting RCI_request=2. If the value
is 0, the dfgmres routine does not perform
the user defined stopping test. The default
value is 1.
2394
NOTE. At least one of the parameters
ipar(8)-ipar(10) must be set to 1.
ipar(11) - if the value is equal to 0, the dfgmres

routine runs the non-preconditioned version
of the FGMRES method. Otherwise, the routine
runs the preconditioned version of the FGMRES
method, and requests the user to perform the
preconditioning step by setting the output
parameter RCI_request=3. The default value
is 0.
routine performs the automatic test for zero
norm of the currently generated vector,
namely, dpar(7)≤dpar(8), where dpar(8)
contains the tolerance value. Otherwise, the
routine requests the user to perform this check
by setting the output parameter
RCI_request=4. The default value is 0.
ipar(13) - if the value is equal to 0, the dfgmres_get
routine updates the solution to the vector x
according to the computations done by the
dfgmres routine. If the value is positive, the
routine writes the solution to the right hand
side vector b. If the value is negative, the
routine returns only the number of the current
iteration, and does not update the solution.
The default value is 0
2395
NOTE. Advanced users may use the

dfgmres_get routine at any place in the
code. In this case special attention
should be paid to the parameter
ipar(13). The RCI FGMRES iterations
can be continued after the call to dfgm-
res_get routine only if the parameter
ipar(13) is not equal to zero. If
ipar(13) is positive, then the updated
solution overwrites the right hand side
in the vector b. If the user wants to run
the restarted version of FGMRES with the
same right hand side, then it must be
saved in a different memory location
before the first call to the dfgmres_get
routine with positive ipar(13).
ipar(14) - contains the internal iteration counter that

counts the number of iterations before the
restart takes place. The initial value is 0.

ipar(15) - specifies the length of the non-restarted

FGMRES iterations. To run the restarted
version of the FGMRES method, the user must
assign the number of iterations to ipar(15)
before the restart. The default value is min
{150, n}, that is, by default the non-restarted
version of FGMRES method is used.
ipar(16) - service variable, specifies the location of the
rotated Hessenberg matrix from which the
matrix stored in the packed format (see Matrix
Arguments in the Appendix B for details) is
started in the tmp array.
2396
rotation cosines from which the vector of
cosines is started in the tmp array.
rotation sines from which the vector of sines
is started in the tmp array.
rotated residual vector from which the vector
is started in the tmp array.
least squares solution vector from which the
vector is started in the tmp array.
set of preconditioned vectors from which the
set is started in the tmp array. The memory
locations in the tmp array starting from
ipar(21) are used only for the preconditioned
FGMRES method.
ipar(22) - specifies the memory location from which
the first vector (source) used in operations
requested via RCI_request is started in the
tmp array.
ipar(23) - specifies the memory location from which
the second vector (source) used in operations
requested via RCI_request is started in the
tmp array.
ipar(24:128) are reserved and not used in the current RCI
FGMRES routines.
NOTE. Advanced users can define the

array in the code as INTEGER ipar(23).
However, to guarantee compatibility with
the future releases of Intel MKL,
declaring the array ipar with length 128
is highly recommended.
2397
dpar(128) - DOUBLE PRECISION array, this parameter specifies the double

precision set of data for the RCI CG computations, specifically:
dpar(1) - specifies the relative tolerance. The default
value is 1.0D-6.
dpar(2) - specifies the absolute tolerance. The default
value is 0.0D-0.
dpar(3) - specifies the Euclidean norm of the initial
residual (if it is computed in the dfgmres
routine). The initial value is 0.0.
dpar(4) - service variable equal to
dpar(1)*dpar(3)+dpar(2) (if it is computed
in the dfgmres routine). The initial value is
0.0.
dpar(5) - specifies the Euclidean norm of the current
residual. The initial value is 0.0.
dpar(6) - specifies the Euclidean norm of residual from
the previous iteration step (if available). The
dpar(7) - contains the norm of the generated vector.
The initial value is 0.0.
NOTE. For reference only: in terms of

[Saad03] this parameter is the coefficient
hk+1,k of the Hessenberg matrix.
dpar(8) - contains the tolerance for the zero norm of

the currently generated vector. The default
value is 1.0D-12.
dpar(9:128) are reserved and not used in the current RCI
FGMRES routines.
2398
array in the code as follows: DOUBLE
PRECISION dpar(8). However, to
guarantee compatibility with the future
dpar with length 128 is highly
recommended.
tmp -DOUBLE PRECISION array of size ((2*ipar(15)+1)*n +

ipar(15)*(ipar(15)+9)/2 + 1)), this parameter is used to
supply the double precision temporary space for the RCI FGMRES
computations, specifically:
tmp(1:ipar(16)-1) - contains the sequence of generated by the
FGMRES method vectors. The initial value is
0.0 for the first part of this memory of length
n;
tmp(ipar(16):ipar(17)-1)contains the rotated Hessenberg matrix
generated by the FGMRES method; the
matrix is stored in the packed format.
There is no initial value for this part of
tmp array.
tmp(ipar(17):ipar(18)-1)- contains the rotation cosines vector
generated by the FGMRES method. There is
no initial value for this part of tmp array.
tmp(ipar(18):ipar(19)-1)contains the rotation sines vector generated
by the FGMRES method. There is no initial
value for this part of tmp array.
tmp(ipar(19):ipar(20)-1)contains the rotated residual vector generated
by the FGMRES method. There is no initial
value for this part of tmp array.
tmp(ipar(20):ipar(21)-1)contains the solution vector to the least
squares problem generated by the FGMRES
method. There is no initial value for this part
of tmp array.
2399
tmp(ipar(21):) contains the set of preconditioned vectors

generated for the FGMRES method by the
user. This part of tmp array is not used if
non-preconditioned version of FGMRES method
is called. There is no initial value for this part
of tmp array.

array in the code as DOUBLE PRECISION
tmp((2*ipar(15)+1)*n +
ipar(15)*(ipar(15)+9)/2 + 1)) if
they run only non-preconditioned
FGMRES iterations.
dcg_init
Syntax
dcg_init(n, x, b, RCI_request, ipar, dpar, tmp)
Description
This routine is declared in mkl_rci.fi for Fortran interface and in mkl_rci.h for C interface.
The routine dcg_init initializes the solver. After initialization, all subsequent invocations of
the Intel MKL RCI CG routines use the values of all parameters returned by the routine
dcg_init. Advanced users can skip this step and set these parameters directly in the
corresponding routines.
WARNING. Users can modify the contents of these arrays after they are passed to the
solver routine only if they are sure that the values are correct and consistent. Basic check
for correctness and consistency can be done by calling the dcg_check routine, but it
does not guarantee that the method will work correctly.
2400
Input Parameters
n INTEGER. Contains the size of the problem, and the sizes
of arrays x and b.
x DOUBLE PRECISION array of size n. Contains the initial
approximation to the solution vector. Normally it is equal
to 0 or to b.
b DOUBLE PRECISION array of size n. Contains the right-hand
side vector.
Output Parameters
RCI_request INTEGER. Informs about result of work of the routine.
ipar INTEGER array of size 128. Refer to the CG Common
Parameters.
dpar DOUBLE PRECISION array of size 128. Refer to the CG
Common Parameters.
tmp DOUBLE PRECISION array of size (n,4). Refer to the CG
Common Parameters.
Return Values
RCI_request= 0 The routine completes the task normally.
RCI_request= -10000 The routine fails to complete the task.
dcg_check
Checks consistency and correctness of the user
defined data.
Syntax
dcg_check(n, x, b, RCI_request, ipar, dpar, tmp)
Description
2401
The routine dcg_check checks consistency and correctness of the parameters to be passed to
the solver routine dcg. However this operation does not guarantee that the method gives the
correct result. It only reduces the chance of making a mistake in the parameters of the method.
Advanced users may skip this operation if they are sure that the correct data is specified in the
solver parameters.
for correctness and consistency can be done by calling the dcg_check routine, but it
does not guarantee that the method will work correctly.
Note that lengths of all vectors are assumed to have been defined in a previous call to the
dcg_init routine.
Input Parameters
n INTEGER. Contains the size of the problem, and size of
arrays x and b.
to 0 or to b.
side vector.
Output Parameters
Parameters.
Common Parameters.
Common Parameters.
Return Values
RCI_request= 0 The routine completes task normally.
RCI_request= -1100 The routine is interrupted, errors occur.
2402
RCI_request= -1001 The routine returns some warning messages.
RCI_request= -1010 The routine changes some parameters to make them
consistent or correct.
RCI_request= -1011 The routine returns some warning messages and
changes some parameters.
dcg
Computes the approximate solution vector.
Syntax
dcg(n, x, b, RCI_request, ipar, dpar, tmp)
Description
The routine dcg computes the approximate solution vector using the CG method [Young71].
The routine dcg uses the value that was in the vector x before the first call as an initial
approximation to the solution. The parameter RCI_request informs the user about the task
completion and requests results of certain operations that are required by the solver.
dcg_init routine.
Input Parameters
n INTEGER. Contains the size of the problem, and size of
arrays x and b.
approximation to the solution vector.
side vector.
Common Parameters.
Output Parameters
2403
x DOUBLE PRECISION array of size n. Contains the updated

Parameters.
Common Parameters.
Common Parameters.
Return Values
RCI_request=0 The routine completes task normally, the solution
is found and stored in the vector x. This occurs only
if the stopping tests are fully automatic. For the user
defined stopping tests, see the comments to the
RCI_request= 2.
RCI_request=-1 The routine is interrupted because the maximal
number of iterations is reached, but the relative
stopping criterion is not met - this situation occurs
only if both tests are requested by the user.
RCI_request=-2 The routine is interrupted because attempt to divide
by zero occurs. This situation happens if the matrix
is (almost) non-positive definite.
RCI_request=- 10 The routine is interrupted because the residual norm
is invalid. Probably, the value dpar(6) was altered
outside of the routine, or the dcg_check routine
was not called.
RCI_request=-11 The routine is interrupted because it enters the
infinite cycle. Probably, the values ipar(8),
ipar(9), ipar(10) were altered outside of the
routine, or the dcg_check routine was not called.
RCI_request= 1 Requests the user to multiply the matrix by
tmp(1:n,1), put the result in the tmp(1:n,2), and
return the control back to the routine dcg.
2404
RCI_request= 2 Requests the user to perform the stopping tests. If
they fail, the control must be returned back to the
dcg routine. Otherwise, the solution is found and
stored in the vector x.
RCI_request= 3 Requests the user to apply the preconditioner to
tmp(:, 3), put the result in the tmp(:, 4), and
return the control back to the routine dcg.
dcg_get
Syntax
dcg_get(n, x, b, RCI_request, ipar, dpar, tmp, itercount)
Description
The routine dcg_get retrieves the current iteration number of the solutions process.
Input Parameters
of arrays x and b.
approximation vector to the solution.
side vector.
RCI_request INTEGER. This parameter is not used.
Parameters.
Common Parameters.
Common Parameters.
2405
Output Parameters
itercount INTEGER argument. Contains the value of the current
iteration number.
Return Values
The routine dcg_get does not return any value.
dcgmrhs_init
Initializes the RCI CG solver with MHRS.
Syntax
dcgmrhs_init(n, x, nrhs, b, method, RCI_request, ipar, dpar, tmp)
Description
The routine dcgmrhs_init initializes the solver. After initialization all subsequent invocations
of the Intel MKL RCI CG with multiple right hand sides (MRHS) routines use the values of all
parameters that are returned by dcgmrhs_init. Advanced users may skip this step and set
the values to these parameters directly in the corresponding routines.
for correctness and consistency can be done by calling the dcgmrhs_check routine, but
it does not guarantee that the method will work correctly.
NOTE. To use this routine with the name dcg_init, switch on the compiler preprocessor
and include the files mkl_solver.h for C/C++, or mkl_solver.f77 for FORTRAN.
Input Parameters
of arrays x and b.
2406
x DOUBLE PRECISION matrix of size n-by-nrhs. Contains the
initial approximation to the solution vectors. Normally it is
equal to 0 or to b.
nrhs INTEGER. This parameter sets the number of right-hand
sides.
b DOUBLE PRECISION matrix of size (nrhs, n). Contains
the right-hand side vectors.
method INTEGER. Specifies the method of solution:
1 - CG with multiple right hand sides; default value
2 - Block-CG (not supported now).
Output Parameters
ipar INTEGER array of size (128+2*nrhs). Refer to the CG
Common Parameters.
dpar DOUBLE PRECISION array of size (128+2*nrhs). Refer to
the CG Common Parameters.
tmp DOUBLE PRECISION array of size (n, 3+nrhs). Refer to
Return Values
dcgmrhs_check
defined data.
Syntax
dcgmrhs_check(n, x, nrhs, b, RCI_request, ipar, dpar, tmp)
Description
2407
The routine dcgmrhs_check checks consistency and correctness of the parameters to be passed
to the solver routine dcgmrhs. However this operation does not guarantee that the method
gives the correct result. It only reduces the chance of making a mistake in the parameters of
the method. Advanced users may skip this operation if they are sure that the correct data is
specified in the solver parameters.
for correctness and consistency can be done by calling the dcgmrhs_check routine, but
dcgmrhs_init routine.
NOTE. To use this routine with the name dcg_check, switch on the compiler preprocessor
Input Parameters
of arrays x and b.
x DOUBLE PRECISION matrix of size n by nrhs. Contains the
initial approximation to the solution vectors. Normally it is
equal to 0 or to b.
sides.
b DOUBLE PRECISION matrix of size (nrhs,n). Contains the
right-hand side vectors.
Output Parameters
Common Parameters.
2408
Return Values
dcgmrhs
Computes the approximate solution vectors.
Syntax
dcgmrhs(n, x, nrhs, b, RCI_request, ipar, dpar, tmp)
Description
The routine dcgmrhs computes the approximate solution vectors using the CG with multiple
right hand sides (MRHS) method [Young71]. The routine dcgmrhs uses the value that was in
the x before the first call as an initial approximation to the solution. The parameter RCI_request
informs the user about task completion status and requests results of certain operations that
are required by the solver.
dcgmrhs_init routine.
NOTE. To use this routine with the name dcg, the user must switch on the compiler's
preprocessor and include the files mkl_solver.h for C/C++, or mkl_solver.f77 for
FORTRAN.
2409
Input Parameters
of arrays x and b.
initial approximation to the solution vectors.
sides.
right-hand side vectors.
Output Parameters
updated approximation to the solution vectors.
Common Parameters.
Return Values
RCI_request=0 The routine completes task normally, the solution
is found and stored in the matrix x. This occurs only
if the stopping tests are fully automatic. For the user
defined stopping tests, see the comments to the
RCI_request= 2.
RCI_request=-1 The routine is interrupted because the maximal
stopping criterion is not satisfied. This situation
occurs only if both tests are requested by the user.
2410
RCI_request=-2 The routine is interrupted because attempt to divide
by zero occurs. This happens if the matrix is (almost)
non-positive definite.
RCI_request=- 10 The routine is interrupted because the residual norm
is invalid. Probably, the value dpar(6) was altered
outside of the routine, or the routine dcgm-
rhs_check was not called.
RCI_request=-11 The routine is interrupted because it enters the
routine, or the routine dcgmrhs_check was not
called.
tmp(1:n,1), put the result in the tmp(1:n,2), and
return the control back to the routine dcgmrhs.
they fail, the control must be returned back to the
routine dcgmrhs. Otherwise, the solution is found
and stored in the matrix x.
RCI_request= 3 Requests the user to apply the preconditioner to
tmp(:,3+ipar(3)), put the result in tmp(:,3),
and return the control back to the routine dcgmrhs.
dcgmrhs_get
Syntax
dcgmrhs_get(n, x, b, RCI_request, ipar, dpar, tmp, itercount)
Description
The routine dcgmrhs_get retrieves the current iteration number of the solving process.
NOTE. To use this routine with the name dcg_get, switch on the compiler preprocessor
2411
Input Parameters
of arrays x and b.
initial approximation to the solution vectors.
sides.
right-hand side .
RCI_request INTEGER. This parameter is not used.
Common Parameters.
Output Parameters
iteration number.
Return Values
The routine dcgmrhs_get does not return any value.
dfgmres_init
Syntax
dfgmres_init(n, x, b, RCI_request, ipar, dpar, tmp)
Description
2412
The routine dfgmres_init initializes the solver. After initialization all subsequent invocations
of Intel MKL RCI FGMRES routines use the values of all parameters that are returned by dfgm-
res_init. Advanced users may skip this step and set the values to these parameters directly
in the corresponding routines.
for correctness and consistency can be done by calling the dfgmres_check routine, but
Input Parameters
of arrays x and b.
to 0 or to b.
side vector.
Output Parameters
ipar INTEGER array of size 128. Refer to the FGMRES Common
Parameters.
dpar DOUBLE PRECISION array of size 128. Refer to the FGMRES
Common Parameters.
tmp DOUBLE PRECISION array of size
((2*ipar(15)+1)*n+ipar(15)*ipar(15)+9)/2 + 1).
Refer to the FGMRES Common Parameters.
Return Values
2413
dfgmres_check
defined data.
Syntax
dfgmres_check(n, x, b, RCI_request, ipar, dpar, tmp)
Description
The routine dfgmres_check checks consistency and correctness of the parameters to be passed
to the solver routine dfgmres. However this operation does not guarantee that the method
gives the correct result. It only reduces the chance of making a mistake in the parameters of
the method. Advanced users may skip it if they are sure that the correct data is specified in
the solver parameters.
for correctness and consistency can be done by calling the dfgmres_check routine, but
dfgmres_init routine.
Input Parameters
of arrays x and b.
to 0 or to b.
side vector.
Output Parameters
2414
Parameters.
Common Parameters.
((2*ipar(15)+1)*n+ipar(15)*ipar(15)+9)/2 + 1).
Return Values
dfgmres
Makes the FGMRES iterations.
Syntax
dfgmres(n, x, b, RCI_request, ipar, dpar, tmp)
Description
The routine dfgmres performs the FGMRES iterations [Saad03]. The routine dfgmres uses the
value that was in the vector x before the first call as an initial approximation to the solution.
To update the current approximation to the solution, the dfgmres_get routine must be called.
The RCI FGMRES iterations can be continued after the call to the dfgmres_get routine only if
the value of the parameter ipar(13) is not equal to 0 (default value). Note that the updated
solution overwrites the right hand side in the vector b if the parameter ipar(13) is positive,
and the restarted version of the FGMRES method can not be run. If the right hand side should
be kept, it must be saved in a different memory location before the first call to the dfgmres_get
routine with a positive ipar(13).
2415
The parameter RCI_request informs the user about the task completion and requests results
of certain operations that are needed to the solver.
Note that lengths of all the vectors are assumed to have been defined in a previous call to the
dfgmres_init routine.
Input Parameters
of arrays x and b.
side vector.
((2*ipar(15)+1)*n+ipar(15)*ipar(15)+9)/2 + 1).
Output Parameters
Parameters.
Common Parameters.
((2*ipar(15)+1)*n+ipar(15)*ipar(15)+9)/2 + 1).
Return Values
RCI_request= 0 The routine completes task normally, the solution
is found. This occurs only if the stopping tests are
fully automatic. For the user defined stopping tests,
see the comments to the RCI_request= 2 or 4.
RCI_request= -1 The routine is interrupted because the maximal
stopping criterion is not met. This situation occurs
only if both tests are requested by the user.
2416
RCI_request= -10 The routine is interrupted because attempt to divide
by zero occurs. Normally it happens if the matrix is
(almost) degenerate. However it may happen if the
parameter dpar is altered by mistake, or if the
method is not stopped when the solution is found.
RCI_request= -11 The routine is interrupted because it enters the
routine, or the dfgmres_check routine was not
called.
RCI_request= -12 The routine is interrupted because errors are found
in the method parameters. Normally this happens
if the parameters ipar and dpar are altered by
mistake outside the routine.
tmp(ipar(22)), put the result in the
tmp(ipar(23)), and return the control back to the
routine dfgmres.
they fail, the control back must be returned to the
dfgmres routine. Otherwise, the FGMRES solution
is found, and the fgmres_get routine can be run to
update the computed solution in the vector x.
RCI_request= 3 Requests the user to apply the inverse preconditioner
to ipar(22), put the result in the ipar(23), and
return the control back to the routine dfgmres.
RCI_request= 4 Requests the user to check the norm of the currently
generated vector. If it is not zero within the
computational/rounding errors, the control must be
returned back to the dfgmres routine. Otherwise,
the FGMRES solution is found, and the dfgmres_get
routine can be run to update the computed solution
in the vector x.
2417
dfgmres_get
Retrieves the number of the current iteration and
updates the solution.
Syntax
dfgmres_get(n, x, b, RCI_request, ipar, dpar, tmp, itercount)
Description
The routine dfgmres_get retrieves the current iteration number of the solution process and
updates the solution according to the computations performed by the dfgmres routine. To
retrieve the current iteration number only, set the parameter ipar(13)= -1 beforehand. This
is normally recommended to do to proceed further with the computations. If the intermediate
solution is needed, the method parameters must be set properly, see for details FGMRES
Common Parameters and Iterative Sparse Solver Code Examples section in the Appendix C.
Input Parameters
of the arrays x and b.
Parameters.
Common Parameters.
((2*ipar(15)+1)*n+ipar(15)*ipar(15)+9)/2 + 1).
Output Parameters
x DOUBLE PRECISION array of size n. If ipar(13)= 0, it
contains the updated approximation to the solution according
to the computations done in dfgmres routine. Otherwise,
it is not changed.
2418
b DOUBLE PRECISION array of size n. If ipar(13)> 0, it
contains the updated approximation to the solution according
to the computations done in dfgmres routine. Otherwise,
it is not changed.
iteration number.
Return Values
RCI_request= -12 The routine is interrupted because errors are found
in the method parameters. Normally this happens
if the parameters ipar and dpar are altered by
mistake outside the routine.
Implementation Details
Several aspects of the Intel MKL RCI ISS interface are platform-specific and language-specific.
To promote portability across platforms and ease of use across different languages, include
one of the Intel MKL RCI ISS language-specific header files. Currently, there is one language
specific header file for C programs.
This language-specific header file defines function prototypes and they are the following:
void dcg_init(int *n, double *x, double *b, int *rci_request, int *ipar,
double dpar, double *tmp);
void dcg_check(int *n, double *x, double *b, int *rci_request, int *ipar,
void dcg(int *n, double *x, double *b, int *rci_request, int *ipar, double
dpar, double *tmp);
void dcg_get(int *n, double *x, double *b, int *rci_request, int *ipar,
double dpar, double *tmp, int *itercount);
void dcgmrhs_init(int *n, double *x, int *nRhs, double *b, int *method, int
*rci_request, int *ipar, double dpar, double *tmp);
void dcgmrhs_check(int *n, double *x, int *nRhs, double *b, int *rci_request,
int *ipar, double dpar, double *tmp);
2419
void dcgmrhs(int *n, double *x, int *nRhs, double *b, int *rci_request, int
*ipar, double dpar, double *tmp);
void dcgmrhs_get(int *n, double *x, int *nRhs, double *b, int *rci_request,
int *ipar, double dpar, double *tmp, int *itercount);
void dfgmres_init(int *n, double *x, double *b, int *rci_request, int *ipar,
void dfgmres_check(int *n, double *x, double *b, int *rci_request, int *ipar,
void dfgmres(int *n, double *x, double *b, int *rci_request, int *ipar,
void dfgmres_get(int *n, double *x, double *b, int *rci_request, int *ipar,
double dpar, double *tmp, int *itercount);
NOTE. Intel MKL does not support the RCI ISS interface without the language specific
header file included.
Preconditioners based on Incomplete LU Factorization

Technique
Usually, preconditioners, or accelerators are used to accelerate an iterative solution process.
In some cases, their use can reduce the number of iterations dramatically and thus lead to
better solver performance. Although the terms ‘preconditioner’ and ‘accelerator’ are synonyms,
hereafter only ‘preconditioner’ is used.
Currently, Intel MKL provides two preconditioners, ILU0 and ILUT, for the sparse matrix
presented only in the format accepted in the Intel MKL direct sparse solvers (three-array
variation of the CSR storage format, see Sparse Matrix Storage Format section). The used
algorithms are described in [Saad03].
The ILU0 preconditioner is based on a well-known factorization of the original matrix into a
product of two triangular matrices: low triangular and upper triangular matrices. Usually, such
decomposition leads to some fill-in in the resulting matrix structure in comparison with the
original matrix. The distinctive feature of the ILU0 preconditioner is that it preserves the structure
of the original matrix in the result.
2420
Unlike the ILU0 preconditioner, the ILUT preconditioner preserves some resulting fill-in in the
preconditioner matrix structure. The distinctive feature of the ILUT preconditioner is that it
saves the resulting entry of the preconditioner if the entry satisfies two conditions
simultaneously: the value of the entry is greater than the product of the given tolerance and
matrix row norm, and the entry is in the given bandwidth of the resulting preconditioner matrix.
Both ILU0 and ILUT preconditioners can apply to any non-degenerate matrix. They can be used
alone or together with the Intel MKL RCI FGMRES solver (see Sparse Solver Routines). Avoid
using this preconditioners with MKL RCI CG solver because in general, they produce
non-symmetric resulting matrix even if the original matrix is symmetric. Usually, an inverse of
the preconditioner is required in this case. To do this the Intel MKL triangular solver routine
mkl_dcsrtrsv must be applied twice: for the low triangular part of the preconditioner, and
then for its upper triangular part.
NOTE. Although ILU0 and ILUT preconditioners apply to any non-degenerate matrix, in
some cases the algorithm may fail to ensure successful termination and the required
result. The result of the preconditioner routine can be checked only in practice.
A preconditioner may increase the number of iterations for an arbitrary case of the system
and the initial guess, and even ruin the convergence. It is user's responsibility to carefully
use a suitable preconditioner.
General Scheme of Using ILUT and RCI FGMRES Routines

The scheme is the same for both preconditioners. Some differences exist in the calling
parameters of the preconditioners and in the subsequent call of two triangular solvers. All these
differences can be seen in the example codes for both preconditioners in the Preconditioner
Code Examples in the Appendix C.
The following pseudocode shows the general scheme of using the ILUT preconditioner in the
RCI FGMRES context.
...
generate matrix A
call dfgmres_init(n, x, b, RCI_request, ipar, dpar, tmp)
call dcsrilut(n, a, ia, ja, bilut, ibilut, jbilut, tol, maxfil, ipar, dpar,
ierr)
call dfgmres_check(n, x, b, RCI_request, ipar, dpar, tmp)
2421
1 call dfgmres(n, x, b, RCI_request, ipar, dpar, tmp)

multiply the matrix A by tmp(ipar(22)) and put the result in tmp(ipar(23))
goto 1
endif
go to 1
else
c stop FGMRES iterations.
goto 2
endif
endif
c Below, trvec is an intermediate vector of length at least n
c Here is the recommended use of the result produced by the ILUT routine.
c via standard Intel MKL Sparse Blas solver routine mkl_dcsrtrsv.
call mkl_dcsrtrsv('L','N','U', n, bilut, ibilut, jbilut,
tmp(ipar(22)),trvec)
call mkl_dcsrtrsv('U','N','N', n, bilut, ibilut, jbilut, trvec,
tmp(ipar(23)))
goto 1
endif
2422
check the norm of the next orthogonal vector, it is contained in dpar(7)
if (the norm is not zero up to rounding/computational errors) then
goto 1
else
goto 2
endif
endif
2 call dfgmres_get(n, x, b, RCI_request, ipar, dpar, tmp, itercount)
ILU0 and ILUT Preconditioners Interface Description

The concepts required to understand the use of the Intel MKL preconditioner routines are
discussed in the Linear Solvers Basics.
In this section the FORTRAN style notations are used.
All types in this documentation refer to the standard Fortran types, INTEGER, and DOUBLE
PRECISION.
C and C++ programmers should refer to the section Calling Sparse Solver Routines From C/C++
for information on mapping Fortran types to C/C++ types.
User Data Arrays

The preconditioner routines take arrays of user data as input, for example, user arrays
representing the original matrix. To minimize storage requirements and improve overall run-time
efficiency, the Intel MKL preconditioner routines do not make copies of the user input arrays.
2423
Common Parameters
The preconditioners have parameters for passing various options to the routine. The values for
these parameters should be specified very carefully. These values can be changed during
computations according to the user's needs.
Some parameters are common with the FGMRES Common Parameters. Only the routine
dfgmres_init specifies their default and initial values. However, some parameters can be
redefined with other values. These parameters are listed below.
For the ILU0 preconditioner:

ipar(2) - specifies the destination of error messages generated by the ILU0 routine. The
default value 6 means that all error messages are displayed on the screen. Otherwise the error
messages are written to the newly created file MKL_PREC_log.txt. Note if the parameter
ipar(6) is set to 0, then error messages are not generated at all.
ipar(6) - if its value is not equal to 0, the ILU0 routine returns error messages in accordance
with the parameter ipar(2). Otherwise, the routine does not generate error messages at all,
but returns a negative value of the parameter ierr. The default value is 1.
For the ILUT preconditioner:

ipar(2) - specifies the destination of error messages generated by the ILUT routine. The
default value 6 means that all messages are displayed on the screen. Otherwise the error
messages are written to the newly created file MKL_PREC_log.txt. Note if the parameter
ipar(6) is set to 0, error messages are not generated at all.
ipar(6) - if its value is not equal to 0, the ILUT routine returns error messages in accordance
with the parameter ipar(2). Otherwise, the routine does not generate error messages at all,
but returns a negative value of the parameter ierr. The default value is 1.
ipar(7) - if its value is greater than 0, the ILUT routine generates warning messages in
accordance with the parameter ipar(2)and continues calculations. Otherwise, if its value is
equal to 0, the routine returns a positive value of the parameter ierr, if its value is less than
0, the routine generates a warning message in accordance with the parameter ipar(2)and
returns a positive value of the parameter ierr. The default value is 1.
NOTE. Users must provide correct and consistent parameters to the routine to avoid
fails or wrong results.
2424
dcsrilu0
ILU0 preconditioner based on incomplete LU
factorization of a sparse matrix in the CSR format
(PARDISO variation).
Syntax
Fortran:
call dcsrilu0(n, a, ia, ja, bilu0, ipar, dpar, ierr)
C:
dcsrilu0(&n, a, ia, ja, bilu0, ipar, dpar, &ierr);
Input Parameters
n INTEGER. Size (number of rows or columns) of the original
square n-by-n-matrix A.
a DOUBLE PRECISION. Array containing the set of elements
of the matrix A. Its length is equal to the number of non-zero
elements in the matrix A. Refer to values array description
in the Sparse Matrix Storage Format section for more details.
ia INTEGER. Array of size (n+1) containing begin indices of
rows of matrix A such that ia(I) is the index in the array
A of the first non-zero element from the row I. The value
of the last element ia(n+1) is equal to the number of
non-zeros in the matrix A plus one. Refer to the rowIndex
array description in the Sparse Matrix Storage Format
section for more details.
non-zero element of the matrix A. Its size is equal to the
size of the array a. Refer to the columns array description
in the Sparse Matrix Storage Format section for more details.
NOTE. Column indices should be put in increasing

order for each row of matrix.
2425
ipar INTEGER array of size 128. This parameter is used to specify

the integer set of data for both the ILU0 and RCI FGMRES
computations. Refer to the ipar array description in the
FGMRES Common Parameters for more details on FGMRES
parameter entries. The entries that are specific to ILU0 are
listed below.
ipar(31) - specifies how the routine operates if
zero diagonal element occurs during
calculation. If this parameter is set to 0
(default value set by the routine dfgm-
res_init) then that the calculations are
stopped and the routine returns non-zero
error value. Otherwise the value of the
diagonal element is set to the specified
value and the calculations are continued.
NOTE. Advanced users can define

this array in the code as follows:
INTEGER ipar(31). However, to
guarantee the compatibility with the
future releases of the Intel MKL it is
highly recommended to declare the
array ipar of length 128.
dpar DOUBLE PRECISION array of size 128. This parameter is

used to specify the double precision set of data for both the
ILU0 and RCI FGMRES computations. Refer to the dpar
array description in the FGMRES Common Parameters for
more details on FGMRES parameter entries. The entries that
are specific to ILU0 are listed below.
dpar(31) - specifies the small value that is
compared with the diagonal elements
during calculations; if the value of the
diagonal element is smaller, then it is set
2426
to dpar(32), or the calculations are
stopped, in accordance with ipar(31);
the default value is 1.0D-16
NOTE. This parameter can be set

to the negative value, because its
absolute value is actually used in
calculations.
If this parameter is set to 0, the
comparison with the diagonal
element is not performed.
dpar(32) - specifies the value that is assigned to

the diagonal element if its value is less
than dpar(31) (see above); the default
value is 1.0D-10
NOTE. Advanced users can define

this array in the code as follows:
DOUBLE PRECOSIONdpar(32).
However, to guarantee the
compatibility with the future releases
of the Intel MKL it is highly
recommended to declare the array
dpar of length 128.
Output Parameters
bilu0 DOUBLE PRECISION. Array B stored in the PARDISO CSR
format containing non-zero elements of the resulting
preconditioning matrix. Its size is equal to the number of
non-zero elements in the matrix A. Refer to the values
2427
ierr INTEGER. Error flag, informs about the routine completion

status.
NOTE. To present the resulting preconditioning matrix in the CSR format the arrays ia
(row indices) and ja (column indices) of the input matrix must be used.
Description
The routine dcsrilu0 computes a preconditioner B [Saad03] of a given sparse matrix A stored
in the CSR format accepted in PARDISO:
A~B=L*U , where L is a low triangular matrix with unit diagonal, U is an upper triangular matrix
with non-unit diagonal, and the portrait of the original matrix A is used to store the incomplete
factors L and U.
Return Values
ierr=0 The routine completed task normally.
ierr=-101 The routine is interrupted, the error occurs: at least
one diagonal element is omitted from the matrix in
CSR format (see Sparse Matrix Storage Format).
ierr=-102 The routine is interrupted because the matrix
contains zero diagonal element, routine can not
perform operations.
ierr=-103 The routine is interrupted as the matrix contains too
small diagonal element, and an overflow may occur
because of the division by its value required to
complete the task, or a bad approximation to ILU0
with use of this element will be computed.
ierr=-104 The routine is interrupted because the memory is
insufficient for the internal work array.
ierr=-105 The routine is interrupted because the input matrix
size n isless than or equal to 0.
ierr=-106 The routine is interrupted because the the column
indices ja are placed in not increasing order.
2428
Interfaces
FORTRAN 77 and Fortran 95:

SUBROUTINE dcsrilu0 (n, a, ia, ja, bilu0, ipar, dpar, ierr)
INTEGER n, ierr, ipar(128)
DOUBLE PRECISION a(*), bilu0(*), dpar(128)
C:
void dcsrilu0 (int *n, double *a, int *ia, int *ja, double *bilu0, int *ipar, double *dpar,
int *ierr);
dcsrilut
ILUT preconditioner based on the incomplete LU
factorization with a threshold of a sparse matrix.
Syntax
Fortran:
call dcsrilut(n, a, ia, ja, bilut, bilut, ibilut, jbilut, tol, maxfil, ipar,
dpar, ierr)
C:
dcsrilut(&n, a, ia, ja, bilut, bilut, ibilut, jbilut, &tol, &maxfil, ipar,
dpar, &ierr);
Description
The routine dcsrilut computes a preconditioner B [Saad03] of a given sparse matrix A stored
in the format accepted in the direct sparse solvers:
A~B=L*U , where L is a low triangular matrix with unit diagonal, U is an upper triangular matrix
with non-unit diagonal.
The following threshold criteria are used to generate the incomplete factors L and U:
2429
1) the resulting entry must be greater than the matrix current row norm multiplied by the
parameter tol, and
2) the number of the non-zero elements in each row of the resulting L and U factors must not
be greater than the value of the parameter maxfil.
Input Parameters
n INTEGER. Size (number of rows or columns) of the original
square n-by-n matrix A.
a DOUBLE PRECISION. Array containing all non-zero elements
of the matrix A. The length of the array is equal to their
number. Refer to values array description in the Sparse
Matrix Storage Format section for more details.
ia INTEGER. Array of size (n+1) containing indices of non-zero
elements in the array A. ia(I)is the index of the first
non-zero element from the row I. The value of the last
element ia(n+1) is equal to the number of non-zeros in
the matrix A plus one. Refer to the rowIndex array
description in the Sparse Matrix Storage Format section for
more details.
ja INTEGER. Array of size equal to the size of the array a. This
array contains the column numbers for each non-zero
element of the matrix A. Refer to the columns array
more details.
NOTE. Column numbers must be in increasing order

for each row of matrix.
tol DOUBLE PRECISION. Tolerance for threshold criterion for

the resulting entries of the preconditioner.
maxfil INTEGER. Maximum fill-in, half of the preconditioner
bandwidth. The number of non-zero elements in the rows
of the preconditioner can not exceed (2*maxfil+1).
2430
ipar INTEGER array of size 128. This parameter is used to specify
the integer set of data for both the ILUT and RCI FGMRES
computations. Refer to the ipar array description in the
FGMRES Common Parameters for more details on FGMRES
parameter entries. The entries specific to ILUT are listed
below.
ipar(31) - specifies how the routine operates if the
value of the computed diagonal element
is less than the current matrix row norm
multiplied by the value of the parameter
tol. If ipar(31) = 0, then the
calculation is stopped and the routine
returns non-zero error value. Otherwise,
the value of the diagonal element is set
to the certain value (see description of
the parameter dpar(31) below), and the
calculation continues.
NOTE. There is no default value for

ipar(31) entry even if the
preconditioner is used within the RCI
ISS context. Always set the value of
this entry.
NOTE. Advanced users can define this array in the

code as INTEGER ipar(31). However, to guarantee
the compatibility with the future releases of the Intel
MKL, declare the array ipar of length 128.
dpar DOUBLE PRECISION array of size 128. This parameter

specifies the double precision set of data for both ILUT and
RCI FGMRES computations. Refer to the dpar array
description in the FGMRES Common Parameters for more
details on FGMRES parameter entries. The entries that are
specific to ILUT are listed below.
2431
dpar(31) - specifies the value that being multiplied

by the matrix row norm is assigned to
those diagonal elements whose value is
less than the matrix row norm multiplied
by the value of the parameter tol, if the
calculation is not stopped in accordance
with ipar(31).
NOTE. There is no default value for

dpar(31) entry even if the
preconditioner is used within RCI ISS
context. Always set the value of this
entry.
NOTE. Advanced users can define this array in the

code as DOUBLE PRECISION dpar(31). However,
to guarantee the compatibility with the future
releases of the Intel MKL, declare the array dpar of
length 128.
Output Parameters
bilut DOUBLE PRECISION. Array B containing non-zero elements
of the resulting preconditioning matrix, stored format
accepted in the direct sparse solvers. Refer to the values
section for more details. The size of the array is equal to
(2*maxfil+1)*n-maxfil*(maxfil+1)+1.
NOTE. Provide enough memory for this array before

calling the routine. Otherwise, the routine may fail
to complete successfully with a correct result.
2432
ibilut INTEGER. Array of size (n+1) containing indices of non-zero
elements in the array B. ibilut(I) is the index of the first
non-zero element from the row I. The value of the last
element ibilut(n+1) is equal to the number of non-zeros
in the matrix B plus one. Refer to the rowIndex array
more details..
jbilut INTEGER. Array, its size is equal to the size of the array
bilut. This array contains the column numbers for each
non-zero element of the matrix B. Refer to the columns
ierr INTEGER. Error flag, informs about the routine completion.
Return Values
ierr=0 The routine completes the task normally.
ierr=-101 The routine is interrupted, the error occurs: the
number of elements in some matrix row specified
in the sparse format is equal to or less than 0.
ierr=-102 The routine is interrupted because the value of the
computed diagonal element is less than the product
of the given tolerance and the current matrix row
norm, and it cannot be replaced as ipar(31)=0.
ierr=-103 The routine is interrupted, the error occurs: the
element ia(I+1) is less than or equal to the
element ia(I) (see Sparse Matrix Storage Format
section).
ierr=-104 The routine is interrupted because the memory is
insufficient for the internal work arrays.
ierr=-105 The routine is interrupted because the input value
of maxfil is less than 0.
ierr=-106 The routine is interrupted because the size n of the
input matrix is less than 0.
ierr=-107 The routine is interrupted, the error occurs: an
element of the array ja is less than 0, or greater
than n (see Sparse Matrix Storage Format section).
2433
ierr=101 The value of maxfil is greater than or equal to n.

The calculation is performed with the value of
maxfil set to (n-1).
ierr=102 The value of tol is less than 0. The calculation is
performed with the value of the parameter set to
(-tol)
ierr=103 The absolute value of tol is greater than value of
dpar(31); it can result in instability of the
calculation.
ierr=104 The value of dpar(31) is equal to 0. It can cause
the fail of the calculations.
Interfaces
FORTRAN 77 and Fortran 95:

SUBROUTINE dcsrilut (n, a, ia, ja, bilut, ibilut, jbilut, tol, maxfil, ipar, dpar, ierr)
INTEGER n, ierr, ipar(*), maxfil
INTEGER ia(*), ja(*), ibilut(*), jbilut(*)
DOUBLE PRECISION a(*), bilut(*), dpar(*), tol
C:
void dcsrilut (int *n, double *a, int *ia, int *ja, double *bilut, int *ibilut, int *jbilut,
double *tol, int *maxfil, int *ipar, double *dpar, int *ierr);
Calling Sparse Solver and Preconditioner Routines from

C/C++
The calling interface for all of the Intel MKL sparse solver and preconditioner routines is designed
to be used easily from FORTRAN 77 or Fortran 90. However, any of these routines can be
invoked directly from C or C++ if users are familiar with the inter-language calling conventions
of their platforms. These conventions include, but are not limited to, the argument passing
mechanisms for the language, the data type mappings from Fortran to C/C++, and the platform
specific method of decoration for Fortran external names.
2434
To promote portability, the C header files provide a set of macros and type definitions intended
to hide the inter-language calling conventions and provide an interface to the Intel MKL sparse
solver routines that appears natural for C/C++.
For example, consider a hypothetical library routine foo that takes a real vector of length n,
and returns an integer status. Fortran users would access such a function as:
INTEGER n, status, foo
REAL x(*)
status = foo(x, n)
As noted above, to invoke foo, C users need to know what C data types correspond to Fortran
types INTEGER and REAL; what argument passing mechanism the Fortran compiler uses; and
what, if any, name decoration the Fortran compiler performs when generating the external
symbol foo.
However, by using the C specific header file, for example mkl_solver.h, the invocation of
foo, within a C program would look as follows:
#include "mkl_solver.h"
_INTEGER_t i, status;
_REAL_t x[];
status = foo( x, i );
Note that in the above example, the header file mkl_solver.h provides definitions for the
types _INTEGER_t and _REAL_t that correspond to the Fortran types INTEGER and REAL.
To simplify calling of the Intel MKL sparse solver routines from C and C++, the following
approach of providing C definitions of Fortran types is used: if an argument or a result from a
sparse solver is documented as having the Fortran language specific type XXX, then the C and
C++ header files provide an appropriate C language type definitions for the name _XXX_t.
Caveat for C Users

One of the key differences between C/C++ and Fortran is the argument passing mechanisms
for the languages: Fortran programs use pass-by-reference semantics and C/C++ programs
use pass-by-value semantics. In the above example, the header file mkl_solver.h attempts
to hide this difference, by defining a macro foo, which takes the address of the appropriate
arguments. For example, on the Tru64 UNIX* operating system mkl_solver.h defines the
macro as follows:
#define foo(a,b) foo_((a), &(b))
2435
Note how constants are treated when using the macro form of foo. foo( x, 10 ) is converted
into foo_( x, &10 ). In a strictly ANSI compliant C compiler, taking the address of a constant
is not permitted, so a strictly conforming program would look like:
INTEGER_t iTen = 10;
_REAL_t * x;
status = foo( x, iTen );
However, some C compilers in a non-ANSI compliant mode enable taking the address of a
constant for ease of use with Fortran programs. The form foo( x, 10 ) is acceptable for such
compilers.
2436
Vector Mathematical Functions 9
This chapter describes Intel® MKL Vector Mathematical Functions Library (VML), which is designed to
compute mathematical functions on vector arguments. VML is an integral part of Intel MKL and the VML
terminology is used here for simplicity in discussing this group of functions.
VML includes a set of highly optimized implementations of certain computationally expensive core
mathematical functions (power, trigonometric, exponential, hyperbolic etc.) that operate on vectors of
real and complex numbers.
Application programs that might significantly improve performance with VML include nonlinear programming
software, integrals computation, and many others.
VML functions are divided into the following groups according to the operations they perform:
• VML Mathematical Functions compute values of mathematical functions (such as sine, cosine,
exponential, logarithm and so on) on vectors with unit increment indexing.
• VML Pack/Unpack Functions convert to and from vectors with positive increment indexing, vector
indexing and mask indexing (see Appendix B for details on vector indexing methods).
• VML Service Functions allow the user to set/get the accuracy mode, and set/get the error code.
VML mathematical functions take an input vector as argument, compute values of the respective function
element-wise, and return the results in an output vector.
All the Intel MKL VML functions are declared in mkl_vml.f77 for FORTRAN 77 interface, in mkl_vml.fi
for Fortran 90 interface, and in mkl_vml_functions.h for C interface.
Examples that demonstrate usage of the Vector Mathematical functions are available in the following
directories:
${MKL}/examples/vmlc/source
${MKL}/examples/vmlf/source.
Data Types and Accuracy Modes

Mathematical and pack/unpack vector functions in VML have been implemented for vector arguments
of single and double precision real data. Both Fortran- and C-interfaces to all functions, including
VML service functions, are provided in the library. The differences in naming and calling the functions
for Fortran- and C-interfaces are detailed in the Function Naming Conventions section below.
Each vector function from VML (for each data format) can work in three modes: High Accuracy (HA),
Low Accuracy (LA), and Enhanced Performance (EP). For many functions, using the LA version
improves performance at the cost of slight reduction in accuracy (1-2 least significant bits). In
2437
contrast to the LA accuracy flavor, the EP flavor further enhances the performance at the cost
of significant reduction in accuracy. In both single and double precision, about half bits of
floating-point mantissa are correct. Moreover, subtle argument paths for certain functions (for
example, large arguments in trigonometric functions) may be calculated with even less accuracy.
Despite the fact that default accuracy is HA, LA is more than sufficient in most cases. For certain
applications that are not very demanding for accuracy (for example, media applications, some
Monte Carlo simulations, etc.) you may find the EP accuracy flavor sufficient.
In addition, special value behavior may differ between the HA and LA versions of the functions.
Any information on accuracy behavior can be found in the Intel MKL Release Notes.
Switching between the modes (HA, LA, EP) is accomplished by using vmlSetMode(mode)(see
Table 9-14). The function vmlGetMode() returns the currently used mode. The High Accuracy
mode is used by default.
Function Naming Conventions

Full names of all VML functions include only lowercase letters for Fortran interface, whereas for
C interface names the lowercase letters are mixed with uppercase.
VML mathematical and pack/unpack function full names have the following structure:
v<?><name><mod>
The initial letter v is a prefix indicating that a function belongs to VML.
The <?> field is a precision prefix that indicates the data type:
s REAL for Fortran interface, or float for C interface
d DOUBLE PRECISION for Fortran interface, or double for C interface.
c COMPLEX for Fortran interface, or MKL_Complex8 for C interface.
z DOUBLE COMPLEX for Fortran interface, or MKL_Complex16 for C
interface.
The <name> field indicates the function short name, with some of its letters in uppercase for C
interface (see for example Table 9-2 or Table 9-13).
The <mod> field (written in uppercase for C interface) is present in pack/unpack functions only;
it indicates the indexing method used:
i indexing with positive increment
v indexing with index vector
m indexing with mask vector.
VML service function full names have the following structure:
2438
vml<name>
where vml is a prefix indicating that a function belongs to VML, and <name> is the function
short name, which includes some uppercase letters for C interface (see Table 9-13). To call
VML functions from an application program, use conventional function calls. For example, the
VML exponential function for single precision real data can be called as
call vsexp ( n, a, y ) for Fortran interface, or
vsExp ( n, a, y ); for C interface.
Functions Interface
The interface to VML functions includes function full names and the arguments list. The Fortran-
and C-interface descriptions for different groups of VML functions are given below. Note that
some functions (Div, Pow, and Atan2) have two input vectors a and b as their arguments,
while SinCos function has two output vectors y and z.
VML Mathematical Functions

Fortran:
call v<name>( n, a, y )
call v<name>( n, a, b, y )
call v<name>( n, a, y, z )
C:
v<name>( n, a, y );
v<name>( n, a, b, y );
v<name>( n, a, y, z );
Pack Functions
Fortran:
call vpacki( n, a, inca, y )

call vpackv( n, a, ia, y )
call vpackm( n, a, ma, y )
C:
vPackI( n, a, inca, y );
2439
vPackV( n, a, ia, y );
vPackM( n, a, ma, y );
Unpack Functions
Fortran:
call vunpacki( n, a, y, incy )

call vunpackv( n, a, y, iy )
call vunpackm( n, a, y, my )
C:
vUnpackI( n, a, y, incy );
vUnpackV( n, a, y, iy );
vUnpackM( n, a, y, my );
Service Functions
Fortran:
oldmode = vmlsetmode( mode )

mode = vmlgetmode( )
olderr = vmlseterrstatus ( err )
err = vmlgeterrstatus( )
olderr = vmlclearerrstatus( )
oldcallback = vmlseterrorcallback( callback )
callback = vmlgeterrorcallback( )
oldcallback = vmlclearerrorcallback( )
C:
oldmode = vmlSetMode( mode );

mode = vmlGetMode( void );
olderr = vmlSetErrStatus ( err );
err = vmlGetErrStatus( void );
olderr = vmlClearErrStatus( void );
oldcallback = vmlSetErrorCallBack( callback );
callback = vmlGetErrorCallBack( void );
oldcallback = vmlClearErrorCallBack( void );
2440
Input Parameters
n number of elements to be calculated
a first input vector
b second input vector
inca vector increment for the input vector a
ia index vector for the input vector a
ma mask vector for the input vector a
incy vector increment for the output vector y
iy index vector for the output vector y
my mask vector for the output vector y
err error code
mode VML mode
callback address of the callback function
Output Parameters
y first output vector
z second output vector
err error code
mode VML mode
olderr former error code
oldmode former VML mode
callback address of the callback function
oldcallback address of the former callback function
The data types of the parameters used in each function are specified in the respective function
description section. All VML mathematical functions can perform in-place operations, which
means that the same vector can be used as both input and output parameter. This holds true
for functions with two input vectors as well, in which case one of them may be overwritten with
the output vector. For functions with two output vectors, one of them may coincide with the
input vector. But partially overlapping input and output vectors could lead to unpredictable
results.
2441
Vector Indexing Methods

Current VML mathematical functions work only with unit increment. Arrays with other increments,
or more complicated indexing, can be accommodated by gathering the elements into a contiguous
vector and then scattering them after the computation is complete.
Three following indexing methods are used to gather/scatter the vector elements in VML:
• positive increment
• index vector
• mask vector.
The indexing method used in a particular function is indicated by the indexing modifier (see
the description of the <mod> field in Function Naming Conventions). For more information on
indexing methods see Vector Arguments in VML in Appendix B.
Error Diagnostics
The VML library has its own error handler. The only difference for C- and Fortran- interfaces is
that the Intel MKL error reporting routine XERBLA can be called after the Fortran- interface VML
function encounters an error, and this routine gets information on VML_STATUS_BADSIZE and
VML_STATUS_BADMEM input errors (see Table 9-16).
The VML error handler has the following properties:
• The Error Status (vmlErrStatus) global variable is set after each VML function call. The
possible values of this variable are shown in the Table 9-16.
• Depending on the VML mode, the error handler function invokes:
– errno variable setting. The possible values are shown in the Table 9-1.
– writing error text information to the stderr stream
– raising the appropriate exception on error, if necessary
– calling the additional error handler callback function.
Table 9-1. Set Values of the errno Variable

Value of errno Description
0 No errors are detected.
EINVAL The array dimension is not positive.
EACCES NULL pointer is passed.
2442
Value of errno Description
EDOM At least one of array values is out of a range of definition.
ERANGE At least one of array values caused a singularity, overflow or
underflow.
VML Mathematical Functions

This section describes VML functions which compute values of mathematical functions on real
and complex vector arguments with unit increment.
Each function group is introduced by its short name, a brief description of its purpose, and the
calling sequence for each type of data both for Fortran- and C-interfaces, as well as a description
of the input/output arguments.
For all VML mathematical functions, the input range of parameters is equal to the mathematical
range of definition in the set of defined values for the respective data type. Several VML
functions, specifically Div, Exp, Sinh, Cosh, and Pow, can result in an overflow. For these
functions, the respective input threshold values that mark off the precision overflow are specified
in the function description section. Note that in these specifications, FLT_MAX denotes the
maximum number representable in single precision real data type, while DBL_MAX denotes the
maximum number representable in double precision real data type.
Table 9-2 lists available mathematical functions and data types associated with them.
Table 9-2. VML Mathematical Functions
Function Data Types Description
Arithmetic Functions
v?Add s, d, c, z Addition of the vector elements
v?Sub s, d, c, z Subtraction of the vector elements
v?Sqr s, d Squaring of the vector elements
v?Mul s, d, c, z Multiplication of the vector elements
v?MulByConj c, z Multiplication of elements of one vector by conjugated
elements of the second vector
v?Conj c, z Conjugation of the vector elements
v?Abs s, d, c, z Absolute value of the vector elements
Power and Root Functions
v?Inv s, d Inversion of the vector elements
v?Div s, d, c, z Divide elements of one vector by elements of second vector
v?Sqrt s, d, c, z Square root of vector elements
v?InvSqrt s, d Inverse square root of vector elements
v?Cbrt s, d Cube root of vector elements
2443

v?InvCbrt s, d Inverse cube root of vector elements
v?Pow2o3 s, d Each vector element raised to 2/3
v?Pow3o2 s, d Each vector element raised to 3/2
v?Pow s, d, c, z Each vector element raised to the specified power
v?Powx s, d, c, z Each vector element raised to the constant power
v?Hypot s, d Square root of sum of squares
Exponential and Logarithmic Functions
v?Exp s, d, c, z Exponential of vector elements
v?Expm1 s, d Exponential of vector elements decreased by 1
v?Ln s, d, c, z Natural logarithm of vector elements
v?Log10 s, d, c, z Denary logarithm of vector elements
v?Log1p s, d Natural logarithm of vector elements that are increased by
1
Trigonometric Functions
v?Cos s, d, c, z Cosine of vector elements
v?Sin s, d, c, z Sine of vector elements
v?SinCos s, d Sine and cosine of vector elements
v?CIS c, z Complex exponent of vector elements (cosine and sine
combined to complex value)
v?Tan s, d, c, z Tangent of vector elements
v?Acos s, d, c, z Inverse cosine of vector elements
v?Asin s, d, c, z Inverse sine of vector elements
v?Atan s, d, c, z Inverse tangent of vector elements
v?Atan2 s, d Four-quadrant inverse tangent of elements of two vectors
Hyperbolic Functions
v?Cosh s, d, c, z Hyperbolic cosine of vector elements
v?Sinh s, d, c, z Hyperbolic sine of vector elements
v?Tanh s, d, c, z Hyperbolic tangent of vector elements
v?Acosh s, d, c, z Inverse hyperbolic cosine of vector elements
v?Asinh s, d, c, z Inverse hyperbolic sine of vector elements
v?Atanh s, d, c, z Inverse hyperbolic tangent of vector elements.
Special Functions
v?Erf s, d Error function value of vector elements
v?Erfc s, d Complementary error function value of vector elements
v?CdfNorm s, d Cumulative normal distribution function value of vector
elements
v?ErfInv s, d Inverse error function value of vector elements
2444
v?ErfcInv s, d Inverse complementary error function value of vector
elements
v?CdfNormInv s, d Inverse cumulative normal distribution function value of
vector elements
Rounding Functions
v?Floor s, d Rounding towards minus infinity
v?Ceil s, d Rounding towards plus infinity
v?Trunc s, d Rounding towards zero infinity
v?Round s, d Rounding to nearest integer
v?NearbyInt s, d Rounding according to current mode
v?Rint s, d Rounding according to current mode and raising inexact
result exception
v?Modf s, d Integer and fraction parts
Arithmetic Functions
v?Add
Performs element by element addition of vector a
and vector b.
Syntax
Fortran:
call vsadd( n, a, b, y )
call vdadd( n, a, b, y )
call vcadd( n, a, b, y )
call vzadd( n, a, b, y )
C:
vsAdd( n, a, b, y );
vdAdd( n, a, b, y );
vcAdd( n, a, b, y );
vzAdd( n, a, b, y );
2445
Input Parameters
Name Type Description
n FORTRAN 77: INTEGER Specifies the number of elements to be

calculated.
Fortran 90: INTEGER,
INTENT(IN)
C: const int
a, b FORTRAN 77: REAL for vsadd FORTRAN: Arrays, specify the input
vectors a and b.
DOUBLE PRECISION for vdadd
C: Pointers to arrays that contain the input
COMPLEX for vcadd
vectors a and b.
DOUBLE COMPLEX for vzadd
Fortran 90: REAL, INTENT(IN)
for vsadd
DOUBLE PRECISION, INTENT(IN)
for vdadd
COMPLEX, INTENT(IN) for vcadd
DOUBLE COMPLEX, INTENT(IN)
for vzadd
C: const float* for vsAdd
const double* for vdAdd
const MKL_Complex8* for vcAdd
const MKL_Complex16* for
vzAdd
Output Parameters
y FORTRAN 77: REAL for vsadd FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdadd
2446
COMPLEX, for vcadd C: Pointer to an array that contains the
output vector y.
DOUBLE COMPLEX for vzadd
Fortran 90: REAL, INTENT(OUT)
for vsadd
DOUBLE PRECISION,
INTENT(OUT) for vdadd
COMPLEX, INTENT(OUT) for vcadd
DOUBLE COMPLEX, INTENT(OUT)
for vzadd
C: float* for vsAdd
double* for vdAdd
MKL_Complex8* for vcAdd
MKL_Complex16* for vzAdd
v?Sub
Performs element by element subtraction of vector
b from vector a.
Syntax
Fortran:
call vssub( n, a, b, y )
call vdsub( n, a, b, y )
call vcsub( n, a, b, y )
call vzsub( n, a, b, y )
2447
C:
vsSub( n, a, b, y );
vdSub( n, a, b, y );
vcSub( n, a, b, y );
vzSub( n, a, b, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a, b FORTRAN 77: REAL for vssub FORTRAN: Arrays, specify the input
vectors a and b.
DOUBLE PRECISION for vdsub
COMPLEX for vcsub
vectors a and b.
DOUBLE COMPLEX for vzsub
for vssub
for vdsub
COMPLEX, INTENT(IN) for vcsub
for vzsub
C: const float* for vsSub
const double* for vdSub
const MKL_Complex8* for vcSub
const MKL_Complex16* for vz-
Sub
2448
Output Parameters
y FORTRAN 77: REAL for vssub FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdsub
C: Pointer to an array that contains the
COMPLEX for vcsub
output vector y.
DOUBLE COMPLEX for vzsub
Fortran 90: REALINTENT(OUT)
for vssub
DOUBLE PRECISION,
INTENT(OUT) for vdsub
COMPLEX, INTENT(OUT) for vcsub
for vzsub
C: float* for vsSub
double* for vdSub
MKL_Complex8* for vcSub
MKL_Complex16* for vzSub
v?Sqr
Performs element by element squaring of the
vector.
Syntax
Fortran:
call vssqr( n, a, y )
call vdsqr( n, a, y )
2449
C:
vsSqr( n, a, y );
vdSqr( n, a, y );
Input Parameters
n FORTRAN 77: INTEGER Specifies the number of elements to be calculated.

INTENT(IN)
C: const int
a FORTRAN 77: REAL for FORTRAN: Array, specifies the input vector a.
vssqr
C: Pointer to an array that contains the input vector
DOUBLE PRECISION for vd- a.
sqr
Fortran 90: REAL,
INTENT(IN) for vssqr
DOUBLE PRECISION,
INTENT(IN) for vdsqr
C: const float* for vsSqr
const double* for vdSqr
Output Parameters
y FORTRAN 77: REAL for vssqr FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdsqr
output vector y.
for vssqr
DOUBLE PRECISION,
INTENT(OUT) for vdsqr
2450
C: float* for vsSqr
double* for vdSqr
v?Mul
Performs element by element multiplication of
vector a and vector b.
Syntax
Fortran:
call vsmul( n, a, b, y )
call vdmul( n, a, b, y )
call vcmul( n, a, b, y )
call vzmul( n, a, b, y )
C:
vsMul( n, a, b, y );
vdMul( n, a, b, y );
vcMul( n, a, b, y );
vzMul( n, a, b, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a, b FORTRAN 77: REAL for vsmul FORTRAN: Arrays, specify the input
vectors a and b.
DOUBLE PRECISION for vdmul
2451

COMPLEX for vcmul C: Pointers to arrays that contain the input
vectors a and b.
DOUBLE COMPLEX for vzmul
for vsmul
for vdmul
COMPLEX, INTENT(IN) for vcmul
for vzmul
C: const float* for vsMul
const double* for vdMul
const MKL_Complex8* for vcMul
Mul
Output Parameters
y FORTRAN 77: REAL for vsmul FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdmul
COMPLEX, for vcmul
output vector y.
DOUBLE COMPLEX for vzmul
for vsmul
DOUBLE PRECISION,
INTENT(OUT) for vdmul
COMPLEX, INTENT(OUT) for vcmul
for vzmul
2452
C: float* for vsMul
double* for vdMul
MKL_Complex8* for vcMul
MKL_Complex16* for vzMul
v?MulByConj
Performs element by element multiplication of
vector a element and conjugated vector b element.
Syntax
Fortran:
call vcmulbyconj( n, a, b, y )
call vzmulbyconj( n, a, b, y )
C:
vcMulByConj( n, a, b, y );
vzMulByConj( n, a, b, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a, b FORTRAN 77: COMPLEX for vcmul- FORTRAN: Arrays, specify the input
byconj vectors a and b.
DOUBLE COMPLEX for vzmulby- C: Pointers to arrays that contain the input
conj vectors a and b.
2453

Fortran 90: COMPLEX,
INTENT(IN) for vcmulbyconj
for vzmulbyconj
C: const MKL_Complex8* for
vcMulByConj
MulByConj
Output Parameters
y FORTRAN 77: COMPLEX for vcmul- FORTRAN: Array, specifies the output
byconj vector y.
DOUBLE COMPLEX for vzmulby- C: Pointer to an array that contains the

conj output vector y.

INTENT(OUT) for vcmulbyconj
for vzmulbyconj
C: MKL_Complex8* for vcMulBy-
Conj
MKL_Complex16* for vzMulBy-
Conj
2454
v?Conj
Performs element by element conjugation of the
vector.
Syntax
Fortran:
call vcconj( n, a, y )
call vzconj( n, a, y )
C:
vcConj( n, a, y );
vzConj( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: COMPLEX, FORTRAN: Array, specifies the input vector

INTENT(IN) for vcconj a.
DOUBLE COMPLEX, INTENT(IN) C: Pointer to an array that contains the
for vzconj input vector a.

INTENT(IN) for vcconj
for vzconj
C: const MKL_Complex8* for
vcConj
2455

Conj
Output Parameters
y FORTRAN 77: COMPLEX, for vc- FORTRAN: Array, specifies the output
conj vector y.
DOUBLE COMPLEX for vzconj C: Pointer to an array that contains the

output vector y.
INTENT(OUT) for vcconj
for vzconj
C: MKL_Complex8* for vcConj
MKL_Complex16* for vzConj
v?Abs
Computes absolute value of vector elements.
Syntax
Fortran:
call vsabs( n, a, y )
call vdabs( n, a, y )
call vcabs( n, a, y )
call vzabs( n, a, y )
2456
C:
vsAbs( n, a, y );
vdAbs( n, a, y );
vcAbs( n, a, y );
vzAbs( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsabs FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdabs
COMPLEX for vcabs
input vector a.
DOUBLE COMPLEX for vzabs
for vsabs
for vdabs
COMPLEX, INTENT(IN) for vcabs
for vzabs
C: const float* for vsAbs
const double* for vdAbs
const MKL_Complex8* for vcAbs
Abs
2457
Output Parameters
y FORTRAN 77: REAL for vsabs, FORTRAN: Array, specifies the output
vcabs vector y.
DOUBLE PRECISION for vdabs, C: Pointer to an array that contains the

vzabs output vector y.

for vsabs, vcabs
DOUBLE PRECISION,
INTENT(OUT) for vdabs, vzabs
C: float* for vsAbs, vcAbs
double* for vdAbs, vzAbs
Power and Root Functions
v?Inv
Performs element by element inversion of the
vector.
Syntax
Fortran:
call vsinv( n, a, y )
call vdinv( n, a, y )
C:
vsInv( n, a, y );
vdInv( n, a, y );
2458
Input Parameters
n FORTRAN 77: INTEGER Specifies the number of elements to be calculated.

INTENT(IN)
C: const int
a FORTRAN 77: REAL for vs- FORTRAN: Array, specifies the input vector a.
inv
C: Pointer to an array that contains the input vector
DOUBLE PRECISION for vd- a.
inv
Fortran 90: REAL,
INTENT(IN) for vsinv
DOUBLE PRECISION,
INTENT(IN) for vdinv
C: const float* for vsInv
const double* for vdInv
Output Parameters
y FORTRAN 77: REAL for vsinv FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdinv
output vector y.
for vsinv
DOUBLE PRECISION,
INTENT(OUT) for vdinv
C: float* for vsInv
double* for vdInv
2459
v?Div
Performs element by element division of vector a
by vector b
Syntax
Fortran:
call vsdiv( n, a, b, y )
call vddiv( n, a, b, y )
call vcdiv( n, a, b, y )
call vzdiv( n, a, b, y )
C:
vsDiv( n, a, b, y );
vdDiv( n, a, b, y );
vcDiv( n, a, b, y );
vzDiv( n, a, b, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a, b FORTRAN 77: REAL for vsdiv FORTRAN: Arrays, specify the input
vectors a and b.
DOUBLE PRECISION for vddiv
COMPLEX for vcdiv
vectors a and b.
DOUBLE COMPLEX for vzdiv
for vsdiv
2460
for vddiv
COMPLEX, INTENT(IN) for vcdiv
for vzdiv
C: const float* for vsDiv
const double* for vdDiv
const MKL_Complex8* for vcDiv
Div
Table 9-3 Precision Overflow Thresholds for Div Real Function

Data Type Threshold Limitations on Input Parameters
single precision abs(a[i]) < abs(b[i]) * FLT_MAX
double precision abs(a[i]) < abs(b[i]) * DBL_MAX
NOTE. Overflow can occur also in Div complex function, but the exact formula is beyond
the scope of this document.
Output Parameters
y FORTRAN 77: REAL for vsdiv FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vddiv
COMPLEX for vcdiv
output vector y.
DOUBLE COMPLEX for vzdiv
for vsdiv
DOUBLE PRECISION,
INTENT(OUT) for vddiv
2461

COMPLEX, INTENT(OUT) for vcdiv
for vzdiv
C: float* for vsDiv
double* for vdDiv
MKL_Complex8* for vcDiv
MKL_Complex16* for vzDiv
v?Sqrt
Computes a square root of vector elements.
Syntax
Fortran:
call vssqrt( n, a, y )
call vdsqrt( n, a, y )
call vcsqrt( n, a, y )
call vzsqrt( n, a, y )
C:
vsSqrt( n, a, y );
vdSqrt( n, a, y );
vcSqrt( n, a, y );
vzSqrt( n, a, y );
2462
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vssqrt FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdsqrt
COMPLEX for vcsqrt
input vector a.
DOUBLE COMPLEX for vzsqrt
for vssqrt
for vdsqrt
COMPLEX, INTENT(IN) for vcsqrt
for vzsqrt
C: const float* for vsSqrt
const double* for vdSqrt
const MKL_Complex8* for vc-
Sqrt
vzSqrt
Output Parameters
y FORTRAN: REAL for vssqrt FORTRAN: Array, specifies the output

vector y.
DOUBLE PRECISION for vdsqrt
2463

COMPLEX for vcsqrt C: Pointer to an array that contains the
output vector y.
DOUBLE COMPLEX for vzsqrt
C: float* for vsSqrt
double* for vdSqrt
MKL_Complex8* for vcSqrt
MKL_Complex16* for vzSqrt
v?InvSqrt
Computes an inverse square root of vector
elements.
Syntax
Fortran:
call vsinvsqrt( n, a, y )
call vdinvsqrt( n, a, y )
C:
vsInvSqrt( n, a, y );
vdInvSqrt( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsin- FORTRAN: Array, specifies the input vector
vsqrt a.
2464
DOUBLE PRECISION for vdin- C: Pointer to an array that contains the
vsqrt input vector a.

for vsinvsqrt
for vdinvsqrt
C: const float* for vsInvSqrt
const double* for vdInvSqrt
Output Parameters
y FORTRAN 77: REAL for vsin- FORTRAN: Array, specifies the output
vsqrt vector y.

vsqrt output vector y.

for vsinvsqrt
DOUBLE PRECISION,
INTENT(OUT) for vdinvsqrt
C: float* for vsInvSqrt
double* for vdInvSqrt
2465
v?Cbrt
Computes a cube root of vector elements.
Syntax
Fortran:
call vscbrt( n, a, y )
call vdcbrt( n, a, y )
C:
vsCbrt( n, a, y );
vdCbrt( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vscbrt FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdcbrt
input vector a.
for vscbrt
for vdcbrt
C: const float* for vsCbrt
const double* for vdCbrt
2466
Output Parameters
y FORTRAN 77: REAL for vscbrt FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdcbrt
output vector y.
for vscbrt
DOUBLE PRECISION,
INTENT(OUT) for vdcbrt
C: float* for vsCbrt
double* for vdCbrt
v?InvCbrt
Computes an inverse cube root of vector elements.
Syntax
Fortran:
call vsinvcbrt( n, a, y )
call vdinvcbrt( n, a, y )
C:
vsInvCbrt( n, a, y );
vdInvCbrt( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
2467
a FORTRAN 77: REAL for vsin- FORTRAN: Array, specifies the input vector
vcbrt a.
vcbrt input vector a.

for vsinvcbrt
for vdinvcbrt
C: const float* for vsInvCbrt
const double* for vdInvCbrt
Output Parameters
y FORTRAN 77: REAL for vsin- FORTRAN: Array, specifies the output
vcbrt vector y.

vcbrt output vector y.

for vsinvcbrt
DOUBLE PRECISION,
INTENT(OUT) for vdinvcbrt
C: float* for vsInvCbrt
double* for vdInvCbrt
2468
v?Pow2o3
Raises each element of a vector to the constant
power 2/3.
Syntax
Fortran:
call vspow2o3( n, a, y )
call vdpow2o3( n, a, y )
C:
vsPow2o3( n, a, y );
vdPow2o3( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vspow2o3 FORTRAN: Arrays, specify the input vector
a.
DOUBLE PRECISION for vdpow2o3
vector a.
for vspow2o3
for vdpow2o3
C: const float* for vsPow2o3
const double* for vdPow2o3
2469
Output Parameters
y FORTRAN 77: REAL for vspow2o3 FORTRAN: Array, specifies the output
vector y.
output vector y.
for vspow2o3
DOUBLE PRECISION,
INTENT(OUT) for vdpow2o3
C: float* for vsPow2o3
double* for vdPow2o3
v?Pow3o2
power 3/2.
Syntax
Fortran:
call vspow3o2( n, a, y )
call vdpow3o2( n, a, y )
C:
vsPow3o2( n, a, y );
vdPow3o2( n, a, y );
Input Parameters

calculated.
INTENT(IN)
2470
C: const int
a FORTRAN 77: REAL for vspow3o2 FORTRAN: Arrays, specify the input vector
a.
vector a.
for vspow3o2
for vdpow3o2
C: const float* for vsPow3o2
const double* for vdPow3o2
Table 9-4 Precision Overflow Thresholds for Pow3o2 Function

single precision abs(a[i]) < ( FLT_MAX )2/3
double precision abs(a[i]) < ( DBL_MAX )2/3
Output Parameters
y FORTRAN 77: REAL for vspow3o2 FORTRAN: Array, specifies the output
vector y.
output vector y.
for vspow3o2
DOUBLE PRECISION,
INTENT(OUT) for vdpow3o2
C: float* for vsPow3o2
double* for vdPow3o2
2471
v?Pow
Computes a to the power b for elements of two
vectors.
Syntax
Fortran:
call vspow( n, a, b, y )
call vdpow( n, a, b, y )
call vcpow( n, a, b, y )
call vzpow( n, a, b, y )
C:
vsPow( n, a, b, y );
vdPow( n, a, b, y );
vcPow( n, a, b, y );
vzPow( n, a, b, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a, b FORTRAN 77: REAL for vspow FORTRAN: Arrays, specify the input
vectors a and b.
DOUBLE PRECISION for vdpow
COMPLEX for vcpow
vectors a and b.
DOUBLE COMPLEX for vzpow
for vspow
2472
for vdpow
COMPLEX, INTENT(IN) for vcpow
for vzpow
C: const float* for vsPow
const double* for vdPow
const MKL_Complex8* for vcPow
vzPow
Table 9-5 Precision Overflow Thresholds for Pow Real Function

single precision abs(a[i]) < ( FLT_MAX )1/b[i]
double precision abs(a[i]) < ( DBL_MAX )1/b[i]
NOTE. Overflow can occur also in Pow complex function, but the exact formula is beyond
Output Parameters
y FORTRAN 77: REAL for vspow FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdpow
COMPLEX for vcpow
output vector y.
DOUBLE COMPLEX for vzpow
for vspow
DOUBLE PRECISION,
INTENT(OUT) for vdpow
2473

COMPLEX, INTENT(OUT) for vcpow
for vzpow
C: float* for vsPow
double* for vdPow
MKL_Complex8* for vcPow
MKL_Complex16* for vzPow
Description
This function is declared in mkl_vml.f77 for FORTRAN 77 interface, in mkl_vml.fi for Fortran
90 interface, and in mkl_vml_functions.h for C interface.
The real function Pow has certain limitations on the input range of a and b parameters.
Specifically, if a[i] is positive, then b[i] may be arbitrary. For negative a[i], the value of
b[i] must be integer (either positive or negative).
The complex function Pow has no such input range limitations.
v?Powx
power.
Syntax
Fortran:
call vspowx( n, a, b, y )
call vdpowx( n, a, b, y )
call vcpowx( n, a, b, y )
call vzpowx( n, a, b, y )
2474
C:
vsPowx( n, a, b, y );
vdPowx( n, a, b, y );
vcPowx( n, a, b, y );
vzPowx( n, a, b, y );
Input Parameters
n FORTRAN 77: INTEGER Number of elements to be calculated.

INTENT(IN)
C: const int
a FORTRAN 77: REAL for vspowx FORTRAN: Array a that specifies the input
vector
DOUBLE PRECISION for vdpowx
COMPLEX for vcpowx
input vector a.
DOUBLE COMPLEX for vzpowx
for vspowx
for vdpowx
COMPLEX, INTENT(IN) for vcpowx
for vzpowx
C: const float* for vsPowx
const double* for vdPowx
vcPowx
vzPowx
2475

b
FORTRAN 77: REAL for vspowx FORTRAN: Scalar value b that is the
constant power.
C: Constant value for power b.
COMPLEX for vcpowx
for vspowx
for vdpowx
COMPLEX, INTENT(IN) for vcpowx
for vzpowx
C: const float for vsPowx
const double for vdPowx
vcPowx
vzPowx
Table 9-6 Precision Overflow Thresholds for Powx Real Function

single precision abs(a[i]) < ( FLT_MAX )1/b
double precision abs(a[i]) < ( DBL_MAX )1/b
NOTE. Overflow can occur also in Powx complex function, but the exact formula is
beyond the scope of this document.
2476
Output Parameters
y FORTRAN 77: REAL for vspowx FORTRAN: Array, specifies the output
vector y.
COMPLEX for vcpowx
output vector y.
for vspowx
DOUBLE PRECISION,
INTENT(OUT) for vdpowx
COMPLEX, INTENT(OUT) for
vcpowx
for vzpowx
C: float* for vsPowx
double* for vdPowx
MKL_Complex8* for vcPowx
MKL_Complex16* for vzPowx
Description
The real function Powx has certain limitations on the input range of a and b parameters.
Specifically, if a[i] is positive, then b may be arbitrary. For negative a[i], the value of b must
be integer (either positive or negative).
The complex function Powx has no such input range limitations.
2477
v?Hypot
Computes a square root of sum of two squared
elements.
Syntax
Fortran:
call vshypot( n, a, b, y )
call vdhypot( n, a, b, y )
C:
vsHypot( n, a, b, y );
vdHypot( n, a, b, y );
Input Parameters
n FORTRAN 77: INTEGER Number of elements to be calculated.

INTENT(IN)
C: const int
a,b FORTRAN 77: REAL for vshypot FORTRAN: Arrays that specify the input
vectors a and b
DOUBLE PRECISION for vdhypot
vectors a and b.
for vshypot
for vdhypot
C: const float* for vsHypot
const double* for vdHypot
2478
Table 9-7 Precision Overflow Thresholds for Hypot Function
single precision
abs(a[i]) < sqrt(FLT_MAX)
abs(b[i]) < sqrt(FLT_MAX)
double precision
abs(a[i]) < sqrt(DBL_MAX)
abs(b[i]) < sqrt(DBL_MAX)
Output Parameters
y FORTRAN 77: REAL for vshypot FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdhypot
output vector y.
for vshypot
DOUBLE PRECISION,
INTENT(OUT) for vdhypot
C: float* for vsHypot
double* for vdHypot
2479
Exponential and Logarithmic Functions
v?Exp
Computes an exponential of vector elements.
Syntax
Fortran:
call vsexp( n, a, y )
call vdexp( n, a, y )
call vcexp( n, a, y )
call vzexp( n, a, y )
C:
vsExp( n, a, y );
vdExp( n, a, y );
vcExp( n, a, y );
vzExp( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsexp FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdexp
COMPLEX for vcexp
input vector a.
DOUBLE COMPLEX for vzexp
2480
for vsexp
for vdexp
COMPLEX, INTENT(IN) for vcexp
for vzexp
C: const float* for vsExp
const double* for vdExp
const MKL_Complex8* for vcExp
Exp
Table 9-8 Precision Overflow Thresholds for Exp Real Function

single precision a[i] < Ln( FLT_MAX )
double precision a[i] < Ln( DBL_MAX )
NOTE. Overflow can occur also in Exp complex function, but the exact formula is beyond
Output Parameters
y FORTRAN 77: REAL for vsexp FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdexp
COMPLEX for vcexp
output vector y.
DOUBLE COMPLEX for vzexp
for vsexp
2481

DOUBLE PRECISION,
INTENT(OUT) for vdexp
COMPLEX, INTENT(OUT) for vcexp
for vzexp
C: float* for vsExp
double* for vdExp
MKL_Complex8* for vcExp
MKL_Complex16* for vzExp
v?Expm1
Computes an exponential of vector elements
decreased by 1.
Syntax
Fortran:
call vsexpm1( n, a, y )
call vdexpm1( n, a, y )
C:
vsExpm1( n, a, y );
vdExpm1( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
2482
a FORTRAN 77: REAL for vsexpm1 FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdexpm1
input vector a.
for vsexpm1
for vdexpm1
C: const float* for vsExpm1
const double* for vdExpm1
Table 9-9 Precision Overflow Thresholds for Expm1 Function

single precision a[i] < Ln( FLT_MAX )
double precision a[i] < Ln( DBL_MAX )
Output Parameters
y FORTRAN 77: REAL for vsexpm1 FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdexpm1
output vector y.
for vsexpm1
DOUBLE PRECISION,
INTENT(OUT) for vdexpm1
C: float* for vsExpm1
double* for vdExpm1
2483
v?Ln
Computes natural logarithm of vector elements.
Syntax
Fortran:
call vsln( n, a, y )
call vdln( n, a, y )
call vcln( n, a, y )
call vzln( n, a, y )
C:
vsLn( n, a, y );
vdLn( n, a, y );
vcLn( n, a, y );
vzLn( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsln FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdln
COMPLEX for vcln
input vector a.
DOUBLE COMPLEX for vzln
for vsln
2484
for vdln
COMPLEX, INTENT(IN) for vcln
for vzln
C: const float* for vsLn
const double* for vdLn
const MKL_Complex8* for vcLn
const MKL_Complex16* for vzLn
Output Parameters
y FORTRAN 77: REAL for vsln FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdln
COMPLEX for vcln
output vector y.
DOUBLE COMPLEX for vzln
for vsln
DOUBLE PRECISION,
INTENT(OUT) for vdln
COMPLEX, INTENT(OUT) for vcln
for vzln
C: float* for vsLn
double* for vdLn
MKL_Complex8* for vcLn
MKL_Complex16* for vzLn
2485
v?Log10
Computes denary logarithm of vector elements.
Syntax
Fortran:
call vslog10( n, a, y )
call vdlog10( n, a, y )
call vclog10( n, a, y )
call vzlog10( n, a, y )
C:
vsLog10( n, a, y );
vdLog10( n, a, y );
vcLog10( n, a, y );
vzLog10( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vslog10 FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdlog10
COMPLEX for vclog10
input vector a.
DOUBLE COMPLEX for vzlog10
for vslog10
2486
for vdlog10
COMPLEX, INTENT(IN) for
vclog10
for vzlog10
C: const float* for vsLog10
const double* for vdLog10
vcLog10
vzLog10
Output Parameters
y FORTRAN 77: REAL for vslog10 FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdlog10
COMPLEX for vclog10
output vector y.
DOUBLE COMPLEX for vzlog10
for vslog10
DOUBLE PRECISION,
INTENT(OUT) for vdlog10
vclog10
for vzlog10
C: float* for vsLog10
2487

double* for vdLog10
MKL_Complex8* for vcLog10
MKL_Complex16* for vzLog10
v?Log1p
Computes natural logarithm of vector elements
that are increased by 1.
Syntax
Fortran:
call vslog1p( n, a, y )
call vdlog1p( n, a, y )
C:
vsLog1p( n, a, y );
vdLog1p( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vslog1p FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdlog1p
input vector a.
for vslog1p
2488
for vdlog1p
C: const float* for vsLog1p
const double* for vdLog1p
Output Parameters
y FORTRAN 77: REAL for vslog1p FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdlog1p
output vector y.
for vslog1p
DOUBLE PRECISION,
INTENT(OUT) for vdlog1p
C: float* for vsLog1p
double* for vdLog1p
Trigonometric Functions
v?Cos
Computes cosine of vector elements.
Syntax
Fortran:
call vscos( n, a, y )
call vdcos( n, a, y )
call vccos( n, a, y )
call vzcos( n, a, y )
2489
C:
vsCos( n, a, y );
vdCos( n, a, y );
vcCos( n, a, y );
vzCos( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vscos FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdcos
COMPLEX for vccos
input vector a.
DOUBLE PRECISION for vzcos
for vscos
for vdcos
COMPLEX, INTENT(IN) for vccos
for vzcos
C: const float* for vsCos
const double* for vdCos
const MKL_Complex8* for vcCos
Cos
2490
Output Parameters
y FORTRAN 77: REAL for vscos FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdcos
COMPLEX for vccos
output vector y.
DOUBLE COMPLEX for vzcos
for vscos
DOUBLE PRECISION,
INTENT(OUT) for vdcos
COMPLEX, INTENT(OUT) for vccos
for vzcos
C: float* for vsCos
double* for vdCos
MKL_Complex8* for vcCos
MKL_Complex16* for vzCos
Description
The function computes sine of vector elements.
Note that arguments abs(a[i]) ≤ 213 and abs(a[i]) ≤ 216 for single and double precisions
respectively are called fast computational path. These are trig function arguments for which
VML provides the best possible performance. For performance reasons, avoid arguments that
do not belong to the fast computational path in the VML High Accuracy (HA) and Low Accuracy
(LA) functions. Alternatively, you can use VML Enhanced Performance (EP) functions that are
fast on the entire function domain at the cost of accuracy.
2491
v?Sin
Computes sine of vector elements.
Syntax
Fortran:
call vssin( n, a, y )
call vdsin( n, a, y )
call vcsin( n, a, y )
call vzsin( n, a, y )
C:
vsSin( n, a, y );
vdSin( n, a, y );
vcSin( n, a, y );
vzSin( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vssin FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdsin
COMPLEX for vcsin
input vector a.
DOUBLE PRECISION for vzsin
for vssin
2492
for vdsin
COMPLEX, INTENT(IN) for vcsin
for vzsin
C: const float* for vsSin
const double* for vdSin
const MKL_Complex8* for vcSin
vzSin
Output Parameters
y FORTRAN 77: REAL for vssin FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdsin
COMPLEX for vcsin
output vector y.
DOUBLE COMPLEX for vzsin
for vssin
DOUBLE PRECISION,
INTENT(OUT) for vdsin
COMPLEX, INTENT(OUT) for vcsin
for vzsin
C: float* for vsSin
double* for vdSin
MKL_Complex8* for vcSin
2493

MKL_Complex16* for vzSin
Description
v?SinCos
Computes sine and cosine of vector elements.
Syntax
Fortran:
call vssincos( n, a, y, z )
call vdsincos( n, a, y, z )
C:
vsSinCos( n, a, y, z );
vdSinCos( n, a, y, z );
Input Parameters

calculated.
INTENT(IN)
2494
C: const int
a FORTRAN 77: REAL for vssincos FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdsincos
input vector a.
for vssincos
for vdsincos
C: const float* for vsSinCos
const double* for vdSinCos
Output Parameters
y, z FORTRAN 77: REAL for vssincos FORTRAN: Arrays, specify the output
vectors y (for sine values) and z (for cosine
DOUBLE PRECISION for vdsincos
values).
C: Pointers to arrays that contain the
for vssincos
output vectors y (for sinevalues) and z(for
DOUBLE PRECISION, cosine values).
INTENT(OUT) for vdsincos
C: float* for vsSinCos
double* for vdSinCos
Description
2495
v?CIS
Computes complex exponent of real vector
elements (cosine and sine of real vector elements
combined to complex value).
Syntax
Fortran:
call vccis( n, a, y )
call vzcis( n, a, y )
C:
vcCIS( n, a, y );
vzCIS( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vccis FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vzcis
input vector a.
for vccis
for vzcis
C: const float* for vcCIS
2496
const double* for vzCIS
Output Parameters
y FORTRAN 77: COMPLEX for vccis FORTRAN: Array, specifies the output
vector y.
DOUBLE COMPLEX for vzcis
output vector y.
INTENT(OUT) for vccis
for vzcis
C: MKL_Complex8* for vcCIS
MKL_Complex16* for vzCIS
v?Tan
Computes tangent of vector elements.
Syntax
Fortran:
call vstan( n, a, y )
call vdtan( n, a, y )
call vctan( n, a, y )
call vztan( n, a, y )
C:
vsTan( n, a, y );
vdTan( n, a, y );
vcTan( n, a, y );
vzTan( n, a, y );
2497
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL, INTENT(IN) FORTRAN: Array, specifies the input vector
for vstan a.
DOUBLE PRECISION, INTENT(IN) C: Pointer to an array that contains the
for vdtan input vector a.
COMPLEX, INTENT(IN) for vctan

for vztan
for vstan
for vdtan
COMPLEX, INTENT(IN) for vctan
for vztan
C: const float* for vsTan
const double* for vdTan
const MKL_Complex8* for vcTan
Tan
2498
Output Parameters
y FORTRAN 77: REAL for vstan FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdtan
COMPLEX for vctan
output vector y.
DOUBLE COMPLEX for vztan
for vstan
DOUBLE PRECISION,
INTENT(OUT) for vdtan
COMPLEX, INTENT(OUT) for vctan
for vztan
C: float* for vsTan
double* for vdTan
MKL_Complex8* for vcTan
MKL_Complex16* for vzTan
Description
2499
v?Acos
Computes inverse cosine of vector elements.
Syntax
Fortran:
call vsacos( n, a, y )
call vdacos( n, a, y )
call vcacos( n, a, y )
call vzacos( n, a, y )
C:
vsAcos( n, a, y );
vdAcos( n, a, y );
vcAcos( n, a, y );
vzAcos( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsacos FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdacos
COMPLEX for vcacos
input vector a.
DOUBLE COMPLEX for vzacos
for vsacos
2500
for vdacos
COMPLEX, INTENT(IN) for vcacos
for vzacos
C: const float* for vsAcos
const double* for vdAcos
const MKL_Complex8* for vcA-
cos
const MKL_Complex16* for vzA-
cos
Output Parameters
y FORTRAN 77: REAL for vsacos FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdacos
COMPLEX for vcacos
output vector y.
DOUBLE COMPLEX for vzacos
for vsacos
DOUBLE PRECISION,
INTENT(OUT) for vdacos
COMPLEX, INTENT(OUT) for vca-
cos
for vzacos
C: float* for vsAcos
double* for vdAcos
2501

MKL_Complex8* for vcAcos
MKL_Complex16* for vzAcos
v?Asin
Computes inverse sine of vector elements.
Syntax
Fortran:
call vsasin( n, a, y )
call vdasin( n, a, y )
call vcasin( n, a, y )
call vzasin( n, a, y )
C:
vsAsin( n, a, y );
vdAsin( n, a, y );
vcAsin( n, a, y );
vzAsin( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsasin FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdasin
2502
COMPLEX for vcasin C: Pointer to an array that contains the
input vector a.
DOUBLE COMPLEX for vzasin
for vsasin
for vdasin
COMPLEX, INTENT(IN) for vcasin
for vzasin
C: const float* for vsAsin
const double* for vdAsin
vcAsin
vzAsin
Output Parameters
y FORTRAN 77: REAL for vsasin FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdasin
COMPLEX for vcasin
output vector y.
DOUBLE COMPLEX for vzasin
for vsasin
DOUBLE PRECISION,
INTENT(OUT) for vdasin
vcasin
2503

for vzasin
C: float* for vsAsin
double* for vdAsin
MKL_Complex8* for vcAsin
MKL_Complex16* for vzAsin
v?Atan
Computes inverse tangent of vector elements.
Syntax
Fortran:
call vsatan( n, a, y )
call vdatan( n, a, y )
call vcatan( n, a, y )
call vzatan( n, a, y )
C:
vsAtan( n, a, y );
vdAtan( n, a, y );
vcAtan( n, a, y );
vzAtan( n, a, y );
Input Parameters

calculated.
INTENT(IN)
2504
C: const int
a FORTRAN 77: REAL for vsatan FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdatan
COMPLEX for vcatan
input vector a.
DOUBLE COMPLEX for vzatan
for vsatan
for vdatan
COMPLEX, INTENT(IN) for vcatan
for vzatan
C: const float* for vsAtan
const double* for vdAsin
vcAtan
vzAsin
Output Parameters
y FORTRAN 77: REAL for vsatan FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdatan
COMPLEX for vcatan
output vector y.
DOUBLE COMPLEX for vzatan
for vsatan
2505

DOUBLE PRECISION,
INTENT(OUT) for vdatan
vcatan
for vzatan
C: float* for vsAtan
double* for vdAtan
MKL_Complex8* for vcAtan
MKL_Complex16* for vzAtan
v?Atan2
Computes four-quadrant inverse tangent of
elements of two vectors.
Syntax
Fortran:
call vsatan2( n, a, b, y )
call vdatan2( n, a, b, y )
C:
vsAtan2( n, a, b, y );
vdAtan2( n, a, b, y );
Input Parameters

calculated.
INTENT(IN)
2506
C: const int
a, b FORTRAN 77: REAL for vsatan2 FORTRAN: Arrays, specify the input
vectors a and b.
DOUBLE PRECISION for vdatan2
vectors a and b.
for vsatan2
for vdatan2
C: const float* for vsAtan2
const double* for vdAtan2
Output Parameters
y FORTRAN 77: REAL for vsatan2 FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdatan2
output vector y.
for vsatan2
DOUBLE PRECISION,
INTENT(OUT) for vdatan2
C: float* for vsAtan2
double* for vdAtan2
Description
The function computes four-quadrant inverse tangent of elements of two vectors.
The elements of the output vectory are computed as the four-quadrant arctangent of a[i] /
b[i].
2507
Hyperbolic Functions
v?Cosh
Computes hyperbolic cosine of vector elements.
Syntax
Fortran:
call vscosh( n, a, y )
call vdcosh( n, a, y )
call vccosh( n, a, y )
call vzcosh( n, a, y )
C:
vsCosh( n, a, y );
vdCosh( n, a, y );
vcCosh( n, a, y );
vzCosh( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vscosh FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdcosh
COMPLEX for vccosh
input vector a.
DOUBLE COMPLEX for vzcosh
2508
for vscosh
for vdcosh
COMPLEX, INTENT(IN) for vccosh
for vzcosh
C: const float* for vsCosh
const double* for vdCosh
Cosh
Cosh
Table 9-10 Precision Overflow Thresholds for Cosh Real Function

single precision -Ln(FLT_MAX)-Ln2 <a[i] < Ln(FLT_MAX)+Ln2
double precision -Ln(DBL_MAX)-Ln2 <a[i] < Ln(DBL_MAX)+Ln2
NOTE. Overflow can occur also in Cosh complex function, but the exact formula is
Output Parameters
y FORTRAN 77: REAL for vscosh FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdcosh
COMPLEX for vccosh
output vector y.
DOUBLE COMPLEX for vzcosh
2509

for vscosh
DOUBLE PRECISION,
INTENT(OUT) for vdcosh
COMPLEX, INTENT(OUT) for vc-
cosh
for vzcosh
C: float* for vsCosh
double* for vdCosh
MKL_Complex8* for vcCosh
MKL_Complex16* for vzCosh
v?Sinh
Computes hyperbolic sine of vector elements.
Syntax
Fortran:
call vssinh( n, a, y )
call vdsinh( n, a, y )
call vcsinh( n, a, y )
call vzsinh( n, a, y )
C:
vsSinh( n, a, y );
vdSinh( n, a, y );
vcSinh( n, a, y );
vzSinh( n, a, y );
2510
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vssinh FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdsinh
COMPLEX for vcsinh
input vector a.
DOUBLE COMPLEX for vzsinh
for vssinh
for vdsinh
COMPLEX, INTENT(IN) for vcsinh
for vzsinh
C: const float* for vsSinh
const double* for vdSinh
Sinh
const MKL_Complex16* for vzS-
inh
Table 9-11 Precision Overflow Thresholds for Sinh Real Function

single precision -Ln(FLT_MAX)-Ln2 <a[i] < Ln(FLT_MAX)+Ln2
double precision -Ln(DBL_MAX)-Ln2 <a[i] < Ln(DBL_MAX)+Ln2
2511
NOTE. Overflow can occur also in Sinh complex function, but the exact formula is
Output Parameters
y FORTRAN 77: REAL for vssinh FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdsinh
COMPLEX for vcsinh
output vector y.
DOUBLE COMPLEX for vzsinh
for vssinh
DOUBLE PRECISION,
INTENT(OUT) for vdsinh
COMPLEX, INTENT(OUT) for vc-
sinh
for vzsinh
C: float* for vsSinh
double* for vdSinh
MKL_Complex8* for vcSinh
MKL_Complex16* for vzSinh
2512
v?Tanh
Computes hyperbolic tangent of vector elements.
Syntax
Fortran:
call vstanh( n, a, y )
call vdtanh( n, a, y )
call vctanh( n, a, y )
call vztanh( n, a, y )
C:
vsTanh( n, a, y );
vdTanh( n, a, y );
vcTanh( n, a, y );
vzTanh( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vstanh FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdtanh
COMPLEX for vctanh
input vector a.
DOUBLE COMPLEX for vztanh
for vstanh
2513

for vdtanh
COMPLEX, INTENT(IN) for vctanh
for vztanh
C: const float* for vsTanh
const double* for vdTanh
Tanh
Tanh
Output Parameters
y FORTRAN 77: REAL for vstanh FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdtanh
COMPLEX for cstanh
output vector y.
DOUBLE COMPLEX for zdtanh
for vstanh
DOUBLE PRECISION,
INTENT(OUT) for vdtanh
cstanh
for zdtanh
C: float* for vsTanh
double* for vdTanh
2514
MKL_Complex8* for vcTanh
MKL_Complex16* for vzTanh
v?Acosh
Computes inverse hyperbolic cosine (nonnegative)
of vector elements.
Syntax
Fortran:
call vsacosh( n, a, y )
call vdacosh( n, a, y )
call vcacosh( n, a, y )
call vzacosh( n, a, y )
C:
vsAcosh( n, a, y );
vdAcosh( n, a, y );
vcAcosh( n, a, y );
vzAcosh( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsacosh FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdacosh
2515

COMPLEX for vcacosh C: Pointer to an array that contains the
input vector a.
DOUBLE COMPLEX for vzacosh
for vsacosh
for vdacosh
COMPLEX, INTENT(IN) for vca-
cosh
for vzacosh
C: const float* for vsAcosh
const double* for vdAcosh
const MKL_Complex8* for vcA-
cosh
const MKL_Complex16* for vzA-
cosh
Output Parameters
y FORTRAN 77: REAL for vsacosh FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdacosh
COMPLEX for vcacosh
output vector y.
DOUBLE COMPLEX for vzacosh
for vsacosh
DOUBLE PRECISION,
INTENT(OUT) for vdacosh
2516
COMPLEX, INTENT(OUT) for vca-
cosh
for vzacosh
C: float* for vsAcosh
double* for vdAcosh
MKL_Complex8* for vcAcosh
MKL_Complex16* for vzAcosh
v?Asinh
Computes inverse hyperbolic sine of vector
elements.
Syntax
Fortran:
call vsasinh( n, a, y )
call vdasinh( n, a, y )
call vcasinh( n, a, y )
call vzasinh( n, a, y )
C:
vsAsinh( n, a, y );
vdAsinh( n, a, y );
vcAsinh( n, a, y );
vzAsinh( n, a, y );
2517
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsasinh FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdasinh
COMPLEX for vcasinh
input vector a.
DOUBLE COMPLEX for vzasinh
for vsasinh
for vdasinh
COMPLEX, INTENT(IN) for vcas-
inh
for vzasinh
C: const float* for vsAsinh
const double* for vdAsinh
const MKL_Complex8* for vcAs-
inh
vzAsinh
2518
Output Parameters
y FORTRAN 77: REAL for vsasinh FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdasinh
COMPLEX for vcasinh
output vector y.
DOUBLE COMPLEX for vzasinh
for vsasinh
DOUBLE PRECISION,
INTENT(OUT) for vdasinh
COMPLEX, INTENT(OUT) for vcas-
inh
for vzasinh
C: float* for vsAsinh
double* for vdAsinh
MKL_Complex8* for vcAsinh
MKL_Complex16* for vzAsinh
2519
v?Atanh
Computes inverse hyperbolic tangent of vector
elements.
Syntax
Fortran:
call vsatanh( n, a, y )
call vdatanh( n, a, y )
call vcatanh( n, a, y )
call vzatanh( n, a, y )
C:
vsAtanh( n, a, y );
vdAtanh( n, a, y );
vcAtanh( n, a, y );
vzAtanh( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsatanh FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdatanh
COMPLEX for vcatanh
input vector a.
DOUBLE COMPLEX for vzatanh
for vsatanh
2520
for vdatanh
COMPLEX, INTENT(IN) for
vcatanh
for vzatanh
C: const float* for vsAtanh
const double* for vdAtanh
vcAtanh
vzAtanh
Output Parameters
y FORTRAN 77: REAL for vsatanh FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdatanh
COMPLEX for vcatanh
output vector y.
DOUBLE COMPLEX for vzatanh
for vsatanh
DOUBLE PRECISION,
INTENT(OUT) for vdatanh
vcatanh
for vzatanh
C: float* for vsAtanh
2521

double* for vdAtanh
MKL_Complex8* for vcAtanh
MKL_Complex16* for vzAtanh
Special Functions
v?Erf
Computes the error function value of vector
elements.
Syntax
Fortran:
call vserf( n, a, y )
call vderf( n, a, y )
C:
vsErf( n, a, y );
vdErf( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vserf FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vderf
2522
Fortran 90: REAL, INTENT(IN) C: Pointer to an array that contains the
for vserf input vector a.

for vderf
C: const float* for vsErf
const double* for vdErf
Output Parameters
y FORTRAN 77: REAL for vserf FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vderf
output vector y.
for vserf
DOUBLE PRECISION,
INTENT(OUT) for vderf
C: float* for vsErf
double* for vdErf
Description
The function Erf computes the error function values for elements of the input vector a and
writes them to the output vector y.
The error function is defined as given by:
2523
Useful relations:
where erfc is the complementary error function.
where
is the cumulative normal distribution function.
where Φ-1(x) and erf-1(x) are the inverses to Φ(x) and erf(x) respectively.
Figure 9-1 illustrates the relationships among Erf family functions (Erf, Erfc, CdfNorm).
2524
Figure 9-1 Erf Family Functions Relationship
Useful relations for these functions:
See Also
• Special Functions
2525
• ErfcComputes the complementary error function value of vector elements.

• CdfNormComputes the cumulative normal distribution function values of vector
elements.
v?Erfc
Computes the complementary error function value
of vector elements.
Syntax
Fortran:
call vserfc( n, a, y )
call vderfc( n, a, y )
C:
vsErfc( n, a, y );
vdErfc( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vserfc FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vderfc
input vector a.
for vserfc
for vderfc
C: const float* for vsErfc
2526
const double* for vdErfc
Output Parameters
y FORTRAN 77: REAL for vserfc FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vderfc
output vector y.
for vserfc
DOUBLE PRECISION,
INTENT(OUT) for vderfc
C: float* for vsErfc
double* for vdErfc
Description
The function Erfc computes the complementary error function values for elements of the input
vector a and writes them to the output vector y.
The complementary error function is defined as given by:
Useful relations:
2527
where
See alsoFigure 9-1 in Erf function description for Erfc function relationship with the other
functions of Erf family.
v?CdfNorm
Computes the cumulative normal distribution
function values of vector elements.
Syntax
Fortran:
call vscdfnorm( n, a, y )
call vdcdfnorm( n, a, y )
C:
vsCdfNorm( n, a, y );
vdCdfNorm( n, a, y );
2528
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for FORTRAN: Array, specifies the input vector
vscdfnorm a.
DOUBLE PRECISION for vd- C: Pointer to an array that contains the
cdfnorm input vector a.

for vscdfnorm
for vdcdfnorm
C: const float* for vsCdfNorm
const double* for vdCdfNorm
Output Parameters
y FORTRAN 77: REAL for FORTRAN: Array, specifies the output

vscdfnorm vector y.

cdfnorm output vector y.

for vscdfnorm
DOUBLE PRECISION,
INTENT(OUT) for vdcdfnorm
C: float* for vsCdfNorm
double* for vdCdfNorm
2529
Description
The function CdfNorm computes the cumulative normal distribution function values for elements
of the input vector a and writes them to the output vector y.
The cumulative normal distribution function is defined as given by:
Useful relations:
where Erf and Erfc are the error and complementary error functions.
See also Figure 9-1 in Erf function description for CdfNorm function relationship with the other
functions of Erf family.
v?ErfInv
Computes inverse error function value of vector
elements.
Syntax
Fortran:
call vserfinv( n, a, y )
call vderfinv( n, a, y )
2530
C:
vsErfInv( n, a, y );
vdErfInv( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vserfinv FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vderfinv
input vector a.
for vserfinv
for vderfinv
C: const float* for vsErfInv
const double* for vdErfInv
Output Parameters
y FORTRAN 77: REAL for vserfinv FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vderfinv
output vector y.
for vserfinv
DOUBLE PRECISION,
INTENT(OUT) for vderfinv
C: float* for vsErfInv
2531

double* for vdErfInv
Description
The function ErfInv computes the inverse error function values for elements of the input vector
a and writes them to the output vector y
y = erf-1(a),
where erf(x) is the error function defined as given by:
Useful relations:
where erfc is the complementary error function.
where
2532
Figure 9-2 illustrates the relationships among ErfInv family functions (ErfInv, ErfcInv,
CdfNormInv).
2533
Figure 9-2 ErfInv Family Functions Relationship
Useful relations for these functions:
See Also
• Special Functions
2534
• ErfcInvComputes the inverse complementary error function value of vector
elements.
• CdfNormInvComputes the inverse cumulative normal distribution function values
of vector elements.
v?ErfcInv
Computes the inverse complementary error
function value of vector elements.
Syntax
Fortran:
call vserfcinv( n, a, y )
call vderfcinv( n, a, y )
C:
vsErfcInv( n, a, y );
vdErfcInv( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vser- FORTRAN: Array, specifies the input vector
fcinv a.
DOUBLE PRECISION for vder- C: Pointer to an array that contains the
fcinv input vector a.

for vserfcinv
2535

for vderfcinv
C: const float* for vsErfcInv
const double* for vdErfcInv
Output Parameters
y FORTRAN 77: REAL for vser- FORTRAN: Array, specifies the output
fcinv vector y.
DOUBLE PRECISION for vder- C: Pointer to an array that contains the

fcinv output vector y.

for vserfcinv
DOUBLE PRECISION,
INTENT(OUT) for vderfcinv
C: float* for vsErfcInv
double* for vdErfcInv
Description
The function ErfcInv computes the inverse complimentary error function values for elements
of the input vector a and writes them to the output vector y.
The inverse complementary error function is defined as given by:
2536
where erf(x)denotes the error function and erfinv(x) denotes the inverse error function.
See alsoFigure 9-2 in ErfInv function description for ErfcInv function relationship with the
other functions of ErfInv family.
v?CdfNormInv
Computes the inverse cumulative normal
distribution function values of vector elements.
Syntax
Fortran:
call vscdfnorminv( n, a, y )
call vdcdfnorminv( n, a, y )
C:
vsCdfNormInv( n, a, y );
vdCdfNormInv( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for FORTRAN: Array, specifies the input vector
vscdfnorminv a.
cdfnorminv input vector a.
2537

for vscdfnorminv
for vdcdfnorminv
C: const float* for
vsCdfNormInv
const double* for vd-
CdfNormInv
Output Parameters
y FORTRAN 77: REAL for FORTRAN: Array, specifies the output

vscdfnorm vector y.

cdfnorminv output vector y.

for vscdfnorm
DOUBLE PRECISION,
INTENT(OUT) for vdcdfnorminv
C: float* for vsCdfNormInv
double* for vdCdfNormInv
Description
The function CdfNormInv computes the inverse cumulative normal distribution function values
for elements of the input vector a and writes them to the output vector y.
The inverse cumulative normal distribution function is defined as given by:
2538
where CdfNorm(x) denotes the cumulative normal distribution function.

Useful relations:
where erfinv(x) denotes the inverse error function and erfcinv(x) denotes the inverse
complementary error functions.
See also Figure 9-2 in ErfInv function description for CdfNormInv function relationship with
the other functions of ErfInv family.
Rounding Functions
v?Floor
Computes a rounded towards minus infinity integer
value for each vector element.
Syntax
Fortran:
call vsfloor( n, a, y )
call vdfloor( n, a, y )
C:
vsFloor( n, a, y );
vdFloor( n, a, y );
2539
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsfloor FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdfloor
input vector a.
for vsfloor
for vdfloor
C: const float* for vsFloor
const double* for vdFloor
Output Parameters
y FORTRAN 77: REAL for vsfloor FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdfloor
output vector y.
for vsfloor
DOUBLE PRECISION,
INTENT(OUT) for vdfloor
C: float* for vsFloor
double* for vdFloor
2540
v?Ceil
Computes a rounded towards plus infinity integer
value for each vector element.
Syntax
Fortran:
call vsceil( n, a, y )
call vdceil( n, a, y )
C:
vsCeil( n, a, y );
vdCeil( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsceil FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdceil
input vector a.
for vsceil
for vdceil
C: const float* for vsCeil
const double* for vdCeil
2541
Output Parameters
y FORTRAN 77: REAL for vsceil FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdceil
output vector y.
for vsceil
DOUBLE PRECISION,
INTENT(OUT) for vdceil
C: float* for vsCeil
double* for vdCeil
v?Trunc
Computes a rounded towards zero integer value
for each vector element.
Syntax
Fortran:
call vstrunc( n, a, y )
call vdtrunc( n, a, y )
C:
vsTrunc( n, a, y );
vdTrunc( n, a, y );
Input Parameters

calculated.
INTENT(IN)
2542
C: const int
a FORTRAN 77: REAL for vstrunc FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdtrunc
input vector a.
for vstrunc
for vdtrunc
C: const float* for vsTrunc
const double* for vdTrunc
Output Parameters
y FORTRAN 77: REAL for vstrunc FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdtrunc
output vector y.
for vstrunc
DOUBLE PRECISION,
INTENT(OUT) for vdtrunc
C: float* for vsTrunc
double* for vdTrunc
2543
v?Round
Computes a rounded to nearest integer value for
each vector element.
Syntax
Fortran:
call vsround( n, a, y )
call vdround( n, a, y )
C:
vsRound( n, a, y );
vdRound( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsround FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdround
input vector a.
for vsround
for vdround
C: const float* for vsRound
const double* for vdRound
2544
Output Parameters
y FORTRAN 77: REAL for vsround FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdround
output vector y.
for vsround
DOUBLE PRECISION,
INTENT(OUT) for vdround
C: float* for vsRound
double* for vdRound
Description
Halfway values, that is, 0.5, -1.5, and the like, are rounded off away from zero. That is, 0.5
-> 1, -1.5 -> -2, etc.
v?NearbyInt
Computes a rounded integer value in a current
rounding mode for each vector element.
Syntax
Fortran:
call vsnearbyint( n, a, y )
call vdnearbyint( n, a, y )
C:
vsNearbyInt( n, a, y );
vdNearbyInt( n, a, y );
2545
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsnear- FORTRAN: Array, specifies the input vector
byint a.
DOUBLE PRECISION for vdnear- C: Pointer to an array that contains the
byint input vector a.

for vsnearbyint
for vdnearbyint
C: const float* for vsNear-
byInt
const double* for vdNearbyInt
Output Parameters
y FORTRAN 77: REAL for vsnear- FORTRAN: Array, specifies the output
byint vector y.
DOUBLE PRECISION for vdnear- C: Pointer to an array that contains the

byint output vector y.

for vsnearbyint
DOUBLE PRECISION,
INTENT(OUT) for vdnearbyint
C: float* for vsNearbyInt
2546
double* for vdNearbyInt
Description
Halfway values, that is, 0.5, -1.5, and the like, are rounded off towards even values.
v?Rint
Computes a rounded integer value in a current
rounding mode for each vector element with
inexact result exception raised for each changed
value.
Syntax
Fortran:
call vsrint( n, a, y )
call vdrint( n, a, y )
C:
vsRint( n, a, y );
vdRint( n, a, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsrint FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdrint
input vector a.
for vsrint
2547

for vdrint
C: const float* for vsRint
const double* for vdRint
Output Parameters
y FORTRAN 77: REAL for vsrint FORTRAN: Array, specifies the output
vector y.
DOUBLE PRECISION for vdrint
output vector y.
for vsrint
DOUBLE PRECISION,
INTENT(OUT) for vdrint
C: float* for vsRint
double* for vdRint
Description
Halfway values, that is, 0.5, -1.5, and the like, are rounded off towards even values. For each
changed value, inexact result exception is raised.
v?Modf
Computes a truncated integer value and remaining
fraction part for each vector element.
Syntax
Fortran:
call vsmodf( n, a, y, z )
call vdmodf( n, a, y, z )
2548
C:
vsModf( n, a, y, z );
vdModf( n, a, y, z );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsmodf FORTRAN: Array, specifies the input vector
a.
DOUBLE PRECISION for vdmodf
input vector a.
for vsmodf
for vdmodf
C: const float* for vsModf
const double* for vdModf
Output Parameters
y, z FORTRAN 77: REAL for vsmodf FORTRAN: Array, specifies the output
vector y and z.
DOUBLE PRECISION for vdmodf
output vector y and z.
for vsmodf
DOUBLE PRECISION,
INTENT(OUT) for vdmodf
C: float* for vsModf
2549

double* for vdModf
VML Pack/Unpack Functions

This section describes VML functions which convert vectors with unit increment to and from
vectors with positive increment indexing, vector indexing and mask indexing (see Appendix B
for details on vector indexing methods).
Table 9-12 lists available VML Pack/Unpack functions, together with data types and indexing
methods associated with them.
Table 9-12 VML Pack/Unpack Functions
Function Short Data Indexing Description
Name Types Methods
v?Pack s, d I,V,M Gathers elements of arrays, indexed by different
methods.
v?Unpack s, d I,V,M Scatters vector elements to arrays with different
indexing.
v?Pack
Copies elements of an array with specified indexing
to a vector with unit increment.
Syntax
Fortran:
call vspacki( n, a, inca, y )
call vspackv( n, a, ia, y )
call vspackm( n, a, ma, y )
call vdpacki( n, a, inca, y )
call vdpackv( n, a, ia, y )
call vdpackm( n, a, ma, y )
2550
C:
vsPackI( n, a, inca, y );
vsPackV( n, a, ia, y );
vsPackM( n, a, ma, y );
vdPackI( n, a, inca, y );
vdPackV( n, a, ia, y );
vdPackM( n, a, ma, y );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vspacki, FORTRAN: Array, DIMENSION at least(1

vspackv, vspackm + (n-1)*inca) for vspacki/vdpacki,
DOUBLE PRECISION for vdpacki, Array, DIMENSION at least max(
vdpackv, vdpackm n,max(ia[j]) ), j=0, …, n-1 for vs-
packv/vdpackv ,
for vspacki, vspackv, vspackm Array, DIMENSION at least n for vs-
packm/vdpackm.
for vdpacki, vdpackv, vdpackm Specifies the input vector a.
C: const float* for vsPackI, C: Specifies pointer to an array that
vsPackV, vsPackM contains the input vector a. Size of the
array must be:
const double* for vdPackI, vd-
PackV, vdPackM at least(1 + (n-1)*inca) for vsPac-
kI/vdPackI,
at least max( n,max(ia[j]) ), j=0,
…, n-1 for vsPackV/vdPackV ,
at least n for vsPackM/vdPackM.
2551
inca FORTRAN 77: INTEGER for vspac- Specifies the increment for the elements
ki, vdpacki of a.

INTENT(IN) for vspacki, vdpac-
ki
C: const int for vsPackI, vdPac-
kI
ia FORTRAN 77: INTEGER for vs- FORTRAN: Array, DIMENSION at least n.

packv, vdpackv
Specifies the index vector for the elements
Fortran 90: INTEGER, of a.
INTENT(IN) for vspackv, vd-
C: Specifies the pointer to an array of size
packv at least n that contains the index vector
C: const int* for vsPackV, vd- for the elements of a.
PackV
ma FORTRAN 77: INTEGER for vs- FORTRAN: Array, DIMENSION at least n,

packm, vdpackm
Specifies the mask vector for the elements
Fortran 90: INTEGER, of a.
INTENT(IN) for vspackm, vd-
packm at least n that contains the mask vector for
C: const int* for vsPackM, vd- the elements of a.
PackM
Output Parameters
y FORTRAN 77: REAL for vspacki, FORTRAN: Array, DIMENSION at least n.

vspackv, vspackm Specifies the output vector y.
DOUBLE PRECISION for vdpacki, C: Pointer to an array of size at least n that
vdpackv, vdpackm contains the output vector y.
2552
for vspacki, vspackv, vspackm
DOUBLE PRECISION,
INTENT(OUT) for vdpacki, vd-
packv, vdpackm
C: float* for vsPackI, vsPackV,
vsPackM
double* for vdPackI, vdPackV,
vdPackM
v?Unpack
Copies elements of a vector with unit increment to
an array with specified indexing.
Syntax
Fortran:
call vsunpacki( n, a, y, incy )
call vsunpackv( n, a, y, iy )
call vsunpackm( n, a, y, my )
call vdunpacki( n, a, y, incy )
call vdunpackv( n, a, y, iy )
call vdunpackm( n, a, y, my )
2553
C:
vsUnpackI( n, a, y, incy );
vsUnpackV( n, a, y, iy );
vsUnpackM( n, a, y, my );
vdUnpackI( n, a, y, incy );
vdUnpackV( n, a, y, iy );
vdUnpackM( n, a, y, my );
Input Parameters

calculated.
INTENT(IN)
C: const int
a FORTRAN 77: REAL for vsunpac- FORTRAN: Array, DIMENSION at least n.

ki, vsunpackv, vsunpackm
Specifies the input vector a.
DOUBLE PRECISION for vdunpac-
ki, vdunpackv, vdunpackm.
at least n that contains the input vector a.
for vsunpacki, vsunpackv,
vsunpackm
for vdunpacki, vdunpackv,
vdunpackm.
C: const float* for vsUnpackI,
vsUnpackV, vsUnpackM
const double* for vdUnpackI,
vdUnpackV, vdUnpackM
2554
incy FORTRAN 77: INTEGER for Specifies the increment for the elements
vsunpacki, vdunpacki. of y.

INTENT(IN) for vsunpacki,
vdunpacki.
C: const int for vsUnpackI,
vdUnpackI.
iy FORTRAN 77: INTEGER for FORTRAN: Array, DIMENSION at least n.

vsunpackv, vdunpackv
Specifies the index vector for the elements
Fortran 90: INTEGER, of y.
INTENT(IN) for vsunpackv,
vdunpackv at least n that contains the index vector
C: const int* for vsUnpackV, for the elements of a.
vdUnpackV
my FORTRAN 77: INTEGER for FORTRAN: Array, DIMENSION at least n,

vsunpackm, vdunpackm
Specifies the mask vector for the elements
Fortran 90: INTEGER, of y.
INTENT(IN) for vsunpackm,
vdunpackm at least n that contains the mask vector for
C: const int* for vsUnpackM, the elements of a.
vdUnpackM
Output Parameters
y FORTRAN 77: REAL for vsunpac- FORTRAN: Array, DIMENSION

ki, vsunpackv, vsunpackm
at least (1 + (n-1)*incy) for vsunpac-
DOUBLE PRECISION for vdunpacki,
ki, vdunpackv, vdunpackm
at least max( n,max(iy[j]) ),j=0,...,
n-1, for vsunpackv,
2555

Fortran 90: REAL, INTENT(OUT) at least n for vsunpackm/vdunpackm
for vsunpacki, vsunpackv,
C: Specifies the pointer to an array that
vsunpackm
contains the output vector y.
DOUBLE PRECISION,
Size of the array must be:
INTENT(OUT) for vdunpacki,
vdunpackv, vdunpackm at least (1 + (n-1)*incy) for vsUn-
PackI,
C: float* for vsUnpackI, vsUn-
at least max( n,max(ia[j])
packV, vsUnpackM
),j=0,..., n-1, for vsUnPackV,
double* for vdUnpackI, vdUn- at least n for vsUnPackM.
packV, vdUnpackM
VML Service Functions

VML Service functions allow the user to set /get the accuracy mode, and set/get the error code.
All these functions are available both in Fortran- and C- interfaces. Table 9-13 lists available
VML Service functions and their short description.
Table 9-13 VML Service Functions
Function Short Name Description
SetMode Sets the VML mode
GetMode Gets the VML mode
SetErrStatus Sets the VML error status
GetErrStatus Gets the VML error status
ClearErrStatus Clears the VML error status
SetErrorCallBack Sets the additional error handler callback function
GetErrorCallBack Gets the additional error handler callback function
ClearErrorCallBack Deletes the additional error handler callback function
2556
SetMode
Sets a new mode for VML functions according to
mode parameter and stores the previous VML mode
to oldmode.
Syntax
Fortran:
oldmode = vmlsetmode( mode )
C:
oldmode = vmlSetMode( mode );
Input Parameters
mode FORTRAN 77: INTEGER Specifies the VML mode to be set.

INTENT(IN)
C: const unsigned int
Output Parameters
oldmode FORTRAN: INTEGER Specifies the former VML mode.

C: int
Description
The mode parameter is designed to control accuracy, FPU, error handling and threading options.
Table 9-14 lists values of the mode parameter. All other possible values of the mode parameter
may be obtained from these values by using bitwise OR ( | ) operation to combine one value
2557
for accuracy, one for FPU, and one for error control options. The default value of the mode
parameter is VML_HA | VML_ERRMODE_DEFAULT | VML_NUM_THREADES_OMP_AUTO. Thus, the
current FPU control word (FPU precision and the rounding method) is used by default.
If any VML mathematical function requires different FPU precision, or rounding method, it
changes these options automatically and then restores the former values. The mode parameter
enables you to minimize switching the internal FPU mode inside each VML mathematical function
that works with similar precision and accuracy settings. To accomplish this, set the mode
parameter to VML_FLOAT_CONSISTENT for single precision real and complex functions, or to
VML_DOUBLE_CONSISTENT for double precision real and complex functions. These values of the
mode parameter are the optimal choice for the respective function groups, as they are required
for most of the VML mathematical functions. After the execution is over, set the mode to
VML_RESTORE if you need to restore the previous FPU mode.
Table 9-14 Values of the mode Parameter
Value of mode Description
Accuracy Control
VML_HA High accuracy versions of VML functions will be used.
VML_LA Low accuracy versions of VML functions will be used.
VML_EP Enhanced Performance accuracy versions of VML functions
will be used.
Additional FPU Mode Control
VML_FLOAT_CONSISTENT The optimal FPU mode (control word) for single precision
functions is set, and the previous FPU mode is saved.
VML_DOUBLE_CONSISTENT The optimal FPU mode (control word) for double precision
functions is set, and the previous FPU mode is saved.
VML_RESTORE The previously saved FPU mode is restored.
Error Mode Control
VML_ERRMODE_IGNORE No action is set for computation errors.
VML_ERRMODE_ERRNO On error, the errno variable is set.
VML_ERRMODE_STDERR On error, the error text information is written to stderr.
VML_ERRMODE_EXCEPT On error, an exception is raised.
VML_ERRMODE_CALLBACK On error, an additional error handler function is called.
VML_ERRMODE_DEFAULT On error, the errno variable is set, an exception is raised,
and an additional error handler function is called.
Treading Mode Control
2558
Value of mode Description
VML_NUM_THREADS_OMP_AUTO This is default behavior. Maximum number of threads is
determined by environmental variable OMP_NUM_THREADS
and can be overridden by OpenMP* function
omp_set_num_threads(). For performance reasons VML
threading logic can use fewer number of threads.
VML_NUM_THREADS_OMP_FIXED Number of threads is determined by environmental variable
OMP_NUM_THREADS and can be overridden by OpenMP*
function omp_set_num_threads(). Use this mode to
disable VML threading logic.
Examples
Several examples of calling the function vmlSetMode() with different values of the mode
parameter are given below:
Fortran:
oldmode = vmlsetmode( VML_LA )
call vmlsetmode( IOR(VML_LA, IOR(VML_FLOAT_CONSISTENT,
VML_ERRMODE_IGNORE )))
call vmlsetmode( VML_RESTORE)
call vmlsetmode( VML_NUM_THREADS_OMP_FIXED)
C:
vmlSetMode( VML_LA );
vmlSetMode( VML_LA | VML_FLOAT_CONSISTENT | VML_ERRMODE_IGNORE
);
vmlSetMode( VML_RESTORE);
vmlSetMode( VML_NUM_THREADS_OMP_FIXED);
2559
GetMode
Gets the VML mode.
Syntax
Fortran:
mod = vmlgetmode()
C:
mod = vmlGetMode( void );
Output Parameters
mod FORTRAN: INTEGER Specifies the packed mode parameter.

C: int
Description
The function vmlGetMode() returns the VML mode parameter that controls accuracy, FPU, and
error handling options. The mod variable value is some combination of the values listed in the
Table 9-14. You can obtain some of these values using the respective mask from the Table
9-15.
Table 9-15 Values of Mask for the mode Parameter
Value of mask Description
VML_ACCURACY_MASK Specifies mask for accuracy mode selection.
VML_FPUMODE_MASK Specifies mask for FPU mode selection.
VML_ERRMODE_MASK Specifies mask for error mode selection.
See example below:
2560
Examples
Fortran:
mod = vmlgetmode()
accm = IAND(mod, VML_ACCURACY_MASK)
fpum = IAND(mod, VML_FPUMODE_MASK)
errm = IAND(mod, VML_ERRMODE_MASK)
C:
accm = vmlGetMode(void )& VML_ACCURACY_MASK;
fpum = vmlGetMode(void )& VML_FPUMODE_MASK;
errm = vmlGetMode(void )& VML_ERRMODE_MASK;
SetErrStatus
Sets the new VML error status according to err
and stores the previous VML error status to olderr.
Syntax
Fortran:
olderr = vmlseterrstatus( err )
C:
olderr = vmlSetErrStatus( err );
Input Parameters
err FORTRAN 77: INTEGER Specifies the VML error status to be set.
INTENT(IN)
C: const int
2561
Output Parameters
olderr FORTRAN: INTEGER Specifies the former VML error status.

C: int
Description
Table 9-16 lists possible values of the err parameter.
Table 9-16 Values of the VML Error Status
Error Status Description
VML_STATUS_OK The execution was completed successfully.
VML_STATUS_BADSIZE The array dimension is not positive.
VML_STATUS_BADMEM NULL pointer is passed.
VML_STATUS_ERRDOM At least one of array values is out of a range of
definition.
VML_STATUS_SING At least one of array values caused a singularity.
VML_STATUS_OVERFLOW An overflow has happened during the calculation
process.
VML_STATUS_UNDERFLOW An underflow has happened during the
calculation process.
Examples
vmlSetErrStatus( VML_STATUS_OK );
vmlSetErrStatus( VML_STATUS_ERRDOM );
vmlSetErrStatus( VML_STATUS_UNDERFLOW );
2562
GetErrStatus
Gets the VML error status.
Syntax
Fortran:
err = vmlgeterrstatus( )
C:
err = vmlGetErrStatus( void );
Output Parameters
err FORTRAN: INTEGER Specifies the VML error status.

C: int
ClearErrStatus
Sets the VML error status to VML_STATUS_OK and
stores the previous VML error status to olderr.
Syntax
Fortran:
olderr = vmlclearerrstatus( )
C:
olderr = vmlClearErrStatus( void );
Output Parameters
olderr FORTRAN: INTEGER Specifies the former VML error status.

C: int
2563
SetErrorCallBack
Sets the additional error handler callback function
and gets the old callback function.
Syntax
Fortran:
oldcallback = vmlseterrorcallback( callback )
C:
oldcallback = vmlSetErrorCallBack( callback );
Input Parameters
Name Description
FORTRAN: Address of the callback function. The callback function has the following
callback format:
INTEGER FUNCTION ERRFUNC(par)
TYPE (ERROR_STRUCTURE) par
! ...
! user error processing
! ...
ERRFUNC = 0
! if ERRFUNC= 0 - standard VML error
handler
! is called after the callback
! if ERRFUNC != 0 - standard VML error
handler
! is not called
END
2564
Name Description
The passed error structure is defined as
follows:
TYPE ERROR_STRUCTURE SEQUENCE
INTEGER*4 ICODE
INTEGER*4 IINDEX
REAL*8 DBA1
REAL*8 DBA2
REAL*8 DBR1
REAL*8 DBR2
CHARACTER(64) CFUNCNAME
INTEGER*4 IFUNCNAMELEN
END TYPE ERROR_STRUCTURE
C: callback Pointer to the callback function. The callback function has the following
format:
static int __stdcall
MyHandler(DefVmlErrorContext*
pContext)
{
/* Handler body */
};
2565
Name Description
The passed error structure is defined as
follows:
typedef struct _DefVmlErrorContext
{
int iCode;/* Error status value */
int iIndex;/* Index for bad array
element, or bad array
dimension, or bad
array pointer */
double dbA1; /* Error argument 1 */
double dbA2; /* Error argument 2 */
double dbR1; /* Error result 1 */
double dbR2; /* Error result 2 */
char cFuncName[64]; /* Function name */
int iFuncNameLen; /* Length of
functionname*/
} DefVmlErrorContext;
Output Parameters
oldcallback Fortran 90: INTEGER FORTRAN: Address of the former callback

function.
C: int
C: Pointer to the former callback function.
NOTE. FORTRAN 77 support is unavailable for this function.
2566
Description
This function is declared in mkl_vml.fi for Fortran 90 interface and in mkl_vml_functions.h
for C interface.
The callback function is called on each VML mathematical function error if
VML_ERRMODE_CALLBACK error mode is set (see Table 9-14).
Use the vmlSetErrorCallBack() function if you need to define your own callback function
instead of default empty callback function.
The input structure for a callback function contains the following information about the error
encountered:
• the input value that caused an error

• location (array index) of this value
• the computed result value
• error code
• name of the function in which the error occurred.
You can insert your own error processing into the callback function. This may include correcting
the passed result values in order to pass them back and resume computation. The standard
error handler is called after the callback function only if it returns 0.
GetErrorCallBack
Gets the additional error handler callback function.
Syntax
Fortran:
callback = vmlgeterrorcallback( )
C:
callback = vmlGetErrorCallBack( void );
2567
Output Parameters
Name Description
callback Fortran 90: Address of the callback

function
C: Pointer to the callback function
ClearErrorCallBack
Deletes the additional error handler callback
function and retrieves the former callback function.
Syntax
Fortran:
oldcallback = vmlclearerrorcallback( )
C:
oldcallback = vmlClearErrorCallBack( void );
Output Parameters
oldcallback Fortran 90: INTEGER FORTRAN: Address of the former callback

function
C: int
C: Pointer to the former callback function
2568
Statistical Functions 10
Statistical functions in Intel® MKL are known as Vector Statistical Library (VSL) that is designed for the
purpose of
• generating vectors of pseudorandom and quasi-random numbers
• performing mathematical operations of convolution and correlation.
The corresponding functionality is described in the respective Random Number Generators and Convolution
and Correlation sections.
Random Number Generators

VSL provides a set of routines implementing commonly used pseudo- or quasi-random number
generators with continuous and discrete distribution. To improve performance, all these routines
were developed using the calls to the highly optimized Basic Random Number Generators (BRNGs)
and the library of vector mathematical functions (VML, see Chapter 9, “Vector Mathematical
Functions”).
VSL provides interfaces both for FORTRAN and C languages. For users of the C and C++ languages
the mkl_vsl.h header file is provided. For users of the Fortran 90 or Fortran 95 language the
mkl_vsl.fi header file is provided. For users of the FORTRAN 77 language the mkl_vsl.f77 header
file is provided. All header files are found in the following directory:
${MKL}/include
The mkl_vsl.fi header is intended for using via the Fortran include clause and is compatible with
both standard forms of F90/F95 sources — the free and 72-columns fixed forms. If you need to use
the VSL interface with 80- or 132-columns fixed form sources, you may add a new file to your project.
That file is formatted as a 72-columns fixed-form source and consists of a single include clause as
follows:
include ‘mkl_vsl.fi’
This include clause causes the compiler to generate the module files mkl_vsl.mod and
mkl_vsl_type.mod, which are used to process the Fortran use clauses referencing to the VSL
interface:
use mkl_vsl_type
use mkl_vsl
2569
Because of this specific feature, you do not need to include the mkl_vsl.fi header into each
source of your project. You only need to include the header into some of the sources. In any
case, make sure that the sources that depend on the VSL interface are compiled after those
that include the header so that the module files mkl_vsl.mod and mkl_vsl_type.mod are
generated prior to using them.
The mkl_vsl.f77 header is intended for using via the Fortran include clause as follows:
include ‘mkl_vsl.f77’
NOTE. For Fortran 90 interface, VSL provides both subroutine-style interface and
function-style interface. Default interface in this case is a function-style interface.
Function-style interface, unlike subroutine-style interface, allows the user to get error
status of each routine. Subroutine-style interface is provided for backward compatibility
only. To use subroutine-style interface, manually include mkl_vsl_subroutine.fi file
instead of mkl_vsl.fi by changing the line include ‘mkl_vsl.fi’ in include\mkl.fi
with the line include ‘mkl_vsl_subroutine.fi’.
For FORTRAN 77 interface, VSL provides only function-style interface.
All VSL routines can be classified into three major categories:
• Transformation routines for different types of statistical distributions, for example, uniform,
normal (Gaussian), binomial, etc. These routines indirectly call basic random number
generators, which are either pseudorandom number generators or quasi-random number
generators. Detailed description of the generators can be found in Distribution Generators
section.
• Service routines to handle random number streams: create, initialize, delete, copy, save to
a binary file, load from a binary file, get the index of a basic generator. The description of
these routines can be found in Service Routines section.
• Registration routines for basic pseudorandom generators and routines that obtain properties
of the registered generators (see Advanced Service Routines section ).
The last two categories are referred to as service routines.
Conventions
This document makes no specific differentiation between random, pseudorandom, and
quasi-random numbers, nor between random, pseudorandom, and quasi-random number
generators unless the context requires otherwise. For details, refer to ‘Random Numbers’ section
in VSL Notes document provided with Intel® MKL.
2570
All generators of nonuniform distributions, both discrete and continuous, are built on the basis
of the uniform distribution generators, called Basic Random Number Generators (BRNGs). The
pseudorandom numbers with nonuniform distribution are obtained through an appropriate
transformation of the uniformly distributed pseudorandom numbers. Such transformations are
referred to as generation methods. For a given distribution, several generation methods can
be used. See VSL Notes for the description of methods available for each generator.
The stream descriptor specifies which BRNG should be used in a given transformation method.
See ‘Random Streams and RNGs in Parallel Computation’ section of VSL Notes.
The term computational node means a logical or physical unit that can process data in parallel.
The following notation is used throughout the text:
N The set of natural numbers N = {1, 2, 3 ...}.
Z The set of integers Z = {... -3, -2, -1, 0, 1, 2, 3 ...}.
R The set of real numbers.
The floor of a (the largest integer less than or equal to a).
⊕ or xor Bitwise exclusive OR.
Binomial coefficient or combination (α∈R, α≥ 0; k∈N ∪{0}).
For α≥k binomial coefficient is defined as
If α < k, then
Φ(x) Cumulative Gaussian distribution function
2571
defined over - ∞ < x < + ∞.

Φ(-∞) = 0, Φ(+∞) = 1.
Γ(α) The complete gamma function
where α > 0.
B(p, q) The complete beta function
where p>0 and q>0.

LCG(a,c, m) Linear Congruential Generator xn+1 = (axn + c) mod m, where a is
called the multiplier, c is called the increment, and m is called the
modulus of the generator.
MCG(a,m) Multiplicative Congruential Generator xn+1 = (axn) mod m is a special
case of Linear Congruential Generator, where the increment c is taken
to be 0.
GFSR(p, q) Generalized Feedback Shift Register Generator
xn = xn-p ⊕xn-q.
Naming Conventions
The names of all VSL functions in FORTRAN are lowercase; names in C may contain both
lowercase and uppercase letters.
The names of generator routines have the following structure:
2572
v<type of result>rng<distribution> for FORTRAN-interface
v<type of result>Rng<distribution> for C-interface,
where v is the prefix of a VSL vector function, and the field <type of result> is either s, d,
or i and specifies one of the following types:
s REAL for FORTRAN-interface
float for C-interface
d DOUBLE PRECISION for FORTRAN-interface
double for C-interface
i INTEGER for FORTRAN-interface
int for C-interface
Prefixes s and d apply to continuous distributions only, prefix i applies only to discrete case.
The prefix rng indicates that the routine is a random generator, and the <distribution> field
specifies the type of statistical distribution.
Names of service routines follow the template below:
vsl<name>,
where vsl is the prefix of a VSL service function. The field <name> contains a short function
name. For a more detailed description of service routines refer to Service Routines and Advanced
Service Routines sections.
Prototype of each generator routine corresponding to a given probability distribution fits the
following structure:
status = <function name>( method, stream, n, r, [<distribution parameters>] ),
where
• method defines the method of generation. A detailed description of this parameter can be
found in Table 10-1. See the next page, where the structure of the method parameter name
is explained.
• stream defines the descriptor of the random stream and must have a non-zero value.
Random streams, descriptors, and their usage are discussed further in Random Streams
and Service Routines.
• n defines the number of random values to be generated. If n is less than or equal to zero,
no values are generated. Furthermore, if n is negative, an error condition is set.
• r defines the destination array for the generated numbers. The dimension of the array must
be large enough to store at least n random numbers.
• status defines the error status of a VSL routine. See Error Reporting section for a detailed
description of error status values.
2573
Additional parameters included into <distribution parameters> field are individual for each
generator routine and are described in detail in Distribution Generators section.
To invoke a distribution generator, use a call to the respective VSL routine. For example, to
obtain a vector r, composed of n independent and identically distributed random numbers with
normal (Gaussian) distribution, that have the mean value a and standard deviation sigma, write
the following:
for FORTRAN-interface
status = vsrnggaussian( method, stream, n, r, a, sigma )
for C-interface
status = vsRngGaussian( method,
stream, n, r, a, sigma )
The name of a method parameter has the following structure:
VSL_METHOD_<precision><distribution>_<method>,
VSL_METHOD_<precision><distribution>_<method>_ACCURATE,
where
<precision> S
for single precision continuous distribution
D for double precision continuous distribution
I for discrete distribution
<distribution> probability distribution
<method> method name.
Type of name structure for method parameter corresponds to fast and accurate modes of
random number generation (see “Distribution Generators” section and VSL Notes for details).
Method names VSL_METHOD_<precision><distribution>_<method> and
VSL_METHOD_<precision><distribution>_<method>_ACCURATE should be used with
vsl<precision>Rng<distribution> function only, where
<precision> s
for single precision continuous distribution
d for double precision continuous distribution
i for discrete distribution
<distribution> probability distribution.
Table 10-1 provides specific predefined values of the method name. The third column contains
names of the functions that use the given method.
2574
Table 10-1 Values of <method> in method parameter
Method Short Description Functions
STD Standard method. Currently there is only one method for Uniform
these functions. (continu-
ous), Uni-
form (dis-
crete), Uni-
formBits
BOXMULLER BOXMULLER generates normally distributed random number Gaussian,

x thru the pair of uniformly distributed numbers u1 and u2 GaussianMV
according to the formula:
BOXMULLER2 BOXMULLER2 generates normally distributed random Gaussian,

numbers x1 and x2 thru the pair of uniformly distributed GaussianMV,
numbers u1 and u2 according to the formulas: Lognormal
ICDF Inverse cumulative distribution function method. Exponen-

tial,
Laplace,
Weibull,
Cauchy,
Rayleigh,
Gumbel,
Bernoulli,
Geometric,
Gaussian,
GaussianMV
2575
GNORM For α > 1, a gamma distributed random number is Gamma

generated as a cube of properly scaled normal random
number; for 0.6 ≤ α < 1, a gamma distributed random
number is generated using rejection from Weibull
distribution; for α < 0.6, a gamma distributed random
number is obtained using transformation of exponential
power distribution; for α = 1, gamma distribution is reduced
to exponential distribution.
CJA For min(p, q) > 1, Cheng method is used; for min(p, Beta
q) < 1, Jöhnk method is used, if q + K·p2+ C ≤ 0 (K=
0.852..., C=-0.956...) otherwise, Atkinson switching
algorithm is used; for max(p, q) < 1, method of Jöhnk
is used; for min(p, q) < 1, max(p, q)> 1, Atkinson
switching algorithm is used (CJA stands for the first letters
of Cheng, Jöhnk, Atkinson); for p = 1 or q = 1, inverse
cumulative distribution function method is used;for p = 1
and q = 1, beta distribution is reduced to uniform
distribution.
BTPE Acceptance/rejection method for ntrial·min(p,1 - p)≥ Binomial

30 with decomposition into 4 regions:
– 2 parallelograms
– triangle
– left exponential tail
– right exponential tail
H2PE Acceptance/rejection method for large mode of distribution Hypergeomet-

with decomposition into 3 regions: ric
– rectangular
2576
PTPE Acceptance/rejection method for λ ≥ 27 with decomposition Poisson

into 4 regions:
– 2 parallelograms
– triangle
– right exponential tail;
otherwise, table lookup method is used.
POISNORM for λ ≥ 1, method based on Poisson inverse CDF Poisson,

approximation by Gaussian inverse CDF; for λ < 1, table PoissonV
lookup method is used.
NBAR Acceptance/rejection method for , NegBinomial
with decomposition into 5 regions:

– rectangular
– 2 trapezoid
Basic Generators
VSL provides the following BRNGs, which differ in speed and other properties:
• the 32-bit multiplicative congruential pseudorandom number generator MCG(1132489760,

231 -1) [L’Ecuyer99]
• the 32-bit generalized feedback shift register pseudorandom number generator
GFSR(250,103) [Kirkpatrick81]
2577
• the combined multiple recursive pseudorandom number generator MRG-32k3a [L’Ecuyer99a]

• the 59-bit multiplicative congruential pseudorandom number generator MCG(1313, 259)
from NAG Numerical Libraries [NAG]
• Wichmann-Hill pseudorandom number generator (a set of 273 basic generators) from NAG
Numerical Libraries [NAG]
• Mersenne Twister pseudorandom number generator MT19937 [Matsumoto98] with period
19937
length 2 -1 of the produced sequence
• Set of 1024 Mersenne Twister pseudorandom number generators MT2203 [Matsumoto98],
2203
[Matsumoto00]. Each of them generates a sequence of period length equal to 2 -1.
Parameters of the generators provide mutual independence of the corresponding sequences.
Besides these pseudorandom number generators, VSL provides two basic quasi-random number
generators:
• Sobol quasi-random number generator [Sobol76], [Bratley88], which works in arbitrary

dimension. For dimensions greater than 40 the user should supply initialization parameters
(initial direction numbers and primitive polynomials or direction numbers) by using
vslNewStreamEx function. See additional details on interface for registration of the
parameters in the library in VSL Notes.
• Niederreiter quasi-random number generator [Bratley92], which works in arbitrary dimension.
For dimensions greater than 318 the user should supply initialization parameters (irreducible
polynomials or direction numbers) by using vslNewStreamEx function. See additional details
on interface for registration of the parameters in the library in VSL Notes.
See some testing results for the generators in VSL Notes and comparative performance data
at http://www.intel.com/software/products/mkl/data/vsl/vsl_performance_data.htm.
VSL provides means of registration of such user-designed generators through the steps described
in Advanced Service Routines section.
For some basic generators, VSL provides two methods of creating independent random streams
in multiprocessor computations, which are the leapfrog method and the block-splitting method.
These sequence splitting methods are also useful in sequential Monte Carlo.
In addition, MT2203 pseudorandom number generator is a set of 1024 generators designed to
create up to 1024 independent random sequences, which might be used in parallel Monte Carlo
simulations. Another generator that has the same feature is Wichmann-Hill. It allows creating
up to 273 independent random streams. The properties of the generators designed for parallel
computations are discussed in detail in [Coddington94].
You may want to design and use your own basic generators. VSL provides means of registration
of such user-designed generators through the steps described in Advanced Service Routines
section.
2578
There is also an option to utilize externally generated random numbers in VSL distribution
generator routines. For this purpose VSL provides three additional basic random number
generators:
– for external random data packed in 32-bit integer array

– for external random data stored in double precision floating-point array; data is supposed
to be uniformly distributed over (a,b) interval
– for external random data stored in single precision floating-point array; data is supposed
to be uniformly distributed over (a,b) interval.
Such basic generators are called the abstract basic random number generators.
See VSL Notes for a more detailed description of the generator properties.
BRNG Parameter Definition

Predefined values for the brng input parameter are as follows:
Table 10-2 Values of brng parameter
Value Short Description
VSL_BRNG_MCG31 A 31-bit multiplicative congruential generator.
VSL_BRNG_R250 A generalized feedback shift register generator.
VSL_BRNG_MRG32K3A A combined multiple recursive generator with two

components of order 3.
VSL_BRNG_MCG59 A 59-bit multiplicative congruential generator.
VSL_BRNG_WH A set of 273 Wichmann-Hill combined multiplicative

congruential generators.
VSL_BRNG_MT19937 A Mersenne Twister pseudorandom number generator.
VSL_BRNG_MT2203 A set of 1024 Mersenne Twister pseudorandom number

generators.
VSL_BRNG_SOBOL A 32-bit Gray code-based generator producing

low-discrepancy sequences for dimensions 1 ≤ s ≤
40; user-defined dimensions are also available.
2579
Value Short Description
VSL_BRNG_NIEDERR A 32-bit Gray code-based generator producing

low-discrepancy sequences for dimensions 1 ≤ s ≤
318; user-defined dimensions are also available.
VSL_BRNG_IABSTRACT An abstract random number generator for integer

arrays.
VSL_BRNG_DABSTRACT An abstract random number generator for double

precision floating-point arrays.
VSL_BRNG_SABSTRACT An abstract random number generator for single

precision floating-point arrays.
See VSL Notes for detailed description.
Random Streams
Random stream (or stream) is an abstract source of pseudo- and quasi-random sequences
of uniform distribution. Users have no direct access to these sequences and operate with stream
state descriptors only. A stream state descriptor, which holds state descriptive information for
a particular BRNG, is a necessary parameter in each routine of a distribution generator. Only
the distribution generator routines operate with random streams directly. See VSL Notes for
details.
NOTE. Random streams associated with abstract basic random number generator are
called the abstract random streams. See VSL Notes for detailed description of abstract
streams and their use.
User can create unlimited number of random streams by VSL Service Routines like NewStream
and utilize them in any distribution generator to get the sequence of numbers of given probability
distribution. When they are no longer needed, the streams should be deleted calling service
routine DeleteStream.
VSL provides service functions SaveStreamF and LoadStreamF to save random stream
descriptive data to a binary file and to read this data from a binary file respectively. See VSL
Notes for detailed description.
2580
Data Types
FORTRAN 77:
INTEGER*4 vslstreamstate(2)
Fortran 90:
TYPE
VSL_STREAM_STATE
INTEGER*4 descriptor1
INTEGER*4 descriptor2
END
TYPE VSL_STREAM_STATE
C:
typedef (void*) VSLStreamStatePtr;
See Advanced Service Routines for the format of the stream state structure for user-designed
generators.
Error Reporting
VSL routines return status codes of the performed operation to report errors and warnings to
the calling program. Thus, it is up to the application to perform error-related actions and/or
recover from the error. The status codes are of integer type and have the following format:
VSL_ERROR_<ERROR_NAME> - indicates VSL errors
VSL_WARNING_<WARNING_NAME> - indicates VSL warnings.
VSL errors are of negative values while warnings are of positive values. The status code of zero
value indicates that the operation is completed successfully: VSL_ERROR_OK (or synonymic
VSL_STATUS_OK).
Table 10-3 Status Codes and Messages
Status Code Message
VSL_ERROR_OK, VSL_STATUS_OK Indicates no error, execution is successful.
VSL_ERROR_BAD_ARG Input argument value is not valid.
VSL_ERROR_NULL_PTR Input pointer argument is NULL.
VSL_ERROR_MEM_FAILURE System cannot allocate memory.
2581
Status Code Message
VSL_ERROR_INVALID_BRNG_INDEX BRNG index is not valid.
VSL_ERROR_BRNGS_INCOMPATIBLE Two BRNGs are not compatible for the operation.
VSL_ERROR_LEAPFROG_UNSUPPORTED BRNG does not support Leapfrog method.
VSL_ERROR_SKIPAHEAD_UNSUPPORTED BRNG does not support Skip-Ahead method.
VSL_ERROR_BAD_STREAM The random stream is invalid.
VSL_ERROR_FILE_OPEN Indicates an error in opening the file.
VSL_ERROR_FILE_READ Indicates an error in reading the file.
VSL_ERROR_FILE_WRITE Indicates an error in writing the file.
VSL_ERROR_FILE_CLOSE Indicates an error in closing the file.
VSL_ERROR_BAD_FILE_FORMAT File format is unknown.
VSL_ERROR_UNSUPPORTED_FILE_VER File format version is not supported.
VSL_ERROR_BRNG_TABLE_FULL Registration cannot be completed due to lack of

free entries in the table of registered BRNGs.
VSL_ERROR_BAD_STREAM_STATE_SIZE The value in StreamStateSize field is bad.
VSL_ERROR_BAD_WORD_SIZE The value in WordSize field is bad.
VSL_ERROR_BAD_NSEEDS The value in NSeeds field is bad.
VSL_ERROR_BAD_NBITS The value in NBits field is bad.
VSL_ERROR_BAD_UPDATE Callback function for an abstract BRNG returns an

invalid number of updated entries in a buffer, that
is, < 0 or >nmax.
VSL_ERROR_NO_NUMBERS Callback function for an abstract BRNG returns

zero as the number of updated entries in a buffer.
VSL_ERROR_INVALID_ABSTRACT_STREAM The abstract random stream is invalid.
2582
VSL Usage Model

A typical algorithm for VSL random number generators is as follows:
1. Create and initialize stream/streams. Functions vslNewStream, vslNewStreamEx,

vslCopyStream, vslCopyStreamState, vslLeapfrogStream, vslSkipAheadStream.
2. Call one or more RNGs.
3. Process the output.
4. Delete the stream/streams. Function vslDeleteStream.
NOTE. You may reiterate steps 2-3. Random number streams may be generated for
different threads.
The following C example demonstrates generation of a random stream that is output of basic
generator MT19937. The seed is equal to 777. The stream is used to generate 10,000 normally
distributed random numbers in blocks of 1,000 random numbers with parameters a = 5 and
sigma = 2. Delete the streams after completing the generation. The purpose of the example
is to calculate the sample mean for normal distribution with the given parameters.
C Example of VSL Usage

#include <stdio.h>
#include “mkl_vsl.h”
int main();
{
double r[1000]; /* buffer for random numbers */
double s; /* average */
VSLStreamStatePtr stream;
int i, j;
/* Initializing */
s = 0.0;
vslNewStream( &stream, VSL_BRNG_MT19937, 777 );
/* Generating */
for ( i=0; i<10; i++ );
{
vdRngGaussian( VSL_METHOD_DGAUSSIAN_ICDF, stream, 1000, r, 5.0, 2.0 );
for ( j=0; j<1000; j++ );
{
s += r[j];
}
2583
}
s /= 10000.0;
/* Deleting the stream */
vslDeleteStream( &stream );
/* Printing results */
printf( “Sample mean of normal distribution = %f\n”, s );
return 0;
}
The Fortran version of the same example is below:
Fortran Example of VSL Usage
include 'mkl_vsl.fi'
program MKL_VSL_GAUSSIAN
USE MKL_VSL_TYPE
USE MKL_VSL
real(kind=8) r(1000) ! buffer for random numbers
real(kind=8) s ! average
real(kind=8) a, sigma ! parameters of normal distribution
TYPE (VSL_STREAM_STATE) :: stream
integer(kind=4) errcode
integer(kind=4) i,j
integer brng,method,seed,n
n = 1000
s = 0.0
a = 5.0
sigma = 2.0
brng=VSL_BRNG_MT19937
method=VSL_METHOD_DGAUSSIAN_ICDF
seed=777
! ***** Initializing *****
errcode=vslnewstream( stream, brng, seed )
! ***** Generating *****
do i = 1,10
errcode=vdrnggaussian( method, stream, n, r, a, sigma )
do j = 1, 1000
s = s + r(j)
end do
2584
end do
s = s / 10000.0
! ***** Deinitialize *****
errcode=vsldeletestream( stream )
! ***** Printing results *****
print *,"Sample mean of normal distribution = ", s
end
Additionally, examples that demonstrate usage of VSL random number generators are available
in the following directories:
${MKL}/examples/vslc/source
${MKL}/examples/vslf/source.
Service Routines
Stream handling comprises routines for creating, deleting, or copying the streams and getting
the index of a basic generator. A random stream can also be saved to and then read from a
binary file. Table 10-4 lists all available service routines
Table 10-4 Service Routines
Routine Short Description
NewStream Creates and initializes a random stream.
NewStreamEx Creates and initializes a random stream for the

generators with multiple initial conditions.
iNewAbstractStream Creates and initializes an abstract random stream for

integer arrays.
dNewAbstractStream Creates and initializes an abstract random stream for

double precision floating-point arrays.
sNewAbstractStream Creates and initializes an abstract random stream for

single precision floating-point arrays.
DeleteStream Deletes previously created stream.
CopyStream Copies a stream to another stream.
2585
Routine Short Description
CopyStreamState Creates a copy of a random stream state.
SaveStreamF Writes a stream to a binary file.
LoadStreamF Reads a stream from a binary file.
LeapfrogStream Initializes the stream by the leapfrog method to

generate a subsequence of the original sequence.
SkipAheadStream Initializes the stream by the skip-ahead method.
GetStreamStateBrng Obtains the index of the basic generator responsible

for the generation of a given random stream.
GetNumRegBrngs Obtains the number of currently registered basic

generators.
NOTE. In the above table, the vsl prefix in the function names is omitted. In the function
reference this prefix is always used in function prototypes and code examples.
Most of the generator-based work comprises three basic steps:
1. Creating and initializing a stream (NewStream, NewStreamEx, CopyStream, CopyStream-

State, LeapfrogStream, SkipAheadStream).
2. Generating random numbers with given distribution, see Distribution Generators.
3. Deleting the stream (DeleteStream).
Note that you can concurrently create multiple streams and obtain random data from one or
several generators by using the stream state. You must use the DeleteStream function to
delete all the streams afterwards.
2586
NewStream
Creates and initializes a random stream.
Syntax
Fortran:
status = vslnewstream( stream, brng, seed )
C:
status = vslNewStream( &stream, brng, seed );
Input Parameters
brng FORTRAN 77: INTEGER Index of the basic generator to initialize the stream.
See Table 10-2 for specific value.
INTENT(IN)
C: const int
seed FORTRAN 77: INTEGER Initial condition of the stream. In the case of a
quasi-random number generator seed parameter is
used to set the dimension. If the dimension is
INTENT(IN)
greater than the dimension that brng can support
C: const unsigned int or is less than 1, then the dimension is assumed to
be equal to 1.
Output Parameters
stream FORTRAN 77: INTEGER*4 Stream state descriptor

stream(2)
Fortran 90:
TYPE(VSL_STREAM_STATE),
INTENT(OUT)
C: VSLStreamStatePtr*
2587
Description
This function is declared in mkl_vsl.f77 for FORTRAN 77 interface, in mkl_vsl.fi for Fortran
90 interface, and in mkl_vsl_functions.h for C interface.
For a basic generator with number brng, this function creates a new stream and initializes it
with a 32-bit seed. The seed is an initial value used to select a particular sequence generated
by the basic generator brng. The function is also applicable for generators with multiple initial
conditions. See VSL Notes for a more detailed description of stream initialization for different
basic generators.
NOTE. This function is not applicable for abstract basic random number generators.
Please use vsliNewAbstractStream, vslsNewAbstractStream or vsldNewAbstract-
Stream to utilize integer, single-precision or double-precision external random data
respectively.
Return Values
VSL_ERROR_INVALID_BRNG_INDEX BRNG index is invalid.
VSL_ERROR_MEM_FAILURE System cannot allocate memory for stream.
NewStreamEx
Creates and initializes a random stream for
generators with multiple initial conditions.
Syntax
Fortran:
status = vslnewstreamex( stream, brng, n, params )
C:
status = vslNewStreamEx( &stream, brng, n, params );
2588
Input Parameters
brng FORTRAN 77: INTEGER Index of the basic generator to initialize the stream.
See Table 10-2 for specific value.
INTENT(IN)
C: const int
n FORTRAN 77: INTEGER Number of initial conditions contained in params

INTENT(IN)
C: const int
params FORTRAN 77: INTEGER Array of initial conditions necessary for the basic
generator brng to initialize the stream. In the case
of a quasi-random number generator only the first
INTENT(IN)
element in params parameter is used to set the
C: const unsigned int dimension. If the dimension is greater than the
dimension that brng can support or is less than 1,
then the dimension is assumed to be equal to 1.
Output Parameters
stream FORTRAN 77: INTEGER*4 Stream state descriptor

stream(2)
Fortran 90:
INTENT(OUT)
Description
2589
The function provides an advanced tool to set the initial conditions for a basic generator if its
input arguments imply several initialization parameters. Initial values are used to select a
particular sequence generated by the basic generator brng. Whenever possible, use NewStream,
which is analogous to vslNewStreamEx except that it takes only one 32-bit initial condition.
In particular, vslNewStreamEx may be used to initialize the state table in Generalized Feedback
Shift Register Generators (GFSRs). A more detailed description of this issue can be found in
VSL Notes.
This function is also used to pass user-defined initialization parameters of quasi-random number
generators into the library. See VSL Notes for the format for their passing and registration in
VSL.
NOTE. This function is not applicable for abstract basic random number generators.
Please use vsliNewAbstractStream, vslsNewAbstractStream or vsldNewAbstract-
Stream to utilize integer, single-precision or double-precision external random data
respectively.
Return Values
VSL_ERROR_INVALID_BRNG_INDEX BRNG index is invalid.
iNewAbstractStream
Creates and initializes an abstract random stream
for integer arrays.
Syntax
Fortran:
status = vslinewabstractstream( stream, n, ibuf, icallback )
C:
status = vsliNewAbstractStream( &stream, n, ibuf, icallback );
2590
Input Parameters
n FORTRAN 77: INTEGER Size of the array ibuf

INTENT(IN)
C: const int
ibuf FORTRAN 77: INTEGER Array of n 32-bit integers

INTENT(IN)
C: const unsigned int
icallback See Note below FORTRAN: Address of the callback function used
for ibuf update
C: Pointer to the callback function used for ibuf

update
NOTE. Format of the callback function in FORTRAN 77:

INTEGER FUNCTION IUPDATEFUNC( stream, n, ibuf, nmin, nmax, idx )
INTEGER*4 stream(2)
INTEGER n
INTEGER*4 ibuf(n)
INTEGER nmin
INTEGER nmax
INTEGER idx
2591
Format of the callback function in Fortran 90:

INTEGER FUNCTION IUPDATEFUNC[C]( stream, n, ibuf, nmin, nmax, idx )
TYPE(VSL_STREAM_STATE),POINTER :: stream[reference]
INTEGER(KIND=4),INTENT(IN) :: n[reference]
INTEGER(KIND=4),INTENT(OUT) :: ibuf[reference](0:n-1)
INTEGER(KIND=4),INTENT(IN) :: nmin[reference]
INTEGER(KIND=4),INTENT(IN) :: nmax[reference]
INTEGER(KIND=4),INTENT(IN) :: idx[reference]
Format of the callback function in C:
int iUpdateFunc( VSLStreamStatePtr stream, int* n, unsigned int ibuf[], int* nmin,
int* nmax, int* idx );
The callback function returns the number of elements in the array actually updated by the
function. Table 10-5 gives the description of the callback function parameters.
Table 10-5 icallback Callback Function Parameters
Parameters Short Description
stream Abstract stream descriptor
n Size of ibuf
ibuf Array of random numbers associated with the stream

stream
nmin Minimal quantity of numbers to update
nmax Maximal quantity of numbers that can be updated
idx Position in cyclic buffer ibuf to start update 0≤idx<n.
2592
Output Parameters
stream FORTRAN 77: INTEGER*4 Descriptor of the stream state structure

stream(2)
Fortran 90:
TINTENT(OUT)
Description
The function creates a new abstract stream and associates it with an integer array ibuf and
user's callback function icallback that is intended for updating of ibuf content.
Return Values
VSL_ERROR_BAD_ARG Parameter n is not positive.
VSL_ERROR_NULL_PTR Either buffer or callback function parameter is a NULL
pointer.
dNewAbstractStream
for double precision floating-point arrays.
Syntax
Fortran:
status = vsldnewabstractstream( stream, n, dbuf, a, b, dcallback )
C:
status = vsldNewAbstractStream( &stream, n, dbuf, a, b, dcallback );
2593
Input Parameters
n FORTRAN 77: INTEGER Size of the array dbuf

INTENT(IN)
C: const int
dbuf FORTRAN 77: DOUBLE Array of n double precision floating-point random

PRECISION numbers with uniform distribution over interval (a,b)
Fortran 90: DOUBLE
PRECISION, INTENT(IN)
C: const double
a FORTRAN 77: DOUBLE Left boundary a

PRECISION
Fortran 90: DOUBLE
C: const double
b FORTRAN 77: DOUBLE Right boundary b

PRECISION
Fortran 90: DOUBLE
C: const double
dcallback See Note below FORTRAN: Address of the callback function used
for update of the array dbuf
C: Pointer to the callback function used for update

of the array dbuf
2594
Output Parameters

stream(2)
Fortran 90:
INTENT(OUT)

INTEGER FUNCTION DUPDATEFUNC( stream, n, dbuf, nmin, nmax, idx )
INTEGER*4 stream(2)
INTEGER n
DOUBLE PRECISION dbuf(n)
INTEGER nmin
INTEGER nmax
INTEGER idx
INTEGER FUNCTION DUPDATEFUNC[C]( stream, n, dbuf, nmin, nmax, idx )
REAL(KIND=8),INTENT(OUT) :: dbuf[reference](0:n-1)
int dUpdateFunc( VSLStreamStatePtr stream, int* n, double dbuf[], int* nmin, int*
nmax, int* idx );
2595
dcallback Callback Function Parameters
n Size of dbuf
dbuf Array of random numbers associated with the stream

stream
idx Position in cyclic buffer dbuf to start update 0≤idx<n.
Description
The function creates a new abstract stream for double precision floating-point arrays with
random numbers of the uniform distribution over interval (a,b). The function associates the
stream with a double precision array dbuf and user's callback function dcallback that is
intended for updating of dbuf content.
Return Values
pointer.
2596
sNewAbstractStream
for single precision floating-point arrays.
Syntax
Fortran:
status = vslsnewabstractstream( stream, n, sbuf, a, b, scallback )
C:
status = vslsNewAbstractStream( &stream, n, sbuf, a, b, scallback );
Input Parameters
n FORTRAN 77: INTEGER Size of the array sbuf

INTENT(IN)
C: const int
sbuf FORTRAN 77: REAL Array of n single precision floating-point random

numbers with uniform distribution over interval (a,b)
Fortran 90: REAL,
INTENT(IN)
C: const float
a FORTRAN 77: REAL Left boundary a

Fortran 90: REAL,
INTENT(IN)
C: const float
b FORTRAN 77: REAL Right boundary b

Fortran 90: REAL,
INTENT(IN)
C: const float
2597
scallback See Note below FORTRAN: Address of the callback function used
for update of the array sbuf
C: Pointer to the callback function used for update

of the array sbuf
Output Parameters

stream(2)
Fortran 90:
INTENT(OUT)

INTEGER FUNCTION SUPDATEFUNC( stream, n, ibuf, nmin, nmax, idx )
INTEGER*4 stream(2)
INTEGER n
REAL sbuf(n)
INTEGER nmin
INTEGER nmax
INTEGER idx
2598
INTEGER FUNCTION SUPDATEFUNC[C]( stream, n, sbuf, nmin, nmax, idx )
REAL(KIND=4), INTENT(OUT) :: sbuf[reference](0:n-1)
int sUpdateFunc( VSLStreamStatePtr stream, int* n, float sbuf[], int* nmin, int*
nmax, int* idx );
Table 10-7 scallback Callback Function Parameters
n Size of sbuf
sbuf Array of random numbers associated with the stream

stream
idx Position in cyclic buffer sbuf to start update 0≤idx<n.
Description
2599
The function creates a new abstract stream for single precision floating-point arrays with random
numbers of the uniform distribution over interval (a,b). The function associates the stream with
a single precision array sbuf and user’s callback function scallback that is intended for updating
of sbuf content.
Return Values
pointer.
DeleteStream
Deletes a random stream.
Syntax
Fortran:
status = vsldeletestream( stream )
C:
status = vslDeleteStream( &stream );
Input/Output Parameters
stream FORTRAN 77: INTEGER*4 FORTRAN: Stream state descriptor. Must have
stream(2) non-zero value. After the stream is successfully
deleted, the descriptor becomes invalid.
Fortran 90:
TYPE(VSL_STREAM_STATE), C: Stream state descriptor. Must have non-zero
INTENT(OUT) value. After the stream is successfully deleted, the
pointer is set to NULL.
2600
Description
The function deletes the random stream created by one of the initialization functions.
Return Values
VSL_ERROR_NULL_PTR srcstream parameter is a NULL pointer.
VSL_ERROR_BAD_STREAM srcstream is not a valid random stream.
VSL_ERROR_MEM_FAILURE System cannot allocate memory for newstream.
CopyStream
Creates a copy of a random stream.
Syntax
Fortran:
status = vslcopystream( newstream, srcstream )
C:
status = vslCopyStream( &newstream, srcstream );
Input Parameters
srcstream FORTRAN 77: INTEGER*4 FORTRAN: Descriptor of the stream to be copied

srcstream(2)
C: Pointer to the stream state structure to be copied
Fortran 90:
INTENT(IN)
C: const
VSLStreamStatePtr
2601
Output Parameters
newstream FORTRAN 77: INTEGER*4 Copied stream descriptor

newstream(2)
Fortran 90:
INTENT(OUT)
Description
The function creates an exact copy of srcstream and stores its descriptor to newstream.
Return Values
VSL_ERROR_NULL_PTR srcstream parameter is a NULL pointer.
VSL_ERROR_BAD_STREAM srcstream is not a valid random stream.
VSL_ERROR_MEM_FAILURE System cannot allocate memory for newstream.
CopyStreamState
Creates a copy of a random stream state.
Syntax
Fortran:
status = vslcopystreamstate( deststream, srcstream )
C:
status = vslCopyStreamState( deststream, srcstream );
2602
Input Parameters
srcstream FORTRAN 77: INTEGER*4 FORTRAN: Descriptor of the destination stream

srcstream(2) where the state of scrstream stream is copied
Fortran 90: C: Pointer to the stream state structure, from which

TYPE(VSL_STREAM_STATE), the state structure is copied
INTENT(IN)
C: const
VSLStreamStatePtr
Output Parameters
deststream FORTRAN 77: INTEGER*4 FORTRAN: Descriptor of the stream with the state
deststream(2) to be copied
Fortran 90: C: Pointer to the stream state structure where the

TYPE(VSL_STREAM_STATE), stream state is copied
INTENT(OUT)
C: VSLStreamStatePtr
Description
The function copies a stream state from srcstream to the existing deststream stream. Both
the streams should be generated by the same basic generator. An error message is generated
when the index of the BRNG that produced deststream stream differs from the index of the
BRNG that generated srcstream stream.
Unlike CopyStream function, which creates a new stream and copies both the stream state
and other data from srcstream, the function CopyStreamState copies only srcstream stream
state data to the generated deststream stream.
2603
Return Values
VSL_ERROR_NULL_PTR Either srcstream or deststream is a NULL pointer.
VSL_ERROR_BAD_STREAM Either srcstream or deststream is not a valid
random stream.
VSL_ERROR_BRNGS_INCOMPATIBLE BRNG associated with srcstream is not compatible
with BRNG associated with deststream.
SaveStreamF
Writes random stream descriptive data to binary
file.
Syntax
Fortran:
errstatus = vslsavestreamf( stream, fname )
C:
errstatus = vslSaveStreamF( stream, fname );
Input Parameters
stream FORTRAN 77: INTEGER*4 Random stream to be written to the file

stream(2)
Fortran 90:
INTENT(IN)
C: const
VSLStreamStatePtr
fname FORTRAN 77: FORTRAN: File name specified as a C-style

CHARACTER(*) null-terminated string
Fortran 90: CHARACTER(*), C: File name specified as a Fortran-style character
INTENT(IN) string
2604
C: const char*
Output Parameters
errstatus FORTRAN: INTEGER Error status of the operation

C: int
Description
The function writes the random stream descriptive data to the binary file. Random stream
descriptive data is saved to the binary file with the name fname. Random stream stream must
be a valid stream created by NewStream-like or CopyStream-like service routines. If the stream
cannot be saved to the file, errstatus has a non-zero value. Random stream can be read from
the binary file using LoadStreamF function.
Return Values
VSL_ERROR_NULL_PTR Either fname or stream is a NULL pointer.
VSL_ERROR_BAD_STREAM stream is not a valid random stream.
VSL_ERROR_MEM_FAILURE System cannot allocate memory for internal needs.
2605
LoadStreamF
Creates new stream and reads stream descriptive
data from binary file.
Syntax
Fortran:
errstatus = vslloadstreamf( stream, fname )
C:
errstatus = vslLoadStreamF( &stream, fname );
Input Parameters
fname FORTRAN 77: FORTRAN: File name specified as a C-style

CHARACTER(*) null-terminated string
Fortran 90: CHARACTER(*), C: File name specified as a Fortran-style character
INTENT(IN) string
C: const char*
Output Parameters
stream FORTRAN 77: INTEGER*4 FORTRAN: Descriptor of a new random stream

stream(2)
C: Pointer to a new random stream
Fortran 90:
INTENT(OUT)
errstatus FORTRAN: INTEGER Error status of the operation

C: int
2606
Description
The function creates a new stream and reads stream descriptive data from the binary file. A
new random stream is created using the stream descriptive data from the binary file with the
name fname. If the stream cannot be read (for example, an I/O error occurs or the file format
is invalid), errstatus has a non-zero value. To save random stream to the file, use SaveStreamF
function.
Return Values
VSL_ERROR_NULL_PTR fname is a NULL pointer.
VSL_ERROR_MEM_FAILURE System cannot allocate memory for internal needs.
VSL_ERROR_BAD_FILE_FORMAT Unknown file format.
VSL_ERROR_UNSUPPORTED_FILE_VER File format version is unsupported.
LeapfrogStream
Initializes a stream using the leapfrog method.
Syntax
Fortran:
status = vslleapfrogstream( stream, k, nstreams )
C:
status = vslLeapfrogStream( stream, k, nstreams );
2607
Input Parameters
stream FORTRAN 77: INTEGER*4 FORTRAN: Descriptor of the stream to which

stream(2) leapfrog method is applied
Fortran 90: C: Pointer to the stream state structure to which

TYPE(VSL_STREAM_STATE), leapfrog method is applied
INTENT(IN)
k FORTRAN 77: INTEGER Index of the computational node, or stream number

INTENT(IN)
C: const int
nstreams FORTRAN 77: INTEGER Largest number of computational nodes, or stride

INTENT(IN)
C: const int
Description
The function function allows generating random numbers in a random stream with non-unit
stride. This feature is particularly useful in distributing random numbers from original stream
across nstreams buffers without generating the original random sequence with subsequent
manual distribution. One of the important applications of the leapfrog method is splitting the
original sequence into non-overlapping subsequences across nstreams computational nodes.
2608
The function initializes the original random stream (see Figure 10-1) to generate random
numbers for the computational node k, 0 ≤ k < nstreams, where nstreams is the largest
number of computational nodes used.
Figure 10-1 Leapfrog Method
The leapfrog method is supported only for those basic generators that allow splitting elements
by the leapfrog method, which is more efficient than simply generating them by a generator
with subsequent manual distribution across computational nodes. See VSL Notes for details.
For quasi-random basic generators the leapfrog method allows generating individual components
of quasi-random vectors instead of whole quasi-random vectors. In this case nstreams
parameter should be equal to the dimension of the quasi-random vector while k parameter
should be the index of a component to be generated (0 ≤ k < nstreams). Other parameters
values are not allowed.
The following code examples illustrate the initialization of three independent streams using the
leapfrog method:
2609
Example 10-1 Fortran 90 Code for Leapfrog Method

...
TYPE(VSL_STREAM_STATE) ::stream1
! Creating 3 identical streams
status = vslnewstream(stream1, VSL_BRNG_MCG31, 174)
status = vslcopystream(stream2, stream1)
status = vslcopystream(stream3, stream1)
! Leapfrogging the streams
status = vslleapfrogstream(stream1, 0, 3)
! Generating random numbers
...
! Deleting the streams
status = vsldeletestream(stream1)
...
2610
Example 10-2 C Code for Leapfrog Method
...
VSLStreamStatePtr stream1;
/* Creating 3 identical streams */
status = vslNewStream(&stream1, VSL_BRNG_MCG31, 174);
status = vslCopyStream(&stream2, stream1);
/* Leapfrogging the streams
*/
status = vslLeapfrogStream(stream1, 0, 3);
/* Generating random numbers
*/
...
/* Deleting the streams
*/
status = vslDeleteStream(&stream1);
...
Return Values
VSL_ERROR_NULL_PTR stream is a NULL pointer.
VSL_ERROR_LEAPFROG_UNSUPPORTED BRNG does not support Leapfrog method.
2611
SkipAheadStream
Initializes a stream using the block-splitting
method.
Syntax
Fortran:
status = vslskipaheadstream( stream, nskip )
C:
status = vslSkipAheadStream( stream, nskip);
Input Parameters
stream FORTRAN 77: INTEGER*4 FORTRAN: Descriptor of the stream to which

stream(2) block-splitting method is applied
Fortran 90: C: Pointer to the stream state structure to which

TYPE(VSL_STREAM_STATE), block-splitting method is applied
INTENT(IN)
nskip FORTRAN 77: INTEGER*4 Number of skipped elements

nskip(2)
Fortran 90:
INTEGER(KIND=8),
INTENT(IN)
C: const long long int
Description
The function skips a given number of elements in a random stream. This feature is particularly
useful in distributing random numbers from original random stream across different
computational nodes. If the largest number of random numbers used by a computational node
2612
is nskip, then the original random sequence may be split by SkipAheadStream into
non-overlapping blocks of nskip size so that each block corresponds to the respective
computational node. The number of computational nodes is unlimited. This method is known
as the block-splitting method or as the skip-ahead method. (see Figure 10-2).
Figure 10-2 Block-Splitting Method
The skip-ahead method is supported only for those basic generators that allow skipping elements
by the skip-ahead method, which is more efficient than simply generating them by generator
with subsequent manual skipping. See VSL Notes for details.
Please note that for quasi-random basic generators the skip-ahead method works with
components of quasi-random vectors rather than with whole quasi-random vectors. Thus to
skip NS quasi-random vectors, set nskip parameter equal to the NS*DIMEN, where DIMEN is
the dimension of quasi-random vector.
The following code examples illustrate how to initialize three independent streams using
SkipAheadStream function:
2613
Example 10-3 Fortran 90 Code for Block-Splitting Method

...
type(VSL_STREAM_STATE) ::stream1
! Creating the 1st stream
status = vslnewstream(stream1, VSL_BRNG_MCG31, 174)
! Skipping ahead by 7 elements the 2nd stream
status = vslcopystream(stream2, stream1);

status = vslskipaheadstream(stream2, 7);
! Skipping ahead by 7 elements the 3rd stream
status = vslcopystream(stream3, stream2);

status = vslskipaheadstream(stream3, 7);
! Generating random numbers
...
! Deleting the streams
...
2614
Example 10-4 C Code for Block-Splitting Method
/* Creating the 1st stream
*/
status = vslNewStream(&stream1, VSL_BRNG_MCG31, 174);
/* Skipping ahead by 7 elements the 2nd stream */
status = vslSkipAheadStream(stream2, 7);
/* Skipping ahead by 7 elements the 3rd stream */
status = vslSkipAheadStream(stream3, 7);
/* Generating random numbers
*/
...
/* Deleting the streams
*/
...
Return Values
VSL_ERROR_SKIPAHEAD_UNSUPPORTED BRNG does not support Skip-Ahead method.
2615
GetStreamStateBrng
Returns index of a basic generator used for
generation of a given random stream.
Syntax
Fortran:
brng = vslgetstreamstatebrng( stream )
C:
brng = vslGetStreamStateBrng( stream );
Input Parameters
stream FORTRAN 77: INTEGER*4 FORTRAN: Descriptor of the stream state

stream(2)
C: Pointer to the stream state structure
Fortran 90:
TINTENT(IN)
C: const
VSLStreamStatePtr
Output Parameters
brng FORTRAN: INTEGER Index of the basic generator assigned for the
generation of stream ; negative in case of an error
C: int
Description
The function retrieves the index of a basic generator used for generation of a given random
stream.
2616
Return Values
GetNumRegBrngs
Obtains the number of currently registered basic
generators.
Syntax
Fortran:
nregbrngs = vslgetnumregbrngs( )
C:
nregbrngs = vslGetNumRegBrngs( void );
Output Parameters
nregbrngs FORTRAN: INTEGER Number of basic generators registered at the

moment of the function call
C: int
Description
The function obtains the number of currently registered basic generators. Whenever user
registers a user-designed basic generator, the number of registered basic generators is
incremented. The maximum number of basic generators that can be registered is determined
by VSL_MAX_REG_BRNGS parameter.
Distribution Generators
VSL routines are used to generate random numbers with different types of distribution. Each
function group is introduced below by the type of underlying distribution and contains a short
description of its functionality, as well as specifications of the call sequence for both FORTRAN
2617
and C-interface and the explanation of input and output parameters. Table 10-8 and Table
10-9 list the random number generator routines, together with used data types, output
distributions, and sets correspondence between data types of the generator routines and called
basic random number generators.
Table 10-8 Continuous Distribution Generators
Type of Data BRNG Description
Distribution Types Data Type
Uniform s, d s, d Uniform continuous distribution on the interval

[a,b].
Gaussian s, d s, d Normal (Gaussian) distribution.
GaussianMV s, d s, d Multivariate normal (Gaussian) distribution.
Exponential s, d s, d Exponential distribution.
Laplace s, d s, d Laplace distribution (double exponential

distribution).
Weibull s, d s, d Weibull distribution.
Cauchy s, d s, d Cauchy distribution.
Rayleigh s, d s, d Rayleigh distribution.
Lognormal s, d s, d Lognormal distribution.
Gumbel s, d s, d Gumbel (extreme value) distribution.
Gamma s, d s, d Gamma distribution.
Beta s, d s, d Beta distribution.
Table 10-9 Discrete Distribution Generators

Type of Data BRNG Data Type Description
Distribution Types
Uniform i d Uniform discrete

distribution on the
interval [a,b).
2618
Type of Data BRNG Data Type Description
Distribution Types
UniformBits i i Generator of integer

random values with
uniform bit
distribution.
Bernoulli i s Bernoulli distribution.
Geometric i s Geometric
distribution.
Binomial i d Binomial distribution.
Hypergeomet- i d Hypergeometric
ric distribution.
Poisson i s (for Poisson distribution.

VSL_METHOD_IPOISSON_POISNORM)
s (for distribution parameter λ≥ 27) and

d (for λ < 27) (for
VSL_METHOD_IPOISSON_PTPE)
PoissonV i s Poisson distribution

with varying mean.
NegBinomial i d Negative binomial

distribution, or Pascal
distribution.
The library provides two modes of random number generation, accurate and fast. Accurate
generation mode is intended for the applications that are highly demanding to accuracy of
calculations. When used in this mode, the generators produce random numbers lying completely
within definitional domain for all values of the distribution parameters. For example, random
numbers obtained from the generator of continuous distribution that is uniform on interval
[a,b] belong to this interval irrespective of what a and b values may be. Fast mode provides
high performance of generation and also guaranties that generated random numbers belong
to the definitional domain except for some specific values of distribution parameters. The
generation mode is set by specifying relevant value of the method parameter in generator
routines. List of distributions that support accurate mode of generation is given in the table
below.
2619
Table 10-10 Distribution Generators Supporting Accurate Mode

Type of Distribution Data Types
Uniform s, d
Exponential s, d
Weibull s, d
Rayleigh s, d
Lognormal s, d
Gamma s, d
Beta s, d
See additional details about accurate and fast mode of random number generation in VSL Notes.
Continuous Distributions
This section describes routines for generating random numbers with continuous distribution.
Uniform
Generates random numbers with uniform
distribution.
Syntax
Fortran:
status = vsrnguniform( method, stream, n, r, a, b )
status = vdrnguniform( method, stream, n, r, a, b )
C:
status = vsRngUniform( method, stream, n, r, a, b );
status = vdRngUniform( method, stream, n, r, a, b );
2620
Input Parameters
method FORTRAN 77: INTEGER Generation method; the specific values are as
follows:
INTENT(IN) VSL_METHOD_SUNIFORM_STD
VSL_METHOD_DUNIFORM_STD
C: const int
VSL_METHOD_SUNIFORM_STD_ACCURATE
VSL_METHOD_DUNIFORM_STD_ACCURATE
Standard method.
stream FORTRAN 77: INTEGER*4 FORTRAN: Descriptor of the stream state structure.
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)
n FORTRAN 77: INTEGER Number of random values to be generated

INTENT(IN)
C: const int
a FORTRAN 77: REAL for Left bound a

vsrnguniform
DOUBLE PRECISION for
vdrnguniform
Fortran 90: REAL,
INTENT(IN) for
vsrnguniform
DOUBLE PRECISION,
INTENT(IN) for
vdrnguniform
2621

C: const float for
vsRngUniform
const double for
vdRngUniform
b FORTRAN 77: REAL for Right bound b

vsrnguniform
vdrnguniform
Fortran 90: REAL,
INTENT(IN) for
vsrnguniform
DOUBLE PRECISION,
INTENT(IN) for
vdrnguniform
C: const float for
vsRngUniform
const double for
vdRngUniform
Output Parameters
r FORTRAN 77: REAL for Vector of n random numbers uniformly distributed

vsrnguniform over the interval [a,b]
vdrnguniform
Fortran 90: REAL,
INTENT(OUT) for
vsrnguniform
2622
DOUBLE PRECISION,
INTENT(OUT) for
vdrnguniform
C: float* for
vsRngUniform
double* for vdRngUniform
Description
The function generates random numbers uniformly distributed over the interval [a, b], where
a, b are the left and right bounds of the interval, respectively, and a, b∈R ; a > b.
The probability density function is given by:
The cumulative distribution function is as follows:
2623
Return Values
is, < 0 or > nmax.
VSL_ERROR_NO_NUMBERS Callback function for an abstract BRNG returns 0 as
the number of updated entries in a buffer.
Gaussian
Generates normally distributed random numbers.
Syntax
Fortran:
status = vsrnggaussian( method, stream, n, r, a, sigma )
status = vdrnggaussian( method, stream, n, r, a, sigma )
C:
status = vsRngGaussian( method, stream, n, r, a, sigma );
status = vdRngGaussian( method, stream, n, r, a, sigma );
2624
Input Parameters
method FORTRAN 77: INTEGER Generation method. The specific values are as
follows:
INTENT(IN) VSL_METHOD_SGAUSSIAN_BOXMULLER
VSL_METHOD_SGAUSSIAN_BOXMULLER2
C: const int
VSL_METHOD_SGAUSSIAN_ICDF
VSL_METHOD_DGAUSSIAN_BOXMULLER
VSL_METHOD_DGAUSSIAN_BOXMULLER2
VSL_METHOD_DGAUSSIAN_ICDF
See brief description of the methods BOXMULLER,
BOXMULLER2, and ICDF in Table 10-1
stream FORTRAN 77: INTEGER*4 FORTRAN: Descriptor of the stream state structure
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN )

INTENT(IN)
C: const int
a FORTRAN 77: REAL for Mean value a.

vsrnggaussian
vdrnggaussian
Fortran 90: REAL,
INTENT(IN) for
vsrnggaussian
2625

DOUBLE PRECISION,
INTENT(IN) for
vdrnggaussian
C: const float for
vsRngGaussian
const double for
vdRngGaussian
sigma FORTRAN 77: REAL for Standard deviation σ.

vsrnggaussian
vdrnggaussian
Fortran 90: REAL,
INTENT(IN) for
vsrnggaussian
DOUBLE PRECISION,
INTENT(IN) for
vdrnggaussian
C: const float for
vsRngGaussian
const double for
vdRngGaussian
Output Parameters
r FORTRAN 77: REAL for Vector of n normally distributed random numbers

vsrnggaussian
vdrnggaussian
2626
Fortran 90: REAL,
INTENT(OUT) for
vsrnggaussian
DOUBLE PRECISION,
INTENT(OUT) for
vdrnggaussian
C: float* for
vsRngGaussian
double* for vdRngGaussian
Description
The function generates random numbers with normal (Gaussian) distribution with mean value
a and standard deviation σ, where
a, σ∈R ; σ > 0.
2627
The cumulative distribution function Fa,σ(x) can be expressed in terms of standard normal
distribution Φ(x) as
Fa,σ(x) = Φ((x - a)/σ)
Return Values
is, < 0 or > nmax.
GaussianMV
Generates random numbers from multivariate
Syntax
Fortran:
status = vsrnggaussianmv( method, stream, n, r, dimen, mstorage, a, t )
status = vdrnggaussianmv( method, stream, n, r, dimen, mstorage, a, t )
C:
status = vsRngGaussianMV( method, stream, n, r, dimen, mstorage, a, t );
status = vdRngGaussianMV( method, stream, n, r, dimen, mstorage, a, t );
2628
Input Parameters
follows:
INTENT(IN) VSL_METHOD_SGAUSSIANMV_BOXMULLER
VSL_METHOD_SGAUSSIANMV_BOXMULLER2
C: const int
VSL_METHOD_SGAUSSIANMV_ICDF
VSL_METHOD_DGAUSSIANMV_BOXMULLER
VSL_METHOD_DGAUSSIANMV_BOXMULLER2
VSL_METHOD_DGAUSSIANMV_ICDF
See brief description of the methods BOXMULLER,
BOXMULLER2, and ICDF in Table 10-1
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)

INTENT(IN)
C: const int
dimen FORTRAN 77: INTEGER Dimension d ( d ≥ 1) of output random vectors

INTENT(IN)
C: const int
2629
mstorage FORTRAN 77: INTEGER FORTRAN: Matrix storage scheme for upper
triangular matrix TT. The routine supports three
matrix storage schemes:
INTENT(IN)
C: const int • VSL_MATRIX_STORAGE_FULL— all d x d
elements of the matrix TT are passed, however,
only the upper triangle part is actually used in
the routine.
• VSL_MATRIX_STORAGE_PACKED— upper triangle
elements of TT are packed by rows into a
• VSL_MATRIX_STORAGE_DIAGONAL— only diagonal
elements of TT are passed.
C: Matrix storage scheme for lower triangular matrix

T. The routine supports three matrix storage
schemes:
• VSL_MATRIX_STORAGE_FULL— all d x d
elements of the matrix T are passed, however,
only the lower triangle part is actually used in the
routine.
• VSL_MATRIX_STORAGE_PACKED— lower triangle
elements of T are packed by rows into a
• VSL_MATRIX_STORAGE_DIAGONAL— only diagonal
elements of T are passed.
a FORTRAN 77: REAL for Mean vector a of dimension d

vsrnggaussianmv
vdrnggaussianmv
Fortran 90: REAL,
INTENT(IN) for
vsrnggaussianmv
2630
DOUBLE PRECISION,
INTENT(IN) for
vdrnggaussianmv
C: const float* for
vsRngGaussianMV
const double* for
vdRngGaussianMV
t FORTRAN 77: REAL for FORTRAN: Elements of the upper triangular matrix
vsrnggaussianmv passed according to the matrix TT storage scheme
mstorage.
vdrnggaussianmv C: Elements of the lower triangular matrix passed
according to the matrix T storage scheme mstorage.
Fortran 90: REAL,
INTENT(IN) for
vsrnggaussianmv
DOUBLE PRECISION,
INTENT(IN) for
vdrnggaussianmv
C: const float* for
vsRngGaussianMV
const double* for
vdRngGaussianMV
Output Parameters
r FORTRAN 77: REAL for Array of n random vectors of dimension dimen

vsrnggaussianmv
vdrnggaussianmv
2631

Fortran 90: REAL,
INTENT(OUT) for
vsrnggaussianmv
DOUBLE PRECISION,
INTENT(OUT) for
vdrnggaussianmv
C: float* for
vsRngGaussianMV
double* for
vdRngGaussianMV
Description
The function generates random numbers with d-variate normal (Gaussian) distribution with
mean value a and variance-covariance matrix C, where a∈Rd; C is a d×d symmetric
where x∈Rd .
Matrix C can be represented as C = TTT, where T is a lower triangular matrix - Cholesky factor
of C.
Instead of variance-covariance matrix C the generation routines require Cholesky factor of C

in input. To compute Cholesky factor of matrix C, the user may call MKL LAPACK routines for
matrix factorization: ?potrf or ?pptrf for v?RngGaussianMV/v?rnggaussianmv routines (?
means either s or d for single and double precision respectively). See Application Notes for
more details.
2632
Application Notes
Since matrices are stored in Fortran by columns, while in C they are stored by rows, the usage
of MKL factorization routines (assuming Fortran matrices storage) in combination with
multivariate normal RNG (assuming C matrix storage) is slightly different in C and Fortran. The
following tables help in using these routines in C and Fortran. For further information please
refer to the appropriate VSL example file.
Table 10-11 Using Cholesky Factorization Routines in Fortran
Matrix Storage Scheme Variance-Covariance Factorization UPLO Result of
Matrix Argument Routine Parameter Factorization
in as Input
Factorization Argument
Routine for RNG
VSL_MATRIX_STORAGE_FULL C in Fortran spotrf for ‘U’ Upper

two-dimensional vsrnggaussianmv triangle of
array TT. Lower
dpotrf for
triangle is
vdrnggaussianmv
not used.
VSL_MATRIX_STORAGE_PACKED Lower triangle of spptrf for ‘L’ Upper

C packed by vsrnggaussianmv triangle of
columns into TT packed
dpptrf for
one-dimensional by rows
vdrnggaussianmv
array into
one-dimensional
array.
Table 10-12 Using Cholesky Factorization Routines in C

in as Input
Routine for RNG
VSL_MATRIX_STORAGE_FULL C in C spotrf for ‘U’ Upper

two-dimensional vsRngGaussianMV triangle of
array TT. Lower
dpotrf for
triangle is
vdRngGaussianMV
not used.
2633

in as Input
Routine for RNG
VSL_MATRIX_STORAGE_PACKED Lower triangle of spptrf for ‘L’ Upper

C packed by vsRngGaussianMV triangle of
columns into TT packed
dpptrf for
one-dimensional by rows
vdRngGaussianMV
array into
one-dimensional
array.
Return Values
is, < 0 or > nmax.
Exponential
Generates exponentially distributed random
numbers.
Syntax
Fortran:
status = vsrngexponential( method, stream, n, r, a, beta )
status = vdrngexponential( method, stream, n, r, a, beta )
2634
C:
status = vsRngExponential( method, stream, n, r, a, beta );
status = vdRngExponential( method, stream, n, r, a, beta );
Input Parameters
follows:
INTENT(IN) VSL_METHOD_SEXPONENTIAL_ICDF
VSL_METHOD_DEXPONENTIAL_ICDF
C: const int
VSL_METHOD_SEXPONENTIAL_ICDF_ACCURATE
VSL_METHOD_DEXPONENTIAL_ICDF_ACCURATE
Inverse cumulative distribution function method
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)

INTENT(IN)
C: const int
a FORTRAN 77: REAL for Displacement a

vsrngexponential
vdrngexponential
2635

Fortran 90: REAL,
INTENT(IN) for
vsrngexponential
DOUBLE PRECISION,
INTENT(IN) for
vdrngexponential
C: const float for
vsRngExponential
C: const double for
vdRngExponential
beta FORTRAN 77: REAL for Scalefactor β.

vsrngexponential
vdrngexponential
Fortran 90: REAL,
INTENT(IN) for
vsrngexponential
DOUBLE PRECISION,
INTENT(IN) for
vdrngexponential
C: const float for
vsRngExponential
const double for
vdRngExponential
2636
Output Parameters
r FORTRAN 77: REAL for Vector of n exponentially distributed random

vsrngexponential numbers
vdrngexponential
Fortran 90: REAL,
INTENT(OUT) for
vsrngexponential
DOUBLE PRECISION,
INTENT(OUT) for
vdrngexponential
C: float* for
vsRngExponential
double* for
vdRngExponential
Description
The function generates random numbers with exponential distribution that has displacement
a and scalefactor β, where a, β∈R ; β > 0.
2637
Return Values
is, < 0 or > nmax.
Laplace
Generates random numbers with Laplace
distribution.
Syntax
Fortran:
status = vsrnglaplace( method, stream, n, r, a, beta )
status = vdrnglaplace( method, stream, n, r, a, beta )
C:
status = vsRngLaplace( method, stream, n, r, a, beta );
status = vdRngLaplace( method, stream, n, r, a, beta );
2638
Input Parameters
follows:
INTENT(IN) VSL_METHOD_SLAPLACE_ICDF
VSL_METHOD_DLAPLACE_ICDF
C: const int
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)

INTENT(IN)
C: const int
a FORTRAN 77: REAL for Mean value a

vsrnglaplace
vdrnglaplace
Fortran 90: REAL,
INTENT(IN) for
vsrnglaplace
DOUBLE PRECISION,
INTENT(IN) for
vdrnglaplace
C: const float for
vsRngLaplace
2639

const double for
vdRngLaplace

vsrnglaplace
vdrnglaplace
Fortran 90: REAL,
INTENT(IN) for
vsrnglaplace
DOUBLE PRECISION,
INTENT(IN) for
vdrnglaplace
C: const float for
vsRngLaplace
const double for
vdRngLaplace
Output Parameters
r FORTRAN 77: REAL for Vector of n Laplace distributed random numbers

vsrnglaplace
vdrnglaplace
Fortran 90: REAL,
INTENT(OUT) for
vsrnglaplace
DOUBLE PRECISION,
INTENT(OUT) for
vdrnglaplace
2640
C: float* for
vsRngLaplace
double* for vdRngLaplace
Description
The function generates random numbers with Laplace distribution with mean value (or average)
a and scalefactor β, where a, β∈R ; β > 0. The scalefactor value determines the standard
deviation as
2641
Return Values
is, < 0 or > nmax.
Weibull
Generates Weibull distributed random numbers.
Syntax
Fortran:
status = vsrngweibull( method, stream, n, r, alpha, a, beta )
status = vdrngweibull( method, stream, n, r, alpha, a, beta )
C:
status = vsRngWeibull( method, stream, n, r, alpha, a, beta );
status = vdRngWeibull( method, stream, n, r, alpha, a, beta );
Input Parameters
follows:
INTENT(IN) VSL_METHOD_SWEIBULL_ICDF
VSL_METHOD_DWEIBULL_ICDF
C: const int
VSL_METHOD_SWEIBULL_ICDF_ACCURATE
VSL_METHOD_DWEIBULL_ICDF_ACCURATE
2642
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)

INTENT(IN)
C: const int
alpha FORTRAN 77: REAL for Shape α.

vsrngweibull
vdrngweibull
Fortran 90: REAL,
INTENT(IN) for
vsrngweibull
DOUBLE PRECISION,
INTENT(IN) for
vdrngweibull
C: const float for
vsRngWeibull
const double for
vdRngWeibull

vsrngweibull
vdrngweibull
2643

Fortran 90: REAL,
INTENT(IN) for
vsrngweibull
DOUBLE PRECISION,
INTENT(IN) for
vdrngweibull
C: const float for
vsRngWeibull
const double for
vdRngWeibull

vsrngweibull
vdrngweibull
Fortran 90: REAL,
INTENT(IN) for
vsrngweibull
DOUBLE PRECISION,
INTENT(IN) for
vdrngweibull
C: const float for
vsRngWeibull
const double for
vdRngWeibull
Output Parameters
r FORTRAN 77: REAL for Vector of n Weibull distributed random numbers

vsrngweibull
2644
vdrngweibull
Fortran 90: REAL,
INTENT(OUT) for
vsrngweibull
DOUBLE PRECISION,
INTENT(OUT) for
vdrngweibull
C: float* for
vsRngWeibull
double* for vdRngWeibull
Description
The function generates Weibull distributed random numbers with displacement a, scalefactor
β, and shape α, where α, β, a∈R ; α > 0, β > 0.
2645
Return Values
is, < 0 or > nmax.
Cauchy
Generates Cauchy distributed random values.
Syntax
Fortran:
status = vsrngcauchy( method, stream, n, r, a, beta )
status = vdrngcauchy( method, stream, n, r, a, beta )
C:
status = vsRngCauchy( method, stream, n, r, a, beta );
status = vdRngCauchy( method, stream, n, r, a, beta );
2646
Input Parameters
follows:
INTENT(IN) VSL_METHOD_SCAUCHY_ICDF
VSL_METHOD_DCAUCHY_ICDF
C: const int
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)

INTENT(IN)
C: const int
a FORTRAN 77: REAL for Displacement a.

vsrngcauchy
vdrngcauchy
Fortran 90: REAL,
INTENT(IN) for
vsrngcauchy
DOUBLE PRECISION,
INTENT(IN) for
vdrngcauchy
C: const float for
vsRngCauchy
2647

const double for
vdRngCauchy

vsrngcauchy
vdrngcauchy
Fortran 90: REAL,
INTENT(IN) for
vsrngcauchy
DOUBLE PRECISION,
INTENT(IN) for
vdrngcauchy
C: const float for
vsRngCauchy
const double for
vdRngCauchy
Output Parameters
r FORTRAN 77: REAL for Vector of n Cauchy distributed random numbers

vsrngcauchy
vdrngcauchy
Fortran 90: REAL,
INTENT(OUT) for
vsrngcauchy
DOUBLE PRECISION,
INTENT(OUT) for
vdrngcauchy
C: float* for vsRngCauchy
2648
double* for vdRngCauchy
Description
This function is declared in mkl_vsl.f77 for FORTRAN 77 interface, in mkl_vsl.fi
The function generates Cauchy distributed random numbers with displacement a and scalefactor
β, where a, β∈R ; β > 0.
Return Values
is, < 0 or > nmax.
2649
Rayleigh
Generates Rayleigh distributed random values.
Syntax
Fortran:
status = vsrngrayleigh( method, stream, n, r, a, beta )
status = vdrngrayleigh( method, stream, n, r, a, beta )
C:
status = vsRngRayleigh( method, stream, n, r, a, beta );
status = vdRngRayleigh( method, stream, n, r, a, beta );
Input Parameters
follows:
INTENT(IN) VSL_METHOD_SRAYLEIGH_ICDF
VSL_METHOD_DRAYLEIGH_ICDF
C: const int
VSL_METHOD_SRAYLEIGH_ICDF_ACCURATE
VSL_METHOD_DRAYLEIGH_ICDF_ACCURATE
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)

INTENT(IN)
2650
C: const int

vsrngrayleigh
vdrngrayleigh
Fortran 90: REAL,
INTENT(IN) for
vsrngrayleigh
DOUBLE PRECISION,
INTENT(IN) for
vdrngrayleigh
C: const float for
vsRngRayleigh
const double for
vdRngRayleigh

vsrngrayleigh
vdrngrayleigh
Fortran 90: REAL,
INTENT(IN) for
vsrngrayleigh
DOUBLE PRECISION,
INTENT(IN) for
vdrngrayleigh
C: const float for
vsRngRayleigh
const double for
vdRngRayleigh
2651
Output Parameters
r FORTRAN 77: REAL for Vector of n Rayleigh distributed random numbers

vsrngrayleigh
vdrngrayleigh
Fortran 90: REAL,
INTENT(OUT) for
vsrngrayleigh
DOUBLE PRECISION,
INTENT(OUT) for
vdrngrayleigh
C: float* for
vsRngRayleigh
double* for vdRngRayleigh
Description
The function generates Rayleigh distributed random numbers with displacement a and scalefactor
Rayleigh distribution is a special case of Weibull distribution, where the shape parameter α
= 2.
2652
Return Values
is, < 0 or > nmax.
Lognormal
Generates lognormally distributed random
numbers.
Syntax
Fortran:
status = vsrnglognormal( method, stream, n, r, a, sigma, b, beta )
status = vdrnglognormal( method, stream, n, r, a, sigma, b, beta )
C:
status = vsRngLognormal( method, stream, n, r, a, sigma, b, beta );
status = vdRngLognormal( method, stream, n, r, a, sigma, b, beta );
2653
Input Parameters
follows:
INTENT(IN) VSL_METHOD_SLOGNORMAL_BOXMULLER2
VSL_METHOD_DLOGNORMAL_BOXMULLER2
C: const int
VSL_METHOD_SLOGNORMAL_BOXMULLER2_ACCURATE
VSL_METHOD_DLOGNORMAL_BOXMULLER2_ACCURATE
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)

INTENT(IN)
C: const int
a FORTRAN 77: REAL for Average a of the subject normal distribution

vsrnglognormal
vdrnglognormal
Fortran 90: REAL,
INTENT(IN) for
vsrnglognormal
DOUBLE PRECISION,
INTENT(IN) for
vdrnglognormal
2654
C: const float for
vsRngLognormal
const double for
vdRngLognormal
sigma FORTRAN 77: REAL for Standard deviation σ of the subject normal
vsrnglognormal distribution
vdrnglognormal
Fortran 90: REAL,
INTENT(IN) for
vsrnglognormal
DOUBLE PRECISION,
INTENT(IN) for
vdrnglognormal
C: const float for
vsRngLognormal
const double for
vdRngLognormal
b FORTRAN 77: REAL for Displacement b

vsrnglognormal
vdrnglognormal
Fortran 90: REAL,
INTENT(IN) for
vsrnglognormal
DOUBLE PRECISION,
INTENT(IN) for
vdrnglognormal
C: const float for
vsRngLognormal
2655

const double for
vdRngLognormal

vsrnglognormal
vdrnglognormal
Fortran 90: REAL,
INTENT(IN) for
vsrnglognormal
DOUBLE PRECISION,
INTENT(IN) for
vdrnglognormal
C: const float for
vsRngLognormal
const double for
vdRngLognormal
Output Parameters
r FORTRAN 77: REAL for Vector of n lognormally distributed random numbers

vsrnglognormal
vdrnglognormal
Fortran 90: REAL,
INTENT(OUT) for
vsrnglognormal
DOUBLE PRECISION,
INTENT(OUT) for
vdrnglognormal
2656
C: float* for
vsRngLognormal
double* for
vdRngLognormal
Description
The function generates lognormally distributed random numbers with average of distribution
a and standard deviation σ of subject normal distribution, displacement b, and scalefactor β,
where a, σ, b, β∈R ; σ > 0 , β > 0.
Return Values
2657

is, < 0 or > nmax.
Gumbel
Generates Gumbel distributed random values.
Syntax
Fortran:
status = vsrnggumbel( method, stream, n, r, a, beta )
status = vdrnggumbel( method, stream, n, r, a, beta )
C:
status = vsRngGumbel( method, stream, n, r, a, beta );
status = vdRngGumbel( method, stream, n, r, a, beta );
Input Parameters
follows:
INTENT(IN) VSL_METHOD_SGUMBEL_ICDF
VSL_METHOD_DGUMBEL_ICDF
C: const int
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)
2658

INTENT(IN)
C: const int

vsrnggumbel
vdrnggumbel
Fortran 90: REAL,
INTENT(IN) for
vsrnggumbel
DOUBLE PRECISION,
INTENT(IN) for
vdrnggumbel
C: const float for
vsRngGumbel
const double for
vdRngGumbel

vsrnggumbel
vdrnggumbel
Fortran 90: REAL,
INTENT(IN) for
vsrnggumbel
DOUBLE PRECISION,
INTENT(IN) for
vdrnggumbel
C: const float for
vsRngGumbel
2659

const double for
vdRngGumbel
Output Parameters
r FORTRAN 77: REAL for Vector of n random numbers with Gumbel

vsrnggumbel distribution
vdrnggumbel
Fortran 90: REAL,
INTENT(OUT) for
vsrnggumbel
DOUBLE PRECISION,
INTENT(OUT) for
vdrnggumbel
C: float* for vsRngGumbel
double* for vdRngGumbel
Description
The function generates Gumbel distributed random numbers with displacement a and scalefactor
2660
Return Values
is, < 0 or > nmax.
Gamma
Generates gamma distributed random values.
Syntax
Fortran:
status = vsrnggamma( method, stream, n, r, alpha, a, beta )
status = vdrnggamma( method, stream, n, r, alpha, a, beta )
C:
status = vsRngGamma( method, stream, n, r, alpha, a, beta );
status = vdRngGamma( method, stream, n, r, alpha, a, beta );
2661
Input Parameters
follows:
INTENT(IN) VSL_METHOD_SGAMMA_GNORM
VSL_METHOD_DGAMMA_GNORM
C: const int
VSL_METHOD_SGAMMA_GNORM_ACCURATE
VSL_METHOD_DGAMMA_GNORM_ACCURATE
Acceptance/rejection method using random numbers
with Gaussian distribution. See brief description of
the method GNORM in Table 10-1
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)

INTENT(IN)
C: const int
alpha FORTRAN 77: REAL for Shape α.

vsrnggamma
vdrnggamma
Fortran 90: REAL,
INTENT(IN) for vsrnggamma
DOUBLE PRECISION,
INTENT(IN) for vdrnggamma
2662
C: const float for
vsRngGamma
const double for
vdRngGamma

vsrnggamma
vdrnggamma
Fortran 90: REAL,
DOUBLE PRECISION,
C: const float for
vsRngGamma
const double for
vdRngGamma

vsrnggamma
vdrnggamma
Fortran 90: REAL,
DOUBLE PRECISION,
C: const float for
vsRngGamma
const double for
vdRngGamma
2663
Output Parameters
r FORTRAN 77: REAL for Vector of n random numbers with gamma distribution
vsrnggamma
vdrnggamma
Fortran 90: REAL,
INTENT(OUT) for
vsrnggamma
DOUBLE PRECISION,
INTENT(OUT) for
vdrnggamma
C: float* for vsRngGamma
double* for vdRngGamma
Description
The function generates random numbers with gamma distribution that has shape parameter
α, displacement a, and scale parameter β, where α, β, and a∈R ; α > 0, β > 0.
where Γ(α) is the complete gamma function.
2664
Return Values
is, < 0 or > nmax.
Beta
Generates beta distributed random values.
Syntax
Fortran:
status = vsrngbeta( method, stream, n, r, p, q, a, beta )
status = vdrngbeta( method, stream, n, r, p, q, a, beta )
C:
status = vsRngBeta( method, stream, n, r, p, q, a, beta );
status = vdRngBeta( method, stream, n, r, p, q, a, beta );
2665
Input Parameters
follows:
INTENT(IN) VSL_METHOD_SBETA_CJA
VSL_METHOD_DBETA_CJA
C: const int
VSL_METHOD_SBETA_CJA_ACCURATE
VSL_METHOD_DBETA_CJA_ACCURATE
See brief description of the method CJA in Table
10-1
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)

INTENT(IN)
C: const int
p FORTRAN 77: REAL for Shape p

vsrngbeta
vdrngbeta
Fortran 90: REAL,
INTENT(IN) for vsrngbeta
DOUBLE PRECISION,
INTENT(IN) for vdrngbeta
2666
C: const float for
vsRngBeta
const double for
vdRngBeta
q FORTRAN 77: REAL for Shape q

vsrngbeta
vdrngbeta
Fortran 90: REAL,
DOUBLE PRECISION,
C: const float for
vsRngBeta
const double for
vdRngBeta

vsrngbeta
vdrngbeta
Fortran 90: REAL,
DOUBLE PRECISION,
C: const float for
vsRngBeta
const double for
vdRngBeta
2667

vsrngbeta
vdrngbeta
Fortran 90: REAL,
DOUBLE PRECISION,
C: const float for
vsRngBeta
const double for
vdRngBeta
Output Parameters
r FORTRAN 77: REAL for Vector of n random numbers with beta distribution
vsrngbeta
vdrngbeta
Fortran 90: REAL,
INTENT(OUT) for vsrngbeta
DOUBLE PRECISION,
INTENT(OUT) for vdrngbeta
C: float* for vsRngBeta
double* for vdRngBeta
Description
This function is declared in mkl_vsl.f77 for FORTRAN 77 interface, in mkl_vsl.fi
2668
The function generates random numbers with beta distribution that has shape parameters p
and q, displacement a, and scale parameter β, where p, q, a, and β∈R ; p > 0, q > 0, β
> 0.
where B(p, q) is the complete beta function.
Return Values
is, < 0 or > nmax.
Discrete Distributions
This section describes routines for generating random numbers with discrete distribution.
2669
Uniform
Generates random numbers uniformly distributed
over the interval [a, b).
Syntax
Fortran:
status = virnguniform( method, stream, n, r, a, b )
C:
status = viRngUniform( method, stream, n, r, a, b );
Input Parameters
method FORTRAN 77: INTEGER Generation method; dummy and set to 0 in case of
uniform distribution. The specific value is as follows:
INTENT(IN) VSL_METHOD_IUNIFORM_STD
C: const int Standard method. Currently there is only one method

for this distribution generator.
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)

INTENT(IN)
C: const int
2670
a FORTRAN 77: INTEGER Left interval bound a

INTENT(IN)
C: const int
b FORTRAN 77: INTEGER Right interval bound b

INTENT(IN)
C: const int
Output Parameters
r FORTRAN 77: INTEGER Vector of n random numbers uniformly distributed

over the interval [a,b)
INTENT(OUT)
C: int*
Description
The function generates random numbers uniformly distributed over the interval [a, b), where
a, b are the left and right bounds of the interval respectively, and a, b∈Z; a < b.
The probability distribution is given by:
2671
Return Values
is, < 0 or > nmax.
UniformBits
Generates integer random values with uniform bit
distribution.
Syntax
Fortran:
status = virnguniformbits( method, stream, n, r )
C:
status = viRngUniformBits( method, stream, n, r );
2672
Input Parameters
method FORTRAN 77: INTEGER Generation method; dummy and set to 0. The
specific value is as follows:
INTENT(IN) VSL_METHOD_IUNIFORMBITS_STD
C: const int Standard method. Currently there is only one method

for this distribution generator.
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)

INTENT(IN)
C: const int
Output Parameters
r FORTRAN 77: INTEGER FORTRAN: Vector of n random integer numbers. If

the stream was generated by a 64 or a 128-bit
generator, each integer value is represented by two
INTENT(OUT)
or four elements of r respectively. The number of
C: unsigned int* bytes occupied by each integer is contained in the
field wordsize of the structure
VSL_BRNG_PROPERTIES. The total number of bits
that are actually used to store the value are
contained in the field nbits of the same structure.
See Advanced Service Routines for a more detailed
discussion of VSLBRngProperties.
2673

C: Vector of n random integer numbers. If the
stream was generated by a 64 or a 128-bit
generator, each integer value is represented by two
or four elements of r respectively. The number of
bytes occupied by each integer is contained in the
field WordSize of the structure VSLBRngProperties.
The total number of bits that are actually used to
store the value are contained in the field NBits of
the same structure.See Advanced Service Routines
for a more detailed discussion of
VSLBRngProperties.
Description
The function generates integer random values with uniform bit distribution.The generators of
uniformly distributed numbers can be represented as recurrence relations over integer values
in modular arithmetic. Apparently, each integer can be treated as a vector of several bits. In
a truly random generator, these bits are random, while in pseudorandom generators this
randomness can be violated. For example, a well known drawback of linear congruential
generators is that lower bits are less random than higher bits (for example, see [Knuth81]).
For this reason, care should be taken when using this function. Typically, in a 32-bit LCG only
24 higher bits of an integer value can be considered random. See VSL Notes for details.
Return Values
is, < 0 or > nmax.
2674
Bernoulli
Generates Bernoulli distributed random values.
Syntax
Fortran:
status = virngbernoulli( method, stream, n, r, p )
C:
status = viRngBernoulli( method, stream, n, r, p );
Input Parameters
method FORTRAN 77: INTEGER Generation method. The specific value is as follows:
Fortran 90: INTEGER, VSL_METHOD_IBERNOULLI_ICDF
INTENT(IN) Inverse cumulative distribution function method.
C: const int
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)

INTENT(IN)
C: const int
p FORTRAN 77: DOUBLE Success probability p of a trial

PRECISION
Fortran 90: DOUBLE
2675

C: const double
Output Parameters
r FORTRAN 77: INTEGER Vector of n Bernoulli distributed random values

INTENT(OUT)
C: int*
Description
The function generates Bernoulli distributed random numbers with probability p of a single trial
success, where
p∈R; 0 ≤ p ≤ 1.
A variate is called Bernoulli distributed, if after a trial it is equal to 1 with probability of success
p, and to 0 with probability 1 — p.

P(X = 1) = p
P(X = 0) = 1 - p
2676
Return Values
is, < 0 or > nmax.
Geometric
Generates geometrically distributed random values.
Syntax
Fortran:
status = virnggeometric( method, stream, n, r, p )
C:
status = viRngGeometric( method, stream, n, r, p );
Input Parameters
Fortran 90: INTEGER, VSL_METHOD_IGEOMETRIC_ICDF
INTENT(IN) Inverse cumulative distribution function method.
C: const int
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)
2677


INTENT(IN)
C: const int
p FORTRAN 77: DOUBLE Success probability p of a trial

PRECISION
Fortran 90: DOUBLE
C: const double
Output Parameters
r FORTRAN 77: INTEGER Vector of n geometrically distributed random values

INTENT(OUT)
C: int*
Description
The function generates geometrically distributed random numbers with probability p of a single
trial success, where p∈R; 0 < p < 1.
A geometrically distributed variate represents the number of independent Bernoulli trials

preceding the first success. The probability of a single Bernoulli trial success is p.

k
P(X = k) = p·(1 - p) , k∈ {0,1,2, ... }.
2678
Return Values
is, < 0 or > nmax.
Binomial
Generates binomially distributed random numbers.
Syntax
Fortran:
status = virngbinomial( method, stream, n, r, ntrial, p )
C:
status = viRngBinomial( method, stream, n, r, ntrial, p );
Input Parameters
Fortran 90: INTEGER, VSL_METHOD_IBINOMIAL_BTPE
INTENT(IN)
2679

C: const int See brief description of the BTPE method in Table
10-1 .
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)

INTENT(IN)
C: const int
ntrials FORTRAN 77: INTEGER Number of independent trials m

INTENT(IN)
C: const int
p FORTRAN 77: DOUBLE Success probability p of a single trial

PRECISION
Fortran 90: DOUBLE
C: const double
Output Parameters
r FORTRAN 77: INTEGER Vector of n binomially distributed random values

INTENT(OUT)
2680
C: int*
Description
The function generates binomially distributed random numbers with number of independent
Bernoulli trials m, and with probability p of a single trial success, where p∈R; 0 ≤ p ≤ 1,
m∈N.
A binomially distributed variate represents the number of successes in m independent Bernoulli
trials with probability of a single trial success p.
Return Values
2681

is, < 0 or > nmax.
Hypergeometric
Generates hypergeometrically distributed random
values.
Syntax
Fortran:
status = virnghypergeometric( method, stream, n, r, l, s, m )
C:
status = viRngHypergeometric( method, stream, n, r, l, s, m );
Input Parameters
Fortran 90: INTEGER, VSL_METHOD_IHYPERGEOMETRIC_H2PE
INTENT(IN) See brief description of the H2PE method in Table
C: const int 10-1
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)
2682

INTENT(IN)
C: const int
l FORTRAN 77: INTEGER Lot size l

INTENT(IN)
C: const int
s FORTRAN 77: INTEGER Size of sampling without replacement s

INTENT(IN)
C: const int
m FORTRAN 77: INTEGER Number of marked elements m

INTENT(IN)
C: const int
Output Parameters
r FORTRAN 77: INTEGER Vector of n hypergeometrically distributed random

values
INTENT(OUT)
C: int*
Description
2683
The function generates hypergeometrically distributed random values with lot size l, size of
sampling s, and number of marked elements in the lot m, where l, m, s∈N∪{0}; l ≥ max(s,
m).
Consider a lot of l elements comprising m “marked” and l-m “unmarked“ elements. A trial
sampling without replacement of exactly s elements from this lot helps to define the
hypergeometric distribution, which is the probability that the group of s elements contains
exactly k marked elements.
The probability distribution is given by:)
, k∈ {max(0, s + m - l), ..., min(s, m)}
Return Values
is, < 0 or > nmax.
2684
Poisson
Generates Poisson distributed random values.
Syntax
Fortran:
status = virngpoisson( method, stream, n, r, lambda )
C:
status = viRngPoisson( method, stream, n, r, lambda );
Input Parameters
follows:
INTENT(IN) VSL_METHOD_IPOISSON_PTPE
VSL_METHOD_IPOISSON_POISNORM
C: const int
See brief description of the PTPE and POISNORM
methods in Table 10-1 .
stream(2)
Fortran 90: TYPE
(VSL_STREAM_STATE),
INTENT(IN)

INTENT(IN)
C: const int
2685
lambda FORTRAN 77: DOUBLE Distribution parameter λ.

PRECISION
Fortran 90: DOUBLE
C: const double
Output Parameters
r FORTRAN 77: INTEGER Vector of n Poisson distributed random values

INTENT(OUT)
C: int*
Description
The function generates Poisson distributed random numbers with distribution parameter λ,
where λ∈R; λ > 0.
k∈ {0, 1, 2, ...}.
2686
Return Values
is, < 0 or > nmax.
PoissonV
Generates Poisson distributed random values with
varying mean.
Syntax
Fortran:
status = virngpoissonv( method, stream, n, r, lambda )
C:
status = viRngPoissonV( method, stream, n, r, lambda );
Input Parameters
Fortran 90: INTEGER, VSL_METHOD_IPOISSONV_POISNORM
INTENT(IN)
2687

C: const int See brief description of the POISNORM method in
Table 10-1
stream FORTRAN 77: INTEGER*4

Mklman PDF

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Mklman PDF

Загружено:

Авторское право:

Доступные форматы

Intel® Math Kernel Library

Disclaimer and Legal Information

Document Number: 630813-029US

-001 Original Issue. 11/94

-6004 Documents Intel Math Kernel Library release 6.1. 07/03

Version Version Information Date

Audience for This Manual.....................................................59

Chapter 2: BLAS and Sparse BLAS Routines

Chapter 3: LAPACK Routines: Linear Equations

Chapter 4: LAPACK Routines: Least Squares and Eigenvalue

Linear Least Squares (LLS) Problems.................................1107

Chapter 5: LAPACK Auxiliary and Utility Routines

Chapter 6: ScaLAPACK Routines

Chapter 7: ScaLAPACK Auxiliary and Utility Routines

Chapter 8: Sparse Solver Routines

Chapter 9: Vector Mathematical Functions

Chapter 10: Statistical Functions

Chapter 11: Fourier Transform Functions

Chapter 12: PBLAS Routines

Chapter 13: Partial Differential Equations Support

Chapter 14: Optimization Solver Routines

Chapter 15: Support Functions

Chapter 16: BLACS Routines

BLACS Support Routines..........................................................3142

Appendix A: Linear Solvers Basics

Appendix B: Routine and Function Arguments

Appendix C: Code Examples

Appendix D: CBLAS Interface to the BLAS

Appendix E: Specific Features of Fortran 95 Interfaces for

Modified Netlib Interfaces.........................................................3375

Appendix F: Optimization Solvers Basics

Appendix G: FFTW to Intel® Math Kernel Library Wrappers

Basic Interface Wrappers.........................................3397

About This Software

• Basic Linear Algebra Subprograms (BLAS):

• Submit issues and review their status.

Sparse BLAS Routines

Sparse Solver Routines

Fourier Transform Functions

Partial Differential Equations Support

Optimization Solver Routines

GMP Arithmetic Functions

• Loop unrolling to minimize loop management costs

About This Manual

Audience for This Manual

Chapter 4 LAPACK Routines: Least Squares and Eigenvalue Problems. Provides

• Routine name shorthand (for example, ?ungqr instead of cungqr/zungqr).

Routine Name Shorthand

Routine Naming Conventions

The examples below illustrate how to interpret BLAS routine names:

Fortran 95 Interface Conventions

The main conventions used in Fortran 95 interface are as follows:

Optional parameters are given in square brackets in Fortran 95 call syntax.

Matrix Storage Schemes

BLAS Level 1 Routines

?asum s, d, sc, dz Sum of vector magnitudes (functions)

?axpy s, d, c, z Scalar-vector product (routines)

?copy s, d, c, z Copy vector (routines)

?dot s, d Dot product (functions)

?sdot sd, d Dot product with extended precision (functions)

?dotc c, z Dot product conjugated (functions)

?dotu c, z Dot product unconjugated (functions)

?nrm2 s, d, sc, dz Vector 2-norm (Euclidean norm) (functions)