LAPACK 3.7.0 ============ Release date: 12/24/16. This material is based upon work supported by the National Science Foundation and the Department of Energy under Grant No. NSF ACI 1339797, NSF-OCI-1032861, NSF-CCF-00444486, NSF-CNS 0325873, NSF-EIA 0122599, NSF-ACI-0090127, DOE-DE-FC02-01ER25478, DOE-DE-FC02-06ER25768. LAPACK is a software package provided by Univ. of Tennessee, Univ. of California, Berkeley, Univ. of Colorado Denver and NAG Ltd.. 1. Support and questions: ------------------------- http://icl.cs.utk.edu/lapack-forum/ https://github.com/Reference-LAPACK/lapack/ 2. LAPACK 3.7.0: What's new --------------------------- 2.1. Linear Least Squares / Minimum Norm solution ------------------------------------------------- A contribution from Syd Hashemi Ghermezi (UC Berkeley), Jim Demmel (UC Berkeley), with some help from Eugene Chereshnev (Intel) and Konstantin Arturov (Intel). - New TP (triangle on top of trapezoid kernels for LQ factorization - Short and Wide LQ (SWLQ) and Tall and Skinny QR (TSQR) factorization - New interface for QR and LQ factorization is GEQR and GELQ - Corresponding Linear Least Squares / Minimum Norm solution driver (GETSLS) Added: SRC/cgelq.f SRC/dgelq.f SRC/sgelq.f SRC/zgelq.f SRC/cgelqt.f SRC/dgelqt.f SRC/sgelqt.f SRC/zgelqt.f SRC/cgelqt3.f SRC/dgelqt3.f SRC/sgelqt3.f SRC/zgelqt3.f SRC/cgemlq.f SRC/dgemlq.f SRC/sgemlq.f SRC/zgemlq.f SRC/cgemlqt.f SRC/dgemlqt.f SRC/sgemlqt.f SRC/zgemlqt.f SRC/cgemqr.f SRC/dgemqr.f SRC/sgemqr.f SRC/zgemqr.f SRC/cgeqr.f SRC/dgeqr.f SRC/sgeqr.f SRC/zgeqr.f SRC/cgetsls.f SRC/dgetsls.f SRC/sgetsls.f SRC/zgetsls.f SRC/clamswlq.f SRC/dlamswlq.f SRC/slamswlq.f SRC/zlamswlq.f SRC/clamtsqr.f SRC/dlamtsqr.f SRC/slamtsqr.f SRC/zlamtsqr.f SRC/claswlq.f SRC/dlaswlq.f SRC/slaswlq.f SRC/zlaswlq.f SRC/clatsqr.f SRC/dlatsqr.f SRC/slatsqr.f SRC/zlatsqr.f SRC/ctplqt.f SRC/dtplqt.f SRC/stplqt.f SRC/ztplqt.f SRC/ctplqt2.f SRC/dtplqt2.f SRC/stplqt2.f SRC/ztplqt2.f SRC/ctpmlqt.f SRC/dtpmlqt.f SRC/stpmlqt.f SRC/ztpmlqt.f References: * Brian C. Gunter and Robert A. Van De Geijn, Parallel out-of-core computation and updating of the QR factorization, ACM Transactions on Mathematical Software, 31(1):60-78, 2005. * James Demmel, Laura Grigori, Mark Hoemmen, and Julien Langou. Communication-Optimal Parallel and Sequential QR and LU Factorizations. SIAM J. Scientific Computing, 34(1):A206--A239, 2012. 2.2. Symmetric-indefinite Factorization: Aasen's tridiagonalization -------------------------------------------------------------------- A contribution from Ichitaro Yamazaki (University of Tennessee). This is Aasen's factorization for symmetric-indefinite factorization. Added: SRC/chesv_aa.f SRC/dsysv_aa.f SRC/ssysv_aa.f SRC/zhesv_aa.f SRC/chetrf_aa.f SRC/dsytrf_aa.f SRC/ssytrf_aa.f SRC/zhetrf_aa.f SRC/chetrs_aa.f SRC/dsytrs_aa.f SRC/ssytrs_aa.f SRC/zhetrs_aa.f SRC/clahef_aa.f SRC/dlasyf_aa.f SRC/slasyf_aa.f SRC/zlahef_aa.f References: * Miroslav Rozloznik, Gil Shklarski, and Sivan Toledo, Partitioned triangular tridiagonalization, ACM Transactions on Mathematical Software, 37(4), article 38, 16 pages, 2011 * Jan Ole Aasen, On the reduction of a symmetric matrix to tridiagonal form, BIT, 11 (1971), pp. 233–242. 2.3. Symmetric-indefinite Factorization: New storage format for L factor in Rook Pivoting and Bunch Kaufman of LDLT ------------------------------------------------------------------------------------------------------------------- A contribution from Igor Kozachenko and Jim Demmel (UC Berkeley). This storage format is akin to LU and enables Level 3 BLAS TRS and TRI. Added routines for new factorization code for symmetric indefinite ( or Hermitian indefinite ) matrices with bounded Bunch-Kaufman ( rook ) pivoting algorithm. New more efficient storage format for factors U (or L), block-diagonal matrix D, and pivoting information stored in IPIV: factor L is stored explicitly in lower triangle of A; diagonal of D is stored on the diagonal of A; subdiagonal elements of D are stored in array E; IPIV format is the same as in *_ROOK routines, but differs from SY Bunch-Kaufman routines (e.g. *SYTRF). The factorization output of these new rook _RK routines is not compatible with the existing _ROOK routines and vice versa. This new factorization format is designed in such a way, that there is a possibility in the future to write new Bunch-Kaufman routines that conform to this new factorization format. Then the future Bunch-Kaufman routines could share solver *TRS_3,inversion *TRI_3 and condition estimator *CON_3. Added: SRC/dsytf2_rk.f SRC/ssytf2_rk.f SRC/zsytf2_rk.f SRC/zhetf2_rk.f SRC/csytf2_rk.f SRC/chetf2_rk.f SRC/dlasyf_rk.f SRC/slasyf_rk.f SRC/zlasyf_rk.f SRC/zlahef_rk.f SRC/clasyf_rk.f SRC/clahef_rk.f SRC/dsytrf_rk.f SRC/ssytrf_rk.f SRC/zsytrf_rk.f SRC/zhetrf_rk.f SRC/csytrf_rk.f SRC/chetrf_rk.f SRC/dsytrs_3.f SRC/ssytrs_3.f SRC/zsytrs_3.f SRC/zhetrs_3.f SRC/csytrs_3.f SRC/chetrs_3.f SRC/dsycon_3.f SRC/ssycon_3.f SRC/zsycon_3.f SRC/zhecon_3.f SRC/csycon_3.f SRC/checon_3.f SRC/dsytri_3.f SRC/ssytri_3.f SRC/zsytri_3.f SRC/zhetri_3.f SRC/csytri_3.f SRC/chetri_3.f SRC/dsytri_3x.f SRC/ssytri_3x.f SRC/zsytri_3x.f SRC/zhetri_3x.f SRC/csytri_3x.f SRC/chetri_3x.f SRC/dsysv_rk.f SRC/ssysv_rk.f SRC/zsysv_rk.f SRC/zhesv_rk.f SRC/csysv_rk.f SRC/chesv_rk.f 2.4. Symmetric eigenvalue problem: Two-stage algorithm for reduction to tridiagonal form ---------------------------------------------------------------------------------------- A contribution from Azzam Haidar (University of Tennessee). This is the two-stage algorithm for reduction to tridiagonal form. Based on an algorithm from Lang. Added: SRC/chb2st_kernels.f SRC/dsb2st_kernels.f SRC/ssb2st_kernels.f SRC/zhb2st_kernels.f SRC/chbev_2stage.f SRC/dsbev_2stage.f SRC/ssbev_2stage.f SRC/zhbev_2stage.f SRC/chbevd_2stage.f SRC/dsbevd_2stage.f SRC/ssbevd_2stage.f SRC/zhbevd_2stage.f SRC/chbevx_2stage.f SRC/dsbevx_2stage.f SRC/ssbevx_2stage.f SRC/zhbevx_2stage.f SRC/cheev_2stage.f SRC/dsyev_2stage.f SRC/ssyev_2stage.f SRC/zheev_2stage.f SRC/cheevd_2stage.f SRC/dsyevd_2stage.f SRC/ssyevd_2stage.f SRC/zheevd_2stage.f SRC/cheevr_2stage.f SRC/dsyevr_2stage.f SRC/ssyevr_2stage.f SRC/zheevr_2stage.f SRC/cheevx_2stage.f SRC/dsyevx_2stage.f SRC/ssyevx_2stage.f SRC/zheevx_2stage.f SRC/chegv_2stage.f SRC/dsygv_2stage.f SRC/ssygv_2stage.f SRC/zhegv_2stage.f SRC/chetrd_2stage.f SRC/dsytrd_2stage.f SRC/ssytrd_2stage.f SRC/zhetrd_2stage.f SRC/chetrd_hb2st.F SRC/dsytrd_sb2st.F SRC/ssytrd_sb2st.F SRC/zhetrd_hb2st.F SRC/chetrd_he2hb.f SRC/dsytrd_sy2sb.f SRC/ssytrd_sy2sb.f SRC/zhetrd_he2hb.f SRC/clarfy.f SRC/dlarfy.f SRC/slarfy.f SRC/zlarfy.f References: * Christian H. Bischof, Bruno Lang, and Xiaobai Sun, A framework for symmetric band reduction, ACM Transactions on Mathematical Software, 26(4): 581-601, 2000. 2.5. Improved Complex Jacobi SVD --------------------------------- A contribution from Zlatko Drmac (University of Zagreb). (1) LWORK query added; (2) few modifications in pure one sided Jacobi (XGESVJ) to remove possible error in the really extreme cases (sigma_max close to overflow and sigma_min close to underflow) - note that XGESVJ is designed to compute the singular values in the full range; it was used (double complex) to compute SVD of certain factored Hankel matrices with the condition number 1.0e616; (3) `same` modifications in the preconditioned Jacobi SVD (XGEJSV), the idea is also to extend the computational range, this brings assumptions on how other lapack routines behave under those extreme conditions. Modified: SRC/cgejsv.f SRC/zgejsv.f SRC/cgesvj.f SRC/zgesvj.f SRC/cgsvj0.f SRC/zgsvj0.f SRC/cgsvj1.f SRC/zgsvj1.f 2.6. LAPACKE interfaces ----------------------- A contribution from Julie Langou (University of Tennessee). 3. Developer list ----------------- .External Contributors * Konstantin Arturov, Intel * Eugene Chereshnev, Intel * Zlatko Drmac, University of Zagreb .LAPACK developers involved in this release * Mark Gates (University of Tennessee) * Syd Hashemi Ghermezi (UC Berkeley) * Azzam Haidar (University of Tennessee) * Igor Kozachenko (University of California, Berkeley, USA) * Julie Langou (University of Tennessee, USA) * Osni Marques (University of California, Berkeley, USA) * Ichitaro Yamazaki (University of Tennessee) .Principal Investigators * Jim Demmel (University of California, Berkeley, USA) * Jack Dongarra (University of Tennessee and ORNL, USA) * Julien Langou (University of Colorado Denver, USA) 4. Thanks --------- * MathWorks: Penny Anderson, Amanda Barry, Mary Ann Freeman, Bobby Cheng, Duncan Po, Pat Quillen, Christine Tobler. * Intel: Konstantin Arturov, Eugene Chereshnev * Gonum: Vladimír Chalupecký * ORACLE: Elena Ivanov. * IBM: Peng HongBo, Joan McComb, and Yi LB Peng. * Cygwin: Marco Atzeri * GitHub Users: turboencabulator@github, cmoha@github, banzaiman@github, advanpix@github, reeuwijk-altium@github, brandimarte@github, zerothi@github * Christoph Conrads * J. Kay Dewhurst (Max Planck Institute of Microstructure Physics) * Kyle Guinn * Berend Hasselman * Pavel Holoborodko * Hans Johnson (Univeristy of Iowa) * Nick Papior * Antonio Rojas * Julien Schueller (Phimeca) 5. Bug Fix ----------- * link:https://github.com/Reference-LAPACK/lapack/milestone/1?closed=1[Closed bugs] * link:https://github.com/Reference-LAPACK/lapack/issues?utf8=✓&q=is%3Aissue%20is%3Aopen%20label%3A%22Type%3A%20Bug%22%20[Open bugs] * link:https://github.com/Reference-LAPACK/lapack/issues[All open issues] // vim: set syntax=asciidoc: .