Intel^® Math Kernel Library 10.1 Update 2 for Linux*
Release Notes

Overview
New in Intel® MKL
System Requirements
Installation Notes
Documentation
Known Limitations
Technical Support and Feedback
Related Products and Services
Disclaimer and Legal Information

Overview

The Intel® Math Kernel Library (Intel® MKL) provides developers of scientific, engineering and financial software with a set of linear algebra routines, fast Fourier transforms, and vectorized math and random number generation functions, all optimized for the latest Intel® Pentium® 4 processors, Intel® Xeon® processors with Streaming SIMD Extensions 3 (SSE3) and Intel® Extended Memory 64 Technology (Intel® EM64T), and Intel® Itanium® 2 processors. This software also performs well on non-Intel (x86) processors.

Intel® MKL provides linear algebra functionality with LAPACK (solvers and eigensolvers) plus level 1, 2, and 3 BLAS offering the vector, vector-matrix, and matrix-matrix operations needed for complex mathematical software computations. Users who prefer the FORTRAN 90/95 programming language may call LAPACK driver and computational subroutines via special interfaces with reduced numbers of arguments. Intel® MKL provides ScaLAPACK (Scalable LAPACK) and support functionality including the Parallel Basic Linear Algebra Subprograms (PBLAS). For solving sparse systems of equations, Intel® MKL provides direct and iterative sparse solvers as well as a supporting set of sparse BLAS (levels 1, 2, and 3).

Intel® MKL offers multidimensional fast Fourier transforms (1D, 2D, 3D) with mixed radix support (not limited to sizes of powers of 2). Intel® MKL also provides distributed versions of these functions for use on clusters. For the solution of partial differential equations (PDE), Intel® MKL provides a few preconditioners to help with the convergence of our iterative solvers. Optimization [Trust Region] solvers provide efficient routines for solving nonlinear least square problems with and without boundary constraints.

Intel® MKL also includes a set of vectorized transcendental functions (called the Vector Math Library (VML)) offering both greater performance and excellent accuracy compared to the libm (scalar) functions. The Vector Statistical Library (VSL) offers high performance vectorized random number generators for a number of probability distributions as well as convolution and correlation routines.

The BLAS, LAPACK, direct sparse solver (DSS/PARDISO), FFT, VML library functions, and optimization solvers in Intel® MKL are threaded using OpenMP*. All of Intel® MKL is thread-safe (with the exception of the deprecated ?lacon, ?lasq3, and ?lasq4 LAPACK routines; see the reference manual for more information).

New in Intel® MKL

New in Intel® MKL 10.1 Update 2

Performance improvements
- Improved performance of DGEMM on Intel® Xeon 5500 series processors with 4 MB of last-level cache
  - DGEMM floating-point computational efficiencies up to 96% have been measured
  - Intel® Optimized LINPACK Benchmark efficiencies up to 91.5% have been measured, up from 88% in Intel® MKL 10.1
Bug fixes
- Fixed a problem with DGELSD which caused programs to continue indefinitely
- Fixed a problem with accepting lower case character parameters in LAPACK functions
- Fixed a bug in the LAPACK function ZHEGST (affecting ZHEGV) which was noted on some systems using glibc 2.4
- Fixed a problem with the ?ladiv group of functions which occurred for certain data sets
- Removed performance degradation in DSTEVR for cases that caused operations on NaNs
- Fixed error reporting on the VML function vdLn to properly report error codes on large vectors
- Fixed a bug which introduced an infinite loop in the Hypergeometric random number generator
- Fixed a load-balancing problem in the pdtran ScaLAPACK function by improving the communications scheme improving performance by up to 3 times
- Fixed an unaligned memory access error message in the SGI MPT BLACS library
- Fixed a bug in scripts used for building single-precision FFTW2.x wrappers
- Fixed custom DLL builder to link the proper threading runtime when integrated with the Intel® compilers
- Improved error messages in cases where Intel MKL is unable to load processor specific DLLs
- Fixed error reporting from MKL_malloc (now returns NULL) when there is insufficient memory to proceed

New in Intel® MKL 10.1 Update 1

Performance Improvements
- Optimized DGEMM/SGEMM 64-bit kernels for the Intel® Core™ i7 processor provide a 2% performance improvement on average
- Improved DGEMM/SGEMM 32-bit kernels for the Intel® Core™ i7 processor and the Quad-Core Intel® Xeon® processor 5400 series
  - 2-8% performance improvement on the Quad-Core Intel® Xeon® processor 5400 series
  - 2-3% improvement of multithreaded DGEMM on the Intel® Core™ i7 processor
- Better SMP Linpack performance on the Intel® Core™ i7 processor and the Quad-Core Intel® Xeon® processor 5400 series
  - 2-3% performance improvement for 32-bits
  - 1.5% improvement for 64-bits running a 20K problem on 8 threads
- Added new optimized radix 7 and 11 FFT kernels which provide performance benefits for transform lengths that are multiples of 7 or 11
- Threaded sparse matrix vector functions such as mkl_dcoogemv to provide better multicore performance
Bug fixes and other improvements
- Extended the FFTW interfaces to include cluster FFT functions to further support real-to-complex transforms and the use of the transposed order configuration (see the “Configuration Settings” section of the “Fast Fourier Transforms” chapter in the reference manual for more information on transposition)
- Eliminated memory leaks where DFT descriptors were created on one thread and freed on another
- Resolved a few LAPACK issues
  - Addressed an error in the size of the TAU array used by the DSYTRD function
  - Corrected inaccurate results returned by the ZHETRD function when using the GNU threading layer (libmkl_gnu_thread) on Linux*
- Reduced memory use in the PARDISO out-of-core (OOC) solver by 30 - 50% and in-core by 10 - 50%
- Added a pre-built MP LINPACK executable that links dynamically to an Intel® MPI Library on Linux*

New in Intel® MKL 10.1

Performance Improvements in the BLAS:
- Performance improvements on Quad-Core Intel® Xeon® processor 5400 series systems with 64-bit OS's:
  - SGEMM: 2% on 1 thread and 6% on 8 threads
  - DGEMM: 7% on 8 threads
  - CGEMM: 2% on 1 thread and 10% on 8 threads
  - ZGEMM: 7% on 1 thread and 11% on 8 threads
- Performance improvements on Quad-Core Intel® Xeon® processor 5400 series systems with 32-bit OS's:
  - SGEMM: 7-15% on 8 threads
  - DGEMM: 7-15% on 8 threads
- Performance improvement on Intel® Core™ i7 processors with 64-bit OS's:
  - SGEMM: 50% on 1 thread and 50% on 8 threads
  - DGEMM: 11% on 1 thread and 12% on 8 threads
  - CGEMM: 2-3% on 1 thread and 2-3% on 8 threads
  - ZGEMM: 2% on 1 thread
  - DTRSM: 20% on 1 thread and 20% on 8 threads for some cases.
Improvements to the direct sparse solver (DSS/PARDISO):
- The performance of out-of-core PARDISO was improved by 35% on average.
- Support of separate backward/forward substitution for DSS/PARDISO has been added.
- A new parameter for turning off iterative refinement for DSS interface has been introduced.
- A new parameter for checking sparse matrix structure has been introduced for PARDISO interface.
- The sparse solver functionality has now been integrated into the core math library and it is no longer necessary to link a separate solver library. See the user guide for more information.
- The sparse solver functionality can now be linked dynamically.
The capability to track and/or interrupt the progress of lengthy LAPACK computations has been added via a callback function mechanism. A function called mkl_progress can be defined in a user application, which will be called regularly from a subset of the MKL LAPACK routines. See the LAPACK Auxiliary and Utility Routines chapter in the reference manual for more information. Refer to the specific function descriptions to see which LAPACK functions support the feature.
Transposition functions have been added to Intel MKL. See the "BLAS-like Extensions" section of chapter 2 in reference manual for further detail.
The C++ std::complex type can now be used instead of MKL-specific complex types.
Improvements to the Discrete Fourier Transform Interface (DFTI)
- Addition of the DftiCopyDescriptor convenience function
- Reduction in the size of statically linked executables calling DFTI functions
- Support for DFTI_REAL_REAL storage (i.e., real and imaginary parts in separate arrays) in complex-to-complex transforms
An implementation of the Boost uBLAS matrix-matrix multiplication routine is now provided which will make use of the highly optimized version of DGEMM in the Intel MKL BLAS. See the User guide for more information.
Improvements to the sparse BLAS:
- Support for all data types (single precision, complex and double complex) has been added.
- Routines for computing the sum and product of two sparse matrices stored, both stored in the compressed sparse row format have been added.
- Routines for converting between different support sparse matrix formats have been added.
ScaLAPACK functionality can now be dynamically linked.
Optimized versions of the Cumulative Normal Distribution (CdfNorm), its inverse (CdfNormInv), and the inverse complementary error function (ErfcInv) have been added to the Vector Math Library.
Performance improvements on Intel® Core™ i7 processors:

3-17% improvement for the following VML functions: Asin, Asinh, Acos, Acosh, Atan, Atan2, Atanh, Cbrt, CIS, Cos, Cosh, Conj, Div, ErfInv, Exp, Hypot, Inv, InvCbrt, InvSqrt, Ln, Log10, MulByConj, Sin, SinCos, Sinh, Sqrt, Tanh.
7-67% improvement for uniform random number generation.
3-10% improvement for VSL distribution generators based on Wichmann-Hill, Sobol, and Niederreiter BRNGs (64-bit only).

The configuration file functionality has been removed. See the user guide for alternative means to configure the behavior of Intel MKL.
When functions in Intel MKL are called from an MPI program they will be run on 1 thread by default (that is in cases where the number of threads are not explicitly set) if the single-threaded MPI is linked. This differs from default Intel MKL behavior in programs without use of MPI. See chapter 6 of the User's Guide for more information on controlling parallelism.
Documentation updates:
- The FFTW Wrappers for MKL Notes have been removed from the product package and their content was integrated into the Intel MKL Reference Manual (Appendix G).
- The parallel BLAS (PBLAS) which support ScaLAPACK are now documented in the Intel MKL reference manual.
- Added FORTRAN 77 support info to the description of VML and VSL functions in the Intel MKL reference manual.
- Eclipse IDE Infopop support for VML functions and VSL service functions. The infopop support means brief info on a function in a pop-up window appearing when the cursor is placed to the function/routine name in the Eclipse Editor panel. This Eclipse feature is implemented in the CDT 5.0 version.
Support for new compilers including the new Intel® compilers 11.0 and PGI* compilers.
The default OpenMP runtime library for Intel MKL has been changed from libguide to libiomp. See the User Guide in the doc directory for more information.
The optimized code paths for the Intel® Pentium® III processor have been removed from Intel MKL along with the associated processor specific dynamic link libraries. We continue to support the use of Intel MKL on this processor, but the default code path will be used and as a result performance may be reduced.
The interval linear solver functions have been removed from MKL.
Support for Intel MPI 1.x has ended.

System Requirements

Hardware

To install and use Intel® MKL you will need a system with a supported processor and 1.1 GB of free hard disk space plus an additional 400 MB during installation for download and temporary files (host system only).

Supported processors:

Intel® Core™ processor family
Intel® Xeon® processor family
Intel® Itanium® processor family
Intel® Pentium® 4 processor family
Intel® Pentium® III processor
Intel® Pentium® processor (300 MHz or faster)
Intel® Celeron® processor
AMD Athlon* and Opteron* processors

Software

To use Intel® MKL you will need a supported compiler and MPI implementation.

Following is the list of supported operating systems:

Red Hat* Enterprise Linux* 3, 4, 5 (IA-32 / Intel® 64 / IA-64)
SUSE LINUX Enterprise Server* 9, 10 (IA-32 / Intel® 64 / IA-64)
SGI ProPack* for Linux 4, 5 (Intel® 64 / IA-64)
Red Hat* Fedora* 9 (IA-32 / Intel® 64)
Debian* GNU/Linux 4.0 (IA-32 / Intel® 64 / IA-64)
Ubuntu* 8.04 (IA-32 / Intel® 64)
Asianux* Server 3 (IA-32 / Intel® 64 / IA-64)
Turbolinux* 11 (IA-32 / Intel® 64 / IA-64)

Note: These Linux* distributions are supported, and Intel® MKL should work on many more. If you have trouble with your distribution, do let us know.

Following is the list of supported C/C++ and Fortran compilers:

Intel® Fortran Compiler 11.0 for Linux*
Intel® Fortran Compiler 10.1 for Linux*
Intel® C++ Compiler 11.0 for Linux*
Intel® C++ Compiler 10.1 for Linux*
GNU Compiler Collection (gcc, g77, GNU Fortran 4.2.0 and later)
Absoft* Pro Fortran v10.1 for Linux*
PGI* Workstation Complete version 7.1.6

Following is the list of MPI implementations that Intel® MKL has been validated against:

Intel® MPI Library Version 2.0, 3.0, 3.1, and 3.2.x (http://www.intel.com/go/mpi)
MPICH2 version 1.0.x (http://www-unix.mcs.anl.gov/mpi/mpich)
MPICH version 1.2.x (http://www-unix.mcs.anl.gov/mpi/mpich)
Open MPI 1.2.x (http://www.open-mpi.org)
SGI* MPT on Intel® 64 and IA-64 (http://www.sgi.com/products/software/mpt/)

Note: Usage of MPI linking instructions can be found in the User's Guide in the doc directory.

Note:

Parts of Intel® MKL have Fortran interfaces and data structures while other parts which have C interfaces and C data structures. The User Guide in the doc directory contains advice on how to link to Intel® MKL with different compilers.

Installation Notes

Guidance on the installation of Intel® MKL is provided at install time. Links will be provided to a file with step-by-step instructions (filename: Install.txt). This file can also be found in the doc directory.

Documentation

The Documentation Index (mkl_documentation.htm in the doc directory) has a list of the principal Intel® MKL documents. For a complete list, see chapter 3 of the User's Guide.

NOTICES

The following change is planned for future versions of Intel MKL. Please contact customer support if you have concerns:

The compatibility or dummy libraries documented in the linking chapter of the user guide will be removed. These were provided to smooth the transition to the new naming scheme for Intel MKL libraries introduced in version 10.0

Known Limitations

A full list of known limitations in this version of Intel® MKL can be found on the support site (http://www.intel.com/software/products/support/mkl).

Technical Support and Feedback

Self Help and User Forums

A rich repository of self-help product information such as tutorials, getting started tips, known product issues, product errata, compatibility information and answers to frequently asked questions can be found at the Intel® Software Development Products Technical Support site (http://www.intel.com/software/products/support/index.htm). It's a great place to find answers quickly or to gain insight in using our products effectively.

The Intel® MKL User Forum (http://software.intel.com/en-us/forums/intel-math-kernel-library/) is the place to ask questions of and share information with other users of Intel® MKL.

Submitting Issues

Your feedback is very important to us. To receive technical support and product updates for the tools provided in this product you need to register at the Intel® Registration Center (https://registrationcenter.intel.com/).

If you have questions or problems getting started with the Intel® Math Kernel Library please contact support at https://registrationcenter.intel.com/support/.

Note: Please notify your support representative prior to submitting source code where access needs to be restricted to certain countries to determine if this request can be accommodated.

To submit an issue via the Intel® Premier Support website, please perform the following steps:

Ensure that Java* and JavaScript* are enabled in your browser
Go to http://premier.intel.com
Type in your Login and Password. Both are case-sensitive
Click the "Submit Issues" button
Click on the "Development Environment" button next to the "Product Type" drop-down list
Click on the " Intel(R) MKL for Linux*" button next to the "Product Name" drop-down list
Enter the info to the required fields, and Click on the "Submit Issue" link in the left navigation bar
Choose "Development Environment (tools,SDV,EAP)" from the "Product Type" drop-down list
If this is a software or license-related issue choose " Intel(R) MKL for Linux*" from the "Product Name" drop-down list
Enter your question and complete the fields in the windows that follow to successfully submit the issue

Please follow these guidelines when forming your problem report or product suggestion:

Describe your difficulty or suggestion
For problem reports please be as specific as possible (e.g., including compiler and link command line options), so that we may reproduce the problem. Please include a small test case if possible
Describe your system configuration information
Be sure to include specific information that may be applicable to your setup: operating system, name and version number of installed applications, and anything else that may be relevant to helping us address your concern

Related Products and Services

Information on Intel® software development products is available at http://www.intel.com/software/products. Some of the related products include:

The Intel® Software College provides interactive tutorials, documentation, and code samples that teach Intel® architecture and software optimization techniques
The VTune™ Performance Analyzer allows you to evaluate how your application is utilizing the CPU and helps you determine if there are modifications you can make to improve your application's performance
The Intel® C++ and Fortran Compilers are an important part of making software run at top speeds and fully support the latest Intel IA-32 and Itanium® processors
The Intel® Performance Library Suite provides a set of routines optimized for various Intel® processors. The Intel® Math Kernel Library, which provides developers of scientific and engineering software with a set of linear algebra, fast Fourier transforms and vector math functions optimized for the latest Intel Pentium and Intel Itanium® processors. The Intel® Integrated Performance Primitives consists of cross platform tools to build high performance software for several Intel architectures and several operating systems

Attribution

As referenced in the End User License Agreement, attribution requires, at a minimum, prominently displaying the full Intel product name (e.g. "Intel® Math Kernel Library") and providing a link/URL to the Intel® MKL homepage (www.intel.com/software/products/mkl) in both the product documentation and website.

The original versions of the BLAS from which that part of Intel® MKL was derived can be obtained from http://www.netlib.org/blas/index.html.

The original versions of LAPACK from which that part of Intel® MKL was derived can be obtained from http://www.netlib.org/lapack/index.html. The authors of LAPACK are E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen. Our FORTRAN 90/95 interfaces to LAPACK are similar to those in the LAPACK95 package at http://www.netlib.org/lapack95/index.html. All interfaces are provided for pure procedures.

The original versions of ScaLAPACK from which that part of Intel® MKL was derived can be obtained from http://www.netlib.org/scalapack/index.html. The authors of ScaLAPACK are L. S. Blackford, J. Choi, A. Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley.

PARDISO in Intel® MKL is compliant with the 3.2 release of PARDISO freely distributed by the University of Basel. It can be obtained at http://www.pardiso-project.org.

Some FFT functions in this release of Intel® MKL have been generated by the SPIRAL software generation system (http://www.spiral.net/) under license from Carnegie Mellon University. Some FFT functions in this release of the Intel® MKL DFTI have been generated by the UHFFT software generation system under license from University of Houston. The Authors of SPIRAL are Markus Puschel, Jose Moura, Jeremy Johnson, David Padua, Manuela Veloso, Bryan Singer, Jianxin Xiong, Franz Franchetti, Aca Gacic, Yevgen Voronenko, Kang Chen, Robert W. Johnson, and Nick Rizzolo.

Disclaimer and Legal Information

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL(R) PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR.

Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information.

The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or by visiting Intel's Web Site.

Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. See http://www.intel.com/products/processor_number for details.

This document contains information on products in the design phase of development.

BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino logo, Core Inside, FlashFile, i960, InstantIP, Intel, Intel logo, Intel386, Intel486, Intel740, IntelDX2, IntelDX4, IntelSX2, Intel Core, Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel NetBurst, Intel NetMerge, Intel NetStructure, Intel SingleDriver, Intel SpeedStep, Intel StrataFlash, Intel Viiv, Intel vPro, Intel XScale, IPLink, Itanium, Itanium Inside, MCS, MMX, Oplus, OverDrive, PDCharm, Pentium, Pentium Inside, skoool, Sound Mark, The Journey Inside, VTune, Xeon, and Xeon Inside are trademarks of Intel Corporation in the U.S. and other countries.

* Other names and brands may be claimed as the property of others.

Intel® Math Kernel Library 10.1 Update 2 for Linux* Release Notes

Contents

Intel^® Math Kernel Library 10.1 Update 2 for Linux*
Release Notes