Intel® Math Kernel Library 10.1 Update 2 for Linux*
Release Notes
Contents
Overview
New in Intel® MKL
System Requirements
Installation Notes
Documentation
Known Limitations
Technical Support and Feedback
Related Products and Services
Disclaimer and Legal Information
The Intel® Math Kernel Library (Intel® MKL) provides developers of
scientific, engineering and financial software with a set of linear
algebra routines, fast Fourier transforms, and vectorized math and
random number generation functions, all optimized for the latest Intel®
Pentium® 4 processors, Intel® Xeon® processors with Streaming SIMD
Extensions 3 (SSE3) and Intel® Extended Memory 64 Technology (Intel®
EM64T), and Intel® Itanium® 2 processors. This software also performs
well on non-Intel (x86) processors.
Intel® MKL provides linear algebra functionality with LAPACK
(solvers and eigensolvers) plus level 1, 2, and 3 BLAS offering the
vector, vector-matrix, and matrix-matrix operations needed for complex
mathematical software computations. Users who prefer the FORTRAN 90/95
programming language may call LAPACK driver and computational
subroutines via special interfaces with reduced numbers of arguments.
Intel® MKL provides ScaLAPACK (Scalable LAPACK) and support
functionality including the Parallel Basic Linear Algebra Subprograms
(PBLAS). For solving sparse systems of equations, Intel® MKL provides
direct and iterative sparse solvers as well as a supporting set of
sparse BLAS (levels 1, 2, and 3).
Intel® MKL offers multidimensional fast Fourier transforms (1D, 2D,
3D) with mixed radix support (not limited to sizes of powers of 2).
Intel® MKL also provides distributed versions of these functions for
use on clusters.
For the solution of partial differential equations (PDE), Intel® MKL
provides a few preconditioners to help with the convergence of our
iterative solvers. Optimization [Trust Region] solvers provide
efficient routines for solving nonlinear least square problems with and
without boundary constraints.
Intel® MKL also includes a set of vectorized transcendental
functions (called the Vector Math Library (VML)) offering both greater
performance and excellent accuracy compared to the libm (scalar)
functions. The Vector Statistical Library (VSL) offers high performance
vectorized random number generators for a number of probability
distributions as well as convolution and correlation routines.
The BLAS, LAPACK, direct sparse solver (DSS/PARDISO), FFT, VML
library functions, and optimization solvers in Intel® MKL are threaded
using OpenMP*. All of Intel® MKL is thread-safe (with the exception of
the deprecated ?lacon, ?lasq3, and ?lasq4 LAPACK routines; see the
reference manual for more information).
New in Intel® MKL 10.1 Update 2
- Performance improvements
- Improved performance of DGEMM on Intel® Xeon 5500 series processors with 4 MB of last-level cache
- DGEMM floating-point computational efficiencies up to 96% have been measured
- Intel® Optimized LINPACK Benchmark efficiencies up to 91.5% have been measured, up from 88% in Intel® MKL 10.1
- Bug fixes
- Fixed a problem with DGELSD which caused programs to continue indefinitely
- Fixed a problem with accepting lower case character parameters in LAPACK functions
- Fixed a bug in the LAPACK function ZHEGST (affecting ZHEGV) which was noted on some systems using glibc 2.4
- Fixed a problem with the ?ladiv group of functions which occurred for certain data sets
- Removed performance degradation in DSTEVR for cases that caused operations on NaNs
- Fixed error reporting on the VML function vdLn to properly report error codes on large vectors
- Fixed a bug which introduced an infinite loop in the Hypergeometric random number generator
- Fixed
a load-balancing problem in the pdtran ScaLAPACK function by improving
the communications scheme improving performance by up to 3 times
- Fixed an unaligned memory access error message in the SGI MPT BLACS library
- Fixed a bug in scripts used for building single-precision FFTW2.x wrappers
- Fixed custom DLL builder to link the proper threading runtime when integrated with the Intel® compilers
- Improved error messages in cases where Intel MKL is unable to load processor specific DLLs
- Fixed error reporting from MKL_malloc (now returns NULL) when there is insufficient memory to proceed
New in Intel® MKL 10.1 Update 1
- Performance Improvements
- Optimized DGEMM/SGEMM 64-bit kernels for the Intel® Core™ i7 processor provide a 2% performance improvement on average
- Improved DGEMM/SGEMM 32-bit kernels for the Intel® Core™ i7 processor and the Quad-Core Intel® Xeon® processor 5400 series
- 2-8% performance improvement on the Quad-Core Intel® Xeon® processor 5400 series
- 2-3% improvement of multithreaded DGEMM on the Intel® Core™ i7 processor
- Better SMP Linpack performance on the Intel® Core™ i7 processor and the Quad-Core Intel® Xeon® processor 5400 series
- 2-3% performance improvement for 32-bits
- 1.5% improvement for 64-bits running a 20K problem on 8 threads
- Added
new optimized radix 7 and 11 FFT kernels which provide performance
benefits for transform lengths that are multiples of 7 or 11
- Threaded sparse matrix vector functions such as mkl_dcoogemv to provide better multicore performance
- Bug fixes and other improvements
- Extended
the FFTW interfaces to include cluster FFT functions to further support
real-to-complex transforms and the use of the transposed order
configuration (see the “Configuration Settings” section of the “Fast
Fourier Transforms” chapter in the reference manual for more
information on transposition)
- Eliminated memory leaks where DFT descriptors were created on one thread and freed on another
- Resolved a few LAPACK issues
- Addressed an error in the size of the TAU array used by the DSYTRD function
- Corrected inaccurate results returned by the ZHETRD function when using the GNU threading layer (libmkl_gnu_thread) on Linux*
- Reduced memory use in the PARDISO out-of-core (OOC) solver by 30 - 50% and in-core by 10 - 50%
- Added a pre-built MP LINPACK executable that links dynamically to an Intel® MPI Library on Linux*
New in Intel® MKL 10.1
-
Performance Improvements in the BLAS:
-
Performance improvements on Quad-Core Intel® Xeon® processor 5400 series systems with 64-bit OS's:
-
SGEMM: 2% on 1 thread and 6% on 8 threads
-
DGEMM: 7% on 8 threads
-
CGEMM: 2% on 1 thread and 10% on 8 threads
-
ZGEMM: 7% on 1 thread and 11% on 8 threads
-
Performance improvements on Quad-Core Intel® Xeon® processor 5400 series systems with 32-bit OS's:
-
SGEMM: 7-15% on 8 threads
-
DGEMM: 7-15% on 8 threads
-
Performance improvement on Intel® Core™ i7 processors with 64-bit OS's:
-
SGEMM: 50% on 1 thread and 50% on 8 threads
-
DGEMM: 11% on 1 thread and 12% on 8 threads
-
CGEMM: 2-3% on 1 thread and 2-3% on 8 threads
-
ZGEMM: 2% on 1 thread
-
DTRSM: 20% on 1 thread and 20% on 8 threads for some cases.
-
Improvements to the direct sparse solver (DSS/PARDISO):
-
The performance of out-of-core PARDISO was improved by 35% on average.
-
Support of separate backward/forward substitution for DSS/PARDISO has been added.
-
A new parameter for turning off iterative refinement for DSS interface has been introduced.
-
A new parameter for checking sparse matrix structure has been introduced for PARDISO interface.
-
The sparse solver functionality has now been integrated into the core
math library and it is no longer necessary to link a separate solver
library. See the user guide for more information.
-
The sparse solver functionality can now be linked dynamically.
-
The capability to track and/or interrupt the progress of lengthy LAPACK
computations has been added via a callback function mechanism. A
function called mkl_progress can be defined in a user application,
which will be called regularly from a subset of the MKL LAPACK
routines. See the LAPACK Auxiliary and Utility Routines chapter in the
reference manual for more information. Refer to the specific function
descriptions to see which LAPACK functions support the feature.
- Transposition functions have been added to Intel MKL. See
the "BLAS-like Extensions" section of chapter 2 in reference manual for
further detail.
-
The C++ std::complex type can now be used instead of MKL-specific complex types.
-
Improvements to the Discrete Fourier Transform Interface (DFTI)
- Addition of the DftiCopyDescriptor convenience function
- Reduction in the size of statically linked executables calling DFTI functions
- Support for DFTI_REAL_REAL storage (i.e., real and imaginary parts in separate arrays) in complex-to-complex transforms
-
An implementation of the Boost uBLAS matrix-matrix multiplication
routine is now provided which will make use of the highly optimized
version of DGEMM in the Intel MKL BLAS. See the User guide for more
information.
-
Improvements to the sparse BLAS:
-
Support for all data types (single precision, complex and double complex) has been added.
-
Routines for computing the sum and product of two sparse matrices
stored, both stored in the compressed sparse row format have been
added.
-
Routines for converting between different support sparse matrix formats have been added.
-
ScaLAPACK functionality can now be dynamically linked.
-
Optimized versions of the Cumulative Normal Distribution (CdfNorm), its
inverse (CdfNormInv), and the inverse complementary error function
(ErfcInv) have been added to the Vector Math Library.
-
Performance improvements on Intel® Core™ i7 processors:
-
3-17% improvement for the following VML functions: Asin, Asinh, Acos,
Acosh, Atan, Atan2, Atanh, Cbrt, CIS, Cos, Cosh, Conj, Div, ErfInv,
Exp, Hypot, Inv, InvCbrt, InvSqrt, Ln, Log10, MulByConj, Sin, SinCos,
Sinh, Sqrt, Tanh.
-
7-67% improvement for uniform random number generation.
-
3-10% improvement for VSL distribution generators based on Wichmann-Hill, Sobol, and Niederreiter BRNGs (64-bit only).
-
The configuration file functionality has been removed. See the user
guide for alternative means to configure the behavior of Intel MKL.
- When functions in Intel MKL are called from an MPI
program they will be run on 1 thread by default (that is in cases where
the number of threads are not explicitly set) if the single-threaded
MPI is linked. This differs from default Intel MKL behavior in programs
without use of MPI. See chapter 6 of the User's Guide for more
information on controlling parallelism.
-
Documentation updates:
- The FFTW
Wrappers for MKL Notes have been removed from the product package and
their content was integrated into the Intel MKL Reference Manual
(Appendix G).
-
The parallel BLAS (PBLAS) which support ScaLAPACK are now documented in the Intel MKL reference manual.
-
Added FORTRAN 77 support info to the description of VML and VSL functions in the Intel MKL reference manual.
-
Eclipse IDE Infopop support for VML functions and VSL service
functions. The infopop support means brief info on a function in a
pop-up window appearing when the cursor is placed to the
function/routine name in the Eclipse Editor panel. This Eclipse feature
is implemented in the CDT 5.0 version.
-
Support for new compilers including the new Intel® compilers 11.0 and PGI* compilers.
-
The default OpenMP runtime library for Intel MKL has been changed from
libguide to libiomp. See the User Guide in the doc directory for more
information.
- The optimized code paths for the Intel® Pentium® III
processor have been removed from Intel MKL along with the associated
processor specific dynamic link libraries. We continue to support the
use of Intel MKL on this processor, but the default code path will be
used and as a result performance may be reduced.
-
The interval linear solver functions have been removed from MKL.
-
Support for Intel MPI 1.x has ended.
Hardware
To install and use Intel® MKL you will need a system with a
supported processor and 1.1 GB of free hard disk space plus an
additional 400 MB during installation for download and temporary files
(host system only).
Supported processors:
- Intel® Core™ processor family
- Intel® Xeon® processor family
- Intel® Itanium® processor family
- Intel® Pentium® 4 processor family
- Intel® Pentium® III processor
- Intel® Pentium® processor (300 MHz or faster)
- Intel® Celeron® processor
- AMD Athlon* and Opteron* processors
Software
To use Intel® MKL you will need a supported compiler and MPI implementation.
Following is the list of supported operating systems:
-
Red Hat* Enterprise Linux* 3, 4, 5 (IA-32 / Intel® 64 / IA-64)
-
SUSE LINUX Enterprise Server* 9, 10 (IA-32 / Intel® 64 / IA-64)
-
SGI ProPack* for Linux 4, 5 (Intel® 64 / IA-64)
-
Red Hat* Fedora* 9 (IA-32 / Intel® 64)
-
Debian* GNU/Linux 4.0 (IA-32 / Intel® 64 / IA-64)
-
Ubuntu* 8.04 (IA-32 / Intel® 64)
-
Asianux* Server 3 (IA-32 / Intel® 64 / IA-64)
-
Turbolinux* 11 (IA-32 / Intel® 64 / IA-64)
Note:
These Linux* distributions are supported, and Intel® MKL should work on
many more. If you have trouble with your distribution, do let us know.
Following is the list of supported C/C++ and Fortran compilers:
- Intel® Fortran Compiler 11.0 for Linux*
- Intel® Fortran Compiler 10.1 for Linux*
- Intel® C++ Compiler 11.0 for Linux*
- Intel® C++ Compiler 10.1 for Linux*
- GNU Compiler Collection (gcc, g77, GNU Fortran 4.2.0 and later)
- Absoft* Pro Fortran v10.1 for Linux*
- PGI* Workstation Complete version 7.1.6
Following is the list of MPI implementations that Intel® MKL has
been validated against:
Note:
- Parts of Intel® MKL have Fortran interfaces and data
structures while other parts which have C interfaces and C data
structures. The User Guide in the doc directory contains advice on how
to link to Intel® MKL with different compilers.
Guidance on the installation of Intel® MKL is provided at install
time. Links will be provided to a file with step-by-step instructions
(filename: Install.txt). This file can also be found in the doc
directory.
The Documentation Index (mkl_documentation.htm in the doc directory)
has a list of the principal Intel® MKL documents. For a complete list,
see chapter 3 of the User's Guide.
The following change is planned for future versions of Intel MKL. Please contact customer support if you have concerns:
- The compatibility or dummy libraries documented in the
linking chapter of the user guide will be removed. These were provided
to smooth the transition to the new naming scheme for Intel MKL
libraries introduced in version 10.0
A full list of known limitations in this version of Intel® MKL can be found on the support site (http://www.intel.com/software/products/support/mkl).
Self Help and User Forums
A rich repository of self-help product information such as
tutorials, getting started tips, known product issues, product errata,
compatibility information and answers to frequently asked questions can
be found at the Intel® Software Development Products Technical Support
site (http://www.intel.com/software/products/support/index.htm). It's a great place to find answers quickly or to gain insight in using our products effectively.
The Intel® MKL User Forum (http://software.intel.com/en-us/forums/intel-math-kernel-library/) is the place to ask questions of and share information with other users of Intel® MKL.
Submitting Issues
Your feedback is very important to us. To receive technical support
and product updates for the tools provided in this product you need to
register at the Intel® Registration Center (https://registrationcenter.intel.com/).
If you have questions or problems getting started with the Intel® Math Kernel Library please contact support at https://registrationcenter.intel.com/support/.
Note: Please notify your support representative prior to
submitting source code where access needs to be restricted to certain
countries to determine if this request can be accommodated.
To submit an issue via the Intel® Premier Support website, please perform the following steps:
- Ensure that Java* and JavaScript* are enabled in your browser
- Go to http://premier.intel.com
- Type in your Login and Password. Both are case-sensitive
- Click the "Submit Issues" button
- Click on the "Development Environment" button next to the "Product Type" drop-down list
- Click on the " Intel(R) MKL for Linux*" button next to the "Product Name" drop-down list
- Enter the info to the required fields, and Click on the "Submit Issue" link in the left navigation bar
- Choose "Development Environment (tools,SDV,EAP)" from the "Product Type" drop-down list
- If this is a software or license-related issue choose " Intel(R) MKL for Linux*" from the "Product Name" drop-down list
- Enter your question and complete the fields in the windows that follow to successfully submit the issue
Please follow these guidelines when forming your problem report or product suggestion:
-
Describe your difficulty or suggestion
For problem reports please be as specific as possible (e.g., including
compiler and link command line options), so that we may reproduce the
problem. Please include a small test case if possible
-
Describe your system configuration information
Be sure to include specific information that may be applicable to your
setup: operating system, name and version number of installed applications,
and anything else that may be relevant to helping us address your concern
Information on Intel® software development products is
available at http://www.intel.com/software/products.
Some of the related products include:
-
The Intel® Software College
provides interactive tutorials, documentation, and code samples that
teach Intel® architecture and software optimization techniques
-
The VTune™ Performance Analyzer
allows you to evaluate how your application is utilizing the CPU and
helps you determine if there are modifications you can make to improve
your application's performance
-
The Intel® C++ and Fortran Compilers are an important part of making software run at top speeds and fully support the latest Intel IA-32 and Itanium® processors
-
The Intel® Performance Library Suite
provides a set of routines optimized for various Intel® processors. The
Intel® Math Kernel Library, which provides developers of scientific and
engineering software with a set of linear algebra, fast Fourier
transforms and vector math functions optimized for the latest Intel
Pentium and Intel Itanium® processors. The Intel® Integrated
Performance Primitives consists of cross platform tools to build high
performance software for several Intel architectures and several
operating systems
As referenced in the End User License Agreement, attribution
requires, at a minimum, prominently displaying the full Intel product
name (e.g. "Intel® Math Kernel Library") and providing a link/URL to
the Intel® MKL homepage (www.intel.com/software/products/mkl) in both the product documentation and website.
The original versions of the BLAS from which that part of Intel® MKL was derived can be obtained from http://www.netlib.org/blas/index.html.
The original versions of LAPACK from which that part of Intel® MKL was derived can be obtained from http://www.netlib.org/lapack/index.html.
The authors of LAPACK are E. Anderson, Z. Bai, C. Bischof, S.
Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S.
Hammarling, A. McKenney, and D. Sorensen. Our FORTRAN 90/95 interfaces
to LAPACK are similar to those in the LAPACK95 package at http://www.netlib.org/lapack95/index.html. All interfaces are provided for pure procedures.
The original versions of ScaLAPACK from which that part of Intel® MKL was derived can be obtained from http://www.netlib.org/scalapack/index.html.
The authors of ScaLAPACK are L. S. Blackford, J. Choi, A. Cleary, E.
D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry,
A. Petitet, K. Stanley, D. Walker, and R. C. Whaley.
PARDISO in Intel® MKL is compliant with the 3.2 release of PARDISO
freely distributed by the University of Basel. It can be obtained at http://www.pardiso-project.org.
Some FFT functions in this release of Intel® MKL have been generated by the SPIRAL software generation system (http://www.spiral.net/)
under license from Carnegie Mellon University. Some FFT functions in
this release of the Intel® MKL DFTI have been generated by the UHFFT
software generation system under license from University of Houston.
The Authors of SPIRAL are Markus Puschel, Jose Moura, Jeremy Johnson,
David Padua, Manuela Veloso, Bryan Singer, Jianxin Xiong, Franz
Franchetti, Aca Gacic, Yevgen Voronenko, Kang Chen, Robert W. Johnson,
and Nick Rizzolo.
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL(R)
PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO
ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS
PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS,
INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS
OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS
INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR
PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR
OTHER INTELLECTUAL PROPERTY RIGHT.
UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT
DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE
INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH
MAY OCCUR.
Intel may make changes to specifications and product descriptions at
any time, without notice. Designers must not rely on the absence or
characteristics of any features or instructions marked "reserved" or
"undefined." Intel reserves these for future definition and shall have
no responsibility whatsoever for conflicts or incompatibilities arising
from future changes to them. The information here is subject to change
without notice. Do not finalize a design with this information.
The products described in this document may contain design defects
or errors known as errata which may cause the product to deviate from
published specifications. Current characterized errata are available on
request.
Contact your local Intel sales office or your distributor to obtain
the latest specifications and before placing your product order.
Copies of documents which have an order number and are referenced in
this document, or other Intel literature, may be obtained by calling
1-800-548-4725, or by visiting Intel's Web Site.
Intel processor numbers are not a measure of performance. Processor
numbers differentiate features within each processor family, not across
different processor families. See
http://www.intel.com/products/processor_number for details.
This document contains information on products in the design phase of development.
BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino logo, Core
Inside, FlashFile, i960, InstantIP, Intel, Intel logo, Intel386,
Intel486, Intel740, IntelDX2, IntelDX4, IntelSX2, Intel Core, Intel
Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo,
Intel NetBurst, Intel NetMerge, Intel NetStructure, Intel SingleDriver,
Intel SpeedStep, Intel StrataFlash, Intel Viiv, Intel vPro, Intel
XScale, IPLink, Itanium, Itanium Inside, MCS, MMX, Oplus, OverDrive,
PDCharm, Pentium, Pentium Inside, skoool, Sound Mark, The Journey
Inside, VTune, Xeon, and Xeon Inside are trademarks of Intel
Corporation in the U.S. and other countries.
* Other names and brands may be claimed as the property of others.
Copyright (C) 2000-2009, Intel Corporation. All rights reserved.