CM FORTRAN PROGRAMMING GUIDE
Version 2.1, January 1994
Copyright (c) 1994 Thinking Machines Corporation.


CHAPTER 1:   A LANGUAGE FOR ARRAY PROCESSING
********************************************

The CM Fortran language is an implementation of Fortran 77
supplemented with array-processing extensions from the ANSI and ISO
standard Fortran 90. The array-processing features map naturally onto
the data parallel architecture of the Connection Machine (CM) system,
which is designed for computations on large data sets. CM Fortran thus
combines

  o  the familiarity of Fortran 77, often the language of choice for
     scientific computing

  o  the expressive power of Fortran 90, which offers a rich selection
     of operations and intrinsic functions for manipulating arrays of
     data

  o  the computational power of the CM system, which brings multiple
     processors to bear on large arrays in either a data parallel
     fashion or a message-passing fashion


1.1  ARRAY OPERATIONS IN CM FORTRAN
-----------------------------------

The essence of the Fortran 90 array-processing features is that they
treat arrays as first-class objects. An array object can be referenced
by name in an expression or assignment or passed as an argument to a
procedure, and the operation is performed on every element of the
array.


1.1.1  Compared with Fortran 77
-------------------------------

In Fortran 77, operations are defined only on individual scalars.
Operating on an array requires stepping through its elements,
explicitly performing the operation on each one. With Fortran 90
constructions, it is not necessary to reference array elements
separately by means of subscripts, and it is not necessary to write DO
loops or other such control constructs to have the operation repeated
for each element. It is sufficient simply to name the array as an
operand.


1.1.2  Example in Fortran 77 and CM Fortran
-------------------------------------------

Consider a 4-element array A, initialized to [1,2,3,4]:

         INTEGER A(4)
         DATA A  / 1, 2, 3, 4 /


Suppose you want to increment each of the values by 1, so that A
contains [2,3,4,5]. The familiar method in Fortran 77 is to reference
the elements by subscript and, through a looping construct, explicitly
increment each value:

         DO 30 I=1,4
             A(I) = A(I) + 1
     30    CONTINUE


If the array is multidimensional, then the control sequence is nested
to operate on all the elements:

         INTEGER A(4,4)
         DO 30 I=1,4
             DO 40 J=1,4
                 A(I,J) = A(I,J) + 1
     40        CONTINUE
     30    CONTINUE


CM Fortran dispenses with the subscript references and the DO loops.
Both the above operations are expressed simply as:

         A = A + 1


These code fragments perform the same set of operations, but their
semantics are slightly different. The Fortran 77 statements are
evaluated in the order specified by the nested loops, whereas the
Fortran 90 construction allows the elements of A to be evaluated in
any order. The CM system takes advantage of this feature to process
the elements simultaneously.

A Fortran 90 array reference can be used for any size or shape array
and for any array operation defined in CM Fortran. The array could,
for example, be 4-dimensional, and the operation could be any Fortran
operator or intrinsic function:

      REAL B( 512, 64, 8, 4 )

      B = 8.0          ! Set all 1,048,516 elements to 8.0.
      B = B * 2.0      ! All 1,048,516 elements contain 16.0.
      B = SQRT( B )      ! All 1,048,516 elements contain 4.0.


          --------------------------------------------------

                                      NOTE


          The simple array reference A or B is the default form of a
          Fortran 90 triplet subscript. A triplet subscript, such as
          A(1:4:1), contains the information that Fortran 77 expresses
          in the control specification of a DO loop: the first and
          last elements and the increment. Fortran 90 thus replaces DO
          loops with a form of array reference that indicates all the
          elements of interest. See Chapter  for more information
          about Fortran 90 array references.


          --------------------------------------------------


This manual uses the term array object or the CM Fortran term CM array
to mean any array that is referenced in the Fortran 90 manner. That
is, an array object is one for which the array reference contains an
explicit or implicit triplet subscript that indicates all the elements
that are to be operated upon.


1.2  DATA PARALLEL PROCESSING
-----------------------------

Fortran 90 array operations map naturally onto the data parallel CM
system. From the software perspective, an array object refers to all
the data elements of the array simultaneously. From the hardware
perspective, operations on the array's elements are performed
simultaneously.


1.2.1  Compared with Serial Processing
--------------------------------------

A serial implementation of Fortran 90 would have the syntactical
convenience of referencing arrays as objects, but the compiler would
necessarily generate serial loops to process the elements
sequentially. However, if the operations on individual data elements
are independent of one another--that is, there is no loop-carried data
dependence--there is no inherent need for them to be sequential. The
sequence of operations is an artifact of the processor,  not of the
program.

A data parallel computer--one that can store the data elements in the
memories of several processors--is free to operate on more than one
element at a time. For example, given a 10 x 10 x 10 array A, consider
the expression

     SQRT(A)/2


To evaluate this expression, a serial computer would need to perform
2000 arithmetic computations one after another. A 1024-processor CM,
in contrast, could process all the elements of A at once, with each
processor performing only two computations.


1.2.2  The Compiler's Machine Model
-----------------------------------

The CM Fortran compiler assumes that its target machine is a
collection of parallel processors, each with its own memory, all
acting under the direction of a serial control processor.

          Figure 1. The CM Fortran compiler's machine model.

The compiler decomposes the input program into serial and parallel
components:

  o  Arrays used in Fortran 90 constructions are distributed across
     the memory of the parallel processors. For historical reasons,
     these Fortran 90 array objects are called CM arrays in CM
     Fortran; some of the CM libraries also refer to them as parallel
     arrays.

  o  Scalar data and arrays used only in Fortran 77 loop constructions
     are stored in the memory of the control processor. These arrays
     are called front-end arrays in CM Fortran and serial arrays in
     some of the CM libraries.

  o  Arrays that are used in both ways are distributed across the
     parallel processors and moved to the control processor for serial
     operations.


The compiler generates serial instructions for the control processor
to execute on scalar data stored in its own memory. It also generates
blocks of parallel computations, meaning element-wise operations and
certain other operations that involve no interprocessor communication.

The control processor sends these parallel code blocks to the parallel
processors, and each executes them on its own portion of the data. The
control processor also executes control-flow statements and calls
run-time library routines whenever the parallel processors need to
communicate with each other or with the control processor.

                          [ Figure Omitted ]

           Figure 2. The compiler's "two machines" output.

Notice that no new data structure is needed to express parallelism in
Fortran, and the programmer need not specify where arrays are to be
allocated (although compiler directives and switches do provide this
capability).

Nor does the programmer need to specify when or how processors are to
communicate, since the compiler generates several kinds of
communications instructions as needed:

  o  Nearest-neighbor, or NEWS, communication, whereby each processor
     gets a value from its neighbor on an n-dimensional grid, all at
     the same time

  o  General-purpose, or "send," communication, whereby each processor
     sends a value to any arbitrary processor, all at the same time

  o  Global communication, which includes cumulative computations
     along grid axes and reduction of an array to a single value


Because there is only one instruction stream for the parallel
processors, they are naturally synchronized. Race conditions cannot
develop because no processor proceeds to the next code block or
communication instruction until all processors have finished the
current one.


1.2.3  Data Parallel Processing on the CM System
------------------------------------------------

The compiler's "two-machine" model reflects the data parallel
architecture of CM systems. These systems provide a control processor,
called the partition manager on a CM-5 or the front end of a CM-2 or
CM-200. The CM also provides a set of parallel processors, or
processing nodes.

See the Technical Summary for the CM-5, CM-200, or CM-2 for detailed
information on system architecture.

      Figure 3. Division of labor between CM system components.

In the straightforward global model of program execution, the
partition manager (or front end) executes the code that the compiler
generates for the control processor, and the nodes store and operate
upon CM array objects. In a CM-5 with vector units, the vector units
serve as the parallel processors instead.

On the CM-5, the three-level hierarchy of processors--partition
manager, node microprocessors, and vector units--permits other
mappings to hardware, including asynchronous message-passing programs
that run on the individual nodes, each using the associated four
vector units as its set of parallel processors.


CHAPTER 3: OF THIS MANUAL DESCRIBES THE CM FORTRAN EXECUTION MODELS.
********************************************************************


     --------------------------------------------------

                                   NOTE

      ON TERMINOLOGY CM Fortran typically refers to the control
     processor and the parallel processors as the front end and the
     CM, respectively. Hence the terms front-end array and CM array.
     On CM-5, and particularly in the message-passing execution
     models, it is important to recognize that front end means
     whatever system component is serving as the control processor (it
     may be the partition manager or a node), and that CM refers to
     whatever set of processors is executing CM Fortran array
     operations.


     --------------------------------------------------


1.2.4  CM Fortran as a Superset of Fortran 77
---------------------------------------------

CM Fortran is a superset of Fortran 77. The differences between the
two languages reflect the compiler's "two-machine model," that is, the
assumption that a CM Fortran program is directing two system
components with different memory organizations.

Any Fortran 77 construction can be used with scalar data and front-end
("serial") arrays, since such data is stored and processed in the
conventional serial manner.

Most Fortran 77 features are extended for use with array objects, as
the + and = operators are used above in the array operation A = A + 1.
However, certain features with storage-order dependencesmost notably,
the EQUIVALENCE statementare not supported for use with CM arrays.

CM Fortran implements Fortran 90's array-processing features for use
only with CM arrays. It also offers a subset of Fortran 90's
precision-control syntax and certain other Fortran 90 features that
are commonly found as extensions of Fortran 77 (see Section 1.3). CM
Fortran does not implement Fortran 90's derived types ("structures"),
modules, or pointer assignments.


1.3  THE FEATURES OF CM FORTRAN
-------------------------------

This section summarizes the major extensions of CM Fortran over
Fortran 77. See the CM Fortran Language Reference Manual and the CM
Fortran Libraries Reference Manual for complete information.


1.3.1  Fortran 90 Array Processing
----------------------------------

The array-processing features that CM Fortran draws from Fortran 90
include:

  o  expanded semantics for Fortran 77 operators and intrinsic
     functions, such that they can take an array object and operate on
     its elements

  o  array sections and vector-valued subscripts, new syntax for
     selecting subarrays from array objects

  o  the WHERE statement and construct, which operate conditionally on
     an array's elements depending on the elements' values

  o  new intrinsic functions for permuting and transforming arrays, as
     well as for constructing arrays and inquiring about their
     properties

  o  dynamic memory management for ALLOCATABLE arrays and array
     pointers by means of the statements ALLOCATE and DEALLOCATE

  o  attributed type declarations, an alternative to Fortran 77 type
     declarations and the DIMENSION statement for declaring arrays


1.3.2  Other Array Extensions
-----------------------------

CM Fortran includes some extensions over Fortran 90 that are
particularly useful for data parallel programming. Several of these
features have been adopted by the emerging industry standard High
Performance Fortran.

  o  The FORALL statement, a powerful facility for initializing
     arrays, for selecting subarrays, and for specifying data movement
     in terms of array indices.

  o  The intrinsic functions FIRSTLOC, LASTLOC, and PROJECT, which
     return the locations of certain array elements (such as the first
     true element); the array transformation functions DIAGONAL and
     REPLiCATE; and the inquiry function RANK.

  o  Compiler directives LAYOUT and ALIGN, which control the layout of
     arrays in distributed memory. The choice of array layout can have
     major effects on program performance.


1.3.3  Precision Control
------------------------

CM Fortran provides Fortran 90 syntax for specifying the kind of
numeric data types (for both scalars and arrays). A type's kind
indicates either 32-bit or 64-bit precision.

  o  KIND keyword and predefined kind type parameters for declaring
     the precision of numeric variables, plus the KIND intrinsic
     function for inquiring about precision

  o  syntax for specifying the kind type of literal constants


1.3.4  Other Fortran 90 Features
--------------------------------

CM Fortran also includes some Fortran 90 features that are not
specifically related to array processing, but are commonly found in
implementations of Fortran 77. Examples of these are:

  o  control-flow statements CASE, DO TIMES, DO WHILE, and END DO

  o  NAMELIST I/O

  o  DOUBLE COMPLEX data type

  o  INCLUDE lines

  o  IMPLICIT NONE statement


1.3.5  CM Fortran Libraries
---------------------------

CM Fortran provides three libraries: the utility library, the cmf77
library, and the global/local library.

The utility library provides:

  o  some language-level operations that are not available in CM
     Fortran, such as generating random numbers in an array and
     ranking or sorting array elements by value

  o  some performance-enhancing capabilities beyond what the language
     delivers, such as fast data transfers between the CM's serial and
     parallel processors


In earlier versions, the utility library served mainly as a stopgap
for language features that were inefficient or were not yet
implemented. In this version, many of the utility procedures are
redundant, or nearly so, with language features. The CM Fortran
Libraries Reference Manual compares each set of procedures with the
corresponding language features, if any, and points out any
significant differences in behavior or performance.

The library libcmf77 is similar to Sun Microsystem's library libF77.
It provides interfaces to OS system calls, such as TIME and IARGC. Two
particularly useful routines are FMALLOC and FFREE, which permit
dynamic allocation of front-end arrays, a feature not otherwise
available in CM Fortran. (This operation is illustrated below in
Chapter 7).

A third CM Fortran library is the global/local library CMGL. These
procedures are used only in the global/local model of programming,
described below in Chapter 13.


1.3.6  CM System Libraries
--------------------------

Besides the CM Fortran libraries described above, CM systems offer the
following libraries callable from CM Fortran:

  o  CM Scientific Software library, CMSSL: highly optimized routines
     for performing common scientific and mathematical tasks on the
     CM-5 and CM-2/200

  o  CM Message Passing library, CMMD: routines for handling
     interprocessor communication, synchronization, and I/O from
     within CM Fortran programs running locally on the CM-5 nodes

  o  CM data visualization libraries, such as CM/AVS for the CM-5 and
     *Render for the CM-2/200

  o  CM low-level I/O libraries: SFS for parallel I/O between the CM-5
     and the Scalable Disk Array, and CMFS for parallel I/O between
     any CM and the DataVault or devices such as CM-HIPPI


See the documentation for the individual libraries for information on
calling them from CM Fortran programs.
*****************************************************************

  The information in this document is subject to change without
  notice  and should not be construed as a commitment by Think-
  ing  Machines  Corporation. Thinking  Machines  reserves  the
  right to make changes to any product described herein.

  Although the information  in this document has  been reviewed
  and is believed to be reliable, Thinking Machines Corporation
  assumes no liability for  errors in this  document.  Thinking
  Machines  does  not  assume  any  liability  arising from the
  application  or use of any  information or product  described
  herein.

*****************************************************************

Connection Machine (r)
is a registered trademark of Thinking Machines Corporation.
CM, CM-2, CM-200, CM-5, CM-5 Scale 3, and DataVault
are trademarks of Thinking Machines Corporation.
CMOST, CMAX, and Prism are trademarks of Thinking Machines Corporation.
C* (r) is a registered trademark of Thinking Machines Corporation.
Paris, *Lisp, and CM Fortran are trademarks of Thinking Machines Corporation.
CMMD, CMSSL, and CMX11 are trademarks of Thinking Machines Corporation.
CMview is a trademark of Thinking Machines Corporation.
Scalable Computing (SC) is a trademark of Thinking Machines Corporation.
Scalable Disk Array (SDA) is a trademark of Thinking Machines Corporation.
Thinking Machines (r)
is a registered trademark of Thinking Machines Corporation.
SPARC and SPARCstation are trademarks of SPARC International, Inc.
Sun, Sun-4, SunOS, Sun FORTRAN, and Sun Workstation 
are trademarks of Sun Microsystems, Inc.
UNIX is a trademark of UNIX System Laboratories, Inc.
The X Window System
is a trademark of the Massachusetts Institute of Technology.

Copyright (c) 1989-1994 by Thinking Machines Corporation.  All rights reserved.
This file contains documentation produced by Thinking Machines Corporation.
Unauthorized duplication of this documentation is prohibited.

Thinking Machines Corporation
245 First Street
Cambridge, Massachusetts 02142-1264
(617) 234-1000