CM FORTRAN USER'S GUIDE Version 2.1, January 1994 Copyright (c) 1994 Thinking Machines Corporation. CHAPTER 2: COMPILING AND LINKING CM FORTRAN PROGRAMS ***************************************************** This chapter explains how to use the cmf command to compile and link CM Fortran programs. The cmf command operates much like commands for other compilers running under the UNIX operating system. It can be used to compile source files of different types and link them with object files. The command invokes one or more of the following: the CM Fortran compiler, the C language preprocessor cpp (if the filename has the extension .FCM or .F), the compilers f77, cc, or cs, the Sun assembler as, or the vector-units assembler dpas (on the CM-5 only), as appropriate. Next, it invokes either the linker cmld (on the CM-5), the linker ld (on the CM-2 or CM-200), or the linker cmmd-ld (for nodal CM Fortran programs) to combine the resulting object files with other object files or libraries to form an executable load module. Note that cmf invokes other compilers only as a convenience. cmf does not know all switches available for these compilers. In cases where unusual switches are desired, it may be more convenient to call the desired compiler directly, producing an object file that is then given to cmf to link. 2.1 THE COMPILER COMMAND ------------------------- The cmf command invokes the CM Fortran compiler. Its syntax is: % cmf [ switch-or-filename ]... The optional switch-or-filename is either a switch or a filename. Each filename specifies a source or object file to be compiled or linked, and each switch is a command switch name preceded by a dash, in the usual UNIX style. Some switches require an argument; the form of these arguments is described in Section 2.3. Switches and filenames should be separated by one or more blank spaces; the order of switches and filenames does not matter. For example: % cmf my-program.fcm -o my-program 2.1.1 Filename Extensions -------------------------- The cmf command determines the type of a file from its filename extension and processes the file accordingly. The command accepts filenames with the following extensions: Object files ay be froa previous invocation of cf. Theyay also be the output of the Sun FORTRAN copiler f77 or the Suncopiler cc. 2.1.2 Linking with CM Libraries -------------------------------- CM Fortran Libraries The CM Fortran Utility Library is linked autoatically when needed. Prograunits that call procedures fro this library need to include its header fileMF_defs.h. INCLUDE '/usr/include/c/MF_defs.h' The pathnae ay be different at a particular site if the syste ad inistrator has revised the CM directory structure. The CM Fortran global/local library is linked explicitly. See Section 3.3 for inforation about linking for the global/local prograing odel and about which header files to include. The CM Fortran library cf77 (libcf77.a) requires no special header (include) file or explicit linking. This library provides procedures coparable to those in the Sun library libF77.a. If the Sun library is installed on a CM syste, the cf coand links with it as well as with libcf77.a so that object files copiled with Sun's f77 coand will link successfully. All three of these libraries are described in the CM Fortran Libraries Reference Manual. CM Syste Libraries Soe of theM libraries require you to include header files and link explicitly when processing aM Fortran progra that uses the. The following libraries are supported on the CM-5: o Pris o CMSSL o CMMD o CMX11 o CM/AVS o CMFS o SFS The following libraries are supported on the CM-2 and CM-200: o Pris o CMSSL o CMFS The CM siulator supports no libraries except for Pris. CMFS and SFS are linked in autoatically. As of Version 2.1, cf is able to link with the appropriate version of the following libraries, based on either the default copiler ode or the use of the -vu, -sparc, and -node switches: o -lcssl (for theMSSL library) o -lcx (for theMX11 library) o -lcavsflow, -lcavssi (for theM/AVS libraries) o -lcsr (for the *Render library) o -lcs (for C* codes that call CM Fortran subroutines, which are linked using cf) o -lcfs_cs (for callingMFS fro C* code that is linked to a CM Fortran progra) For instance, you can now link with CMSSL by specifying -lcssl on the link line instead of a forthat indicates which execution odel you are using, such as -lcsslc5 or -lcsslc5vu-node. The old foris still supported, however, so there is no need to change your akefiles. NOTE : For CMX11 and CM/AVS, you ust still specify the non-M libraries, such as -lX11 and -lflow_f. The libraries available ay differ aong the CM execution environents. See the docuentation for the current versions of the libraries for details. INCLUDE Files The copiler accepts up to 252 INLUDE files per source file, nested to a axiu depth of 19. The total refers to distinct files referenced in all the INCLUDE lines in all the copilation units in a source file. If, in the sae file, SUB1 includes inc1.fc, inc2.fc, and inc3.fc, and SUB2 includes inc3.fc and inc4.fc, the total charged against the 252 is 4. 2.1.3 Copiler and Linker Output -------------------------------- The copiler generates code that is specific to aM hardware platfor (as shown below in Section 2.2.1). There are also soe differences in linking behavior and in the interediate files generated, as detailed in this section. Code Copiled for theM-2 orM-200 Syste The cf copiler and the ld linker generate a single output file, a.out. The interediate files generated with the -S or -c switches are the assebler file yfile.s and the object fileyfile.o, respectively. Code Copiled for theM-5 orM Fortran Siulator The cf copiler and the cld linker generate a single output file that cobines a scalar executable for the partition anager and a parallel executable for the parallel processors. As interediate output, however, the copiler generates separate files for the two target coponents. For exaple: o With the -S switch, the copiler generates two assebler files: yfile.s andyfile.pe.s. o With the -c switch, the copiler generates two object files:yfile.o and yfile.pe.o. o The linker generates only one executable file: a.out. There is no file a.out.pe corresponding to the parallel interediate files. If you work with interediate files explicitly linking object files, for instance you need only specify the scalar file. The corresponding parallel file is linked autoatically. % cf -c 5 file_1.o file_2.o This co and line links by invoking the copiler, which in turn invokes the linker. It is also possible to invoke the cld linker directly. (A CM-5 CMOST an page is provided.) For ore inforation about the output of the cf copiler, see the CM-5 CM Fortran Perforance Guide. If you wish to disable the autoatic processing of the parallel (".pe") interediate file when the corresponding scalar file is specified, set the environent variableMF_AUTO_PE_FILES to 0. Any positive value for this variable leaves the feature enabled. In either case, recall that the separate interediate files exist. If you have occasion to copy orove interediate files to another directory, be sure to take both the scalar and the parallel files. 2.1.4 Co and Line Arguents --------------------------------- You can write a routine to get co and-line arguents by using operating systeinterfaces fro theM Fortran library cf77 (see the CM Fortran Libraries Reference Manual). This library is siilar to Sun Microsystes' F77 library. The interfaces are a function IARGC() and a subroutine GETARG(I, NAME). You ight use the as follows: PROGRAM LOOP IMPLICIT NONE INTEGER N1, N2, N3, N4, IARGC CHARACTER*80 ARGUMENT IF (4 .NE. IARGC()) THEN PRINT *, 'Usage : LOOP N1 N2 N3 N4' STOP END IF CALL GETARG(1, ARGUMENT) READ (UNIT=ARGUMENT, FMT='(I6)') N1 CALL GETARG(2, ARGUMENT) READ (UNIT=ARGUMENT, FMT='(I6)') N2 CALL GETARG(3, ARGUMENT) READ (UNIT=ARGUMENT, FMT='(I6)') N3 CALL GETARG(4, ARGUMENT) READ (UNIT=ARGUMENT, FMT='(I6)') N4 PRINT *, IARGC(), N1, N2, N3, N4 STOP END 2.1.5 Sun Microsystes' Libraries --------------------------------- CM Fortran autoatically links with Sun Microsystes' libraries if they are installed on the syste. This behavior will be discontinued in future versions. When a progra that was linked with Sun Microsystes' dynaically bound libraries does not find theat run tie, an erroressage like the following is displayed: ld.so: libF77.so.1: not found You can prevent this proble by linking statically byeans of the -Bstatic switch, which cf passes on to the linker: % cf yfile.fc -Bstatic 2.1.6 Custo Link Libraries --------------------------- The cf copiler generates separate object files for scalar and parallel code, with the extensions .o and .pe.o, respectively. To build a custo link library, create two separate libraries for the scalar and parallel code: o Take all the .o files and put the into libfoo.a, using the UNIX ar coand. % ar rv libfoo.a sub1.o o Take all the .pe.o files and put the into libfoo_pe.a, again using the ar coand. The underscore in the nae is required; a filenae of the forlibfoo.pe.a does not work. % ar rv libfoo_pe.a sub1.pe.o o Add the location of libfoo.a to the library search path. When linking a progra, you need only specify libfoo.a to the link coand (c f or cld). The parallel library libfoo_pe.a is then linked in autoatically. % cf -lfooyfile.fc 2.2 SOE IPORTANT SWITCHES ---------------------------- Section 2.3 describes all the switches accepted by the cf co and. This section draws your attention to soe particularly iportant switches. 2.2.1 Specifying Execution Environent -------------------------------------- Soe switches have to do with specifying hardware platforand/or execution odel. This section provides a brief overview of these switches. Forore detailed inforation about using theto copile progras, see Chapter 3. The CM Fortran execution odel refers to the way the copiler akes use of the hardware. OnM-5 systes, the copiler can treat either the nodes or the vector units as the parallel processing eleents; on CM-2/200 systes, it uses the processing nodes. In the siulator odel, a single Sun-4 coputer is the processor. The following table shows the copiler switches that control the targetM platfor and execution odel. It also shows the sybolic naes of the executionodels and the syste co ponent that serves as the the processing eleent. Table 1. CM Fortran hardware platfors and executionodels. Hardware CopilerMF_ARHITETUREMF_NUMBER_OF_ platfor options returns PROESSORS returns CM-5 Vector units -c5 -vuMF_M5_VU nuber of vector units SPARC nodes -c5 -sparcMF_M5_SPARnuber of nodes CM-200 Slicewise -c200 -slicewiseMF_M200_SLIEWISE nuber of nodes CM-2 Slicewise -c2 -slicewiseMF_M2_SLIEWISE nuber of nodes CM Fortran Siulator -csiMF_MSIM nuber of processors (1) See Chapter 3 for inforation about the nodalodel and the local part of the global/local execution odel (specified by -c5 -vu -node and -c5 -vu ... -local, respectively). For both of these executionodels, CMF_ARCHITECTURE returns the sae value as for the vector-units odel. -------------------------------------------------- NOTE Object files copiled for different executionodels are not copatible. If you link as a separate step, be sure that all theM Fortran files were copiled for the sae executionodel. -------------------------------------------------- Hardware Requireents o The vector-units odel executes only on aM-5 with vector units. The (SPAR) nodes odel executes on any CM-5 syste. o The slicewise odel executes only on aM-2 with the 64-bit floating- point accelerator or on anyM-200 (where this hardware is standard). o The siulator odel executes on a Sun-4 coputer, whichay be a CM-5 partition anager or a front end to aM-200 orM-2. The copiler signals an error if the options specifying hardware platforand execution odel are incopatible. Notice that the option -csi is not copatible with any other option on the list. Defaults For systes with ore than one CM hardware platfor installed, the default platforis deterined locally at installation tie. For each platfor (other than siulator), there is also a syste-wide default executionodel. o If you specify hardware platfor but not executionodel on the cf coand line, the copiler uses the site default for the issing value. o Use the environent variableMF_DEFAULT_MAHINE to change the default hardware platfor for your shell environent. Its value is one of the following (case is ignored): % setenv CMF_DEFAULT_MACHINE [ CM5 | CM200 | CM2 | CMSIM ] o Unsetting this environent variable causes the shell environent to revert to the site's default hardware platfor: % unsetenv CMF_DEFAULT_MACHINE o Use one of the following environent variables to change the default executionodel for your shell environent (case is ignored). % CMF_CM5_DEFAULT_MODE [ SPARC | VU ] % CMF_CM200_DEFAULT_MODE [ SLICE ] % CMF_CM2_DEFAULT_MODE [ SLICE ] o Note that CM Fortran Version 2.1 supports only the slicewise execution odel on theM-2 andM-200. o Unsetting these environent variables causes the shell environent to revert to the site's default for each of the hardware platfors. % unsetenv CMF_platfor_DEFAULT_MODE 2.2.2 Listing Array Hoes ------------------------- Mistaking the hoes of arrays is a frequent source of user errors and poor prograperforance. If you inadvertently perfora front-end operation on a CM array, perforance degrades as the systeoves the array, one eleent at a tie, froits CM hoe to the front end where the operation executes. If youisatch array hoes across subroutine boundaries passing a CM array to a procedure that expects a front-end array or vice versa the result is a run-tie error or incorrect results. The copiler switch -list allows you to check the location of particular arrays allocated by the copiler. This switch causes the copiler to produce a filenae filenae.lis, which reports the hoe and other inforation about naes used in the progra being copiled. For exaple, consider this code fragent: LOGICAL, ARRAY(10) :: CML, FEL INTEGER, ARRAY(10) :: CMI, FEI REAL, ARRAY(10) :: CMR, FER COMPLEX, ARRAY(10) :: CMC, FEC ... In this code fragent, naes beginning with the letters CM are assued to beM arrays, and those beginning with FE are assued to be front-end arrays. The "ARRAYS" section of the listing produced for this progra fragent indicates the hoe of each array under the colun labeled "Hoe," as shown below: ARRAYS Offset Size Type Block/Class Hoe Nae --- 80 C*8 local CM CMC --- 40 I*4 local CM CMI --- 40 L*4 local CM CML --- 40 R*4 local CM CMR 0 80 C*8 local FE FEC 120 40 I*4 local FE FEI 160 40 L*4 local FE FEL 80 40 R*4 local FE FER The sizes and offsets are given in units of bytes. No offset is listed for CM arrays, as they are allocated at run tie. 2.2.3 Identifying Co unication Routines Generated ---------------------------------------------------- The listing file produced when the -list switch is specified identifies the co unication routine references generated by a progra unit, and the source code line nubers at which each reference occurs. For exaple, the source lines of a (soewhat contrived) progra xref.fc would appear in the listing file xref.lis as: Source Listing File: /users/user-nae/xref.fc 1 1 1 progra xref 2 2 1 paraeter (= 10) 3 3 1 real a(), b( ) 4 4 1 integer v() 5 5 1 a = [1:]*17.0 6 6 1 v = [1,4,3,2,7,6,9,8,10,5] 7 7 1 a(v) = a*3 8 8 1 print 10, a 9 9 1 10 forat( " A:", 10F9.3 ) 10 10 1 loop: do 100 i=1, 11 11 1 b(i) = log(real(i*i*i)) 12 12 1 a(i) = a(i)*b(v(i)) 13 13 1 if (i==9) exit loop 14 14 1 100 continue 15 15 1 print 10, a 16 16 1 200 end The listing file reports the co unication routine references as: COMMUNICATION ROUTINES Nae Line Nuber (nuber of ties) READ VALUE FROM PROCESSOR 12(2) VECTOR SEND 7 FE TO CM ARRAY TRANSFER 6 The exaple code generates references to three different counication routines: READ VALUE FROM PROESSOR on line 12, VETOR SEND on line 7, and FE TOM ARRAY TRANSFER on line 6. (VETOR SEND is a general counication routine to handle vector-valued subscripting.) Ifore than one reference to a co unication routine appears on a single line, that nuber is indicated in parentheses following the line nuber. Many of the co unication routines support the intrinsic functions directly, and references to the use the nae of the intrinsic function itself (possibly qualified), such as CSHIFT, MAXLOC, SUM (into scalar), and SUM (into vector). Others refer to coon CM counication patterns: SEND, GET, VPMOVE, NEWS, and NEWS (power of two). Still others refer to data transfers between the CM and the front end: READ VALUE FROM PROCESSOR, FE TO CM ARRAY TRANSFER, and so on. The listing also reports uses of SUBROUTINE ARGUMENT COPYOUT. 2.2.4 Locating Line Labels and Naes ------------------------------------ When used together with the -list switch, the -cross_reference switch causes the listing file to include inforation that relates line labels and naes (sybols) to source code lines. (Add the switch -show_include if you want the contents of include files to be listed also.) The -cross_reference switch is ignored if the -list switch is not specified. The default is -nocross_reference. The sybol and label cross reference listings generated for the progra listed in the previous section are shown below. Sybolross Reference File: /users/usr-nae/xref.fc Sybol Line Nuber(s) A 3 5 7 7 8 12 12 15 B 3 11 12 I 10 11 11 11 11 12 12 12 13 LOG 11 LOOP 10 13 M 2 3 3 4 5 10 REAL 11 V 4 6 7 12 XREF 1 Label Cross Reference File: /users/usr-nae/xref.fc Label Defined References(s) 10 9 8 15 100 14 10 200 16 2.2.5 Run-Tie Safetyhecking ---------------------------------------- To catch array-hoe errors and other run-tie errors, you can direct the copiler to generate code that checks at run tie for hoe isatches, otherisatches between actual and duy array arguents, and the use of uninitialized CM arrays of floating-point types. The switch is -safety=level, where level is an integer value indicating the level of run-tie safety desired. The key levels are: 0 No safety checking 1 Checks that: o o the nuber of actual arguents equals the nuber of duy argu ents o o the hoe of each actual array arguent is the sae as the hoe of the corresponding du y arguent o o the rank of each CM actual array arguent is the sae as the corresponding du y arguent o Note that the generated code does not check the shape of CM array arguents (only their rank), nor does it check the types of arguents. Any arguent fro 1 through 9 provides these checks. 10Provides the checks above plus checking for the use of uninitialized CM arrays of types real or coplex (single- or double-precision). This arguent causes all bits in progra eory to be initialized to one. Read as a two's copleent integer, this is interpreted as -1; read as a floating- point nuber, it is interpreted as a NaN. The syste prints a warning when a NaN is detected at run tie. Under -safety=10, only uses of NaNs in eleentalM coputations are detected. For instance, suppose CM array A is uninitialized. The copiler will detect an error in B = A + C but not in B(i) = A(i) + C(i). The latter coputation takes place on the front end, where NaN checking is not perfored. Any arguent of 10 or greater provides these checks. NOTE: Do not use -safety when copiling a code in which one file has the extension .fcand another has the extension .f or is an object file copiled by the f77 copiler. In this case, -safety reports the following: *** RTS-FATAL-NUMARGS: The nubers of actual and duy argu ents differ. 2.2.6 Debugging ---------------- There are two copiler switches that produce inforation for the Pris debugger, -g and -cdebug. The -g switch is reco ended for ost situations. (NOTE: Either of these switches or the -cprofile switch links in the Pris library.) The difference between the two switches is in their ability to suppress certain optiizations the copiler ordinarily perfors. The copiler usually fuses ultiple stateents together, which increases execution speed but akes it difficult to relate debugger output to progra source. The -g switch suppresses the fusing of source code stateents, which akes debugging easier, although it degrades progra perforance significantly. The -cdebug switch perits stateents to be fused; this switch should be used only when you want to debug fully optiized code (a difficult task). If you suspect a progra error that the Prisdebugger is unable to locate, try copiling with both -g and the run-tie safety switch, -safety=10. Since run-tie safety checking degrades perforance even further, prograers usually avoid using it on the unoptiized code produced by -g. However, the cobination of switches is soeties useful in finding especially subtle bugs. Using -cdebug causes all front-end scalars to haveeory ho es. This can result in a slight degradation in front-end code perforance. It also inhibits soe dead code reoval whichay, in soe circustances, increase the nuber of code blocks. Overall, the overhead is around 1- 2% on heavily parallel code. The -g switch, on the other hand, works by putting each stateent in its own parallel code block. This can result in a factor of 2 (orore) decrease in perforance. 2.2.7 Perforance Analysis -------------------------- To produce inforation for perforance analysis under the Pris developent environent, use the -cprofile switch. If you use this switch, include it during both copilation and linking. By default, the -cprofile switch activates the -cdebug switch, causing Pris to analyze perforance on a block-by-block basis. See Section 4.3 for ore inforation about using Pris. 2.2.8 Controlling Axis Sequence (CM-5 only) -------------------------------------------- The copiler transfors all array section code into one or ore DO loops (collapsing the loops onore than one array diension where possible). The diension whose eleent indices vary fastest in localeory becoes the innerost loop. This loop is vectorized on aM-5 with vector units; the vector units process eight "data objects" for each iteration of a loop on the vector axis. CM Fortran provides the copiler switch -[no]axisreorder (available on theM-5 only), which allows you to control which array diension varies fastest in e ory. The negative for -noaxisreorder causes the copiler to lay out the diensions in the order you declare the, with the leftost axis the fastest varying. Consider arrays with two serial axes: REAL A(10,20,100), B(10,20,100) CMF$ LAYOUT A(:SERIAL,:SERIAL,), B(:SERIAL,:SERIAL,) ... A(1,:,:) = B(2,:,:) With -noaxisreorder, the iddle axis has stride 10 in localeory. The double loop that ipleents this assignent vectorizes along the iddle axis. In previous releases, the copiler always laid out axes ineory in the following way, reordering thefro the declared order if necessary: o All :NEWS axes varied faster than any :SERIAL axis. (This guaranteed that a section fored by scalar indexing into all serial axes would be a contiguous block ofeory.) o The :NEWS axis declared rightost varied faster than other :NEWS axes, the next rightost :NEWS axis varied next fastest, and so on. o The :SERIAL axis declared rightost varied faster than other :SERIAL axes, the next rightost :SERIAL axis varied next fastest, and so on. For back-copatibility, this axis reordering is still the default. That is, the switch defaults to the positive for-axisreorder, and therefore the vector axis is always a :NEWS axis. The current release allows you to choose the axis order in e ory. When you supply the negative for -noaxisreorder, array diensions are laid out in colun-ajor order, with the leftost declared axis the fastest-varying. -------------------------------------------------- NOTE All CM Fortran progra units of a prograust be copiled with the sae setting of -[no]axisreorder. -------------------------------------------------- Note that -axisreorder does not affect purely serial arrays, that is, arrays that do not reside on the CM at all. For those arrays, axes are ordered fro left to right, regardless of whether -[no]axisreorder is used. 2.2.9 Controlling Vector-Length Padding (CM-5 with VUs only) ------------------------------------------------------------- The -[no]padding switch deterines whether subgrids the on-processor portion of arrays are padded to aultiple of vector length under the CM-5 vector units execution odel. Reoving the padding constraint gives the user added control over which data eleents are co-resident in local eory. This switch is supported on the CM-5 only. When allocating e ory for CM arrays, the CM run-tie syste always allocates the sae eory locations in every processing eleent (node or vector unit, depending on the execution odel). As a result, the size of a virtual grid the block of distributed eory onto which a user array is apped is always an integer ultiple of CMF_NUMBER_OF_PROCESSORS(). If the product of an array's non-serial diensions is not a legal size for a virtual grid, the syste allocates the next legal size (and later asks out the unused eory locations). The syste then lays out the array's parallel diensions within the virtual grid so as to iniize the aount of this global padding. In previous versions and by default in the current version the run- tie systein two execution odels et a second constraint in deterining the grid size required for parallel diensions: the on-processor subgrid had to be a ultiple of the length of a vector register. As a result, achine geoetry sizes (excluding serial diensions) were integer ultiples of: nuber-of-processors x 8M-5 vector units odel nuber-of-processors x 4M-2 orM-200 slicewise odel The CM-5 SPARC nodes odel, lacking vector registers, does not generate this vector-length padding. After the syste deterines the grid size and layout needed to acco odate the array's parallel diensions, it extends the per-processor subgrid by the product of the array's serial diensions. Since serial diensions are added after the padding has been deterined, they are never padded. Version 2.1 provides a switch, -nopadding, by which you can specify that the run-tie systenot pad virtual grids to a ultiple of vector length. (Global padding is not affected by the switch; you can avoid it by declaring arrays that are evenly divisible byachine size.) For back-copatibility, padding is enabled by default. Supply -nopadding explicitly to avoid vector-length padding. -------------------------------------------------- NOTE All CM Fortran progra units of a prograust be copiled with the sae setting of -[no]padding. -------------------------------------------------- 2.2.10 Optiization ------------------- The global optiizer (-O switch) optiizes interediate code, such as by perforing dead-code eliination and copy propagation (along with coon subexpression eliination and soe codeotion). Copy propagation is one pass, useless assignent eliination is a second pass. Dead-Code Eliination As an exaple of dead-code eliination, consider this subroutine: SUBROUTINE FOO(C,A,B,N) IMPLICIT NONE INTEGER N REAL *8 C(N),A(N),B(N) REAL *8 TMP(N) TMP = A + B C = TMP RETURN END The copiler generates 2 loads and 1 store for this subroutine. Copy Propagation As an exaple of copy propagation, consider these assignents: A = A + 1 A = A + 1 A = A + 1 If we copy propagate this, we get A = A + 1 A = (A + 1) + 1 ! one copy prop A = ((A+1) + 1) + 1 ! two copy props If we now evaluate this, assuing A is 0 on entry, we get 3 frothe first set of stateents and 6 frothe second. If, however, we have B = A + 1 B = B + 1 A = B + 1 we can copy propagate to B = A + 1 B = (A + 1) + 1 A = ((A + 1) + 1) + 1 and now the two stores to B can be killed, and we get A = 3 when we execute it. In essence, you can't copy past the point at which any quantity on the right-hand side is odified. In the second exaple, the stateent B = B + 1 satisfies this condition because we can replace B with A + 1 and still have a correct progra with no dead code reoved. 2.3 COPILER SWITCHES --------------------- This section describes the switches accepted by the CM Fortran copiler. Switches are applied froleft to right; if the sae switch is specified several ties on a co and line, the last setting or value is used. Abbreviations are peritted provided that sufficient characters are used to nae the switch unabiguously. The following switches are accepted by the cf coand and passed to theM Fortran copiler, to another copiler, or to the linker. The cf co and ay conceivably fail if given options that only a foreign copiler accepts. To prevent this happening, use the -Z switch to pass the desired switches. -c Suppresses the linker. Produces an object file for each source file. -dryrun Shows the co ands that cf builds, but does not execute the. -Dnae | -Dnae=def Defines the sybol nae for use by the C language preprocessor cpp. The first for sets the value of nae to 1; the second for sets its value to def. (This option has the sae effect as the C preprocessor directive #define.) -g Produces additional sybol table inforation for use by the Pris debugger and suppresses the optiization of fusing source code lines into code blocks. This allows the debugger to step through source code line by line. This switch ay drastically slow perforance. Please refer to -cdebug for ore inforation. -h Displays a su ary of the options on standard output. If this option is used, other options and arguents are ignored.-Ldirectory Adds directory to the list of directories containing object-library routines (for linking using ld, cld, or cd-ld). -llib Adds lib to the list of libraries to be searched during linking. All user libraries are searched before any syste libraries; user libraries are searched in the order in which they appear on the coand line. The arguents are passed to the cld linker on the CM-5, the ld linker on the CM-2 or CM-200, or the cd-ld linker (for nodal CM Fortran progras), which searches the standard path (/usr, /usr/lib, /usr/local/lib) for a library with the nae lib to add to the list (the linker adds a lib prefix and a .a suffix). Using a library filenae librarynae.a on the cf coand line adds the library file to the list to be searched. This is used for libraries that are not in the standard search path.-O Perfors optiization during copilation. (This takes effect even if the -g switch is also specified.)-o output Naes the executable file produced by the linker output rather than the default nae a.out. Note the space between -o and output. -pg Produces profiling code. A run-tie routine accuulates counts and a gon.out file is produced. The linker searches profiling libraries instead of the standard libraries. An execution profile can then be generated by using gprof.-q Suppresses essages describing the progress of the copilation or link. -S Produces an assebly language representation of each source file.-v Shows each co and built by cf before executing it.-version Prints copiler and driver versions.-Zcp "switches" Passes a list of switches to the copiler or linker specified by cp. The nae c p ust i ediately follow the -Z switch, butust be separated by a space fro the switches. The switches should be enclosed in quotes (") and should appear as they would if passed directly.-c2 opiles for a CM-2 syste. This switchay not be necessary if the copiler driver is configured so that theM-2 is the default target. -c200 Copiles for aM-200 syste. This switch ay not be necessary if the copiler driver is configured so that the CM-200 is the default target. -c5 opiles for a CM-5 syste. This switchay not be necessary if the copiler driver is configured so that theM-5 is the default target. -csi Copiles for theM Fortran siulator. This switch ay not be necessary if the copiler driver is configured so that the siulator is the default target.-[no]axisreorder default: -axisreorder The positive for specifies that axes are reordered such that all NEWS axes vary faster than all serial axes, the rightost NEWS axis varies faster than other NEWS axes, and the rightost serial axis varies faster than other serial axes. The negative forspecifies that array axes are laid out in eory in the order declared, with the leftost axis varying fastest. The default positive for is back-copatible with previous releases. This switch is available only on CM-5 systes. All CM Fortran progra units of a progra ust be copiled with the sae setting of -[no]axisreorder. -[no]cdebug default: -nocdebug Produces additional sybol table inforation for use by the Pris debugger. When used together with the -slicewise switch but without the -g switch, several parallel assignent stateents ay happen during one single step co and issued to the debugger. This switch should be used in conjunction with the -g switch if single stepping on each parallel assignent is desired. -[no]cprofile default: -nocprofile Produces inforation needed for perforance analysis under the Pris developent environent. If used, this switchust be used during both copilation and linking.By default, the -cprofile switch activates the -cdebug switch, causing Pris to analyze perforance on a block-by-block basis (with source code lines fused together). To analyze a progra on a line-by-line basis, relating perforance to individual source code lines, specify the -g switch along with -cprofile. The switch -g suppresses the optiization of fusing source code lines into code blocks; note that this action akes the code execute artificially slowly. -cophost copiler Requests that the linker treat the host progra (identified by the -host switch) as if it had been copiled by copiler.urrently, copiler can be either cc or f77. This switch is available only on CM-5 systes with vector units, and the -node switch ust also be specified. -continuations=nuber default: -continuations=19 Specifies the axiu nu ber of continuation lines in a Fortran stateent. The nuber ust be a non-negative integer value; by default, a Fortran stateent can be continued on up to 19 lines. -[no]co on_initialized default: -noco on_initialized Allocates a copy of a CM array in co on on the front end. The copy is necessary only when a CM array in co on is initialized via block data. (A copiler directive is also provided. Refer to the CM Fortran Language Reference Manual for inforation on directives.) -[no]cross_reference default: -nocross_reference When used together with the -list switch, causes the listing file to include inforation that relates line labels and naes (sybols) to source code lines. Add the switch -show_include to cause this inforation to be generated for include files also. If the -list switch is not specified, the -cross_reference switch is ignored. -[no]d_lines default: -nod_lines Specifies whether lines beginning with a D in colun 1 are to be copiled as ordinary stateents, as if the D were not present, or treated as coents. This option allows "debug" stateents to be incorporated into a progra unit for testing; the stateents can in effect be reoved without changing the source code. - [no]directive default: -directive Indicates whether co ent lines that have the syntax of copiler directives should be processed as directives, or regarded as coents. (Refer to theM Fortran Language Reference Manual for inforation on directives.) -[no]double_precision default: - nodouble_precision Indicates whether to interpret the type specification REAL as double- precision data or single-precision data. -[no]extend_source default: -noextend_source Extends the source line length to 132 characters (fro 72 characters). -[no]feco on default: -nofeco on Specifies the default allocation of co on arrays. Ordinarily, the CM Fortran copiler allocates coon arrays (other than character arrays) inM eory. This switch changes the default allocation of coon arrays to be in front-endeory. (opiler directives ay force arrays to be allocated in front-end eory or CM eory. Refer to the CM Fortran Language Reference Manual.) Progras that use only Fortran 77 language features are required to set this switch for correct execution. A co on array that has been allocated in front-end e ory cannot be used in an array operation. Thus, soe progras that copile correctly in the absence of -fecoon ay fail if the option is set. -host filenae Identifies filenae as being part of the host progra, for the host/node progra ing odel. The linker links the host prograobject files as specified by the switch -cophost. The -host switchust be repeated for every host file. This switch is available only on CM-5 systes with vector units, and the -node switch ust also be specified. -[no]iplicit_none default: -noiplicit_none Inhibits iplicit typing of variable, constant, and function naes appearing in the progra. Otherwise, undeclared naes beginning with the letters I through N are assued to be INTEGER, and other naes are assued to be REAL. The effect of this option is to require explicit declaration of all naes. This option overrides IMPLICIT stateents appearing in the progra unit. -[no]list default: -nolist Produces a source file listing naed filenae.lis. The listing contains the source, inforation about variables used, and inforation about the counication routines generated fro the source. -local filenae Links filenae as a local subprograto a global CM Fortran progra. The -local switchust be repeated for every local file. See the CM Fortran Libraries Reference Manual for full inforation about copiling and linking global/local progras. This switch requires -vu; it is incopatible with -sparc. -node Links for the CM-5 nodal execution odel. Use this switch only for progras that run separately on each node and co unicate via CMMD calls. See Section 3.2 for inforation about copiling and linking nodal progras. This switch requires-vu; it is incopatible with -sparc. -[no]padding default: -padding On CM-5 systes with vector units, specifies whether an array's per- processor subgrid (excluding serial diensions) needs to be an integer ultiple of 8 (the vector-length). If so, the subgriday be padded with garbage eleents to reach a legal size. The switch has no effect under theM-5 SPARnodes odel (which does not add vector-length padding); it is not supported on the CM-2 or CM-200. The default positive for is back-copatible with previous releases. All CM Fortran progra units of a progra ust be copiled with the sae setting of -[no]padding. -[no]pecode default: -nopecode Produces an assebly language representation of the parallel part of eachM-2 orM-200 source file. The output file will have an extension _peac.peac. -safety=nuber default: -safety=0 Causes code to be generated that perfors soe run-tie checking. At level safety=0 (zero) no run-tie checking is perfored. At level safety=1, causes code to be generated that detects soe is atches between actual and du y arguents at run tie. In particular, it checks that the nuber of actual arguents equals the nuber of duy arguents, the hoe of each actual arguent is the sae as the hoe of the duy arguent, and the rank of each CM actual array arguent is the sae as the duy arguent. The generated code currently does not check the shape or layout of CM arrays. At level safety=10 all run-tie safety checks are enabled. Currently these are: arguent checking, use of uninitialized CM arrays, and out-of-bounds vector- valued subscripts. This switch is ipleented under under CM-5 and under CM-2/200 slicewise. All CM Fortran progra units of a progra ust be copiled with the sae setting of -safety. -[no]show_include default: -noshow_include Causes the listing file to contain text fro INLUDE stateents in the source code. This option has no effect if the -list option is not specified. -[no]slicewise Copiles and links for the slicewise executionodel. The output odule can execute only onM-2 orM-200 systes with the 64-bit floating point accelerator. -sparc Copiles and links for theM-5 nodes only (SPAR) execution odel. The output odule can execute on anyM-5 syste. -ucode icrocode-version Links with CM-2 or CM-200 icrocode-version libraries instead of the default. Note the space between the switch andicrocode-version. -[no]veccode default: -noveccode Produces an assebly language representation of the parallel part of each source file for theM-5 (vector unit odel only). The output file will have an extension .pe.dp. -[no]vu Copiles and links for theM-5 vector unit execution odel. The output odule can execute only onM-5 systes with vector units. -[no]warning default: -warning Suppresses warning essages during copilation. When enabled, these warnings are written to the listing file if the -list option is also specified; otherwise, they are written to standard output. ***************************************************************** The information in this document is subject to change without notice and should not be construed as a commitment by Think- ing Machines Corporation. Thinking Machines reserves the right to make changes to any product described herein. Although the information in this document has been reviewed and is believed to be reliable, Thinking Machines Corporation assumes no liability for errors in this document. Thinking Machines does not assume any liability arising from the application or use of any information or product described herein. ***************************************************************** Connection Machine (r) is a registered trademark of Thinking Machines Corporation. CM, CM-2, CM-200, CM-5, CM-5 Scale 3, and DataVault are trademarks of Thinking Machines Corporation. CMOST, CMAX, and Prism are trademarks of Thinking Machines Corporation. C* (r) is a registered trademark of Thinking Machines Corporation. Paris, *Lisp, and CM Fortran are trademarks of Thinking Machines Corporation. CMMD, CMSSL, and CMX11 are trademarks of Thinking Machines Corporation. CMview is a trademark of Thinking Machines Corporation. Scalable Computing (SC) is a trademark of Thinking Machines Corporation. Scalable Disk Array (SDA) is a trademark of Thinking Machines Corporation. Thinking Machines (r) is a registered trademark of Thinking Machines Corporation. SPARC and SPARCstation are trademarks of SPARC International, Inc. Sun, Sun-4, SunOS, Sun FORTRAN, and Sun Workstation are trademarks of Sun Microsystems, Inc. UNIX is a trademark of UNIX System Laboratories, Inc. The X Window System is a trademark of the Massachusetts Institute of Technology. Copyright (c) 1991-1994 by Thinking Machines Corporation. All rights reserved. This file contains documentation produced by Thinking Machines Corporation. Unauthorized duplication of this documentation is prohibited. Thinking Machines Corporation 245 First Street Cambridge, Massachusetts 02142-1264 (617) 234-1000