C* PROGRAMMING GUIDE May 1993 Copyright (c) 1990-1993 Thinking Machines Corporation. CHAPTER 14: GENERAL COMMUNICATION ********************************* The C* communications functions we have discussed so far have required that the source and destination parallel variables be of the current shape (except for global, where the destination is a scalar variable), and that the communication be in regular patterns--that is, all elements transfer their values the same number of positions in the same direction. In this chapter, we introduce functions that allow communication in which: o One of the parallel variables need not be of the current shape, and o The communication need not be in a regular pattern. The get and send functions described in this chapter provide communication comparable to that offered by parallel left indexing; see Chapter 10. The read_from_position function described in this chapter provide communication comparable to that offered by assigning a scalar-indexed parallel variable to a scalar variable; write_to_position is comparable to assigning a scalar variable to a scalar-indexed parallel variable. The read_from_pvar function reads data from a parallel variable into a scalar array; write_to_pvar writes data from an array to a parallel variable. Include the header file when calling any of the functions discussed in this chapter. 14.1 THE MAKE_SEND_ADDRESS FUNCTION ------------------------------------ Grid communication requires knowing the coordinates of parallel variable elements in the shape. More information is required for general communication. Specifically, you need to supply a send address for a parallel variable element's position. This send address, along with a position's shape, uniquely identifies a position among all positions in all shapes; thus, you can use this address when an element of the current shape is communicating with an element that is of a different shape. Use the make_send_address function to obtain a send address for one or more positions. make_send_address is an overloaded function that has different versions depending on these conditions: o Whether you want to return a single address or multiple addresses. Multiple addresses are returned as a parallel variable of the current shape. o Whether you specify axis coordinates for the position in a stdargs list or in an array. The choice is the same as that for the allocate_shape function, which we discussed in Section 9.3. If you know the rank of the position's shape, it is easier to use the stdargs version. If the rank will not be known until run time, you must use an array. 14.1.1 Obtaining a Single Send Address --------------------------------------- To obtain a send address for a single position, use make_send_address with one of these formats: CMC_sendaddr_t make_send_address ( shape s, int axis_0_coord, ...); or: CMC_sendaddr_t make_send_address ( shape s, int axes[]); where: s is the shape to which the position whose address you are obtaining belongs. axis_0_coord (in the first version) specifies the position's coordinate along axis 0. Specify as many coordinates as there are axes in the shape. axes[ ] (in the second version) is an array that contains the position's coordinates. The function returns a scalar value (of type CMC_sendaddr_t) that is the send address of the position. This address is returned even if the position is inactive. Note that the shape you specify in the parameter list need not be the current shape. An Example ---------- The code below calculates the send address of position [77][44] in shape image and assigns this address to the scalar variable addr: CMC_sendaddr_t addr; addr = make_send_address(image, 77, 44); 14.1.2 Obtaining Multiple Send Addresses ----------------------------------------- To obtain send addresses for more than one position, use make_send_address with one of these formats: CMC_sendaddr_t:current make_send_address( shape s, int:current axis_0_coord, ...); or: CMC_sendaddr_t:current make_send_address ( shape s, int:current axes[]); These formats are the same as the ones shown in Section 14.1.1, except that the axis_n_coord arguments take parallel ints of the current shape, and the function returns a parallel variable of the current shape. The value in each element of the parallel variable you specify for an axis of shape s represents a coordinate along that axis. The corresponding elements of the parallel variables that represent all the axes of the shape therefore fully specify a position in shape s. The function returns the send address for each position specified in this way. These send addresses are returned as the values of elements of a parallel variable that is of the current shape. For example, if you specify p1 as the axis argument for a 1- dimensional shape s, and [0]p1 contains the value 4, then the send address of position [4] of shape s is returned in element [0] of a parallel variable of the current shape. You cannot mix scalar values and parallel values in the argument list. If you want to use a scalar value (for example, because you only want the send addresses of positions whose coordinate for axis 1 is 3), either: o Use a separate assignment statement to assign 3 to a parallel variable; or o Use a cast in the argument list to explicitly promote 3 to a parallel value. When Positions Are Inactive --------------------------- If a position in the current shape is inactive, that position does not participate in the operation. In other words, the function does not return the send address specified by that position's parallel variable elements. If elements specify a position in shape s that is inactive, the send address for that position is returned. An Example ---------- Figure 90 shows an example of make_send_address, using parallel variables of the 1-dimensional shape t to map parallel variables of the 2-dimensional shape s. [ Figure Omitted ] Figure 90. An example of the make_send_address function. Note these points in Figure 90: o Two elements contain the same send address; this is legal. o Position [2] is inactive; therefore, element [2] of address does not obtain the send address specified by the values in [2]axis_0 and [2]axis_1. The values of the elements that specify coordinates for an axis must be within the range of these coordinates. If, for example, shape s has 256 positions along axis 0, an element of axis_0 cannot have a value greater than 255. 14.2 GETTING PARALLEL DATA: THE GET FUNCTION --------------------------------------------- Use the get function to get values from a parallel variable when grid communication is not possible--that is, when communicating between shapes, or when the communication is not in a regular pattern. The get function is overloaded for both arithmetic and aggregate types. 14.2.1 Getting Parallel Variables ---------------------------------- The get function has this definition when used with arithmetic types: type:current get ( CMC_sendaddr_t:current send_address, type:void *sourcep, CMC_collision_mode_t collision_mode); where: send_address is a parallel variable of the current shape. The parallel variable contains send addresses for positions in a shape that need not be the current shape; see Section 14.1. They must, however, be of the same shape as the parallel variable pointed to by sourcep. sourcep is a scalar pointer to a parallel variable (of any shape) from which values are to be returned. The parallel variable pointed to by send_address specifies which values are to be returned and where they are to be assigned. collision_mode specifies the behavior if more than one destination parallel variable element tries to get from the same element of the source parallel variable. Possible values are CMC_collisions, CMC_no_collisions, CMC_few_collisions, and CMC_many_collisions. See "Collisions in Get Operations," below. The get function returns a parallel variable of the current shape. It has the same arithmetic type as the parallel variable pointed to by sourcep, and it contains the values of the parallel variable pointed to by sourcep in the positions specified by send_address. The get function works like a get operation using a parallel left index; see Chapter 10. A destination parallel variable obtains values of the source parallel variable, using the parallel variable send_address as an index. Thus, given this code: #include shape [65536]ShapeA; shape [512][128]ShapeB; int:ShapeA axis_0, axis_1, dest; int:ShapeB source; These two code fragments have the same results: with (ShapeA) { CMC_sendaddr_t:ShapeAaddress; address = make_send_address(ShapeB, axis_0, axis_1); dest = get(address, &source, CMC_collisions); } and: with (ShapeA) dest = [axis_0][axis_1]source; The get function is more general, however: o You can use get even if the rank of the shape from which you want to get values is not known until run time. Parallel left indexing requires that you know the rank of the shape when you write the program. o The get function lets you control how collisions are handled; see below. o The get function also lets you get parallel arrays. See Section 14.2.2, below. If there are inactive positions in ShapeA in the first example above, elements of dest at these positions do not get values from source. The status of the positions in ShapeB does not matter; the active elements of dest get the values from the positions for which address has send addresses, whether or not these positions are active. Once again, this behavior is the same as that for get operations with parallel left indexing. Collisions in Get Operations ---------------------------- The collisions we have talked about previously occur when two elements try to send to the same element at the same time. Get operations also have collisions, however; these occur when more than one parallel variable element tries to get a value from the same element at the same time. Unlike send collisions, get collisions are permitted in C*; they are handled automatically by get operations in the language. The get function and its collision_mode argument, however, gives you some control over how collisions are handled. We recommend using the CMC_collisions option of collision_mode for most applications. This is the method used by get operations in the language itself. The other options may be useful in special circumstances: o If there is no possibility of collisions, you can specify CMC_no_collisions; currently, this option uses the same code as CMC_collisions. However, future implementations of the get function may increase the performance of CMC_no_collisions. o CMC_many_collisions and CMC_few_collisions can be useful if your application is memory-intensive and risks running out of storage. (You can determine this if, for example, your program doesn't run with a certain number of physical processors, but does run with a larger number of processors.) CMC_collisions requires memory for two aspects of its operation: to store the paths it takes in doing gets for each position, and to store colliding addresses. If it runs out of memory, it switches over and tries the algorithm used by CMC_many_collisions, which is slower but requires less memory. Under these circumstances, the operation would be faster if you specified CMC_many_collisions to begin with, thus avoiding the time spent trying the CMC_collisions algorithm. If CMC_collisions takes a long time due to memory limitations and the get has few collisions, CMC_few_collisions may be faster. In this case, the get operation iterates separately over each collision, saving the memory required to store the colliding addresses. 14.2.2 Getting Parallel Data of Any Length ------------------------------------------- You can also use the get function to obtain values from parallel locations of any length--typically, parallel structures or parallel arrays. This version of the get function has this definition: void get ( void:current *destp, CMC_sendaddr_t:current *send_addressp, void:void *sourcep, CMC_collision_mode_t collision_mode, int length); where: destp is a scalar pointer to a parallel location of the current shape. This location obtains values from sourcep, based on the index in the parallel variable pointed to by send_addressp. send_addressp is a scalar pointer to a parallel variable of the current shape. The parallel variable contains send addresses for positions in a shape that need not be the current shape. See Section 14.1. sourcep is a scalar pointer to a parallel location; it need not be of the current shape. The parallel variable pointed to by send_addressp specifies positions of this location. Data is to be gotten from these positions. collision_mode specifies what to do if more than one destination parallel variable element tries to get from the same element of the source parallel variable. Possible values are CMC_collisions, CMC_no_collisions, CMC_few_collisions, and CMC_many_collisions. See "Collisions in Get Operations," above. length specifies the length in bools of the parallel location pointed to by sourcep. This version of the get function lets you obtain data that is larger than the standard data types; typically, this data would be in a parallel structure or parallel array. For example: #include shape [65536]ShapeA; shape [512][128]ShapeB; struct S { int a; int b; }; int:ShapeA axis_0, axis_1; struct S:ShapeA dest_struct; struct S:ShapeB source_struct; main() { with (ShapeA) { CMC_sendaddr_t:ShapeAaddress; address = make_send_address(ShapeB, axis_0, axis_1); get(&dest_struct, &address, &source_struct, CMC_collisions, boolsizeof(source_struct)); } } dest_struct, of shape ShapeA, gets data from individual positions of the structure source_struct, of shape ShapeB, based on the send addresses stored in address. Note the use of the intrinsic function boolsizeof to obtain the length, in bools, of source_struct. 14.3 SENDING PARALLEL DATA: THE SEND FUNCTION ---------------------------------------------- Use the send function to send parallel data when grid communication is not possible--that is, when communicating between shapes, or when the communication is not in a regular pattern. The send function is overloaded for both arithmetic and aggregate types. 14.3.1 Sending Parallel Variables ---------------------------------- The send function has this definition when used with arithmetic types: type:current send ( type:void *destp, CMC_sendaddr_t:current send_address, type:current source, CMC_combiner_t combiner, bool:void*notifyp); where: destp is a scalar pointer to a parallel variable to which values are to be sent. It can be of any arithmetic type and any shape. send_address is a parallel variable of the current shape. The parallel variable contains send addresses for positions in the shape of the parallel variable pointed to by destp. This shape need not be the current shape; see Section 14.1. source is a parallel variable from which values are to be sent. It must be of the current shape, and it must have the same type as the parallel variable pointed to by destp. combiner specifies how send is to handle collisions. Possible values are CMC_combiner_max, CMC_combiner_min, CMC_combiner_add, CMC_combiner_logior, CMC_combiner_logxor, CMC_combiner_logand, and CMC_combiner_overwrite. All of these are defined in Section 13.1 except CMC_combiner_overwrite. If you specify CMC_combiner_overwrite and more than one value is sent to a parallel variable element, one of the values is chosen arbitrarily and stored in the element, and the rest of the values are discarded. notifyp is a scalar pointer to a bool-size parallel variable of the same shape as the parallel variable pointed to by destp. When an element of the destp parallel variable receives a value, the corresponding element of the parallel variable pointed to by notifyp is set to 1; other elements are set to 0. If you do not want to use a notify bit, specify CMC_no_field for this argument. send returns the source. Using the send function is roughly equivalent to performing a send operation with parallel left indexing; see Chapter 10. The source parallel variable sends values to the destp parallel variable, using send_address as an index. The combiners are equivalent to reduction assignment operators. CMC_combiner_overwrite has the same effect as the = operator, when the parallel right-hand side is cast to the type of the scalar left-hand side. There are some differences, however, between the send function and send operations with parallel left indexing: o The send function can be used when the rank of the shape of the destination parallel variable is not known until run time. o The send function lets you include a notify bit, which provides notification that a value has been received by an element of the destination parallel variable. o There is not a complete correspondence between the combiners and the reduction assignment operators. For example, there is no combiner that is equivalent to the -= reduction assignment operator. o The send function has an overloaded version that lets you send parallel arrays; see Section 14.3.2, below. Inactive Positions ------------------ Inactive positions are treated in the same way they are treated by send operations with parallel left indexes: o An element in an inactive position in the current shape does not send a value. o Destination parallel variable elements receive values even if they are in inactive positions. In addition, the notify bit can be set even in an inactive position. An Example ---------- This code sends values from elements of source to elements of dest: #include shape [16384]ShapeA; shape [2][16384]ShapeB; int:ShapeA axis_0, axis_1, source; int:ShapeB dest; /* Code to initialize parallel variables omitted. */ main() { with (ShapeA) { CMC_sendaddr_t:ShapeAaddress; address = make_send_address(ShapeB, axis_0, axis_1); where (source < 9) send(&dest, address, source, CMC_combiner_min, ¬ify_bit); } } Some sample results are shown in Figure 91. The arrows show what happens to the value at [3]source, based on the send address in [3]address. Note these points in the results: o Position [2] of ShapeA is inactive; therefore, [2]source does not send its value. o The CMC_combiner_min combiner causes the 3 from [0]source, rather than the 5 from [1]source, to be sent to [1][0]dest. o The notify bit is set in the two positions that receive values. [ Figure Omitted ] Figure 91. An example of the send function. 14.3.2 Sending Parallel Data of Any Length ------------------------------------------- You can also use the send function to send parallel data of any length--typically a parallel structure or parallel array. This version of the send function is defined as follows: void:current * send ( void:void *destp, CMC_sendaddr_t:current *send_addressp, void:current *sourcep, int length, bool:void*notifyp); where: destp is a scalar pointer to a parallel location to which data is to be sent. void:void specifies that destp points to a location that can be of any type and of any shape. send_addressp is a scalar pointer to a parallel variable of the current shape. The parallel variable contains send addresses for positions in the shape of the parallel variable pointed to by destp. sourcep is a scalar pointer to a parallel location from which data is to be sent. It must be of the current shape. length specifies the length in bools of the location whose beginning is pointed to by sourcep. notifyp is a scalar pointer to a bool-sized parallel variable of the same shape as the location pointed to by destp. When data is written to a position pointed to by destp, the corresponding element of the parallel variable pointed to by notifyp is set to 1. If you do not want to use a notify bit, specify CMC_no_field for this argument. send returns a pointer to the source. This version of the send function lets you send data that is larger than the standard data types; typically, this data would be in a parallel structure or parallel array. The data is sent from the source location to the destination location, using the parallel variable pointed to by send_addressp as an index to determine the destination. Note that this version of send does not include a combiner argument. This version uses the CMC_combiner_overwrite option, and arbitrarily chooses a position of the array or structure if there would otherwise be a collision. For example: #include shape [65536]ShapeA; shape [512][128]ShapeB; struct S { int a; int b; }; int:ShapeA axis_0, axis_1; struct S source_struct:ShapeA, dest_struct:Shape_B; main() { with (ShapeA) { CMC_sendaddr_t:ShapeAaddress; address = make_send_address(ShapeB, axis_0, axis_1); send(&dest_struct, &address, &source_struct, boolsizeof(source_struct), ¬ify_bit); } } The values of individual positions of the parallel structure source_struct, of shape ShapeA, are sent to dest_struct, of shape ShapeB, based on the send addresses stored in address. Note the use of the intrinsic function boolsizeof to obtain the length, in bools, of source_struct. 14.3.3 Sorting Elements by Their Ranks --------------------------------------- You can use send, along with the make_send_address and rank functions, to reorder elements of a parallel variable by the ranks of their values. Note that this is also possible with parallel left indexing, as described in Section 13.7.1. In the example below, we rearrange salary data for employees: #include shape [5]employees; struct employee { int id; int salary; }; struct employee:employees staff; main() { /* Code to initialize salaries and ids omitted. */ with (employees) { int:employees order; CMC_sendaddr_t:employees address; /* Determine ranks of salary values. */ order = rank(staff.salary, 0, CMC_upward, CMC_none, CMC_no_field); /* Create send addresses, using salary ranks as the index. */ address = make_send_address(employees, order); /* Send employee data for each employee to new positions, based on the salary ranks. */ send(&staff, &address, &staff, boolsizeof(staff), CMC_no_field); } } The code proceeds as follows: 1. It declares the shape, and declares and initializes the parallel structure. (The initialization of staff.salary and staff.id is omitted.) 2. It calls rank to return the ranks of the elements of staff.salary. The results are shown in Figure 92. 3. It calls make_send_address to return send addresses, using the salary ranks as the index. Upon return, [0]address contains the send address of position [1] of shape employees, [1]address contains the send address of position [0] of employees, and so on. 4. It then calls send to send the variables in the parallel structure to new positions, based on the send addresses. The result is that the values are rearranged as shown in Figure 93. [ Figure Omitted ] Figure 96. Using the rank function to rank elements of a parallel variable. [ Figure Omitted ] Figure 97. Using make_send_address and send to reorder the elements of parallel variables by rank. 14.4 COMMUNICATING BETWEEN SCALAR AND PARALLEL VARIABLES --------------------------------------------------------- This section discusses C* communication functions that provide general communication between the scalar and parallel variables. 14.4.1 From a Parallel Variable to a Scalar Variable ----------------------------------------------------- The read_from_position Function ------------------------------- Use the read_from_position function to read a value from a parallel variable element (not necessarily of the current shape) and assign it to a scalar variable. This function is overloaded for any arithmetic type. The read_from_position function has this definition: type read_from_position ( CMC_sendaddr_t send_address, type:void *sourcep); where: send_address is the send address of a position from which a value is to be read. sourcep is a scalar pointer to the parallel variable from which a value is to be read; the parallel variable can be of any shape and any arithmetic type. Before calling read_from_position (or as part of the read_from_position call), you must use the single-address version of make_send_address to obtain a send address; see Section 14.1. The read_from_position function uses this send address to specify the position, and it uses sourcep to specify the parallel variable. It returns the value obtained from the parallel variable element at that position. The value is returned even if the position is inactive. Since read_from_position deals with a scalar value, it does not have to be called within the scope of a with statement, and the source parallel variable does not have to be of the current shape. This function, in combination with make_send_address, produces the same result as assigning a scalar-indexed parallel variable to a scalar variable. For example: scalar = [7]p1; You can use read_from_position even when the rank of the shape is not known until run time, however. The example below reads the value from element [16][4] of parallel variable p1, which is of shape image. It assigns the value to the scalar variable s1. #include shape [256][256]image; float:image p1; CMC_sendaddr_t address; float s1; main() { address = make_send_address(image, 16, 4); s1 = read_from_position(address, &p1); } Note that the call to make_send_address can also be made from within read_from_position's argument list: s1 = read_from_position(make_send_address(image, 16, 4), &p1); The read_from_pvar Function --------------------------- Use the read_from_pvar function to read the values of active elements of a parallel variable and assign them to a scalar array. This function is overloaded for any arithmetic type. It has this definition: void read_from_pvar ( type *destp, type:current source) where: destp is a pointer to the buffer to which values are to be written. source is a parallel variable of the current shape from which values are to be read. Both source and the array pointed to by destp must have the same arithmetic type. The values in source are written into the specified scalar array. Values in inactive elements are not copied; array elements that correspond to inactive positions receive undefined values. Typically, the scalar array will have the same number of elements and dimensions as the source parallel variable. It cannot have fewer elements than the source parallel variable. This example copies the values in p1 to the scalar array scalar_array: #include shape [16384]ShapeA; int:ShapeA p1; int scalar_array[16384]; main() { /* Initialization of p1 omitted */ with (ShapeA) read_from_pvar(scalar_array, p1); } Note, however, that if the scalar array has more than one dimension, you must cast it to be a pointer to the type of the array, so that the function knows where to put the data. For example: #include shape [128][256]ShapeB; float:ShapeB q1; float scalar_array2[128][256]; main() { /* Initialization of q1 omitted */ with (ShapeB) read_from_pvar((float *)scalar_array2, q1); } Also, when there is more than one dimension involved, the data is transferred so that the highest-numbered parallel dimension is contiguous in scalar memory. In other words, the left indexes of the parallel variable match up with the right indexes of the scalar array. Note for users of CM-5 C*: The CM-5 implementation also has a version of this function for parallel data of any length. It has this definition: void read_from_pvar ( void *destp, void:current *sourcep, int length); where destp is a pointer to the scalar array to which the values are to be written, sourcep is a pointer to the parallel data, and length is the length, in units of bools, of each data element pointed to by sourcep. Note that using this version of read_from_pvar with aggregate data may improve performance, but it will also make your program nonportable (because of its reliance on size, alignment, and structure field padding). 14.4.2 From a Scalar Variable to a Parallel Variable ----------------------------------------------------- The write_to_position Function ------------------------------- Use the write_to_position function to write a value from a scalar variable to a parallel variable element (not necessarily of the current shape). The write_to_position function has this definition: type write_to_position ( CMC_sendaddr_t send_address, type:void *destp, type source); where: send_address is the send address of the position to which a value is to be written. destp is a scalar pointer to the parallel variable to which a value is to be written; the parallel variable can be of any shape and any arithmetic type. source is the scalar variable whose value is to be sent to the destination parallel variable element. Both source and the parallel variable pointed to by destp must have the same arithmetic type. The function returns the value of source. As with read_from_position, you must use the single-address version of make_send_address to obtain a send address; see Section 14.1. write_to_position uses this send address to specify the position, and it uses destp to specify the parallel variable. It sends the value in source to the element specified by these arguments. The value is written into this element even if the element's position is inactive. write_to_position does not have to be called within the scope of a with statement, and the destination parallel variable does not have to be of the current shape. This function, when used along with make_send_address, produces the same result as assigning a scalar variable to a scalar-indexed parallel variable. For example: [7]p1 = scalar; You can use write_to_position even when the rank of the shape is not known until run time, however. The example below reverses the example for read_from_position in the previous section. It assigns the value of the scalar variable s1 to element [16][4] of parallel variable p1, which is of shape image. #include shape [256][256]image; float:image p1; CMC_sendaddr_t address; float s1; main() { address = make_send_address(image, 16, 4); write_to_position(address, &p1, s1); } The write_to_pvar Function -------------------------- Use the write_to_pvar function to write data from a scalar array to a parallel variable of the current shape. The function is overloaded for any arithmetic type. It has this definition: type:current write_to_pvar ( type *sourcep) where sourcep is a pointer to a scalar array from which data is to be written. The function returns a parallel variable of the current shape containing the values in the scalar array. If there are inactive positions in the shape at the time the function is called, the values in these inactive positions are not overwritten. The scalar array typically has the same number of elements and dimensions as the current shape; it cannot have fewer elements. The example below reverses the example for read_from_pvar shown in the previous section. The array scalar_array writes its values to the parallel variable p1: #include shape [16384]ShapeA; int:ShapeA p1; int scalar_array[16384]; main() { /* Initialization of scalar_array omitted */ with (ShapeA) p1 = write_to_pvar(scalar_array); } Note once again, however, that if the scalar array has more than one dimension, you must cast it to be a pointer to the type of the array, so that the function knows where to put the data. For example: #include shape [128][256]ShapeB; float:ShapeB q1; float scalar_array2[128][256]; main() { /* Initialization of scalar_array2 omitted */ with (ShapeB) q1 = write_to_pvar((float *) scalar_array2); } Also, when there is more than one dimension involved, the data is transferred so that values that are contiguous in scalar memory become the highest-numbered dimension of the parallel variable. In other words, the right indexes of the scalar array match up with the left indexes of the parallel variable. Note for users of CM-5 C*: The CM-5 implementation also has a version of this function for parallel data of any length. It has this definition: void write_to_pvar ( void:current *destp, void *sourcep, int length); where destp is a pointer to the parallel data in which the values are to be written, sourcep is a pointer to the scalar array, and length is the length, in units of bools, of the data pointed to by destp. Note that using this version of write_to_pvar with aggregate data may improve performance, but it will make your program nonportable (because of its reliance on size, alignment, and structure field padding). 14.5 THE MAKE_MULTI_COORD AND COPY_MULTISPREAD FUNCTIONS --------------------------------------------------------- As we mentioned in Section 13.8, the copy_multispread function is comparable to the copy_spread function, except that you use it on hyperplanes instead of scan classes. copy_multispread takes as one of its arguments a multicoordinate. The multicoordinate specifies which position of the parallel variable is to be spread through each hyperplane. For example, in the discussion of multispread in Section 13.8, we saw that, if we allowed positions to differ along axes 0 and 1 while keeping axis 2 fixed, we created these two hyperplanes (for a 2-by-2-by-2 shape): [0][0][0] [1][0][0] [0][1][0] [1][1][0] and: [0][0][1] [1][0][1] [0][1][1] [1][1][1] Choosing an individual element in these hyperplanes requires that you specify only two of the three coordinates, since the third (the coordinate for axis 2) is fixed (it is [0] in the first hyperplane, [1] in the second). The multicoordinate specifies what the coordinates are along the axes that are not fixed. If the multicoordinate specifies [0] for axis 0 and [0] for axis 1, for example, then position [0][0][0] is chosen for the first hyperplane, and [0][0][1] is chosen for the second hyperplane. To obtain this multicoordinate for a position, use the make_multi_coord function. You can then use the multicoordinate in the call to copy_multispread. The multicoordinate specifies the desired position in each hyperplane. make_multi_coord is an overloaded function. It provides three different ways of specifying a position: o By including the position's coordinates as arguments to the function. o By specifying an array that contains these coordinates. Use this version if the shape's rank will not be known until run time. o By specifying the position's send address. The three versions of make_multi_coord have these definitions: CMC_multicoord_t make_multi_coord ( shape s, unsignedint axis_mask, int axis_0_coord, ... ); or: CMC_multicoord_t make_multi_coord ( shape s, unsignedint axis_mask, int axes[]); or: CMC_multicoord_t make_multi_coord ( shape s, unsignedint axis_mask, CMC_sendaddr_t send_address); where: s specifies the shape for which the multicoordinate is to be obtained. axis_mask is a bit mask that specifies the axis or axes along which positions in a hyperplane are allowed to differ. Bit 1 corresponds to axis 0, bit 2 to axis 1, and so on. For example, use a bit mask of 3 to specify axes 0 and 1; use 6 to specify axes 1 and 2; use 5 to specify axes 0 and 2. axis_0_coord (in the first version) specifies the coordinates of a position in shape s along axis 0. Specify as many coordinates as there are axes in the shape. axes[] (in the second version) is an array that contains the position's coordinates. Specify as many coordinates as there are axes in the shape. send_address (in the third version) is the send address for a position in shape s. Any position will do. In all versions, the function returns the multicoordinate for the specified position with the specified axis mask. The definition of copy_multispread is: type:current copy_multispread ( type:current *sourcep, unsigned int axis_mask, CMC_multicoord_t multi_coord); where: sourcep is a scalar pointer to a parallel variable from which values are to be copied. The parallel variable can be of any arithmetic type; it must be of the current shape. axis_mask is a bit mask that specifies the axis or axes along which positions in a hyperplane are allowed to differ. multi_coord specifies the coordinates that determine the elements of the source parallel variable from which values are to be copied. The function copies the value from each specified element to each active position in that element's hyperplane. It returns a parallel variable containing these values; the parallel variable is of the current shape and has the same arithmetic type as source. Values of inactive elements are copied. 14.5.1 An Example ------------------ For example, given these declarations: #include CMC_sendaddr_t address; CMC_multicoord_t multi_coord; shape [128][128][128]ShapeA; int:ShapeA source, dest; then: address = make_send_address(ShapeA, 0, 0, 1); obtains the send address for position [0][0][1] in shape ShapeA and assigns it to the scalar int address. multi_coord = make_multi_coord(ShapeA, 3, address); obtains the multicoordinate for this position along axes 0 and 1 (specified by the value 3 for the axis_mask argument) and assigns it to the multi_coord. with (ShapeA) dest = copy_multispread(&source, 3, multi_coord); takes each element of parallel variable source specified by the axis mask (3) and the multicoordinate (multi_coord) and copies its value into the elements of parallel variable dest in the same hyperplane. In other words (for a 2-by-2-by-2 shape): o The value in [0][0][0]source is assigned to [0][0][0]dest, [1][0][0]dest, [0][1][0]dest, and [1][1][0]dest. o The value in [0][0][1]source is assigned to [0][0][1]dest, [1][0][1]dest, [0[1][1]dest, and [1][1][1]dest. ----------------------------------------------------------------- Contents copyright (C) 1990-1993 by Thinking Machines Corporation. All rights reserved. This file contains documentation produced by Thinking Machines Corporation. Unauthorized duplication of this documentation is prohibited. ***************************************************************** The information in this document is subject to change without notice and should not be construed as a commitment by Think- ing Machines Corporation. Thinking Machines reserves the right to make changes to any product described herein. Although the information in this document has been reviewed and is believed to be reliable, Thinking Machines Corporation assumes no liability for errors in this document. Thinking Machines does not assume any liability arising from the application or use of any information or product described herein. ***************************************************************** Connection Machine (r) is a registered trademark of Thinking Machines Corporation. CM, CM-2, CM-200, and CM-5 are trademarks of Thinking Machines Corporation. C* (r) is a registered trademark of Thinking Machines Corporation. Thinking Machines (r) is a registered trademark of Thinking Machines Corporation. UNIX is a registered trademark of UNIX System Laboratories, Inc. Copyright (c) 1990-1993 by Thinking Machines Corporation. All rights reserved. Thinking Machines Corporation 245 First Street Cambridge, Massachusetts 02142-1264 (617) 234-1000