Dynamic Instrumentation API (proposed)
Revision 0.1
Jeffrey K. Hollingsworth
Computer Science Department
University of Maryland
College Park, MD 20742
hollings@cs.umd.edu
|
|
Barton P. Miller
Computer Sciences Department
University of Wisconsin
Madison, WI 53706-1685
bart@cs.wisc.edu
|
Note that this is a draft document.
There are comments about issues that have not yet been settled and, in a
few places, interfaces that are not completely defined.
We are releasing this draft version to encourage comments and suggestions.
1. Introduction
The normal cycle of developing a program is to edit source code, compile
it, and then execute the resulting binary. However, sometimes this cycle
can be too restrictive. We may wish to change the program while it is
executing, and not have to re-compile, re-link, or even re-execute the
program to change the binary. At first thought, this may seem like a
bizarre goal, however there are several practical reasons we may wish to
have such a system. For example, if we are measuring the performance of
a program and discover a performance problem, it might be necessary to
insert additional instrumentation into the program to understand the
problem. Another application is performance steering; for large
simulations, computational scientists often find it advantageous to be
able to make modifications to the code and data while the simulation is
executing.
This document describes an Application Program Interface (API)
to permit the insertion of code into a running program. Runtime code
changes are useful to support a variety of applications including
debugging, performance monitoring, and to support composing applications
out of existing packages. The goal of this API is to provide a machine
independent interface to permit the creation of tools and applications
that use runtime code patching. This API is based on the idea of Dynamic
Instrumentation described in [2].
The unique feature of this interface is that it makes it
possible to insert and change instrumentation in a running program.
This differs from other post-linker instrumentation tools [3] that
permit code to be inserted into a binary before it starts to execute.
The goal of this API is to keep the interface small and easy to
understand. At the same time it needs to be sufficiently expressive to
be useful for a variety of applications. The way we have done this is by
providing a simple set of abstractions and a simple way to specify the
code to insert into the application.
To generate more complex code, extra (initially un-called subroutines)
can be linked into the application program, and calls to these subroutines
can be inserted at runtime.
We are in the process of producing a code-release that conforms to this
API.
The Dynamic Instrumentation in the Paradyn system will be the basis for
this, so the API will support AIX, SunOS, Solaris (SPARC and x86), and
HP-UX.
As Paradyn is ported to other platforms, the Dyninst API will also
support those platforms.
2. Abstractions
The API is based on abstractions of a program and its state while in
execution. The two primary abstractions are
points
and
snippets.
A point is a location in a program where instrumentation can be inserted.
A snippet is a representation of a bit of executable code to be inserted
into a program at a point. For example, if we wished to record the number
of times a procedure was invoked, the point would be the first
instruction in the procedure, and the snippets would be a statement to
increment a counter. Snippets can include conditionals, function calls,
and loops.
The API is designed so that a single instrumentation process can
insert snippets into multiple processes executing on a single machine. To
support multiple processes, two additional abstractions,
threads
and
images,
are included in the API.
A thread refers to thread of execution.
Depending on the programming model, a thread can correspond to either a
normal process or a lightweight thread. Images refer the static
representation of a program on disk.
The relationship between these four abstractions is shown in
Figure 1. Images contain points where their code can be modified. Each
thread is associated with exactly one image.
3. Simple Example
To illustrate the ideas of the API, we present several short examples
that demonstrate how the API can be used. The full details of the
interface are presented in the next section. To prevent confusion, we
refer to the process we are modifying as the application, and the program
that uses the API to modify the application as the mutator. A mutator is
a seperate process that modifies an application process.
The first thing a mutator needs to do is identify the
application process to be modified. If the process is already in
execution, this can be done by specifying the process id of the
application as an argument to create an instance of a thread object:
appThread = new BPatch_thread(proccesId);
This creates a new instance of the BPatch_thread class that refers to the
existing process.
It had no affect on the state of the process (i.e., running or stopped).
If the process has not been started, the mutator specifies the name of the
command line to execute the process:
appThread = new BPatch_thread(argc, argv);
Once the application thread has been created, the mutator defines the snippet of code to be
inserted and the points where they should be inserted. For example, if we wanted to count the
number of times a procedure called InterestingProcedure executes, the mutator might look like
this:
BPatch_image appImage;
BPatch_Vector points;
// Open the program image associated with the thread and return a
// handle to it.
appImage = appThread->getImage();
// find and return the entry point to the "InterestingProcedure".
points = appImage.findProcedurePoint("InterestingProcedure",
BPatch_entry);
// Create a counter variable (but first get a handle to the correct
// type).
BPatch_variableExpr &intCounter =
appThread->malloc(appImage.findType("int"));
// Create a code block to increment the integer by one.
// intCounter = intCounter + 1
//
BPatch_arithExpr addOne(BPatch_assign,
intCounter,
BPatch_arithExpr(BPatch_plus,
intCounter,
BPatch_constExpr(1)));
// insert the snippet of code into the application.
appThread->insertBlock(addOne, points);
4. Interface
This section describes functions in the API.
The API is organized as a collection of C++ classes.
The primary classes are
BPatch_thread, BPatch_image, BPatch_point, and BPatch_snippet.
The API also uses a template class called BPatch_Vector.
This class is modelled after the Standard Template
Library (STL) vector class.
4.1 class BPatch_thread
BPatch_thread is the primary class to operate on (and to create) code in execution.
-
BPatch_thread(int pid)
BPatch_thread(int pid, int tid)
BPatch_thread(int argc, char *argv[])
BPatch_thread(BPatch_Vector threads)
-
Each of these constructs creates a new instance of the BPatch_thread object.
The first constructor associates a BPatch_thread with an existing process.
The second function associates a new BPatch_thread with an existing thread
within a process. The meaning of thread and process is implementation
specific.
The ability to use the first two interfaces to create a
BPatch_thread object for an existing process depends on support from the
underlying operating system and may not be implemented on all platforms.
The running state of the process is
not affected by these two routines.
The third interface creates a new process and creates a new
BPatch_thread for the class.
The process is created, but is it put into a stopped state before
executing any code.
The fourth constructor creates a new "virtual" thread from a list of
threads.
This permits operations to be performed on several threads as a group.
This can (potentially) increase the efficiently of the requests because
they can be processed in parallel.
-
const BPatch_image &getImage()
-
Open the executable file associate this BPatch_thread object and return a
handle to it.
Depending on the implementation this might also parse the application's
symbol table.
-
void stopExecution()
void continueExecution()
-
These two functions change the running state of the thread.
stopExecution puts the thread into a stopped state.
Depending on the operating system, stopping one thread may stop all threads
associated with a process. continueExecution continues execution of the thread
(or group of threads if they have to be stopped atomically).
-
bool isStopped()
int stopSignal()
bool isTerminated()
-
There three functions query the status of a thread.
isStopped returns true if the thread is currently stopped.
If the process is stopped (as indicated by isStopped),
then stopSignal can be called to find out what signal caused the process
to stop.
isTerminated returns true if the thread has exited.
Any of these functions may be called multiple times and calling them will
not affect the state of the thread.
-
void catchSignal(int signum);
void ignoreSignal(int signum);
-
These two functions indicate that the process should be stopped or not
when it receives the named signal.
-
int dumpCore(const String &file, const bool terminate)
-
This function causes the thread to dump its state to the passed file argument.
If the terminate flag is true, the thread is also terminated.
The ability to use this function depends on support
from the underlying operating system and may not be implemented on all
platforms.
-
BPatch_variableExpr malloc(int n)
BPatch_variableExpr malloc(const BPatch_type&)
-
These two functions allocate memory.
Memory allocation is from a heap.
The heap is not (necessarily) the same heap used by the application.
The available space in the heap may be
limited depending on the implementation.
The first function, malloc(int n), allocates n bytes
of memory from the heap.
The second function, malloc(const BPatch_type&t), allocates
enough memory to hold an object of the specified type.
Using the second version is strongly encouraged
because it provides additional information to permit better type
checking of the passed code.
The returned memory is from a global heap,
and may be used in different snippets.
-
void free(const BPatch_variableExpr &ptr)
-
Free the memory in the passed ptr.
The programmer is responsible to verify that all code that
could reference this memory will not execute again
(either by removing all snippets that refer
to it, or by analysis of the program).
-
InferiorPC(const BPatch_snippet &expr)
-
Cause snippet to be called once.
This interface has several applications, including causing
initialization functions to be
called in the application.
The application process must be stopped when this is called.
This call will use the application stack for saving local state.
-
insertSnippet(const BPatch_snippet &expr, const BPatch_point&)
insertSnippet(const BPatch_snippet &expr, const BPatch_Vector&)
-
Insert a snippet of code at the specified point.
If a list of points is supplied, insert the code
snippet at each point in the list.
What about wild cards for all threads in a process?
-
setTypeChecking(bool state)
-
Turn on or off type-checking of snippets.
By default type-checking is turned on, and an
attempt to create a snippet that contains type conflicts will fail.
Any snippet expressions created with type-checking off have the type of
their left operand.
Turning type-checking off, creating a snippet, and then turn type-checking
back on is similar to type cast operation is the C
programming language.
-
setMutationsActive(bool)
-
Enable or disable the execution of snippets for the thread.
This provides a way to temporally disable all of the dynamic code patches
that have been inserted without having to delete them one by one.
All allocated memory will remain unchanged while the patches are disabled.
When the mutations are not active, the process control functions
(i.e., stopExecution and continueExecution) can still be used.
Requests to insert snippets (including oneShots) may not be
made while mutations are disabled.
One additional convenience (non-member) function is provided to test if
the status of any of the threads managed by the mutator has changed.
-
bool pollForStatusChange();
-
This is useful for a mutator that needs to periodically check on the
status of its managed
threads and does not want to have to check each process individually.
4.2 class BPatch_image
This class defines a program image (the executable associated with a thread).
-
const BPatch_Vector<&BPatch_point> &getProcedures()
-
Return a table of the procedures in the image.
-
const BPatch_Vector<&BPatch_point> &findProcedurePoint(const String &name,
const BPatch_procedureLocation&)
-
Return the BPatch_point associated with the requested procedure.
The BPatch_procedureLocation argument is one of BPatch_entry,
BPatch_exit, BPatch_subroutine, BPatch_longJump, or BPatch_allLocations.
It is used to select which type of points associated with the
procedure will be returned.
BPatch_entry and BPatch_exit request respectively the entry and exit
points of a subroutine.
BPatch_subroutine returns the list of points where the procedure calls
other procedures.
BPatch_longJumps returns any long jump statements made by the procedures.
If the lookup fails to locate any points of the requested type,
a list with zero elements is returned.
The function can fail either because the procedure does not exist or
because there are no such points.
-
const BPatch_point &findLinePoint(const String &fileName, int line)
-
Return the handle to the instrumentation point nearest to the requested
fileName and line number.
The nearest point to a requested line is the last executable instruction
before the line
(Note this function can have strange interactions with optimized code).
-
const BPatch_variableExpr &findVariable(const String &name)
-
Lookup the passed variable name as a global variable.
The lookup is done in the scope of global variables defined in the original
(un-instrumented) application program.
The returned BPatch_variableExpr can be used to create references to
the variable in subsequent snippets.
If the image was not compiled with debugging symbols,
this function will fail even if the global
variable is defined.
-
const BPatch_variableExpr &findVariable(const BPatch_point &scope, const
String &name)
-
Lookup and return a handle to the named variable using the passed
BPatch_point as the scope of the variable.
The returned BPatch_variableExpr can be used to create references (uses)
of the variable in subsequent snippets.
The scoping rules used will be those of the source language.
If the image was not compiled with debugging symbols,
this function will fail even if
the variable is defined in the passed scope.
-
const BPatch_type &findType(const String &name)
-
Lookup and return a handle to the named type.
The handle can be used as an argument to malloc to create new variables
of the corresponding type.
4.3 Class BPatch_snippet
A snippet is an abstract representation of code to insert into a program.
Snippets are defined by
creating a new instance of the correct subclass of a snippet.
For example, to create a snippet to call a function, you create a
new instance of the class
BPatch_funcCallExpr.
Creating a snippet does not result in code being inserted into an application.
Code is generated when a request is made to
insert a snippet at a specific point in a program.
Sub-snippets may be shared by different snippets
(i.e. a handle to a snippet may be passed as an argument to create two
different snippets), but whether the generated code is shared
(or replicated) between two snippets is implementation
dependent.
-
const BPatch_type &getType()
-
Return the type of the snippet.
-
float getCost()
-
Return the estimated cost of the snippets in seconds.
The problems with accurately estimating
the cost of code are numerous and out of the scope of this
document [1].
But, it is important to realize that the returned cost value is (at best)
an estimate.
The rest of the classes are derived classes of the class BPatch_snippet.
-
BPatch_sequence(const BPatch_Vector &items)
-
Define a sequence of snippets.
The passed snippets will be executed in the order in which
they appear in the list.
-
BPatch_funcCallExpr(const BPatch_function& func, const
BPatch_Vector &args)
-
Define a call to a function, the passed function must be valid for the
current code region.
Args is a list of arguments to pass to the function.
If type checking is enabled, the types of the
passed arguments are checked against the function to be called
(Availability of type checking depends on the source language of the
application and program being compiled for debugging).
-
BPatch_boolExpr(BPatch_relOp op, const BPatch_snippet &lOperand, const
BPatch_snippet &rOperand)
-
Define a relational snippet. The available operators are:
-
Operator | Description
|
---|
BPatch_lt
| Return lOperand < rOperand
|
---|
BPatch_eq
| Return lOperand == rOperand
|
---|
BPatch_gt
| Return lOperand > rOperand
|
---|
BPatch_le
| Return lOperand <= rOperand
|
---|
BPatch_ne
| Return lOperand != rOperand
|
---|
BPatch_ge
| Return lOperand >= rOperand
|
---|
BPatch_and
| Return lOperand && rOperand (Boolean and)
|
---|
BPatch_or
| Return lOperand || rOperand (Boolean or)
|
---|
-
The type of the returned snippet is boolean, and the operands are type checked.
-
class BPatch_ifExpr(const BPatch_boolExpr &conditional,
const BPatch_snippet &tClause,
const BPatch_snippet &fClause)
class BPatch_ifExpr(const BPatch_boolExpr &conditional,
const BPatch_snippet &tClause)
-
This constructor creates an if statement.
The first argument, conditional, should be a Boolean
expression that is will be evaluated to decide which clause should be executed.
The second argument, tClause, is the snippet to execute if the conditional
evaluates to true. The third argument, fClause,
is the snippet to execute if the conditional evaluates to false.
This third argument is optional.
Else-if statements, can be constructed by making the fClause of an if
statement another if statement.
-
BPatch_constExpr(int value)
BPatch_constExpr(float value)
BPatch_constExpr(const String &value)
-
Define a constant snippet of the appropriate type.
-
BPatch_arithExpr(BPatch_binOp op, const BPatch_snippet &lOperand, const
BPatch_snippet &rOperand)
-
Perform the required binary operation. The available binary operators are:
-
Operator | Description
|
---|
BPatch_assign
| Assign the value of rOperand to lOperand
|
---|
BPatch_plus
| Add lOperand and rOperand
|
---|
BPatch_minus
| Subtract rOperand from lOperand
|
---|
BPatch_divide
| Divide rOperand by lOperand
|
---|
BPatch_times
| Multiply rOperand by lOperand
|
---|
BPatch_mod
| Compute the remainder of dividing rOperand into lOperand
|
---|
BPatch_ref
| Array reference of the form lOperand[rOperand]
|
---|
BPatch_seq
| Define a sequence of two expressions (similar to comma in C)
|
---|
-
Should we add min, max, and mean Ruth Aydt suggested this. jkh 8/10/95)
-
BPatch_arithExpr(BPatch_unOp, const BPatch_snippet &operand)
-
Define a snippet consisting of a unary operator.
There available unary operators are
BPatch_negate, and BPatch_address.
BPatch_negate takes an integer snippet and returns
the negation of the the snippet.
BPatch_address takes a variable reference snippet and
returns a pointer to it.
This is equivalent to the C operator (&) and is useful for
call-by-reference parameters.
-
BPatch_gotoExpr(const BPatch_gotoExpr &target)
-
Branch to the passed snippet.
-
nullExpr()
-
Define a null snippet. This snippet contains no executable statements;
however it is a useful place holder for the destination of a goto.
4.4 class BPatch_Vector
BPatch_Vector is the primary container class used by the API.
It is styled after the Standard Template Library (STL) Vector container class.
At the time of the writing of this document, STL has
been adopted as part of the ANSI C++ standardization,
but implementations were not widely
available.
As a result, the initial version of the API uses its own compatible subset
of the Vector class.
In any implementation of the API will have (at least) the following member
functions available.
-
BPatch_Vector();
-
Create a new empty vector.
-
int size();
-
Return the number of elements in the container instance.
-
void push_back(const T& x);
-
Add x to the end of the Vector.
-
const T& operator[](int n) const;
-
Return the nth element of the Vector.
5. Other Examples
In this section we show a complete program to demonstrate the use of the API.
The example is a program called "re-pipe",
it takes two arguments a process id and a file name and changes the output
file descriptor for the specified process to the be the named file.
The motivation for the example program is that you run a program,
and it starts to print copious lines of output to your screen,
and you wish to direct that output to a file without having to re-run
the program.
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include "BPatch.h"
main(int argc, char *argv[])
{
int pid;
BPatch_thread *tr;
BPatch_image appImage;
BPatch_Vector<BPatch_block&> args, dupArgs;
BPatch_Vector<BPatch_point&> openFunc;
BPatch_Vector<BPatch_point&> dup2Func;
BPatch_Vector<BPatch_block&> codeBlock;
// check for the correct arguments
if (argc != 2) {
printf("usage: repipe <pid> <file name>\\n");
}
// Get the process ID from the command line and create a BPatch_thread
// instance for that process.
pid = atoi(argv[1]);
tr = new BPatch_thread(pid);
if (!tr) exit(-1);
// Open the application image (binary) and return a handle to it.
appImage = tr->getImage();
// The next part of the program generates the following code snippet:
//
// {
// int tempFd;
//
// tempFd = open(argv[2], O_WRONLY, O_CREAT);
// if (tempFd >= 0) {
// (void) dup2(0, tempFd);
// }
// }
//
// NOTE: argv[2] refers to the second argument to the mutator not
// the application.
// Create the code to open the new file.
// Open(argv[2], O_WRONLY, O_CREAT)
openFunc = appImage.findProcedurePoint("open", BPatch_entry);
if (openFunc.count() != 1) {
fprintf(stderr, "unable to find function open\\n");
exit(-1);
}
args.push_back(BPatch_constExpr(argv[2]));
args.push_back(BPatch_constExpr(O_WRONLY));
args.push_back(BPatch_constExpr(O_CREAT));
BPatch_funcCallExpr openCall(openFunc[0], args);
// generate assignment statement to a temp variable
// tempFd = open(...)
BPatch_variableExpr &tempFd = tr->malloc(appImage.findType("int"));
BPatch_arithExpr assgn(BPatch_assign, tempFd, openCall);
// dup2(0, tempFd)
dup2Func = appImage.findProcedurePoint("dup2", BPatch_entry);
if (dup2Func.count() != 1) {
fprintf(stderr, "unable to find procedure dup2\\n");
exit(-1);
}
dupArgs.push_back(BPatch_variableExpr(tempFd));
BPatch_funcCallExpr dup2Call(dup2Func[0], dupArgs);
// Generate if to test return statement
// if (tempFd >= 0) dup2(...)
BPatch_boolExpr compareExpr(BPatch_ge, tempFd, BPatch_constExpr(0));
BPatch_ifExpr ifExpr(compareExpr, dup2Call);
// build statement list of open and if.
codeBlock.push_back(openCall);
codeBlock.push_back(ifExpr);
BPatch_sequence block(codeBlock);
// now arrange for the code to be executed.
tr->stopExecution();
tr->oneShotCode(block);
tr->continueExecution();
// Code to cleanup ommited.
}
6. References
-
Jeffrey K. Hollingsworth and Barton P. Miller,
"An Adaptive Cost System for Parallel Programs",
Euro-Par `96, Lyon, France, August 1996.
-
Jeffrey K. Hollingsworth, Barton P. Miller, and Jon Cargille,
"Dynamic Program Instrumentation for Scalable Performance Tools",
1994 Scalable High-Performance Computing Conf.,
Knoxville, Tenn., 1994.
-
James R. Larus and Eric S. Snarr,
"EEL: Machine-Independent Executable Editing",
SIGPLAN Conference on Programming Language Design and Implementation,
June 1995.
-
Barton P. Miller, Mark D. Callaghan, Jonathan M. Cargille,
Jeffrey K. Hollingsworth, R. Bruce Irvin, Karen L. Karavanic,
Krishna Kunchithapadam, and Tia Newhall,
"The Paradyn Parallel Performance Measurement Tools",
IEEE Computer 28 11, (November 1995).
Last modified:
Wed Oct 2 11:33:39 CDT 1996
by
bart