Guide  Reference  Contents  Previous  Next  Index
A persistent collection is an aggregate persistent object that can contain a variable number of elements. The elements of a collection can be either persistent objects or key-value pairs. In the latter case, the values are persistent objects; the keys may be either strings or persistent objects. You can create persistent collections to organize persistent objects.
Collections are classified according to whether the order of the elements is relevant:
Collections can also be classified according to their implementation. Scalable collections can increase in size with minimal performance degradation; nonscalable collections cannot. Objectivity for Java collections are implemented using three different mechanisms:
Objectivity for Java provides the persistence-capable collection classes listed in the following table.
Class | Used for | Description |
---|---|---|
ooTreeList | List | Scalable ordered collection of objects that can contain duplicates and null elements. |
ooHashSet | Set | Scalable unordered collection of objects with no duplicates and no null elements. |
ooTreeSet | Sorted set | Scalable sorted collection of objects with no duplicates and no null elements. |
ooMap | Name map | Nonscalable unordered collection of key-value pairs in which the key is a string and the value is a persistent object or null. Maintains referential integrity. |
ooHashMap | Object map | Scalable unordered collection of key-value pairs in which the key is a persistent object and the value is a persistent object or null. |
ooTreeMap | Sorted object map | Scalable sorted collection of key-value pairs in which the key is a persistent object and the value is a persistent object or null. |
Referential integrity is a characteristic of a persistent collection that ensures that the collection has references only to objects that actually exist. Maintaining referential integrity requires that, when an object is deleted, any reference from a persistent collection to the deleted object is removed. Name maps can maintain referential integrity automatically; scalable collections cannot.
By default, a name map maintains referential integrity of its elements. That is, the name map ensures that each object in the name map is a valid persistent object. When a persistent object in a name map is deleted, Objectivity/DB automatically removes the corresponding key-value pair from the name map.
After you create a name map and before you add any elements, you can call its setIntegrityMaintained method to disable the automatic maintenance of its referential integrity. When you do so, you reduce the overhead in adding and deleting elements; however, you become responsible for ensuring that the name map does not contain any dangling references to deleted objects.
Scalable collections do not maintain referential integrity. Before you delete an object from the database, you are responsible for removing it from any persistent collection to which it belongs. You can restore the referential integrity of any of these collections by calling its removeAllDeleted method. If the collection contains any persistent objects that have been deleted, that method removes the deleted objects from the collection.
A collection has properties that affect its growth, the storage of any auxiliary objects it may use, and concurrency of access to its objects. The particular properties supported by each collection class depend on how collections of that class are implemented. Before you create a persistent collection of a given class, you should be familiar with the relevant properties of that class.
Objectivity for Java supports one type of nonscalable unordered collection, namely name maps. Name maps are implemented with a traditional hashing mechanism:
The hash table of a name map can grow dynamically; however, increasing its size requires rehashing the entire hash table.
When you create a name map, you can specify the following growth characteristics of its hash table.
totalElements >= numberBins * maximumAverageDensity
Scalable ordered collections (lists, sorted sets, and sorted object maps) are implemented as B-trees. The B-tree organization supports efficient binary search and reduces the runtime overhead of inserting elements into the middle of the collection.
An element's position within the ordered collection is given by a zero-based index.
A B-tree is composed of nodes, each with a corresponding array. The array for a non-leaf node contains references to the first leaf-node descendants along each branch from the node; the array for a leaf node contains references to elements of the collection whose indexes are within a particular range. B-Tree nodes and their arrays are internal objects that the collection creates as needed; you never create them or work with them directly.
A newly instantiated ordered collection consists of the root node and its array. As the collection grows, additional nodes are created as necessary. When each node is created, its corresponding array is also created. Most existing nodes do not need to be modified; in fact, those nodes can be accessed for read or write while a new node is being added.
Every ordered collection has a node size property that determines the maximum size of a node in the collection's B-tree; that is, the maximum number of references in its array. As elements are added to a newly created ordered collection, they are assigned to the root B-tree node until the collection's node size is reached. At that point, a new node must be added to the tree.
The default node size allows each array to contain as many references as will fit on a single page in the federated database. When you create an ordered collection, a parameter to the constructor allows you to specify a different node size.
The B-tree nodes and their arrays are all persistent objects and, as such, they are stored in containers. To access an element of an ordered collection, an application must be able to obtain a lock on the B-tree node and array corresponding to the element's index. As is the case with all persistent objects, locking a B-tree node or array locks its container, effectively locking any other objects stored in the same container. As a consequence, the distribution of nodes and arrays in containers affects concurrent access to the collection.
You can increase concurrent access to the collection by making sure that the collection's nodes and arrays are distributed in different containers. Of course, the more containers used for internal objects, the larger the federated database will be. "Assigning Basic Objects to Containers" describes how the use of containers can affect concurrency, storage requirements, and runtime performance.
At any given time, an ordered collection uses a particular container, called its current node container, to store its newly created B-tree nodes. The collection's container is its initial node container.
An ordered collection serves as the root node of its B-tree; that is, no additional B-tree node object is required if all elements can fit in the root node. If the number of elements exceeds the capacity of a single node, the collection creates additional nodes, as necessary, to accommodate the elements.
When the collection creates a new node, it clusters the B-tree node object in its current node container. When the current node container is full, the collection creates a new container, which becomes the current node container. Each new node container is created in the same database as the previous node container. When the database contains at least 30,000 containers, a new database is created automatically for the next node container.
If an ordered collection is clustered in a garbage-collectible container, all its node containers are garbage-collectible. If the collection is clustered in a non-garbage-collectible container, all its node containers are non-garbage-collectible.
An ordered collection stores only B-tree nodes in the node containers it creates. An application typically does not access those containers directly.
At any given time, an ordered collection uses a particular container, called its current array container, to store the arrays for its new B-tree nodes.
When you create an ordered collection, Objectivity for Java creates the array for the collection's root B-tree node. By default, Objectivity for Java also creates the collection's initial array container and stores the array in that container. The new container is created in the same database as the ordered collection. If an ordered collection is clustered in a garbage-collectible container, its initial array container is garbage-collectible; if the collection is clustered in a non-garbage-collectible container, its initial array container is non-garbage-collectible.
If you prefer, you can specify an ordered collection's initial array container as a parameter to the constructor that creates the collection. For example, you might want to minimize the number of containers used by a collection by specifying the collection's container as its initial array container. Alternatively, you might use a container in a different database as the collection's initial array container. In that case, the collection's arrays would be stored in a different database from its nodes.
As the collection creates a new array for each new node, the arrays are added to the initial array container until that container is full. Then, a new container is created and used as the current array container.
As more nodes are needed, the ordered collection stores each new node's array in its current array container until that container is full; it then creates a new current array container. Each new array container is created in the same database as the previous array container. When the database contains at least 30,000 containers, a new database is created automatically for the next array container.
All array containers for a given ordered collection are of the same type. If the initial array container is garbage-collectible, all subsequent array containers will be garbage-collectible; if the initial array container is non-garbage collectible, all subsequent array containers will be non-garbage collectible.
An ordered collection stores only arrays in the array containers it creates. An application typically does not access those containers directly.
Every ordered collection has an internal object, called a tree administrator, to manage the containers for the collection's nodes and arrays. The collection's tree administrator is created when the collection itself is created. By default, the tree administrator is stored in a new container in the same database as the ordered collection itself. If you want a collection's tree administrator to be stored in an existing container, however, you can specify that container as a parameter to the constructor that creates the ordered collection. For example, instead of having Objectivity for Java create a new container just for the tree administrator, you might choose to store the tree administrator in the same container as the ordered collection itself.
A collection's tree administrator is created when the collection itself is created. By default, the tree administrator is stored in a new container in the same database as the ordered collection itself. If you want a container's tree administrator to be stored in an existing container, however, you can specify that container as a parameter to the constructor that creates the ordered collection. For example, instead of having Objectivity for Java create a new container just for the tree administrator, you might choose to store the tree administrator in the same container as the ordered collection itself.
The tree administrator uses two properties of an ordered collection to control when the current node container and the current array container are considered "full."
pageSize / 47
To use a different value for this property, call the ordered collection's maxNodesPerContainer method.
Changing the maximum nodes per container affects only the collection's current node container and any node containers created in the future. If you reduce the number of nodes per container, existing node containers are left with more nodes than the new maximum; if you increase the number, existing node containers are left with fewer nodes than the new maximum.
Changing the maximum arrays per container affects only the collection's current array container and any array containers created in the future. It does not affect existing array containers that are already full.
Every sorted collection has an associated comparator that controls how elements are sorted. The comparator defines a total ordering to be used by the underlying B-tree.
You can implement an alternative sorting criteria with an application-defined comparator class. See "Defining a Comparator Class for Sorted Collections".
To use your own sorting criteria, assign an instance of your comparator class to a sorted collection when you create the collection. If you do not assign a comparator explicitly, the collection uses the default comparator. For additional information, see "Using a Comparator".
Scalable unordered collections (unordered sets and unordered object maps) are implemented with an extendible hashing mechanism that uses a two-level directory structure to locate elements. You can think of the elements in the unordered collection as being divided into disjoint groups, each with its own directory. The top-level directory identifies a hash bucket, which acts as the directory for one of the disjoint groups. A hash bucket locates elements whose hash values are within a certain range. Adding elements may cause individual hash buckets to be rehashed, but the entire collection never needs to be rehashed.
The two-level directory structure allows the unordered collection to increase in size with minimal performance degradation. Regardless of the size of the collection, accessing an element requires one look-up in the top-level directory and one look-up in the appropriate hash bucket.
Hash buckets are persistent objects that the collection creates as needed and uses internally; you never create them or work with them directly. By default, one initial hash bucket is created for the collection; if you prefer, you can specify a different number of initial hash buckets as a parameter to the constructor that creates the unordered collection object. The number hash buckets created initially is a power of two; if you do not specify a power or two, the next higher power of two is used. For example, if you specify 5 initial hash buckets, 8 initial hash buckets are actually created. If the collection has N hash buckets, the first N high-order bits of an object's hash value are used to determine which hash bucket it belongs to.
Preallocating multiple hash buckets increases the speed of adding and finding map elements. If each hash bucket is stored in a separate container (the default behavior), preallocating hash buckets also reduces the chance of lock conflicts. However, an unordered collection with a large number of initial hash buckets requires more disk space, more memory for the directory, and more time to create.
As an unordered scalable collection grows past the capacity of its existing hash buckets, new hash buckets are added.
Every scalable unordered collection has a bucket size property that determines the size of a hash bucket in its hash table. The size of a hash bucket is the number of elements that can be hashed into each bucket. The default hash-bucket size is 30011. If you want to use a different bucket size, you can specify the desired size as a parameter to the constructor that creates the unordered collection object.
For optimal performance, the hash-bucket size should be a prime number. If you specify a number that is not prime, the next higher prime number is computed and used as the actual hash-bucket size.
The hash buckets of a scalable unordered collection are persistent objects; as such, they are stored in containers. To access an element of a scalable unordered collection, an application must be able to obtain a lock on the hash bucket corresponding to the element's hash value. As is the case with all persistent objects, locking a hash bucket locks its container, effectively locking any other objects stored in the same container. As a consequence, the distribution of hash buckets in containers affects concurrent access to the collection.
By default, a separate hash-bucket container is created for each of the collection's initial hash buckets--that is, for each of the hash buckets that are created by the constructor. As the collection grows, and additional hash buckets are created, a new hash-bucket container is created by default for each new hash bucket; this default behavior optimizes concurrent access to the collection. However, the more containers used for hash buckets, the larger the federated database will be; for a discussion of the trade-offs between concurrency and storage requirements, see "Assigning Basic Objects to Containers".
If you prefer to store more than one hash bucket in a container, you can specify an existing container in which to store the all the initial hash buckets. In addition, you can change the number of hash buckets that are clustered in the same container using the collection's hash administrator; see "Hash Administrator".
By default, the first hash-bucket container is created in the same database as the unordered collection. Additional hash-bucket containers are created in the same database. If you specify an existing container for the initial hash buckets, additional hash-bucket containers will be created in the same database as that container. New hash-bucket containers are added to a given database until it contains at least 30,000 containers. Then a new database is created automatically for subsequent hash-bucket container.
All hash-bucket containers for a given unordered collection are of the same type. If an unordered collection is clustered in a garbage-collectible container, all its hash-bucket containers are garbage-collectible; if the collection is clustered in a non-garbage-collectible container, all its hash-bucket containers are non-garbage collectible.
An unordered collection stores only hash buckets in the hash-bucket containers it creates. An application typically does not access those containers directly.
Every scalable unordered collection has an internal object, called a hash administrator, to manage the containers for the collection's hash buckets.
A collection's hash administrator is created when the collection itself is created; the hash administrator is stored in a new container in the same database as the unordered collection itself. The collection's hash administrator is created when the collection itself is created; by default, the hash administrator is stored in a new container in the same database as the unordered collection itself. If you prefer, you can specify an existing container in which to store the hash administrator.
The hash administrator uses a property of the unordered collection to control when the current hash-bucket container is considered "full". The maximum buckets per container property specifies how many hash buckets can be clustered together in the same container. It is typical for a hash bucket to be updated frequently. The default value for this property is 1, which minimizes lock conflicts. If you know that a particular collection will be accessed only by a single user, locking is not an issue. In that case, a larger value may be appropriate. To use a different value for this property, call the collection's maxBucketsPerContainer method.
Changing the maximum buckets per container affects only the clustering of hash buckets that are created after the call. After you change the value from the default of 1 to a larger number, newly created hash buckets will be clustered in the collection's most recently created hash-bucket container until the maximum number is reached. New hash-bucket containers will created as needed, and the maximum number of has buckets will be clustered in each.
Every scalable unordered collection has an associated comparator that controls how an element's hash value is computed.
You can implement an alternative hashing algorithm with an application-defined comparator class. See "Defining a Comparator Class for Unordered Collections".
To use your own hashing algorithm, assign an instance of your comparator class to a scalable unordered collection when you create the collection. If you do not assign a comparator explicitly, the collection uses the default comparator. For additional information, see "Using a Comparator".
A comparator is an object of a concrete descendant class of ooCompare. It provides a comparison function for ordering elements of sorted collections, and a hashing function for computing the hash values for elements of unordered collections.
You can implement your own sorting or hashing behavior in an application-defined comparator class; to do so, you define your own subclass of ooCompare and override the compare and/or hash method as appropriate. An application-defined comparator also allows you to identify elements of a collection based on persistent data in the element or its key.
If your application uses sorted collections with elements (or keys) of some particular class, you may want to sort the elements based on the data in some persistent field(s) of each element (or key). You might additionally want to use the same data as a unique identifier for an element (or key) of a sorted collection so that you can look up the element with particular value(s) in the relevant field(s).
Elements in a sorted collection are ordered based on the sorting criteria embodied in the compare method of its comparator. That method compares a persistent object in the collection with another object and indicates the second object's position in the collection relative to the persistent object.
When you define a comparator class to be used with sorted collections, you can override the compare method to compare two persistent objects based on whatever sorting criteria you choose. Typically the comparison uses some data that uniquely identifies an object of its class. Uniqueness is necessary because the compare method must impose a total ordering on the elements.
For example, suppose the elements of a sorted set are objects of the Person class, and you know that no two elements of the set will have the same name. You might choose to order the elements based on the string in each element's name field. You could define the compare method of your comparator class to verify that its parameters are Person objects and, if so, to compare the name strings of the two objects. A comparator of this class could also be used for a sorted object map whose keys are Person objects.
A comparator class for sorted collections can optionally provide the ability to identify an element (or key) based on its sorting criteria. This ability allows you to use the data that identifies a particular element to:
In the preceding example, the application-defined comparator class could support the ability to use a name to uniquely identify an element in a sorted set of Person objects, or in a sorted object map in which the keys are Person objects.
If you want your comparator class to be able to identify an element (or key) of a sorted collection based on its sorting criteria, you should implement the compare method to be able to compare an element (or key) either with another persistent object or with data that identifies a persistent object. In our example, the compare method would be able to compare a Person object either with another Person object or with a string containing a person's name.
If your application uses scalable unordered collections with elements (or keys) of some particular class, you may want to hash elements based on the data in some persistent field(s) of each element (or key). You might additionally want to use the same data as a unique identifier for an element (or key) of an unordered collection so that you can look up the element with particular value(s) in the relevant field(s).
When you define a comparator class to be used with unordered collections, you can override the hash method to compute hash values for persistent objects using whatever criteria or algorithm you choose.
For example, suppose the elements of an unordered set are objects of the Employee class, representing people employed by a particular company in the United States; the SSN field of this class is a string representation of an employee's Social Security Number (SSN). You might define a comparator class whose hash method computes hash values from SSNs. The hash method would verify that its parameter is a an Employee object and, if so, convert the SSN string to a 32-bit integer to be used as the hash value. A comparator of this class could also be used for an unordered object map whose keys are Employee objects.
A comparator class for unordered collections can optionally provide the ability to identify an element (or key) based on the data from which its hash value is computed. This ability allows you to use the data that identifies a particular element to:
In the preceding example, the application-defined comparator class could support the ability to use a Social Security Number to uniquely identify an element in an unordered set of Employee objects or in an unordered object map in which the keys are Employee objects.
If you want your comparator class to be able to identify an element (or key) of an unordered collection based on class-specific data, you must override both the hash method and the compare method.
In our example, the hash method would be able to compute a hash value either from an Employee object or from a Social Security Number. The method might allow the SSN to be specified in a variety of different forms.
In our example, the compare method would be able to compare an Employee object either with another Employee object or with a SSN specified in any form that your hash method supports.
To use a comparator of an application-defined comparator class, you create it, make it persistent, and assign it to one or more collections. Special care may be required when modifying objects in the collection .
To create a comparator, instantiate your comparator class and make it persistent by clustering it in the desired container. A comparator is locked whenever you access its associated collection. To avoid locking conflicts, you typically cluster the comparator in a separate container. If the comparator is stored in the same container as the collection, applications may fail to get the necessary read lock on the comparator when another process is updating the collection.
The persistent data for a persistent collection references its comparator. Your application should not explicitly save any comparator. For example, you should not add a comparator to a collection or save a comparator in an instance variable of any persistent object. Typically, an application uses only comparators that it creates dynamically; it does not explicitly retrieve comparators from the database.
After creating the comparator and making it persistent, you can assign it to any collections that need to use the comparator's particular comparison and hashing methods. You assign a comparator to a collection by passing the comparator as a parameter to the constructor that creates the collection.
A collection's comparator may affect how an application modifies objects in the collection.
If a persistent collection uses a comparator of an application-defined class, the data for the collection in the federated database includes a reference to the comparator. Any application that retrieves the collection will also retrieve its comparator. As a consequence, any application that retrieves the comparator must include a comparator class with the same schema class name as the comparator's class.
Objectivity/DB provides persistent storage for data only, not for methods, so the federated database does not store compare and hash methods of the comparator. The comparator class in the retrieving application must include implementations for these methods; furthermore, those methods must use the same sorting criteria and the same hashing algorithm as the application that stored the collection.
If the applications that use a given persistent collection are all implemented in the same language (for example, Java), they can all share the definition of the comparator class. If the applications are written in different languages (for example, some in Java and some in C++), their comparator classes must have equivalent comparison and hashing methods.
You must create a persistent collection during a transaction; the new collection object belongs to that transaction's session and all persistent objects that you add to the collection must belong to the same session. You create a persistent collection in the same way as you would a typical Java object with the new operator, and you make it persistent as you would for an object of any persistence-capable class. For example, you might name the collection, reference it in a persistent field of a named root (or other persistent object), or use it as an element of another persistent collection.
You can retrieve a persistent collection from the database just as you retrieve any persistent object. See Chapter 11, "Retrieving Persistent Objects".
After you create or retrieve a persistent collection, you can call its methods to:
Any object you add to a collection must be an instance of a persistence-capable class. If the object is transient, it is made persistent when it is added to the collection; the default clustering strategy assigns it to the collection's container.
For a complete description of the available methods, see the descriptions of the various persistent-collection classes.
Examples in Chapter 6, "Defining Persistence-Capable Classes," Chapter 10, "Naming Persistent Objects," and Chapter 11, "Retrieving Persistent Objects," illustrate the creation and use of a name map.
Guide  Reference  Contents  Previous  Next  Index