Chapter 3 - Objects
The Part-Whole Relationship
In ThingLab, an object is composed of named parts, each of which is in turn another object. The parts are thus composed of subparts, and so on. The recursion stops with primitive objects such as integers and strings. Consider a line:
Point1: a point
Point2: a point
The line is composed of two parts that are its endpoints. Each endpoint is in turn composed of an x and y value; these are primitive objects (integers). An object will sometimes be referred to as the owner of its parts. For example, the above line owns its endpoints.
A primitive object has no parts of its own. In ThingLab, integers, real numbers, and text are taken to be primitive.
Part Descriptions A PartDescription is an object that describes the common properties of the corresponding parts of all instances of a class. Every class has a list of part descriptions, one for each part owned by its instance. The following things are associated with each part description:
Name - an identifier,
Constraints - the set of constraints that apply to the corresponding part of each instance.
Merges - the set of merges that apply to the corresponding part of each instance
Class - the class of the corresponding part of each instance. This is more restrictive than Smalltalk, in which the class of the contents of an
instance field is not declared. Imposing this restriction makes the job of constraint satisfaction easier [see Chapter 5].
For example, the class Line has two part descriptions that describe the parts of each instance of Line. The first part description has the name point1. It has no constraints or merges, and specifies that the point1 part of each line be an instance of class point. The other part description is defined analogously. For a class specifies some constraints, for example the class HorizontalLine, the point1 part description would also indicate that there was a constraint that applied to the point1 part of each of its instances.
When a part description is added to a class, it automatically compiles messages in the class’s message dictionary to read and write the part.
A subclass of PartDescription is SuperclassDescription. Instances of this subclass are used to describe part that represent superclasses. [See the section on Inheritance later in this chapter for more details.]
Insides and Outsides
One of the important features of Smalltalk is its sharp distinction between the inside and the outside of an object. The internal aspects of an object are its class, and its instance fields and their contents; its external aspects are the messages that it understands and its responses. Since other parts of the system and the user interact with the object by sending and receiving messages, they need not know about its internal representation. This makes it easier to construct modular systems. For example, the class Rectangle defines the message center. It makes no difference to the user of this message whether a rectangle actually has a center stored as one of its instance fields, or whether the center is computed on demand – in fact, it is computed on demand. [This is also related to work on data abstraction mechanisms in languages such as CLU and Alphard (Liskov et al. 1977; Wulf, London & Shaw 1976.)]
In ThingLab, the notion of having a part has implications for both the internal and external aspects of the object that owns the part. In the current implementation, internally the object must have an instance field in which the part is stored, as well as a corresponding part description in its class; externally, the object should understand messages to read and write the part. However, these internal and external aspects are separate. A virtual part, as proposed in Chapter 6, is an example of the use of this separation. Such a part has all the external manifestations of a part, i.e., messages to read and write it. Internally, however, there is no corresponding field; rather, the part is computed as needed. [Smalltalk already has virtual parts; the proposed mechanism would add the necessary declarative superstructure so that the constraint satisfaction mechanism could know about them.]
A path is a ThingLab object that represents a symbolic reference to a subpart. Each path consists of a list of part names that indicate a way to get from some object to one of its subparts. The path itself doesn’t own a pointer to the object to which it is applied; this must be supplied by the user of the path. Thus the same path can be used to refer to the corresponding subpart of many different objects. For example, point1 x is a path to get to the x value of the first endpoint of any line. [To get to this path first sends the message point1 to the line, and then sends the message x to the result.] A full path is a path that starts from the global symbol table, while a relative path is one that starts from some intermediate owner (not necessarily the global symbol table). In the latter case, the starting point is provided by the context in which the path is used; see Chapter 4 and 5 for examples.
While the definition of a path is simple, the idea behind it has proven quite powerful. As mentioned above, Smalltalk draws a distinction between the inside and the outside of an object. The notion of a path helps strengthen this distinction by providing a protected way for an object to provide external references to its parts and subparts. For example, if a triangle wishes to allow another object to refer to one of its vertices, it does so by handing back a path such as side2 point1, rather than by providing a direct pointer to the vertex. If this other object wants to change the location of the vertex, it must do so by routing the request through the triangle, rather than by simply making the change itself. This allows the triangle to decide whether or not to accept the change; if it does accept it, it knows what has been altered, so that it can update its other parts as necessary to satisfy all its constraints.
In addition to these semantic considerations, a major pragmatic benefit of this discipline is that no backpointers are needed. [If the triangle did hand out a direct pointer to its vertex, the vertex would need a pointer back to the triangle so that it could inform the triangle when it changed.] Access to parts is somewhat slower using this technique, since each access involves following a path. However, an access via a path can often be moved out of the inner loops by the constraint compiler. Another pragmatic consideration is that constraints and merges can be represented symbolically using paths, so that they apply to all instance of a class, rather than to a particular instance. This allows the system to compile constraint satisfaction plans in the form of standard Smalltalk methods.
The constraint satisfaction techniques to be described in Chapter 5 all depend on noticing when one constraint interferes with another. Paths are used to specify which parts or subparts of an object are affected by constraint. Two paths overlap if one can be produced from the other by adding 0 or more names to the end of the other’s list. The following paths overlap the path
Side1 point1 X
(The empty path)
The following paths do not overlap side1 point1:
To test if two constraints interfere, the system checks if any of their paths overlap.
The class PartDescription is actually a subclass of FieldDescription, which is an abstract class intended to capture the notion of a description of instance fields in general. Two other subclass of FieldDescription have been defined for use in describing external references, that is, references from an object to something other than one of its own parts. External references come in two flavors: absolute and relative. The behavior of these two kinds of references is different with respect to copying. If a copy of an object is made and used as a part of another object, all absolute references will remain unchanged in the copy. On the other hand, all relative references will be changed to refer to the corresponding part of the new owner of the copy. For example, an absolute reference might be used t represent a footnote in a document A that refers to a section in another document B. If A is copied, the footnote in the copy will still refer to the same section in B. On the other hand, a relative reference should be used to represent a reference from one section of A to another section of the same document. If A were copied, the relative reference would indicate the corresponding section of the copy.
For pragmatic reasons, external references are represented as full paths, rather than direct pointers. This avoids circular structures, and makes it easy to find all the constraints that affect the referenced object.
As described in Chapter 1, a new class may be defined as a subclass of one or more existing classes. The subclass inherits the part descriptions, constraints, merges, and message protocol of its superclasses. It may add new information of its own, and it may override inherited responses to messages. Every class (except class Object) must be a subclass of at least one other class.
The superclasses of an object are represented by including an instance of each superclass as a part of the object. The field descriptions for such parts will be instances of SuperclassDescription, rather than PartDescription. These parts may have constraints and merges applied to them in the usual way. The only difference between these instances of superclasses and ordinary parts is that messages will be forwarded to them automatically (see below). [The actual implementation is a bit more arcane – see the section Implementation of Superclasses. However, the effect is as described, and the reader should think of it in this way.]
The most general class in both Smalltalk and ThingLab is class Object. As part of the ThingLab kernel, a large number of methods have been added to this class. These methods provide defaults for adding or deleting parts, merging parts, satisfying constraints, showing in a ThingLab window, and so on. In general, these methods treat an object as the sum of its parts. For example, to show itself, an object asks each of its parts to show; to move itself by some increment, the object asks each of its parts to move by that increment. This strict hierarchy is, however, modified by the object’s constraints and merges. Thus, when an object decides exactly how to move, it must watch out for overlap between its parts due to merges, and must also keep all its constraints satisfied. [See Chapter 5 for an example.]
Inheritance is relatively simple for a class with only one superclass. However, the situation is more complex when multiple superclasses are present: how should the system act in the presence of conflicting inherited information? To help choose among conflicting information, one of the superclasses of a class must be designated as its primary superclass.
When an object receives a message, it first checks the message dictionary of its own class. If a corresponding method is found, that method is used. If not, it checks the dictionary of its primary superclass, and then the primary superclass of that superclass, and so on. If the message is not understood by any of the primary superclasses , the object then checks all of its other superclasses to see if any of them understand the message. If there is no method, or if there are several conflicting inherited methods, the user is notified. If there is a single method defined for that message, then that method is used. To avoid this search the next time the message is received, the class automatically compiles a message forwarder that will intercept that message in the future and sent it directly to the appropriate superclass part. If the message protocol of a class is changed, its subclass should be notified so that they can delete obsolete message forwarders; this is not done in the current implementation.
If the user wants to choose among conflicting messages, or to combine them somehow, an appropriate method for doing this should be defined in the class or one of its primary supperclasses. An example of the use of this technique is in the default method for showing an object’s picture. This default method (defined in class Object) yields a picture that combines all inherited pictures and the pictures of the object’s own parts. It simply specifies that each of the object’s parts should be asked to show itself (irrespective of whether or not the parts represent multiple superclasses).
In previous versions of ThingLab, the problem of conflicting inherited methods was handled in the following way. Methods were themselves objects that understood a variety of messages. If several conflicting methods were found for a given message, the inherited methods were asked to negotiate among themselves to come up with the method to be used to receive the message. These negotiations were relatively unsophisticated. If the methods were all the same, then any one of them would be used. Some methods, called default methods, would always defer to others, while other methods, such as those for depicting an object, would make a new method that invoked all the inherited methods. If negotiation failed, the user would be asked to supply the desired method. Once a method had been constructed or entered, it was saved in case it was required again. This technique worked reasonably well; the only reason that it is not in the current system is that the author never got around to re-implementing it.
As an example of the use of multiple superclasses, suppose one has available the class Triangle, and two subclasses of it: a class IsoscelesTriangle; and another class HorizontalTriangle, the class of triangles with horizontal bases.
Side1: a line
Side2: a line
Side3: a line
Side1 point1 = side3 point1
Side1 point2 = side2 point1
Side2 point2 = side3 point2
Side1 length = side2 length
Side3 point1 y = side3 point2 y
Given the above classes of traingles, suppose one wishes to define a new class of isosceles horizontal triangles. This can be done as follows:
T: Triangle (primary superclass)
T = IT = HT
If a different correspondence between the parts of the superclasses is desired, it can be represented by an expanded set of merges, e.g.:
T side1 point1 = IT side1 point1 = HT side2 point1
T side2 point1 = IT side2 point1 = HT side3 point2
T side3 point2 = IT side2 point2 = HT side1 point1
The use of this scheme yields a reasonable structure and behavior for the new class. The new class has all the constraints and merges of each of its superclasses. The inherited parts are all available in the new class; overlaps are represented by merges. The use of a primary superclass need not introduce an unwanted asymetry: rather than designating either IsosclesTriangle or HorizontalTriangle as the primary superclass, Triangle itself was so used. Note that having several inherited parts with the same name is not a problem, as one can always indicate which part is meant by prefixing the message with the name of the appropriate superclass part.
Semantically, the close relationship in this scheme between the class-subclass relation and the part-whole relation is interesting. While the scheme is fairly simple, it was arrived at only after much discussion, head-scratching, and implementation of silly ideas, Also given the author’s past experience, it probably has some as yet undiscovered deficiencies.
Use of Multiple Superclasses for multiple Representation
Multiple superclasses also provide a way of implementing multiple representation of objects. For example, suppose the user desires to represent a point in both Cartesian polar forms. This may be done as follows:
X: a Real
Y: a Real
R: a Real
Theta: a Real
G: GeometricObject (primary superclass)
C = P asCartesian
C ¬ P asCartesian
P ¬ C asCartesian
The constriant makes use of two auxiliary to CartesianPoint and PolarPoint.
[Ý CartesianPoint new X:r* theta cos Y:R*theta sin]
asPolar ½ dist angle
[dist ¬ ((x*x) + (y*y)) sqrt.
Angle ¬ [x¹ 0.0Þ [(y/x) arctan]
Y=0.0Þ [0.0] y <0.0Þ [-pi/2.0] pi/2.0].
Ý PolarPoint new r: dist theta: angle]
These methods in turn make use of the message sin, cos, arctan, and sqrt to real numbers (methods not listed).
Implementation of Multiple Superclasses
For pragmatic reasons, multiple superclasses have been implemented so as to take advantage of the efficient Smalltalk subclassing mechanism. Superclasses other than the primary one are implemented as described. However, the primary superclass is represented as a standard Smalltalk superclass. Thus, for this supperclass , the structure has been flattened. Rather than having an instance of its primary superclass as a part, the subclass has the parts of its primary superclass as its own parts. For example, the instance fields of the class IsoscelesHorizontalTriangle would actually be as follows:
Field 1 – a Line (the side1 from Triangle)
Field 2 – a Line (the side2 from Triangle)
Field 3 – a Line (the side3 from Triangle)
Field 4 – an IsoscelesTriangle
Field 5 – a HorizontalTriangle
For a given class, a prototype is a distinguished instance that owns default of typical parts. All classes understand the message prototype, and respond by returning their prototypical instance. If the user doesn’t specify otherwise, the prototype has nil in each of its instance fields. However, if the user has defined the class by example, the prototype will hold the particular values from the example. These values may also be set by writing an initialization message.
Prototypes provide a convenient mechanism for specifying default instance values. Thus, in the introductory example, when a new line was being inserted into the quadrilateral, its initial length and orientation were copied from the prototype Line. Such defaults are essential in graphical editing, since every object needs some appearance.
More importantly, a prototype serves as a representative of its class. ThingLab distinguishes between messages that have no side effects for the receiver (read-only messages), messages that alter the values stored in the receiver, and messages that alter the receiver’s structure. Any instance will accept read-only or value-altering messages, but only prototypes will accept structure-altering messages. This is because this latter type of message affects the class. The prototype is in charge of its class, and is willing to alter it, but for instances other than the prototypical one, the class is read-only. Requests to move a side of a polygon, or even turn it inside out, are examples of value-altering messages. On the other hand, requests to add or delete a side, edit a constraint, or merge two point are structure-altering messages.
On occasion, the user might want to alter the structure of an instance that is not a prototype. To do this, the user should first send it the message asPrototype, and then send the structure-altering message to the result. When an object receives the message asPrototype, if it isn’t already a prototype, it makes a new subclass of its current class, moves its instance values into the fields of the prototype for this subclass, and returns that prototype.
Defining Classes by Example
When the user defines a class by example, the editing messages are always sent to the prototype, rather than sometimes to the class and sometimes to one of its instances. The prototype takes care of separating the generic information that applies to all instances of its class from the specific information that applies only to the default values that it holds in its fields. With its class it associates the number and class of the parts, the constraints, and the merges. With its own instance fields it associates the default values for its parts.
It is not possible to define all classes by example; some, such as classes for new constraint types and abstract classes like GeometricObject, must be entered writing an appropriate Smalltalk class definition. One can also use a combination of definition by example and hand coding. In general, there are many possible classes that could be abstracted from a given example; which one should be abstracted will depend on the user’s purposes. The ThingLab facility for definition by example provides a reasonable default, but is not a general solution to this problem. If the user wants some other sort of class, he or she should write an appropriate definition.
This facility could, however, be generalized to allow several kinds of choices as to how the class definition should be abstracted. First, the prototype could decide whether some aspect of itself, e.g. a constraint, should be a property of it alone, or of its class in general. This kind of choice does not arise in the current representation, since e.g. constraints can only be associated with classes and not particular instances; but if the representation were extended such decisions would need to be made. Also, the prototype could recognize some configuration of its parts as being an instance of an already defined class. In the current system, this is done only for merges – in the user interface, if the user positions one object near another, an instance of a merge will be constructed automatically. However, the system could be extended so that various classes were waiting to recognize instances of themselves. For example, the class HorizontalLine might notice that the line just drawn by the user was in fact horizontal, or the class SeriesResistors might notice that a pair of resistors formed an instance of itself.
If a class is edited, this change will affect all of its instances. After the edit has been made, all these objects should be notified of the change so that they can update themselves if necessary. This problem is not handled completely in the current implementation: presently, only the prototype is notified; the other affected instances first notice the change the next time they reference the class.
Dynamic updating is a knotty problem in general. One such problem arises in connection with the mixture of the generic and the particular. For example, suppose that the prototype for the class Widget is edited by the insertion of an additional line. Each other instance of Widget should also have a line added. Sometimes Widget’s constraints will completely determine the new line’s location; but this will not always be the case. If not, what should be the line’s location? The particular location used for the prototype may not be appropriate, since the parts of the other instances may be positioned quite differently.
Limitations of the ThingLab Hierarchies
One of the main deficiencies of both the part-whole and inheritance hierarchies as presently implemented is that they both are useful for describing additive sorts of knowledge only; with the exception of overriding inherited message responses, the hierarchies do not have any mechanisms for masking off, modifying or replacing information. For example, an object ought to be able to mask off an inherited constraint, or to replace an inherited constraint with another. More generally, there should be a way of defining all sorts of "parasitic" objects, that can be merged with a given object to change it in some specified way. For example, one should be able to define a class LineDotter. Any class that was a subclass of LineDotter would have all its inherited lines made into dotted lines. Thus, a subclass of Triangle an LineDotter would have a dotted lines for its sides.
A Comparison with KRL
The representation language KRL [Bobrow & Winograd 1977a, Bobrow & Winograd 1977b] is another system that uses an object-oriented representation scheme, and it is interesting to compare the approach used in KRL with that used in ThingLab in the areas of overlap. In this section, KRL definitions for the various classes of triangles described above will be presented as a framework for making some comparisons. The intended domain of KRL is much broader than that of ThingLab, and there are hooks for many more kinds of features built into it. Also KRL is much less "automatic" than ThingLab. For example KRL has no constraint mechanism. A programmer who wants some relation to be satisfied must write procedures to satisfy it, and must make sure that the procedures are invoked at the proper times.
In KRL, knowledge is organized around entities called units. Each unit has unique name, and has one or more named slots containing descriptions of other units associated with the given unit. Every unit has a distinguished slot named self; the descriptions in this slot apply to the unit as a whole. The KRL analog of a class is a unit with the particular values left undefined. Such a unit is called a prototype in KRL – note that this is not the same as a ThingLab prototype. Inheritance relations may be described using the self slot. For example, to describe the prototype EquilateralTriangle as being a kind of Triangle and also a kind of RegularPolygon, its self slot would include the descriptions "a Triangle" and "a RegularPolygon". To provide for simple inheritance of slots and attached procedures from one parent, a prototype optionally may be described as a further specif-cation of a single other prototype. This is similar to the primary superclass notion in ThingLab.
The KRL analog of an instance is a unit that typically has inherited slots that are filled in with specific information but no additional slots of its own. For example, a line instance would have the description "a Line" in its self slot, perhaps with specific values for its endpoints, but would have no slots beyond those inherited from Line.
An important difference between a KRL prototype and a ThingLab of Smalltalk class is that a KRL prototype and an instance of it are basically the same kind of object: the only difference built into the language is that one has more information filled in than the other. [Of course, a KRL user may choose to treat them differently.] On the other hand, a Smalltalk class and an instance of it are fundamentally different: a class has the message protocol an local storage defined by class Class; while an instance has the message protocol an local storage defined by its own class. For example, suppose one has a KRL prototype for Line that corresponds to the Smalltalk class Line. In KRL, one could fine the description of an endpoint of a Line instance (this might be an explicitly stored point); in the same way,one could find the description of an endpoint of the prototype Line (this would be simply "a point"). In Smalltalk, an instance of class Line will respond to the message point1 by returning its first endpoint; but class Line itself won’t understand this message at all. On the other hand, the Smalltalk class Line understands messages like howMany (how many instances of Line are there?), while this would be a somewhat odd thing to ask a KRL prototype, since it is asking for information about the set of all lines rather than the prototypical line.
There seem to be both advantages and disadvantages for each of these representations for generic concepts, but other language features reduce their importance. ThingLab prototypes can represent a typical instance of a class in the same way that a KRL prototype can represent a stereotypical individual. On the other hand, the KRL meta-description facility gives it the power to talk about all instances of a prototype, just as a Smalltalk class talk about all its instances rather than being a typical instance. KRL does have the significant advantage that an instance may be represented as a further description of several other prototypes. This generality is not cheap.
Turning to the example, we might describe a triangle prototype in KRL as follows:
side1: A Line with
point1 = The point1 from a line thatIs My side3
point2 = the point 1 form a line thatIs My side2
side2: A Line with
point1 = The point2 from a line thatIs My side1
point2 = the point 2 form a line thatIs My side3
side3: A Line with
point1 = The point1 from a line thatIs My side1
point2 = the point 2 form a line thatIs My side2
The triangle is a kind of geometric object, as indicated by the description in its self slot. The part-whole hierarchy used in ThingLab can be represented in KRL using other named slots. Note that the descriptions of the line’s endpoints are redundant. Such redundancy is allowed, even encouraged, in KRL. There are also other ways in which the triangle could be described e.g. by including additional slots for its vertices. The contents of a KRL slot are a set of descriptions of another entity, not necessarily a direct pointer to that entity. This allows partial knowledge about an entity to be represented, which is not generally possible in ThingLab. Given a Triangle unit, units for IsoscelesTriangle and HorizontalTriangle may be defined.
#IsoscelesTriangle 1 1: FurtherSpecified (\Triangle)
Self: A Triangle
A ConstrainedObject with
ConstraintRule = …
#HorizontalTriangle 1 1: FurtherSpecified (\Triangle)
Self: A Triangle
A ConstrainedObject with
ConstraintRule = …
Since there is no constraint mechanism built into KRL, a unit for ConstrainedObject has been posited. Not that these triangles have been described in terms of two other units (c.f. multiple superclasses). As a result of being declared as further specifications of Triangle, they will automatically inherit slots and procedures from that unit.
Finally, IsoscelesHorizontalTriangle can be described using the two preceding definitions.
self: An IsoscelesTriangle
The multiple description facility of KRL provides a convenient way of handling the multiple superclass problem. In this definition, no correspondence between the sides of the IsoscelesTriangle and the HorizontalTriangle has been declared. In this case, the KRL matcher will from this correspondence when the program tries to access one of the sides. However, one could declare the correspondence in advance with suitable additional descriptions. This is in contrast to ThingLab, in which the correspondence is always part of the static instance structure. Again, the flexibility of KRL is expensive, in that a fairly complex reasoning process will be invoked for every access to a part of the horizontal isosceles triangle. In ThingLab, on the other hand, such run-time access is quite cheap.
Other Representations of ThingLab Objects
The current representation scheme is not the only one that has been used in ThingLab. In earlier versions, ThingLab objects were simulated in Smalltalk, rather than being represented directly as Smalltalk objects as at present. In these earlier versions, there was a single Smalltalk class Thing; all ThingLab objects were represented as instance of this class. ThingLab message sending and inheritance were simulated as well. [These versions were written first in Smalltalk-72 and then in Fastalk, neither of which supported subclasses.] The simulated instances were somwhat more complex than Smalltalk instances, in that each had its own sets of constraints, merges, backpointers to owners, an so on. Also, the results of constraint analysis were held in a data structure that had to be interpreted each time the constraints needed to be satisfied. This approach was quite useful, in that it was easy to try new representation schemes. Its speed was glacial.
With the advent of Smalltalk-76, subclassing and a Smalltalk compiler became available. At that time the author decided to try to use the Smalltalk representation directly (including the use of existing Smalltalk classes such as Integer an Point), and also to compile ordinary Smalltalk methods to hold the results of constraints satisfaction. To make this possible, ways had to be found to move all the extra ThingLab information that was stored in the simulated instances into Smalltalk classes instead. This was accomplished with the use of paths, part descriptions, the elimination of backpointers, and so on. Constraints and merges could be associated with an entire class only, rather than a particular instance. [To do otherwise could require that methods be associated with an instance, which is difficult in the current Smalltalk. This restriction is not always what one would like. For example, in the current system one cannot represent the class Bridge in such a way that the connectivity of a particular bridge is a property of the instance alone. Rather, a separate subclass is needed for each kind of bridge structure.]
One result of all this is that the system is much faster – once a method for satisfying the constraints has been planned and compiled, the response time is usually as good as if a suitable method has been hand-coded. [However, the time for planning a new method is still slow.] A much more important result is that this switch has forced the restructuring and refinement of the constraint mechanism so that is can deal with Smalltalk directly, making it realistic to consider organizing an entire programming language around constraints. See Chapter 6, Directions for Future Research