SECTION 2: TYPING

There are many definitions of type (and class and related concepts). Many authors define the terms as applied by their particular approach or language, however we shall proceed in the face of this diversity.

References:

[Blair 89] Some Typing Topics.
[Booch 91] Small Section on Typing.
[Cardelli 85] Discussion on Object-Oriented Typing.
[Gunter 94] Theoretical Aspects of Object-Oriented Programming.
[Kim 89, ch1] Discussion on Some Research Topics.

2.1) What Is Polymorphism?

Polymorphism is a ubiquitous concept in object-oriented programming and is defined in many ways, so many definitions are presented from: Websters', Author, Strachey, Cardelli and Wegner, Booch, Meyer, Stroustrup, and Rumbaugh. Polymorphism is often considered the most powerful facility of an OOPL.

Webster's New World Dictionary

Polymorphism 1. State or condition of being polymorphous. 2. Cryall. crystallization into 2 or more chemically identical but crystallographically distinct forms. 3. Zool., Bot. existence of an animal or plant in several forms or color varieties.

polymorphous adj. having, assuming, or passing through many or various forms, stages, or the like. Also, polymorphic. [Gk polymorphous multiform]

Author's Definition

Polymorphism is the ability of an object (or reference) to assume (be replaced by) or become many different forms of object. Inheritance (or delegation) specifies slightly different or additional structure or behavior for an object, and these more specific or additional attributes of an object of a base class (or type) when assuming or becoming an object of a derived class characterizes object-oriented polymorphism. This is a special case of parametric polymorphism, which allows an object (or reference) to assume or become any object (possibly satisfying some implicit or explicit type constraints (parametric type), or a common structure), with this common structure being provided by base classes or types (subclass and subtype polymorphism, respectively).

"Poly" means "many" and "morph" means "form". The homograph polymorphism has many uses in the sciences, all referring to objects that can take on or assume many different forms. Computer Science refers to Strachey's original definitions of polymorphism, as divided into two major forms, parametric and ad-hoc. Cardelli and Wegner followup with another classification scheme, adding inclusion polymorphism for subtyping and inheritance.

Strachey's Original Definition [Strachey 67]:

"Parametric polymorphism is obtained when a function works uniformly on a range of types; these types normally exhibit some common structure. Ad-hoc polymorphism is obtained when a function works, or appears to work, on several different types (which may not exhibit a common structure) and may behave in unrelated ways for each type."

Parametric polymorphism is also referred to as "true" polymorphism, whereas ad-hoc polymorphism isn't (apparent polymorphism).

Cardelli and Wegner's Definition [Cardelli 85]

C+W refine Strachey's definition by adding "inclusion polymorphism" to model subtypes and subclasses (inheritance). Strachey's parametric polymorphism is divided into parametric and inclusion polymorphism, which are closely related, but separated to draw a clear distinction between the two forms, which are then joined as specializations of the new "Universal" polymorphism.
                                 |-- parametric
                 |-- universal --|
                 |               |-- inclusion
  polymorphism --|
                 |               |-- overloading
                 |-- ad hoc    --|
                                 |-- coercion

Polymorphic Languages: some values and variables may have more than one type.

Polymorphic Functions: functions whose operands (actual parameters) can have more than one type. [...] If we consider a generic function to be a value, it has many functional types and is therefore polymorphic.

Polymorphic Types: types whose operations are applicable to operands of more than one type.

Parametric Polymorphism: a polymorphic function has an implicit or explicit type parameter which determines the type of the argument for each application of that function.

Inclusion Polymorphism: an object can be viewed as belonging to many different classes that need not be disjoint; that is, there may be inclusion of classes.

The two forms of "Universal Polymorphism", parametric and inclusion are closely related, but are distinct enough in implementation to justify separate classifications.

Parametric polymorphism is referred to as generics. Generics can be syntactic, where each instantiation creates a specialized version of the code allowing fast running execution, but in a "true polymorphic system", only a single implementation is used.

On inheritance is subtype polymorphism:

"Subtyping on record types corresponds to the concept of inheritance (subclass) in languages, especially if records are allowed to have functional components."

Author's Notes

Implicit parametric polymorphism can be implemented with type inferencing schemes [Aho 85]. ML is prototypical in providing this facility.

Inclusion polymorphism is common and is found in languages such as Simula, Ada95, C++, CLOS, Eiffel and etc. (subclass polymorphism). Smalltalk also uses inclusion polymorphism; its used in declaring classes, and subclass polymorphism is used in practice but not enforced. For inheritance, inclusion polymorphism specifies an instance of a subclass can appear wherever an instance of a superclass is required. For subtyping (subtype polymorphism), the same applies because all operations required by the supertype are present in the subtype (subtype is subset of supertype). Cardelli and Wegner view classes as sets of objects (resulting in subtype objects are a subset of supertype objects, or an extensional view), as contrasted with a feature based (intensional) approach (where subtypes are supersets of (contain) supertypes). MI provides an interesting example here, as it is set intersection with an extensional view and set union with an intensional view. Details are left as an exercise for the reader.

Ada generics and C++ templates provide explicit syntactic generics. While Ada may infer some actual generic parameters (operations) and C++ doesn't require explicit instantiation of its template functions, formal generic parameters must still be declared and many bodies are generated.

Inclusion polymorphism can refer to subtyping, or having at least as much or more than required. Since derived classes can inherit structure and behavior from base classes, such inheritance is an example of inclusion polymorphism with respect to representation (subclassing). An example of inclusion polymorphism with respect to assignment (and initialization, or replacement if viewed in an almost symbolic way) occurs when object types may be specified and assignment is based on actual object membership in that type (often of the CLOS is-a-member-of form in OO). Emerald provides another example of an object- oriented language using inclusion polymorphism with respect to replacement; however, inclusion is with respect to subtyping only with abstract types ("bounded quantification" by C+W. C+W's parameters are subtype polymorphic but lose the inherent type). Any object possessing all required operations is acceptable and no inheritance relation is required (subtype polymorphism). They refer to this as "best-fitting" types [Black 86]. The original Trellis/ Owl also had such a facility but with two separate inheritance hierarchies, although it was abandoned in favor of a single class-based approach for simplicity. See also section 2.7.

[As inclusion polymorphism covers both subtype and subclass polymorphism, perhaps IP could be further divided in C+W's above classification.]

Booch's Definition [Booch 91, p. 517]

polymorphism: A concept in type theory, according to which a name (such as a variable declaration) may denote objects of many different classes that are related by some common superclass; thus, any object denoted by this name is able to respond to some common set of operations in different ways.

Booch also has several sections devoted to polymorphism.

[The author notes Booch's definition above is clearly in the context of conventional, classical OO and subclass polymorphism.]

Meyer's Definition [Meyer 88, sect. 10.1.5 Polymorphism]

"Polymorphism" means the ability to take several forms. In object-oriented programming, this refers to the ability of an entity to refer at run-time to instances of various classes. In a typed environment such as Eiffel, this is constrained by inheritance: ...

[The Author notes Meyer has a following section 10.1.7 on Static Type, dynamic type, which is relevant, but claims "... there is no way the type of an object can ever change. Only a reference can be polymorphic: ...". Meyer is clear between the concept and the Eiffel realization in his polymorphism definition above, but here neglects the "becomes" facility as found in several dynamically typed OO languages such as Actors, CLOS, Self and Smalltalk, which allows an object (and not just a reference) to change its class.]

Stroustrup's Definition [Stroustrup 90, p. 209]

The use of derived classes and virtual functions is often called "object- oriented programming". Furthermore, the ability to call a variety of functions using exactly the same interface - as is provided by virtual functions - is sometimes called "polymorphism".

[The Author notes this is a functional view of polymorphism (as provided in C++). [Stroustrup 91, p. 136] has an example of polymorphism with void *'s, but a newer template function is incomparably preferable, as implied in [Stroustrup 90, ch 14]]

Rumbaugh's Definition [Rumbaugh 91, p. 2]

"Polymorphism" means that the same operation may behave differently on different classes.

2.2) What Does Polymorphism Boil Down To In OO Programming Languages?

In C++, virtual functions provide polymorphism. This is because a polymorphic object (pointer or reference (or such parameter)) is assignment compatible with any object of a derived class. Is this polymorphism in itself? Objects can take on objects of different forms (the derived classes), but of what use is it? To make any difference, the differing forms must have some effect. In dynamically typed languages, polymorphic objects are passed messages and will respond in whatever way the object has defined (usually starting from its most derived class and working its way up). But for static objects, a virtual function is invoked. This is the stored method from the derived class that overrode the virtual method from its base class, providing specialized behavior for the polymorphic object; and hence, polymorphism. This common pure statically typed example is, of course, an example of inclusion polymorphism, subclass polymorphism to be more specific (see section 2.1). Pure statically typed subtype polymorphism, as provided in Emerald, can be implemented similarly [Black 86].

2.3) What Is Dynamic Binding?

Dynamic binding has two forms, static and dynamic. Statically-typed dynamic binding is found in languages such as C++ (virtual functions) and Eiffel (redefinition). It is not known which function will be called for a virtual function at run-time because a derived class may override the function, in which case the overriding function must be called. Statically determining all possibilities of usage is undecidable. When the complete program is compiled, all such functions are resolved (statically) for actual objects. Formal object usage must have a consistent way of accessing these functions, as achieved thru vtables of function pointers in the actual objects (C++) or equivalent, providing statically-typed dynamic binding (this is really just defining simple function pointers with static typechecking in the base class, and filling them in in the derived class, along with offsets to reset the receiver).

The run-time selection of methods is another case of dynamic binding, meaning lookup is performed (bound) at run-time (dynamically). This is often desired and even required in many applications including databases, distributed programming and user interaction (e.g. GUIs). Examples can be found in [Garfinkel 93, p80] and [Cox 91, pp 64-67]. To extend Garfinkels example with multiple-polymorphism, a cut operation in an Edit submenu may pass the cut operation (along with parameters) to any object on the desktop, each of which handles the message in its own way (OO). If an (application) object can cut many kinds of objects such as text and graphical objects, multiple-polymorphism comes into play, as many overloaded cut methods, one per type of object to be cut, are available in the receiving object, the particular method being selected based on the actual type of object being cut (which in the GUI case is not available until run-time).

Again, various optimizations exist for dynamic lookup to increase efficiency (such as found in [Agrawal 91] and [Chambers 92]).

Dynamic binding allows new objects and code to be interfaced with or added to a system without affecting existing code and eliminates switch statements. This removes the spread of knowledge of specific classes throughout a system, as each object knows what operation to support. It also allows a reduction in program complexity by replacing a nested construct (switch statement) with a simple call. It also allows small packages of behavior, improving coherence and loose coupling. Another benefit is that code complexity increases not linearly but exponentially with lines of code, so that packaging code into methods reduces program complexity considerably, even further that removing the nested switch statement! [Martin 92] covers some of these issues.

2.4) Is There A Difference Between Being A Member Or Instance Of A Class?

Yes (but be careful of context). To use C++ terminology, an object (not a reference) is defined to be an instance of exactly one class (in classical OO), called its most derived class. An object not directly contained in any other is called the complete object [Stroustrup 90]. An object is a member of several classes, including all of the classes its declared (or most derived) class inherits from. With static typing and inclusion polymorphism based on class, if a polymorphic object (or reference) is made to refer to an object, that object must be a member of the polymorphic object's class.

This also provides a good example of differing definitions among object- oriented languages, since a member is defined as above in CLOS, but a member of a class is one of its instance variables in C++.

2.5) What Is The Difference Between Static And Dynamic Typing?

Static typing refers to types declared in a program at compile-time, so no type information is available on objects at run-time. Dynamic typing uses the inherent types of polymorphic objects, keeping track of the types of objects at run-time. Statically typed dynamic binding is a compromise (usually implemented with tables of function pointers and offsets), and is how statically-typed OO languages provide polymorphism. Some approaches provide both static and dynamic typing, sometimes with static typing providing type- safe programs and dynamic typing providing multiple-polymorphism [Agrawal 91], [Mugridge 91]. See also section 2.3.

Static typing is more efficient and reliable, but loses power. Typical restrictions include only allowing a common set of base class functions (or any common functions for the more general subtyping or parametric polymorphic cases) to be available on formal objects and a lack of multiple-polymorphism (see section 1.19), both of which are overcome with dynamic typing.

Many languages provide dynamic typing: Smalltalk, Self, Objective-C, and etc. A limited dynamic typing scheme, called RTTI (Run Time Type Identification), is even being considered for the C++ standard. A similar facility to safe downcasting (historically known as type narrowing), the thrust of RTTI, can also be found in recent versions of Eiffel.

See section 3.4 for a categorization of common OO languages by type system.

2.6) What Is This I Hear About ML And Functional Programming Languages?

ML, Metalanguage, is a functional programming language with a strongly typed polymorphic type system [Wikstrom 87]. Russell (see Appendix E) is a more recent functional language and Haskell [Hudak 92] provides a more modern and "pure" example. Section 2.5 discusses why static typing has less power/ flexibility than dynamic typing and the same applies to ML (although see the appendixes for an experimental dynamic extension to ML, Alcool-90 and [Cardelli 85] for a proper placement of ML's type system). ML doesn't use inheritance for polymorphism; unlike OO languages, but provides the prototypical example of parametric polymorphism, so no inheritance is required. This is "true" or "pure" statically (or strongly) checked parametric polymorphism, by Strachey's (and Cardelli and Wegner's) definitions.

Smalltalk is an example of a dynamically-typed language which does not check types during assignment (and hence for parameters) and therefore provides parametric polymorphism without static constraints (by Strachey's definition). However, Smalltalk's style uses inclusion polymorphism in practise and inheritance for subclassing (representation).

2.7) What Is A Separation Between Type And Class (Representation)?

For a short answer:
Subtype Polymorphism, as opposed to Subclass Polymorphism, is the best answer in OO. Parametric polymorphism is a related concept where this is also true, but is of a different flavor (and usually requires object attributes by use. See also section 2.1).
A type can be considered a set of values and a set of operations on those values. This can insure type-safe programming. However, the representation of types (classes in OO) can be separated from the notion of type allowing many representations per type while still maintaining reasonable type-safety.

In many languages, a type has a single representation insuring all operations performed on that type are well defined (statically bound) and providing for efficiency by taking advantage of that representation wherever used. In many OO languages, subclassing and dynamic binding provides for greater flexibility by providing object specialization. However, in many OO languages classes are used for assignment compatibility forcing an assigned object to inherit (transitively) from any polymorphic object's class (inclusion polymorphism based on class, or subclass polymorphism). This insures all operations to be performed on any polymorphic object are satisfied by any replacing objects. This also insures all types share a common representation, or at least a common base interface specification.

The Java interface provides prototypical interface inheritance, which is similar to C++'s abstract classes, but does not allow the definition of any method (only method specifications, or abstract methods) and any data must be static and final, or constant and non-overridable. Java interfaces are types, from which classes inherit to belong to that type. Java classes must provide representation for interface methods.

By separating type from class, or representation (or perhaps separating class from type, by the aforementioned definition of type), a replacing object must satisfy the operations or type constraints of a polymorphic object (subtype polymorphism) but are not required to do to do so by an inheritance relation (subclass polymorphism), as is typical in most OOPLs. Java requires an object of a class being assigned to a variable of an interface type to have that class inherit from that interface, providing linkage between type and class. Dropping this restriction, or linkage, is somewhat less type-safe, because accidental matches of method signatures can occur, calling for greater care in use. [Black 86] discusses this issue in Emerald. The same issue arises in parametric polymorphism (generics/templates), as any method matching a required signature is accepted, calling for careful matching of actual and formal generic parameters. The difference between static and dynamic binding in OO and dynamic binding and subtyping seems similar. A possible loss of semantic integrity/similarity is contrasted with greater power.

It is possible to specify desired abstract properties of type specifications with mechanisms similar to Eiffel's pre-, post-, and invariant conditions. This helps to insure the semantic integrity of replacing objects and their behavior. [Liskov 93] provides a recent exposition.

Abstract classes ([Stroustrup 91] and [Meyer 88]) in typing provide a facility similar to subtype polymorphism; however, ACs require type compatible classes to inherit from them, providing a subclass polymorphism facility, and ACs can also specify representation. Subtyping is therefore most useful to avoid spreading knowledge of classes throughout a system, which is a high priority for loosely coupled modules and in distributed programming [Black 87].

The formal type system found in [Cardelli 85], Emerald/Jade [Black 86] and [Raj 89], original trellis/Owl, an experimental C++ extension (See Appendix E, Signatures), Sather (originally Eiffel-based), and an Eiffel superset [Jones 92] are all examples of OO systems providing subtype polymorphism. Functional languages such as ML, Russell, and Haskell provide a separation with pure parametric polymorphism (as is also commonly found in OO languages in addition to inclusion polymorphism). See also [Cook 90], "Inheritance Is Not Subtyping", for a formal approach.

2.8) What Are Generics And Templates?

For a short Answer:
Parametric Polymorphism (although various implementations provide various subsets).
Generics (or Templates in C++) refer to the ability to parameterize types and functions with types. This is useful for parameterized classes and polymorphic functions as found in languages such as Ada, C++, Eiffel, and etc., although these are "syntactic" or restricted forms [Cardelli 85]. Generics are orthogonal to inheritance, since types (and classes) may be generically parameterized. Generics provide for reusability in programming languages. An example is a Stack with a generically parameterized base type. This allows a single Stack class to provide many instantiations such as a Stack of ints, a Stack of any fundamental or user defined type, or even a Stack of Stacks of ... Another example is a polymorphic sort function taking a base type with a comparison operator. The function can be called with any type (containing a comparison operator). See [Booch 87b] for several examples in Ada and [Stroustrup xx] and [Murray 93] for examples in C++

While generics have many advantages, typical limitations include a static nature, which is an advantage for strong typechecking but a potential disadvantage when causing dynamic compilation (leading to a time/space efficiency tradeoff), and sources can cause inlining and create source code dependencies and expand code size (unlike a single-body or "true" parametrically polymorphic implementation. Generics can also be viewed as a special case of type variables.

Functions are typically generic in statically-typed parametrically-polymorphic languages. One such popular functional language is ML, in which all functions are generic. Russell and Haskel are more modern variants (references are forthcoming, however see APPENDIX E).