What is abstraction?

This article discusses abstraction within the field of computer science.

A layer of abstraction is a point where indirection can occur. Behavior does not need to be specified up front in specific detail. If A() relies on the result of B(), and B() has no implementation, then B() is abstract.

In this case, the behavior of A() will depend on the behavior of B() which has not been specified yet, but in order for A() to provide any functionality at all, B() will need to be defined or provided in some way. Whether this indirection is resolved at compile time, link time or run time is irrelevant at this point.

// For example, A and B could look like this:
int B (int); // implementation not provided here
 
int A (int n)
{
  return B(n) * B(n-1);
}

Why is abstraction important?

Abstraction is important because it allows functionality to be defined once, and used in conjunction with a variety of different data types. For example:

int a, b = 1, c = 2;
float d, e = 1, f = 2;
 
a = b + c;
d = e + f;

In this case the = and + operators are used the same way, even though the data type is different. The result might be different, but it still performs the same useful function (assignment and addition respectively). In fact, the compiler and language provides this abstraction to us at compile time - the compiler keeps track of types and functions (i.e. operators). The compiler can then generate the appropriate integer instructions and floating point instructions for the operations specified.

This is in constrast to something which is less abstract, for example:

void assignInt(int & result, int value);
int addInts(int lhs, int rhs);
 
int a, b, c;
assignInt(b, 1);
assignInt(c, 2);
 
assignInt(a, addInts(b, c));

This is more like what compiled code might look like - since CPUs have instructions that explicitly deal with particular data types. The important concept is that we can provide similar functionality with a consistent interface across different data types. This is why abstraction is important, it makes things more convenient, and we don't need to understand every detail manipulate data structures.

Different kinds of abstraction

In the above examples, we considered abstractions that are resolved by the compiler. This is where we use a type system to figure out what actual operations need to be performed at compile time.

int process(int, float);
int process(std::string);
 
//...
process("Apples and Oranges");

The compiler can resolve this dependency because it knows that only one of the definitions (int process(std::string)) provides the interface required.

class Animal {
public:
  virtual std::string speciesName () = 0;
};

At run time, we may have an object instance which is derived from the class Animal. Even though we don't know what kind of object it is, because we know it is an Animal we can ask it what species it is. This is another form of abstraction, and is resolved at run time.

Abstraction is everywhere

At this point, it may be good to step back a moment and look at compilers. Generally, one could consider a compiler a layer of indirection, in that given a sequence of commands, it can produce an executable binary for different CPUs. So, given the same source code, we can produce an executable that runs on completely different underlying hardware, and still produces similar or (more typically) identical results. This is also facilitated by the operating system which provides abstract system calls for dealing with files, threads, networking, etc. The general theory is that as long as the operating system provides the required interfaces, your application should work with the same intent as it does on a other operating system.

Polymorphism

Polymorphism (poly = many, morph = form) is a name used in computer science to mean one thing can have multiple meanings depending on the context in which it is used. In other words, polymorphism is the idea that there can be one interface provided to control a general set of functionality. Polymorphism is a specific set of cases where abstract concepts are used to make programming more concise and expressive.

Run Time Polymorphism, or object oriented polymorphism, is used at run time to provide dynamic functionality based on object type. This is done using dynamic dispatch (virtual functions) and inheritance. Wikipedia: Polymorphism in object-oriented programming.

Parametric polymorphism or Compile Time Polymorphism is based on types and allows for functionality to be determined at compile time. This is done using function and operator overloading. Wikipedia: Type Polymorphism.