Contents (hide)
 1 Processes and Scheduling
 2 Threads
  2.1 Applications of Concurrency
 3 Abstract Object Models
  3.1 Static properties of the system
   3.1.1 Object identity
   3.1.2 Encapsulation
   3.1.3 Communication between objects
   3.1.4 Connections wrt. communication
 4 Computation Models
  4.1 Sequential mappings
  4.2 Active objects
  4.3 Mixed models - Multithreading
 5 Java Support for Threads
In this lecture, we introduce the notion of concurrent computation.

Processes and Scheduling

Process vs. Program

In the previous lecture, we defined processes as the basic unit of execution managed by Operating Systems. An Operating System, as any RTE, executes processes. The Operating System creates processes from programs, and manages their resources.

Let us first elaborate what is meant by a process resource. To keep running, a process requires at least the following resources:
Register
A register is a small cell of very fast memory located physically inside the CPU. Registers are the fastest type of memory, in contract to the slower RAM main memory, located outside of the CPU.
  • The program (from which the process is created)
  • Two blocks of memory: the stack and the heap (we will explain the difference in Chapter 2)
  • Tables for resources of specific types: File handles, socket handles, IO handles, window handles.
  • Control attributes: Execution state, Process relationship
  • Processor state information: contents of registers, Program Counter (PC), Stack Pointer

The program context is defined as the Program Counter, the Registers and all the memory associated with the process. Practically, the program context contains all the information the OS needs to keep track of the state of a process during its execution.

Scheduling

Operating Systems have the capability to run more than one process simultaneously on a single computer. In most cases, an OS runs more processes than there are physical processors in hardware. The OS is, therefore, responsible to interleave the execution of the active processes – that is, the OS runs a process a little while, interrupts it and keeps track of its context in memory, then runs another process for a little while, and it keeps switching this way from process to process. The component of the OS responsible for sharing the CPU between processes is called the scheduler.

On a single CPU system (called UP, for uni-processor), only a single process may be executing at any given time. On an SMP system (SMP - symmetric multi processor), where there may be several CPUs (or several cores inside a single CPU), a number of processes may be concurrently executed on different CPUs.

To control the scheduling activities, the scheduler maintains a list of all active processes in the system, and at some point decides to switch from one executing process to another. There are two paradigms with which the scheduler may interact with the currently active processes:
  • Non-Preemptive Scheduling
  • Preemptive Scheduling

In non-preemptive systems, the process signals to the operating system (in fact, to the scheduler) that the process is ready to relinquish the CPU. The process does so with a special system call or when the process waits for an external event such as I/O. The scheduler will then select a different process to execute according to a scheduling policy (a method to pick the highest priority process among all processes that would like to run). Some examples for non-preemptive operating systems are Dos, Windows 3.1 and older versions of Mac OS.

Time Slice
The time duration between two consecutive timer interrupts (which is the longest time a process may execute un-interrupted).
In preemptive systems, the scheduler may interrupt the executing process, and select a different process to execute. The interrupt is issued with the help of special timer hardware, which sends an interrupt at regular intervals, suspends the currently executing process and starts executing the scheduler. Another scenario where the scheduler may be executed is when the process passes control to the OS, which happens each time the process invokes a system call. Once the scheduler is running, it selects a new process to execute, according to the scheduling policy. All modern operating systems support preemptive scheduling.

Context Switch

A context switch is the operation that occurs when the OS switches between one executing process and the next one the scheduler chose, and may consume several milliseconds of processing time (that is, a context switch can take the time of several tens of thousands of simple CPU operations). The context switch is transparent to all processes, meaning a process is not able to tell it was preempted.
Cache flushing
One of the most expensive operation when executing a process, is accessing the memory. The reason is that accessing a cell of memory in RAM is much slower than executing a computation step on the CPU. To reduce this cost, the CPU maintains a cache of frequently used memory cells in a small very-fast memory section, located on the CPU. When a memory cell is copied in the CPU cache, it can be read and written as fast as executing a single computation step on the CPU. When there is a context switch, all the memory cells stored in the CPU cache become invalid - since they refer to values from the old process. Therefore, the whole cache must be ``flushed'' (that is, all the cells must be written back to actual RAM and cleared).

Actually, the CPU cache is not flushed, only a part of it which is called the TLB. We will not discuss the TLB during this course, as you will discuss it next year in the Operating System course. More info for interested parties can be found here.
A context switch includes at least the following sub-steps:
  1. Timer interrupt - suspend the currently executing process (the old process) and start the execution of the scheduler
  2. Save the context of the process, so that it may be resumed later
  3. Select the next process to execute (use a scheduling policy)
  4. Retrieve the context of the next process to execute
  5. Restore the state of the new process: restore the registers and the program counter.
  6. Flush the CPU cache (as the new process has a new memory map, it can not use the cache which is filled with the data of the old process)
  7. Resume the new process (start executing the code of the new process from the instruction that was interrupted)

The most costly operation (amortized over the lifetime of a process) is flushing the CPU cache, as it will degrade the performance of the new process. As a result we would like to lower the number of context switches our application requires. To our rescue comes the notion of the Thread.

Threads

Scheduling and threads
As noted above, only one execution unit can be executed by a given CPU at any given time. As a result, different threads in the same process cannot in effect be executed concurrently. We also need to review scheduling in a more fine-grained level; the scheduler selects not only the new process to be executed, but also the thread inside this process as well.
Let us start by trying to design a video player, which should be able to play a compressed video stream.

The video player example

The player should take the following steps in playing the video:

  1. Reading the video from disk
  2. decompressing the video
  3. decode the video and display on the screen

Now, if the video is very small, say several MegaBytes, we have no problem. We can read the entire video into memory, decompress it and then display it on the screen. However, consider what will happen when the video is larger than the actual memory your computer has? and what if the video is downloaded from the Internet, and you want to start watching it before the download completes (e.g., YouTube)?

The Interleaving solution

Consider the following solution: read some of the video data (from disk or the network), decompress it, decode it and display on the screen. Repeat until the video ends. Let us call this the "Interleaving" solution. There is a very fundamental difficulty when trying to implement the Interleaving solution; namely, it is very difficult to program, and is very error prone.

The multi-process solution

So, why not try a different approach? Since the task of playing the movie is easily decomposed into several independent tasks, why not have one process read the movie, another to decompress it, the next one to decode it and the final one which displays the movie on the screen. This makes our life as programmers much more easier, as we do not need to control the interleaving of the tasks by ourselves, as we had to in the Interleaving solution. However, another difficulty arises; how will all of these processes communicate with each other? oh boy.

The multi-threading solution

If we could have only one process, which would be able to do all of these task simultaneously, our life would be much more tolerable.

Context Switch and Threads
When performing a context switch between threads in the same process, not all steps of a regular context switch need to be taken. For example, there is no need to restore any context of the process (but just to switch stacks) and no need to flush the CPU cache, as threads in the same process share their memory
To our rescue comes the concept of Threads. In a single process, several threads my be executing concurrently. They share all the resources allocated to the process, and may communicate with each other using constructs which are built in most modern languages, thus making the programmer happier.

In contrast to different processes, threads inside the same process share the same memory space (however, each thread has its own stack), the same opened files and access rights. Moreover, the cost of a context switch between threads in the same process is much lower than between processes or threads of different processes, as the CPU cache need not be flushed.

To conclude our pet problem of the video player, we can now design a single process with several threads. One thread will read the video stream and place chunks of it in the chunk queue. Another thread will read the video data chunks from the chunk queue, decompress them and place them in the decompressed queue. The next thread takes decompressed chunks from the decompressed queue, decodes them into frames and queues them in the frame queue. The final thread takes frame after frame from the frame queue and displays them on screen. Viola.

Applications of Concurrency

Advantages

Concurrency opens up design possibilities that would be impractical otherwise. Threads liberate you from the basic limitation of invoking a method and blocking, doing nothing, while waiting for reply. There are many reasons for using threads, the most prominent are listed here:

  • Reactive Programing: Some programs are required to do more than one thing at a time, performing each as a reactive response to some input. Our example of the video player falls here. Another large group of programs falls in this category: GUI applications. It seems impossible to program a GUI without concurrent programming.
  • Availability: One of the most common design patterns for programs that are service providers (we will review it thoroughly when discussing networking) is to have one thread as a gateway for incoming service consumers and a different thread for handling the requests of each one. For example, an FTP server will have a gateway thread to handle new clients connecting and a separate thread (per client) to deal with long file transfers.
  • Controllability: A thread can be suspended, resumed or stopped by another thread. This gives the designer and programmer a freedom that he would not be able to gain otherwise.
  • Simplified design: Software objects usually model real objects. In real life objects in many cases are acting independently and in parallel. Even if not modeling real objects, designing autonomous behavior is in many cases easier than designing sequential (interleaving between objects) one. For example recall the video player discussion.
  • Simplified implementation: With threads it is simple to implement cases where a process needs to wait for external blocking event but perform some other vital processing in the meanwhile.
  • Parallelization: On multiprocessor machines, the operating system can execute threads truly concurrently with one another by allowing one thread to run on each CPU. Even if only one CPU exists, interleaving the execution paths avoid delays when it comes to part of the program that is busy doing some heavy computation or waiting for events.
  • Even if you choose not to use threads some services provided by modern RTE operate in a concurrent manner, hence leaving you no choice rather than mastering concurrent programing.

Limitations

Benefits of concurrency should be weighed against its cost in resource consumption, efficiency and program complexity:

  • Safety: when multiple threads share resources (this is the most common case) they need to use some synchronizations mechanisms to ensure that they maintain consistent state (we will elaborate on that in the next lecture). Failing to do so may lead to random looking, hard to debug inconsistencies.
  • Liveness: In concurrent programming a thread may fail to be alive - an activity performed by a thread may stop for many reasons we will explore later.
  • Non determinism: No two executions of a concurrent program need be identical. This makes multi-threaded program harder to understand, predict and debug.
  • Context switching overhead: when a job performed in a thread is small enough the overhead of thread creation and context switching overrides the benefits.
  • Request/Reply programing: Threads are not a good choice when an object actually needs to wait for a reply from another object in order to continue. In such cases synchronizing the activities costs extra time and introduce more complexity.
  • Synchronization overhead: even with good design where multi-threading is necessary , synchronization can not be avoided. Synchronization constructs consume execution time and add complexity.
  • In some (rare) cases where activity is self contained and sufficiently heavy it may be easier to encapsulate it in a different standalone program. A standalone program can still be accessed via system level services.

Abstract Object Models

In order to discuss programming issues related to threads, we introduce a formal model of concurrent objects. This model can be implemented in different manners in different RTEs or programming languages. We first discuss the abstract model, then we study its implementation in Java.

Static properties of the system

Object Messages
In Java, sending a Message to an object is just calling one of the object's public methods.
The structure of each object is described (normally via a class) in terms of internal attributes (state), connections to other objects, local (internal) methods, and methods for accepting messages from other objects (interface).

Object identity

New objects can be constructed at any time (subject to system resource constraints) by any other object (subject to access control). Once constructed, each object maintains a unique identity that persists over its lifetime.

Encapsulation

Changing internal state in Java
Assuming all internal members of the object are declared private (as is proper), only the object itself may alter their state. This can happen, for instance, when the object receives a message (e.g., a public method has been invoked).
Objects have separation between their inside and outside parts. The internal state can be directly modified only by the object itself.

Communication between objects

Objects communicate only via message passing. Objects issue messages that trigger actions in other objects. The forms of these messages may range from simple procedural calls to those transported via arbitrary communication protocols.

Connections wrt. communication

One object can send messages to others if it knows their identities. Some models rely on the identities of the communication channel rather than or in addition to object identities. Two objects that share a communication channel may pass messages to each other through that channel without knowing each other's identity.

Objects can perform only the following operations:

  • Accept a message
  • Update their internal state
  • Send a message
  • Create a new object

Computation Models

We now discuss three different computational models for our abstract object model.

Sequential mappings

An ordinary general-purpose computer can be exploited so that this computer can pretend it is any object. This can be done by loading a description of the corresponding class into a virtual machine (VM). The VM can then construct a passive representation of an instance and then interpret the associated operations. It also extends to programs involving many objects of different classes, each loaded and instantiated as needed, by having the VM at all times record the identity (this) of the object it is currently simulating. In other words, the VM is itself an object, although a very special one that can pretend it is any other object.

seqcm.png

We know this model from single threaded applications.

Active objects

In the active objects model, every object is autonomous and possibly as powerful as a sequential VM. Such a model is natural when different objects may reside on different machines, hence all message passing is performed via remote communication. This model also serves as an object-oriented view of most operating-system-level processes. Each process is independent of, and shares as few as possible resources with other processes.

activeobjects.png

A popular programing paradigm, so called agents, is an example of this model. more

Mixed models - Multithreading

This model fall between the two extremes of passive and active models. A concurrent VM may be composed of multiple threads of execution.

  • Each thread acts in about the same way as a single sequential VM, but unlike pure active objects - all threads share access to the same set of passive representations.
  • This model can simulate the first two models, but not vice versa.

Thread-based concurrent object oriented models conceptually separate "normal" passive objects from active objects (threads). However, the passive objects typically show thread-awareness not seen in sequential programming, e.g., by protecting themselves via locks.

mixedcm.png

Java Support for Threads

Java (in fact, the JVM) implements the mixed object model. Until now, you have been using only passive object; consider the following class, MessagePrinter, which has a simple method run(). When this method is invoked, the object prints a message which it received in its constructor.

  1. /* this class abstracts a an objects which receives a message and prints it.
  2.    The Runnable interface requires a single void method, run()*/
  3. class MessagePrinter implements Runnable {
  4.  
  5.   /* the message the object received */
  6.   protected String msg_;    
  7.   /* where to print to */
  8.   protected PrintStream out_;  // The place to print
  9.  
  10.   /* the constructor.
  11.    * @param out the stream to which we print 
  12.    * @param msg the message
  13.    */
  14.   MessagePrinter(PrintStream out,String msg) {
  15.     out_ =  out;
  16.     msg_ = msg
  17.   }
  18.  
  19.   /* print the message */
  20.   public void run() { 
  21.     out_.print(msg_)// display the message 
  22.   }
  23. }
You would usually use this class in the following fashion:
  1. /* a simple class, which instantiates a passive MessagePrinter object */
  2. class SequentialPrinter {
  3.  
  4.   /* the main() function, the point of entry */
  5.   public static void main(String[] args) {
  6.     MessagePrinter mpHello = new MessagePrinter(System.out"Hello\n");
  7.     MessagePrinter mpGoodbye = new MessagePrinter(System.out"Goodbye\n");
  8.  
  9.     mpHello.run()
  10.     mpGoodbye.run();
  11.   }
  12. }
In this example, the JVM first loads the SequentialPrinter class and calls the main() function. The JVM now impersonates the class SequentialPrinter. Next, two new objects are created, both of the MessagePrinter class and each of them receives a unique id (not the names of the variables in the main() method!!). Next, the MessagePrinter object reachable via mpHello is sent a message to run(). Last, the MessagePrinter object reachable via mpGoodbye is sent a message also.

The result is two words printed to the screen, one after the other. First, "Hello" is printed, then, "Goodbye".

Let us now employ the full power the JVM provides us by implementing a mixed mode object model; consider a Multithreaded version, using a different thread for each of our MessagePrinters:

  1. class ConcurrentPrinter {
  2.  
  3.   public static void main(String[] args) {
  4.  
  5.     MessagePrinter mpHello = new MessagePrinter(System.out"Hello\n");
  6.     MessagePrinter mpGoodbye = new MessagePrinter(System.out"Goodbye\n")
  7.  
  8.     Thread tHello = new Thread(mpHello);
  9.     Thread tGoodbye = new Thread(mpGoodbye);
  10.  
  11.     tHello.start()
  12.     tGoodbye.start();
  13.   }
  14. }
In this example, the JVM first loades the ConcurrentPrinter class and calls the main() function. The JVM now impersonates the class ConcurrentPrinter. Next, two new objects are created, both of the MessagePrinter class and each of them receives a unique id. Next, two new special, active objects are instantiated (threads…). Each thread receives as an argument a Runnable object, r. When each thread is given the message to start, it will then send a message to r. Namely, each thread impersonates r and invokes r's run() method. In our example, the last thing the JVM does as the ConcurrentPrinter class is send both of the threads the start() message.

Now, since we have two separate threads, each invoking the run() method of a different MessagePrinter object (and possibly at different times), we cannot know ahead of time which message will be printed first.