Labs 11-12: Command Interpreters
Motivation
Perhaps the most important system program is the command interpreter,
that is, the program that gets user commands and executes them.
The command interpreter is thus the major interface between the user and
the operating system services. There are two main types of command interpreter:
- Text-based command interpreter: receives the user commands in text form,
and executes them.
This type is the command-line interpreter (also called
a shell in UNIX-like systems).
- Menu-based interpreter: user applies commands by selecting them from
menus displayed by the interpreter. At the most basic level, menus are
text driven. At the most extreme end, everything uses nifty graphical
displays, with sexy icons, etc., such as the Windows or KDE command
interpreters with which you are most familiar. (Note that, in fact,
early versions of the Microsoft Windows operating system were not really
operating systems, but just graphical command interpreters with a lot of
sales hype. In fact, most unsophisticated users today are still confused
between the actual operating system, and the command interpreter
front end!).
Lab Goals
In this sequence of labs, you will be implementing a simple shell (command-line
interpreter). Like traditional UNIX shells, your shell program will
also be a user level process (just like all your programs
to-date), that will rely very heavily on operating system services
to do its job, which is:
- To receive commands from the user
- Interpret the commands, and use the operating system to help starting up
programs and processes requested by the user.
- Maintain some process characteristics (such as run processes in
the background, kill them, etc.), again using operating system services.
The complicated task of actually starting up the processes, mapping their
memory, files, etc. are strictly a responsibility of the operating
system, and as such you will study these issues in the Operating Systems course.
But telling the operating system which processes to run, which files
to start up as standard IO, etc. are topics of the next two
lab sessions.
Starting and maintaining a process involve many technicalities
(all kinds of structures, loading, etc.), and like any other command
interpreter we will get assistance from system calls, such
as exec, fork, wait (in other words: see
man on how to use these system calls).
How are we going to study the topics? You are going to write a mini-shell:
your program will invoke (with much help from the OS) another program,
giving the invoked program its parameters, waiting (or not as determined
by the user) for it to
complete its task, managing its input/output, and so on. A complete
"industrial grade" shell
provides a few additional services, but the few we'll implement will
demonstrate the idea nicely. Later on, in your free time,
you may wish to add features, or even graphical menus with Windows-like
icons (taking care not to make them too similar, so as not to get sued
by Microsoft).
Lab tasks
First week: 1-6 , second week: 7-8
First, download a little file and its header
with some parsing
functions in it. When writing your program, your compilation line should
look something like this:
gcc -o myprog myprog.c parse.c
Task 1
-
Write a (very) basic shell: A program that loops infinitely,
and at each pass in the loop does the following:
- Displays a prompt. Let it be composed of whatever you want, but
includes the current working directory (use getcwd).
- Reads a single input line from the keyboard. Your shell
assumes that the words in the input do not contain any special characters,
which will be discussed shortly, and that the first word is a program
name, and the other words are parameters. Use fgets to read.
- Invoke the other program (the one whose name is the first
word). For simplicity, use the execv system call.
- Check the return code from execv and print an error message
in case of a failure (no such file, permissions, etc.). Use perror
for the error message.
Place a little printf command before, and another one after,
the invokation. What happens and when? Test your program by using
it to run the test program, and another program, such as
/bin/ls. What happens if you type ls? Why?
Task 2
-
We would like a shell which can stay active after invoking
another program. The fork system call is the key: it 'duplicates'
our process, creating an almost identical copy (child) of the
issuing (parent) process. Repeat the previous task, this time
performing fork before execv. Do we get an explosion of
processes since the child process also has to perform the loop? What
happened?
(Actually, there are two problems here).
Task 3
-
Just using fork did not solve our problem: So let us work
like this: As before, the shell will loop forever. Inside the
loop, it will do the following:
- Read a command line,
- Duplicate itself,
- And here comes the difference: invoke the program using the
child copy.
How can we tell which process is the parent and which is the
child? The return code of fork is different: the
child gets 0, the parent gets the process id of the
child.
==> Now our shell can work 'forever', so let us agree that an end of file,
or an end-of-file (control-D) character in an otherwise empty command line
means that the shell should exit. Implement the shell such
that the child process invokes the required program, and the parent goes
to read the next command line, and exits if receives ^d.
Task 4
-
Download long and use your shell to invoke it.
Do you see a little annoying phenomenon? Let us fix it. This time your
shell will be able to wait for the completion of the
son process (this is the default), or be ready to receive the next
command, if we request it to do so. The common way of telling
that in UNIX is by the & character. That is: if the command line looks
like this:
prog_name p1 p2 p3 &
with the
obvious meanings of: prog_name is a program name, p1 stands for the first
parameter for the program, etc., and the '&' is really the ampersand
character, then your shell should NOT WAIT for the completion of the
process invoked, whereas whenever this symbol does not appear, the default
(to wait) is applied. When implementing, do not send the '&' symbol to the
issued program!
There are several versions of functions that wait for the termination
of another process. We will use the simplest, called
wait.
Implement this feature, in which the shell WAITS if the & was not
given, and skips waiting if that character was not given.
Task 5
-
Current working directory: changing it is one of many commands that
change the environment and as such are built-in - implemented by
the shell, not by a file. Try to perform
cd. So how does the shell do it? It calls a function named
chdir, which changes the appropriate environment variable.
Implement cd.
Task 6
-
Scripting: Sometimes a user may have several consequtives commands
which he/she would like to issue, on a regular basis. Any shell allows
writing them in a file and executing them one by one (and in fact much
more: the user can actually write a program: loops, conditions, etc.
within the framework of a script). You are going to implement the basic
part: reading commands from a file and executing them one by one. How will
your shell know whether it is required to perform in batch mode (doing the
work from the script) or interactively (getting commands from the
keyboard)? If it receives a file name as a parameter when invoked - it
should treat it as a script, if no parameters - it should work as before.
Implement this script option.
* Also add the following features to your shell:
- Ignore empty lines (when only Enter was pressed)
- Ignore lines starting with the '#' character
Tasks for lab 12
Task 7
-
Redirection: Next we would like to add some functionality to
the shell. Suppose the program you want to invoke prints its result to the
standard output, but sometimes you want the output to be kept in a file.
One way is to modify your program to open a file. Another is to ask the
shell to do it for you, only when you request it.
- What types of redirection exist? The main ones are input redirection
and output redirection.
- How does a user tell the shell that he/she wants redirection? The
symbols are: '>' : output which is supposed to go to the standard output
(the screen), should be sent to the file whose name appears to the right
of the symbol. Input redirection is typed similarly, using '<'.
- What should the shell do? Suppose the shell receives: myprog > result . Note that execv replaces
its loaded image with the image of another process, but some 'managing
tables' are not affected. One such table is the open files table.
Initially, it tells the defaults: channel 0 is standard input, channels 1
and 2 are standard output and error, respectively. Output redirection is
achieved by closing the standard output (close(1)) and opening the
requested file instead. The opened file gets the first available fd (which
is 1, since we just closed - freed - that descriptor).
Similarly, implementing input redirection can be accomplished by first closing the standard input
(close(0)), then opening the requested file. Note: the user may ask for both of them in the same
input line!
- Let us summarize the activities of the shell and the child process
(we refer only to the steps inside the loop):
- The shell should create the child process,
- Each one of the processes has to recognize who it is,
- The shell should decide whether it is expected to wait,
- The child should look for redirection symbols, and open the requested file(s) such that they
will obtain file descriptor 0 (and/or) 1 appropriately, and then to perform execv.
- Neither of the special characters ('>' , '&' - if given), nor the
output file name, should be further given as parameters.
- Implement your shell so that it will hanlde both redirections at the same command line.
Task 8
-
PIPE:
- Sometimes we can combine two simpler programs to do a
more complicated task just by concatenating the output of the first to the
input of the second. A nice example is when you want your file listing to
be sorted by some criteria which is not supported by ls. Try
/bin/ls -l | /bin/sort +4 and you'll see that the file listing is sorted
by the 4'th (starting from 0) column, which is the file size (the '|' is
the symbol for pipe).
- What should the shell do? Mainly, invoke two
processes. Besides, a common buffer should be established (that is
the pipe), and appropriate redirections performed. The order is very
important. The pipe should be established first (so that the two child
processes will know where it is), then the copies of the shell, then the
redirections, then the invokations. You ask for a pipe by the
pipe function. However, such a function returns file descriptors, as
does open , in free slots in the file table. In order for the
(double) redirection to work well, we need these fd's to
replace the appropriate ones of the two processes. How do we
solve this issue? The easiest way is to perform fork, then to
duplicate (man dup, dup2 ) to fix fd's
where we need them.
- Let us summarize the steps:
- The shell creates a pipe,
- Then it creates the 'left' child, which duplicates the pipe's
output file descriptor to take the place of standard output, then issues
the execv system call with the first program name (the first word),
- Then it (the shell, not the left child) creates the 'right' child,
which duplicates the pipe's input file descriptor to take the place
of the standard input, then issues the execv system call with the
second program name (the first word after the '|' symbol),
- Parameters: As before, no special symbols should be given as
parameters. This time, however, also the real parameters should be dealt
with: the parameters up to the '|' are the left program's parameters, and
likewise for the parameters to the right of the '|'.
- Waiting: This time the shell has to wait for two processes. A good
place to start using waitpid. DO NOT WAIT for the first
one BEFORE invoking the second. However, once they all went
off to their way, your shell has to wait for the two of them,
so pay attention to the possibility that while waiting for the
first, the second already completed its task (the man waitpid
answers this).
- Cleanup: Each of the invoked processes starts with some redundant
file descriptors (those which it got from the shell), and it is an ordered
work to close them (before execv). For the shell, this is not an
option - this process goes on running our commands, and unless we cleanup
- close the unnecessary files - we'll fill the open file table ...
- Test your shell calling /bin/ls -l | /bin/sort
+4 > filename
- Note: Implement your shell so that it will hanlde legitimate mix of redirections, such as the
left process may have also an input redirection, and the right process may have also an output
redirection.