Montgomery College Rockville Campus Shell Program Execution Command
Parsing command lines
At a high level, our shell does two things: parse command lines, and execute those command lines. This piece of the lab is about parsing; the next (and longer) piece is about executing. In our shell, parsing has two aspects (which are interleaved): converting a typed command line into a sequence of tokens, and converting that sequence of tokens into an internal representation (the execution piece acts on this internal representation). Tokens can be command names, command arguments, or special shell tokens. For instance, the command ls -l “foo bar” > zot consists of five tokens: “ls”, “-l”, “foo bar” (the quotes are not part of the token), “>”, and “zot”. To motivate the internal representation, notice that command lines can contain multiple commands, separated by various operators: ;, &, |, &&, ||, (, or ). Our internal representation is a list of commands, with information about the operator that separates each command from the next (along with other information, such as what kinds of redirection the user specified for each command). The data structure that implements this representation is (mostly) a linked list of commands, with each node in the list containing information about the command (we said “mostly” because, owing to subshells, the data structure can be thought of as a hybrid of a list and a tree). The key structure in the code is the struct command (in cmdparse.h). A good way to get familiar with a code base is to look at the key data structures and interfaces (which are usually in header files). To this end: Read the code and comments in cmdparse.h. The implementation of parsing is in cmdline.c, and is driven from main.c. You will need to have a solid understanding of this code in order to do the remaining exercises. Therefore: Familiarize yourself with cmdparse.c and main.c. It may help to play around with the parser for the shell that you have been given: $ make # if you have not already built the code $ ./cs202sh -p # -p means “parse only” cs202$ ls -l “foo bar” > zot Inspect the output. You can and should try typing in some of the commands from the warm-up section. There is missing functionality in cmdparse.c. It does not handle parentheses properly, and it does not free the memory that it uses. Your job is to fill these gaps.
Exercise 1. In cmdparse.c, complete cmd_parse() so that it handles parentheses and subshells. If you did this correctly, then make test should now pass the first five tests (it will by default pass some of the successive tests).
Exercise 2. In cmdparse.c, complete cmd_free(). We do not provide testing code for this. One hint: what does strdup() do? Type man strdup to find out. A helpful resource may be gdb. Speaking of resources….in this and the next part of the lab, you will need some comfort with C. We have been building this comfort in class, but you may still feel shaky. Here are some potentially helpful resources:
C for Java Programmers (Nagarajan and Dobra, Cornell) C for Java Programmers (Henning Schulzrinne, Columbia) A Tutorial on Pointers and Arrays in C (Ted Jensen)
Executing command lines
In this part of the lab, you will write the code to execute the parsed command line. Now would be a good time to take another look at main.c to see the high-level loop and logic for the entire shell. Although we have given you the necessary skeleton, most of the “work” for executing command lines is unimplemented. Your job is to fill it in:
Exercise 3. In cmdrun.c, implement cmd_line_exec() and cmd_exec(). The comments in cmdrun.c give specifications and hints.
It may be best to implement cmd_line_exec() first, and then to fill out cmd_exec() in pieces, testing the functionality as you go (using make test and your own testing).
Your functions will need to make heavy use of system calls. To learn more about the exact use of these system calls, you want to consult the manual pages. The interface to these pages is a program called man. Systems programmers need to type man a lot; you should too.
For example, man waitpid will give you documentation on the parameters and return values to waitpid(). Note that sometimes you will need to specify a section of the manual. For example, man open may not give you documentation on the system call open(); you would want man 2 open. Here is a list of some man commands that will be useful in this lab; you may need others:
# the numbers may be optional, depending on your system
$ man 2 waitpid # note WEXITSTATUS!
$ man 2 fork
$ man 2 open
$ man 2 close
$ man 2 dup2
$ man 3 strdup
$ man 3 strcmp
$ man 2 pipe
$ man 2 execve # or “man execvp” or “man 3 exec”
$ man 2 chdir
$ man 3 getcwd
$ man 3 perror
$ man 3 strtol
Interestingly, you don’t need to invoke read() and write() in this lab. (Why not?) You can learn more about the man program by typing man man (but avoid possible confusion).
A hint, concerning the exec variants: note that the argv array includes the command name itself as the “zeroth” argument. For example, in the command echo foo, the initial element in the array is argv[0], which is the address of memory that holds the string echo. And argv[1] is the address of memory that holds the string foo. This is not necessarily intuitive. A word about testing: if a particular test is giving you trouble, it may be helpful to type it into the standard shell (bash, as covered in the warmup earlier), and see what happens. Then play around with (or modify) the test, to try to understand what bash expects. That may give you insight about what your code is supposed to do.