The Unix Time-sharing System
This paper was written by Ritchie and Thompson, presented for the ACM, detailing v3 unix.
The Unix Time-Sharing system is a general-purpose, multi-user, interactive operating system for the DEC PDP-11.
The kicker? It runs on a PDP-11, costing as little as $40,000.
Major programs include: assembler, ed, a linker, debugger, a C compiler, a Basic interpreter, a formatter, a Fortran compiler, a Snobol interpreter, Top-down compiler compiler (TMG), a bottom-up compiler-compiler (YACC), form letter generator, macro processor (M6) and permuted index program.
Hardware and Software Environment
The PDP-11 has a 16-bit word computer, with 144kb of core memory, which Unix occupies 42kb, with device drivers, I/O buffers, and system tables occupying another few 5kb.
The hard disk is a 1MB fixed-head disk, used for file system storage and swapping, with four moving head 2.5MB removable disks, and a single 40MB disk pack.
The File System
The most important job of Unix is to provide a file system. It provides:
- Ordinary disk files
- Directories
- Special Files
Ordinary Files
Ordinary files are just characters demarcated by the newline character.
Directories
Directories introduce a structure on the file system as a whole. Each user in the OS has a home directory of their own files, and they can make subdirectories, which are all access controlled.
The system mains several directories for its own use, like system administration.
Search starting with ’/’ starts at the root, while search without it starts in the user’s current directory.
A file may be linked to another location, where it is simply a pointer to its location (this is a symlink).
Special Files
Each I/O device supported by unix is associated with one file — they are read and written to like ordinary disk files, but requests to read or write result in activation of the associated device.
For example, special devices reside in /dev
. For example, punch paper
tape (the printer) would be at /dev/ppt
. You could then write to it
“echo hi > /dev/ppt” and that would write the bytes corresponding to ‘hi’
to that device.
This allows three advantages:
- file and device I/O are as similar as possible.
- a program can be passed a file or a device and handle it the same.
- special files (devices) are subject to the same protection as files.
Removable File Systems
Using the mount
system call, a user can turn a leaf (a file) into a
subtree that uses that device as storage.
The user can then choose to cd
into that subtree, and use it as
normal, except that the leaf file turns into the root directory of the
filesystem, and that it cannot reference files on other mounted
filesystems.
In the root directory of all file systems, ’..’ refers to the directory itself, instead of its parent.
Protection
Each user of the system is assigned a unique user identification number. When a file is created, it is marked with that user id. Also, files are given seven protection bits — six of these specify read, write, and execute permission for the owner for the file and other users.
There is a concept of a “super user” which isn’t bound to the permissions setup of the normal users.
I/O Calls
I/O calls entail open
, create
, write
to modify files, which are
represented by file handles. Files have
no limit on size nor any distinction on random or sequential I/O. The
file system also has no user-visible locks on the file system, due to
their overhead and the fact that users don’t normally update other users
files.
The read
system call returns the bytes transmitted, which returns 0
when it hits the end of file marker.
To do random access I/O, you can take a file handle, a pointer,
and the seek
call to seek to a location in the file based on the
pointer, either positive or negative.
Implementation of the File System
A directory entry only contains a name and a pointer to the file. The
pointer is an integer, called an i-number
, short for index number
of
the file. The i-node contains a description of the file as follows:
- Its owner.
- Its protection bits.
- The physical disk or tape addresses for the file contents
- Its size.
- Time of last modification
- The number of links to the file
- A bit whether the file is a directory
- A bit whether the file is a special file
- A bit indicating whether the file is large or small.
Calls to open or create return a file pointer. When a file is created, an i-node is allocated for it, and a directory entry is made, which contains the name of the file and the i-node number. Making a link involves creating a directory entry with the new name, copying the i-number, and incrementing the link-count field of the i-node.
Removing a file involves decrementing the link-count of the i-node, and erasing the directory entry. If the link-count drops to 0, it is freed.
The filesystem is divided into 512-bit blocks, which are logically addressable.
Efficiency of the File System
The filesystem is fairly efficient — running an assembler program, the time was 35.9 seconds, with 63.5% overhead from assembler execution, 16.5% system overhead, and 20% disk wait time.
Processes and Images
Each process gets three logical segments: the text segment begins at location 0 in the virtual address space. This is for putting compile-time data. This is write-protected, and a single copy of it is shared among all processes executing the same data.
The next 8kb are allocated for the stack.
The next segment is a non-shared, writable data segment, which can be expanded with a system call (this is the heap).
Processes
Processes are created with the fork
system call. This splits the
process into two paths. Control returns from the fork to the parent,
while in the child, control is passed to the location label.
Pipes
Processes may communicate through each other with the pipe
system
call, which allows the output of one program to be send as input to
another process.
Execution of Programs
the execute
system call (execute(file, arg1, arg2,…argn)) can be
used to execute a file in the current process.
Process Synchronization
Process synchronization is done with the wait
system call. Wait
returns the processid of the terminated process. An error return is
generated if the calling process has no descendants.
Termination
exit
terminates a process, destroys its image, and closes its open
files.
The Shell
Standard I/O
The shell opens up two file descriptors, 0
for reading user input, and
1
, for writing to. These two files can be read and written to by
programs in the shells lifecycle.
The shell can change the assignments of these file descriptors. For
example, >
changes the file descriptor 1
to the file specified after
>
. Likewise, <
redirects input to a process from the file specified
after <
. So ed < script
means that ed
will read from the text of
script
.
Filters
The |
character is a filter, which is used to direct output from one
command to the input of another.
Command Separators: Multitasking
Commands may be separated by semicolons. The &
character returns the
process id to the shell, but executes the job in the background (is non-
blocking).
The Implementation of the Shell
The shell mainly waits for a user to type a command.
When the new-line character ending the line is typed, the shell’s read call returns.
The shell analyzes the command line, putting the arguments in a form appropriate for execute.
Then fork is called, which generates a child process. That child process
is given the appropriate arguments. The parent process then waits
for
the child process to die. Then, it returns back to the user.
If the &
character is provided, then it simply doesn’t wait.
Since the child inherits from the parent process, this all works as expected.
Finally, when an argument with <
or >
is given, it is only necessary
to close file 0
or 1
, and then open the output file. This is because
the file descriptor opened, by convention, is the lowest available
number.
Filters are straightforward extensions of standard I/O redirection with pipes used, instead of files.
Initialization
The shell itself is a child process of another process, called init
.
init
is used to set up devices with a process. Init then waits for
logins, where it reads a password file and then changes to the user’s
home directory, and executes the shell, allowing the user to access
commands.
This then loops.
Other Programs as Shell
The Unix System can also set up init to drop a specific user into a
different program than the shell, like ed
for text editing. There are
also games like chess, blackjack, and 3D tic-tac-toe.
Traps
The PDP-11 hardware detects a number of hardware faults, like reference
to nonexistent memory. Such faults cause the processor to trap to a
system routine. Normally, this results in the process being terminated
and the memory of the process written to a core
file in the current
directory for debugging.
You can also use the quit signal to exit in the case that something goes wrong.
Perspective
The UNIX OS is successful because of its low hardware requirements, and simple abstraction over devices as files. As well, since UNIX was dog fooded, most of its design mistakes were fixed as soon as they appeared.
The shell, the file system, and access system have been extremely useful for programmers to use.