Lecture 16

You have enough of a background in C that reading the remaining chapters of Kernigan and Pike is profitable. Read chapters 6 & 7, ignoring their discussion of signals (the calls they use are obsolete).

Block IO

In the beginning...

The original I/O layer of Unix is based on file-descriptors. These are small integer indexes to kernel-level per-process I/O structures. It is sometimes said that in Unix, everything is a file-descriptor—at the very least, this is how a Unix program interacts with the OS in order to perform IO, i.e., to read and write files, pipes, and sockets (network connections). When a program starts up, file descriptors 0, 1, and 2 are associated with stdin, stdout, and stderr respectively.

There are finite limits to how many file descriptors a given process can have open at any given time, and indeed, how many files may be open system-wide. This is not ordinarily a problem, but programs that have to deal with a lot of files have to pay attention to this limit.

The system routines that are used are typically

open, close read, write

note that creat(2) is deprecated, in favor of using the O_CREAT flag in open.

This is block-style IO. Every IO device is assigned (at least one) path in the file system, thus the open command always takes a path as an argument. These (low level) commands are documented in section 2 (not 3) of the manual.

The pipe(2) and socket(2) commands are used to create file descriptors associated with pipes and networks respectively.

Here is a simple example of a file-descriptor based copy program:

#include <stdlib.h> #include <stdio.h> #include <fcntl.h> #include <libgen.h> #include <string.h> #include <unistd.h> #define BUFSIZE 4096 char *progname = ""; void usage(); int main(int argc, char **argv) { progname = strdup(basename(argv[0])); if (argc != 3) usage(); int input = open(argv[1],O_RDONLY,0); if (input == -1) { fprintf(stderr,"%s: failed to open input file %s.\n",progname,argv[1]); exit(-1); } int output = open(argv[2],O_WRONLY | O_CREAT,0); if (output == -1) { fprintf(stderr,"%s: failed to open output file %s.\n",progname,argv[2]); exit(-1); } char buf[BUFSIZE]; int bytes_read; while ((bytes_read = read(input,buf,BUFSIZE)) != 0) { int bytes_written = 0; while (bytes_written != bytes_read) { bytes_written += write(output,buf+bytes_written,bytes_read-bytes_written); } } close(input); close(output); exit(0); } void usage() { fprintf(stderr,"usage: %s source dest\n",progname); exit(-1); }

The functions at this layer remain the most common way of dealing with sockets (i.e., network connections).

Standard I/O

Standard I/O is the layer at which programmers usually work. This is a stream (rather than block) abstraction.

The basic routines are these:

And the file open/close

and the “f” versions of the first four routines:

There are also the string based analogues: sprintf, sscanf, and asprintf.

Standard constants

Note: reopening the terminal from the device file /dev/tty.

Memory Mapped IO

Unix did not begin life as a virtual memory system. Thus, memory was a contiguous, zero-based array of bytes, and by golly, when you addressed byte 17, you got byte 17.

Today's systems are more complicated. Processes work with a virtual memory space, in which process memory may or may not be associated with actual memory, and it may or may not (but usually isn't) associated byte-for-byte with hardware memory. To a first approximation, a process's memory is associated with a file (secondary storage systems are typically much larger, albeit slower than main memory). Portions of this file are associated with hardware memory, which functions as a cache. Changes in the hardware memory will be written out if that page is discarded. These systems are based on memory mapping, and the basic facility makes it possible for a user level program to map a portion of a file to a portion of memory.

These days, the “low level I/O” facilities are actually built on top of memory mapping; and using the lower level facilities can be more efficient (because it involves less buffering). The key functions are mmap, and munmap. Here's a short example, that does rot13 “encoding” on each file specified on the command line:

// rotfile.c // Apply rot13 coding to the files given on the command line. #include <stdio.h> #include <stdlib.h> #include <ctype.h> #include <fcntl.h> #include <sys/mman.h> #include <sys/stat.h> #include <sys/types.h> #include <unistd.h> #ifdef Darwin #define BSD #endif #ifndef BSD #include <sys/file.h> #endif char rmap[256]; // open and immediately lock a file int lopen(char *path, int oflags) { #ifdef BSD return open(path,oflags | O_EXLOCK,0); #else int filedes = open(path,oflags,0); if (filedes == -1) return filedes; if (flock(filedes,LOCK_EX | LOCK_NB) != 0) { fprintf(stderr,"Could not obtain lock on %s.\n",path); close(filedes); return -1; } return filedes; #endif } int main(int argc, char **argv) { // set up the rmap array for (int i = 0; i < 256; ++i) { if (islower(i)) { rmap[i] = (i - 'a' + 13) % 26 + 'a'; } else if (isupper(i)) { rmap[i] = (i - 'A' + 13) % 26 + 'A'; } else { rmap[i] = i; } } // process input for (int i = 1; i < argc; ++i) { int filedes = lopen(argv[i],O_RDWR); if (filedes == -1) { fprintf(stderr,"Could not open locked file %s.\n",argv[i]); continue; } struct stat statbuf; fstat(filedes,&statbuf); char *buf = (char *) mmap(NULL,statbuf.st_size,PROT_READ|PROT_WRITE, MAP_SHARED,filedes,0); if (buf == (char *) -1) { fprintf(stderr,"Memory mapping failed on %s\n",argv[i]); perror(NULL); close(filedes); continue; } char *cp = buf; char *endp = buf + statbuf.st_size; while (cp != endp) { *cp = rmap[*cp]; ++cp; } munmap(buf,statbuf.st_size); close(filedes); } }

with the following Makefile:

CFLAGS= -std=c11 -O3 rotfile: rotfile.o clean: rm -f rotfile.o rotfile

For those who care about such things, this program processed a file that consisted of 10 copies of kjv10.txt file in 0.22s, for a throughput of about 200MB/sec on my laptop, which gives you some idea of the level of performace possible. This involves both a read and a write cycle, and does not compare badly with the rated 540MB/sec transfer rate of the drive itself.

Exercise 16.1 The rotfile program above works, but it makes a dangerous assumption: that the file being copied is small enough that it can be memory-mapped efficiently. This assumption can be removed processing the file in reasonable sized chucks. Note that each chunk has to be a multiple of pagesize (cf. getpagesize(2/3)). Refactor the code above so that the the per-file code is broken out into a separate routine, and so that the files are mapped in reasonable sized (say, roughtly 1MB) chunks.

Note here that your program may have difficulties if you try to do your allocations in sizes less than the pagesize of your system, which is probably 4096.