Lecture 1

Overview

Lecturers

Section 01

Stuart Kurtz
Ryerson 166
Office hours: 10:30-11:30 MWF

Section 02

Ariel Feldman
Ryerson 161A
Office hours: 1:30-3:00 TuTh

Make sure that the string “[16200]” appears in the subject line of any class based correspondence.

TAs

The Lab TA is Mark Stoehr.

Grades

There will be four graded elements: homework, labs, participation (a.k.a. wiki-work), and final presentation. The grading will be 40% homework, 10% lab, 40% wiki-work, and 10% final presentation.

Homework assignments will typically be given daily, and will be due the following class period unless otherwise specified. Your aggregate homework grade will be the average of your average and median grades (It's a sweet spot between making the penalty of missing an occasional homework negligible, and having something that is spreadsheet friendly to compute). No late homework is accepted.

Participation. Our goal is to create a community of intelligent, computationally aggressive undergraduates who will help drag this university into the 21st century. You are honors students. Of course you are smart. We also expect that you will be largely self-directed intellectually, and that you’re taking this course because computation has something to do with the life you want to create for yourself. If you’ve not already begun to live that life, now is the time to start. A rule of thumb is that it takes 10,000 hours to become an expert in anything. This is five years of full-time effort. If you spend 2 hours/day, 5 days/week programming for the next 10 weeks, that will get you 1% of the way. You probably want to get further than that, and you should adjust your effort accordingly.

A big part of this class will be student initiated, student solved programming projects. Please note that the student who initiates a programming project need not be the same as the student(s) who complete it, and that there may be more than one completion (it’s not usual to see puzzle-like problems solved in a half-dozen different languages). We will run this through a wiki. You should maintain a portfolio of your contributions on the wiki.

Students will give reports to the class during the final exam week on one or more projects that they’ve worked on. The reports, of unfortunate necessity, will be brief: 5 minutes is typical. Our experience is that students enjoy both the giving and the listening a lot. Actually scheduling these final presentations is going to be tricky. We're probably going to have to schedule at least one extra presentation session, and probably two. There will be more information on this later in the quarter.

The Wiki

The course website is located at http://cmsc-16200.cs.uchicago.edu/2015/, which isn't linked, because you are here.

You should read the Course Overview and Policies pages. There is a link there, and on all the lecture pages, to the wiki.

The wiki is intended for class use only, and to avoid drive-by vandalism on the internet, We’ve password protected it with a highly secure password, which you’ll get in class. There’s a lot that we’re going to do through the wiki—you should go there, log in, then work your way through the BasicEditing page on the sidebar. Pay attention to how to create links, and the structure of WikiGroups and how they interact with link formation.

The level of security on the wiki is intentionally quite light—collaboration is a big class emphasis, and we want to avoid setting up any more walls than are absolutely necessary. That said, we do expect students to follow a few simple rules

  1. The wiki is organized into WikiGroups, e.g., “Instructor”, “Student”, etc. You can look at any page on the wiki, but generally speaking you should only edit pages (a) in the Student group, or (b) your personal Page in the Profiles group. You should only edit text that you've written, and your contributions should be marked off and be identifiable as such.
  2. We ask each of you to set up a personal page on the wiki, which should be in the Profiles group. You should never edit anyone else’s personal page. The primary purpose of your personal page is to serve as a portfolio for this class: you should include links to the problems you’ve posed, and the problems you’ve contributed to. You can include whatever other information you want to.

Exercise 1.1 Set up a “Home Page” for yourself on the Wiki. There should be a link from the [[Profiles/Profiles]] page (just click on “Profiles” in the navigation bar to get there) to your page. Your page can contain any content at all, but it should not be empty. When you edit a web page, you should fill out the author field with your profile page’s trailing component, e.g., if Professor Kurtz's profile page is Profile.StuartKurtz, and if he fills in “Stuart Kurtz” as author, then the Changes page will list authorships properly.

Exercise 1.2 (Due at the beginning of Lecture 3). Come up with at least one computing problem/project that you’d like to solve/work on/see solved over the course of the quarter. Add your problem (make sure you include your name) to the class wiki. I’ve included a sample problem for you to use (or not!) as a model: [[Instructor/Sample Problem]]. Make sure that your profile page links to the problem you’ve posed! A nice model (but we’ll ask you not to mine it for your initial problem) is Project Euler.

You should add additional programming projects as you wish, but you’re required to add at least one. Note that from time to time, your wiki work will be assessed, so keep working, and keep your personal page up-to-date.

Nongraded task: If you do not already have a CS computing account, acquire one. To do this, visit http://www.cs.uchicago.edu/info/services/account_request. You'll need this for the lab.

The Unix Command Line

We’re going to learn the (new) Unix way. This is a mix of old-school ideas and modern languages. One problem here is a lack of good source material. There are some great books from the old days that have broad coverage, and some great new books that deal with specific technologies, but we don’t know any great new books of broad scope. We’re going to have to mix and match.

Is it better to be the fox or the hedgehog? Is it better to have a large toolkit, or to be a master of just one tool? While UC alum Nate Silver has his opinion, our's is that both have value. If last quarter was a hedgehog quarter, this is a fox quarter. We're going to throw a lot of different technologies and ideas at you. Our goal this quarter is breadth. We value depth, but after you've learned one language, it's usually easy to pick up additional languages. So I’m going to introduce you to a number of languages, focussing on what makes each distinctive, and we expect that you’re smart enough and motivated enough to work on the rest. As always, our goal is to teach you enough to be a danger to yourself and your community, because otherwise you can't accomplish anything of utility or note. Of course, we're happy to help dig you out if you get stuck.

We’re going to start with a quick read through a great old book, Kernighan and Pike’s “The Unix Programming Environment.” We assume that you can and will read—our lectures are intended to (a) get you into the material quickly, and (b) to supplement what they do. We don’t try for completeness in this class.

The first old-school idea is the command line. You are probably more familiar with GUIs—Graphical User Interfaces—where you interact with a program through user-centric visual elements like windows, buttons, fields, menus, etc. GUIs have some huge advantages—they’re easy to learn, easy to use, and they can facilitate discovery. At the same time, they have some real disadvantages, principally in a lack of flexibility and composability.

Command-lines complement this. They’re an expert interface. If you know what you’re doing, they can be amazingly productive. But there’s a stiff learning curve, especially when it comes to taking full advantage of what they offer. When you’re first getting started, that curve can look like a cliff. Let's start climbing.

How you get to a command line depends on the system you’re using. If you’re using MacOS X, you can get a command line by running /Applications/Utilities/Terminal.app. You’re going to use this a lot—so put it on your dock. If you’re running Linux, you’ll want an xterm. Exactly how you start an xterm will depend on the particular window manager you’re using, but it shouldn’t be hard to figure out. It's usually at or near the top of the main applications menu, or accessible via a toolbar. Finally, if you’re running Windows, you have our sympathy, because you’re in for some extra work. Download Cygwin, follow the instructions, and get used to pain. Of course, if you’re a Windows user, you’re already used to pain, so this will be a familiar, even reassuring, experience.

Once you’ve gotten started, you’ll get a command prompt. The particular prompt you get will depend both on the particular command interpreter (shell) you’re using, and any customization that may have been done either by you or your distribution.

The default command line prompt for Bourne-like shells is $ for ordinary users, and # for superusers. You should not be working as the superuser unless you’re a system administrator in the act of doing system administration. If you get %, you probably have the csh or tcsh, and will need to figure out how to change shells.

Kernighan and Pike assume that your command interpreter is the Bourne Shell, /bin/sh. I’m going to assume that you’re using one of the Bourne Shell’s modern successors, most likely bash.

Figuring out your shell isn’t as easy as looking at the prompts. We’ve shown the default prompts here, but these prompts are easily (and often!) customized. For example, Professor Kurtz has the following lines in his .bashrc file (a file that is used for customizing interactive shells):

export PS1="\h \$ "

This sets the his command prompt on his research machine to

faith $

Some people find it more helpful to include the current working directory in their prompt. You'll probably change yours more than a couple times to suit your current needs.

One approach to figuring out what shell you're running is to invoke

$ echo $0

which will report the name of the shell as it was provided to exec. Login shells will have the shell name prefixed by a hyphen, e.g.,

$ echo $0 -bash $

The hyphen will be missing for recursively invoked shells, e.g.,

$ bash $ echo $0 bash $ ^-D $ /bin/bash $ echo $0 /bin/bash $ ^-D $

Here’s a fairly typical, if brief, interaction with the command line:

$ date Wed Jan 4 09:11:20 CST 2012 $

We’ve invoked the date command, which quickly reports the day, date, time, and local timezone. After the command has executed, you’ll be prompted for another command. One thing you’ll grow to appreciate is the “snappiness” of the command-line interface, although GUIs are getting better in this regard as the underlying hardware gets faster (and more multi-threaded).

The date command is like a lot of old-school UNIX commands. It is useful, terse, it has a lot functionality, well-chosen defaults, and it is well documented.

For example:

$ date -u Wed Jan 4 15:11:37 UTC 2012

The -u flag to date specifics UTC—Coordinated Universal Time—what used to be called Greenwich Mean Time. Here we have the the basic command, along with a command line argument. The argument is an old-school “flag,” a hyphen followed by a single letter, which indicates a variation on the basic functionality. Note also the 24 hour clock, rather than am/pm.

Or, if we wanted the date formatted in standard North American slash (month/day/year) format, we could provide a user format:

$ date +%m/%d/%y 01/04/12

This is a bit of a pain for ordinary command-line execution, but it is useful in scripts.

Here the structure of the command line argument is different. The use of + to indicate a format is idiosyncratic to date, wherein it indicates a user-defined format string. The format string here is %m/%d/%y. The % symbol is used to introduce a formatting command, which is usually a single character.

Here,

    %m -- month (mm)
    %d -- day   (dd)
    %y -- year  (yy)

are just a few of the format commands.

Non-formatting characters stand for themselves, like /. If we want a %, we have to use %%.

This is all well and good, but how could we have discovered this for ourselves? The man(1) command gives us access to the manual pages for the standard system commands. In this case, we could find out a lot by doing

% man date

This will produce the Unix manual page for the date command. This manual command is run through a pager—a simple program that enables you to view text through a terminal window. The standard pager back in the old days was more, but these days, it is often replaced by the more capable, backward compatible pager, less, and more is usually just an alias to less.

The basics of less:

    “ “ -- (a space), forward one screen full
    b  -- back one screen full
    < -- return to the beginning
    > -- go to the end
    q -- quit
    / -- search forward
    ? -- search backward

There’s a lot more to less (besides bad puns)—which you can figure out from the man pages.

One of the things that might not occur to you (but should!) is that man itself is quite flexible, and you can learn more about it by

% man man

And of course, We found out about the short-hostname flag in the command prompt via

$ man bash

Of particular interest is -k flag—which searches the summary strings of the commands for a particular pattern. In particular, this addresses the problem of “how do I find a command in the first place?”

% man -k date

There is a signal-to-noise ratio problem here. As Unix has evolved, the number of commands has increased, and this means that the output from man -k can sometimes be a bit much. In particular, perl, tcl, and the X-Windows system are responsible for a lot of “man pollution.”

The basic objects of UNIX land are files and processes. A Unix file system is organized as a singly rooted tree, with directories (a special kind of file) and (non-directory) files. We name directories and files via a specially formatted strings called paths. The root directory is /. A typical home directory might be /home/stuart or /Users/stuart. GUIs make it easy to create file names that contain spaces—this is a bad idea if you’re working through CLI’s. Use hyphens, underscores, periods, or CamelCase instead.

Every process is associated with a directory (the working directory), which may change. The shell (command line) is just an ordinary program. Relative paths don’t begin with /, and they’re computed relative to the current process’s working directory.

You can list the files and directories in the current directory using the ls command. Learn the -l variant, which is extremely useful. Note that ls does not show “dot” files by default—you need to use -a to see them. Your home directory is often littered with dot files.

You can create a new directory using mkdir, and delete an empty directory using rmdir. You can change the shell’s current working directory by using cd. The relevance of the current working directory (for most purposes) comes in parsing relative paths. It’s a lot easier to type

$ cat foo.c

than

$ cat /Users/stuart/Sources/MyGreatProject/foo.c

The pwd command will print the current working directory:

$ pwd /Volumes/Home/stuart $

You can delete a file by using rm. Note that rm is not safe—in Unix land, the assumption is that you mean what you say. There is no un-rm. Back when Professor Kurtz first learned the Unix environment (System V on a PDP-11/40), we referred to it as "User hostile software," because of its terseness and it's "do what I say, not what I mean" attitude. We learned, but it took more time when there was no one to guide us.

Files are created using a text editor. Note that MS Word is not a text editor. You should learn at least two text editors: first, ed (which is documented in Chapter 1), and either vi or emacs (your choice). You’ll want to know ed because the stream editor sed uses ed commands, and sed is very useful. You’ll want to know vi or emacs because they are much more powerful than ed, and are almost universally available in Unix-like environments. You may learn one or more other GUI based text editors—Professor Kurtz uses BBEdit when editing locally on one of his Macs, (note that TextWrangler is a free version), and emacs on remote machines. Professor Feldman has been known to use TextMate and even Eclipse as well as vim, the modern version of vi. There are many choices depending on your operating system—YMMV.

Text editing is very important!!

Beyond this, study Chapter 1 of K&P, and do a quick read of Chapter 2. Learn the commands in table 1.1. Redirection and pipes are important—pay particular attention to these sections. Ignore the discussion of stty, unless you're amused by technological archaeology. stty was life vs. death in the modem days, but they’re long gone. These days, ssh rules.