Unix

1 Software for Remote Access

To learn about the software you'll need to access the SCF UNIX machines remotely see Accessing the SCF remotely. To see a list of all the SCF computers go to the Computer Grid page.

2 Basics of Unix

On a UNIX system, most commands are entered into a shell. There is a large amount (sometimes too much) of online documentation for all UNIX commands, through the man command. For example, to find out more about the ls command, which lists the files in a directory, type

man ls

at the UNIX prompt.

Another way to use the man command is to use keywords; type either

man -k keyword

apropos keyword

at the UNIX prompt to find a list of commands that involve "keyword".

One attractive feature of UNIX shells is tab completion; if you type only the first few letters of a command or file name, and then hit the tab key, the shell will complete the name you started to type provided that there is only one match. If there's more than one match, hitting tab twice will list all the names that match.

A properly configured UNIX file system has permission controls to prevent unauthorized access to files, and to make sure that users do not accidently remove or modify key files. If you want to adjust the permissions on the files you own, take a look at the man page for the chmod command.

I'm not going to make a big distinction between UNIX, Linux and the UNIX core of Mac OSX, so the commands we're going to look at here should work under any of these operating systems.

3 Command Path

When you type a command into the shell, it will search through a number of directories looking for a command that matches the name you typed. The directories it looks through are called the search path. You can display your search path by typing

echo $PATH

To see the complete list of commands that are available on a given computer, you could look at all the files (commands) in all of the directories on your search path. (There are well over 2000 commands on most UNIX systems.)

4 Basic Commands

The table below shows some of the most commonly used UNIX commands:

Command	Description	Examples
`ls`	Lists files in a given directory	`ls /some/directory`
		`ls #` with no args, lists current dir
`cd`	Change Working Directory	`cd /some/directory`
		`cd #`with no args, cd to home dir
`pwd`	Print Working Directory	`pwd`
`mkdir`	Create New Directory	`mkdir subdirectory`
`less`	Display file one screen at a time	`less filename`
`cp`	Copy files	`cp file1 newfile1`
		`cp file1 file2 file3 somedirectory`
`mv`	Move or rename a file	`mv oldfile newfile`
		`mv file1 file2 file3 somedirectory`
`rm`	Remove a file	`rm file1 file2`
		`rm -r dir #`removes all directories and subdirectories
`rmdir`	Remove a (empty) directory	`rmdir mydir`
`history`	Display previously typed commands	`history`
`grep`	Find strings in files	`grep Error file.out`
`head`	Show the first few lines of a file	`head myfile`
		`head -20 myfile`
`tail`	Show the last few lines of a file	`tail myfile`
		`tail -20 myfile`
`file`	Identify the type of a file	`file myfile`

Each of these commands has many options which you can learn about by viewing their man page. For example, to get more information about the ls command, type

man ls

5 Command History

Another useful feature of most UNIX shells is the ability to retrieve and review the commands you've previously typed into that instance of the shell. The arrow keys can be used to scroll up or down through previous commands. Once a command appears on the command line (where you would normally type a command), you can use the arrow and/or backspace keys to edit the command. A number of control key combinations can also be used to navigate your command history and are displayed in the table below; these same control sequences will work in the Emacs editor.

Command	Meaning	Command	Meaning
control-p	Previous line	control-n	Next line
control-f	One character forward	control-b	One character backward
control-a	Beginning of line	control-e	End of line
control-d	Delete one character	control-k	Delete to end of line

6 Editors

Most programming tasks involve a stored copy of a program in a file somewhere, along with a file which could contain necessary data. On a UNIX system, it becomes very important to be able to view and modify files, and the program that does that is known as an editor. The kind of editor you use to work with files on a UNIX system is very different than a word processor, or other document handling program, as it's not concerned with formatting, fonts, or other issues of appearance - it basically just holds the bit patterns that represent the commands in your program or the information in your data. There are many editors available on UNIX systems, but the two most popular are emacs and vi. Some of the other editors you might encounter on a UNIX system are pico, nano, vim, xemacs, gedit, and kate.

7 Wildcards

Along with file completion, UNIX shells offer another way to save typing when entering filenames. Certain characters (known as wildcards) have special meaning when included as a filename, and the shell will expand these characters to represent multiple filenames. The most commonly used wildcard is the asterisk (*), which will match anything in a file name; other possibilities are in the table below. To see what will be matched for any string containing wildcards, use the UNIX echo command.

Wildcard	Meaning
`*`	Zero or more of any character
`?`	Any single character
`[...]`	Any of the characters between the brackets
`[^...]`	Any characters except those between the brackets
`[x-y]`	Any character in the range x to y `[0-9]` `[a-z]`
`string-1,string-2,string-3`	Each of the strings in turn

In addition, the shell will recognize the tilda (~) as representing your home directory, and ~user as user's home directory.

8 Redirection

Unix systems manage their input and output through three so-called streams, known as standard input, standard output and standard error. Normally, you don't have to do anything - standard input will be whatever you type, and standard output and error will be your computer screen. However, sometimes it's useful to redirect input from some other source, or to redirect output to some other source. For example, suppose you wanted a list of files in your current directory. You could run the ls command and redirect the output to a file, say myfiles as follows

ls > myfiles

Such a command will overwrite the file myfiles if it already existed. To redirect output to the end of a file, leaving any content intact, use

ls >> myfiles

To have a command read its input from a file instead from standard input the less-than sign (<) can be used.

Another useful form of redirection is known as a pipe. In this case, the standard output of one program is used as the standard input to another program. A very common use of pipes is to view the output of a command through a pager, like less. This allows you to view a screen at a time by pressing the space bar, to move up in the file by using control-u, and to move down using control-d. For example, suppose that you are listing the files in a directory that contains many, many files. If you simply type ls, the output will go streaming by too fast to read. But by typing

ls | less

the output will be displayed one screen at a time, and you can navigate using the commands described above. As another example, suppose we want to find which files we've modified recently. The -lt option of the ls command will provide a long display of files in reverse chronological order; to display, say the five most recently modified files we could use

ls -lt | head -5

9 Job Control

Normally, you type a command into the shell and, while the command is running, you can't type any more commands into that shell. This is known as running in the foreground. When a job is running in the foreground, you can signal it to stop with control-C, which is known as an interrupt signal. Some programs, like R, will catch this signal and do something useful. For example, R will stop working and return to its normal prompt when it receives an interrupt signal. Other programs, (like ls) will simply terminate when they receive that signal. Another signal you can send to a foreground program is control-\, which is known as a kill signal. Programs are not capable of catching this signal, so when you use control-\ it will terminate your foreground program unless there is some larger problem.

When you are sitting in front of the computer, it's not too much of a burden to open a second window, but if you're computing remotely it's nice to be able to run a command and get back the shell prompt to continue working. In addition, when you run jobs in the foreground, they will always terminate when you log out of the computer. The alternative to running in the foreground is running in the background. You can run any command in the background by putting an ampersand (&) at the end of your command. If a job is running in the background and you log off from the computer, it will continue to run, but any output that is not redirected will be lost.

If you've got a job running in the foreground, and you'd like to put it in the background, you need to carry out two separate steps. First, signal the program with control-Z which will suspend the program. Once the program is suspended, and the shell prompt returns, type

bg

to put it in the background. Notice that jobs that are suspended with control-Z continue to use resources even if they are not running, so when you really want to stop a job you should use control-C or control-\. If you want to put a suspended job in the foreground, use the fg command.

We've seen how to manage jobs when we still have access to the shell from which we started them, but sometimes you need to find out about jobs for which we don't have access to their originating shells. For example, you may remotely log in to a computer, start a job, log out and then want to find out if it's running, or to stop it. The ps command can be used to find out about all the programs that are running on a computer. Remember that for a networked system of computers, the ps command will only display information about commands that are running on the particular computer you're using, so if you put a job in the background and log off, it's a very good idea to remember the name of the computer you were using so that you can check on its progress later. To find all the commands you're running on a particular computer type:

ps -aux | grep username

where username is the account name you logged in with. A single line of ps output looks like this:

spector   1325 10380  0 09:48 pts/10   00:00:00 /bin/bash /usr/local/linux/bin/R

Notice that the name of the program that's running is shown at the end of the line. You can identify a particular program by looking for its name there, or by using a second grep command when invoking ps. The second column of the output contains the process id or PID of the process; this number uniquely identifies your job on a particular computer. If you want to terminate a job whose pid you know, type

kill pid

This is similar to control-C. To get the same affect as control-\, type

kill -9 pid

where pid is the number from the ps output.

File translated from T_EX by T_TH, version 3.67.
On 31 Jan 2011, 15:28.