When a command is sent to the background, the Unix system automatically displays two numbers. The first is called the command's job number and the second the process id. In the preceding example, 1 was the job number and the process id. The job number is used by some shell commands that you'll learn more about in Chapter The process id uniquely identifies the command that you sent to the background and can be used to obtain status information about the command.
This is done with the ps command. The ps command gives you information about the processes running on the system. If you type in ps at your terminal, you'll get a few lines back describing the processes you have running:. The sh process in the preceding example is the shell that was started when you logged in, and it has used 21 seconds of computer time. Until the command is finished, it shows up in the output of the ps command as a running process.
Process number in the preceding example is the ps command that was typed in, and is the sort from the preceding example. When used with the -f option, ps prints out more information about your processes, including the parent process id PPID , the time the processes started STIME , and the command arguments:. Table 2. In this table, file refers to a file, file s to one or more files, dir to a directory, and dir s to one or more directories.
What Is the Shell? In this chapter you'll learn what the shell is and what it does. The Kernel and the Utilities. The Unix system is itself logically divided into two pieces: the kernel and the utilities see Figure 3.
The kernel is the heart of the Unix system and resides in the computer's memory from the time the computer is turned on and booted until the time it is shut down. The utilities, on the other hand, reside on the computer's disk and are only brought into memory as requested.
Virtually every command you know under the Unix system is classified as a utility; therefore, the program resides on the disk and is brought into memory only when you request that the command be executed.
So, for example, when you execute the date command, the Unix system loads the program called date from the computer's disk into memory and initiates its execution. The shell, too, is a utility program. It is loaded into memory for execution whenever you log in to the system.
In fact, it's worth learning the precise sequence of events that occurs when the first shell on a terminal or window starts up. The Login Shell. A terminal is connected to a Unix system through a direct wire, modem, or network. In the first case, as soon as you turn on the terminal and press the Enter key a couple of times if necessary , you should get a login: message on your screen.
In the second case, you must first dial the computer's number and get connected before the login: message appears. In the last case, you may connect over the network via a program such as ssh, telnet, or rlogin, or you may use some kind of networked windowing system for example, X Window System to start up a terminal emulation program for example, xterm. For each physical terminal port on a system, a program called getty will be active.
This is depicted in Figure 3. Figure 3. The getty process. The Unix system—more precisely a program called init—automatically starts up a getty program on each terminal port whenever the system is allowing users to log in.
As soon as someone types in some characters followed by Enter, the getty program disappears; but before it goes away, it starts up a program called login to finish the process of logging in see Figure 3. It also gives login the characters you typed in at the terminal—characters that presumably represent your login name. When login begins execution, it displays the string Password: at the terminal and then waits for you to type your password.
This file contains one line for each user of the system. That line specifies, among other things, the login name, home directory, and program to start up [1] when that user logs in. The last bit of information the program to start up is stored after the last colon of each line. In other cases, it may be a special custom-designed program.
The main point here is that you can set up a login account to automatically run any program whatsoever whenever someone logs in to it. The shell just happens to be the program most often selected. So login initiates execution of the standard shell on sue's terminal after validating her password see Figure 3. Three users logged in.
The init program starts up other programs similar to getty for networked connections. For example, sshd, telnetd, and rlogind are started to service logins via ssh, telnet, and rlogin, respectively. Instead of being tied directly to a specific, physical terminal or modem line, these programs connect users' shells to pseudo ttys.
These are devices that emulate terminals over network connections. You can see this whether you're logged in to your system over a network or on an X Windows screen:. Each time you type in a command and press the Enter key Step 3 , the shell analyzes the line you typed and then proceeds to carry out your request Step 4.
If you ask it to execute a particular program, the shell searches the disk until it finds the named program.
When found, the shell asks the kernel to initiate the program's execution and then the shell "goes to sleep" until the program has finished Step 5. The kernel copies the specified program into memory and begins its execution. This copied program is called a process; in this way, the distinction is made between a program that is kept in a file on the disk and a process that is in memory doing things.
If the program writes output to standard output, it will appear at your terminal unless redirected or piped into another command. Similarly, if the program reads input from standard input, it will wait for you to type in input unless redirected from a file or piped from another command Step 6.
When the command finishes execution, control once again returns to the shell, which awaits your next command Steps 7 and 8. Note that this cycle continues as long as you're logged in.
When you log off the system, execution of the shell then terminates and the Unix system starts up a new getty or rlogind, and so on at the terminal and waits for someone else to log in. This cycle is illustrated in Figure 3. Login cycle. It's important for you to recognize that the shell is just a program. It has no special privileges on the system, meaning that anyone with the capability and devotion can create his own shell program. This is in fact the reason why various flavors of the shell exist today, including the older Bourne shell, developed by Stephen Bourne; the Korn shell, developed by David Korn; the "Bourne again shell," mainly used on Linux systems; and the C shell, developed by Bill Joy.
The Shell's Responsibilities. Now you know that the shell analyzes each line you type in and initiates execution of the selected program. But the shell also has other responsibilities, as outlined in Figure 3. Each time you type in a line to the shell, the shell analyzes the line and then determines what to do. As far as the shell is concerned, each line follows the same basic format:.
The line that is typed to the shell is known more formally as the command line. The shell scans this command line and determines the name of the program to be executed and what arguments to pass to the program. The shell uses special characters to determine where the program name starts and ends, and where each argument starts and ends. These characters are collectively called whitespace characters, and are the space character, the horizontal tab character, and the end-of-line character, known more formally as the newline character.
Multiple occurrences of whitespace characters are simply ignored by the shell. When you type the command. The set of characters up to the next whitespace character known as a word to the shell —in this case, the newline—is the second argument to mv: games.
As mentioned, multiple occurrences of whitespace characters are ignored by the shell. This means that when the shell processes this command line:.
Execution of echo with four arguments. Because echo takes its arguments and simply displays them at the terminal, separating each by a space character, the output from the following becomes easy to understand:. The fact is that the echo command never sees those blank spaces; they have been "gobbled up" by the shell. We mentioned earlier that the shell searches the disk until it finds the program you want to execute and then asks the Unix kernel to initiate its execution.
This is true most of the time. However, there are some commands that the shell knows how to execute itself. These built-in commands include cd, pwd, and echo. So before the shell goes searching the disk for a command, the shell first determines whether it's a built-in command, and if it is, the shell executes the command directly.
Like any other programming language, the shell lets you assign values to variables. Whenever you specify one of these variables on the command line, preceded by a dollar sign, the shell substitutes the value assigned to the variable at that point.
The shell also performs filename substitution on the command line. Suppose that your current directory contains the files as shown:. How many arguments do you think were passed to the echo program, one or four?
Because we said that the shell is the one that performs the filename substitution, the answer is four. When the shell analyzes the line. Then the shell determines the arguments to be passed to the command. So echo never sees the asterisk. As far as it's concerned, four arguments were typed on the command line see Figure 3.
In this case, the file is reminder. If reminder already exists and you have write access to it, the previous contents are lost if you don't have write access to it, the shell gives you an error message. Before the shell starts execution of the desired program, it redirects the standard output of the program to the indicated file. As far as the program is concerned, it never knows that its output is being redirected.
It just goes about its merry way writing to standard output which is normally your terminal, you'll recall , unaware that the shell has redirected it to a file. In the first case, the shell analyzes the command line and determines that the name of the program to execute is wc and it is to be passed two arguments: -l and users see Figure 3.
When wc begins execution, it sees that it was passed two arguments. The first argument, -l, tells it to count the number of lines. So wc opens the file users, counts its lines, and then prints the count together with the filename at the terminal. Operation of wc in the second case is slightly different.
The word that follows on the command line is the name of the file input is to be redirected from. When wc begins execution this time, it sees that it was passed the single argument -l. Because no filename was specified, wc takes this as an indication that the number of lines appearing on standard input is to be counted.
So wc counts the number of lines on standard input, unaware that it's actually counting the number of lines in the file users. The final tally is displayed at the terminal—without the name of a file because wc wasn't given one. The difference in execution of the two commands is important for you to understand. If you're still unclear on this point, review the preceding section.
Just as the shell scans the command line looking for redirection characters, it also looks for the pipe character. For each such character that it finds, it connects the standard output from the command preceding the to the standard input of the one following the.
It then initiates execution of both programs. It connects the standard output of the former command to the standard input of the latter, and then initiates execution of both commands.
When the who command executes, it makes a list of who's logged in and writes the results to standard output, unaware that this is not going to the terminal but to another command instead. When the wc command executes, it recognizes that no filename was specified and counts the lines on standard input, unaware that standard input is not coming from the terminal but from the output of the who command.
Environment Control. The shell provides certain commands that let you customize your environment. Your environment includes your home directory, the characters that the shell displays to prompt you to type in a command, and a list of the directories to be searched whenever you request that a program be executed.
You'll learn more about this in Chapter 11, "Your Environment. The shell has its own built-in programming language. This language is interpreted, meaning that the shell analyzes each statement in the language one line at a time and then executes it.
This differs from programming languages such as C and FORTRAN, in which the programming statements are typically compiled into a machine-executable form before they are executed. Programs developed in interpreted programming languages are typically easier to debug and modify than compiled ones.
However, they usually take much longer to execute than their compiled equivalents. The shell programming language provides features you'd find in most other programming languages. It has looping constructs, decision-making statements, variables, and functions, and is procedure-oriented. Chapter 4. This chapter provides detailed descriptions of some commonly used shell programming tools. Covered are cut, paste, sed, tr, grep, uniq, and sort. The more proficient you become at using these tools, the easier it will be to write shell programs to solve your problems.
In fact, that goes for all the tools provided by the Unix system. Regular Expressions. Before getting into the tools, you need to learn about regular expressions. Regular expressions are used by several different Unix commands, including ed, sed, awk, grep, and, to a more limited extent, vi. They provide a convenient and consistent way of specifying patterns to be matched. The shell recognizes a limited form of regular expressions when you use filename substitution.
The regular expressions recognized by the aforementioned programs are far more sophisticated than those recognized by the shell. Also be advised that the asterisk and the question mark are treated differently by these programs than by the shell. Throughout this section, we assume familiarity with a line-based editor such as ex or ed.
A period in a regular expression matches any single character, no matter what it is. So the regular expression. In the first search, ed started searching from the beginning of the file and found the characters " was " in the first line that matched the indicated pattern. The substitute command that followed specified that all occurrences of the character p, followed by any single character, followed by the character o were to be replaced by the characters XXX.
A command such as. What do you think would be matched by the regular expression. Would this match a period character that ends a line? This matches any single character at the end of the line including a period recalling that the period matches any character. So how do you match a period? This regular expression is to be distinguished from one such as. Suppose that you are editing a file and want to search for the first occurrence of the characters the. In ed, this is easy: You simply type the command.
This causes ed to search forward in its buffer until it finds a line containing the indicated string of characters. The first line that matches will be displayed by ed:. Notice that the first line of the file also contains the word the, except it starts a sentence and so begins with a capital T. You can tell ed to search for the first occurrence of the or The by using a regular expression. Just as in filename substitution, the characters [ and ] can be used in a regular expression to specify that one of the enclosed characters is to be matched.
So, the regular expression. A range of characters can be specified inside the brackets. This can be done by separating the starting and ending characters of the range by a dash -. So, to match any digit character 0 through 9, you could use the regular expression. As you'll learn shortly, the asterisk is a special character in regular expressions. However, you don't need to put a backslash before the asterisk in the replacement string of the substitute command. For example, the regular expression.
You know that the asterisk is used by the shell in filename substitution to match zero or more characters. In forming regular expressions, the asterisk is used to match zero or more occurrences of the preceding character in the regular expression which may itself be another regular expression.
A similar type of pattern is frequently used to match the occurrence of one or more blank spaces. Bear in mind that a regular expression matches the longest string of characters that match the pattern. Therefore, used by itself, this regular expression always matches the entire line of text. As another example of the combination of.
That's right, this matches any alphabetic character followed by zero or more alphabetic characters. This is pretty close to a regular expression that matches words. The only thing it didn't match in this example was You can change the regular expression to also consider a sequence of digits as a word:. We could expand on this somewhat to consider hyphenated words and contracted words for example, don't , but we'll leave that as an exercise for you.
So the expression. In the preceding examples, you saw how to use the asterisk to specify that one or more occurrences of the preceding regular expression are to be matched. For instance, the regular expression. There is a more general way to specify a precise number of characters to be matched: by using the construct. As stated before, whenever there is a choice, the largest pattern is matched; so if the input text contains eight consecutive X's at the beginning of the line, that is how many will be matched by the preceding regular expression.
As another example, the regular expression. A few special cases of this special construct are worth noting. If only one number is enclosed between the braces, as in. Note that the last line of the file didn't have five characters when the last substitute command was executed; therefore, the match failed on that line and thus was left alone recall that we specified that exactly five characters were to be deleted.
If a single number is enclosed in the braces, followed immediately by a comma, then at least that many occurrences of the previous regular expression must be matched.
Once again, if more than five exist, the largest number is matched. It is possible to capture the characters matched within a regular expression by enclosing the characters inside backslashed parentheses. These captured characters are stored in "registers" numbered 1 through 9. The net effect of this regular expression is to match the first two characters on a line if they are both the same character.
Go over this example if it doesn't seem clear. So when the following regular expression is used to match some text. The names and the phone numbers are separated from each other in the phonebook file by a single tab character.
The regular expression. The replacement string. Then it substitutes the characters that were matched the entire line with the contents of register 2 followed by a space, followed by the contents of register 1 Alice Chebba :. As you can see, regular expressions are powerful tools that enable you to match complex patterns. Table 4. This section teaches you about a useful command known as cut.
This command comes in handy when you need to extract that is, "cut out" various fields of data from a data file or the output of a command. The general format of the cut command is. This can consist of a single number, as in -c5 to extract character 5; a comma-separated list of numbers, as in -c1,13,50 to extract characters 1, 13, and 50; or a dash-separated range of numbers, as in -c to extract characters 20 through 50, inclusive.
To extract characters to the end of the line, you can omit the second number of the range; so. If file is not specified, cut reads its input from standard input, meaning that you can use cut as a filter in a pipeline. As shown, currently four people are logged in. Suppose that you just want to know the names of the logged-in users and don't care about what terminals they are on or when they logged in.
You can use the cut command to cut out just the usernames from the who command's output:. The -c option to cut specifies that characters 1 through 8 are to be extracted from each line of input and written to standard output.
The following shows how you can tack a sort to the end of the preceding pipeline to get a sorted list of the logged-in users:. If you wanted to see what terminals were currently being used, you could cut out just the tty numbers field from the who command's output:.
How did you know that who displays the terminal identification in character positions 10 through 16? You executed the who command at your terminal and counted out the appropriate character [2] positions. You can use cut to extract as many different characters from a line as you want. Here, cut is used to display just the username and login time of all logged-in users:. The option -c, says "extract characters 1 through 8 the username and also characters 18 through [3] the end of the line the login time.
The cut command as described previously is useful when you need to extract data from a file or command provided that file or command has a fixed format.
For example, you could use cut on the who command because you know that the usernames are always displayed in character positions 1—8, the terminal in 10—16, and the login time in 18— Unfortunately, not all your data will be so well organized!
It also contains other information such as your user id number, your home directory, and the name of the program to start up when you log in.
Getting back to the cut command, you can see that the data in this file does not align itself the same way who's output does.
So getting a list of all the possible users of your system cannot be done using the -c option to cut. So although each field may not be the same length from one line to the next, you know that you can "count colons" to get the same field from each line. The -d and -f options are used with cut when you have data that is delimited by a particular character.
The format of the cut command in this case becomes. Field numbers start at 1, and the same type of formats can be used to specify field numbers as was used to specify character positions before for example, -f1,2,8, -f, -f Given that the home directory of each user is in field 6, you can associate each user of the system with his or her home directory as shown:.
If the cut command is used to extract fields from a file and the -d option is not supplied, cut uses the tab character as the default field delimiter. The following depicts a common pitfall when using the cut command. Suppose that you have a file called phonebook that has the following contents:. If you just want to get the names of the people in your phone book, your first impulse would be to use cut as shown:. Not quite what you want! This happened because the name is separated from the phone number by a tab character and not blank spaces in the phonebook file.
And as far as cut is concerned, tabs count as a single character when using the -c option. So cut extracts the first 15 characters from each line in the previous example, giving the results as shown.
Much better! Recall that you don't have to specify the delimiter character with the -d option because cut assumes that a tab character is the delimiter by default. But how do you know in advance whether fields are delimited by blanks or tabs?
One way to find out is by trial and error as shown previously. Another way is to type the command. The output verifies that each name is separated from each phone number by a tab character. The paste command is sort of the inverse of cut: Instead of breaking lines apart, it puts them together. The general format of the paste command is. The dash character - can be used in files to specify that input is from standard input.
Suppose that you also have a file called numbers that contains corresponding phone numbers for each name in names:. That is, the first character listed in chars will be used to separate lines from the first file that are pasted with lines from the second file; the second character listed in chars will be used to separate lines from the second file from lines from the third, and so on.
If there are more files than there are characters listed in chars, paste "wraps around" the list of characters and starts again at the beginning. In the simplest form of the -d option, specifying just a single delimiter character causes that character to be used to separate all pasted fields:. It's always safest to enclose the delimiter characters in single quotes. The reason why will be explained shortly. The -s option tells paste to paste together lines from the same file, not from alternate files.
If just one file is specified, the effect is to merge all the lines from the file together, separated by tabs, or by the delimiter characters specified with the -d option. It stands for stream editor. Unlike ed, sed cannot be used interactively. However, its commands are similar. The general form of the sed command is. If no file is specified, standard input is assumed.
As sed applies the indicated command to each line of the input, it writes the results to standard output. For now, get into the habit of enclosing your sed command in a pair of single quotes. Later, you'll know when the quotes are necessary and when to use double quotes instead. Whether or not the line gets changed by the command, it gets written to standard output all the same. Note that sed makes no changes to the original input file. To make the changes permanent, you must redirect the output from sed into a temporary file and then move the file back to the old one:.
Always make sure that the correct changes were made to the file before you overwrite the original; a cat of temp could have been included between the two commands shown previously to ensure that the sed succeeded as planned. If your text included more than one occurrence of "Unix" on a line, the preceding sed would have changed just the first occurrence on each line to "UNIX. In this case, the sed command would read.
Suppose that you wanted to extract just the usernames from the output of who. You already know how to do that with the cut command:. The sed command says to substitute a blank space followed by any characters up to the end of the line.
We pointed out that sed always writes each line of input to standard output, whether or not it gets changed. Sometimes, however, you'll want to use sed just to extract some lines from a file. For such purposes, use the -n option. This option tells sed that you don't want it to print any lines unless explicitly told to do so. This is done with the p command.
By specifying a line number or range of line numbers, you can use sed to selectively print lines of text. So, for example, to print just the first two lines from a file, the following could be used:.
If, instead of line numbers, you precede the p command with a string of characters enclosed in slashes, sed prints just those lines from standard input that contain those characters. The following example shows how sed can be used to display just the lines that contain a particular string:. By specifying a line number or range of numbers, you can delete specific lines from the input. In the following example, sed is used to delete the first two lines of text from intro:.
Remembering that by default sed writes all lines of the input to standard output, the remaining lines in text —that is, lines 3 through the end—simply get written to standard output. By preceding the d command with a string of text, you can use sed to delete all lines that contain that text.
In the following example, sed is used to delete all lines of text containing the word UNIX:. The power and flexibility of sed goes far beyond what we've shown here.
The tr filter is used to translate characters from standard input. Any character in from-chars encountered on the input will be translated into the corresponding character in to-chars. The result of the translation is written to standard output. In its simplest form, tr can be used to translate one character into another. Recall the file intro from earlier in this chapter:.
The results of the translation are written to standard output, leaving the original file untouched. Showing a more practical example, recall the pipeline that you used to extract the usernames and home directories of everyone on the system:. You can translate the colons into tab characters to produce a more readable output simply by tacking an appropriate tr command to the end of the pipeline:. Enclosed between the single quotes is a tab character even though you can't see it—just take our word for it.
It must be enclosed in quotes to keep it from the shell and give tr a chance to see it. For example, the octal value of the tab character is If you are going to use this format, be sure to enclose the character in quotes.
The tr command. In the following example, tr takes the output from date and translates all spaces into newline characters. The net result is that each field of output from date appears on a different line. For example, the following shows how to translate all lowercase letters in intro to their uppercase equivalents:. The character ranges [a-z] and [A-Z] are enclosed in quotes to keep the shell from replacing the first range with all the files in your directory named a through z, and the second range with all the files in your directory named A through Z.
What do you think happens if no such files exist? By reversing the two arguments to tr, you can use it to translate all uppercase letters to lowercase:. You can use the -s option to tr to "squeeze" out multiple occurrences of characters in to-chars.
In other words, if more than one consecutive occurrence of a character specified in to-chars occurs after the translation is made, the characters will be replaced by a single character. For example, the following command translates all colons into tab characters, replacing multiple tabs with single tabs:. So one colon or several consecutive colons on the input will be replaced by a single tab character on the output.
You can use tr to squeeze out the multiple spaces by using the -s option and by specifying a single space character as the first and second argument:. The general format of tr in this case is. In the following example, tr is used to delete all spaces from the file intro:. In the case we just saw, either approach is satisfactory that is, tr or sed ; however, tr is probably a better choice in this case because it is a much smaller program and likely to execute a bit faster.
Bear in mind that tr works only on single characters. So if you need to translate anything longer than a single character say all occurrences of unix to UNIX , you have to use a different program such as sed instead. The general format of this command is. Every line of each file that contains pattern is displayed at the terminal. If more than one file is specified to grep, each line is also immediately preceded by the name of the file, thus enabling you to identify the particular file that the pattern was found in.
If the pattern does not exist in the specified file s , the grep command simply displays nothing:. You saw in the section on sed how you could print all lines containing the string UNIX from the file intro with the command. The grep command is useful when you have a lot of files and you want to find out which ones contain certain words or phrases.
The following example shows how the grep command can be used to search for the word shell in all files in the current directory:. It's generally a good idea to enclose your grep pattern inside a pair of single quotes to "protect" it from the shell. For instance, if you want to find all the lines containing asterisks inside the file stars, typing. In this case, the shell took the asterisk and substituted the list of files in your current directory.
Then it started execution of grep, which took the first argument circles and tried to find it in the files specified by the remaining arguments, as shown in Figure 4.
Figure 4. Enclosing the asterisk in quotes, however, removes its special meaning from the shell:. The quotes told the shell to leave the enclosed characters alone.
The whole topic of how quotes are handled by the shell is fascinating; an entire chapter—Chapter 6, "Can I Quote You on That? So you can use grep on the other side of a pipe to scan through the output of a command for something. For example, suppose that you want to find out whether the user jim is logged in. The authors then present simple scriptwriting concepts, and cover all material required for understanding shells e.
For example, sessions use color so students can easily distinguish user input from computer output. In addition, illustrative figures help student visualize what the command is doing. Each chapter concludes with problems, including lab sessions where students work on the computer and complete sessions step-by-step. A second color enables students to easily distinguish user output from computer output. This allows users to progress from a basic interactive user to a writer of shell programs.
An innovative approach to introducing Regular Expressions appears in Chapter 9, with components of regular expressions atoms and operators discussed and compared to mathematical expressions with which students are familiar operand and operator. End-of-chapter material provides students with the opportunity to work hands-on with the material. UNIX Structure. Accessing UNIX. Common Commands. Other Useful Commands.
Key Terms. Practice Set. Lab Sessions. Editor Concepts. The vi Editor. Two Practice Sessions. File Types. Regular Files. File System Implementation. Operations Unique to Directories. Operations Unique to Regular Files. Operations Common to Both. Users and Groups. Security Levels. Changing Permissions.
User Masks. Changing Ownership and Group. UNIX Session. Standard Streams. Command Execution. Command-Line Editing. Command Substitution. Job control. Predefined Variables. Condition: New. First edition. An overview of file management in Unix and commonly used Unix commands is then provided.
Further, it delves into the detailed description offile system and compression. Following the methodology of the original text, the book focuses on the POSIX standard shell, and teaches you how to develop programs in this useful programming environment, taking. Shell Scripting. A compendium of shell scripting recipes that can immediately be used, adjusted, and applied The shell is the primary way of communicating with the Unix and Linux systems, providing a direct way to program by automating simple-to-intermediate tasks.
With this book, Linux expert Steve Parker shares a collection of shell. The Korn Shell. It contains hundreds of examples plus complete ready to run sample scripts.
0コメント