Tuesday, 18 August 2009

How To Look Like A UNIX Guru

UNIX is an extremely popular platform for deploying server software partly because of its security and stability, but also because it has a rich set of command line and scripting tools. Programmers use these tools for manipulating the file system, processing log files, and generally automating as much as possible.

If you want to be a serious server developer, you will need to have a certain facility with a number of UNIX tools; about 15. You will start to see similarities among them, particularly regular expressions, and soon you will feel very comfortable. Combining the simple commands, you can build very powerful tools very quickly--much faster than you could build the equivalent functionality in C or Java, for example.

This lecture takes you through the basic commands and then shows you how to combine them in simple patterns or idioms to provide sophisticated functionality like histogramming. This lecture assumes you know what a shell is and that you have some basic familiarity with UNIX.

[By the way, this page gets a lot of attention on the net and unfortunately I get mail from lots of people that have better solutions or stuff I should add. I'm only showing what I've learned from watching good UNIX people so I am not saying these tips are the optimal solutions. I'd make a pretty ignorant sys admin.]

Everything is a stream

The first thing you need to know is that UNIX is based upon the idea of a stream. Everything is a stream, or appears to be. Device drivers look like streams, terminals look like streams, processes communicate via streams, etc... The input and output of a program are streams that you can redirect into a device, a file, or another program.

Here is an example device, the null device, that lets you throw output away. For example, you might want to run a program but ignore the output.

$ ls > /dev/null # ignore output of ls

where "# ignore output of ls" is a comment.

Most of the commands covered in this lecture process stdin and send results to stdout. In this manner, you can incrementally process a data stream by hooking the output of one tool to the input of another via a pipe. For example, the following piped sequence prints the number of files in the current directory modified in August.

$ ls -l | grep Aug | wc -l

Imagine how long it would take you to write the equivalent C or Java program. You can become an extremely productive UNIX programmer if you learn to combine the simple command-line tools. Even when programming on a PC, I use MKS's UNIX shell and command library to make it look like a UNIX box. Worth the cash.

Getting help

If you need to know about a command, ask for the "man" page. For example, to find out about the ls command, type

$ man ls
LS(1) System General Commands Manual LS(1)

NAME
ls - list directory contents

SYNOPSIS
ls [-ACFLRSTWacdfgiklnoqrstux1] [file ...]

DESCRIPTION
For each operand that names a file of a type other than directory, ls
...

You will get a summary of the command and any arguments.

If you cannot remember the command's name, try using apropos which finds commands and library routines related to that word. For example, to find out how to do checksums, type

$ apropos checksum
cksum(1), sum(1) - display file checksums and block counts
md5(1) - calculate a message-digest fingerprint (checksum) for a file

Special Directories and files

A shortcut for you home directory, /home/username, is ~username. For example, ~parrt is my home directory, /home/parrt.

When you are using the shell, there is the notion of current directory. The dot '.' character is a shorthand for the current directory and '..' is a shorthand for the directory above the current. So to access file test in the current directory, ./test is the same as plain test. If test is a directory above, use ../test.

/ is the root directory; there is no drive specification in UNIX.

The .bash_profile file is very important as it is how your shell session is initialized including your ever-important CLASSPATH environment variable. Your bash shell initialization file is ~username/.bash_profile and has set up code like the following:

PATH=$PATH:$HOME/bin

Typically, you will go in and set your CLASSPATH so that you don't have to set it all the time.

export CLASSPATH=".:/home/public/cs601/junit.jar"

The export means that the assignment to CLASSPATH is visible to all child processes (that is, visible to all programs you run from the shell).

The basics

cd

Changing a directory is done with cd dir where dir can be "." or ".." to move to current directory (do nothing) or go up a directory.

ls

Display files in a directory with ls. The -l option is used to display details of the files:

total 9592
-rw-r--r-- 1 parrt staff 5600 Aug 19 2005 C-Java-relationship.html
...
drwxr-xr-x 13 parrt staff 442 Oct 19 2005 sessions
-rw-r--r-- 1 parrt staff 2488 Oct 19 2005 sessions.html
...

"staff" is parrt's group.

If you want to see hidden files (those starting with "."), use "-a".

Combinations are possible: use "ls -la" to see details of all files including hidden ones.

displaying files

There are 4 useful ways to display the contents or portions of a file. The first is the very commonly used command cat. For example, to display my list of object-oriented keywords used in this course, type:

$ cat /home/public/cs601/oo.keywords.txt

If a file is really big, you will probably want to use more, which spits the file out in screen-size chunks.

$ more /var/log/mail.log

If you only want to see the first few lines of a file or the last few lines use head and tail.

$ head /var/log/mail.log
$ tail /var/log/mail.log

You can specify a number as an argument to get a specific number of lines:

$ head -30 /var/log/mail.log

The most useful incantation of tail prints the last few lines of a file and then waits, printing new lines as they are appended to the file. This is great for watching a log file:

$ tail -f /var/log/mail.log

If you need to know how many characters, words, or lines are in a file, use wc:

$ wc /var/log/mail.log
164 2916 37896 /var/log/mail.log

Where the numbers are, in order, lines, words, then characters. For clarity, you can use wc -l to print just the number of lines.

pushd, popd

Instead of cd you can use pushd to save the current dir and then automatically cd to the specified directory. For example,

$ pwd
/Users/parrt
$ pushd /tmp
/tmp ~
$ pwd
/tmp
$ popd
~
$ pwd
/Users/parrt

top

To watch a dynamic display of the processes on your box in action, use top.

ps

To print out (wide display) all processes running on a box, use ps auxwww.

chmod

To change the privileges of a file or directory, use chmod. The privileges are 3 digit octal words with 3 bits per digit: rwxrwxrwx where the first digit is for the file owner, the 2nd for the group, and 3rd for anybody. 644 is a common word value file which means 110100100 or rw-r--r--. When you do ls -l you will see these bits. 755 is a common word value for directories: rwxr-xr-x where directories need to be executable for cd to be able to enter that dir. 755 is a shorthand for the more readable argument u=rwx,go=rx. u is user, g is group, o is other.

Use chmod -R for recursively applying to all the dirs below the argument as well.

Searching streams

One of the most useful tools available on UNIX and the one you may use the most is grep. This tool matches regular expressions (which includes simple words) and prints matching lines to stdout.

The simplest incantation looks for a particular character sequence in a set of files. Here is an example that looks for any reference to System in the java files in the current directory.

grep System *.java

You may find the dot '.' regular expression useful. It matches any single character but is typically combined with the star, which matches zero or more of the preceding item. Be careful to enclose the expression in single quotes so the command-line expansion doesn't modify the argument. The following example, looks for references to any a forum page in a server log file:

$ grep '/forum/.*' /home/public/cs601/unix/access.log

or equivalently:

$ cat /home/public/cs601/unix/access.log | grep '/forum/.*'

The second form is useful when you want to process a collection of files as a single stream as in:

cat /home/public/cs601/unix/access*.log | grep '/forum/.*'

If you need to look for a string at the beginning of a line, use caret '^':

$ grep '^195.77.105.200' /home/public/cs601/unix/access*.log

This finds all lines in all access logs that begin with IP address 195.77.105.200.

If you would like to invert the pattern matching to find lines that do not match a pattern, use -v. Here is an example that finds references to non image GETs in a log file:

$ cat /home/public/cs601/unix/access.log | grep -v '/images'

Now imagine that you have an http log file and you would like to filter out page requests made by nonhuman spiders. If you have a file called spider.IPs, you can find all nonspider page views via:

$ cat /home/public/cs601/unix/access.log | grep -v -f /tmp/spider.IPs

Finally, to ignore the case of the input stream, use -i.

Translating streams

Morphing a text stream is a fundamental UNIX operation. PERL is a good tool for this, but since I don't like PERL I stick with three tools: tr, sed, and awk. PERL and these tools are line-by-line tools in that they operate well only on patterns fully contained within a single line. If you need to process more complicated patterns like XML or you need to parse a programming language, use a context-free grammar tool like ANTLR.

tr

For manipulating whitespace, you will find tr very useful.

If you have columns of data separated by spaces and you would like the columns to collapse so there is a single column of data, tell tr to replace space with newline tr ' ' '\n'. Consider input file /home/public/cs601/unix/names:

jim scott mike
bill randy tom

To get all those names in a column, use

$ cat /home/public/cs601/unix/names | tr ' ' '\n'

If you would like to collapse all sequences of spaces into one single space, use tr -s ' '.

To convert a PC file to UNIX, you have to get rid of the '\r' characters. Use tr -d '\r'.

sed

If dropping or translating single characters is not enough, you can use sed (stream editor) to replace or delete text chunks matched by regular expressions. For example, to delete all references to word scott in the names file from above, use

$ cat /home/public/cs601/unix/names | sed 's/scott//'

which substitutes scott for nothing. If there are multiple references to scott on a single line, use the g suffix to indicate "global" on that line otherwise only the first occurrence will be removed:

$ ... | sed 's/scott//g'

If you would like to replace references to view.jsp with index.jsp, use

$ ... | sed 's/view.jsp/index.jsp/'

If you want any .asp file converted to .jsp, you must match the file name with a regular expression and refer to it via \1:

$ ... | sed 's/\(.*\).asp/\1.jsp/'

The \(...\) grouping collects text that you can refer to with \1.

If you want to kill everything from the ',' character to end of line, use the end-of-line marker $:

$ ... | sed 's/,.*$//' # kill from comma to end of line

awk

When you need to work with columns of data or execute a little bit of code for each line matching a pattern, use awk. awk programs are pattern-action pairs. While some awk programs are complicated enough to require a separate file containing the program, you can do some amazing things using an argument on the command-line.

awk thinks input lines are broken up into fields (i.e., columns) separate by whitespace. Fields are referenced in an action via $1, $2, ... while $0 refers to the entire input line.

A pattern-action pair looks like:

pattern {action}

If you omit the pattern, the action is executed for each input line. Omitting the action means print the line. You can separate the pairs by newline or semicolon.

Consider input

aasghar Asghar, Ali
wchen Chen, Wei
zchen Chen, Zhen-Jian

If you want a list of login names, ask awk to print the first column:

$ cat /home/public/cs601/unix/emails.txt | awk '{print $1;}'

If you want to convert the login names to email addresses, use the printf C-lookalike function:

$ cat /home/public/cs601/unix/emails.txt | awk '{printf("%s@cs.usfca.edu,",$1);}'

Because of the missing \n in the printf string, you'll see the output all on one line ready for pasting into a mail program:

aasghar@cs.usfca.edu,wchen@cs.usfca.edu,zchen@cs.usfca.edu

You might also want to reorder columns of data. To print firstname, lastname, you might try:

$ cat /home/public/cs601/unix/emails.txt | awk '{printf("%s %s\n", $3, $2);}'

but you'll notice that the comma is still there as it is part of the column:

Ali Asghar,
Wei Chen,
Zhen-Jian Chen,

You need to pipe the output thru tr (or sed) to strip the comma:

$ cat /home/public/cs601/unix/emails.txt | \
awk '{printf("%s %s\n", $3, $2);}' | \
tr -d ','

Then you will see:

Ali Asghar
Wei Chen
Zhen-Jian Chen

You can also use awk to examine the value of content. To sum up the first column of the following data (in file /home/public/cs601/unix/coffee):

3 parrt
2 jcoker
8 tombu

use the following simple command:

$ awk '{n+=$1;} ; END {print n;}' < /home/public/cs601/unix/coffee

where END is a special pattern that means "after processing the stream."

If you want to filter or sum all values less than or equal to, say 3, use an if statement:

$ awk '{if ($1<=3) n+=$1;} END {print n;}' < /home/public/cs601/unix/coffee

In this case, you will see output 5 (3+2);

Using awk to grab a particular column is very common when processing log files. Consider a http://www.jguru.com page view log file, /home/public/cs601/unix/pageview-20021022.log, that are of the form:

date-stamp(thread-name): userID-or-IPaddr URL site-section

So, the data looks like this:

20021022_00.00.04(tcpConnection-80-3019):       203.6.152.30    /faq/subtopic.jsp?topicID=472&page=2    FAQs    
20021022_00.00.07(tcpConnection-80-2981): 995134 /index.jsp Home
20021022_00.00.08(tcpConnection-80-2901): 66.67.34.44 /faq/subtopic.jsp?topicID=364 FAQs
20021022_00.00.12(tcpConnection-80-3003): 217.65.96.13 /faq/view.jsp?EID=736437 FAQs
20021022_00.00.13(tcpConnection-80-3019): 203.124.210.98 /faq/topicindex.jsp?topic=JSP FAQs/JSP
20021022_00.00.15(tcpConnection-80-2988): 202.56.231.154 /faq/index.jsp FAQs
20021022_00.00.19(tcpConnection-80-2976): 66.67.34.44 /faq/view.jsp?EID=225150 FAQs
220021022_00.00.21(tcpConnection-80-2974): 143.89.192.5 /forums/most_active.jsp?topic=EJB Forums/EJB
20021022_00.00.21(tcpConnection-80-2996): 193.108.239.34 /guru/edit_account.jsp Guru
20021022_00.00.21(tcpConnection-80-2996): 193.108.239.34 /misc/login.jsp Misc
...

When a user is logged in, the log file has their user ID rather than their IP address.

Here is how you get a list of URLs that people view on say October 22, 2002:

$ awk '{print $3;}' < /home/public/cs601/unix/pageview-20021022.log
/faq/subtopic.jsp?topicID=472&page=2
/index.jsp
/faq/subtopic.jsp?topicID=364
/faq/view.jsp?EID=736437
/faq/topicindex.jsp?topic=JSP
/faq/index.jsp
/faq/view.jsp?EID=225150
/forums/most_active.jsp?topic=EJB
/guru/edit_account.jsp
/misc/login.jsp
...

If you want to count how many page views there were that day that were not processing pages (my processing pages are all of the form process_xxx), pipe the results through grep and wc:

$ awk '{print $3;}' < /home/public/cs601/unix/pageview-20021022.log | \
grep -v process | \
wc -l
67850

If you want a unique list of URLs, you can sort the output and then use uniq:

$ awk '{print $3;}' < /home/public/cs601/unix/pageview-20021022.log | \
sort | \
uniq

uniq just collapses all repeated lines into a single line--that is why you must sort the output first. You'll get output like:

/article/index.jsp
/article/index.jsp?page=1
/article/index.jsp?page=10
/article/index.jsp?page=2
...

Tarballs

Note: The name comes from a similar word, hairball (stuff that cats throw up), I'm pretty sure.

To collect a bunch of files and directories together, use tar. For example, to tar up your entire home directory and put the tarball into /tmp, do this

$ cd ~parrt
$ cd .. # go one dir above dir you want to tar
$ tar cvf /tmp/parrt.backup.tar parrt

By convention, use .tar as the extension. To untar this file use

$ cd /tmp
$ tar xvf parrt.backup.tar

tar untars things in the current directory!

After running the untar, you will find a new directory, /tmp/parrt, that is a copy of your home directory. Note that the way you tar things up dictates the directory structure when untarred. The fact that I mentioned parrt in the tar creation means that I'll have that dir when untarred. In contrast, the following will also make a copy of my home directory, but without having a parrt root dir:

$ cd ~parrt
$ tar cvf /tmp/parrt.backup.tar *

It is a good idea to tar things up with a root directory so that when you untar you don't generate a million files in the current directly. To see what's in a tarball, use

$ tar tvf /tmp/parrt.backup.tar

Most of the time you can save space by using the z argument. The tarball will then be gzip'd and you should use file extension .tar.gz:

$ cd ~parrt
$ cd .. # go one dir above dir you want to tar
$ tar cvfz /tmp/parrt.backup.tar.gz parrt

Unzipping requires the z argument also:

$ cd /tmp
$ tar xvfz parrt.backup.tar.gz

If you have a big file to compress, use gzip:

$ gzip bigfile

After execution, your file will have been renamed bigfile.gz. To uncompress, use

$ gzip -d bigfile.gz

To display a text file that is currently gzip'd, use zcat:

$ zcat bigfile.gz

Moving files between machines

rsync

When you need to have a directory on one machine mirrored on another machine, use rsync. It compares all the files in a directory subtree and copies over any that have changed to the mirrored directory on the other machine. For example, here is how you could "pull" all logs files from livebox.jguru.com to the box from which you execute the rsync command:

$ hostname
jazz.jguru.com
$ rsync -rabz -e ssh -v 'parrt@livebox.jguru.com:/var/log/jguru/*' \
/backup/web/logs

rsync will delete or truncate files to ensure the files stay the same. This is bad if you erase a file by mistake--it will wipe out your backup file. Add an argument called --suffix to tell rsync to make a copy of any existing file before it overwrites it:

$ hostname
jazz.jguru.com
$ rsync -rabz -e ssh -v --suffix .rsync_`date '+%Y%m%d'` \
'parrt@livebox.jguru.com:/var/log/jguru/*' /backup/web/logs

where `date '+%Y%m%d'` (in reverse single quotes) means "execute this date command".

To exclude certain patterns from the sync, use --exclude:

$ rsync -rabz --exclude=entitymanager/ --suffix .rsync_`date '+%Y%m%d'` \
-e ssh -v 'parrt@livebox.jguru.com:/var/log/jguru/*' /backup/web/logs

scp

To copy a file or directory manually, use scp:

$ scp lecture.html parrt@nexus.cs.usfca.edu:~parrt/lectures

Just like cp, use -r to copy a directory recursively.

Miscellaneous

find

Most GUIs for Linux or PCs have a search facility, but from the command-line you can use find. To find all files named .p4 starting in directory ~/antlr/depot/projects, use:

$ find  ~/antlr/depot/projects -name '.p4'

The default "action" is to -print.

You can specify a regular expression to match. For example, to look under your home directory for any xml files, use:

$ find ~ -name '*.xml' -print

Note the use of the single quotes to prevent command-line expansion--you want the '*' to go to the find command.

You can execute a command for every file or directory found that matches a name. For example, do delete all xml files, do this:

$ find ~ -name '*.xml' -exec rm {} \;

where "{}" stands for "current file that matches". The end of the command must be terminated with ';' but because of the command-line expansion, you'll need to escape the ';'.

You can also specify time information in your query. Here is a shell script that uses find to delete all files older than 14 days.

#!/bin/sh

BACKUP_DIR=/var/data/backup

# number of days to keep backups
AGE=14 # days
AGE_MINS=$[ $AGE * 60 * 24 ]

# delete dirs/files
find $BACKUP_DIR/* -cmin +$AGE_MINS -type d -exec rm -rf {} \;

fuser

If you want to know who is using a port such as HTTP (80), use fuser. You must be root to use this:

$ sudo /sbin/fuser -n tcp 80
80/tcp: 13476 13477 13478 13479 13480
13481 13482 13483 13484 13486 13487 13489 13490 13491
13492 13493 13495 13496 13497 13498 13499 13500 13501 13608

The output indicates the list of processes associated with that port.

whereis

Sometimes you want to use a command but it's not in your PATH and you can't remember where it is. Use whereis to look in standard unix locations for the command.

$ whereis fuser
fuser: /sbin/fuser /usr/man/man1/fuser.1 /usr/man/man1/fuser.1.gz
$ whereis ls
ls: /bin/ls /usr/man/man1/ls.1 /usr/man/man1/ls.1.gz

whereis also shows man pages.

which

Sometimes you might be executing the wrong version of a command and you want to know which version of the command your PATH indicates should be run. Use which to ask:

$ which ls
alias ls='ls --color=tty'
/bin/ls
$ which java
/usr/local/java/bin/java

If nothing is found in your path, you'll see:

$ which fuser
/usr/bin/which: no fuser in (/usr/local/bin:/usr/local/java/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/usr/X11R6/bin:/home/parrt/bin)

kill

To send a signal to a process, use kill. Typically you'll want to just say kill pid where pid can be found from ps or top (see below).

Use kill -9 pid when you can't get the process to die; this means kill it with "extreme prejudice".

traceroute

If you are having trouble getting to a site, use traceroute to watch the sequence of hops used to get to a site:

$ /usr/sbin/traceroute www.cnn.com
1 65.219.20.145 (65.219.20.145) 2.348 ms 1.87 ms 1.814 ms
2 loopback0.gw5.sfo4.alter.net (137.39.11.23) 3.667 ms 3.741 ms 3.695 ms
3 160.atm3-0.xr1.sfo4.alter.net (152.63.51.190) 3.855 ms 3.825 ms 3.993 ms
...

What is my IP address?

$ /sbin/ifconfig

Under the eth0 interface, you'll see the inet addr:

eth0      Link encap:Ethernet  HWaddr 00:10:DC:58:B1:F0 
inet addr:138.202.170.4 Bcast:138.202.170.255 Mask:255.255.255.0
...

Useful combinations

How to kill a set of processes

If you want to kill all java processes running for parrt, you can either run killall java if you are parrt or generate a "kill" script via:

$ ps auxwww|grep java|grep parrt|awk '{print "kill -9 ",$2;}' > /tmp/killparrt
$ bash /tmp/killparrt # run resulting script

The /tmp/killparrt file would look something like:

kill -9 1021
kill -9 1023
kill -9 1024

Note: you can also do this common task with:

$ killall java

Please be aware that this is linux specific; i'm told that it will kill all processing on UNIXen like Solaris!

How to make a histogram

A histogram is set of count, value pairs indicating how often the value occurs. The basic operation will be to sort, then count how many values occur in a row and then reverse sort so that the value with the highest count is at the top of the report.

$ ... | sort |uniq -c|sort -r -n

Note that sort sorts on the whole line, but the first column is obviously significant just as the first letter in someone's last name significantly positions their name in a sorted list.

uniq -c collapses all repeated sequences of values but prints the number of occurrences in front of the value. Recall the previous sorting:

$ awk '{print $3;}' < /home/public/cs601/unix/pageview-20021022.log | \
sort | \
uniq
/article/index.jsp
/article/index.jsp?page=1
/article/index.jsp?page=10
/article/index.jsp?page=2
...

Now add -c to uniq:

$ awk '{print $3;}' < /home/public/cs601/unix/pageview-20021022.log | \
sort | \
uniq -c
623 /article/index.jsp
6 /article/index.jsp?page=1
10 /article/index.jsp?page=10
109 /article/index.jsp?page=2
...

Now all you have to do is reverse sort the lines according to the first column numerically.

$ awk '{print $3;}' < /home/public/cs601/unix/pageview-20021022.log | \
sort | \
uniq -c | \
sort -r -n
6170 /index.jsp
2916 /search/results.jsp
1397 /faq/index.jsp
1018 /forums/index.jsp
884 /faq/home.jsp?topic=Tomcat
...

In practice, you might want to get a histogram that has been "despidered" and only has faq related views. You can filter out all page view lines associated with spider IPs and filter in only faq lines:

$ grep -v -f /tmp/spider.IPs /home/public/cs601/unix/pageview-20021022.log | \
awk '{print $3;}'| \
grep '/faq' | \
sort | \
uniq -c | \
sort -r -n
1397 /faq/index.jsp
884 /faq/home.jsp?topic=Tomcat
525 /faq/home.jsp?topic=Struts
501 /faq/home.jsp?topic=JSP
423 /faq/home.jsp?topic=EJB
...

If you want to only see despidered faq pages that were referenced more than 500 times, add an awk command to the end.

$ grep -v -f /tmp/spider.IPs /home/public/cs601/unix/pageview-20021022.log | \
awk '{print $3;}'| \
grep '/faq' | \
sort | \
uniq -c | \
sort -r -n | \
awk '{if ($1>500) print $0;}'
1397 /faq/index.jsp
884 /faq/home.jsp?topic=Tomcat
525 /faq/home.jsp?topic=Struts
501 /faq/home.jsp?topic=JSP

Generating Java class hierarchy diagrams

A student asked if I knew of a program that generated class hierarchy diagrams. I said "no", but then realized we don't need one. Here's the one liner to do it:

# pulls out superclass and class as $5 and $3:
# public class A extends B ...
# only works for public classes and usual formatting
cat *.java | grep 'public class' $1 | \
awk 'BEGIN {print "digraph foo {";} {print $5 "->" $3;} END {print "}"}'

It generates DOT format graph files. Try it. It's amazing. Works for most cases. Output looks like:

digraph foo {
antlr.CharScanner->JavaLexer
antlr.LLkParser->Mantra
->TestLexer
}

Generating scripts and programs

I like to automate as much as possible. Sometimes that means writing a program that generates another program or script.

Processing mail files

I wanted to get a sequence of SQL commands that would update our database whenever someone's email bounced. Processing the mail file is pretty easy since you can look for the error code followed by the email address. A bounced email looks like:

From MAILER-DAEMON@localhost.localdomain  Wed Jan  9 17:32:33 2002
Return-Path: <>
Received: from web.jguru.com (web.jguru.com [64.49.216.133])
by localhost.localdomain (8.9.3/8.9.3) with ESMTP id RAA18767
for ; Wed, 9 Jan 2002 17:32:32 -0800
Received: from localhost (localhost)
by web.jguru.com (8.11.6/8.11.6) id g0A1W2o02285;
Wed, 9 Jan 2002 17:32:02 -0800
Date: Wed, 9 Jan 2002 17:32:02 -0800
From: Mail Delivery Subsystem
Message-Id: <200201100132.g0a1w2o02285@web.jguru.com>
To:
MIME-Version: 1.0
Content-Type: multipart/report; report-type=delivery-status;
boundary="g0A1W2o02285.1010626322/web.jguru.com"
Subject: Returned mail: see transcript for details
Auto-Submitted: auto-generated (failure)

This is a MIME-encapsulated message

--g0A1W2o02285.1010626322/web.jguru.com

The original message was received at Wed, 9 Jan 2002 17:32:02 -0800
from localhost [127.0.0.1]

----- The following addresses had permanent fatal errors -----

(reason: 550 Host unknown)

----- Transcript of session follows -----
550 5.1.2 ... Host unknown (Name server: intheneck.com: host not found)
...

Notice the SMTP 550 error message. Look for that at the start of a line then kill the angle brackets, remove the ... and use awk to print out the SQL:

# This script works on one email or a file full of other emails
# since it just looks for the SMTP 550 or 554 results and then
# converts them to SQL commands.
grep -E '^(550|554)' | \
sed 's/[<>]//g' | \
sed 's/\.\.\.//' | \
awk "{printf(\"UPDATE PERSON SET bounce=1 WHERE email='%s';\n\",\$3);}" >> bounces.sql

I have to escape the $3 because it means something to the surround bash shell script and I want awk to see the dollar sign.

Generating getter/setters

#!/bin/bash
# From a type and name (plus firstlettercap version),
# generate a Java getter and setter
#
# Example: getter.setter String name Name
#

TYPE=$1
NAME=$2
UPPER_NAME=$3

echo "public $TYPE get$UPPER_NAME() {"
echo " return $NAME;"
echo "}"
echo
echo "void set$UPPER_NAME($TYPE $NAME) {"
echo " this.$NAME = $NAME;"
echo "}"
echo

Have I been hacked?

Failed logins: /var/log/messages

last, w, uptime

/etc/passwd changed?

fuser for ports

portscans in server report

weird processing hogging CPU?

Sunday, 16 August 2009

Apache 2 with SSL/TLS: Part 3

Introducing part three

This article concludes our three part series dedicated to configuring Apache 2.0 with SSL/TLS support -- for maximum security and optimal performance of SSL based e-commerce transactions.

Part one introduced key aspects of SSL/TLS and then showed how to compile, install and configure Apache 2.0. The second part discussed the configuration of mod_ssl and authentication issues, and then showed how to create web server's SSL certificate.

Now, in the third and final article, we will take a look at client authentication using client certificates, show how to chroot a secure Apache, discuss common attack vectors, and then describe some typical configuration mistakes made by administrators that will decrease the security level of SSL communications.

Client authentication

One of the most popular methods for authenticating users in web applications is a password, passphrase or PIN, in other words, "something you know." The greatest advantage of such a method is its simplicity. For an administrator, it is enough to add few directives to httpd.conf and create a passwd file to implement such a schema.

Unfortunately, because of their simplicity passwords are vulnerable to a number of attacks. They can be guessed, sniffed over the wire, brute-forced, stolen (such as when a user writes them down on sticky notes) or coaxed out (through social engineering or some kind of "phishing" method). This is why standard password authentication is considered to be weaker than using one-time passwords, hardware tokens or others forms of authentication.

Few people realize that when using a SSL web server there is an stronger method of authenticating users: client SSL certificates, or "personal" certificates for each user. In this method, we can authenticate web users based on "something you have," using a certificate and a private key corresponding to the client certificate, as well as "something you know," which would be a passphrase to the private key. Thus, using certificates is more secure than using standard password solutions, mainly because besides an intruder would need to get both pieces of authentication -- the private key that corresponds to the user's certificate, as well as the passphrase -- to gain access. Moreover, unlike a standard password, the certificate's passphrase is not actually sent over the network at all, it is used only locally to decrypt the private key.

As will be shown, the implementation of this method of authentication is not complicated and it can be performed in few steps, which administrators will find to be almost as easy as the more popular Basic Authentication password method.

Configuring Apache to use client certificates

In order to configure Apache to support client authentication via X.509v3 certificates, we need to perform four actions:

  1. Enable client authentication in the Apache's web server

    To enable the use of client certificates, we need to add the following directives to httpd.conf:


    SSLVerifyClient require
    SSLVerifyDepth 1

    Thanks to the SSLVerifyClient directive, the access to the web server will now be limited only to the web browsers that present a valid certificate, one which is signed by our local CA. Note that the process of creating local CA has been described in the previous article. The "SSLVerifyDepth" value specifies the maximum depth of the intermediate certificate issuers in the chain of certificates. In our case, we will set this value to "1," because all client certificates must be signed by our local CA -- we are not using intermediate CAs.

  2. Install the local CA's certificate into the Apache directory structure.


    install -m 644 -o root -g sys ca.crt /usr/local/apache2/conf/ssl.crt/ 
  3. Set the SSLCACertificateFile directive (in httpd.conf) to point to the CA certificate we just installed.


    SSLCACertificateFile /usr/local/apache2/conf/ssl.crt/ca.crt
  4. Now restart Apache.


    /usr/local/apache2/bin/apachectl stop
    /usr/local/apache2/bin/apachectl startssl

From now on, access to the web server via SSL will be granted only to the web browsers that present a valid client certificate, signed by our local CA. To test it, we can try to access the URL of the website. After establishing SSL connection, MS Internet Explorer will ask us to choose the client certificate we want to use, as shown below in Figure 1.


Figure 1. Internet Explorer asking for a client certificate.

Since we do not yet have any client certificate installed, access to the web server will simply be denied.

Creating a client certificate

In general, creating an individual client certificate is very similar to creating a web server certificate. The only difference is that we will use different X.509v3 extensions (the "ssl_client" section from openssl.cnf) and we will store both the private key and certificate in PKCS#12 format (also referred as PFX).

For the sake of simplicity, please follow the steps below using OpenSSL to produce the client certificate. Note that it is highly recommended that any actions that are to be performed by users (numbers 1, 2, 7, 8 in the steps below) should be as automated and simplified as possible to minimize user interaction and user error. To that end, additional technology such as Java Applets could be used. Alternatively, a dedicated host can also be used for the purpose of creating client certificates. In this latter case, users would need to visit the server in-person and enter their passphrase on the dedicated host to encrypt their own private key. Although this option might seem a bit inconvenient, it is the most secure method, as a user's identity can be verified and both the certificate and the private key can be passed to the user without sending it over the network.

The steps to create and install a client certificate are exactly as follows:

  1. Create a private/public key pair for the user, together with a certificate request. If a dedicated host is not being used to serve the certificate, this should be executed on the user's host:


    openssl req \
    -new \
    -sha1 \
    -newkey rsa:1024 \
    -nodes \
    -keyout client.key \
    -out request.pem \
    -subj '/O=Seccure/OU=Seccure Labs/CN=Frodo Baggins'
  2. The user sends the certificate request (request.pem) to the local CA, to be signed.
  3. The local CA's task is to verify that the information from the client certificate request is indeed valid and correct.
  4. After verifying, the certificate request (request.pem) should be copied into the $SSLDIR/requests directory on the local CA host using removable media, such as a USB drive.
  5. The local CA should sign the certificate request as follows. This should be executed on the CA's host.


    openssl ca \
    -config $SSLDIR/openssl.cnf \
    -policy policy_anything \
    -extensions ssl_client \
    -out $SSLDIR/requests/signed.pem \
    -infiles $SSLDIR/requests/request.pem
  6. The local CA should send the certificate (signed.pem) to the user.
  7. After receiving the signed certificate, users need to store their private key together with their certificate into PKCS#12 format.


    openssl pkcs12 \
    -export \
    -clcerts \
    -in signed.pem \
    -inkey client.key \
    -out client.p12

    The newly created client.p12 file should be protected with a hard-to-guess passphrase. All other files (including the unencrypted private key, the signed certificate and the certificate request) should be securely erased from the user's disk space using a wipe utility.


    wipe client.key signed.pem request.pem
  8. The client certificate, together with the private key, should be installed in the user's web browser. An example of this for Microsoft Internet Explorer is shown below in Figure 2.


    Figure 2. Installing a certificate in Internet Explorer.

    To protect the private key against accidental or unauthorized use, the option "Enable strong private key protection" should be checked. Also, to protect the certificate from being stolen, it should be not be possible to export the certificate -- the option "Mark this key as exportable" should be disabled. Both these browser configuration options are shown below in Figure 3.


    Figure 3. Protecting the client certificate in Internet Explorer.

    In addition, the security level of the browser should be changed to "High." We see this during the next step of the Import Wizard, as illustrated below in Figure 4. Thanks to this option, a user will be asked to enter his password every time the web browser wants to use that client certificate.


    Figure 4. Security level should be set to "High" in IE.

That's all there is to it. The certificate can now be found under the "Personal" tab in the certificate view (in MS Internet Explorer's menu -> tab "Content" -> "Certificates"). If we double click the certificate we should see some properties similar to Figure 5, below.


Figure 5. Client certificate details in IE.

Using the client certificate

At this point we should try to access the URL of the website again. If the above have been successfully completed, once it is requested we should be able to see and choose the installed certificate from the list, as shown in Figure 6.


Figure 6. Choosing the client certificate when prompted.

After choosing the certificate, the user must enter the required passphrase that decrypts the corresponding private key, as shown in Figure 7.


Figure 7. Entering the passphrase for the certficate.

Now we will have access to the secure website, illustrated in Figure 8.


Figure 8. Secure access, using certificates, has been granted.

Customizing access control

With additional server-side directives, we can control which parts of the website particular users or groups of users are granted or denied access. For example, when and organization must deal securely with many different companies, we can restrict access to the website to just one particular company (O=Seccure), by adding the following directives to httpd.conf:



SSLRequire %{SSL_CLIENT_S_DN_O} eq "Seccure"

Another example shows how to allow access only to a certain department (OU="Seccure Labs") within the company (O="Seccure"):


   SSLRequire %{SSL_CLIENT_S_DN_O}   eq "Seccure" and \
%{SSL_CLIENT_S_DN_OU } eq "Seccure Labs"

Or alternatively, we can provide access to just a few departments (OU="Seccure Labs" or OU="Development") within the same company (O="Seccure"):


   SSLRequire %{SSL_CLIENT_S_DN_O}   eq "Seccure" and \
%{SSL_CLIENT_S_DN_OU } in {"Seccure Labs", "Development"}

Finally, we can even provide access to just one specific user (CN="Frodo Baggins") from a specific company (O="Seccure"):


   SSLRequire %{SSL_CLIENT_S_DN_O}  eq "Seccure" and \
%{SSL_CLIENT_S_DN_CN} in {"Frodo Baggins"}

Note that we can also provide the above environment variables to CGI scripts (including PHP and others), by adding the "+StdEnvVars" parameter to the SSLOptions directive. This features allows us to use DN names inside web applications (PHP and others), to provide more detailed authorization and access control.


SSLOptions +StdEnvVars 

Revoking a client certificate

If a client certificate becomes compromised or lost, what do you do? In this case we need to revoke the certificate, as was already described in the previous article. Then, we must copy the CRL file into the Apache directory, as follows:


install -m 644 -o root -g sys crl.pem /usr/local/apache2/conf/ssl.crl/ 

We also need to make sure that the "SSLCARevocationFile" in httpd.conf points to the above file:


SSLCARevocationFile /usr/local/apache2/conf/ssl.crl/crl.pem 

Then, we need to restart Apache for the changes to take an effect. The revoked certificates will not be allowed access to the website, as shown below in Figure 9.


Figure 9. Access attempted with a revoked certificate.

Chrooting the server

To improve our web server's security and make Apache less vulnerable to buffer overflow attacks, it is recommended that one run Apache in the chrooted environment. Chrooting the web server isolates the process to a new root directory so that it cannot see the rest of the server's files.

The process of chrooting Apache has been already described in "Securing Apache 2.0: Step-by-Step," so readers are encouraged to follow steps that were shown there. With the added support for SSL/TLS, we will need to install some additional libraries and create a few new subdirectories. In the case of FreeBSD 5.1, the list of these libraries and directories are as follows:


cp /usr/lib/libssl.so.3 /chroot/httpd/usr/lib/
cp /usr/lib/libcrypto.so.3 /chroot/httpd/usr/lib/
cp -R /usr/local/apache2/conf/ssl.key /chroot/httpd/usr/local/apache2/conf/
cp -R /usr/local/apache2/conf/ssl.crt /chroot/httpd/usr/local/apache2/conf/
cp -R /usr/local/apache2/conf/ssl.crl /chroot/httpd/usr/local/apache2/conf/

We also need to add urandom device as follows. Once again, the example below is taken from FreeBSD 5.1:


ls -al /dev/*random
crw-rw-rw- 1 root wheel 2, 3 Jan 4 12:10 /dev/random
lrwxr-xr-x 1 root wheel 7 Jan 4 12:10 /dev/urandom -> random
cd /chroot/httpd/dev
mknod ./random c 2 3
ln -s ./random ./urandom
chown root:sys ./random
chmod 666 ./random

In the case of other operating systems, readers can create the list of required files by using commands like truss, strace, ktrace etc., as was described in details in the section "Chrooting the server," from the SecurityFocus article, "Securing Apache: Step-by-Step"

Once all these steps have been completed, we can run Apache in the chrooted environment, as follows:


chroot /chroot/httpd /usr/local/apache2/bin/httpd 

Known attacks on SSL/TLS

Although SSL/TLS protocols offer a high level of security in theory, its actual implementations may be vulnerable to several types of attacks. Of the many attack vectors, two are worthy of special attention:

  • Man in the middle (MITM) attacks

    In this type of attack, an intruder intercepts the traffic that is being sent between a client and server, such as by forging DNS replies or by performing ARP redirection. Then, it impersonates the client to the server, and vice-versa. During this attack, the user's web browser does not connect directly to the destination server, but instead to the intruder host, which impersonate the web browser and essentially acts as a proxy.

    There is good and bad news for an administrator who wishes to defend against such attacks. The good news is that the web browsers warn users when the web server's identity cannot be verified, which may indicate possible man-in-the-middle attack, by displaying a message window with a warning. The bad news is that in real life, users very often ignore such warnings. Hence, if the user's web browser accepts connections to SSL web sites that identities cannot be checked, we can only rely on users' education and trust that they will not press the "proceed" button, if such warning message has been displayed.

  • Brute-force attack on the session key

    This attacks can be performed when the intruder knows or can assume part of the clear text that was sent during SSL/TLS session, such as (such as "GET / HTTP/1.0"), and the intruder can eavesdrop this session (such by using tcpdump, Ethereal or other tools). Then, intruder can encrypt the assumed part of the text by using every possible key, trying to find its occurrence in the originally encrypted SSL/TLS traffic. Once such an occurrence has been found, the key that was used to encrypt this part of the message can be used to decrypt the rest of originally encrypted SSL/TLS traffic.

    The good news is that maximum number of keys that must be checked is 2^128 when 128-bit symmetric cryptography has been used. Today this is believed to be strong enough to protect the session for literally dozens of years. However, since CPUs grow in strength every year, we cannot really predict for how long 128-bit symmetric keys will be considered to be secure, particularly for hackers with access to large supercomputers.

    In case of export-class cipher suites (40-bit, and in some extended 56-bit ciphers) such brute force attacks can be successfully performed in a reasonable amoung of time -- sometimes even in few days, depending on the available number of CPUs. If export regulations in your country allow for the use of strong cryptography, one should definitely use it instead of export-class cipher suites.

In addition to the two types of attacks listed above, there are some other potential vulnerabilities, including algorithm rollback attacks, timing attacks, traffic analysis, Bleinchenbacher's attacks, and others. Those interested in understanding them can find more information in Bruce Schneier and David Wagner's document, "Analysis of the SSL 3.0 protocol," (PDF document) as well as in many other documents that can be found on the public Internet.

Typical SSL/TLS implementation mistakes

  • Certificates signed by a CA that is not known to the web browser
  • The biggest mistake that can be made when implementing SSL/TLS is with the signing of the web server's certificate by a CA that is not trusted by web browser. In other words, the CA's certificate is not installed in the web browser. This makes Man-In-The-Middle attacks very easy to perform, because users have no way of verifying the identity of the web server.

    To avoid this problem, it is important to make sure that signer's certificate (usually, a trusted CA) is installed in user's web browser. If a local CA was used for signing the web server's certificate, then we must make sure that all web browsers on all clients requiring access have the local CA's certificate installed. The same rule applies to self-signed certificates.

  • Expired certificates

    The web browser's certificate should be renewed before the previous one expires. Otherwise it will result in the same problem as above, whereby web clients will not be able to authenticate with the web server -- which once again makes SSL/TLS connections vulnerable to man in the middle attacks. In this case, users may get used to seeing a warning message saying the certificate has expired, and then will probably not notice if it is a bogus certificate.

  • Vulnerable versions of OpenSSL and Apache

    It is very important to always run the latest versions of OpenSSL and Apache. Programmers writing larger pieces of code such as these without bugs is virtually impossible, so we should always use the latest stable versions of the above software. The latest versions should theoretically contain fewer security vulnerabilities (both discovered and not-yet discovered) than previous versions.

  • Acceptance of SSL v2.0, anonymous authentication (aNULL), and cryptography (NULL) by the web server

    As was previously discussed, the use of cipher suites that support anonymous authentication or require no encryption should be disabled. Otherwise, there is a risk that client can be tricked into negotiating parameters that can dramatically lower the security level of the connection. For this reason we should disable the use of the SSLv2.0 protocol and use TLSv1.0 or SSLv3.0 instead.

  • Use of weak encryption

    Early implementations of SSL were only able to use 40-bit keys for symmetric encryption, due to US government restrictions. Unfortunately, the data encrypted by 40-bits symmetric keys can now be decrypted in a relatively short period of time, and for this reason 40-bit and 56-bit keys should no longer be used. Most modern web browsers support 128-bit keys for symmetric encryption, and this is now the minimal recommended length of key for use with SSL/TLS.

  • Improper use of a Network Intruder Detection System (NIDS)

    It is important to stress that unless a NIDS is capable of decrypting the SSL traffic (such as through the use of the web server's private key) it is simply unable to detect attacks on the web application. To detect eventual break-ins, we must either use either a HIDS (Host-based Intruder Detection System), or put the NIDS in a segment where SSL/TLS traffic is being sent in clear text, such as between a SSL Reverse Proxy and the web server's farm. Otherwise we may not be able to detect any attacks, except denial of service, performed against the web server.

  • Allowing access not only via SSL, but also via non-encrypted protocols (HTTP)

    Setting up SSL/TLS and opening port 443/tcp to the web server, by itself, means nothing if users can still access website via non-encrypted HTTP on port 80/tcp. Thus, one must double-check that the protected content cannot be accessed via non-encrypted HTTP or other protocols (including FTP, Samba, NFS, and so on).

  • Vulnerable client machines

    When we focus on securing Apache web servers, we can easily forget about the security of the client machines. If they are compromised, the security of SSL/TLS is compromised as well. In that case, intruders could do as they like with client hosts, such as replace certificates in web browsers, install keyloggers, change /etc/hosts to redirect web requests to bogus web servers, or even steal client certificates if they they have been marked as "exportable."

    Therefore, if the client machines are under our administrative domain, we need to take great care with their security. Enabling a personal firewall, antivirus/antispyware software, and turning on automatic Windows updates should be the absolute minimum. Most importantly, web browser versions should be always up-to-date.

    It is also recommended that one apply the following options (related to SSL/TLS) to the web browser configuration:

    • checking for a publisher's certificate revocation should be enabled
    • checking for any server certificate revocation should be enabled
    • encrypted server pages should not be stored in the cache
    • the use of SSLv2.0 should be disabled, and only TLSv1.0 and SSLv3.0 should remain enabled
    • the displaying of warnings about invalid web servers certificates should be enabled
  • Using the same Session IDs / cookies for SSL and HTTP

    It is possible to have an Apache configuration that accepts both HTTP and HTTPS requests on the same server, such as an information website accessible via HTTP, and a transaction part accessible via HTTPS. In this case, the we must be very careful in our web application not to use the same session IDs / cookies for both protocols. Otherwise, an intruder can sniff the HTTP traffic between web server and victim, and can try to use session IDs to get access to authorized part of web application via SSL.

Conclusion

This article closes the series of articles devoted to configuring Apache 2.0 with SSL/TLS support. It has been presented how to set up SSL/TLS protocol to achieve maximum security and optimal performance. It has been also shown how to create and revoke certificates, and how to use them in practice.

Although the series has covered most important aspects of Public Key Infrastructure for use with SSL web servers, including creating, installing using and revoking certificates, we did not exhaust the subject related to PKI. On the contrary, only the real basics of PKI have been presented. Readers interested in further reading on this topic are encouraged to take a look at the documents created by the PKIX Working Group, or the OpenCA PKI Development Project, where the latter is quite a robust and open-source Certification Authority.

Apache 2 with SSL/TLS: Part 2

In the first article of this three part series, the reader was shown how to install, configure, and troubleshoot Apache 2.0 with SSL/TLS support. Part two now discusses the recommended settings for the mod_ssl module that lets us achieve maximum security and optimal performance. The reader will also see how to create a local Certification Authority and a SSL certificate based on the free and open-source OpenSSL library.

Recommended settings for mod_ssl

In Apache 2.0.52, there are more than 30 directives that can be used to configure mod_ssl. The detailed description of all of them can be found in Apache's mod_ssl documentation. This section focuses only on the recommended settings which can improve the security or performance of SSL/TLS connections.

The list of these mod_ssl settings is shown below in Table 1.

Directive(s)

Recommended setting or comment

SSLEngine

Must be enabled, otherwise the main server (or virtual host) will not be using SSL/TLS

SSLRequireSSL

Must be enabled, otherwise users may be able to access the web content via regular HTTP requests, without using SSL/TLS, at all.

SSLProtocol

SSLProxyProtocol

Should be set to use only TLS v1.0 and SSL v3.0. Most of current web browsers support both of them, so we can safely disable SSL v2.0.

SSLCipherSuite

SSLProxyCipherSuite

To provide strong cryptography, this parameter should be set to use HIGH (>168 bits) and MEDIUM (128 bits) cipher suites. LOW (<56 href="http://eprint.iacr.org/2004/199.pdf">collisions . To summary, the recommended settings could be as follows:

HIGH:MEDIUM:!aNULL:+SHA1:+MD5:+HIGH:+MEDIUM

Note that it is possible to see what ciphers suites the proposed settings can support, as follows:

openssl ciphers -v 'HIGH:MEDIUM:\!aNULL:+SHA1:+MD5:+HIGH:+MEDIUM'

SSLOptions

The "+StrictRequire" options should be set, otherwise the "Satisfy Any" directive may force mod_ssl to allow access to the web content, even if SSL/TLS is not used.

SSLRandomSeed

For startup of Apache should be set to use pseudo random device (/dev/urandom) and/or EGD (Entrophy Gathering Daemon). Before establishing every new SSL connection should be configured to use built-in source, /dev/urandom or EGD. It is not recommended to use /dev/random in both cases, because /dev/random can provide only as much entropy, as it has at certain moment.

SSLSessionCache

To avoid repeating SSL handshakes for parallel HTTP requests (e.g. when web browser downloads several images at one time), SSL caching should be enabled. It should be set to use shared memory (SHM), or DBM. When setting to "none", performance of the web server may decrease significantly.

SSLSessionCacheTimeout

This value specifies the number of seconds, after which the entry in SSLSessionCache expires. It should be set to at least 300-600 seconds. However, the actual time should depend on the average time the users spent on visiting the web server. E.g., if the average time is around 15 minutes, then the value should be set to at least 900 (15 minutes * 60 seconds)

SSLVerifyClient

SSLProxyVerify

When not using client or proxy authentication, these options should be set to "none". They should never be set to "optional_no_ca", because it is against the idea of PKI authentication, where client, to be authenticated, must present valid certificate. "optional" may occasionally be used (depends on needs), however it may not work with all web browsers.

SSLVerifyDepth

SSLProxyVerifyDepth

Should contain the maximum number of intermediate CA's. E.g. to accept only self-signed certificates it should be set to zero, for client certificates that are signed by root CA - it should be 1. And so on.

SSLProxyEngine

Should be disabled, if SSL/TLS proxy mechanism is not used.

Table 1. Recommended mod_ssl settings.

Our sample settings according to the above recommendations can be shown in httpd.conf as follows:


SSLEngine on

SSLOptions +StrictRequire


SSLRequireSSL


SSLProtocol -all +TLSv1 +SSLv3

# Support only for strong cryptography
SSLCipherSuite HIGH:MEDIUM:!aNULL:+SHA1:+MD5:+HIGH:+MEDIUM
# Support for strong and export cryptography
# SSLCipherSuite HIGH:MEDIUM:EXP:!aNULL:+SHA1:+MD5:+HIGH:+MEDIUM:+EXP

SSLRandomSeed startup file:/dev/urandom 1024
SSLRandomSeed connect file:/dev/urandom 1024

SSLSessionCache shm:/usr/local/apache2/logs/ssl_cache_shm
SSLSessionCacheTimeout 600

SSLVerify none
SSLProxyEngine off

In addition to above mod_ssl directives, there are also two important directives from other Apache modules (mod_log_config and mod_set_envif) that need to be setup, as shown below in Table 2.

Directive(s)

Recommended setting / comment

CustomLog

To log information about SSL parameters (recommended minimum: the protocol version and chosen cipher suites) we should use the following value:


CustomLog logs/ssl_request_log \
"%t %h %{HTTPS}x %{SSL_PROTOCOL}x %{SSL_CIPHER}x
%{SSL_CIPHER_USEKEYSIZE}x %{SSL_CLIENT_VERIFY}x
\"%r\" %b"

Setenvif

To provide compatibility with older versions of MS Internet Explorer, which has got known bugs in SSL implementation (e.g. problems with keep-alive functionality, HTTP/1.1 over SSL, and SSL close notify alerts on socket connection close), the following option should be set:


SetEnvIf User-Agent ".*MSIE.*" \
nokeepalive ssl-unclean-shutdown \
downgrade-1.0 force-response-1.0

The above option will cause that web server will neither use HTTP/1.1 nor keep-alive connections, and will not send SSL close notify when the web browser is MS Internet Explorer.

Table 2. Recommended mod_log and mod_set_envif settings.

The sample configuration file (httpd.conf) presented in the previous article already includes the above settings, for the reader's convenience.

Web server authentication

Thus far we were able to configure and test SSL/TLS, but our web browser was not able to check the web server's identity. In the first article we were using a web server certificate that had been created only for testing purposes, and did not contain the information required for real authentication purposes and commerce transactions.

In order for the web browser to successfully authenticate the web server, we need to create a valid web server certificate, which should contain:

  • the public key of the web server
  • validity dates (start and expiration)
  • supported cipher algorithms
  • the distinguish name (DN), which must contain fully qualified domain name of the web server known as the Common Name (CN). Optionally it may also contain some other attributes, like Country (C), State (S), Location (L), the Organization's name (O), the Organization Unit's name (OU), and more.
  • the serial number of the certificate
  • X.509v3 attributes that will tell web browsers about the type and usage of the certificate
  • URI of the CRL distribution point (if exist)
  • URI of the X.509v3 Certificate Policy (if exist)
  • name and signature of trusted Certification Authority (CA)

It is important to note that the Common Name (CN) attribute must be a fully qualified domain name (FQDN) on the web server. Otherwise, the web browsers will not be able to verify if the certificate belongs to the web server that is presenting it.

A sample web server certificate (as a text representation) has been presented below.


Certificate:
Data:
Version: 3 (0x2)
Serial Number: 1 (0x1)
Signature Algorithm: sha1WithRSAEncryption
Issuer: O=Seccure, OU=Seccure Root CA
Validity
Not Before: Nov 28 01:00:20 2004 GMT
Not After : Nov 28 01:00:20 2005 GMT
Subject: O=Seccure, OU=Seccure Labs, CN=www.seccure.lab
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
RSA Public Key: (1024 bit)
Modulus (1024 bit):
00:c1:19:c7:38:f4:89:91:27:a2:1b:1d:b6:8d:91:
48:63:0e:3d:0d:2e:f8:65:45:56:db:98:4d:11:21:
01:ac:81:8e:3f:64:4a:8a:3f:21:15:ca:49:6e:64:
5c:5d:a2:ab:5a:48:cb:2a:9f:0c:02:b9:ff:52:f6:
d9:39:6d:a3:4a:94:41:f9:e9:ab:f0:42:fb:68:9a:
4b:53:41:e7:4f:b0:2b:02:d7:92:a2:2b:02:a2:f9:
f1:2d:68:fa:50:01:2f:49:c1:28:2f:a8:c6:6d:6d:
ab:1d:b9:bd:c9:80:63:f1:d6:22:19:de:2d:4a:43:
50:76:79:7e:a5:5a:75:af:19
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Basic Constraints:
CA:FALSE
Netscape Cert Type:
SSL Server
X509v3 Key Usage:
Digital Signature, Key Encipherment
X509v3 Extended Key Usage:
TLS Web Server Authentication, Netscape Server Gated Crypto,
Microsoft Server Gated Crypto
Netscape Comment:
OpenSSL Certificate for SSL Web Server
Signature Algorithm: sha1WithRSAEncryption
45:30:9d:04:0e:b7:86:9e:61:a1:b0:68:2b:44:93:1c:57:2a:
99:42:bb:16:b1:ab:f5:c0:d2:33:12:c8:d3:1d:2b:bb:6b:9a:
4c:c7:53:bc:e4:88:ef:1e:c3:37:ed:53:2c:15:cf:b8:90:df:
df:4b:34:b8:db:cc:23:77:46:06:72:9d:43:60:a8:a2:ed:0a:
bb:1a:a4:e8:4e:ba:66:93:63:74:87:fd:43:48:b6:93:a2:e3:
3d:da:1b:64:46:35:88:b4:4b:22:e6:3c:84:70:5d:88:dd:64:
c2:51:c2:d6:59:80:87:bc:bd:7f:e3:c1:45:7e:c0:5f:9c:ca:
e1:a1

The examples presented in the subsequent sections of this article are based on the following values, as shown in Table 3. In order to create valid certificates, readers will need to replace these values with the names of their own company or organization.

Attribute's description

Attribute

Sample value

Country code (two letters)

C

C = PL
State or Province

S

S = mazowieckie

Location

L

L = Warsaw

Organization Name

O

O = Seccure

Organization Unit

OU

OU = Seccure Labs

Common Name

CN

CN = www.seccure.lab

Table 3. Sample values for a valid certificate.

The passphrase dilemma

Before creating certificates, it is important to understand the implications of a passphrase for the certificate. Should the web server's private key be encrypted or not? There are many opinions, but it is recommended that one does not protect the web server's private key using passphrase. It is not only inconvenient, but also gives a false sense of security. Why? Consider the points below.

  1. One is required to enter the passphrase after every restart of the web server, which can be quite annoying if the system needs to be restarted often (such as due to a kernel update, an electricity failure, configuration change, and so on).
  2. If an intruder manages to get the private key on the web server, it means that the web server is compromised and the intruder had had access to the web server's operating system at root level. If this is the case, the intruder could obtain the passphrase by installing keylogger, and either crash or restart the system to force administrator to enter the passphrase. Alternatively, an intruder could dump Apache's memory and find the web server's private key stored as clear text there. While it is little bit difficult, for those skilled in the art of hacking Unix it should not pose a big problem (hint: look at the pcat utility from The Coroner's Toolkit).

Therefore, the only advantage of encrypting web server's private key is that the passphrase will help protect web server's private key against script kiddies, but not against professionals who are able to compromise the server.

Creating the web server certificate

At this point we can create our web server certificate. In general, there are three types of certificates that we can use:

  1. A self-signed certificate.
  2. A certificate signed by trusted CA (most recommended).
  3. A certificate signed by a local CA.

The sections below describe in detail the methods of creating the above certificates. The final result of any method used will be just two files:

  • server.key - the private key of the web server
  • server.crt - the PEM encoded certificate that includes our web server's public key

Method 1: Self-signed certificate (for testing purposes only)

This method is recommended only for continuing our testing, or for use in small, closed environments (such as at home or in small Intranets). In order for the web browsers to be able to authenticate the web server, self-signed certificates must be installed in every web browser that needs access the web server. This can be quite inconvenient.

The web server's private/public key pair and the self-signed PEM-encoded certificate now can be created as follows:

openssl req \
-new \
-x509 \
-days 365 \
-sha1 \
-newkey rsa:1024 \
-nodes \
-keyout server.key \
-out server.crt \
-subj '/O=Seccure/OU=Seccure Labs/CN=www.seccure.lab'

The above commands will create a new (-new) certificate (-x509) that will be valid for one year (-days 365) and will be signed using the SHA1 algorithm (-sha1). The RSA private key will be 1024 bits long (-newkey rsa:1024), and will not be protected by a passphrase (-nodes). The certificate and the private/public key pair will be created in the "server.crt" and "server.key" files (-out server.crt -keyout server.key). The "-subj" parameter says that the company's name is "Seccure" (O=Seccure), the department's name is "Seccure Labs", and the web server's fully qualified domain name is "www.seccure.lab".

After creating the above certificate, we need to distribute and install it in every web browser that may connect to the web server. Otherwise, web browsers requiring a connection will not be able to verify the web server's identity. For Windows environments, this is shown below in Figure 1.


Figure 1. Installing a self-signed certificate onto a client machine.

Method 2: Certificate signed by a trusted CA (recommended method)

Creating a certificate request and signing it by a trusted CA (such as Verisign,Thawte, RSA, or others) is the most recommended way to proceed if the SSL web server is to be exposed to the Internet. Using this approach, there is no need to install certificates in each web browser, since most of them already have a number of trusted CA certificates pre-installed out-of-the-box.

Please note that each Certificate Authority has different restrictions for the Distinguish Name's attributes, to accommodate certain key lengths or international characters, and therefore prior to creating certificate requests readers need to make sure that certificate request is compliant with their particular CA's requirements. It is also recommended that one choose a CA whose signing certificate is already installed in most of web browsers (including Thawte, Verisign, and a number of others). Otherwise, the user's web browser may have problems authenticating with the web server.

The process of obtaining a signed certificate from trusted CA consists of the following steps:

  1. In the first step, we should create our web server's private/public key pair (server.key), and certificate request (request.pem), as follows:

    openssl req \
    -new \
    -sha1 \
    -newkey rsa:1024 \
    -nodes \
    -keyout server.key \
    -out request.pem \
    -subj '/O=Seccure/OU=Seccure Labs/CN=www.seccure.lab'
  2. Now we must send the certificate request (request.pem) to the CA, and then wait until it is signed and sent back to us in the form of certificate.
  3. After receiving certificate back from our trusted CA, we must make sure that it is encoded in the PEM format, and not in TXT or DER format. If the received certificate is not PEM-encoded, then we will need to convert it from whatever format we have received.

    The easiest way to check the format of the certificate is to view the certificate with a text editor. Depending on how the certificate look, it can be in one of the following formats (the typical filename extensions has been presented in the brackets):

    • PEM, Base64 encoded X.509 format (*.crt, *.pem, *.cer)

      -----BEGIN CERTIFICATE-----
      MIICdzCCAeCgAwIBAgIBATANBgkqhkiG9w0BAQUFADAsMRAwDgYDVQQKEwdTZWNj
      dXJlMRgwFgYDVQQLEw9TZWNjdXJlIFJvb3QgQ0EwHhcNMDQxMTI4MDEwMDIwWhcN
      ...
      ou0Kuxqk6E66ZpNjdIf9Q0i2k6LjPdobZEY1iLRLIuY8hHBdiN1kwlHC1lmAh7y9
      f+PBRX7AX5zK4aE=
      -----END CERTIFICATE-----
    • TXT + PEM format (*.crt, *.cer, *.pem, *.txt)

      Certificate:
      Data:
      Version: 3 (0x2)
      Serial Number: 1 (0x1)
      Signature Algorithm: sha1WithRSAEncryption
      Issuer: O=Seccure, OU=Seccure Root CA
      ...
      RSA Public Key: (1024 bit)
      Modulus (1024 bit):
      00:c1:19:c7:38:f4:89:91:27:a2:1b:1d:b6:8d:91:
      ...
      X509v3 extensions:
      X509v3 Basic Constraints:
      CA:FALSE
      ...
      -----BEGIN CERTIFICATE-----
      MIICdzCCAeCgAwIBAgIBATANBgkqhkiG9w0BAQUFADAsMRAwDgYDVQQKEwdTZWNj
      dXJlMRgwFgYDVQQLEw9TZWNjdXJlIFJvb3QgQ0EwHhcNMDQxMTI4MDEwMDIwWhcN
      ...
      ou0Kuxqk6E66ZpNjdIf9Q0i2k6LjPdobZEY1iLRLIuY8hHBdiN1kwlHC1lmAh7y9
      f+PBRX7AX5zK4aE=
      -----END CERTIFICATE-----

      If your certificate was received in TXT + PEM format, here is the command to convert it to PEM:

      openssl x509 -in signed_cert.pem -out server.crt

    • DER, binary encoded X.509 (*.der, *.crt, *.cer)


      [ non-text, binary representation ]

      If your certificate was received in DER format, here is the command to convert it to PEM:

      openssl x509 -in signed_cert.der -inform DER -out server.crt
  4. Verify and test the certificate

    Before installing the certificate we should check if the received certificate is indeed valid and can be used for web server authentication purposes:

    openssl verify -CAfile /path/to/trusted_ca.crt -purpose sslserver server.crt

    Also, it is good to make sure that the certificate corresponds to our previously created web server's private key (the results of both commands below should be identical):

    openssl x509 -noout -modulus -in server.crt | openssl sha1
    openssl rsa -noout -modulus -in server.key | openssl sha1

Method 3: Certificate signed by a local CA

This third method of signing a certificate can be used in Intranets as well as all organizations that use, or plan to use, their own Certification Authority. In this case, a local CA certificate must be installed in all web browsers that connect to the secure web server.

To be able to use this method, we need to create our local CA's private/public key, as well as the CA's certificate and repository for the new keys.

Note: The local CA should be created on a separate server that is not connected to the network at all. The operating system should allow access only to authorized people, and the machine itself should be physically secured. The CA's private key is the most precious element of the entire PKI system - if this key is compromised, then all other certificates signed by this CA are considered compromised as well!

We will use the OpenSSL library to setup the environment step by step, as listed below. Of course, if we already have a local CA, we can skip this section and proceed with creating the certificate request for the web server.

  1. Prepare the directory structure for the new CA (the $SSLDIR environment variable should be added to applicable startup scripts, such as /etc/profile or /etc/rc.local):

    export SSLDIR=$HOME/ca
    mkdir $SSLDIR
    mkdir $SSLDIR/certs
    mkdir $SSLDIR/crl
    mkdir $SSLDIR/newcerts
    mkdir $SSLDIR/private
    mkdir $SSLDIR/requests
    touch $SSLDIR/index.txt
    echo "01" > $SSLDIR/serial
    chmod 700 $SSLDIR
  2. Create the main OpenSSL configuration file - $SSLDIR/openssl.cnf, with the following content (optimized for the use with SSL web servers):

    # =================================================
    # OpenSSL configuration file
    # =================================================

    RANDFILE = $ENV::SSLDIR/.rnd

    [ ca ]
    default_ca = CA_default

    [ CA_default ]
    dir = $ENV::SSLDIR
    certs = $dir/certs
    new_certs_dir = $dir/newcerts
    crl_dir = $dir/crl
    database = $dir/index.txt
    private_key = $dir/private/ca.key
    certificate = $dir/ca.crt
    serial = $dir/serial
    crl = $dir/crl.pem
    RANDFILE = $dir/private/.rand
    default_days = 365
    default_crl_days = 30
    default_md = sha1
    preserve = no
    policy = policy_anything
    name_opt = ca_default
    cert_opt = ca_default

    [ policy_anything ]
    countryName = optional
    stateOrProvinceName = optional
    localityName = optional
    organizationName = optional
    organizationalUnitName = optional
    commonName = supplied
    emailAddress = optional

    [ req ]
    default_bits = 1024
    default_md = sha1
    default_keyfile = privkey.pem
    distinguished_name = req_distinguished_name
    x509_extensions = v3_ca
    string_mask = nombstr

    [ req_distinguished_name ]
    countryName = Country Name (2 letter code)
    countryName_min = 2
    countryName_max = 2
    stateOrProvinceName = State or Province Name (full name)
    localityName = Locality Name (eg, city)
    0.organizationName = Organization Name (eg, company)
    organizationalUnitName = Organizational Unit Name (eg, section)
    commonName = Common Name (eg, YOUR name)
    commonName_max = 64
    emailAddress = Email Address
    emailAddress_max = 64

    [ usr_cert ]
    basicConstraints = CA:FALSE
    # nsCaRevocationUrl = https://url-to-exposed-clr-list/crl.pem

    [ ssl_server ]
    basicConstraints = CA:FALSE
    nsCertType = server
    keyUsage = digitalSignature, keyEncipherment
    extendedKeyUsage = serverAuth, nsSGC, msSGC
    nsComment = "OpenSSL Certificate for SSL Web Server"

    [ ssl_client ]
    basicConstraints = CA:FALSE
    nsCertType = client
    keyUsage = digitalSignature, keyEncipherment
    extendedKeyUsage = clientAuth
    nsComment = "OpenSSL Certificate for SSL Client"

    [ v3_req ]
    basicConstraints = CA:FALSE
    keyUsage = nonRepudiation, digitalSignature, keyEncipherment

    [ v3_ca ]
    basicConstraints = critical, CA:true, pathlen:0
    nsCertType = sslCA
    keyUsage = cRLSign, keyCertSign
    extendedKeyUsage = serverAuth, clientAuth
    nsComment = "OpenSSL CA Certificate"

    [ crl_ext ]
    basicConstraints = CA:FALSE
    keyUsage = digitalSignature, keyEncipherment
    nsComment = "OpenSSL generated CRL"
  3. Now create the CA's private/public key pair, and the self-signed CA's certificate:

    openssl req \
    -config $SSLDIR/openssl.cnf \
    -new \
    -x509 \
    -days 3652 \
    -sha1 \
    -newkey rsa:1024 \
    -keyout $SSLDIR/private/ca.key \
    -out $SSLDIR/ca.crt \
    -subj '/O=Seccure/OU=Seccure Root CA'

    It should be emphasized that the CA's private key (ca.key) should be protected by a hard to guess passphrase, and it should be valid for a much longer period of time than regular certificates (typically, 10-30 years, or more).

The CA's certificate "ca.crt" should be published on Intranet web pages and installed in every web browser that may possibly need to use it. A sample root CA certificate installed in Internet Explorer is shown below in Figure 2.


Figure 2. Sample root CA certificate installed in Internet Explorer.

From this point we can now use our local CA for signing/revoking certificates. In order to create the web server certificate, we should follow the below steps:

  1. Create the web server's private/public key pair (server.key), and the certificate request (request.pem). This instruction needs to be executed on the web server.

    openssl req \
    -new \
    -sha1 \
    -newkey rsa:1024 \
    -nodes \
    -keyout server.key \
    -out request.pem \
    -subj '/O=Seccure/OU=Seccure Labs/CN=www.seccure.lab'
  2. Copy the above certificate request (request.pem) into the $SSLDIR/requests directory on the CA host (using removable media, such as a USB-Drive).
  3. Sign the certificate request as follows (to be executed on the CA host only):

    openssl ca \
    -config $SSLDIR/openssl.cnf \
    -policy policy_anything \
    -extensions ssl_server \
    -out $SSLDIR/requests/signed.pem \
    -infiles $SSLDIR/requests/request.pem

    The result of the above command is a signed certificate (signed.pem) that is placed in the $SSLDIR/newcerts directory, and in the file $SSLDIR/signed.pem. It consists of both a TXT and PEM representation of the certificate. Because Apache expects a pure PEM format, we need to convert it, as follows:

    openssl x509 \
    -in $SSLDIR/requests/signed.pem \
    -out $SSLDIR/requests/server.crt
  4. Copy the signed, PEM-encoded certificate (server.crt) back to the web server machine.

At this point web server's certificate is ready to use.

For local Certificate Authorities, if the web server's certificate is compromised it is CA responsibility to revoke the certificate, and to inform users and applications that this certificate is no longer valid.

To revoke a certificate, we need to find the serial number of the certificate we want to revoke in the $SSLDIR/index.txt file. Then, we can revoke the certificate as follows:

openssl ca \
-config $SSLDIR/openssl.cnf \
-revoke $SSLDIR/newcerts/.pem

To create a CRL (Certificate Revocation List) file, we can use the following commands:

openssl ca -config $SSLDIR/openssl.cnf -gencrl -crlexts crl_ext -md sha1 -out $SSLDIR/crl.pem

The above file should be published on the CA's website, and/or distributed to users. When distributing CRLs it is also recommended that one use the Online Certificate Status Protocol (OCSP). More information about OCSP can be found in RFC 2560 .

Note that some browsers (including Firefox) accept only DER-encoded CRLs, so prior to installing crl.pem in such browsers, the file must be converted as follows:

openssl crl \
-in $SSLDIR/crl.pem \
-out $SSLDIR/revoke_certs.crl \
-outform DER

Also note that in order for the web browser to check if the web server's certificate is revoked, the option "Check for server certificate revocation" should be checked in MS Internet Explorer's Advanced Settings. This is shown below in Figures 3 and 4.


Figure 3. Configuring Internet Explorer to check for certificate revocation.


Figure 4. Internet Explorer's response to a revoked certificate.

Installing the certificate

At this point we can proceed with installing the web server's private key (server.key) and certificate (server.crt) into the Apache environment:

install -m 600 -o root -g sys server.key /usr/local/apache2/conf/ssl.key/
install -m 644 -o root -g sys server.crt /usr/local/apache2/conf/ssl.crt/

We should also make sure that the directives in Apache's configuration file are pointing to the above files (in httpd.conf):

SSLCertificateFile /usr/local/apache2/conf/ssl.crt/server.crt
SSLCertificateKeyFile /usr/local/apache2/conf/ssl.key/server.key

The final step is to restart Apache for the changes to take an effect:

/usr/local/apache2/bin/apachectl stop
/usr/local/apache2/bin/apachectl startssl

At this point we can check to see if the SSL website is accessible from the web browsers, and if the web browsers can successfully authenticate with the web server. This time, there should be no warning messages displayed, as shown below in Figure 5.


Figure 5. Secure connection with a valid certificate.

Concluding part two

It has been shown how to configure mod_ssl, and how to create and use a web server's X.509v3 certificates. Next, in the third and final part of this article series, we will discuss client authentication via certificates, as well as common mistakes and known attacks that can threaten the security of SSL communication.