Using ack to search the source code: going beyond grep

ack (or ack-grep on ubuntu) is a perl based search tool that replicates most of functionality of grep command, but goes a step ahead to position itself as an effective tool while searching in source code files. Its main features include

  1. Skipping CVS, SVN or Git directories, .bak files
  2. Searching only files of specific language: with “–type=TYPE format” option. ack supports a predefined list of languages and their file extensions. This can be overridden by -a option to search all file types.
  3. It descends into directories to search the files, ignoring subversion directories. More directories can be included or ignored using –[no]ignore-dir
  4. It supports grep options like -w, -A, -c etc.

Commonly used options include

-f : to print the file names to be searched

-w : search only a complete word

-G REGEX: Only path included in REGEX are included in the search.

-H: Print filenames with each search

-h: skip file names

-i: ignore case

–match REGEX: Used to specify the pattern explicitly. This would be useful to perform multiple searches on the file. For example:

 # search for foo and bar in given files
  ack-grep file1 t/file* --match foo

-n: no descending into directories.

–sort-files: Sorts the found files lexically.

–type=TYPE, –type=noTYPE: Specify the files to include or exclude.

-v: invert match

Moreover, ack provides options add file types, if required. This can be achieved by modifying the .ackrc file. The location of this file is specified by ACKRC environment variable. If this file doesn’t exist, ack looks in the default location.

More help can be found by ‘ack –help’ or ‘ack-grep –help’ or referring to the manual pages.

Understanding init in Linux/Unix : With examples.

Init is the parent of all processes running in user space in Linux/Unix. At startup, init is responsible to start all the non-operating system services, creates user environment, and presents the user with the login screen. Again, at shutdown, it is responsible to terminate all processes in the controlled manner Kernel executes its own shutdown.

Init process is located in /sbin/init on Linux, and has the process id (pid) as 1. Processes managed by init are known as jobs. The configuration files of these jobs are usually located in /etc/init, unless overridden.

Init process has a run level associated with it. A run level also determines which processes are executed at system startup. The run levels are stated below:

0	Halt
1	Single-user mode
2	Local Multiuser with Networking but without network service (like NFS)
3	Full Multiuser with Networking
4	Not Used
5	Full Multiuser with Networking and X Windows(GUI)
6	Reboot

For example, to reboot the system, simply run ‘init 6’ as root. (run a sync before that, as show below):

$ sync
$ sudo telinit 6
$ Connection to 10.0.0.4 closed by remote host.
Connection to 10.0.0.4 closed.

There is a special run level S, which is not really meant to be used directly, but more for the scripts that are executed when entering runlevel 1. We can switch from one run level to another using telinit. The services which do not exist in a given run level are stopped, and the ones required are started. This  is  performed by the /etc/init.d/rc script executed on a change of runlevel This script examines symlinks in the /etc/rc?.d directories, symlinks beginning K are services to be stopped and symlinks beginning S are  services  to  be started.  For example:

$ ls -lrt /etc/rc5.d/
total 4
-rw-r--r-- 1 root root 677 2011-03-28 22:10 README
lrwxrwxrwx 1 root root  18 2011-04-28 14:23 S70pppd-dns -> ../init.d/pppd-dns
lrwxrwxrwx 1 root root  19 2011-04-28 14:23 S70dns-clean -> ../init.d/dns-clean
lrwxrwxrwx 1 root root  15 2011-04-28 14:23 S50saned -> ../init.d/saned
lrwxrwxrwx 1 root root  15 2011-04-28 14:23 S50rsync -> ../init.d/rsync
lrwxrwxrwx 1 root root  20 2011-04-28 14:23 S50pulseaudio -> ../init.d/pulseaudio
lrwxrwxrwx 1 root root  19 2011-04-28 14:23 S25bluetooth -> ../init.d/bluetooth
lrwxrwxrwx 1 root root  27 2011-04-28 14:23 S20speech-dispatcher -> ../init.d/speech-dispatcher
lrwxrwxrwx 1 root root  20 2011-04-28 14:23 S20kerneloops -> ../init.d/kerneloops
lrwxrwxrwx 1 root root  18 2011-04-28 14:23 S99rc.local -> ../init.d/rc.local
lrwxrwxrwx 1 root root  18 2011-04-28 14:23 S99ondemand -> ../init.d/ondemand
lrwxrwxrwx 1 root root  21 2011-04-28 14:23 S99grub-common -> ../init.d/grub-common
lrwxrwxrwx 1 root root  22 2011-04-28 14:23 S99acpi-support -> ../init.d/acpi-support
lrwxrwxrwx 1 root root  24 2011-04-28 14:23 S90binfmt-support -> ../init.d/binfmt-support
lrwxrwxrwx 1 root root  14 2011-08-14 16:21 S75sudo -> ../init.d/sudo

telinit : To change a run level, we use telinit command:

Usage:

telinit [OPTION]...  RUNLEVEL

For example:

$ sudo telinit 5

telinit may be also used to send basic commands to the init, like Q or q to request that init reload its configuration.

To Shell Scripts and beyond

Shell Scripts are good and very handy when it comes to Linux system management. There is a huge level of automation which can be achieved by the use of small shell scripts. One big advantage that I feel while using such scripts is that they are interpreted and not compiles (like Java). So a small change can quickly be reflected in the running system. Moreover, I do not have to write 10 lines of code to read a file line by line and perform some operation. Just do a cat, pipe the output to another command and we are good. Or better, to search something from a file, use a grep. These small advantages, when combined together, make shell scripts very fast to code and deploy. But it is not the complete story.

The main problem comes when we try to up-scale the scripts to do complicated tasks. The scripts tend to become slower and slower as the number of commands involved increase. This happens because each command is a process in itself, each with a start time and execution time. Besides, shell scripts are prone to errors, with huge costs if we do something like rm –rf *.  I have, in a short span of time,  been excited by scripts, have committed costly mistakes, and have waited for 30 minutes or more for script to complete (which took just 5 minutes when ported to Java).

These things have forced me to look beyond shell scripts. I think such times come in life of every Linux administrator, when shell scripts start looking more like a problem than solution. I have experimented with Python, php, Groovy, Scala and Perl.

Being from a Java background, I personally found Groovy and Scala much easier to learn, and still continue to use them till now. For Java developers, picking up one of these languages should not be a hassle. But absolute for beginners, I would say to go with php, because learning php also gives an extra edge due to its extensive use in web development. Once you are into php (or decide to skip it altogether), please take out time to be amazed by the power of Python and Perl. These languages were built for to replace those non-scalable shell scripts with something which can be converted to scripts which are very fast, and easily manageable.

Finally, what it comes down to is that pick one and dive deep. Each above mentioned languages can perform tasks as good as others. Also, I am yet to try my hands on Ruby!

Linux Desktop Environments

Linux and other Unix-like systems come with flexibility of using different type if desktop environments. These desktop environments are Graphical User Interfaces (GUIs) which help users in variety of tasks like accessing and operating various software, configuration, system management etc. Normally, the piece that interacts with the hardware and underlying machine level interfaces is the X Windows system. The X Windows is a hardware abstraction intermediate layer that entrusts another system known as Windows Manager to visualize its elements. Desktop Environments build on top of Windows Manager and provide extra utilities and applications for seamless user interaction.

A Linux based system can support multiple Desktop Environments, the most famous ones being KDE and Gnome. These are heavy fully fledged environment with large number of utilities. There are several light weight alternatives to KDE and Gnome, such as LXDE, Xfce, Motif, FVWM etc.

KDE is a free software community best known for Plasma Desktop. This desktop environment is provided by default in distributions such as Mandriva,  Kubuntu and PCLinuxOS. It is based on Qt framework. Gnome is another desktop environment based on GTK+, a cross-platform widget toolkit for creating graphical user interfaces. Gnome is default desktop environment for distributions such as Ubuntu. Gnome and KDE have different UI, Menu navigation, file browsers, default email clients etc., and hence a different look and feel. But both are free and easy to use.

Gome

KDE

While both KDE and Gnome are large desktop environments which require decent system resources, Linux users have a choice of light weight environments like Xfce, LXDE and ROX Desktop.

Xfce

LXDE

Basics of Common Internet Protocols

In computers, a protocol is a set of well defined rules which govern the way devices communicate. In Internet world, the communication is mostly, but not always, managed through TCP (Transmission Control Protocol) or UDP (User Datagram Protocol). This article briefs only the basics of these internet protocols.

TCP: Transmission Control Protocol, or the TCP, is one of the major protocols involved in internet communication. Websites, email, file transfer etc. rely on TCP for data transfer. This protocol is designed for reliability of communication.

UDP: User Datagram Protocol, or the UDP, is designed for reduced time delay between communicating parties, at the cost of reliability and the order of messages being delivered. Torrents, online gaming applications, DNS etc. use this layer to provide better real time response.

HTTP: Hypertext Transfer Protocol, extensively used for websites, is a protocol for communication between a server and a client. While TCP, UDP etc. manage the transportation of data over the network, HTTP works closely with applications to provide them a communication element. HTTP uses another protocol to transmit data (mostly TCP).

FTP: It is a File Transfer Protocol, meant to transfer large files over TCP from one host to another. FTP supports both clear-text sign-in authentication, and anonymous sign-in. There are various FTP clients available for download, both free and paid.

DNS: Domain Name System is used to translate human readable names of addresses (like web addresses) into their corresponding IP-Addresses. These mapping are maintained in a distributed database in systems known as DNS servers.

DHCP: Dynamic Host Configuration Protocol is a protocol used by computer systems to automatically obtain IP address over a network. It does this by maintaining a small database of connected systems.

SMTP: Simple Mail Transfer Protocol is used for sending email from one host to another. It operates on port 25 for email delivery, and port 587 for new submissions from applications. Generally, port 25 is blocked for direct access from applications to prevent against abuse.

POP: While SMTP is for sending emails, POP (Post Office Protocol) is used to retrieve emails from the server to the local email client. This is useful when using email clients like Outlook or Thunderbird, and not when using your web browser to check email from one of the email sites, like gmail.com.

IMAP: Internet Message Access Protocol is a relatively newer protocol when compared to POP, and is used to retrieve emails from mail server to local client. IMAP offers several advantages over POP, for example, downloading only header and not the full message unless required, thus saving bandwidth and time, multiple client connection to the same mailbox, server side searches etc.

SOCKS: Finally in this list is SOCKS which provides access between client and server when proxies are involved. For authentication while connecting to proxy, SOCKS5 is used. One use of SOCKS is to allow connections to be made through a firewall. You would generally find this protocol extensively used in schools, colleges and offices. It is also known as SOCKetS.

Screen in Linux

Screen is yet another way to run processes in the background (besides nohup), and comes very handy at times when running process that are taking lot of time. Let us work our way with an example of how to use screen command. Install screen if required using apt-get or yum or your default package manager, and start screen using:

Now, once started, you can multiple windows of screen in the same terminal. For example, I can run a script which sleeps for 1 hour, before printing something:

#!/bin/bash

sleep 3600
echo "done"

I run it as shown below:

The script waits and waits. In the same terminal, I can create a new screen, leaving this process running in the background using a shortcut key ‘CTRL-a c’ (First press ctrl-c, and then c. This is called key binding, explained later). I am presented with another screen for me to work on, and I can run ‘top’ on it:

Once I am done using the top command, I can exit the screen with an exit command, or switch to another screen with ctrl-a,n (next) or ctrl-a,p (previous). But let’s assume while my script was running on a remote server, my connection broke and I lost my screen. Here is where screen command options come handy. We can reattach to the previous disconnected screen using ‘screen –ls’ to see which screens are available for us to reattach, and then using ‘screen -r’ as shown below (in another terminal):

I return to my old ‘top’ command, and by pressing ctrl-a,n I can see the script running as well, patiently waiting for the sleep to end.

Screen comes to rescue in case we are connected to a bad connection, or when we are using more than one computer and need to share the same terminal.

Screen has lot of options and shortcuts to use. Few most used options are:

-d -r :   Reattach a session and if necessary detach it first.

-ls and -list : does not start screen, but prints a list of pid.tty.host strings and creation timestamps identifying your screen sessions. Sessions marked `detached’ can be resumed with “screen -r”. Those marked `attached’ are running and have a controlling terminal. If the session runs in multiuser mode, it is marked `multi’. Sessions marked as `unreachable’ either live on a different host or are `dead’. An unreachable session is considered dead, when its name matches either the name of the local host, or the specified parameter, if any. See the -r flag for a description how to construct matches. Sessions marked as `dead’ should be thoroughly checked and removed. Ask your system administrator if you are not sure. Remove sessions with the -wipe option.

-x :  Attach to a not detached screen session. (Multi display mode).  Screen refuses to attach from within itself.  But when cascading multiple screens, loops are not detected.

As mentioned above, screen can also take shortcut keys to follow commands (this is called key binding). They start with ctrl-a, and the command key shown below. To get this screen help, press ‘ctrl-a ?’

history: Linux Command History with Examples

history command, when used without any options, shows the history of commands run on the terminal. But that is not only what history and its associated libraries are famous for. They give high speed advantages to those who know how to use it well. You can search a command previously run, and rerun it without involving the efforts of typing it again. History expansions, as it is called, introduce words from the history list into the input stream, making it easy to repeat commands, insert the arguments to a previous command into the current input line, or fix errors in previous commands quickly. The below paragraph from the man page best describes how this works.

History expansion is usually performed immediately after a complete line is read. It takes place in two parts. The first is to determine which line from the history list to use during substitution. The second is to select portions of that line for inclusion into the current one. The line selected from the history is the event, and the portions of that line that are acted upon are words. Various modifiers are available to manipulate the selected words. The line is broken into words in the same fashion as bash does when reading input, so that several words that would otherwise be separated are considered one word when surrounded by quotes. History expansions are introduced by the appearance of the history expansion character, which is ! by default. Only backslash (\) and single quotes can quote the history expansion character.

Let’s start with few examples.

1. ! instructs the shell to start the history substitution. For example, I can do

$ echo "how are you"
how are you

But once I have typed that, why do I need to type again and again? I can simply do :

!echo

and I get the same output. The thing here to remember is that ! does the substitution of the most recent run command. So it is important to run a commands with this expansion once you are sure what will the substitution be. !r can expand to ‘readlink file.txt’ or ‘rm *’, depending on which was run last. Be VERY sure what was run last by you before using this.

2. !n : When you type history, you see numbers associated with the command in the order in which they were run. For example:

$ history
    1  dmesg
    2  ifconfig
    3  exit
    4  ifconfig
    5  exit
    6  ifconfig
    7  exit
    8  apt-get update
    9  sudo apt-get update
   10  sudo apt-get install ssh
   11  tcpdump -a
   12  sudo apt-get install tcpdump
   13  sudo tcpdump -a
   14  tcpdump
   15  sudo tcpdump
   16  top
  ....
  ....
  ....

If you want to run a particular command in the list, you can use !n to run the command associated with line n. For example, if I want to run command number 11 from the above list (tcpdump -a), I can simply write !11. You can also count in the reverse order if you want, starting from 1. For example, to run the last command again, you can easily use !-1. You can also use !! to rerun the last command.

3. The power of ctrl-r: Press ctrl-r before you start typing and see the magic. As you type, the commands start appearing from the history. The moment your desired command appears, simply hit enter. It is a very useful tool to run very long commands. Below is an example, where I pressed ctrl-r, and typed only ‘ech’. It gave me the entire command. Please note how it explicitly says ‘reverse-i-search’.

(reverse-i-search)`ech': echo "how are you"

4. ^string1^string2^: If you want to replace string2 instead of string1 in the last command, use this string substitution. For example,

$ echo abc.txt
abc.txt

$ ^echo^cat^
cat abc.txt
how are you?
i am fine
how about you?

5. Control the history to be maintained: You can edit your .bash_profile file to set the size of history and location of commands you want to maintain.

HISTSIZE=1000
HISTFILESIZE=1000
HISTFILE=~/.history_list

To avoid duplication of commands which occur together in the history, use

export HISTCONTROL=ignoredups

To remove duplicates from the entire history, use the following:

export HISTCONTROL=erasedups

You can also ask the history command not to remember a particular set of commands. For example, you may never want history to remember ‘rm -r *’. The trick is to export the following variables, and start your commands with a space, like ‘ rm -r *’ (there is a space before rm).

export HISTCONTROL=ignorespace

For better control, you can use the below example to remove few commands altogether.

export HISTIGNORE="rm:rm -r *"

To clear all your previous history, use

history -c

6. Word designators: Suppose I want to choose the n-th word from a history, I can use a saperator ‘:’. A : separates the event specification from the word designator.  It may be omitted if the word designator begins with a ^, $, *, -, or %. Words are numbered from the beginning of the  line,  with  the  first  word  being denoted by 0 (zero).  Words are inserted into the current line separated by single spaces. Let’s take examples:

To select nth word from the history, use:

$ ls -ltr a*
-rw-r--r-- 1 ubuntu ubuntu 38 2011-07-31 09:16 abc.txt

$ echo !ls:2
echo a*
abc.txt

To select a range of words, use:

$ echo !ls:2-3
echo a* b*
abc.txt bcd.txt

To select all the words, you can also use

$ ls a* b*
abc.txt  bcd.txt
$ echo !ls:*
echo a* b*
abc.txt bcd.txt

7. Substitution: After you have chosen the words from the history, you can replace few old words with new words as s/old/new/. Substitution is separated from word designators again with ‘:’. For example:

$ ls a*
abc.txt
$ echo !ls:1:s/a/b
echo b*
bcd.txt

A :& can be used to repeat the previous substitution:

$ ls a*
abc.txt
$ echo !ls:1:s/a/b
echo b*
bcd.txt
$ cat !ls:1:&
cat b*
how are you?
i am fine
how about you?

Basic Linux Commands a newbie should know – Part II

We are continuing from part I.

kill and pkill: kill is used to send signals to a process, which may include from HUP, INT, KILL, STOP, CONT, and 0. kill can also accept the code of the signal. For example, to terminate a process, we can use kill -9 or kill -KILL. It is a pretty dangerous command to be run on live servers and should be used with responsibility. pkill, on the other hand, searches from the list of current processes, and kills processes which match a particular expression. You can reverse the selection with -v option. Example:

 kill -9 -1
pkill -9 tinyproxy

wc and nl: wc is used for word-counting. You can pass it a list of file names, and it will report the count of characters, words and lines each of the files, with total at the end. It can also take the input from a steam to do the counting. It can take different parameters like -w for only reporting word count, -l for only reporting line count etc. Example:

$ wc abc1.txt abc.txt
  78464  392320 2510848 abc1.txt
 537941  537942 3227648 abc.txt
 616405  930262 5738496 total
$ wc -l abc1.txt abc.txt
  78464 abc1.txt
 537941 abc.txt
 616405 total
$ cat abc.txt | wc
 537941  537942 3227648

nl is used to write line numbers from a file along with the line itself. Example:

$ nl abc.txt  | head
     1  hahah
     2  hahah
     3  hahah
     4  hahah
     5  hahah
     6  hahah
     7  hahah
     8  hahah
     9  hahah
    10  hahah
$ nl abc.txt  | tail
	537933  hahah
	537934  hahah
	537935  hahah
	537936  hahah
	537937  hahah
	537938  hahah
	537939  hahah
	537940  hahah
	537941  hahah
	537942  ha

nohup: nohup runs a command in non-hangup mode. This is generally used to run background processes, server processes and tasks which take long time. After it is run, you can type fg to bring the process to foreground as long as you are connected to the same terminal and kill it with ctrl-c if you want. But once you leave the terminal, the only way to kill the process without restarting the machine is with kill -9. By default, it captures all the stdout and stderr streams, and appends to nohup.out file, but that can be over-ridden by redirecting the output to a file using >. Example:

$ nohup tcpdump -a > tcpdump.out &
nohup: ignoring input and redirecting stderr to stdout

pwd: Print the full filename of the current working directory. Example:

$ pwd
/tmp

cat and tac: cat takes multiple files and prints it on stdout (screen, another file or task). It can be used with -n option to print line numbers. tac also prints the file, but in a reverse sequence. Example:

$ cat abc.txt
how are you?
i am fine
how about you?
$ tac abc.txt
how about you?
i am fine
how are you?

alias: Tired of typing same commands over and over again. alias command comes to your rescue. With alias, you can associate a short keyword to a command, which shell will expand while executing the command. For example, ls -ltr can be aliased to a short l. Type l at the command prompt, and you execute ls -tlr. Commonly used aliases are put in .bashrc file located in your home directory. Example:

$ alias l='ls -ltr'
$ l
total 40
-rw-r--r-- 1 ubuntu ubuntu  179 2011-04-28 14:29 examples.desktop
drwxr-xr-x 2 ubuntu ubuntu 4096 2011-04-28 14:42 Videos
drwxr-xr-x 2 ubuntu ubuntu 4096 2011-04-28 14:42 Templates
drwxr-xr-x 2 ubuntu ubuntu 4096 2011-04-28 14:42 Public
drwxr-xr-x 2 ubuntu ubuntu 4096 2011-04-28 14:42 Pictures
drwxr-xr-x 2 ubuntu ubuntu 4096 2011-04-28 14:42 Music
drwxr-xr-x 2 ubuntu ubuntu 4096 2011-04-28 14:42 Downloads
drwxr-xr-x 2 ubuntu ubuntu 4096 2011-04-28 14:42 Documents
drwxr-xr-x 2 ubuntu ubuntu 4096 2011-07-27 17:22 Desktop
-rw-r--r-- 1 ubuntu ubuntu   38 2011-07-31 09:16 abc.txt

Basic Linux Commands a newbie should know – Part I

I assume you know few very basic commands like ls, rm, cd, mkdir etc. Below are next set of newbie commands that you will find useful in your journey of Linux world.

which: This command is used to show the full path of another executable that would have been executed in typed on shell directly. For example:

$ which ls
/bin/ls

 The full path to ls command is /bin/ls

If you pass -a option, it shows all the matching executables in the path, and not necessarily the first one.

w, who and whoami: w shows the current users who are logged in. It also shows the process information of the users. who is, in a way, an extended version of w. It also displays the logged users’ information, but is equipped with more options. whoami show the current user’s username. Examples:

$ w
 17:11:07 up  4:06,  4 users,  load average: 0.12, 0.21, 0.14
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
ubuntu   tty7     :0               Wed11    3days  2:12   0.85s gnome-session --session=ubuntu
ubuntu   pts/0    :0.0             Wed17    7:50   2.14s 21.78s gnome-terminal
ubuntu   pts/1    10.0.0.2         Wed19    2days  2.60s  2.60s -bash
ubuntu   pts/2    10.0.0.2         17:03    0.00s  3.06s  0.08s w
$ who -a
           system boot  2011-07-27 11:12
           run-level 2  2011-07-27 11:12
LOGIN      tty4         2011-07-27 11:12               605 id=4
LOGIN      tty5         2011-07-27 11:12               612 id=5
LOGIN      tty2         2011-07-27 11:12               624 id=2
LOGIN      tty3         2011-07-27 11:12               626 id=3
LOGIN      tty6         2011-07-27 11:12               629 id=6
LOGIN      tty1         2011-07-27 11:12               886 id=1
ubuntu   + tty7         2011-07-27 11:13  old         1147 (:0)
ubuntu   + pts/0        2011-07-27 17:10 00:08        2260 (:0.0)
ubuntu   + pts/1        2011-07-27 19:50  old         5755 (10.0.0.2)
           pts/2        2011-07-27 17:33                 0 id=/2    term=0 exit=0
           pts/3        2011-07-27 20:19              4282 id=ts/3  term=0 exit=0
ubuntu   + pts/2        2011-07-30 17:03   .          6721 (10.0.0.2)
$ whoami
ubuntu

finger and pinky: finger displays the user’s system information like real name, terminal name, write status, idle time, login time, office location and office phone number. pinky is a lightweight finger, with lot of options to exclude unwanted information.  Examples:

$ finger -s
Login     Name       Tty      Idle  Login Time   Office     Office Phone
ubuntu    Ubuntu     tty7       3d  Jul 27 11:13 (:0)
ubuntu    Ubuntu     pts/0      14  Jul 27 17:10 (:0.0)
ubuntu    Ubuntu     pts/1      2d  Jul 27 19:50 (10.0.0.2)
ubuntu    Ubuntu     pts/2          Jul 30 17:03 (10.0.0.2)
$ pinky
Login    Name                 TTY      Idle   When             Where
ubuntu   Ubuntu               tty7     3d     2011-07-27 11:13 :0
ubuntu   Ubuntu               pts/0    00:15  2011-07-27 17:10 :0.0
ubuntu   Ubuntu               pts/1    2d     2011-07-27 19:50 10.0.0.2
ubuntu   Ubuntu               pts/2           2011-07-30 17:03 10.0.0.2

‘pinky -q’ omit the user’s full name, remote host and idle time in short format

$ pinky -q
Login     TTY      When
ubuntu    tty7     2011-07-27 11:13
ubuntu    pts/0    2011-07-27 17:10
ubuntu    pts/1    2011-07-27 19:50
ubuntu    pts/2    2011-07-30 17:03

whatis and apropos: whatis shows a small help about a particular executable. apropos searches the entire manual pages for matching arguments and prints the result. Example:

$ whatis who
who (1)              - show who is logged on
$ apropos who
at.allow (5)         - determine who can submit jobs via at or batch
at.deny (5)          - determine who can submit jobs via at or batch
bsd-from (1)         - print names of those who have sent mail
from (1)             - print names of those who have sent mail
w (1)                - Show who is logged on and what they are doing.
w.procps (1)         - Show who is logged on and what they are doing.
who (1)              - show who is logged on
whoami (1)           - print effective userid
whois (1)            - client for the whois directory service

bc: bc is a calculator with arbitrary precision. Example:

$ bc
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
10+20
30

df and du: They both are used to show the diskspace. df displays the amount of disk space available on the file system containing each file name argument.  If no file name is given, the space available on all currently mounted file systems is shown. du, on the other hand, Summarize disk usage of each file, recursively for directories. Example:

$ df -k
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1             18062108   2567244  14577360  15% /
none                    182684       632    182052   1% /dev
none                    189296       252    189044   1% /dev/shm
none                    189296       108    189188   1% /var/run
none                    189296         0    189296   0% /var/lock
/dev/sr0                 43904     43904         0 100% /media/VBOXADDITIONS_4.1.0_73009
$ du -kh
148K    ./.pulse
16K     ./.gnome2/accels
8.0K    ./.gnome2/gedit
..
..
..
260K    ./.cache
8.0K    ./.dbus/session-bus
12K     ./.dbus
56M     .

Next: Part 2

tcpdump : A Step Further

tcpdump is packet analyzer, and prints  out  the headers of packets on a network interface that match the boolean expression. We have an option of saving packets to a file with -w, or read from a saved packet with -r.

In Linux, tcpdump requires you to be root to run it, or it to be installed setuid to root. Let’s take a few examples to better understand tcpdump.

tcpdump -A: This prints packets in ASCII, making analyzing a bit easy for websites. For example, the below screenshot shows the header of google.co.in captured with tcpdump -a.

tcpdump -i : This option allows you to listen on a specified interface.

tcpdump -vA: -v enables tcpdump to print slightly more verbose output.

Using expressions in tcpdump: tcpdump can filter the output based on a boolean expression supplied in the options. For example, to capture all the packets between local machine and destination 74.125.236.52 port 80, we can use:

# tcpdump dst 74.125.236.52 and port 80

listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes

17:38:00.515784 IP ubuntu-VirtualBox.local.51548 > 74.125.236.52.www: Flags [P.], seq 2679244043:2679244658, ack 39769353, win 2003, options [nop,nop,TS val 551982 ecr 108580187], length 615

17:38:00.627547 IP ubuntu-VirtualBox.local.51548 > 74.125.236.52.www: Flags [.], ack 1419, win 2003, options [nop,nop,TS val 552010 ecr 108587658], length 0

17:38:00.627598 IP ubuntu-VirtualBox.local.51548 > 74.125.236.52.www: Flags [.], ack 2837, win 1964, options [nop,nop,TS val 552010 ecr 108587658], length 0

tcpdump –tttt can be used for better readable timestamps.

# tcpdump -tttt dst 74.125.236.52 and port 80

listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 

2011-07-27 17:40:04.909035 IP ubuntu-VirtualBox.local.51554 > 74.125.236.52.www: F# tcpdump -tttt dst 74.125.236.52 and port 80

2011-07-27 17:40:04.909035 IP ubuntu-VirtualBox.local.51554 > 74.125.236.52.www: Flags [S], seq 1108702963, win 14600, options [mss 1460,sackOK,TS val 583081 ecr 0,nop,wscale 5], length 0

2011-07-27 17:40:04.945123 IP ubuntu-VirtualBox.local.51554 > 74.125.236.52.www: Flags [.], ack 2766379160, win 457, options [nop,nop,TS val 583090 ecr 108712048], length 0

You can also provide the source in tcpdump boolean expression using src, gateway using gateway and an option to analyze packets over a range of ports using portrange port1-port2 options. More expressions are described below:

less length : True if the packet has a length less than or equal to length.  This is equivalent to:
len <= length.

greater length: True if the packet has a length greater than or equal to length.  This is equivalent to:
len >= length.

host host: True if either the IPv4/v6 source or destination of the packet is host.

The expressions can be combined using the following operators:

‘!’ or ‘not’
‘&&’ or ‘and’
‘||’ or ‘or’

If you use parentheses in the expressions, they must be escaped.

More examples from man page:

To print the start and end packets (the SYN and FIN packets) of each TCP conversation that involves a non-local host.

tcpdump ’tcp[tcpflags] & (tcp-syn|tcp-fin) != 0 and not src and dst net localnet’

To print all IPv4 HTTP packets to and from port 80, i.e. print only packets that contain data, not, for example, SYN and  FIN  packets and ACK-only packets.

tcpdump ’tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)’

See also: