Mohammedz.com

For Linux and Shell scripting.


Leave a comment

Reading inputs from a file (line by line or by fields)

Always use while loop to read the input lines from a text file. For loop is an alternative method that some guys use, but it’s not always reliable. We will come to those points after discussing the usage of while loops.

Note: I purposefully ignored awk here as it’s a different tool. I will add another post exclusively for that. In the meantime, you can find some example here in this post.

while read aline

do

echo Input Line: “$aline”

done < input.txt

Here is how you can write the above script in a single line.

while read aline; do echo Input Line : "$aline"; done < input.txt

Here is an example:

read_while1

Now what if you have multiple fields in the input file and you want to work around those fields.

Here is an example where the input file has multiple fields with different data and you want to find the largest value of a specific field. This sample file has 4 fields and you want to find the largest value in 3rd field. Then print the matched line.

Notes: The input file doesn’t contain headings. I just included it below for your understand. And if it’s given in your input file, you have to remove them before you start the actual processing (using sed, head, tail etc). Also, the input file is a regular text file and so the fields are separated by space or tabs.

Name Matches Runs Wickets
Sachin 450 18000 150
Ganguly 300 11000 100
Azharuddin 334 9300 12
Dravid 344 10880 4

#!/bin/bash

largest=0

while read name matches runs wickets; do

if [ $largest -lt $runs ]; then

largest=$runs

player=$name

fi

done < file.txt

echo highest runs = $largest

echo player name = $player

Why you shouldn’t read lines using for loop.

Here are a few negatives by using for loops.

  1. For loop always ignores the blank lines.
  2. When using the simple method (without IFS), the input lines will be split into words.
  1. Shell might expand the glob if it exists in your input file and it results in an unexpected output. You can escape from this using “set -f” option.
  1. The while loop reads one line at a time from streamline, but using $(<input.txt), the for loop reads entire file into memory. So the performance will be poor when working with huge files.

Here is an example usage and output:

bash$ IFS=$’\n’; set -f; for i in $(<input.txt); do echo “$i”; done; set +f; unset IFS

sample input line

*

#$@

bash$


Leave a comment

Find the largest number from a given file

Here is a shell script to find the highest/largest number from a given input file.

Notes: The input file name numbers.txt contains only numbers in each line. It can be positive or negative numbers.

awk '$0>x{x=$0}; END {print "largest number="x}' numbers.txt

Here is a different version using explicit if statement..

awk '{if($0>x)x=$0; y=NR}; END {print "largest number="x}' numbers.txt

The following awk one-liner would give you position (line number) of the largest number as well. This might be useful at times.

awk '$0>x{x=$0; y=NR}; END {print "largest number="x"\nPosition(line number)="y}' numbers.txt

Yet another version using explicit loops (using while loop and if statements)

#!/bin/bash

largest=0

while read number; do

if [ $largest -lt $number ]; then

largest=$number

fi

done < numbers.txt

echo largest number = $largest

This version gives you largest number along with its position (using while loop and if statements)

#!/bin/bash

largest=0

count=1

while read number; do

if [ $largest -lt $number ]; then

largest=$number

position=$count

fi

count=$((count+1))

done < numbers.txt

echo largest number = $largest

echo position \(line number\) = $position

Here is another scenario where the input file has multiple fields with different data and you want to find the largest value of a specific field. Here is a sample file which has 4 fields and you want to find the largest value in 3rd field. Then print the matched line.

Notes: The input file doesn’t contain headings. I just included it below for your understand. And if it’s given in your input file, you have to remove them before you start the actual processing (using sed, head, tail etc). Also, the input file is a regular text file and so the fields are separated by space or tabs.

Name Matches Runs Wickets
Sachin 450 18000 150
Ganguly 300 11000 100
Azharuddin 334 9300 12
Dravid 344 10880 4

awk '$3>x{x=$3; line=$0}; END {print line}' input.txt

If you want to print only the player name and runs (matched field), here it is.

awk '$3>x{x=$3; y=$1}; END {print y " " x}' input.txt

Here is a script for same purpose using while loop and if statements.

#!/bin/bash

largest=0

while read name matches runs wickets; do

if [ $largest -lt $runs ]; then

largest=$runs

player=$name

fi

done < file.txt

echo highest runs = $largest

echo player name = $player


Leave a comment

What is a process?

A process is an instance of execution that runs on a processor. The process uses any resources that the Linux kernel can handle to complete its task.

All processes running on Linux operating system are managed by the task_struct structure, which is also called a process descriptor. A process descriptor contains all the information necessary for a single process to run such as process identification, attributes of the process, and resources which construct the process. If you know the structure of the process, you can understand what is important for process execution and performance.

Task_struct structure which is also called the process descriptor

Task_struct structure which is also called the process descriptor

 

Reference: IBM RedBooks – Linux Performance and Tuning Guidelines.

 

 


Leave a comment

Sha-Bang (#!/bin/bash) line in Scripts

The sha-bang line at the beginning of the script tells your system that this file is a set of commands to be fed to the command interpreter indicated. Immediately following the sha-bang is a path name to the program that interprets the commands in the script.

The #! line in a shell script will be the first thing the command interpreter (sh or bash) sees. Since this line begins with a #, it will be correctly interpreted as a comment when the command interpreter finally executes the script. The line has already served its purpose – calling the command interpreter. If the script includes an additional #! line, then shell will interpret it as a comment.

Of course, the sha-bang (#!) line must be the very first line in the script. Even a blank line above that could change the meaning, and it would then be considered as a comment. However space between #! and path to interpreter should work.

How it works?

Note: These are my assumptions after a series of tests. I haven’t seen this way of explanation in any other sites. I also couldn’t collect any proofs from strace or similar tools.

As soon as your shell find the magic sha-bang line, it feeds your script file as an argument to the command interpreter mentioned there. Here is a simple test script which uses /bin/rm as sha-bang. When you executes the script, it runs successfully (but the commands within the script won’t be executed). The return value of the execution will be 0 (success), but you won’t find your script after that. The /bin/rm would have already deleted it.

$ cat test2.sh

#!/bin/rm

# self destructing script

echo hello world

$

$ ls -l test2.sh

-rwxr–r– 1 root root 27 May 29 20:47 test2.sh

$ ./test2.sh

$ echo $?

0

$ ls -l test2.sh

ls: test2.sh: No such file or directory

$

You can repeat this test with /bin/more as sha-bang instead of /bin/rm. It will show the script contents (like you do more on the command line), instead of executing the script.

Reference:

http://tldp.org/LDP/abs/html/sha-bang.html


13 Comments

SED: change/insert/append lines after matching a pattern

Do you want to change/insert/append lines after matching a pattern from a file? If yes, you can use sed to do that.

—————————–
I’m pasting the relevant parts from sed manpage followed by some examples.
a \
text – Append text, which has each embedded newline preceded by a backslash.

i \
text – Insert text, which has each embedded newline preceded by a backslash.

c \
text – Replace the selected lines with text, which has each embedded newline preceded by a backslash.
—————————–

Here is an example to show you the usage. You can either use it from command line or from within shell scripts.

Description of the example: The filename.txt contains 3 lines as shown below and I’m gonna do all manipulations by matching the pattern “second line”.

# cat > filename.txt
first line
second line
third line
#

Match “second line” pattern and append “append line” into the matched address.
# sed ‘/second line/a\
append line
‘ filename.txt

Output of the above command:
first line
second line
append line
third line

Match “second line” pattern and insert “insert line” to the matched address.
# sed ‘/second line/i\
insert line
‘ filename.txt

Output of the above command:
first line
insert line
second line
third line

Match “second line” pattern and change that line with “change line”.
# sed ‘/second line/c\
change line
‘ filename.txt

Output of the above command:
first line
change line
third line

Hope this helps 🙂

~mohammed


Leave a comment

Linux commands: cut and paste

cut and paste can be handy sometimes, especially when you have to manipulate
files based on rows and columns.

simple usages of cut and paste:

file1
******
first second
first second
first second

file2
******
3 4
3 4
3 4

Let me say, I want to cut first column of file2 and paste them as last column
of file1. Here you go

# cut -d” ” -f1 file2 |paste -d” ” file1 –
first second 3
first second 3
first second 3

what if the result from cut should be pasted at the beginning of file1?

# cut -d” ” -f1 file2 |paste -d” ” – file1
3 first second
3 first second
3 first second

~mohammed


Leave a comment

SNMP packages for Debian Machine

If you want to configure SNMP on a Debian machine, you should install the following packages. You can use apt-get to install all these packages.

———————–
libsensors3_1%3a2.10.1-3_i386.deb
libsnmp9_5.2.3-7etch2_i386.deb
libsnmp-base_5.2.3-7etch2_all.deb
libsysfs2_2.1.0-1_i386.deb
snmp_5.2.3-7etch2_i386.deb
snmpd_5.2.3-7etch2_i386.deb
———————–

~mohammed


Leave a comment

SED: newline and embedded newline characters

The multiline Next (N) command creates a multiline pattern space by reading a new line of input and appending it to the contents of the pattern space. The original contents of pattern space and the new input line are separated by a newline. The embedded newline character can be matched in patterns by the escape sequence “\n”. In a multiline pattern space, the metacharacter “^” matches the very first character of the pattern space, and not the characters(s) following any embedded newline(s). Similarly, “$” matches only the final newline in the pattern space, and not any embedded newline(s). After the Next command is executed, control is then passed to subsequent commands in the script.

The Next command differs from the next command, which outputs the contents of the pattern space and then reads a new line of input. The next command doesn’t create a multiline pattern space.

This is taken from O’Reilly’s sed & awk.

~mohammed


3 Comments

How to remove non-printable/control characters from a file?

You may find difficulties with non-printables in your files. You can see such characters if you open your files in editors like vi. Eventhough commands like “cat” won’t output such non-printable characters into console by default, you can’t remove them by redirecting “cat” output to a different file.

Here is a way to remove non-printable characters with a combination of sed and tr commands.

Step 1:
Use sed with “l” (lower case L) option to print the file/line in a “visually unambiguous” form. From sed output, find the character notation that needs to be removed.

#sed -n ‘l’ filename.txt

Step 2:
Remove control characters that you found from sed output using “tr” or “sed”.

suppose you want to remove the form feed character “\f” lines from filename.txt, use any of the commands given below.

# tr -d ‘\f’ filename.txt
or
# sed ‘/\f/d’ filename.txt

You can any of the commands given below if you want to remove such control characters only, but not the entire line containing them.

# tr ‘\f’ ‘ ‘ filename.txt
or
# sed ‘s/\f//’ filename.txt

The control characters in ASCII still in common use include:

* 0 (null, , ^@), originally intended to be an ignored character, but now used by many programming languages to terminate the end of a string.
* 7 (bell, \a, ^G), which may cause the device receiving it to emit a warning of some kind (usually audible).
* 8 (backspace, \b, ^H), used either to erase the last character printed or to overprint it.
* 9 (horizontal tab, \t, ^I), moves the printing position some spaces to the right.
* 10 (line feed, \n, ^J), used as the end_of_line marker in most UNIX systems and variants.
* 12 (form feed, \f, ^L), to cause a printer to eject paper to the top of the next page, or a video terminal to clear the screen.
* 13 (carriage return, \r, ^M), used as the end_of_line marker in Mac OS, OS-9, FLEX (and variants). A carriage return/line feed pair is used by CP/M-80 and its derivatives including DOS and Windows, and by Application Layer protocols such as HTTP.
* 27 (escape, \e [GCC only], ^[).
* 127 (delete, ^?), originally intended to be an ignored character, but now used to erase a character (especially the one to the right of the cursor).

Read more about control characters at wikipedia.

~mohammed