Mohammedz.com

For Linux and Shell scripting.


13 Comments

SED: change/insert/append lines after matching a pattern

Do you want to change/insert/append lines after matching a pattern from a file? If yes, you can use sed to do that.

—————————–
I’m pasting the relevant parts from sed manpage followed by some examples.
a \
text – Append text, which has each embedded newline preceded by a backslash.

i \
text – Insert text, which has each embedded newline preceded by a backslash.

c \
text – Replace the selected lines with text, which has each embedded newline preceded by a backslash.
—————————–

Here is an example to show you the usage. You can either use it from command line or from within shell scripts.

Description of the example: The filename.txt contains 3 lines as shown below and I’m gonna do all manipulations by matching the pattern “second line”.

# cat > filename.txt
first line
second line
third line
#

Match “second line” pattern and append “append line” into the matched address.
# sed ‘/second line/a\
append line
‘ filename.txt

Output of the above command:
first line
second line
append line
third line

Match “second line” pattern and insert “insert line” to the matched address.
# sed ‘/second line/i\
insert line
‘ filename.txt

Output of the above command:
first line
insert line
second line
third line

Match “second line” pattern and change that line with “change line”.
# sed ‘/second line/c\
change line
‘ filename.txt

Output of the above command:
first line
change line
third line

Hope this helps 🙂

~mohammed


Leave a comment

SED: newline and embedded newline characters

The multiline Next (N) command creates a multiline pattern space by reading a new line of input and appending it to the contents of the pattern space. The original contents of pattern space and the new input line are separated by a newline. The embedded newline character can be matched in patterns by the escape sequence “\n”. In a multiline pattern space, the metacharacter “^” matches the very first character of the pattern space, and not the characters(s) following any embedded newline(s). Similarly, “$” matches only the final newline in the pattern space, and not any embedded newline(s). After the Next command is executed, control is then passed to subsequent commands in the script.

The Next command differs from the next command, which outputs the contents of the pattern space and then reads a new line of input. The next command doesn’t create a multiline pattern space.

This is taken from O’Reilly’s sed & awk.

~mohammed


3 Comments

How to remove non-printable/control characters from a file?

You may find difficulties with non-printables in your files. You can see such characters if you open your files in editors like vi. Eventhough commands like “cat” won’t output such non-printable characters into console by default, you can’t remove them by redirecting “cat” output to a different file.

Here is a way to remove non-printable characters with a combination of sed and tr commands.

Step 1:
Use sed with “l” (lower case L) option to print the file/line in a “visually unambiguous” form. From sed output, find the character notation that needs to be removed.

#sed -n ‘l’ filename.txt

Step 2:
Remove control characters that you found from sed output using “tr” or “sed”.

suppose you want to remove the form feed character “\f” lines from filename.txt, use any of the commands given below.

# tr -d ‘\f’ filename.txt
or
# sed ‘/\f/d’ filename.txt

You can any of the commands given below if you want to remove such control characters only, but not the entire line containing them.

# tr ‘\f’ ‘ ‘ filename.txt
or
# sed ‘s/\f//’ filename.txt

The control characters in ASCII still in common use include:

* 0 (null, , ^@), originally intended to be an ignored character, but now used by many programming languages to terminate the end of a string.
* 7 (bell, \a, ^G), which may cause the device receiving it to emit a warning of some kind (usually audible).
* 8 (backspace, \b, ^H), used either to erase the last character printed or to overprint it.
* 9 (horizontal tab, \t, ^I), moves the printing position some spaces to the right.
* 10 (line feed, \n, ^J), used as the end_of_line marker in most UNIX systems and variants.
* 12 (form feed, \f, ^L), to cause a printer to eject paper to the top of the next page, or a video terminal to clear the screen.
* 13 (carriage return, \r, ^M), used as the end_of_line marker in Mac OS, OS-9, FLEX (and variants). A carriage return/line feed pair is used by CP/M-80 and its derivatives including DOS and Windows, and by Application Layer protocols such as HTTP.
* 27 (escape, \e [GCC only], ^[).
* 127 (delete, ^?), originally intended to be an ignored character, but now used to erase a character (especially the one to the right of the cursor).

Read more about control characters at wikipedia.

~mohammed


2 Comments

How can I remove all line starting with hash (#)

To remove all lines starting with hast sign (#) from a file, try this:
sed /^#/d filename

The above command will remove all line starting with # from its output. If you want to write the output to another file, use o/p redirection >
sed /^#/d filename > newfile

But, if you want to over write the file with new o/p, use -i option with sed.
sed -i /^#/d filename

Regards,

Mohammed.