Sed Regular Expression

Contents
SED Regular expression
Contents
To execute sed from file
Sed regular expression
Contents
To execute sed from file
Sed regular expression
Using the file u3, do the following using sed, displaying

the result on the screen
1.Output only the lines that contain cow
Answer: sed -n '/cow/p' u3
2. Delete any line that contains cow
Answer:
sed '/cow/d' u3
3. Change the first instance of * on each line to !
Answer:
sed 's/*/!/' u3
4. Change all occurrences of * on each line to !
Answer:
sed 's/*/!/g' u3
5. Output only the lines that contain either cow or

calf
Answer:
sed -n -e '/cow/p' -e '/calf/p'
u3
6. Output the file after changing cow to COW on
lines 10-20
Answer:
sed '10,20s/cow/COW/g' u3
7. Output the entire file except lines 1-20
Answer:
sed '1,20d' u3
8. Delete any lines containing the string "news
Answer:
$ sed '/news/d'
8.
line 1 (one)
line 2 (two)
line 3 (three)
Command:
sed -e '1,2s/line/LINE/' file
Output:
LINE 1 (one)
LINE 2 (two)
line 3 (three)
9. Command:
sed -e '1,2d' file
Output:
line 3 (three)
10. Command:
sed -e '3d' file
Output:
line 1 (one)
line 2 (two)
11. Write a script to insert

12. Write a script to change
Sed from a file

If your sed script is getting long, you can put it into
a file, like so:
# This file is named "sample.sed
# comments can only appear in a block at the
beginning s/color/colour/g
s/flavor/flavour/g
s/theater/theatre/g
Then call sed with the "-f" flag:
sed -f sample.sed filename
Or, you can make an executable sed script:

#!/usr/bin/sed -f
# This file is named "sample2.sed"
s/color/colour/g
s/flavor/flavour/g
s/theater/theatre/g
then give it execute permissions:
chmod u+x sample2.sed
and then call it like so:
./sample2.sed filename
Note that you have to escape with backslashes the

many characters:
curlies \{ \} ,
round brackets ,
star \*,
plus \+,
question mark \?
Regular expression in sed

Special characters
Usage
Matches the beginning of the line
Matches the end of the line
Matches any single character
\*
Matches zero or more occurrence of the character
\+
Matches one or more occurrence
\?
Matches zero or one instance of the character
[ ]
Matches any character enclosed in [ ]
[^ ]
(character)\{m,n\}
(character)\{m,\}
(character)\{,n\}
(character)\{n\}
$expression$
\n
&
Matches any character not enclosed in [ ]

Match m-n repetitions of (character)
Match m or more repetitions of (character)
Match n or less (possibly 0) repetitions of
(character)
Match exactly n repetitions of (character)
Group operator. Also memorizes into numbered
variables - use for backreference as \1 \2 .. \9
Backreference - matches nth group
Regular Expressions (character classes)

The following character classes are short-hand for
matching special characters.
[:alnum:]
space)
Printable characters (includes white
[:alpha:] Alphabetic characters

[:blank:] Space and tab characters
[:cntrl:] Control characters
[:digit:] Numeric characters
[:graph:] Printable and visible (non-space)
characters
[:lower:] Lowercase characters
[:print:] Alphanumeric characters
The '^' character means the beginning of the

line.
Example:
sed 's/^Thu /Thursday/' filename
will turn "Thu " into "Thursday", but only at the
beginning of the line.
Example:
sed -e '/^#/d
Example:
/[Uu]nix/!d
word unix.
deletes lines that do not contain the
6d
deletes line 6
/^$/d
deletes all blank lines
1,10d
deletes lines 1 through 10
1,/^$/d deletes from line 1 through the first blank line

/^$/,/$/d deletes from the first blank line through the
last line of the file
/^$/,10d deletes from the first blank line through
line 10
/^Co*t/,/[0-9]$/d deletes from the first line that begins
with
Cot, Coot, Cooot, etc through the first
`[a-zA-Z0-9]'
This matches any letters or digits.
`[â-z A-Z] '
This matches any letters .
Repetition using *
means 0 or more of the previous single character
pattern.
[abc]*
matches "aaaaa" or "acbca
Hi Dave.*
matches "Hi Dave" or "Hi Daveisgoofy
0*10
matches "010" or "0000010" or "10"
Repetition using +
+ means 1 or more of the previous single character
pattern.
[abc]+
Hi Dave.+
matches "Hi Dave." or "Hi Dave.
0+10
matches "010" or "0000010" does not
match "10"
a\+b\+
matches one or more à's followed
by one or more
`b's: àb' is the shorter
possible match, but other
examples are
àaaab' or àbbbbb' or
? Repetition Operator
? means 0 or 1 of the previous single character
pattern.
x[abc]?x
matches "xax" or "xx"
A[0-9]?B matches "A1B" or "AB" does not match

"a1b" or
"A123B"
`.\{9\}A$'
This matches an A that is the last character on line, with at
least nine preceding characters.
`^.\{15\}A
This matches an A that is the 16th character on a line.
sed G myfile.txt > newfile.txt

In the above example using the sed command with G
would double space the file myfile.txt and output the
results to the newfile.txt.
sed = myfile.txt | sed 'N;s/\n/\. /
The above example will use the sed command to output
each of the lines in myfile.txt with the line number
followed by a period and a space before each line. As
done with the first example the output could be
redirected to another file using > and the file name.
sed 's/test/example/g' myfile.txt > newfile.txt

Opens the file myfile.txt and searches for the word
"test" and replaces every occurrence with the word
"example".
sed -n '$=' myfile.txt
Above this command count the number of lines in
the myfile.txt and output the results.
Regular Expressions
(cont)
/^M.*/
Line begins with capital M, 0 or more chars follow
/..*/
At least 1 character long (/.+/ means the same thing)
/^$/
The empty line
ab|cd
Either ab or cd
a(b*|c*)d
matches any string beginning with a letter a, followed

by either zeroor more of the letter b, or zero or
more of the letter c, followed by the letter d.
[[:space:]
[:alnum:]
]
Matches any character that is either a white space

character or alphanumeric.
Note:
Sed always tries to find the longest matching pattern in the
input. How would you match a tag in an HTML document?
Grouping with parens

If you put a subpattern inside parens
you can use + * and ? to the entire
subpattern.
a(bc)*d matches "ad" and "abcbcd"
does not match "abcxd" or "bcbcd"
9. append three exclamation points to the end of each line in u3

that contains student
10.repeat the previous command, but only output the lines that you
change.
11.If you wanted to actually change the original file for questions
#3,4,6,7, and 9, how would you
do it?
9. sed '/student/s/$/!!!/' u3
10.sed -n '/student/s/$/!!!/p' u3
11.Save the output of the sed command in a temporary file and
then use the mv command to rename it
to the original. Never redirect output to the same file you are using
for input within the same command
or pipeline! Example (#9):
sed '/student/s/$/!!!/' u3 > xxx # <-- the shell overwrites xxx
BEFORE it starts sed
mv xxx u3
6. change all occurrences of cow to cows and cows using the
parenthesis operators and \1 substitution
Answer:
sed 's/$cow$/\1s and \1s/' u3
Using the file u3, do the following using sed, displaying

the result on the screen
1.Output only the lines that contain MCIS
Answer: sed -n '/MCIS/p' u3
2. Delete any line that contains mcis
Answer:
sed '/mcis/d' u3
3. Change the first instance of * on each line to !
Answer:
sed 's/*/!/' u3
4. Change all occurrences of * on each line to !
Answer:
sed 's/*/!/g' u3
5. Output only the lines that contain either MCIS

or VLSI
Answer:
sed -n -e '/MCIS /p' -e '/VLSI
/p' u3
6. Output the file after changing mcis to MCIS on
lines 10-20
Answer:
sed '10,20s/mcis/MCIS/g' u3
7. Output the entire file except lines 1-20

Answer:
sed '1,20d' u3
9.
line 1 (one)
line 2 (two)
line 3 (three)
Command:
sed -e '1,2s/line/LINE/' file
Output:
LINE 1 (one)
LINE 2 (two)
line 3 (three)
9. Command:
sed -e '1,2d' file
Output:
line 3 (three)
10. Command:
sed -e '3d' file
Output:
line 1 (one)
line 2 (two)
11. Write a sed script that will take two words and a file
name a input from the user.
Let the inputs be word1, word2, and filename. Write scripts to do the following
To insert the word2 at every place word1 is present in
the file u3
Answer:
#!/bin/sh
echo -n 'Enter the string to which the new string to be
appended:'
read string1
echo -n 'Enter the string which is used to append:'
read string2
echo -n 'Enter the filename '
read filename
Sed from a file

If your sed script is getting long, you can put it into
a file, like so:
# This file is named "sample.sed
# comments can only appear in a block at the
beginning s/color/colour/g
s/flavor/flavour/g
s/theater/theatre/g
Then call sed with the "-f" flag:
sed -f sample.sed filename
Or, you can make an executable sed script:

#!/usr/bin/sed -f
# This file is named "sample2.sed"
s/color/colour/g
s/flavor/flavour/g
s/theater/theatre/g
then give it execute permissions:
chmod u+x sample2.sed
and then call it like so:
./sample2.sed filename
Note that you have to escape with backslashes the

many characters:
curlies \{ \} ,
round brackets ,
star \*,
plus \+,
question mark \?
Regular expression in sed

Special characters
Usage
Matches the beginning of the line
Matches the end of the line
Matches any single character
\*
Matches zero or more occurrence of the character
\+
Matches one or more occurrence
\?
Matches zero or one instance of the character
[ ]
Matches any character enclosed in [ ]
[^ ]
(character)\{m,n\}
(character)\{m,\}
(character)\{,n\}
(character)\{n\}
$expression$
\n
Matches any character not enclosed in [ ]

Match m-n repetitions of (character)
Match m or more repetitions of (character)
Match n or less (possibly 0) repetitions of
(character)
Match exactly n repetitions of (character)
Group operator. Also memorizes into numbered
variables - use for backreference as \1 \2 .. \9
Backreference - matches nth group
The '^' character means the beginning of the

line.
Example:
sed 's/^Thu /Thursday/' filename
will turn "Thu " into "Thursday", but only at the
beginning of the line.
Example:
sed -e '/^#/d
Examples:
1,10d
/[Uu]nix/!d
deletes lines 1 through 10

deletes lines that do not contain the
word unix.
6d
/^$/d
1,/^$/d
deletes line 6
deletes all blank lines
deletes from line 1 through the first
blank line
deletes from the first blank line
through the last line of the file
deletes from the first blank line
through line 10
/^$/,/$/d
/^$/,10d
`[a-zA-Z0-9]'
This matches any letters or digits.
`[â-z A-Z] '
This matches any letters .
Print only lines of 65 characters or longer

sed -n '/^.\{65\}/p
Print only lines of less than 65 characters
sed -n '/^.\{65\}/!p' # method 1, corresponds to above
Print line number 52

sed -n '52p' # method 1
sed '52!d' # method 2
print section of file between two regular expressions

sed -n '/Iowa/,/Montana/p' # case sensitive
print all of file EXCEPT section between 2 regular
expressions
sed '/Iowa/,/Montana/d'
The q or quit command

There is one more simple command that can restrict the
changes to a set of lines. It is the "q command: quit.
the third way to duplicate the head command is:
sed '11 q'
which quits when the eleventh line is reached.
This command is most useful when you wish to abort the
editing after some condition is reached.
The "q" command is the one command that does not take a
range of addresses.
Relationships between d, p, and !

As you may have noticed, there are often several ways to
solve the same problem with sed. This is
because print and delete are opposite functions, and it
appears that "!p" is similar to "d," while "!d" is
similar to "p." I wanted to test this, so I created a 20 line file,
and tried every different combination. The
following table, which shows the results, demonstrates the
difference:
Relations between d, p, and !
Sed Range Command Results
-------------------------------------------------------sed -n 1,10 p Print first 10 lines
sed -n 11,$ !p Print first 10 lines
sed 1,10 !d Print first 10 lines
sed 11,$ d Print first 10 lines
-------------------------------------------------------sed -n 1,10 !p Print last 10 lines

sed -n 11,$ p Print last 10 lines
sed 1,10 d Print last 10 lines
sed 11,$ !d Print last 10 lines
--------------------------------------------------------
sed -n 1,10 d Nothing printed

sed -n 1,10 !d Nothing printed
sed -n 11,$ d Nothing printed
sed -n 11,$ !d Nothing printed
-------------------------------------------------------sed 1,10 p Print first 10 lines twice,
Then next 10 lines once
sed 11,$ !p Print first 10 lines twice,
Then last 10 lines once
-------------------------------------------------------sed 1,10 !p Print first 10 lines once,
Then last 10 lines twice
sed 11,$ p Print first 10 lines once,
then last 10 lines twice
Obviously the command

sed '1,10 q
cannot quit 10 times. Instead
sed '1 q'
or
sed '10 q
is correct.
1. Delete lines that contain "O" at the beginning of the

line.
Answer:
sed '/Ô/d' list.txt
2. Translate capital C,R,O into small c,r,o

Answer:
sed 'y/CRO/cro/' list.txt
3. Delete empty lines

Answer:
sed '/^$/d' list.txt
4. Remove lines containing anything other than

alphabets, numbers, or spaces
Answer:
sed '/ ^[0-9a-zA-Z ]/d' list.txt
Specifying a Range of Characters with [...]

If you want to match specific characters,
you can use the square brackets to identify the exact
characters you are searching for.
The pattern that will match any line of text that
contains exactly one number is
^[0123456789]$
This is verbose.
You can use the hyphen between two characters to
specify a range:
^[0-9]$
You can intermix explicit characters with character

ranges.
This pattern will match a single character that is a
letter, number, or underscore:
[A-Za-z0-9_]
If you wanted to search for a word that

Started with a capital letter "T." Was the first word
on a line
The second letter was a lower case letter
And the third letter was a vowel
the regular expression would be "^T[a-z][aeiou] ."
Delete all lines NOT beginning with an 'a,e,E or I'"

Answer:
sed '/^[âeEI]/d'
list.txt
You can easily search for all characters except

those in square brackets by putting a "^" as the
first character after the "[."
To match all characters except vowels use
"[âeiou]."
Repetition using *
means 0 or more of the previous single character
pattern.
[abc]*
Hi Dave.*
matches "Hi Dave" or "Hi Daveisgoofy
0*10
matches "010" or "0000010" or "10"
Lets looks at another example:/a*bc[e-g]*[0-9]*/

Matches:aaaaabcfgh19919234
bc
abcefg123456789
abc45
Aabcggg87310
d*avid
Will match avid, david, ddavid
dddavid and any
other word with repeated ds
followed by avid
Compress all consecutive sequences of zeroes

into a single zero.
Answer:
s/00*/0/g
Repetition using +
+ means 1 or more of the previous single character
pattern.
[abc]+
Hi Dave.+
matches "Hi Dave." or "Hi Dave.
0+10
matches "010" or "0000010" does not
match "10"
a\+b\+
matches one or more à's followed
by one or more
`b's: àb' is the shorter
possible match, but other
examples are
àaaab' or àbbbbb' or
? Repetition Operator
? means 0 or 1 of the previous single character
pattern.
x[abc]?x
matches "xax" or "xx"
A[0-9]?B matches "A1B" or "AB" does not match

"a1b" or
"A123B"
à\?b'
Matches `b' or àb'.
Match any character with .

The character "." is one of those special metacharacters.
By itself it will match any character, except the
end-of-line character.
The pattern that will match a line with a single
characters is ^.$
Any character (except a metacharacter!)
matches itself.
The "." character matches any character except
newline.
"F." Matches an 'F' followed by any character.
"a.b" Matches 'a' followed by any1 char
followed by 'b'.
If you really want to match '.',

you can use "\."
a\.b a.b axb
Matching a specified number of the pattern using the

curly brackets {}
Using {n}, we match exactly that number of the
previous expression.
If we want to match 'aaaa' then we could use: a{4}
This would match exactly four a's.
If we want to match the pattern 1999 in our file
bazaar.txt,
then we would do: sed '/19{3}/p' bazaar.txt
This should print all lines containing the pattern 1999
in the bazaar.txt file.
The following expression would match a minimum of

four a's but a maximum of 10 a's in a particular
pattern: a\{4,10\}
Let's say we wanted to match any character a
minimum of 3 times, but a maximum of 7 times, then
we could affect a regular expression like: .\{3,7\}
`\{I\}'
As `*', but matches exactly I sequences (I is a decimal
integer;
for portability, keep it between 0 and 255 inclusive).
`\{I,J\}'
Matches between I and J, inclusive, sequences.
`\{I,\}'
Matches more than or equal to I sequences.
`.\{9\}A$
This matches nine characters followed by an À'.
`^.\{15\}A'
This matches the start of a string that contains
16characters,
the last of which is an À'.
`$REGEXP$
Groups the inner REGEXP as a whole, this is
used to:
* Apply postfix operators, like `$abcd$*':
this will search
for zero or more whole sequences of àbcd',
while àbcd*'
would search for àbc' followed by zero or
more occurrences
of `d'. Note that support for `$abcd$*' is
required by
POSIX 1003.1-2001, but many non-GNU
implementations do not
support it and hence it is not universally
portable.
`REGEXP1\|REGEXP2'
Matches either REGEXP1 or REGEXP2.
Use parentheses to use complex alternative regular
expressions.
The matching process tries each alternative in turn,
from left to right, and the first one that succeeds is
used.
`N
Add a newline to the pattern space, then append
the next line of input to the pattern space.
If there is no more input then SED exits without
processing any more commands.
File spacing:
space a file
sed G file name
insert a blank line below every line which matches "regex
sed '/regex/G'
count lines (emulates "wc -l")

sed -n '$='

Sed Regular Expression

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Sed Regular Expression

Загружено:

Авторское право:

Доступные форматы

Contents

SED Regular expression

Using the file u3, do the following using sed, displaying

5. Output only the lines that contain either cow or

11. Write a script to insert

Sed from a file

Or, you can make an executable sed script:

Note that you have to escape with backslashes the

Regular expression in sed

Matches the beginning of the line

Matches the end of the line

Matches any single character

Matches zero or more occurrence of the character

Matches one or more occurrence

Matches zero or one instance of the character

Matches any character enclosed in [ ]

Matches any character not enclosed in [ ]

Regular Expressions (character classes)

Printable characters (includes white

[:alpha:] Alphabetic characters

The '^' character means the beginning of the

deletes lines that do not contain the

deletes all blank lines

deletes lines 1 through 10

1,/^$/d deletes from line 1 through the first blank line

matches "aaaaa" or "acbca

matches "Hi Dave" or "Hi Daveisgoofy

matches "010" or "0000010" or "10"

matches "aaaaa" or "acbca

matches "Hi Dave." or "Hi Dave.

matches "xax" or "xx"

A[0-9]?B matches "A1B" or "AB" does not match

sed G myfile.txt > newfile.txt

sed 's/test/example/g' myfile.txt > newfile.txt

Line begins with capital M, 0 or more chars follow

At least 1 character long (/.+/ means the same thing)

The empty line

matches any string beginning with a letter a, followed

Matches any character that is either a white space

Grouping with parens

9. append three exclamation points to the end of each line in u3

Using the file u3, do the following using sed, displaying

5. Output only the lines that contain either MCIS

sed -n -e '/MCIS /p' -e '/VLSI

7. Output the entire file except lines 1-20

Sed from a file

Or, you can make an executable sed script:

Note that you have to escape with backslashes the

Regular expression in sed

Matches the beginning of the line

Matches the end of the line

Matches any single character

Matches zero or more occurrence of the character

Matches one or more occurrence

Matches zero or one instance of the character

Matches any character enclosed in [ ]

Matches any character not enclosed in [ ]

The '^' character means the beginning of the

deletes lines 1 through 10

Print only lines of 65 characters or longer

Print line number 52

print section of file between two regular expressions

The q or quit command

Relationships between d, p, and !

Lets looks at another example:/abc[e-g][0-9]*/