Some configuration and output text formats contain sections like the following:
foo:
value1
value2
bar:
value 3
In this article, two scripts are presented that print all consecutive indented lines that follow a non-indented line that matches a search pattern given by a regular expression.
This means, given the single argument foo and the standard input above, the scripts should
- determine the line that matches foo and
- print the following two lines, but no other lines.
Also, the indentation of the printed lines should be removed.
Extracting Values from an „Indented Subsection“ Format
The following script ./section.sh extracts indented parts that follow a non-indented search pattern:
#!/bin/sh
awk -v s="$1" '{
if($0 ~ s){
while(getline){
if(!/^[ \t]/)
exit;
gsub(/^[ \t]+/,"")
print
}
}
}'
Executing this script as follows:
section.sh << EOF
foo:
value 1
value 2
bar:
value 3
EOF
results in the following output:
value 1
However, the script fails to render input that itself contains indentation in a useful manner:
section.sh << EOF
foo:
bar:
value 1
EOF
This renders:
bar:
value 1
The nested indentation is lost.
The following Perl script ./section.pl reads lines from STDIN until a line matches the first script argument $ARGV[0]. It then reads all consecutive indented lines into an array @lines, and, in the process, it determines the shortest common white-space prefix of those lines, $prefix. Finally, it left-strips $prefix from each element of @lines, printing the results line by line.
#!/usr/bin/perl
use strict;
use warnings;
my @lines;
my $prefix = undef;
READ: while(<STDIN>) {
next READ unless $_ =~ qr/$ARGV[0]/;
while(<STDIN>) {
last READ unless /^(\s+)/;
push @lines, $_;
$prefix = $1 unless defined $prefix;
$prefix = $1 unless length($1)>length($prefix);
}
}
my $reg = qr/^$prefix/;
for(@lines) {
s/$reg//;
print
}
This script preserves nested indentation; thus it can be used to form pipelines that extract more deeply nested values:
cat > /tmp/input << EOF
foo:
bar:
value 1
EOF
cat /tmp/input | ./section.pl foo | section.pl bar
The output is:
value 1
Extracting Values from a „Stanza“ Format
A variation of the configuration syntax presented until now is the „stanza format“. In this format, a certain line prefix string marks the beginning of a certain section of the configuration, and the section includes the remainder of the first line of the section and all subsequent indented lines:
foo: key1=value1, key2=value2, key3=value3,
key4=value4, key5=value5
bar: key6=value6, key7=value7, key8=value8,
key9=value9, key10=value10
Sections can be extracted, for example, using the following Perl script, stanza.pl:
#!/usr/bin/perl
use strict;
use warnings;
my $reg = qr/^$ARGV[0]: (.*)/;
while(<STDIN>) {
$_ =~ $reg && print $1;
while() {
last unless /^\s/;
print;
}
}
This script could be used as follows:
cat > /tmp/input << EOF
foo: key1=value1, key2=value2, key3=value3,
key4=value4, key5=value5
bar: key6=value6, key7=value7, key8=value8,
key9=value9, key10=value10
EOF
cat /tmp/input | ./stanza.pl foo
The output should be:
key1=value1, key2=value2, key3=value3, key4=value4, key5=value5
- ~/.bashrc mit Output und scp
- sed ist auch eine Programmiersprache
- Spass mit „awk“
- Minimales „find“ in Bourne Again Shell
- GNU find hat keine Option „-older“ …
- Bourne to Bourne Again Shell Forward Compatibility
- Protokoll meines Vortrags „Bourne Shell“ bei UUGRN e.V.
- Print XDG Desktop Definition for Application
- Find Files by Size given in Bytes
- Using sed or awk to ensure a specific last Line in a Text
- Make a Bourne Again Shell Script Log its Output to a File
- Maintaining Multi-Line „stat“ Formats using Bourne Again Shell
- Print all indented Lines following a non-indented Line