Next: , Previous: , Up: Library   [Contents][Index]


5.3 The sed function

The sed function allows you to transform a string by replacing parts of it that match a regular expression with another string. This function is somewhat similar to the sed command line utility (hence its name) and bears similarities to analogous functions in other programming languages (e.g. sub in awk or the s// operator in perl).

Built-in Function: string sed (string subject, expr, …)

The expr argument is an s-expressions of the the form:

s/regexp/replacement/[flags]

where regexp is a regular expression, and replacement is a replacement string for each part of the subject that matches regexp. When sed is invoked, it attempts to match subject against the regexp. If the match succeeds, the portion of subject which was matched is replaced with replacement. Depending on the value of flags (see global replace), this process may continue until the entire subject has been scanned.

The resulting output serves as input for next argument, if such is supplied. The process continues until all arguments have been applied.

The function returns the output of the last s-expression.

Both regexp and replacement are described in detail in The ‘s’ Command in GNU sed.

Supported flags are:

g

Apply the replacement to all matches to the regexp, not just the first.

i

Use case-insensitive matching. In the absence of this flag, the value set by the recent #pragma regex icase is used (see icase).

x

regexp is an extended regular expression (see Extended regular expressions in GNU sed). In the absence of this flag, the value set by the recent #pragma regex extended (if any) is used (see extended).

number

Only replace the numberth match of the regexp.

Note: the POSIX standard does not specify what should happen when you mix the ‘g’ and number modifiers. Mailfromd follows the GNU sed implementation in this regard, so the interaction is defined to be: ignore matches before the numberth, and then match and replace all matches from the numberth on.

Any delimiter can be used in lieue of ‘/’, the only requirement being that it be used consistently throughout the expression. For example, the following two expressions are equivalent:

s/one/two/
s,one,two,

Changing delimiters is often useful when the regex contains slashes. For instance, it is more convenient to write s,/,-, than s/\//-/.

Here is an example of sed usage:

  set email sed(input, 's/^<(.*)>$/\1/x')

It removes angle quotes from the value of the ‘input’ variable and assigns the result to ‘email’.

To apply several s-expressions to the same input, you can either give them as multiple arguments to the sed function:

  set email sed(input, 's/^<(.*)>$/\1/x', 's/(.+@)(.+)/\1\L\2\E/x')

or give them in a single argument separated with semicolons:

  set email sed(input, 's/^<(.*)>$/\1/x;s/(.+@)(.+)/\1\L\2\E/x')

Both examples above remove optional angle quotes and convert the domain name part to lower case.

Regular expressions used in sed arguments are controlled by the #pragma regex, as another expressions used throughout the MFL source file. To avoid using the ‘x’ modifier in the above example, one can write:

  #pragma regex +extended
  set email sed(input, 's/^<(.*)>$/\1/', 's/(.+@)(.+)/\1\L\2\E/')

See regex, for details about that #pragma.

So far all examples used constant s-expressions. However, this is not a requirement. If necessary, the expression can be stored in a variable or even constructed on the fly before passing it as argument to sed. For example, assume that you wish to remove the domain part from the value, but only if that part matches one of predefined domains. Let a regular expression that matches these domains be stored in the variable domain_rx. Then this can be done as follows:

  set email sed(input, "s/(.+)(@%domain_rx)/\1/")

If the constructed regular expression uses variables whose value should be matched exactly, such variables must be quoted before being used as part of the regexp. Mailfromd provides a convenience function for this:

Built-in Function: string qr (string str[; string delim])

Quote the string str as a regular expression. This function selects the characters to be escaped using the currently selected regular expression flavor (see regex). At most two additional characters that must be escaped can be supplied in the delim optional parameter. For example, to quote the variable ‘x’ for use in double-quoted s-expression:

  qr(x, '/"')

Next: , Previous: , Up: Library   [Contents][Index]