Configuration (GNU Dico Manual)

4.3 Configuration

Upon startup, dicod reads its settings and database definitions from a configuration file dicod.conf. By default it is located in $sysconfidr (i.e., in most cases /usr/local/etc, or /etc), but an alternative location may be specified using the --config command line option (see --config).

If any errors are encountered in the configuration file, the program reports them on the standard error and exits with a non-zero status.

To test the configuration file without starting the server, use the --lint (-t) command line option. It causes dicod to check its configuration file and exit with status 0 if no errors were detected, and with status 1 otherwise.

Before parsing, the configuration file is preprocessed using m4 (see Preprocessor). To examine the preprocessed configuration without actually parsing it, use the -E command line option. To avoid preprocessing it, use the --no-preprocessor option.

The rest of this section describes configuration file syntax in detail. You can receive a concise summary of all configuration directives any time by running dicod --config-help.

4.3.1 Configuration File Syntax

A dicod configuration consists of statements and comments.

There are three classes of lexical tokens: keywords, values, and separators. Blanks, tabs, newlines and comments, collectively called white space are ignored except as they serve to separate tokens. Some white space is required to separate otherwise adjacent keywords and values.

4.3.1.1 Comments

Comments may appear anywhere where white space may appear in the configuration file. There are two kinds of comments: single-line and multi-line comments. Single-line comments start with ‘#’ or ‘//’ and continue to the end of the line:

# This is a comment
// This too is a comment

Multi-line or C-style comments start with the two characters ‘/*’ (slash, star) and continue until the first occurrence of ‘*/’ (star, slash).

Multi-line comments cannot be nested.

4.3.1.2 Pragmatic Comments

Pragmatic comments are similar to usual comments, except that they cause some changes in the way the configuration is parsed. Pragmatic comments begin with a ‘#’ sign and end with the next physical newline character. As of GNU Dico version 2.10, the following pragmatic comments are understood:

#include <file>

#include file

Include the contents of the file. If file is an absolute file name, both forms are equivalent. Otherwise, the form with angle brackets searches for the file in the include search path, while the second one looks for it in the current working directory first, and, if not found there, in the include search path.

The default include search path is:

prefix/share/dico/2.10/include
prefix/share/dico/include

where prefix is the installation prefix.

New directories can be appended in front of it using -I (--include-dir) command line option (see --include-dir).

#include_once <file>

#include_once file

Same as #include, except that, if the file has already been included, it will not be included again.

#line num

#line num "file"

This line causes dicod to believe, for purposes of error diagnostics, that the line number of the next source line is given by num and the current input file is named by file. If the latter is absent, the remembered file name does not change.

# num "file"

This is a special form of #line statement, understood for compatibility with the C preprocessor.

In fact, these statements provide a rudimentary preprocessing features. For more sophisticated ways to modify configuration before parsing, see Preprocessor.

4.3.1.3 Statements

A simple statement consists of a keyword and a value separated by any amount of whitespace. It is terminated with a semicolon (‘;’), unless the value is a here-document (see below), in which case semicolon is optional.

Examples of simple statements:

timing yes;
access-log-file /var/log/access_log;

A keyword begins with a letter and may contain letters, decimal digits, underscores (‘_’) and dashes (‘-’). Examples of keywords are: ‘group’, ‘identity-check’.

A value can be one of the following:

number

A number is a sequence of decimal digits.

boolean

A boolean value is one of the following: ‘yes’, ‘true’, ‘t’ or ‘1’, meaning true, and ‘no’, ‘false’, ‘nil’, ‘0’ meaning false.

unquoted string

An unquoted string may contain letters, digits, and any of the following characters: ‘_’, ‘-’, ‘.’, ‘/’, ‘@’, ‘*’, ‘:’.

quoted string

A quoted string is any sequence of characters enclosed in double-quotes (‘"’). A backslash appearing within a quoted string introduces an escape sequence, which is replaced with a single character according to the following rules:

Sequence	Replaced with
\a	Audible bell character (ASCII 7)
\b	Backspace character (ASCII 8)
\f	Form-feed character (ASCII 12)
\n	Newline character (ASCII 10)
\r	Carriage return character (ASCII 13)
\t	Horizontal tabulation character (ASCII 9)
\v	Vertical tabulation character (ASCII 11)
\\	A single backslash (‘`\`’)
\"	A double-quote.

Table 4.1: Backslash escapes

In addition, the sequence ‘\newline’ is removed from the string. This allows you to split long strings over several physical lines, e.g.:

"a long string may be\
 split over several lines"

If the character following a backslash is not one of those specified above, the backslash is ignored and a warning is issued.

Two or more adjacent quoted strings are concatenated, which gives another way to split long strings over several lines to improve readability. For instance, the following fragment produces the same result as the example above:

"a long string may be"
" split over several lines"

Here-document

A here-document is a special construct that allows the user to introduce strings of text containing embedded newlines.

The <<word construct instructs the parser to read all the following lines up to the line containing only word, with possible trailing blanks. Any lines thus read are concatenated together into a single string. For example:

<<EOT
A multiline
string
EOT

The body of a here-document is interpreted the same way as a double-quoted string, unless word is preceded by a backslash (e.g. ‘<<\EOT’) or enclosed in double-quotes, in which case the text is read as is, without interpretation of escape sequences.

If word is prefixed with - (a dash), then all leading tab characters are stripped from input lines and the line containing word. Furthermore, if - is followed by a single space, all leading whitespace is stripped from them. This allows for indenting here-documents in a natural fashion. For example:

<<- TEXT
    All leading whitespace will be
    ignored when reading these lines.
TEXT

It is important that the terminating delimiter be the only token on its line. The only exception to this rule is allowed if a here-document appears as the last element of a statement. In this case a semicolon can be placed on the same line with its terminating delimiter, as in:

help-text <<-EOT
        A sample help text.
EOT;

list

A list is a comma-separated sequence of values. Lists are delimited by parentheses. The following example shows a statement whose value is a list of strings:

capability (mime,auth);

In any case where a list is appropriate, a single value is allowed without being a member of a list: it is equivalent to a list whose only member is that value. This means that, e.g. ‘capability mime;’ is equivalent to ‘capability (mime);’.

A block statement introduces a logical group of another statements. It consists of a keyword, followed by an optional value, and a sequence of statements enclosed in curly braces, as shown in the example below:

load-module outline {
        command "outline";
}

The closing curly brace may be followed by a semicolon, although this is not required.

4.3.2 Server Settings

Server settings control how dicod is executed on the server machine.

Configuration: user string

Run with the privileges of this user. Dicod does not require root privileges, so it is recommended to always use this statement when running dicod in daemon mode (see Daemon Mode). The argument is either a user name, or UID prefixed with a plus sign.

Example:

user nobody;

Configuration: group list

If user is given, dicod will drop all supplementary groups and switch to the principal group of that user. Sometimes, however, it may be necessary to retain one or more supplementary groups. For example, this might be necessary to access dictionary databases. The group statement retains the supplementary groups listed in list. Each group can be specified either by its name or by its GID number, prefixed with ‘+’, e.g.:

user nobody;
group (man, dict, +88);

This statement is ignored if user statement is not present or if dicod is running in inetd mode. See Inetd Mode.

Configuration: mode enum

Sets server operation mode. The argument is one of:

daemon: Run in daemon mode. See Daemon Mode, for a detailed description.
inetd: Run in inetd mode. See Inetd Mode, for a detailed description.

This statement is overridden by the --inetd command line option. See --inetd.

Configuration: listen list;

Specify the IP addresses and ports to listen on in daemon mode. By default, dicod will listen on port 2628 on all existing interfaces. Use the listen statement to abridge the list of interfaces to listen on, or to change the port number.

Elements of list can have the following forms:

host:port

Specifies an IP (version 4 or 6) socket to listen on. The host part is either an IPv4 in “dotted-quad” notation, or an IPv6 address in square brackets, or a host name. In the latter case, dicod will listen on all IP addresses corresponding to its ‘A’ or ‘AAAA’ DNS records.

The port part is either a numeric port number or a symbolic service name which is found in /etc/services file.

Either of the two parts may be omitted. If host is omitted, dicod will listen on all interfaces. If port is omitted, it defaults to 2628. In this case the colon may be omitted, too.

Examples:

listen dict.example.org:2628;
listen 198.51.100.10;
listen [2001:DB8::11];
listen :2628;

inet://host:port

inet4://host:port

Listen on IPv4 socket. The host is either an IP address or a host name. In the latter case, dicod will start listening on all IP addresses from the ‘A’ records for this host.

Either host or port (but not both) can be omitted. Missing host defaults to IPv4 addresses on all available network interfaces, and missing port defaults to 2628.

Example:

listen inet4://198.51.100.10;

inet6://host:port

Listen on IPv6 socket. The host is either an IPv6 address in square brackets, or a host name. In the latter case, dicod will start listening on all IP addresses from the ‘AAAA’ records for this host.

Either host or port (but not both) can be omitted. Missing host defaults to IPv6 addresses on all available network interfaces, and missing port defaults to 2628.

Example:

listen inet6://[2001:DB8::11];

filename

unix://filename

Specifies the name of a UNIX socket to listen on. Filename must be an absolute file name of the socket.

Configuration: pidfile string

Store PID of the master process in this file. Default is localstatedir/run/dicod.pid. Notice that the access bits of this default directory may be insufficient for dicod to write there after dropping root privileges (see user statement). One solution to this is to create a subdirectory with the same owner as given by user statement and to point the PID file there:

pidfile /var/run/dict/dicod.pid;

Another solution is to make PID directory group-writable and to add the owner group to the group statement (see group statement).

Configuration: max-children number: Sets maximum number of sub-processes that can run simultaneously. This is equivalent to the number of clients that can simultaneously use the server. The default is 64 sub-processes.

Configuration: inactivity-timeout number

Set inactivity timeout to the number of seconds. The server disconnects automatically if the remote client has not sent any command within this number of seconds. Setting timeout to 0 disables inactivity timeout (the default).

This statement along with max-children allows you to control the server load.

Configuration: shutdown-timeout number: When the master server is shutting down, wait this number of seconds for all children to terminate. Default is 5 seconds.

Configuration: identity-check boolean: Enable identification check using AUTH protocol (RFC 1413). The received user name or UID can be shown in access log using the %l conversion (see Access Log).

Configuration: ident-keyfile string: Use encryption keys from the named file to decrypt AUTH replies encrypted using DES.

Configuration: ident-timeout number: Set timeout for AUTH input/output operation to number of seconds. Default timeout is 3 seconds.

4.3.3 Authentication

The server may be configured to request authentication in order to make some databases or some additional information available to the user. Another possible use of authentication is to minimize resource utilization on the server machine.

GNU Dico supports two types of authentication: the traditional APOP-style authentication (see AUTH) and a more advanced SASL authentication. The latter is described separately, see SASL.

Authentication setup is simple: first, you define a user authentication database, then you enable it by declaring auth server capability (see Capabilities):

capability auth;

User authentication database keeps, for each user name, the corresponding plain text password, and, optionally, the names of groups this user belongs to. Notice, that due to the specifics of DICT authentication scheme (see AUTH), user passwords are stored in plain text, therefore special care must be taken to protect the contents of your authentication database from compromise.

The database is defined using the user-db block statement:

Configuration: user-db url: Declare user authentication database.

Dico’s authentication is designed so that various authentication database formats can easily be added. A database is identified by its URL, or Universal Resource Locator. It consists of the following parts (square brackets denoting optional ones):

type://[[user[:password]@]host]/path[params]

type: A database type, or format. See below for a list of available database formats.
user: User name necessary to access the database.
password: User password necessary to access the database.
host: Domain name or IP address of a machine running the database.
path: A path to the database. The exact meaning of this element depends on the database protocol. It is described in detail when discussing the particular database protocols.
params: A list of protocol-dependent parameters. Each parameter is of the form keyword=name, multiple parameters are separated with semicolons.

If the underlying mechanism requires some additional configuration data that cannot be supplied in an URL, these are passed to it using the following statement:

user-db conf: options string: The argument is treated as an opaque string and passed to the authentication ‘open’ procedure verbatim. Its exact meaning depends on the type of the database.

The URL defines how the database is accessed. Another important point is where to get the user data from. This is specified by the following two sub-statements:

user-db conf: password-resource arg: A database resource which returns the user’s password.

user-db conf: group-resource arg: A database resource which returns the list of groups this user is member of.

The exact semantics of the database resource depends on the type of database being used. For flat text databases, it means the name of a text file that contains these data, for SQL databases, the resource is an SQL query, etc. Below we will discuss URLs and resources used by each database type.

To summarize, the authentication database is defined as:

# Define user database for authentication.
user-db url {
  # Additional configuration options.
  options string;
  
  # Name of a password resource.
  password-resource resource;

  # Name of the resource returning user group information.
  group-resource resource;
}

4.3.3.1 Text Authentication Database

A text authentication database consists of one or two flat text files — a password file, which contains user passwords, and a group file, which contains user groups. The latter is optional. Both files have the same format:

Empty lines are ignored.
Any text from ‘#’ to the end of the line is ignored.
Non-empty lines consist of two fields, separated by any amount of white space. The first field is the user name. It serves as a search key in the database. The second field is the requested resource.

Record keys in a password file must be unique, i.e. no two records may contain the same first field. The group file may contain multiple records with the same key. For example:

$ grep smith pass
smith guessme
$ grep smith group
smith user
smith timing
smith tester

This means that user ‘smith’ has password ‘guessme’ and is a member of three groups: ‘user’, ‘timing’ and ‘tester’.

A URL of a text database begins with ‘text’ and contains only the path element, which gives the name of the directory where the database files reside. The name of a password file is given by the password-resource statement. The name of a group file is given by the group-resource statement.

For example, if user passwords are kept in the file passwd, user groups are kept in the file user, and both files reside in /var/db/dico directory, then the appropriate database configuration will be:

user-db text:///var/db/dico {
  password-resource passwd;
  group-resource group;
}

4.3.3.2 LDAP Databases.

To configure LDAP user database, you need first to load the ‘ldap’ module (see LDAP module):

load-module ldap;

The URL of the database is: ‘ldap://host[:port]’, where host is the host name or IP address of the LDAP server, and optional port specifies the port number it is listening on (by default, port 389 is assumed).

The password-resource statement specifies the name of an attribute containing the password, and the group-resource supplies the name of the attribute with the group name.

Additional configuration data are supplied in the options statement, whose argument is a whitespace-separated list of assignments:

base=base

Sets base DN.

binddn=dn

Sets the DN to bind as.

passwd=string

Sets the password.

tls=bool

When set to ‘yes’, enables the use of TLS encryption.

debug=number

Sets OpenLDAP debug level.

user-filter=filter

A LDAP filter to select the objects describing this user. Any occurrence of ‘$user’ in filter is replaced with the actual user name, as obtained during the authentication. This variable expansion occurs much the same way as in shell. In particular, the variable is expanded only unless it is immediately followed by an alphanumeric character. For example, it occurs in:

(uid=$user)

and

(uid=$user.1)

But it does not occur in

(uid=$users)

If it is necessary to expand the variable in such a context, enclose its name in curly braces:

(uid=${user}s)

group-filter=filter

A LDAP filter that selects the user groups. The filter is expanded as in user-filter.

The following example shows a LDAP user database configured for base DN ‘example.com’ which uses ‘posixAccount’ and ‘posixGroup’ objects from ‘nis.schema’:

user-db "ldap://localhost" {
  password-resource userPassword;
  group-resource cn;
  options "user-filter=(uid=$user) "
          "group-filter=(&(objectClass=posixGroup)"
                       "(memberuid=$user)) "
          "base=dc=example,dc=com";
}

A note on password usage is in order here. Most authentication methods require the passwords to be stored in the database in plain text form. The use of encrypted passwords (e.g. MD5 or SHA1) is possible only with ‘LOGIN’ and ‘PLAIN’ GSASL authentication methods.

4.3.4 SASL Authentication

The SASL authentication is available if the server was compiled with GNU SASL.

Configuration: sasl { statements }

This block statement configures SASL authentication. The following is a short summary of its syntax and the available substatements:

sasl {
  # Disable SASL mechanisms listed in mech.
  disable-mechanism mech;
  # Enable SASL mechanisms listed in mech.
  enable-mechanism mech;
  # Set service name for GSSAPI and Kerberos.
  service name;
  # Set realm name for GSSAPI and Kerberos.
  realm name;
  # Define groups for anonymous users.
  anon-group group-list;
}

The list of available authentication mechanisms is configured using two statements:

sasl: disable-mechanism mech: Disables SASL mechanisms listed in mech, which is a list of names.

sasl: enable-mechanism mech: Enables SASL mechanisms listed in mech, which is a list of names.

The server builds a list of available mechanisms using the following algorithm. First, a list of implemented mechanisms is retrieved from the SASL library. If the enable-mechanism statement is defined, the resulting list is filtered so that only mechanisms listed in enable-mechanism remain. Further, if the disable-mechanism statement is defined, the names listed there are removed from the list.

sasl: service name: Sets the service name for GSSAPI and Kerberos mechanisms.

sasl: realm name: Sets the realm name.

sasl: anon-group list: Sets the list of user groups considered anonymous.

The database of user credentials depends on the authentication mechanism used. For GSSAPI or Kerberos it is managed by appropriate servers. Other mechanisms use the standard dicod user database configuration (see Authentication).

4.3.5 Access Control Lists

Access control lists, or ACLs for short, are lists of permissions that can be applied to certain dicod objects. They can be used to control who can connect to the dictionary server and what resources are offered to whom.

An ACL is defined using the acl block statement:

acl name {
  definitions
}

The parameter name specifies a unique name for that ACL. This name will be used by another configuration statements to refer to that ACL (See Security Settings, and see Database Visibility).

A part between the curly braces (denoted by definitions above), is a list of access statements. There are two types of such statements:

ACL: allow user-group sub-acl host-list: Allow access to resource.

ACL: deny user-group sub-acl host-list: Deny access to resource.

All parts of an access statement are optional, but at least one of them must be present.

The user-group part specifies which users match this entry. Allowed values are the following:

all: All users.
authenticated: Only authenticated users.
group group-list: Authenticated users which are members of at least one of the groups listed in group-list.

The sub-acl part, if present, branches to another ACL. The syntax of this group is:

acl name

where name is the name of a previously defined ACL.

Finally, the host-list group matches client IP addresses. It consists of a from keyword followed by a list of address specifiers. Allowed address specifiers are:

any: Matches any client address.
addr: Matches if the client IP equals addr. The latter may be given either as an IP address or as a host name, in which case it will be resolved and the first of its IP addresses will be used.
addr/netlen: Matches if first netlen bits from the client IP address equal to addr. The network mask length, netlen must be an integer number in the range from 0 to 32 for IPv4, and in the range 0 – 128 for IPv6. The address part, addr, is as described above.
addr/netmask: The specifier matches if the result of logical AND between the client IP address and netmask equals to addr. The network mask must be specified in a IP address (either IPv4 or IPv6) notation.
filename: Matches if connection was received from a UNIX socket filename, which must be given as an absolute file name.

To summarize, the syntax of an access statement is:

allow|deny [all|authenticated|group group-list]
           [acl name] [from addr-list]

where square brackets denote optional parts and vertical bar means ‘one of’.

When an ACL is applied to a particular object, its entries are tried in turn until one of them matches, or the end of the list is reached. If a matched entry is found, its command verb, allow or deny, defines the result of ACL match. If the end of list is reached, the result is ‘allow’, unless explicitly specified otherwise.

For example, the following statement defines an ACL named ‘common’, that allows access for any user connected via local UNIX socket /tmp/dicod.sock or coming from a local network ‘192.168.10.0/24’. Any authenticated users are allowed, provided that they are allowed by another ACL ‘my-nets’ (which should have been defined before this definition). Users coming from the network ‘10.10.0.0/24’ are allowed if they authenticate themselves and are members of groups ‘dicod’ or ‘users’. Anybody else is denied access:

acl common {
    allow all from ("/tmp/dicod.sock", "192.168.10.0/24");
    allow authenticated acl "my-nets";
    allow group ("dicod", "users") from "10.10.0.0/24";
    deny all;
}

See Security Settings, for information on how to control daemon security settings.

See Database Visibility, for a detailed description on how to use ACLs to control access to databases.

4.3.6 Security Settings

This subsection describes configuration settings that control access to various resources served by dicod.

Configuration: connection-acl acl-name

Use ACL acl-name to control incoming connections. The ACL itself must be defined before this statement. Using user-group (see previous subsection) in this ACL makes no sense, because the authentication itself is performed only after the connection have been established.

acl incoming-conn {
   allow from 213.130.0.0/19;
   deny any;
}

connection-acl incoming-conn;

Configuration: show-sys-info acl-name

This statement controls whether to show system information in reply to SHOW SERVER command (see SHOW SERVER). The information will be shown only if ACL acl-name allows it.

The system information shown includes the following data: name of the package and its version, name of the system where it was built and the kernel version thereof, host name, total operational time of the daemon, number of subprocesses executed so far and average usage frequency. For example:

dicod (dico 2.10) on Linux 2.6.32,
dict.example.net up 99+04:42:58, 19647 forks (686.9/hour)

4.3.7 Logging and Debugging

The directives described in this subsection provide basic logging capabilities.

Configuration: log-tag string: Prefix syslog messages with this string. By default, the program name is used.

Configuration: log-facility string: Sets the syslog facility to use. Allowed values are: ‘user’, ‘daemon’, ‘auth’, ‘authpriv’, ‘mail’, ‘cron’, ‘local0’ through ‘local7’ (case-insensitive), or a facility number.

Configuration: log-print-severity boolean: Prefix diagnostics messages with a string identifying their severity.

Configuration: transcript boolean

Controls the transcript of user sessions. If boolean is ‘true’, the transcript will be output to the logging channel. In the transcript, the lines received from client are prefixed with ‘C:’, while those sent in reply are marked with ‘S:’. Here is an excerpt from the transcript output:

S: 220 example.net dicod (dico 2.10) <mime.xversion>
  <1645.1212874507@example.net>
C: client "Kdict"
S: 250 ok
C: show db
S: 110 16 databases present
S: afr-deu "Afrikaans-German Freedict dictionary"
S: afr-eng "Afrikaans-English FreeDict Dictionary"
[...]
S: .
S: 250 ok

(The first line is split in two to fit in the printed page width.) This option produces lots of output and can significantly slow down the server. Use it only if you are debugging dicod or some remote client. Never use it in a production environment.

4.3.8 Access Log

GNU Dico provides a feature similar to Apache’s CustomLog, which keeps a log of MATCH and DEFINE requests. To enable this feature, specify the name of the log file using the following directive:

Configuration: access-log-file string

Sets access log file name.

access-log-file /var/log/dico/access.log;

The format of log file entries is defined via the access-log-format directive:

Configuration: access-log-format string: Sets format string for access log file.

Its argument can contain literal characters, which are copied into the log file verbatim, and format specifiers, i.e. special sequences which begin with ‘%’ and are replaced in the log file as shown in the table below.

%%

The percent sign.

%a

Remote IP-address.

%A

Local IP-address.

%B

Size of response in bytes.

%b

Size of response in bytes in CLF format, i.e. a ‘-’ rather than a ‘0’ when no bytes are sent.

%C

Remote client (from the CLIENT command, see CLIENT).

%D

The time taken to serve the request, in microseconds.

%d

Request command verb in abbreviated form, suitable for use in URLs, i.e. ‘d’ for DEFINE, and ‘m’ for MATCH. See urls.

%h

Remote host.

%H

Request command verb (DEFINE or MATCH).

%l

Remote logname (from identd, if supplied). This will return a dash unless identity-check is set to true. See identity-check.

%m

The search strategy.

%p

The canonical port of the server serving the request.

%P

The PID of the child that served the request.

%q

The database from the request.

%r

Full request.

%{n}R

The nth token from the request (n is 0-based).

%s

Reply status. For multiple replies, the form ‘%s’ returns the status of the first reply, while ‘%>s’ returns that of the last reply.

%t

Time the request was received in the standard Apache format, e.g.:

[04/Jun/2008:11:05:22 +0300]

%{format}t

The time, in the form given by format, which should be a valid strftime format. See Time and Date Formats, for a detailed description.

The standard ‘%t’ format is equivalent to

[%d/%b/%Y:%H:%M:%S %z]

%T

The time taken to serve the request, in seconds.

%u

Remote user from AUTH command.

%v

The host name of the server serving the request. See hostname directive.

%V

Actual host name of the server (in case it was overridden in configuration).

%W

The word from the request.

For the reference, here is the list of format specifiers that have different meaning than in Apache: ‘%C’, ‘%H’, ‘%m’, ‘%q’. The following format specifiers are unique to dicod: ‘%d’, ‘%{n}R’, ‘%V’, ‘%W’.

The absence of access-log-format directive is equivalent to the following statement:

access-log-format "%h %l %u %t \"%r\" %>s %b";

It was chosen so as to be compatible with Apache access logs and be easily parsable by existing log analyzing tools, such as webalizer.

Extending this format string with the client name produces a log format similar to Apache ‘combined log’:

access-log-format "%h %l %u %t \"%r\" %>s %b \"\" \"%C\"";

4.3.9 General Settings

Settings described in this subsection configure the basic behavior of the DICT daemon.

Configuration: initial-banner-text string

Display the string in the textual part of the initial server reply.

When connection is established, the server sends an initial reply to the client, that looks like in the example below:

220 example.org <auth.mime> <520.1212912026@example.org>

See Initial Reply, for a detailed description of its parts.

The part of this reply after the host name is modifiable and can contain arbitrary text. You can use initial-banner-text to append any additional information there. Note, that string may not contain newlines or angle brackets. For example:

initial-banner-text "Please authenticate yourself,";

This statement produces the following initial reply (split over two lines for readability):

220 example.org Please authenticate yourself,
  <auth.mime> <520.1212912026@Texample.org>

Configuration: hostname string

Sets the hostname. By default, the server determines it automatically. If, however, it makes a wrong guess, you can fix it using this directive.

The server hostname is used, among others, in the initial reply after ‘220’ code (see above) and may also be displayed in the access log file using the ‘%v’ escape (see Access Log).

Configuration: server-info string

Sets the server description to be shown in reply to SHOW SERVER (see SHOW SERVER) command.

The first line of the reply, after the usual ‘114’ response line, shows the name of host where the server is running. If the settings of show-sys-info (see show-sys-info) permit, some additional information about the system is printed.

The lines that follow are taken from the server-info directive. It is common to specify string using “here-document” syntax (see here-document), e.g.:

server-info <<EOT
Welcome to the FOO dictionary service.

Contact <dict@foo.example.org> if you have questions or
suggestions.
EOT;

Configuration: help-text string

Sets the text to be displayed in reply to the HELP command.

The default reply to HELP command displays a list of commands understood by the server with a short description of each.

If the string begins with a plus sign, it will be appended to the default reply:

help-text <<-EOT
  +
  The commands beginning with an X are extensions.
EOT;

If the string begins with any other character, except ‘+’, it will replace the default help output. For example:

help-text <<-EOT
  There is no help.
  See RFC 2229 for detailed information.
EOT;

Configuration: default-strategy string

Sets the name of the default matching strategy (see MATCH). By default, Levenshtein matching is used, which is equivalent to

default-strategy lev;

4.3.10 Server Capabilities

Capabilities are certain server features that can be enabled or disabled at the system administrator’s will.

Configuration: capability list: Requests additional capabilities from the list.

The argument to capability directive must contain names of existing dicod capabilities. These are listed in the following table:

auth: The AUTH command is supported. See Authentication.
mime: The OPTION MIME command is supported. Notice that RFC 2229 requires all servers to support that command, so you should always specify this capability.
xversion: The XVERSION command is supported. It is a GNU extension that displays the dicod implementation and version number. See XVERSION.
xlev: The XLEV command is supported. This command allows the remote party to set and query maximal Levenshtein distance for lev matching strategy. See strategy. See XLEV.

The capabilities set using this directive are displayed in the initial server reply (see initial reply), and their descriptions are added to the HELP command output (unless specified otherwise by the help-text statement).

4.3.11 Database Modules and Handlers

A database module is an external piece of software designed to handle a particular format of dictionary databases. This piece of software is built as a shared library that dicod loads at run time.

A handler is an instance of the database module loaded by dicod and configured for a specific database or a set of databases.

Database handlers are defined using the following block statement:

Configuration: load-module string { … }

Create an instance of a database module. The argument specifies a unique name which will be used by subsequent parts of the configuration to refer to this handler. The ellipsis in the description above represents sub-statements. As of Dico version 2.10 only one sub-statement is defined:

load-module config: command string: Sets the command line for this handler. It is similar to the shell’s command line in that it consists of a name of database module, optionally followed by a whitespace-separated list of its arguments. The name of the module specifies the disk file to load (see below for a detailed description of the loading sequence). Both command name and arguments are passed to the module initialization function (see dico_init).

For example:

load-module dict {
  command "dictorg dbdir=/var/dicodb";
}

This statement defines a handler named ‘dict’, which loads the module dictorg and passes its initialization function a single argument, ‘dbdir=/var/dicodb’. If the module name is not an absolute file name, as in this example, the loadable module will be searched in the module load path.

A common case is when the module does not require initialization arguments and its command string is the same as its name, e.g.:

load-module outline {
  command "outline";
}

The configuration syntax provides a shortcut for such usage:

load-module outline;

If load-module is used this way, it accepts a single string or a list of strings as its argument. In the latter case, it loads all modules listed in the argument. For example:

load-module (stratall,substr,word);

A module load path is an internal list of directories which dicod scans in order to find a loadable file name specified in the command statement. By default the search order is as follows:

Optional prefix search directories specified by the prepend-load-path directive (see below) and the --load-dir (-L) command line option.
GNU Dico module directory: $prefix/lib/dico.
Additional search directories specified by the module-load-path directive (see below).
The value of the environment variable LTDL_LIBRARY_PATH.
The system dependent library search path (e.g. on GNU/Linux it is defined by the file /etc/ld.so.conf and the environment variable LD_LIBRARY_PATH).

The value of LTDL_LIBRARY_PATH and LD_LIBRARY_PATH must be a colon-separated list of absolute directory names, for example ‘/usr/lib/mypkg:/lib/foo’.

In any of these directories, dicod first attempts to find and load the given filename. If this fails, it tries to append the following suffixes to it:

the libtool archive suffix ‘.la’
the suffix used for native dynamic libraries on the host platform, e.g., ‘.so’, ‘.sl’, etc.

Configuration: module-load-path list

This directive adds the directories listed in its argument to the module load path. Example:

module-load-path (/usr/lib/dico,/usr/local/dico/lib);

Configuration: prepend-load-path list: Same as module-load-path, but adds directories to the beginning of the module load path.

4.3.12 Databases

Dictionary databases are defined using the database block statement.

Configuration: database { statements }: Defines a dictionary database. At least two sub-statements must be defined for each database: name and handler.

Database: visible bool: Defines whether this database is visible or not. By default, all databases are visible. You will need this statement if you want to temporary hide the database without removing it from the configuration. Another common use case is to hide a database that is used as a member of a virtual database, so that its contents is available only by querying the parent database (see Virtual Databases).

Database: name string: Sets the name of this database (a single word). This name will be used to identify this database in DICT commands.

Database: handler string: Specifies the handler name for this database and any arguments for it. This handler must be previously defined using the load-module statement (see Handlers).

For example, the following fragment defines a database named ‘en-de’, which is handled by ‘dictord’ handler. The handler is passed one argument, database=en-de:

database {
        name "en-de";
        handler "dictorg database=en-de";
}

More directives are available to fine-tune the database.

Database: description string

Supplies a short description, to be shown in reply to SHOW DB command. The string may not contain new-lines.

Use this statement if the database itself does not supply a description, or if its description is malformed.

In any case, if the description directive is specified, its value takes precedence over the description string retrieved from the database itself.

See SHOW DB, for a description of SHOW DB command.

Database: info string

Supplies a full description of the database. This description is shown in reply to SHOW INFO (see SHOW INFO) command. The string is usually a multi-line text, so it is common to use here-document syntax (see here-document), e.g.:

info <<- EOT
   This is a foo-bar dictionary.
   Copyright (C) 2008 foo-bar dict group.
   Distributed under the terms of GNU Free
   Documentation license.
EOT;

Use this statement if the database itself does not supply a full description, or if its full description is malformed.

As with description, the value of info takes precedence over info strings retrieved from the database.

The following two directives control the content type and transfer encoding used when formatting replies from this database if OPTION MIME (see OPTION MIME) is in effect:

Database: mime-headers multiline-string

Defines the headers to be sent with the replies from this database. Argument is a here-document (see here-document), containing the headers to be sent with each dictionary entry, if the client sent the ‘OPTION MIME’ command. By default dicod uses MIME headers defined in the database itself. Use this statement if these are not defined, or if you want to override them. In this case you would want to include at least the ‘Content-Type’ and ‘Content-Transfer-Encoding’ headers, as shown in the example below:

directory {
   name "foo";
   handler "dictorg";
   mime-headers <<- EOT
     Content-Type: text/html; charset=utf-8
     Content-Transfer-Encoding: 8bit
   EOT;     
   ...
}

Valid values for the ‘Content-Transfer-Encoding’ header are:

8bit: The content will be transferred as is.
quoted-printable: Non-printable characters will be encoded using the ‘quoted-printable’ encoding.
base64: Non-printable characters will be encoded using the ‘base64’ encoding.

4.3.12.1 Database Visibility

A property called database visibility is associated with each dictionary database. It determines whether the database appears in the output of SHOW DB command, and takes part in dictionary searches.

By default, all databases are defined as publicly visible. You can hide a database permanently by using the ‘visible no’ statement in its definition. You can also limit its visibility on global as well as on per-directory basis. This can be achieved using visibility ACLs.

In general, the visibility of a database is controlled by two access control lists: a global visibility ACL and a database visibility ACL. The latter takes precedence over the former.

Both ACLs are defined using the visibility-acl statement:

Configuration: visibility-acl acl-name: Sets name of the ACL that controls the database visibility. When used in global scope, this statement sets the global visibility ACL. If used within a database block, it sets the visibility ACL for that particular database.

Consider the following example:

acl glob-vis {
  allow authenticated;
  deny all;
}  

acl local-nets {
  allow from (192.168.10.0/24, /tmp/dicod.sock);
}

visibility-acl glob-vis;

database {
  name "terms";
  visibility-acl local-nets;
}

In this configuration, the ‘terms’ database is visible to everybody coming from the ‘192.168.10.0/24’ network and from the UNIX socket /tmp/dicod.sock, without authorization. It is not visible to users coming from elsewhere, unless they authenticate themselves.

4.3.12.2 Virtual Databases

A virtual database is a collection of several regular databases. When a search is performed on a virtual database, it returns matches from the constituent databases.

Virtual databases can be used for grouping. For example a virtual database may include all dictionaries translating from English to Norwegian. Another one may include thesauri for English.

Yet another common use for virtual databases is to select different output markup depending on whether ‘OPTION MIME’ was requested by the user.

Technically, a virtual database is defined by specifying

  handler "virtual";

in the database definition. This is a built-in module, so you must not use the load-module statement.

The names of the member databases (the databases to be included to this one) are supplied using the database statements:

Database: database name [mime | nomime]

Specifies the database to be included as a member of this virtual database. The name argument supplies the name of the database (as set by the name statement in its definition).

Optional second argument may be used to restrict the use of this database to the given state of the ‘MIME’ option. Databases marked with ‘mime’ will be used only if the OPTION MIME command has been given for the current session. Databases marked with ‘nomime’ will be used only if this command has not been issued.

The following example defines a virtual database for translations from English to several other languages:

database {
  name "English Translating Database";
  info "Translations from English to other languages";
  handler "virtual";
  database "en-sw";
  database "en-no";
  database "en-pl";
}

It is supposed, that databases ‘en-sw’, ‘en-no’, and ‘en-pl’ are defined elsewhere in the configuration.

Another example illustrates how to define a database that will select the format of the articles depending on whether the client requests MIME output. Suppose that the configuration defines two dictionaries: ‘thes_plain’, with a thesaurus formatted in plaintext, and ‘thes_html’, with the same thesaurus, but formatted in HTML. The following database will return plaintext responses by default and HTML responses after the OPTION MIME command:

database {
  name "thesaurus";
  handler "virtual";
  database thes_plain nomime;
  database thes_html mime;
}

Notice, that in this case it makes sense to define member databases as invisible, to avoid duplicate matches. E.g.:

database {
  name "thes_pain";
  visible no;
  ...
}  
database {
  name "thes_html";
  visible no;
  ...
}

To determine description (whether short or long) for a virtual database, the following algorithm is used. If the ‘description’ (or, for long description, ‘info’) statement is present in the ‘database’ block, its value is used. Otherwise, the server obtains descriptions of each member database that is visible in the current ‘OPTION MIME’ state. If all databases return the same value, it is used. Otherwise, empty string is used.

Practically, that means that when defining a collection virtual database (as in the first example above), you are better off supplying both ‘description’ and ‘info’ statements.

On the other hand, when defining a mime-switching virtual database with two members (as in the second example), you can safely omit both statements: dicod will pick the value from the currently active member database.

4.3.13 Strategies and Default Searches

A default search is a MATCH request with ‘*’ or ‘!’ as the database argument (see MATCH). The former means search in all available databases, the latter means search in all databases until a match is found.

Default searches may be quite expensive and may cause considerable strain on the server. For example, the command MATCH * priefix "" returns all entries from all available databases, which would consume a lot of resources both on the server and on the client side.

To minimize harmful effects from such potentially dangerous requests, it is possible to limit the use of certain strategies in default searches.

Configuration: strategy name { statements }: Restricts the use of the strategy name in default searches.

The statements define conditions the 4th argument of a MATCH command must match in order to deny the request. The following statements are defined:

Configuration: deny-all bool: Unconditionally deny the use of this strategy in default searches.

Configuration: deny-word list: Deny this strategy if the search word matches one of the words from list.

Configuration: deny-length-lt number: Deny if length of the search word is less than number.

Configuration: deny-length-le number: Deny if length of the search word is less than or equal to number.

Configuration: deny-length-gt number: Deny if length of the search word is greater than number.

Configuration: deny-length-ge number: Deny if length of the search word is greater than or equal to number.

Configuration: deny-length-eq number: Deny if length of the search word is equal to number.

Configuration: deny-length-ne number: Deny if length of the search word is not equal to number.

For example, the following statement denies the use of ‘prefix’ strategy in default searches if its argument is an empty string:

strategy prefix {
  deny-length-eq 0;
}

If the dicod daemon is configured this way, it will always return a ‘552’ reply on commands MATCH * prefix "" or MATCH ! prefix "". However, the use of empty prefix on a concrete database, as in MATCH eng-deu prefix "", will still be allowed.

4.3.14 Tuning

While tuning your server, it is often necessary to get timing information which shows how much time is spent serving certain requests. This can be achieved using the timing configuration directive:

Configuration: timing boolean

Provide timing information after successful completion of an operation. This information is displayed after the following requests: MATCH, DEFINE, and QUIT. It consists of the following parts:

[d/m/c = nd/nm/nc RTr UTu STs]

where:

nd: Number of processed define requests. It is ‘0’ after a MATCH request.
nm: Number of processed match requests. It is ‘0’ after a DEFINE request.
nc: Number of comparisons made. This value may be inaccurate if the underlying database module is not able to count comparisons.
RT: Real time spent serving the request.
UT: Time in user space spent serving the request.
ST: Time in kernel space spent serving the request.

An example of a server reply with timing information follows:

250 Done [d/m/c = 0/63/107265 2.293r 1.120u 0.010s]

You can also add timing information to your access log files, see %T.

4.3.15 Command Aliases

Aliases allow a string to be substituted for a word when it is used as the first word of a command. The daemon maintains a list of aliases that are created using the alias configuration file statement:

Configuration: alias word command: Creates a new alias.

Aliases are useful to facilitate manual interaction with the server, as they allow the administrator to create abbreviations for some frequently typed commands. For example, the following alias creates new command d which is equivalent to DEFINE *:

alias d DEFINE "*";

Aliases may be recursive, i.e. the first word of command may refer to another alias. For example:

alias d DEFINE;
alias da d "*";

This configuration will produce the following expansion:

da word ⇒ DEFINE * word

To prevent endless loops, recursive expansion is stopped if the first word of the replacement text is identical to an alias expanded earlier.

4.3.16 Using Preprocessor to Improve the Configuration.

Before parsing its configuration file, dicod preprocesses it. The built-in preprocessor handles only file inclusion and #line statements (see Pragmatic Comments), while the rest of traditional preprocessing facilities, such as macro expansion, is supported via m4, which is used as an external preprocessor.

The detailed description of m4 facilities lies far beyond the scope of this document. You will find a complete user manual in http://www.gnu.org/software/m4/manual. For the rest of this subsection we assume the reader is sufficiently acquainted with m4 macro processor.

The external preprocessor is invoked with -s flag, instructing it to include line synchronization information in its output. This information is then used by the parser to display meaningful diagnostic. An initial set of macro definitions is supplied by the pp-setup file, located in $prefix/share/dico/version/include directory (where version means the version of GNU Dico package).

The default pp-setup file changes quote characters to ‘[’ and ‘]’, and renames all m4 built-in macros so they all start with the prefix ‘m4_’. The latter has an effect similar to GNU m4 --prefix-builtin option, but has an advantage that it works with non-GNU m4 implementations as well.

As an example of how the use of preprocessor may improve dicod configuration, consider the following fragment taken from one of the installations of GNU Dico. This installation offers quite a few Freedict dictionaries. The database definition for each of them is almost the same, except for the dictionary name and eventual description entry for several databases that miss it. To avoid repeating the same text over again, we define the following macro:

# defdb(NAME[, DESCR])
# Produce a standard definition for a database NAME.
# If DESCR is given, use it as a description.
m4_define([defdb], [
database {
        name "$1";
        handler "dictorg database=$1";m4_dnl
m4_ifelse([$2],,,[
        description "$2";])
}
])

It takes two arguments. The first one, NAME, defines the dictionary name visible in the output of SHOW DB command. Optional second argument may be used to supply a description string for the databases that miss it.

Given this macro, the database definitions look like:

defdb(eng-swa)
defdb(swa-eng)
defdb(afr-eng, Afrikaans-English Dictionary)
defdb(eng-afr, English-Afrikaans Dictionary)

This document was generated on September 4, 2020 using makeinfo.

Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved.

Dico

GNU Dictionary Server

4.3 Configuration

4.3.1 Configuration File Syntax

4.3.1.1 Comments

4.3.1.2 Pragmatic Comments

4.3.1.3 Statements

4.3.2 Server Settings

4.3.3 Authentication

4.3.3.1 Text Authentication Database

4.3.3.2 LDAP Databases.

4.3.4 SASL Authentication

4.3.5 Access Control Lists

4.3.6 Security Settings

4.3.7 Logging and Debugging

4.3.8 Access Log

4.3.9 General Settings

4.3.10 Server Capabilities

4.3.11 Database Modules and Handlers

4.3.12 Databases

4.3.12.1 Database Visibility

4.3.12.2 Virtual Databases

4.3.13 Strategies and Default Searches

4.3.14 Tuning

4.3.15 Command Aliases

4.3.16 Using Preprocessor to Improve the Configuration.