![]() |
Dico |
GNU Dictionary Server |
Sergey Poznyakoff |
4.3 Configuration
Upon startup, dicod
reads its settings and database
definitions from a configuration file dicod.conf. By
default it is located in $sysconfidr (i.e., in most cases
/usr/local/etc, or /etc), but an alternative location
may be specified using the --config command line option
(see --config).
If any errors are encountered in the configuration file, the program reports them on the standard error and exits with a non-zero status.
To test the configuration file without starting the server, use
the --lint (-t) command line option. It causes
dicod
to check its configuration file and exit with status 0
if no errors were detected, and with status 1 otherwise.
Before parsing, the configuration file is preprocessed using
m4
(see Preprocessor). To examine the preprocessed
configuration without actually parsing it, use the -E command
line option. To avoid preprocessing it, use the
--no-preprocessor option.
The rest of this section describes configuration file syntax in
detail. You can receive a concise summary of all configuration
directives any time by running dicod --config-help
.
4.3.1 Configuration File Syntax
A dicod
configuration consists of statements and comments.
There are three classes of lexical tokens: keywords, values, and separators. Blanks, tabs, newlines and comments, collectively called white space are ignored except as they serve to separate tokens. Some white space is required to separate otherwise adjacent keywords and values.
4.3.1.1 Comments
Comments may appear anywhere where white space may appear in the configuration file. There are two kinds of comments: single-line and multi-line comments. Single-line comments start with ‘#’ or ‘//’ and continue to the end of the line:
# This is a comment // This too is a comment
Multi-line or C-style comments start with the two characters ‘/*’ (slash, star) and continue until the first occurrence of ‘*/’ (star, slash).
Multi-line comments cannot be nested.
4.3.1.2 Pragmatic Comments
Pragmatic comments are similar to usual comments, except that they cause some changes in the way the configuration is parsed. Pragmatic comments begin with a ‘#’ sign and end with the next physical newline character. As of GNU Dico version 2.10, the following pragmatic comments are understood:
#include <file>
#include file
Include the contents of the file. If file is an absolute file name, both forms are equivalent. Otherwise, the form with angle brackets searches for the file in the include search path, while the second one looks for it in the current working directory first, and, if not found there, in the include search path.
The default include search path is:
- prefix/share/dico/2.10/include
- prefix/share/dico/include
where prefix is the installation prefix.
New directories can be appended in front of it using -I (--include-dir) command line option (see --include-dir).
#include_once <file>
#include_once file
Same as
#include
, except that, if the file has already been included, it will not be included again.#line num
#line num "file"
This line causes
dicod
to believe, for purposes of error diagnostics, that the line number of the next source line is given by num and the current input file is named by file. If the latter is absent, the remembered file name does not change.# num "file"
This is a special form of
#line
statement, understood for compatibility with the C preprocessor.
In fact, these statements provide a rudimentary preprocessing features. For more sophisticated ways to modify configuration before parsing, see Preprocessor.
4.3.1.3 Statements
A simple statement consists of a keyword and a value separated by any amount of whitespace. It is terminated with a semicolon (‘;’), unless the value is a here-document (see below), in which case semicolon is optional.
Examples of simple statements:
timing yes; access-log-file /var/log/access_log;
A keyword begins with a letter and may contain letters, decimal digits, underscores (‘_’) and dashes (‘-’). Examples of keywords are: ‘group’, ‘identity-check’.
A value can be one of the following:
- number
A number is a sequence of decimal digits.
- boolean
-
A boolean value is one of the following: ‘yes’, ‘true’, ‘t’ or ‘1’, meaning true, and ‘no’, ‘false’, ‘nil’, ‘0’ meaning false.
- unquoted string
-
An unquoted string may contain letters, digits, and any of the following characters: ‘_’, ‘-’, ‘.’, ‘/’, ‘@’, ‘*’, ‘:’.
- quoted string
-
A quoted string is any sequence of characters enclosed in double-quotes (‘"’). A backslash appearing within a quoted string introduces an escape sequence, which is replaced with a single character according to the following rules:
Sequence Replaced with \a Audible bell character (ASCII 7) \b Backspace character (ASCII 8) \f Form-feed character (ASCII 12) \n Newline character (ASCII 10) \r Carriage return character (ASCII 13) \t Horizontal tabulation character (ASCII 9) \v Vertical tabulation character (ASCII 11) \\ A single backslash (‘\’) \" A double-quote. Table 4.1: Backslash escapes
In addition, the sequence ‘\newline’ is removed from the string. This allows you to split long strings over several physical lines, e.g.:
"a long string may be\ split over several lines"
If the character following a backslash is not one of those specified above, the backslash is ignored and a warning is issued.
Two or more adjacent quoted strings are concatenated, which gives another way to split long strings over several lines to improve readability. For instance, the following fragment produces the same result as the example above:
"a long string may be" " split over several lines"
- Here-document
-
A here-document is a special construct that allows the user to introduce strings of text containing embedded newlines.
The
<<word
construct instructs the parser to read all the following lines up to the line containing only word, with possible trailing blanks. Any lines thus read are concatenated together into a single string. For example:<<EOT A multiline string EOT
The body of a here-document is interpreted the same way as a double-quoted string, unless word is preceded by a backslash (e.g. ‘<<\EOT’) or enclosed in double-quotes, in which case the text is read as is, without interpretation of escape sequences.
If word is prefixed with
-
(a dash), then all leading tab characters are stripped from input lines and the line containing word. Furthermore, if-
is followed by a single space, all leading whitespace is stripped from them. This allows for indenting here-documents in a natural fashion. For example:<<- TEXT All leading whitespace will be ignored when reading these lines. TEXT
It is important that the terminating delimiter be the only token on its line. The only exception to this rule is allowed if a here-document appears as the last element of a statement. In this case a semicolon can be placed on the same line with its terminating delimiter, as in:
help-text <<-EOT A sample help text. EOT;
- list
-
A list is a comma-separated sequence of values. Lists are delimited by parentheses. The following example shows a statement whose value is a list of strings:
capability (mime,auth);
In any case where a list is appropriate, a single value is allowed without being a member of a list: it is equivalent to a list whose only member is that value. This means that, e.g. ‘capability mime;’ is equivalent to ‘capability (mime);’.
A block statement introduces a logical group of another statements. It consists of a keyword, followed by an optional value, and a sequence of statements enclosed in curly braces, as shown in the example below:
load-module outline { command "outline"; }
The closing curly brace may be followed by a semicolon, although this is not required.
4.3.2 Server Settings
Server settings control how dicod
is executed on the
server machine.
- Configuration: user string
Run with the privileges of this user.
Dicod
does not require root privileges, so it is recommended to always use this statement when runningdicod
in daemon mode (see Daemon Mode). The argument is either a user name, or UID prefixed with a plus sign.Example:
user nobody;
- Configuration: group list
If
user
is given,dicod
will drop all supplementary groups and switch to the principal group of that user. Sometimes, however, it may be necessary to retain one or more supplementary groups. For example, this might be necessary to access dictionary databases. Thegroup
statement retains the supplementary groups listed in list. Each group can be specified either by its name or by its GID number, prefixed with ‘+’, e.g.:user nobody; group (man, dict, +88);
This statement is ignored if
user
statement is not present or ifdicod
is running in inetd mode. See Inetd Mode.
- Configuration: mode enum
Sets server operation mode. The argument is one of:
- daemon
Run in daemon mode. See Daemon Mode, for a detailed description.
- inetd
Run in inetd mode. See Inetd Mode, for a detailed description.
This statement is overridden by the --inetd command line option. See --inetd.
- Configuration: listen list;
Specify the IP addresses and ports to listen on in daemon mode. By default,
dicod
will listen on port 2628 on all existing interfaces. Use thelisten
statement to abridge the list of interfaces to listen on, or to change the port number.Elements of list can have the following forms:
- host:port
Specifies an IP (version 4 or 6) socket to listen on. The host part is either an IPv4 in “dotted-quad” notation, or an IPv6 address in square brackets, or a host name. In the latter case,
dicod
will listen on all IP addresses corresponding to its ‘A’ or ‘AAAA’ DNS records.The port part is either a numeric port number or a symbolic service name which is found in /etc/services file.
Either of the two parts may be omitted. If host is omitted,
dicod
will listen on all interfaces. If port is omitted, it defaults to 2628. In this case the colon may be omitted, too.Examples:
listen dict.example.org:2628; listen 198.51.100.10; listen [2001:DB8::11]; listen :2628;
- inet://host:port
- inet4://host:port
Listen on IPv4 socket. The host is either an IP address or a host name. In the latter case,
dicod
will start listening on all IP addresses from the ‘A’ records for this host.Either host or port (but not both) can be omitted. Missing host defaults to IPv4 addresses on all available network interfaces, and missing port defaults to 2628.
Example:
listen inet4://198.51.100.10;
- inet6://host:port
Listen on IPv6 socket. The host is either an IPv6 address in square brackets, or a host name. In the latter case,
dicod
will start listening on all IP addresses from the ‘AAAA’ records for this host.Either host or port (but not both) can be omitted. Missing host defaults to IPv6 addresses on all available network interfaces, and missing port defaults to 2628.
Example:
listen inet6://[2001:DB8::11];
- filename
- unix://filename
Specifies the name of a UNIX socket to listen on. Filename must be an absolute file name of the socket.
- Configuration: pidfile string
Store PID of the master process in this file. Default is localstatedir/run/dicod.pid. Notice that the access bits of this default directory may be insufficient for
dicod
to write there after dropping root privileges (see user statement). One solution to this is to create a subdirectory with the same owner as given byuser
statement and to point the PID file there:pidfile /var/run/dict/dicod.pid;
Another solution is to make PID directory group-writable and to add the owner group to the
group
statement (see group statement).
- Configuration: max-children number
Sets maximum number of sub-processes that can run simultaneously. This is equivalent to the number of clients that can simultaneously use the server. The default is 64 sub-processes.
- Configuration: inactivity-timeout number
Set inactivity timeout to the number of seconds. The server disconnects automatically if the remote client has not sent any command within this number of seconds. Setting timeout to 0 disables inactivity timeout (the default).
This statement along with
max-children
allows you to control the server load.
- Configuration: shutdown-timeout number
When the master server is shutting down, wait this number of seconds for all children to terminate. Default is 5 seconds.
- Configuration: identity-check boolean
Enable identification check using AUTH protocol (RFC 1413). The received user name or UID can be shown in access log using the
%l
conversion (see Access Log).
- Configuration: ident-keyfile string
Use encryption keys from the named file to decrypt AUTH replies encrypted using DES.
- Configuration: ident-timeout number
Set timeout for AUTH input/output operation to number of seconds. Default timeout is 3 seconds.
4.3.3 Authentication
The server may be configured to request authentication in order to make some databases or some additional information available to the user. Another possible use of authentication is to minimize resource utilization on the server machine.
GNU Dico supports two types of authentication: the traditional APOP-style authentication (see AUTH) and a more advanced SASL authentication. The latter is described separately, see SASL.
Authentication setup is simple: first, you define a user
authentication database, then you enable it by declaring auth
server capability (see Capabilities):
capability auth;
User authentication database keeps, for each user name, the corresponding plain text password, and, optionally, the names of groups this user belongs to. Notice, that due to the specifics of DICT authentication scheme (see AUTH), user passwords are stored in plain text, therefore special care must be taken to protect the contents of your authentication database from compromise.
The database is defined using the user-db
block statement:
- Configuration: user-db url
Declare user authentication database.
Dico’s authentication is designed so that various authentication database formats can easily be added. A database is identified by its URL, or Universal Resource Locator. It consists of the following parts (square brackets denoting optional ones):
type://[[user[:password]@]host]/path[params]
- type
A database type, or format. See below for a list of available database formats.
- user
User name necessary to access the database.
- password
User password necessary to access the database.
- host
Domain name or IP address of a machine running the database.
- path
A path to the database. The exact meaning of this element depends on the database protocol. It is described in detail when discussing the particular database protocols.
- params
A list of protocol-dependent parameters. Each parameter is of the form
keyword=name
, multiple parameters are separated with semicolons.
If the underlying mechanism requires some additional configuration data that cannot be supplied in an URL, these are passed to it using the following statement:
- user-db conf: options string
The argument is treated as an opaque string and passed to the authentication ‘open’ procedure verbatim. Its exact meaning depends on the type of the database.
The URL defines how the database is accessed. Another important point is where to get the user data from. This is specified by the following two sub-statements:
- user-db conf: password-resource arg
A database resource which returns the user’s password.
- user-db conf: group-resource arg
A database resource which returns the list of groups this user is member of.
The exact semantics of the database resource depends on the type of database being used. For flat text databases, it means the name of a text file that contains these data, for SQL databases, the resource is an SQL query, etc. Below we will discuss URLs and resources used by each database type.
To summarize, the authentication database is defined as:
# Define user database for authentication. user-db url { # Additional configuration options. options string; # Name of a password resource. password-resource resource; # Name of the resource returning user group information. group-resource resource; }
4.3.3.1 Text Authentication Database
A text authentication database consists of one or two flat text files — a password file, which contains user passwords, and a group file, which contains user groups. The latter is optional. Both files have the same format:
- Empty lines are ignored.
- Any text from ‘#’ to the end of the line is ignored.
- Non-empty lines consist of two fields, separated by any amount of white space. The first field is the user name. It serves as a search key in the database. The second field is the requested resource.
Record keys in a password file must be unique, i.e. no two records may contain the same first field. The group file may contain multiple records with the same key. For example:
$ grep smith pass smith guessme $ grep smith group smith user smith timing smith tester
This means that user ‘smith’ has password ‘guessme’ and is a member of three groups: ‘user’, ‘timing’ and ‘tester’.
A URL of a text database begins with ‘text’ and
contains only the path element, which gives the name of the
directory where the database files reside. The name of a password
file is given by the password-resource
statement. The name of a
group file is given by the group-resource
statement.
For example, if user passwords are kept in the file passwd, user groups are kept in the file user, and both files reside in /var/db/dico directory, then the appropriate database configuration will be:
user-db text:///var/db/dico { password-resource passwd; group-resource group; }
4.3.3.2 LDAP Databases.
To configure LDAP user database, you need first to load the ‘ldap’ module (see LDAP module):
load-module ldap;
The URL of the database is: ‘ldap://host[:port]’, where host is the host name or IP address of the LDAP server, and optional port specifies the port number it is listening on (by default, port 389 is assumed).
The password-resource
statement specifies the name of an
attribute containing the password, and the group-resource
supplies the name of the attribute with the group name.
Additional configuration data are supplied in the options
statement, whose argument is a whitespace-separated list of
assignments:
- base=base
Sets base DN.
- binddn=dn
Sets the DN to bind as.
- passwd=string
Sets the password.
- tls=bool
When set to ‘yes’, enables the use of TLS encryption.
- debug=number
Sets OpenLDAP debug level.
- user-filter=filter
A LDAP filter to select the objects describing this user. Any occurrence of ‘$user’ in filter is replaced with the actual user name, as obtained during the authentication. This variable expansion occurs much the same way as in shell. In particular, the variable is expanded only unless it is immediately followed by an alphanumeric character. For example, it occurs in:
(uid=$user)
and
(uid=$user.1)
But it does not occur in
(uid=$users)
If it is necessary to expand the variable in such a context, enclose its name in curly braces:
(uid=${user}s)
- group-filter=filter
A LDAP filter that selects the user groups. The filter is expanded as in
user-filter
.
The following example shows a LDAP user database configured for base DN ‘example.com’ which uses ‘posixAccount’ and ‘posixGroup’ objects from ‘nis.schema’:
user-db "ldap://localhost" { password-resource userPassword; group-resource cn; options "user-filter=(uid=$user) " "group-filter=(&(objectClass=posixGroup)" "(memberuid=$user)) " "base=dc=example,dc=com"; }
A note on password usage is in order here. Most authentication methods require the passwords to be stored in the database in plain text form. The use of encrypted passwords (e.g. MD5 or SHA1) is possible only with ‘LOGIN’ and ‘PLAIN’ GSASL authentication methods.
4.3.4 SASL Authentication
The SASL authentication is available if the server was compiled with GNU SASL.
- Configuration: sasl { statements }
This block statement configures SASL authentication. The following is a short summary of its syntax and the available substatements:
sasl { # Disable SASL mechanisms listed in mech. disable-mechanism mech; # Enable SASL mechanisms listed in mech. enable-mechanism mech; # Set service name for GSSAPI and Kerberos. service name; # Set realm name for GSSAPI and Kerberos. realm name; # Define groups for anonymous users. anon-group group-list; }
The list of available authentication mechanisms is configured using two statements:
- sasl: disable-mechanism mech
Disables SASL mechanisms listed in mech, which is a list of names.
- sasl: enable-mechanism mech
Enables SASL mechanisms listed in mech, which is a list of names.
The server builds a list of available mechanisms using the following
algorithm. First, a list of implemented mechanisms is retrieved from
the SASL library. If the enable-mechanism
statement is
defined, the resulting list is filtered so that only mechanisms listed
in enable-mechanism
remain. Further, if the
disable-mechanism
statement is defined, the names listed there
are removed from the list.
- sasl: service name
Sets the service name for GSSAPI and Kerberos mechanisms.
- sasl: realm name
Sets the realm name.
- sasl: anon-group list
Sets the list of user groups considered anonymous.
The database of user credentials depends on the authentication
mechanism used. For GSSAPI or Kerberos it is managed by appropriate
servers. Other mechanisms use the standard dicod
user database
configuration (see Authentication).
4.3.5 Access Control Lists
Access control lists, or ACLs for short, are lists of
permissions that can be applied to certain dicod
objects.
They can be used to control who can connect to the dictionary server
and what resources are offered to whom.
An ACL is defined using the acl
block statement:
acl name { definitions }
The parameter name specifies a unique name for that ACL. This name will be used by another configuration statements to refer to that ACL (See Security Settings, and see Database Visibility).
A part between the curly braces (denoted by definitions above), is a list of access statements. There are two types of such statements:
- ACL: allow user-group sub-acl host-list
Allow access to resource.
- ACL: deny user-group sub-acl host-list
Deny access to resource.
All parts of an access statement are optional, but at least one of them must be present.
The user-group part specifies which users match this entry. Allowed values are the following:
all
All users.
authenticated
Only authenticated users.
group group-list
Authenticated users which are members of at least one of the groups listed in group-list.
The sub-acl part, if present, branches to another ACL. The syntax of this group is:
acl name
where name is the name of a previously defined ACL.
Finally, the host-list group matches client IP addresses.
It consists of a from
keyword followed by a list of
address specifiers. Allowed address specifiers are:
any
Matches any client address.
- addr
Matches if the client IP equals addr. The latter may be given either as an IP address or as a host name, in which case it will be resolved and the first of its IP addresses will be used.
- addr/netlen
Matches if first netlen bits from the client IP address equal to addr. The network mask length, netlen must be an integer number in the range from 0 to 32 for IPv4, and in the range 0 – 128 for IPv6. The address part, addr, is as described above.
- addr/netmask
The specifier matches if the result of logical AND between the client IP address and netmask equals to addr. The network mask must be specified in a IP address (either IPv4 or IPv6) notation.
- filename
Matches if connection was received from a UNIX socket filename, which must be given as an absolute file name.
To summarize, the syntax of an access statement is:
allow|deny [all|authenticated|group group-list] [acl name] [from addr-list]
where square brackets denote optional parts and vertical bar means ‘one of’.
When an ACL is applied to a particular object, its entries
are tried in turn until one of them matches, or the end of the list is
reached. If a matched entry is found, its command verb, allow
or deny
, defines the result of ACL match. If the end
of list is reached, the result is ‘allow’, unless explicitly
specified otherwise.
For example, the following statement defines an ACL named ‘common’, that allows access for any user connected via local UNIX socket /tmp/dicod.sock or coming from a local network ‘192.168.10.0/24’. Any authenticated users are allowed, provided that they are allowed by another ACL ‘my-nets’ (which should have been defined before this definition). Users coming from the network ‘10.10.0.0/24’ are allowed if they authenticate themselves and are members of groups ‘dicod’ or ‘users’. Anybody else is denied access:
acl common { allow all from ("/tmp/dicod.sock", "192.168.10.0/24"); allow authenticated acl "my-nets"; allow group ("dicod", "users") from "10.10.0.0/24"; deny all; }
See Security Settings, for information on how to control daemon security settings.
See Database Visibility, for a detailed description on how to use ACLs to control access to databases.
4.3.6 Security Settings
This subsection describes configuration settings that control access
to various resources served by dicod
.
- Configuration: connection-acl acl-name
Use ACL acl-name to control incoming connections. The ACL itself must be defined before this statement. Using user-group (see previous subsection) in this ACL makes no sense, because the authentication itself is performed only after the connection have been established.
acl incoming-conn { allow from 213.130.0.0/19; deny any; } connection-acl incoming-conn;
- Configuration: show-sys-info acl-name
This statement controls whether to show system information in reply to
SHOW SERVER
command (see SHOW SERVER). The information will be shown only if ACL acl-name allows it.The system information shown includes the following data: name of the package and its version, name of the system where it was built and the kernel version thereof, host name, total operational time of the daemon, number of subprocesses executed so far and average usage frequency. For example:
dicod (dico 2.10) on Linux 2.6.32, dict.example.net up 99+04:42:58, 19647 forks (686.9/hour)
4.3.7 Logging and Debugging
The directives described in this subsection provide basic logging capabilities.
- Configuration: log-tag string
Prefix syslog messages with this string. By default, the program name is used.
- Configuration: log-facility string
Sets the syslog facility to use. Allowed values are: ‘user’, ‘daemon’, ‘auth’, ‘authpriv’, ‘mail’, ‘cron’, ‘local0’ through ‘local7’ (case-insensitive), or a facility number.
- Configuration: log-print-severity boolean
Prefix diagnostics messages with a string identifying their severity.
- Configuration: transcript boolean
Controls the transcript of user sessions. If boolean is ‘true’, the transcript will be output to the logging channel. In the transcript, the lines received from client are prefixed with ‘C:’, while those sent in reply are marked with ‘S:’. Here is an excerpt from the transcript output:
S: 220 example.net dicod (dico 2.10) <mime.xversion> <1645.1212874507@example.net> C: client "Kdict" S: 250 ok C: show db S: 110 16 databases present S: afr-deu "Afrikaans-German Freedict dictionary" S: afr-eng "Afrikaans-English FreeDict Dictionary" [...] S: . S: 250 ok
(The first line is split in two to fit in the printed page width.) This option produces lots of output and can significantly slow down the server. Use it only if you are debugging
dicod
or some remote client. Never use it in a production environment.
4.3.8 Access Log
GNU Dico provides a feature similar to Apache’s CustomLog
, which
keeps a log of MATCH
and DEFINE
requests. To enable
this feature, specify the name of the log file using the following
directive:
- Configuration: access-log-file string
Sets access log file name.
access-log-file /var/log/dico/access.log;
The format of log file entries is defined via the
access-log-format
directive:
- Configuration: access-log-format string
Sets format string for access log file.
Its argument can contain literal characters, which are copied into the log file verbatim, and format specifiers, i.e. special sequences which begin with ‘%’ and are replaced in the log file as shown in the table below.
%%
The percent sign.
%a
Remote IP-address.
%A
Local IP-address.
%B
Size of response in bytes.
%b
Size of response in bytes in CLF format, i.e. a ‘-’ rather than a ‘0’ when no bytes are sent.
%C
Remote client (from the
CLIENT
command, see CLIENT).%D
The time taken to serve the request, in microseconds.
%d
Request command verb in abbreviated form, suitable for use in URLs, i.e. ‘d’ for
DEFINE
, and ‘m’ forMATCH
. See urls.%h
Remote host.
%H
Request command verb (
DEFINE
orMATCH
).%l
Remote logname (from identd, if supplied). This will return a dash unless
identity-check
is set to true. See identity-check.%m
The search strategy.
%p
The canonical port of the server serving the request.
%P
The PID of the child that served the request.
%q
The database from the request.
%r
Full request.
%{n}R
The nth token from the request (n is 0-based).
%s
Reply status. For multiple replies, the form ‘%s’ returns the status of the first reply, while ‘%>s’ returns that of the last reply.
%t
Time the request was received in the standard Apache format, e.g.:
[04/Jun/2008:11:05:22 +0300]
%{format}t
The time, in the form given by format, which should be a valid
strftime
format. See Time and Date Formats, for a detailed description.The standard ‘%t’ format is equivalent to
[%d/%b/%Y:%H:%M:%S %z]
%T
The time taken to serve the request, in seconds.
%u
Remote user from
AUTH
command.%v
The host name of the server serving the request. See hostname directive.
%V
Actual host name of the server (in case it was overridden in configuration).
%W
The word from the request.
For the reference, here is the list of format specifiers that
have different meaning than in Apache: ‘%C’, ‘%H’, ‘%m’,
‘%q’. The following format specifiers are unique to dicod
:
‘%d’, ‘%{n}R’, ‘%V’, ‘%W’.
The absence of access-log-format
directive is equivalent to
the following statement:
access-log-format "%h %l %u %t \"%r\" %>s %b";
It was chosen so as to be compatible with Apache access logs and
be easily parsable by existing log analyzing tools, such as
webalizer
.
Extending this format string with the client name produces a log format similar to Apache ‘combined log’:
access-log-format "%h %l %u %t \"%r\" %>s %b \"\" \"%C\"";
4.3.9 General Settings
Settings described in this subsection configure the basic behavior of the DICT daemon.
Display the string in the textual part of the initial server reply.
When connection is established, the server sends an initial reply to the client, that looks like in the example below:
220 example.org <auth.mime> <520.1212912026@example.org>
See Initial Reply, for a detailed description of its parts.
The part of this reply after the host name is modifiable and can contain arbitrary text. You can use
initial-banner-text
to append any additional information there. Note, that string may not contain newlines or angle brackets. For example:initial-banner-text "Please authenticate yourself,";
This statement produces the following initial reply (split over two lines for readability):
220 example.org Please authenticate yourself, <auth.mime> <520.1212912026@Texample.org>
- Configuration: hostname string
Sets the hostname. By default, the server determines it automatically. If, however, it makes a wrong guess, you can fix it using this directive.
The server hostname is used, among others, in the initial reply after ‘220’ code (see above) and may also be displayed in the access log file using the ‘%v’ escape (see Access Log).
- Configuration: server-info string
Sets the server description to be shown in reply to
SHOW SERVER
(see SHOW SERVER) command.The first line of the reply, after the usual ‘114’ response line, shows the name of host where the server is running. If the settings of
show-sys-info
(see show-sys-info) permit, some additional information about the system is printed.The lines that follow are taken from the
server-info
directive. It is common to specify string using “here-document” syntax (see here-document), e.g.:server-info <<EOT Welcome to the FOO dictionary service. Contact <dict@foo.example.org> if you have questions or suggestions. EOT;
- Configuration: help-text string
Sets the text to be displayed in reply to the HELP command.
The default reply to HELP command displays a list of commands understood by the server with a short description of each.
If the string begins with a plus sign, it will be appended to the default reply:
help-text <<-EOT + The commands beginning with an X are extensions. EOT;
If the string begins with any other character, except ‘+’, it will replace the default help output. For example:
help-text <<-EOT There is no help. See RFC 2229 for detailed information. EOT;
- Configuration: default-strategy string
Sets the name of the default matching strategy (see MATCH). By default, Levenshtein matching is used, which is equivalent to
default-strategy lev;
4.3.10 Server Capabilities
Capabilities are certain server features that can be enabled or disabled at the system administrator’s will.
- Configuration: capability list
Requests additional capabilities from the list.
The argument to capability
directive must contain names
of existing dicod
capabilities. These are listed in the
following table:
- auth
The
AUTH
command is supported. See Authentication.- mime
The
OPTION MIME
command is supported. Notice that RFC 2229 requires all servers to support that command, so you should always specify this capability.- xversion
The
XVERSION
command is supported. It is a GNU extension that displays thedicod
implementation and version number. See XVERSION.- xlev
The
XLEV
command is supported. This command allows the remote party to set and query maximal Levenshtein distance forlev
matching strategy. See strategy. See XLEV.
The capabilities set using this directive are
displayed in the initial server reply (see initial reply), and
their descriptions are added to the HELP
command output (unless
specified otherwise by the help-text
statement).
4.3.11 Database Modules and Handlers
A database module is an external piece of software designed to
handle a particular format of dictionary databases. This piece of
software is built as a shared library that dicod
loads
at run time.
A handler is an instance of the database module loaded by
dicod
and configured for a specific database or a set of
databases.
Database handlers are defined using the following block statement:
- Configuration: load-module string { … }
Create an instance of a database module. The argument specifies a unique name which will be used by subsequent parts of the configuration to refer to this handler. The ellipsis in the description above represents sub-statements. As of Dico version 2.10 only one sub-statement is defined:
- load-module config: command string
Sets the command line for this handler. It is similar to the shell’s command line in that it consists of a name of database module, optionally followed by a whitespace-separated list of its arguments. The name of the module specifies the disk file to load (see below for a detailed description of the loading sequence). Both command name and arguments are passed to the module initialization function (see dico_init).
For example:
load-module dict { command "dictorg dbdir=/var/dicodb"; }
This statement defines a handler named ‘dict’, which loads the module dictorg and passes its initialization function a single argument, ‘dbdir=/var/dicodb’. If the module name is not an absolute file name, as in this example, the loadable module will be searched in the module load path.
A common case is when the module does not require initialization arguments and its command string is the same as its name, e.g.:
load-module outline { command "outline"; }
The configuration syntax provides a shortcut for such usage:
load-module outline;
If load-module
is used this way, it accepts a single string or
a list of strings as its argument. In the latter case, it loads all
modules listed in the argument. For example:
load-module (stratall,substr,word);
A module load path is an internal list of directories which
dicod
scans in order to find a loadable file name specified
in the command
statement. By default the search order is as
follows:
-
Optional prefix search directories specified by the
prepend-load-path
directive (see below) and the --load-dir (-L) command line option. - GNU Dico module directory: $prefix/lib/dico.
- Additional search directories specified by the
module-load-path
directive (see below). -
The value of the environment variable
LTDL_LIBRARY_PATH
. -
The system dependent library search path (e.g. on GNU/Linux it is defined
by the file /etc/ld.so.conf and the environment variable
LD_LIBRARY_PATH
).
The value of LTDL_LIBRARY_PATH
and LD_LIBRARY_PATH
must be a
colon-separated list of absolute directory names, for example
‘/usr/lib/mypkg:/lib/foo’.
In any of these directories, dicod
first attempts to find and
load the given filename. If this fails, it tries to append the
following suffixes to it:
- the libtool archive suffix ‘.la’
- the suffix used for native dynamic libraries on the host platform, e.g., ‘.so’, ‘.sl’, etc.
- Configuration: module-load-path list
This directive adds the directories listed in its argument to the module load path. Example:
module-load-path (/usr/lib/dico,/usr/local/dico/lib);
- Configuration: prepend-load-path list
Same as
module-load-path
, but adds directories to the beginning of the module load path.
4.3.12 Databases
Dictionary databases are defined using the database
block
statement.
- Configuration: database { statements }
Defines a dictionary database. At least two sub-statements must be defined for each database:
name
andhandler
.
- Database: visible bool
Defines whether this database is visible or not. By default, all databases are visible. You will need this statement if you want to temporary hide the database without removing it from the configuration. Another common use case is to hide a database that is used as a member of a virtual database, so that its contents is available only by querying the parent database (see Virtual Databases).
- Database: name string
Sets the name of this database (a single word). This name will be used to identify this database in DICT commands.
- Database: handler string
Specifies the handler name for this database and any arguments for it. This handler must be previously defined using the
load-module
statement (see Handlers).
For example, the following fragment defines a database named
‘en-de’, which is handled by ‘dictord’ handler. The handler
is passed one argument, database=en-de
:
database { name "en-de"; handler "dictorg database=en-de"; }
More directives are available to fine-tune the database.
- Database: description string
Supplies a short description, to be shown in reply to
SHOW DB
command. The string may not contain new-lines.Use this statement if the database itself does not supply a description, or if its description is malformed.
In any case, if the
description
directive is specified, its value takes precedence over the description string retrieved from the database itself.See SHOW DB, for a description of
SHOW DB
command.
- Database: info string
Supplies a full description of the database. This description is shown in reply to
SHOW INFO
(see SHOW INFO) command. The string is usually a multi-line text, so it is common to use here-document syntax (see here-document), e.g.:info <<- EOT This is a foo-bar dictionary. Copyright (C) 2008 foo-bar dict group. Distributed under the terms of GNU Free Documentation license. EOT;
Use this statement if the database itself does not supply a full description, or if its full description is malformed.
As with
description
, the value ofinfo
takes precedence over info strings retrieved from the database.
The following two directives control the content type and transfer
encoding used when formatting replies from this database if
OPTION MIME
(see OPTION MIME) is in effect:
- Database: mime-headers multiline-string
Defines the headers to be sent with the replies from this database. Argument is a here-document (see here-document), containing the headers to be sent with each dictionary entry, if the client sent the ‘OPTION MIME’ command. By default
dicod
uses MIME headers defined in the database itself. Use this statement if these are not defined, or if you want to override them. In this case you would want to include at least the ‘Content-Type’ and ‘Content-Transfer-Encoding’ headers, as shown in the example below:directory { name "foo"; handler "dictorg"; mime-headers <<- EOT Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit EOT; ... }
Valid values for the ‘Content-Transfer-Encoding’ header are:
- 8bit
The content will be transferred as is.
- quoted-printable
Non-printable characters will be encoded using the ‘quoted-printable’ encoding.
- base64
Non-printable characters will be encoded using the ‘base64’ encoding.
4.3.12.1 Database Visibility
A property called database visibility is associated with each
dictionary database. It determines whether the database appears in
the output of SHOW DB
command, and takes part in dictionary
searches.
By default, all databases are defined as publicly visible. You can hide a database permanently by using the ‘visible no’ statement in its definition. You can also limit its visibility on global as well as on per-directory basis. This can be achieved using visibility ACLs.
In general, the visibility of a database is controlled by two access control lists: a global visibility ACL and a database visibility ACL. The latter takes precedence over the former.
Both ACLs are defined using the visibility-acl
statement:
- Configuration: visibility-acl acl-name
Sets name of the ACL that controls the database visibility. When used in global scope, this statement sets the global visibility ACL. If used within a
database
block, it sets the visibility ACL for that particular database.
Consider the following example:
acl glob-vis { allow authenticated; deny all; } acl local-nets { allow from (192.168.10.0/24, /tmp/dicod.sock); } visibility-acl glob-vis; database { name "terms"; visibility-acl local-nets; }
In this configuration, the ‘terms’ database is visible to everybody coming from the ‘192.168.10.0/24’ network and from the UNIX socket /tmp/dicod.sock, without authorization. It is not visible to users coming from elsewhere, unless they authenticate themselves.
4.3.12.2 Virtual Databases
A virtual database is a collection of several regular databases. When a search is performed on a virtual database, it returns matches from the constituent databases.
Virtual databases can be used for grouping. For example a virtual database may include all dictionaries translating from English to Norwegian. Another one may include thesauri for English.
Yet another common use for virtual databases is to select different output markup depending on whether ‘OPTION MIME’ was requested by the user.
Technically, a virtual database is defined by specifying
handler "virtual";
in the database
definition. This is a built-in module, so you
must not use the load-module
statement.
The names of the member databases (the databases to be included to
this one) are supplied using the database
statements:
- Database: database name [mime | nomime]
Specifies the database to be included as a member of this virtual database. The name argument supplies the name of the database (as set by the
name
statement in its definition).Optional second argument may be used to restrict the use of this database to the given state of the ‘MIME’ option. Databases marked with ‘mime’ will be used only if the
OPTION MIME
command has been given for the current session. Databases marked with ‘nomime’ will be used only if this command has not been issued.
The following example defines a virtual database for translations from English to several other languages:
database { name "English Translating Database"; info "Translations from English to other languages"; handler "virtual"; database "en-sw"; database "en-no"; database "en-pl"; }
It is supposed, that databases ‘en-sw’, ‘en-no’, and ‘en-pl’ are defined elsewhere in the configuration.
Another example illustrates how to define a database that will select
the format of the articles depending on whether the client requests
MIME output. Suppose that the configuration defines two dictionaries:
‘thes_plain’, with a thesaurus formatted in plaintext, and
‘thes_html’, with the same thesaurus, but formatted in HTML. The
following database will return plaintext responses by default and HTML
responses after the OPTION MIME
command:
database { name "thesaurus"; handler "virtual"; database thes_plain nomime; database thes_html mime; }
Notice, that in this case it makes sense to define member databases as invisible, to avoid duplicate matches. E.g.:
database { name "thes_pain"; visible no; ... } database { name "thes_html"; visible no; ... }
To determine description (whether short or long) for a virtual database, the following algorithm is used. If the ‘description’ (or, for long description, ‘info’) statement is present in the ‘database’ block, its value is used. Otherwise, the server obtains descriptions of each member database that is visible in the current ‘OPTION MIME’ state. If all databases return the same value, it is used. Otherwise, empty string is used.
Practically, that means that when defining a collection virtual database (as in the first example above), you are better off supplying both ‘description’ and ‘info’ statements.
On the other hand, when defining a mime-switching virtual database
with two members (as in the second example), you can safely omit both
statements: dicod
will pick the value from the currently
active member database.
4.3.13 Strategies and Default Searches
A default search is a MATCH
request with ‘*’ or
‘!’ as the database argument (see MATCH). The former means
search in all available databases, the latter means search in all
databases until a match is found.
Default searches may be quite expensive and may cause considerable
strain on the server. For example, the command MATCH * priefix
""
returns all entries from all available databases, which would
consume a lot of resources both on the server and on the client side.
To minimize harmful effects from such potentially dangerous requests, it is possible to limit the use of certain strategies in default searches.
- Configuration: strategy name { statements }
Restricts the use of the strategy name in default searches.
The statements define conditions the 4th argument of a
MATCH
command must match in order to deny the request. The
following statements are defined:
- Configuration: deny-all bool
Unconditionally deny the use of this strategy in default searches.
- Configuration: deny-word list
Deny this strategy if the search word matches one of the words from list.
- Configuration: deny-length-lt number
Deny if length of the search word is less than number.
- Configuration: deny-length-le number
Deny if length of the search word is less than or equal to number.
- Configuration: deny-length-gt number
Deny if length of the search word is greater than number.
- Configuration: deny-length-ge number
Deny if length of the search word is greater than or equal to number.
- Configuration: deny-length-eq number
Deny if length of the search word is equal to number.
- Configuration: deny-length-ne number
Deny if length of the search word is not equal to number.
For example, the following statement denies the use of ‘prefix’ strategy in default searches if its argument is an empty string:
strategy prefix { deny-length-eq 0; }
If the dicod
daemon is configured this way, it will always return
a ‘552’ reply on commands MATCH * prefix ""
or MATCH
! prefix ""
. However, the use of empty prefix on a concrete database, as
in MATCH eng-deu prefix ""
, will still be allowed.
4.3.14 Tuning
While tuning your server, it is often necessary to get timing
information which shows how much time is spent serving certain
requests. This can be achieved using the timing
configuration
directive:
- Configuration: timing boolean
Provide timing information after successful completion of an operation. This information is displayed after the following requests:
MATCH
,DEFINE
, andQUIT
. It consists of the following parts:[d/m/c = nd/nm/nc RTr UTu STs]
where:
- nd
Number of processed define requests. It is ‘0’ after a
MATCH
request.- nm
Number of processed match requests. It is ‘0’ after a
DEFINE
request.- nc
Number of comparisons made. This value may be inaccurate if the underlying database module is not able to count comparisons.
- RT
Real time spent serving the request.
- UT
Time in user space spent serving the request.
- ST
Time in kernel space spent serving the request.
An example of a server reply with timing information follows:
250 Done [d/m/c = 0/63/107265 2.293r 1.120u 0.010s]
You can also add timing information to your access log files, see %T.
4.3.15 Command Aliases
Aliases allow a string to be substituted for a word when it is used
as the first word of a command. The daemon maintains a list of
aliases that are created using the alias
configuration file
statement:
- Configuration: alias word command
Creates a new alias.
Aliases are useful to facilitate manual interaction with the server,
as they allow the administrator to create abbreviations for some
frequently typed commands. For example, the following alias creates
new command d
which is equivalent to DEFINE *
:
alias d DEFINE "*";
Aliases may be recursive, i.e. the first word of command may refer to another alias. For example:
alias d DEFINE; alias da d "*";
This configuration will produce the following expansion:
da word ⇒ DEFINE * word
To prevent endless loops, recursive expansion is stopped if the first word of the replacement text is identical to an alias expanded earlier.
4.3.16 Using Preprocessor to Improve the Configuration.
Before parsing its configuration file, dicod
preprocesses
it. The built-in preprocessor handles only file inclusion
and #line
statements (see Pragmatic Comments), while the
rest of traditional preprocessing facilities, such as macro expansion,
is supported via m4
, which is used as an external preprocessor.
The detailed description of m4
facilities lies far beyond
the scope of this document. You will find a complete user manual in
http://www.gnu.org/software/m4/manual.
For the rest of this subsection we assume the reader is sufficiently
acquainted with m4
macro processor.
The external preprocessor is invoked with -s flag, instructing it to include line synchronization information in its output. This information is then used by the parser to display meaningful diagnostic. An initial set of macro definitions is supplied by the pp-setup file, located in $prefix/share/dico/version/include directory (where version means the version of GNU Dico package).
The default pp-setup file changes quote characters to ‘[’
and ‘]’, and renames all m4
built-in macros so they all
start with the prefix ‘m4_’. The latter has an effect similar
to GNU m4
--prefix-builtin option, but has an
advantage that it works with non-GNU m4
implementations as well.
As an example of how the use of preprocessor may improve
dicod
configuration, consider the following fragment taken
from one of the installations of GNU Dico. This installation offers quite
a few Freedict dictionaries. The database definition for each of them
is almost the same, except for the dictionary name and eventual
description entry for several databases that miss it. To avoid
repeating the same text over again, we define the following macro:
# defdb(NAME[, DESCR]) # Produce a standard definition for a database NAME. # If DESCR is given, use it as a description. m4_define([defdb], [ database { name "$1"; handler "dictorg database=$1";m4_dnl m4_ifelse([$2],,,[ description "$2";]) } ])
It takes two arguments. The first one, NAME, defines the dictionary
name visible in the output of SHOW DB
command. Optional second
argument may be used to supply a description string for the databases
that miss it.
Given this macro, the database definitions look like:
defdb(eng-swa) defdb(swa-eng) defdb(afr-eng, Afrikaans-English Dictionary) defdb(eng-afr, English-Afrikaans Dictionary)
This document was generated on September 4, 2020 using makeinfo.
Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved.