diff options
-rw-r--r-- | site/spec.html | 180 |
1 files changed, 117 insertions, 63 deletions
diff --git a/site/spec.html b/site/spec.html index 423ff68..01d42e3 100644 --- a/site/spec.html +++ b/site/spec.html @@ -34,7 +34,8 @@ Every file describes a <i>configuration</i>, which is a mapping from keys to values. A <i>key</i> is a string consisting of <i>components</i> separated by dots (<code>.</code>). A <i>value</i> is a string whose interpretation is entirely decided by the application. - Notably there is no distinction in POM’s syntax between, say, the number <code>5</code> and the string <code>5</code>. + Notably there is no distinction in POM’s syntax between, say, + the number <code>5</code> and the string <code>5</code>. A configuration, such as the one obtained from the POM file </p> <pre><code>[ingredients.sugar] @@ -69,7 +70,10 @@ time = 35 min or a tree of keys, with a value associated to each leaf node: </p> <div style="text-align:center;"> - <svg viewBox="-281 0 500 200" style="width:80%;max-width:600px;" xmlns="http://www.w3.org/2000/svg"> + <svg + viewBox="-281 0 500 200" + style="width:80%;max-width:600px;" + xmlns="http://www.w3.org/2000/svg"> <style> text { text-anchor: middle; } rect { stroke: black; fill: none; } @@ -137,17 +141,19 @@ time = 35 min </div> <h2>Error handling</h2> <p> - All error conditions are described in this specification. A general-purpose POM parser should not - reject a file in any other case, outside of exceptional circumstances such as running out of memory. - When an error occurs, it should be reported, ideally with information about the file name and line number, - and the file must be entirely rejected (i.e. parsers must not attempt to preserve only the correct parts of an erroneous file). + All error conditions are described in this specification. + A general-purpose POM parser should not reject a file in any other case, + outside of exceptional circumstances such as running out of memory. + When an error occurs, it should be reported, ideally with information + about the file name and line number, and the file must be entirely rejected + (i.e. parsers must not attempt to preserve only the correct parts of an erroneous file). Warnings may also be issued according to the judgment of the parser author. </p> <h2>Text encoding</h2> <p> All POM files are encoded using UTF-8. Both LF and CRLF line endings may be used (see below). - If invalid UTF-8 is encountered, including overlong sequences and UTF-16 surrogate halves (U+D800-DFFF), - an error occurs. + If invalid UTF-8 is encountered, including overlong sequences and UTF-16 + surrogate halves (U+D800-DFFF), an error occurs. </p> <h2>Valid keys/values</h2> <p> @@ -155,7 +161,8 @@ time = 35 min </p> <ul> <li> - The ASCII characters <code>a</code>–<code>z</code>, <code>A</code>–<code>Z</code>, <code>0</code>–<code>9</code>, as well as each of + The ASCII characters <code>a</code>–<code>z</code>, <code>A</code>–<code>Z</code>, + <code>0</code>–<code>9</code>, as well as each of <code>./-*_</code>. </li> <li> @@ -163,29 +170,34 @@ time = 35 min </li> </ul> <p> - A non-empty string containing only these characters is a valid key if and only if it does not start or end with a dot + A non-empty string containing only these characters is a valid key + if and only if it does not start or end with a dot and does not contain two dots in a row (<code>..</code>). </p> <p> - Any string of non-zero Unicode scalar values (U+0001–10FFFF, but not U+D800–U+DFFF) is a valid value. + Any string of non-zero Unicode scalar values (U+0001–10FFFF, but not U+D800–U+DFFF) + is a valid value. </p> <h2>Parsing</h2> <p> If a “byte order mark” of <code>EF BB BF</code> appears at the start of the file, it is ignored. - Every carriage return character (U+000D) which immediately precedes a line feed (U+000A) is deleted. - Then, if any control characters in the range U+0000 to U+001F other than the line feed and horizontal - tab (U+0009) are present in the file, an error occurs. + Every carriage return character (U+000D) which immediately + precedes a line feed (U+000A) is deleted. + Then, if any control characters in the range U+0000 to U+001F + other than the line feed and horizontal tab (U+0009) are + present in the file, an error occurs. </p> <p> - The <i>current-section</i> is a string variable which should be maintained during parsing. It is initally - equal to the empty string. + The <i>current-section</i> is a string variable which should be maintained during parsing. + It is initally equal to the empty string. </p> <p> An <i>accepted-space</i> is either a space (U+0020) or horizontal tab (U+0009) character. </p> <p> - Parsing now proceeds line-by-line, with lines being delimited by line feed characters. For each line: + Parsing now proceeds line-by-line, with lines being delimited by line feed characters. + For each line: </p> <ol> <li>Any accepted-spaces that appear at the start of the line are removed.</li> @@ -198,12 +210,18 @@ time = 35 min parsing proceeds to the next line. </li> <li> - If the line begins with <code>[</code>, it is interpreted as a <i>section header</i>. In this case: + If the line begins with <code>[</code>, it is interpreted as a <i>section header</i>. + In this case: <ol> - <li>If the line does not end with <code>]</code> optionally succeeded by any number of accepted-spaces, an error occurs.</li> <li> - The current-section is set to the text in between the initial <code>[</code> and final <code>]</code> - (white space after the <code>[</code> and before the <code>]</code> is <em>not</em> trimmed). + If the line does not end with <code>]</code> optionally succeeded + by any number of accepted-spaces, an error occurs. + </li> + <li> + The current-section is set to the text in between + the initial <code>[</code> and final <code>]</code> + (white space after the <code>[</code> and before the + <code>]</code> is <em>not</em> trimmed). </li> <li> If the new current-section is not empty and not a valid key (see above), an error occurs. @@ -215,7 +233,8 @@ time = 35 min <ol> <li>If the line does not contain an equal sign (<code>=</code>), an error occurs.</li> <li> - The <i>relative-key</i> is the text preceding the <code>=</code>, not including any space or horizontal tab characters + The <i>relative-key</i> is the text preceding the <code>=</code>, + not including any space or horizontal tab characters immediately before the <code>=</code>. </li> <li> @@ -226,8 +245,9 @@ time = 35 min </li> <li> If <i>c</i> is <code>"</code> (U+0022 QUOTATION MARK) or <code>`</code> (U+0060 GRAVE ACCENT), - the value is <i>quoted</i>, and spans from the first character after <i>c</i> to the next unescaped - instance of <i>c</i> in the file (which may be on a different line). In this case, + the value is <i>quoted</i>, and spans from the first character after <i>c</i> + to the next unescaped instance of <i>c</i> in the file (which may be on a different line). + In this case, <ol> <li> Escape sequences are processed as described below. @@ -304,7 +324,8 @@ time = 35 min Although POM does not have a way of specially designating a value as being a list, there is a recommended syntax for encoding them. Specifically, a value can be treated as a list by first splitting it into comma-delimited parts, treating <code>\,</code> as a literal comma - in a list entry and <code>\\</code> as a literal backslash, then removing any accepted-spaces surrounding list entries. + in a list entry and <code>\\</code> as a literal backslash, + then removing any accepted-spaces surrounding list entries. </p> <p> List entries may be empty, but if the last entry in a list is empty, it is removed @@ -361,8 +382,8 @@ time = 35 min <h2>Merging configurations</h2> <p> - A configuration <i>B</i> can be <i>merged into</i> another configuration <i>A</i> by parsing both of them - and setting the value associated with a key <i>k</i> to be + A configuration <i>B</i> can be <i>merged into</i> another configuration <i>A</i> + by parsing both of them and setting the value associated with a key <i>k</i> to be </p> <ol> <li>The value associated with <i>k</i> in <i>B</i>, if any.</li> @@ -370,28 +391,30 @@ time = 35 min </ol> <p> (Likewise, an ordered series of configurations <i>A<sub>1</sub></i>, …, <i>A<sub>n</sub></i> - can be merged by merging <i>A<sub>2</sub></i> into <i>A<sub>1</sub></i>, then <i>A<sub>3</sub></i> into - the resulting configuration, etc.) + can be merged by merging <i>A<sub>2</sub></i> into <i>A<sub>1</sub></i>, + then <i>A<sub>3</sub></i> into the resulting configuration, etc.) </p> <p> - This is useful, for example, when you want to have a global configuration for a piece of software - installed on a multi-user machine where individual settings can be overriden by each user (in this case, - the user configuration would be merged into the global configuration). + This is useful, for example, when you want to have a global configuration for + a piece of software installed on a multi-user machine where individual settings + can be overriden by each user (in this case, the user configuration would + be merged into the global configuration). </p> <h2>Schemas</h2> <p> - A <i>schema</i> is a POM file that describes how other POM files should be formatted (i.e. what keys they should - include, and what values they can be associated with). A configuration can be said to <i>follow</i> a schema when it obeys - all of the schema’s rules. + A <i>schema</i> is a POM file that describes how other POM files should be formatted + (i.e. what keys they should include, and what values they can be associated with). + A configuration can be said to <i>follow</i> a schema + when it obeys all of the schema’s rules. </p> <p> POM’s schema format is not powerful to enforce all possible restrictions; some will have to be enforced by the application. </p> <p> - Every schema key must be of the form <i>k</i><code>.</code><i>rule</i>, where <i>k</i> is a valid key, - and <i>rule</i> is one of the rule names listed below. + Every schema key must be of the form <i>k</i><code>.</code><i>rule</i>, where + <i>k</i> is a valid key, and <i>rule</i> is one of the rule names listed below. </p> <p> For any valid key <i>k</i> the value of the rule <i>rule</i> for <i>k</i> is determined as follows: @@ -409,10 +432,11 @@ time = 35 min <li>If the candidate-list is empty, use the default value for the rule.</li> <li>If the candidate-list has exactly one schema key, use the value of that schema key.</li> <li> - Otherwise select the schema key in the candidate-list with the latest first <code>*</code>-component (in terms of component index), + Otherwise select the schema key in the candidate-list with the + latest first <code>*</code>-component (in terms of component index), preferring keys with no <code>*</code>, - breaking ties by the latest second <code>*</code>-component, preferring keys with only one <code>*</code>, etc. - Use its value. + breaking ties by the latest second <code>*</code>-component, + preferring keys with only one <code>*</code>, etc. Use its value. </li> </ul> @@ -448,21 +472,34 @@ and for my.nephews.car.id is String (no schema key matches, so the default of St <li><code>Empty</code> — accepts an empty value</li> <li><code>Bool</code> — accepts <code>true</code>, <code>on</code>, <code>yes</code>, <code>false</code>, <code>off</code>, <code>no</code> (case-sensitive).</li> - <li><code>UInt</code> — accepts any unsigned - integer that fits in 63 [sic] bits, written in decimal or <code>0x</code> or <code>0X</code>-prefixed hexadecimal. + <li><code>UInt</code> — accepts any unsigned integer that fits in 63 [sic] bits, + written in decimal or <code>0x</code> or <code>0X</code>-prefixed hexadecimal. A leading <code>+</code> is permitted, but <code>-0</code> is not. Only 63 bits are allowed to support languages (Java) with no 64-bit unsigned integers.</li> <li><code>Int</code> — accepts any (two’s complement) signed integer that fits in 64 bits, written in decimal or <code>0x</code> or <code>0X</code>-prefixed hexadecimal. A leading <code>+</code> (or, of course, <code>-</code>) is permitted.</li> <li><code>Float</code> — - A floating-point number, written in ordinary decimal (e.g. <code>-1.234</code>, <code>7.</code>, <code>265</code>) or in scientific notation + A floating-point number, written in ordinary decimal + (e.g. <code>-1.234</code>, <code>7.</code>, <code>265</code>) or in scientific notation (e.g. <code>3e5</code>, <code>3.E-5</code>, <code>-3.7e+5</code>). A leading <code>+</code> (or, of course, <code>-</code>) is permitted. - <li><code>'</code><i>value</i><code>'</code> — accepts the literal value <i>value</i>. <i>value</i> cannot contain a literal apostrophe <code>'</code>. </li> - <li><i>T</i><code> | </code><i>U</i>, where <i>T</i>, <i>U</i> are types — accepts a value of type <i>T</i> or <i>U</i>.</li> - <li><code>Optional[</code><i>T</i><code>]</code>, where <i>T</i> is a type — equivalent to <i>T</i><code> | None</code>.</li> - <li><code>List[</code><i>T</i><code>]</code>, where <i>T</i> is a type — accepts a list of entries of type <i>T</i> (see description + <li> + <code>'</code><i>value</i><code>'</code> — + accepts the literal value <i>value</i>. + <i>value</i> cannot contain a literal apostrophe <code>'</code>. + </li> + <li> + <i>T</i><code> | </code><i>U</i>, where <i>T</i>, <i>U</i> are types + — accepts a value of type <i>T</i> or <i>U</i>. + </li> + <li> + <code>Optional[</code><i>T</i><code>]</code>, where <i>T</i> is a type + — equivalent to <i>T</i><code> | None</code>. + </li> + <li> + <code>List[</code><i>T</i><code>]</code>, where <i>T</i> is a type + — accepts a list of entries of type <i>T</i> (see description of lists above). Nested lists are not permitted. </li> </ul> @@ -473,7 +510,8 @@ and for my.nephews.car.id is String (no schema key matches, so the default of St <h3><code>allow_unknown</code> rule</h3> <p> Default: inherited from parent (i.e. if <i>k</i> = <i>j</i><code>.</code><i>component</i>, - look up the <code>allow_unknown</code> rule for <i>j</i>), or <code>yes</code> if <i>k</i> has no parent (does not contain a dot). + look up the <code>allow_unknown</code> rule for <i>j</i>), + or <code>yes</code> if <i>k</i> has no parent (does not contain a dot). </p> <p> This describes whether or the key <i>k</i> is allowed if it is not described in the schema. @@ -487,8 +525,9 @@ and for my.nephews.car.id is String (no schema key matches, so the default of St </p> <h3><code>min</code>, <code>max</code> rules</h3> <p> - This schema key’s value sets the minimum/maximum value for the key’s value. This may only be set if <code>type</code> - explicitly allows numeric values (i.e. it contains a type <code>Int</code>/<code>UInt</code>/<code>Float</code>). + This schema key’s value sets the minimum/maximum value for the key’s value. + This may only be set if <code>type</code> explicitly allows numeric values + (i.e. it contains a type <code>Int</code>/<code>UInt</code>/<code>Float</code>). </p> <h3><code>maxlength</code> rule</h3> <p> @@ -502,14 +541,18 @@ and for my.nephews.car.id is String (no schema key matches, so the default of St </p> <h3>Missing values</h3> <p> -If there is a schema key <i>k</i><code>.type</code>, where <i>k</i> does not contain any <code>*</code>-components, -and the type does not allow unset values (<code>None</code>), and there is no schema key <i>k</i><code>.default</code>, -then a configuration must contain the key <i>k</i> to follow the schema. + If there is a schema key <i>k</i><code>.type</code>, where <i>k</i> + does not contain any <code>*</code>-components, + and the type does not allow unset values (<code>None</code>), + and there is no schema key <i>k</i><code>.default</code>, + then a configuration must contain the key <i>k</i> to follow the schema. </p> <p> -Additionally, if there is a schema key <i>j</i><code>.*.</code><i>k</i><code>.type</code> that does not allow unset values -and no correspoding <code>default</code> schema key, where <i>k</i> does not contain any <code>*</code>-components, then a configuration -containing a key <i>x</i> matching <i>j</i><code>.*</code> must also contain the key <i>x</i><code>.</code><i>k</i>. + Additionally, if there is a schema key <i>j</i><code>.*.</code><i>k</i><code>.type</code> + that does not allow unset values and no correspoding <code>default</code> schema key, + where <i>k</i> does not contain any <code>*</code>-components, then a configuration + containing a key <i>x</i> matching <i>j</i><code>.*</code> must also contain + the key <i>x</i><code>.</code><i>k</i>. </p> <h2>Extensions</h2> @@ -533,7 +576,8 @@ containing a key <i>x</i> matching <i>j</i><code>.*</code> must also contain the </li> <li> <code>load_string(string: String) -> Configuration</code><br> - Load a configuration from a string (may be overloaded with <code>load</code> if language supports it). + Load a configuration from a string + (may be overloaded with <code>load</code> if language supports it). </li> <li> <code>load_path(path: String) -> Configuration</code><br> @@ -555,9 +599,14 @@ containing a key <i>x</i> matching <i>j</i><code>.*</code> must also contain the in an arbitrary order. </li> <li> + <code>subkeys(conf: Configuration, key: String) -> List<String></code><br> + Equivalent to <code>keys(section(conf, key))</code> but may be implemented more efficiently. + </li> + <li> <code>location(conf: Configuration, key: String) -> Optional<Location></code><br> Location of the definition of <code>key</code> in the configuration (file and line number). - Useful for reporting invalid values when the format of valid values can’t be described by a schema type. + Useful for reporting invalid values when + the format of valid values can’t be described by a schema type. </li> <li> <code>get(conf: Configuration, key: String) -> Optional<String></code><br> @@ -570,7 +619,8 @@ containing a key <i>x</i> matching <i>j</i><code>.*</code> must also contain the <li> <code>get_int(conf: Configuration, key: String) -> Optional<Int></code><br> <code>get_int_or_default(conf: Configuration, key: String, default: Int) -> Int</code><br> - Get value associated with <code>key</code>, if any exists, and parse it as a signed 64-bit integer, + Get value associated with <code>key</code>, + if any exists, and parse it as a signed 64-bit integer, following the <code>Int</code> schema type described above (returning <code>default</code> if the key doesn’t exist). Returns an error if the key exists but its value is not a valid signed 64-bit integer. @@ -578,7 +628,8 @@ containing a key <i>x</i> matching <i>j</i><code>.*</code> must also contain the <li> <code>get_uint(conf: Configuration, key: String) -> Optional<UInt></code><br> <code>get_uint_or_default(conf: Configuration, key: String, default: UInt) -> UInt</code><br> - Get value associated with <code>key</code>, if any exists, and parse it as an unsigned 63-bit [sic] integer, + Get value associated with <code>key</code>, if any exists, + and parse it as an unsigned 63-bit [sic] integer, following the <code>UInt</code> schema type described above (returning <code>default</code> if the key doesn’t exist). Returns an error if the key exists but its value is not a valid unsigned 63-bit integer. @@ -586,7 +637,8 @@ containing a key <i>x</i> matching <i>j</i><code>.*</code> must also contain the <li> <code>get_float(conf: Configuration, key: String) -> Optional<Float></code><br> <code>get_float_or_default(conf: Configuration, key: String, default: Float) -> Float</code><br> - Get value associated with <code>key</code>, if any exists, and parse it as a 64-bit IEEE-754 double precision + Get value associated with <code>key</code>, if any exists, + and parse it as a 64-bit IEEE-754 double precision floating-point number, following the <code>Float</code> schema type described above (returning <code>default</code> if the key doesn’t exist). Returns an error if the key exists but its value is not a valid floating-point number. @@ -607,7 +659,8 @@ containing a key <i>x</i> matching <i>j</i><code>.*</code> must also contain the </li> <li> <code>section(conf: Configuration, key: String) -> Configuration</code><br> - Get the sub-configuration consisting of the descendants of <code>key</code> (i.e. keys starting with <code>key.</code>), + Get the sub-configuration consisting of the descendants of <code>key</code> + (i.e. keys starting with <code>key.</code>), with the initial <code>key.</code> removed, and their corresponding values. Returns an empty configuration if there are no descendants of <code>key</code> defined. </li> @@ -729,7 +782,8 @@ This configuration has the following mapping of keys to values: <h3>Schema for text editor’s configuration</h3> <p> - Here is a schema which a text editor might use. The example text editor configuration above follows it. + Here is a schema which a text editor might use. + The example text editor configuration above follows it. </p> <pre><code> # don't allow unknown keys by default |