From c5365646ea2b6a6790eb8f8e70d15d25b5942a79 Mon Sep 17 00:00:00 2001 From: pommicket Date: Mon, 8 Sep 2025 17:17:29 -0400 Subject: Formatting, add subkeys to spec --- site/spec.html | 180 +++++++++++++++++++++++++++++++++++++-------------------- 1 file changed, 117 insertions(+), 63 deletions(-) diff --git a/site/spec.html b/site/spec.html index 423ff68..01d42e3 100644 --- a/site/spec.html +++ b/site/spec.html @@ -34,7 +34,8 @@ Every file describes a configuration, which is a mapping from keys to values. A key is a string consisting of components separated by dots (.). A value is a string whose interpretation is entirely decided by the application. - Notably there is no distinction in POM’s syntax between, say, the number 5 and the string 5. + Notably there is no distinction in POM’s syntax between, say, + the number 5 and the string 5. A configuration, such as the one obtained from the POM file

[ingredients.sugar]
@@ -69,7 +70,10 @@ time = 35 min
 	or a tree of keys, with a value associated to each leaf node:
 

- +

Error handling

- All error conditions are described in this specification. A general-purpose POM parser should not - reject a file in any other case, outside of exceptional circumstances such as running out of memory. - When an error occurs, it should be reported, ideally with information about the file name and line number, - and the file must be entirely rejected (i.e. parsers must not attempt to preserve only the correct parts of an erroneous file). + All error conditions are described in this specification. + A general-purpose POM parser should not reject a file in any other case, + outside of exceptional circumstances such as running out of memory. + When an error occurs, it should be reported, ideally with information + about the file name and line number, and the file must be entirely rejected + (i.e. parsers must not attempt to preserve only the correct parts of an erroneous file). Warnings may also be issued according to the judgment of the parser author.

Text encoding

All POM files are encoded using UTF-8. Both LF and CRLF line endings may be used (see below). - If invalid UTF-8 is encountered, including overlong sequences and UTF-16 surrogate halves (U+D800-DFFF), - an error occurs. + If invalid UTF-8 is encountered, including overlong sequences and UTF-16 + surrogate halves (U+D800-DFFF), an error occurs.

Valid keys/values

@@ -155,7 +161,8 @@ time = 35 min

  • - The ASCII characters az, AZ, 09, as well as each of + The ASCII characters az, AZ, + 09, as well as each of ./-*_.
  • @@ -163,29 +170,34 @@ time = 35 min

- A non-empty string containing only these characters is a valid key if and only if it does not start or end with a dot + A non-empty string containing only these characters is a valid key + if and only if it does not start or end with a dot and does not contain two dots in a row (..).

- Any string of non-zero Unicode scalar values (U+0001–10FFFF, but not U+D800–U+DFFF) is a valid value. + Any string of non-zero Unicode scalar values (U+0001–10FFFF, but not U+D800–U+DFFF) + is a valid value.

Parsing

If a “byte order mark” of EF BB BF appears at the start of the file, it is ignored. - Every carriage return character (U+000D) which immediately precedes a line feed (U+000A) is deleted. - Then, if any control characters in the range U+0000 to U+001F other than the line feed and horizontal - tab (U+0009) are present in the file, an error occurs. + Every carriage return character (U+000D) which immediately + precedes a line feed (U+000A) is deleted. + Then, if any control characters in the range U+0000 to U+001F + other than the line feed and horizontal tab (U+0009) are + present in the file, an error occurs.

- The current-section is a string variable which should be maintained during parsing. It is initally - equal to the empty string. + The current-section is a string variable which should be maintained during parsing. + It is initally equal to the empty string.

An accepted-space is either a space (U+0020) or horizontal tab (U+0009) character.

- Parsing now proceeds line-by-line, with lines being delimited by line feed characters. For each line: + Parsing now proceeds line-by-line, with lines being delimited by line feed characters. + For each line:

  1. Any accepted-spaces that appear at the start of the line are removed.
  2. @@ -198,12 +210,18 @@ time = 35 min parsing proceeds to the next line.
  3. - If the line begins with [, it is interpreted as a section header. In this case: + If the line begins with [, it is interpreted as a section header. + In this case:
      -
    1. If the line does not end with ] optionally succeeded by any number of accepted-spaces, an error occurs.
    2. - The current-section is set to the text in between the initial [ and final ] - (white space after the [ and before the ] is not trimmed). + If the line does not end with ] optionally succeeded + by any number of accepted-spaces, an error occurs. +
    3. +
    4. + The current-section is set to the text in between + the initial [ and final ] + (white space after the [ and before the + ] is not trimmed).
    5. If the new current-section is not empty and not a valid key (see above), an error occurs. @@ -215,7 +233,8 @@ time = 35 min
      1. If the line does not contain an equal sign (=), an error occurs.
      2. - The relative-key is the text preceding the =, not including any space or horizontal tab characters + The relative-key is the text preceding the =, + not including any space or horizontal tab characters immediately before the =.
      3. @@ -226,8 +245,9 @@ time = 35 min
      4. If c is " (U+0022 QUOTATION MARK) or ` (U+0060 GRAVE ACCENT), - the value is quoted, and spans from the first character after c to the next unescaped - instance of c in the file (which may be on a different line). In this case, + the value is quoted, and spans from the first character after c + to the next unescaped instance of c in the file (which may be on a different line). + In this case,
        1. Escape sequences are processed as described below. @@ -304,7 +324,8 @@ time = 35 min Although POM does not have a way of specially designating a value as being a list, there is a recommended syntax for encoding them. Specifically, a value can be treated as a list by first splitting it into comma-delimited parts, treating \, as a literal comma - in a list entry and \\ as a literal backslash, then removing any accepted-spaces surrounding list entries. + in a list entry and \\ as a literal backslash, + then removing any accepted-spaces surrounding list entries.

          List entries may be empty, but if the last entry in a list is empty, it is removed @@ -361,8 +382,8 @@ time = 35 min

          Merging configurations

          - A configuration B can be merged into another configuration A by parsing both of them - and setting the value associated with a key k to be + A configuration B can be merged into another configuration A + by parsing both of them and setting the value associated with a key k to be

          1. The value associated with k in B, if any.
          2. @@ -370,28 +391,30 @@ time = 35 min

          (Likewise, an ordered series of configurations A1, …, An - can be merged by merging A2 into A1, then A3 into - the resulting configuration, etc.) + can be merged by merging A2 into A1, + then A3 into the resulting configuration, etc.)

          - This is useful, for example, when you want to have a global configuration for a piece of software - installed on a multi-user machine where individual settings can be overriden by each user (in this case, - the user configuration would be merged into the global configuration). + This is useful, for example, when you want to have a global configuration for + a piece of software installed on a multi-user machine where individual settings + can be overriden by each user (in this case, the user configuration would + be merged into the global configuration).

          Schemas

          - A schema is a POM file that describes how other POM files should be formatted (i.e. what keys they should - include, and what values they can be associated with). A configuration can be said to follow a schema when it obeys - all of the schema’s rules. + A schema is a POM file that describes how other POM files should be formatted + (i.e. what keys they should include, and what values they can be associated with). + A configuration can be said to follow a schema + when it obeys all of the schema’s rules.

          POM’s schema format is not powerful to enforce all possible restrictions; some will have to be enforced by the application.

          - Every schema key must be of the form k.rule, where k is a valid key, - and rule is one of the rule names listed below. + Every schema key must be of the form k.rule, where + k is a valid key, and rule is one of the rule names listed below.

          For any valid key k the value of the rule rule for k is determined as follows: @@ -409,10 +432,11 @@ time = 35 min

        2. If the candidate-list is empty, use the default value for the rule.
        3. If the candidate-list has exactly one schema key, use the value of that schema key.
        4. - Otherwise select the schema key in the candidate-list with the latest first *-component (in terms of component index), + Otherwise select the schema key in the candidate-list with the + latest first *-component (in terms of component index), preferring keys with no *, - breaking ties by the latest second *-component, preferring keys with only one *, etc. - Use its value. + breaking ties by the latest second *-component, + preferring keys with only one *, etc. Use its value.
        5. @@ -448,21 +472,34 @@ and for my.nephews.car.id is String (no schema key matches, so the default of St
        6. Empty — accepts an empty value
        7. Bool — accepts true, on, yes, false, off, no (case-sensitive).
        8. -
        9. UInt — accepts any unsigned - integer that fits in 63 [sic] bits, written in decimal or 0x or 0X-prefixed hexadecimal. +
        10. UInt — accepts any unsigned integer that fits in 63 [sic] bits, + written in decimal or 0x or 0X-prefixed hexadecimal. A leading + is permitted, but -0 is not. Only 63 bits are allowed to support languages (Java) with no 64-bit unsigned integers.
        11. Int — accepts any (two’s complement) signed integer that fits in 64 bits, written in decimal or 0x or 0X-prefixed hexadecimal. A leading + (or, of course, -) is permitted.
        12. Float — - A floating-point number, written in ordinary decimal (e.g. -1.234, 7., 265) or in scientific notation + A floating-point number, written in ordinary decimal + (e.g. -1.234, 7., 265) or in scientific notation (e.g. 3e5, 3.E-5, -3.7e+5). A leading + (or, of course, -) is permitted. -
        13. 'value' — accepts the literal value value. value cannot contain a literal apostrophe '.
        14. -
        15. T | U, where T, U are types — accepts a value of type T or U.
        16. -
        17. Optional[T], where T is a type — equivalent to T | None.
        18. -
        19. List[T], where T is a type — accepts a list of entries of type T (see description +
        20. + 'value' — + accepts the literal value value. + value cannot contain a literal apostrophe '. +
        21. +
        22. + T | U, where T, U are types + — accepts a value of type T or U. +
        23. +
        24. + Optional[T], where T is a type + — equivalent to T | None. +
        25. +
        26. + List[T], where T is a type + — accepts a list of entries of type T (see description of lists above). Nested lists are not permitted.
        27. @@ -473,7 +510,8 @@ and for my.nephews.car.id is String (no schema key matches, so the default of St

          allow_unknown rule

          Default: inherited from parent (i.e. if k = j.component, - look up the allow_unknown rule for j), or yes if k has no parent (does not contain a dot). + look up the allow_unknown rule for j), + or yes if k has no parent (does not contain a dot).

          This describes whether or the key k is allowed if it is not described in the schema. @@ -487,8 +525,9 @@ and for my.nephews.car.id is String (no schema key matches, so the default of St

          min, max rules

          - This schema key’s value sets the minimum/maximum value for the key’s value. This may only be set if type - explicitly allows numeric values (i.e. it contains a type Int/UInt/Float). + This schema key’s value sets the minimum/maximum value for the key’s value. + This may only be set if type explicitly allows numeric values + (i.e. it contains a type Int/UInt/Float).

          maxlength rule

          @@ -502,14 +541,18 @@ and for my.nephews.car.id is String (no schema key matches, so the default of St

          Missing values

          -If there is a schema key k.type, where k does not contain any *-components, -and the type does not allow unset values (None), and there is no schema key k.default, -then a configuration must contain the key k to follow the schema. + If there is a schema key k.type, where k + does not contain any *-components, + and the type does not allow unset values (None), + and there is no schema key k.default, + then a configuration must contain the key k to follow the schema.

          -Additionally, if there is a schema key j.*.k.type that does not allow unset values -and no correspoding default schema key, where k does not contain any *-components, then a configuration -containing a key x matching j.* must also contain the key x.k. + Additionally, if there is a schema key j.*.k.type + that does not allow unset values and no correspoding default schema key, + where k does not contain any *-components, then a configuration + containing a key x matching j.* must also contain + the key x.k.

          Extensions

          @@ -533,7 +576,8 @@ containing a key x matching j.* must also contain the
        28. load_string(string: String) -> Configuration
          - Load a configuration from a string (may be overloaded with load if language supports it). + Load a configuration from a string + (may be overloaded with load if language supports it).
        29. load_path(path: String) -> Configuration
          @@ -554,10 +598,15 @@ containing a key x matching j.* must also contain the Returns a list of all unique first components of keys in the configuration, in an arbitrary order.
        30. +
        31. + subkeys(conf: Configuration, key: String) -> List<String>
          + Equivalent to keys(section(conf, key)) but may be implemented more efficiently. +
        32. location(conf: Configuration, key: String) -> Optional<Location>
          Location of the definition of key in the configuration (file and line number). - Useful for reporting invalid values when the format of valid values can’t be described by a schema type. + Useful for reporting invalid values when + the format of valid values can’t be described by a schema type.
        33. get(conf: Configuration, key: String) -> Optional<String>
          @@ -570,7 +619,8 @@ containing a key x matching j.* must also contain the
        34. get_int(conf: Configuration, key: String) -> Optional<Int>
          get_int_or_default(conf: Configuration, key: String, default: Int) -> Int
          - Get value associated with key, if any exists, and parse it as a signed 64-bit integer, + Get value associated with key, + if any exists, and parse it as a signed 64-bit integer, following the Int schema type described above (returning default if the key doesn’t exist). Returns an error if the key exists but its value is not a valid signed 64-bit integer. @@ -578,7 +628,8 @@ containing a key x matching j.* must also contain the
        35. get_uint(conf: Configuration, key: String) -> Optional<UInt>
          get_uint_or_default(conf: Configuration, key: String, default: UInt) -> UInt
          - Get value associated with key, if any exists, and parse it as an unsigned 63-bit [sic] integer, + Get value associated with key, if any exists, + and parse it as an unsigned 63-bit [sic] integer, following the UInt schema type described above (returning default if the key doesn’t exist). Returns an error if the key exists but its value is not a valid unsigned 63-bit integer. @@ -586,7 +637,8 @@ containing a key x matching j.* must also contain the
        36. get_float(conf: Configuration, key: String) -> Optional<Float>
          get_float_or_default(conf: Configuration, key: String, default: Float) -> Float
          - Get value associated with key, if any exists, and parse it as a 64-bit IEEE-754 double precision + Get value associated with key, if any exists, + and parse it as a 64-bit IEEE-754 double precision floating-point number, following the Float schema type described above (returning default if the key doesn’t exist). Returns an error if the key exists but its value is not a valid floating-point number. @@ -607,7 +659,8 @@ containing a key x matching j.* must also contain the
        37. section(conf: Configuration, key: String) -> Configuration
          - Get the sub-configuration consisting of the descendants of key (i.e. keys starting with key.), + Get the sub-configuration consisting of the descendants of key + (i.e. keys starting with key.), with the initial key. removed, and their corresponding values. Returns an empty configuration if there are no descendants of key defined.
        38. @@ -729,7 +782,8 @@ This configuration has the following mapping of keys to values:

          Schema for text editor’s configuration

          - Here is a schema which a text editor might use. The example text editor configuration above follows it. + Here is a schema which a text editor might use. + The example text editor configuration above follows it.

          
           # don't allow unknown keys by default
          -- 
          cgit v1.2.3