Kv (pronounced: cave
) Format's driving idea is radical simplicity—
to use a regular grammar
(simpler than the context-free grammars of JSON and TOML).
It builds upon .env file conventions while imposing constraints
necessary for regular language properties.
Kv Format 1.0 is thus a standardized minimal common denominator
to .env files used in practice,
intended to provide a foundation for extensions in 1.x versions.
The goal of Kv Format is to provide:
This design trades expressiveness (escaping, nesting, interpolation, multiline values) in exchange for single-pass parsing, deterministic processing, and straightforward implementation.
To avoid confusion with incompatible .env dialects,
Kv Format files should use the .kv extension.
This specification defines:
This specification does not define:
In Kv Format terminology, the parser processes Kv text to emit entries. The parser's responsibility ends with entry emission; application logic that consumes entries is outside this specification's scope. Thus, a parser in Kv Format terminology has a narrower meaning than typical: it is a token producer that yields entries for subsequent processing, without constructing concrete or abstract syntax trees.1
The entry stream abstraction separates parsing (syntax recognition) from processing (semantic interpretation). This enables flexible integration into different application architectures while maintaining a clear boundary between the parser's responsibilities and the application's domain logic.
This specification is structured for stable evolution across 1.x versions:
The Technicalia provide the authoritative interpretation of the Normative Requirements. Implementations that conform to the Technicalia necessarily satisfy the Normative Requirements, and in case of ambiguity, the Technicalia has precedence.
The all-caps key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [rfc2119].
Unicode code points in the U+XXXX form
are defined in the Unicode Standard [unicode].
A Kv text is composed of different types of lines:
# and extending to the end of the line;Data lines and comment lines are collectively termed content lines.
The parser emits an output of parsed entries, which come in several varieties:
In this specification, whitespace refers exclusively to:
U+0020 SPACEU+0009 CHARACTER TABULATIONIn Kv Format syntax, whitespace characters are generally treated like any other characters, except that they are tolerated as indentation and as optional spacing between keys and assignment operators.
EOL (End of Line) is a line terminator, either LF (POSIX-style line feed) or CRLF (Windows-style line ending). Kv Format tolerates intermixing of LF and CRLF within the same Kv text.
In this specification, a line refers to the sequence of characters from the start of the line up to and including the line terminator (EOL).
A parser is a process that reads Kv text and emits an entry stream according to this specification.
This section represents a descriptive definition of the Kv Format syntax. The formal grammar presented in Appendix A is the normative definition, and has precedence in case of any ambiguity.
A key-value data line has the basic form (BNF):
key OP value EOL
where:
key is a string identifier,OP is an assignment operator; for simple string values this is
%x3D ; EQUALS SIGN (=)value is the remainder of the line, extending from the assignment operator
to the end of the line, excluding the EOLEOL is the line terminator (LF or CRLF).For example:
host=kvformat.org
Keys and assignment operators may be indented for presentation clarity. For example:
database =db1 port =5432 user =admin
Note that any whitespace characters after the assignment operator =
would not be considered optional indentation, but part of the value.
The general ABNF form of a data line is therefore
*WSP key *WSP OP value EOL
where *WSP represents zero or more whitespace characters.
A comment line has the basic form (BNF):
"#" comment-text EOL
where # is the NUMBER SIGN character (%x23).
For example:
# This is a comment
As with data lines, comments may be indented. The following lines are all syntactically equivalent:
# comment
# comment
# comment
This does not hold for whitespace characters after the # character,
which are considered part of the comment text.
Thus, the following lines are all different:
#comment # comment # comment
General ABNF form of a comment line:
*WSP "#" comment-text EOL
A blank line is an empty line or a line containing only whitespace characters:
*WSP EOL
Blank lines are used for visual separation without semantic meaning.
EOLs are line terminators.
EOL = LF / CRLF
where:
LF = %x0A ; LINE FEEDCR = %x0D ; CARRIAGE RETURNCRLF = CR LFEvery line in a Kv text MUST be terminated with an EOL (either LF or CRLF), including the final line. An EOL is considered part of the line (see Section 2.6).
Shebang lines [execveman] are file-level directives, not part of Kv text syntax. Shebangs MUST NOT be treated as comments. See Section 6.4 for details.
A parser processes a Kv text line by line, emitting a stream of entries.
Parsers MUST keep track of line numbers.
Each line is parsed into an entry that contains:
Entry payloads contain the line's semantic content; the EOL terminator is consumed during parsing and is not part of any entry payload.
The method of representing entries (tuples, structs, objects, etc.) is implementation-defined.
Keys in KV entries MUST NOT contain any whitespace characters.
Any leading or trailing whitespace in a key MUST be removed by the parser,
while any whitespace inside the key represents an INVALID_KEY_ERROR
(Appendix B.6).
Parsers emit an entry stream — a unidirectional flow of entries derived from lines in the Kv text.
Shebang handling:
When parsing a Kv file that begins with a shebang line (see Section 6.4),
the parser MAY emit a shebang entry for the shebang line as line number zero
(Section 4.1).
Shebang entries MUST preserve all characters of the
shebang line verbatim except the EOL terminator,
including the #! prefix.
Shebang entries MUST NOT be emitted for any line beyond the first.
Parsers MUST treat all Kv text values as strings, and MUST preserve values verbatim. Parsers MUST NOT interpret any characters (including but not limited to quotes, backslashes, dollar signs, percent signs, etc.) as having special meaning or initiating escape sequences. Specifically:
Values MUST NOT contain EOL characters. All other characters, including any leading or trailing whitespace characters (see Section 4.5), MUST be preserved verbatim.
Empty value parsing: When an assignment operator is immediately followed by EOL, parsers MUST emit the value as an empty string.
Parsers MUST ensure valid UTF-8 encoding. Validation may occur:
BOM handling:
Kv Format does not support the UTF-8 Byte Order Mark (BOM).
If a Kv text begins with the bytes EF BB BF, parsers MUST
raise a BOM_ERROR (Appendix B.1).
Parsers implement one of two models:
Parsers MUST document which model they implement and their behavior when error occurs:
Lines causing errors SHOULD NOT emit entries.
A comprehensive list of parsing errors is given in Appendix B.
This specification does not mandate maximum limits for:
Implementations MAY impose reasonable limits based on available resources, but SHOULD document these limits and provide clear error messages when limits are exceeded.
By design, Kv Format relegates most semantic choices to applications consuming the entry stream. This includes decisions such as value whitespace trimming, duplicate key handling, and comment preservation. This section is therefore limited to reiterating some basic principles.
Kv Format values preserve all characters exactly as written (see Section 4.6). No characters have special meaning: quotes, backslashes, dollar signs, percent signs, etc., are all literal characters.
This design gives applications flexibility to interpret values according to their needs. For example, a configuration loader might trim trailing whitespace, while a round-trip serializer would preserve it exactly.
Consistent with .env file precedent, Kv Format texts may contain duplicate keys. This follows from the regular grammar: parsers emit entries without maintaining key state.
The entry stream thus contains all instances of repeated keys. Applications consuming the entry stream determine how to handle duplicate keys according to their needs. Common strategies include:
Kv Format files SHOULD use the .kv file extension.
When transmitting Kv Format content over protocols that use media types [rfc2046]
(such as HTTP [rfc9110] or email [rfc5322]),
implementations SHOULD use text/org.kvformat as the Content-Type,
with the following optional parameters (following RFC 2045 syntax [rfc2045]):
version: OPTIONAL. If present, MUST be a string (quoted or unquoted) containing the
Kv Format specification version number (e.g., 1.0). This parameter indicates which version
of the Kv Format syntax the content conforms to.charset: OPTIONAL. If present, the value MUST be utf-8 (quoted or unquoted).
Since Kv Format requires UTF-8 encoding (Section 4.7),
this parameter is redundant but MAY be included for compatibility with systems
that require explicit charset declaration.Example valid HTTP Content-Type headers:
Content-Type: text/org.kvformat Content-Type: text/org.kvformat; version=1.0 Content-Type: text/org.kvformat; charset=utf-8 Content-Type: text/org.kvformat; version="1.0"; charset="utf-8"
If Kv Format gains sufficient adoption, text/kv may be registered with IANA in the future.
Kv files MUST be encoded as UTF-8 without a Byte Order Mark (BOM).
A shebang line [execveman] is an optional first line in a Kv file that specifies how to execute the file as a script. Shebang lines typically have the form:
#!/path/to/interpreter [optional argument]
#!.Simple examples:
APP_NAME=My Application API_KEY=sk_live_1234567890abcdef DEBUG=true PATH=/usr/local/bin:/usr/bin
Empty value (value is an empty string):
EMPTY=
Leading and trailing space in values preserved
(values are: «foo·» and «·bar»)3
trailing=foo· leading=·bar
No inline comments (value is «1 # default value»)
NOT_AS_INTENDED=1 # default value
Quotes are characters like any other (value is «"hello"»)
quoted="hello"
Backslashes are literals, never used as escapes (value is «\n»
— two characters: a backslash and the letter "n")
BACKSLASH=\n
No interpolation (values are «/usr/bin» and «$PATH:/usr/var/bin»)
PATH=/usr/bin PATH=$PATH:/usr/var/bin
Empty comment (comment text is an empty string)
#
Comment consisting of a space (comment text is «·»)
#·
Comment with leading and trailing spaces (comment text is «··comment···»)
#··comment···
Duplicate keys are valid in the parser context (see Section 5.2); this results in two emitted entries:
KEY=1 KEY=2
Invalid key: dashes not allowed (INVALID_KEY_ERROR)
KEY-NAME=value
Invalid key: cannot start with a digit (INVALID_KEY_ERROR)
123KEY=value
Invalid key: cannot contain period (INVALID_KEY_ERROR)
KEY.NAME=value
Missing assignment operator = (MISSING_OPERATOR_ERROR)
KEY value
Empty key (EMPTY_KEY_ERROR)
=value
Comments, not data lines
#=foo #key=value
Valid data line with value «=foo».
key==foo
Valid data line with an empty string for a value
key=
Valid data line with a leading space in the value: «·abc»
key = abc
Error — space is not an operator (MISSING_OPERATOR_ERROR)
key value
Error — colon is not an operator (MISSING_OPERATOR_ERROR)
key:value
Error — colon is not a valid key character (INVALID_KEY_ERROR)
key:=value
Error — space is not a valid key character (INVALID_KEY_ERROR)
foo bar=value
Kv Format 1.x versions are not sequential updates in the traditional sense. All 1.x format definitions were designed and specified concurrently, then organized into a layered specification collection with progressive feature inclusion. This approach reflects the fundamental trade-off between feature richness and implementation size.
The Kv Format 1.x versions provide increasing functionality:
Each version includes all features of previous versions. The complete 1.4 specification is not an "update" to 1.0 but a superset specification that adds new syntactic constructs while maintaining backward compatibility.
As a regular language, Kv Format implementations are expected to outperform context-free alternatives (JSON, TOML, YAML) while maintaining significantly smaller parser binary (compiled implementation) footprints. The layered version approach directly addresses resource-constrained systems by enabling implementers to choose the minimal feature set their application requires:
Each successive version adds parsing complexity and increases the minimum implementation size. The version numbers thus serve as size/complexity indicators rather than "latest and greatest" markers.
To help users select optimal implementations for their constraints, Kv Format mandates an unorthodox implementation versioning scheme:
Examples:
10.0.010.0.110.1.013.0.014.2.0This scheme provides immediate visual identification of format support: the major version 10-14 directly maps to the supported feature set, while preserving semantic versioning semantics for the implementation itself.
This versioning scheme applies to the Kv Format 1.x specification series, which at the time of Kv Format 1.0 publication includes versions 1.0 - 1.4. Any possible future version mapping will be addressed in those versions' specifications.
The following grammar is the normative definition of Kv Format syntax in ABNF form [rfc5234]. In case of any ambiguities in the descriptive main text of this specification, this grammar has precedence.
kv-text = *kv-line
kv-line = data-line / comment-line / blank-line
data-line = *WSP key *WSP "=" value EOL
comment-line = *WSP "#" comment-text EOL
blank-line = *WSP EOL
key = ( ALPHA / "_" ) *( ALPHA / DIGIT / "_" )
value = *valid-char
comment-text = *valid-char
; Terminals
EOL = LF / CRLF
valid-char = %x01-09 / %x0B-0C / %x0E-10FFFF
The rules from RFC 5234 [rfc5234] are used:
ALPHA = %x41-5A / %x61-7A (A-Z / a-z)DIGIT = %x30-39 (0-9)LF = %x0A (line feed)CR = %x0D (carriage return)CRLF = CR LFWSP = SP / HTAB = %x20 / %x09 (space or tab)Note on valid-char:
The valid-char rule excludes NUL (U+0000), LF (U+000A),
and CR (U+000D) for parsing reasons.
Implementations MUST perform UTF-8 validation separately
(Section 4.7).
As valid-char includes only valid code points,
the range U+D800 to U+DFFF (UTF-16 surrogates)
is therefore implicitly excluded.
Note on EOF:
This grammar requires every line to be terminated by EOL.
Input where the final line lacks an EOL terminator does not
match this grammar; parsers MUST raise a
MISSING_FINAL_EOL_ERROR
(Appendix B.7).
This appendix classifies the cases of parse failures. If feasible in the implementation environment, in case of parse failures, parsers SHOULD return one of the errors below. If possible, error reports SHOULD include line numbers (Section 4.1).
The errors below are listed in order of priority. In case of multiple errors occurring on any line, parsers MUST report the highest priority error, and MAY report further errors from the same line.
BOM_ERRORA Kv text MUST NOT begin with a UTF-8 BOM sequence (EF BB BF) [unicode].
Parsers MUST detect texts starting with the BOM sequence and raise a BOM_ERROR.
Whether parsing proceeds after this error detection follows the implementation's error handling strategy.
INVALID_UTF8_ERRORParsers MUST process input as UTF-8. Upon encountering invalid UTF-8 sequences,
parsers MUST raise an INVALID_UTF8_ERROR.
INVALID_CHARACTER_ERRORKv text MUST NOT contain standalone CR characters (U+000D) or NUL characters (U+0000).
Parsers MUST raise an INVALID_CHARACTER_ERROR when encountering:
U+000D) that is not part of a CRLF sequenceU+0000)EMPTY_KEY_ERRORA key MUST NOT be an empty string. If a data line starts with the assignment operator,
parsers MUST raise an EMPTY_KEY_ERROR.
MISSING_OPERATOR_ERRORIf a data line (i.e., a line that is neither blank nor a comment line) does not contain
an assignment operator, parsers MUST raise a MISSING_OPERATOR_ERROR.
INVALID_KEY_ERRORAll keys in a Kv text MUST:
Otherwise, parsers MUST raise an INVALID_KEY_ERROR.
MISSING_FINAL_EOL_ERRORIf the final line of a Kv text is missing an EOL terminator, parsers MUST consider the line invalid,
and MUST raise a MISSING_FINAL_EOL_ERROR.
This error MAY be reported without a line number.
This section provides informative guidance for implementing a parser using a simple scanning approach. The normative requirements are specified in the preceding Sections and Appendices. Implementations MAY use different parsing strategies as long as they conform to the grammar specified in Appendix A, detect all errors defined in Appendix B and produce equivalent results.
A. initialize pos = 0, error_count = 0
B. File-level artifacts
1. if file starts with UTF-8 BOM (EF BB BF):
- echo "BOM_ERROR at start of file" > stderr
- error_count += 1
- pos += 3 # skip BOM
2. if file[pos..] starts with "#!":
- shebang_line, eol = extract first line starting at pos
- if shebang_line contains invalid UTF-8:
* echo "INVALID_UTF8_ERROR in line 0 (shebang)" > stderr
* error_count += 1
- pos += length(shebang_line) + length(eol) # skip shebang line
C. Extract Kv text
- text = file[pos..]
- initialize i = 0 (byte-level index), line_num = 0
D. while i < length(text):
1. (string eol, index i_eol) = find next LF or CRLF starting from i
2. if not found:
- echo "MISSING_FINAL_EOL_ERROR at end of file" > stderr
- error_count += 1
- -> break # exit loop and go to step E
3. line = text[i..i_eol]
4. line_num += 1
5. parse line:
a. if line starts with U+0020 or U+0009: # trim leading whitespace:
- i_nonws = find first non-whitespace character index
- if not found (all whitespaces): line = ""
- else: line = line[i_nonws..]
b. if line == "": # ignore blank line
- -> continue to step 6
c. if line contains invalid UTF-8:
- echo "INVALID_UTF8_ERROR in $line_num" > stderr
- error_count += 1
- -> continue to step 6
d. if line contains NUL or CR:
- echo "INVALID_CHARACTER_ERROR in $line_num" > stderr
- error_count += 1
- -> continue to step 6
e. if line starts with '#':
- yield ("COMMENT_ENTRY", line[1..], line_num)
- -> continue to step 6
f. else: parse as data line:
- i_op = index of first '=' in line
- if '=' is not found:
* echo "MISSING_OPERATOR_ERROR in $line_num" > stderr
* error_count += 1
* -> continue to step 6
- key = trim_trailing_whitespace(line[0..i_op])
- if key == "":
* echo "EMPTY_KEY_ERROR in $line_num" > stderr
* error_count += 1
* -> continue to step 6
- if key does not match /^[_a-zA-Z][_a-zA-Z0-9]*$/:
* echo "INVALID_KEY_ERROR in $line_num" > stderr
* error_count += 1
* -> continue to step 6
- value = line[i_op+1..]
- yield ("KV_ENTRY", key, value, line_num)
6. i = i_eol + length(eol); # skip LF or CRLF
E. return (error_count > 0 ? 1 : 0)
Notes:
This specification is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). A human-readable summary of the license terms is provided below, but the full legal code is the authoritative version.
You are free to:
Under the following terms:
Notices:
When attributing this specification, please include:
https://kvformat.org/spec/1.0-rc1.htmlExample attribution:
Kv Format 1.0 Specification RC1. © 2026 Bojan Đuričković.
https://kvformat.org/spec/1.0-rc1.html (CC BY 4.0)
This specification is provided "AS IS" without warranty of any kind. The authors and copyright holders make no representations or warranties, express or implied, regarding the accuracy, completeness, or suitability of this specification for any particular purpose.
In no event shall the authors or copyright holders be liable for any claim, damages, or other liability, whether in an action of contract, tort, or otherwise, arising from, out of, or in connection with this specification or the use or other dealings in this specification.
This license applies to the specification document only. Reference implementations, example code, and other associated materials are licensed separately as indicated in their respective repositories.