One of the hardest issues to solve around digital signatures is generating a digest or hash of the source data consistently and despite any structural changes that may have a happened while in transit. The objective of canonicalization is to ensure that the data is logically equivalent at the source and destination so that the digest can be calculated reliably on both sides, and thus be used in digital signatures. In the world of XML, the Canonical XML Version 1.1 W3C recommendation aims to set out rules to be used to create consistent documents. Anyone who’s worked with XML signatures however (XML-DSIG) knows that despite the best intentions and libraries, it can still be difficult to get the expected results, especially when using different languages at the source and destination. JSON conversely lacks clearly defined or active industrial standard around canonicalization, despite having a much simpler syntax. Indeed, the JSON Web Token specification gets around canonical issues by including the actual signed payload data as a Base64 string inside the signatures. One of the objectives of GOBL is to create a document that could potentially be stored in any key-value format alternative to JSON, like YAML, Protobuf, or maybe even XML. Perhaps GOBL documents need to be persisted to a document database like CouchDB or a JSONB field in PostgreSQL. It should not matter what the underlying format or persistence engine is, as long as the logical contents are exactly the same. Thus when signing documents its essential we have a reliable canonical version of JSON, even if the data is stored somewhere else. TheDocumentation Index
Fetch the complete documentation index at: https://gobl-improv.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
c14n package included in GOBL, is inspired by the works of others and aims to define a simple standardized approach to canonical JSON that could potentially be implemented easily in other languages.
GOBL JSON C14n
GOBL considers the following JSON values as explicit types:- a string
- a number, which extends the JSON spec and is split into:
- an integer
- a float
- an object
- an array
- a boolean
- null
- Must be encoded in valid UTF-8. A document with invalid character encoding will be rejected.
- Must not include superfluous or non-semantic whitespace.
- Must order the attributes of objects lexicographically by the code points of their names.
-
Must remove attributes from objects whose value is
null. - Must not remove null values from arrays.
-
Must represent numbers that are mathematically integers—i.e., those with a zero-valued fractional part—using the canonical JSON integer form. These numbers must not be represented with:
- a leading minus sign when the value is zero (i.e., use
0, not-0); - a decimal point (e.g.,
3, not3.0); - exponent notation (e.g.,
1000, not1e3); - leading zeroes (e.g.,
42, not042), as already prohibited by the JSON specification.
- a leading minus sign when the value is zero (i.e., use
-
Must represent floating-point numbers in exponential notation, adhering to the following format:
- A nonzero single-digit integer part to the left of the decimal point (e.g.,
1.23E+3, not12.3E+2); - A nonempty fractional part to the right of the decimal point (e.g.,
1.2E3, not1.E3); - No trailing zeroes in the fractional part, unless required to satisfy the condition above;
- A capital
Eas the exponent separator (not lowercasee); - No plus sign (
+) in either the mantissa or the exponent; - No leading zeroes in the exponent (e.g.,
1.2E3, not1.2E003).
- A nonzero single-digit integer part to the left of the decimal point (e.g.,
-
Must represent all strings, including object attribute keys, in their minimal length UTF-8 encoding:
- using two-character escape sequences where possible for characters that require escaping, specifically:
Character Escape Sequence Unicode "Quotation Mark\"U+0022\Reverse Solidus (backslash)\\U+005C⌫Backspace\bU+0008⇥Character Tabulation (tab)\tU+0009␊Line Feed (newline)\nU+000A␌Form Feed\fU+000C↵Carriage Return\rU+000D- using six-character
\u00XXuppercase hexadecimal escape sequences for control characters that require escaping but lack a two-character sequence described previously, and - reject any string containing invalid encoding.
encoding/json library’s streaming methods to parse and recreate a document in memory. A simplified object model is used to map JSON structures ready to be converted into canonical JSON.