JSON for Humans.
JSON is great. Until you miss that trailing comma... or want to use comments. What about multiline strings? JSONH provides a much more elegant way to write JSON that's designed for humans rather than machines.
Since JSONH is compatible with JSON, any JSONH syntax can be represented with equivalent JSON.
{
// use #, // or /**/ comments
// quotes are optional
keys: without quotes,
// commas are optional
isn\'t: {
that: cool? # yes
}
// use multiline strings
haiku: '''
Let me die in spring
beneath the cherry blossoms
while the moon is full.
'''
// compatible with JSON5
key: 0xDEADCAFE
// or use JSON
"old school": 1337
}
JSONH is a format inspired by and closely related to HJSON. It aims to improve HJSON's pitfalls while keeping its elegant charm.
Unlike HJSON, JSONH is also backwards compatible with JSON5, a superset of JSON that adds features like hexadecimal numbers and escaped newlines.
Usability is at the forefront of JSONH, and this is evident in the features borrowed from all three formats.
JSON
|______
| |
JSON5 HJSON
|
JSONH
Name another format that features an anime mascot!
While JSON is designed to be readable for humans, it's secondary to communication between machines. As such, its syntax is strict, inflexible and basic.
{
"name": "John Doe",
"age": 20
}
No comments, no trailing commas, no multiline strings, no floating-point literals or hexadecimal. It's as raw as it gets, and leaves something to be desired.
HJSON makes a number of adventurous improvements to JSON. Among quality-of-life changes like trailing commas are more zany ideas like quoteless strings.
{
name: John Doe
age: 20 # last we heard
}
However, HJSON's elegance is undermined by a number of design pitfalls that are too late to change. For example:
- Commas at the end of quoteless strings are parsed as part of the string.
- Numbers cannot be separated with underscores (e.g.
100_000
). - Not backwards-compatible with JSON5.
JSONH should be considered as "HJSON v2".
JSON5 sticks much closer to JSON than other formats. It mainly adds things like trailing commas and quoteless property names.
{
name: "John Doe",
age: 20 // last we heard
}
Since its primary purpose is compatibility with ECMAScript, it's missing desirable features. For example:
- No multi-quoted strings.
- Commas cannot be omitted in arrays and objects.
- The root braces cannot be omitted.
YAML is a format that introduces more confusion than improvements to JSON.
name: "John Doe"
age: 20 # last we heard
Instead of building upon the JSON syntax, YAML provides a huge number of features, each one more error-prone than the last.
- Indentation-based arrays and objects, with confusion on when or how much indentation is necessary.
- Arbitrary dashes to signify the beginning of an object.
- Multiline string indicators like
>
,|
,>-
,>+
,|+
(is this readable??) - YAML is not even a superset of JSON!
Safe to say, YAML is not easily to understand. JSONH is much more straightforward and still has all the features you need to express yourself.
TOML is based on INI rather than JSON, making it a format used strictly for configuration files. However, it adds support for JSON objects and other syntax.
[person]
name = "John Doe"
age = 20
Whereas JSON is hierarchical and unambiguous, it's not immediately clear what the attributes in TOML refer to. Additionally, if you want values that are objects, you end up using JSON anyway, making the TOML syntax inconsistent.
You might be thinking: new programming languages and formats get created all the time, and never reach the light of day due to a lack of usage. Basically, it's hard to get people to change to new things.
However, in the case of JSONH, this is not a problem. Programming languages are most useful when they have widespread adoption and an ecosystem of packages. However, configuration/data formats like JSONH are useful in personal projects, oblivious to common usage. Use the format that's right for you.
Since JSONH is a superset of JSON and JSON5, all valid JSON and JSON5 is valid JSONH.
Objects contain an ordered sequence of properties (key: value
).
They are wrapped in braces ({}
).
Properties are separated with ,
or a newline. A single trailing comma is allowed.
If two properties have the same key, the first property is replaced.
{
a: b
c: d
}
{
"a": "b",
"c": "d"
}
A braceless object can be created at the root level. It terminates at the end of the document.
meal: pizza
drink: cola
snacks: [
"biscuit",
"chocolate"
]
{
"meal": "pizza",
"drink": "cola",
"snacks": [
"biscuit",
"chocolate"
]
}
Arrays contain an ordered sequence of items.
They are wrapped in brackets ([]
).
Items are separated with ,
or a newline. A single trailing comma is allowed.
[
a
b
]
[
"a",
"b"
]
There are three named literals, just like JSON.
null
- no value (any type)true
- true (boolean)false
- false (boolean)
Other named literals (such as Infinity
and NaN
) are left as quoteless strings.
Strings contain an ordered sequence of characters.
All strings can contain escape sequences starting with \
:
\b
- backspace\f
- form feed\n
- newline\r
- carriage return\t
- tab\v
- vertical tab\0
- null\a
- alert\e
- escape (\e
=\u001b
)\u0000
- UTF-16 escape sequence (\u00E7
=ç
)\x00
- short UTF-16 escape sequence (\xE7
=ç
)\U00000000
- UTF-32 escape sequence (\U0001F47D
=👽
)\(newline)
- escaped newline (\(newline)
=(empty)
)\(rune)
- literal character (\q
=q
)
Double-quoted/single-quoted strings are wrapped in double-quotes ("
) or single-quotes ('
).
They can contain newlines.
"hello
world\n"
"hello\nworld\n"
Multi-quoted strings are wrapped in three or more double-quotes ("""
) or single-quotes ('''
).
The first (whitespace -> newline) and last (newline -> whitespace) are stripped. If either are not present, no whitespace is stripped.
"""
hello world """
"\n hello world "
""" hello world
"""
" hello world\n "
Otherwise, the whitespace after the last newline is stripped from the beginning of each line.
"""
hello
world
"""
"hello\n world"
Note
The recommended way for implementations to parse multi-quoted strings is with several forward-passes:
- Pass 0: read string
- Condition: skip remaining steps unless started with multiple quotes
- Pass 1: count leading whitespace -> newline
- Condition: skip remaining steps if pass 1 failed
- Pass 2: count trailing newline -> whitespace
- Condition: skip remaining steps if pass 2 failed
- Pass 3: strip trailing newline -> whitespace
- Pass 4: strip leading whitespace -> newline
- Condition: skip remaining steps if no trailing whitespace
- Pass 5: strip line-leading whitespace
Quoteless strings are terminated by a newline or a symbol.
{ text: hello world, }
{ "text": "hello world" }
Unlike other types of strings, reserved symbols (\
, ,
, :
, [
, ]
, {
, }
, /
, #
, "
, '
) and newlines must be escaped.
this \, is a comma. this\
\n is a newline.
"this , is a comma. this\n is a newline."
Leading and trailing whitespace is always stripped, including escaped whitespace.
a: b c \n,
{
"a": "b c"
}
Numbers represent a numeric value.
They are comprised of the following components:
- A sign (
+
or-
)- Optional
- A base specifier (
0x
or0X
or0b
or0B
or0o
or0O
)- Optional
- An integer
- Optional if followed by dot integer
- A dot (
.
)- Optional
- An integer
- Optional if preceded by integer dot
- An exponent (
e
orE
)- Optional
- A sign (
+
or-
)- Optional if not hexadecimal
- An integer
- Optional if followed by dot integer
- A dot (
.
)- Optional
- An integer
- Optional if preceded by integer dot
By default, every digit is decimal (base-10). If a base is specified, every digit is in that base:
0x
or0X
-> hexadecimal (base-16)0b
or0B
-> binary (base-2)0o
or0O
-> octal (base-8)
Digits can be separated by one-or-more underscores (_
).
- Underscores must be between digits or base specifiers or underscores.
Note
Exponents in hexadecimal numbers must be followed by +
or -
(e.g. 0x5e+3
), since otherwise the exponent is parsed as a hexadecimal digit.
[
1.0
.5e3
+64e-1.0
354_246.1_2_3
0xa1b.5e2
]
[
1,
500,
6.4,
354246.123,
258750.0
]
Note
Since numbers with fractional exponents (e.g. 1e3.4
) are often irrational, implementations may choose an arbitrary precision.
As such, fractional exponents should be avoided.
Comments are allowed in the space of any whitespace and do not affect the resulting JSONH.
Line comments start with a hash (#
) or a double-slash (//
) and are terminated by a newline.
# Numbers
3.14 // pi approximation
3.14
Block comments start with a slash-asterisk (/*
) and are terminated by an asterisk-slash (*/
).
[ /*
Example
*/ ]
[ ]
The following characters are valid whitespace:
\u0020
(space)\u00A0
(non-breaking space)\u1680
(Ogham space mark)\u2000
(en quad)\u2001
(em quad)\u2002
(en space)\u2003
(em space)\u2004
(three-per-em space)\u2005
(four-per-em space)\u2006
(six-per-em space)\u2007
(figure space)\u2008
(punctuation space)\u2009
(thin space)\u200A
(hair space)\u202F
(narrow no-break space)\u205F
(medium mathematical space)\u3000
(ideographic space)\u2028
(line separator)\u2029
(paragraph separator)\u0009
(character tabulation / horizontal tab)\u000A
(line feed)\u000B
(line tabulation / vertical tab)\u000C
(form feed)\u000D
(carriage return)\u0085
(next line)
This corresponds to char.IsWhiteSpace
in .NET.
The following characters are valid line terminators:
\n
(line feed)\r
(carriage return)\r\n
(carriage return + line feed)\u2028
(line separator)\u2029
(paragraph separator)
This corresponds to line terminators in JSON5.
JSONH can be written in any unicode encoding (UTF-8, UTF-16, UTF-32).
JSONH files should always end with .jsonh
.
JSONH uses a versioning system to ensure significant changes to the syntax are properly documented.
Implementations may support one or more versions of JSONH.
The JSONH format is designed to be flexible, so you are free to ignore any usage recommendations.
Nevertheless, the following practices are recommended:
- Use UTF-8 encoding (see UTF-8 Everywhere).
- Use UNIX-style line endings (
\n
).
Contributions are welcome.
Please raise an issue. Note that the format shouldn't stray too far from HJSON or JSON5 and should be directly translatable to JSON.
You are welcome to write a JSONH parser or highlighter in a language of your choice. Raise an issue so it can be added to the list.
Type | Name | Package | Description |
---|---|---|---|
VSCode | JsonhVscode | Syntax highlighter | |
C# | JsonhCs | Parser | |
C++ | JsonhCpp | Parser | |
GDExtension | JsonhGdextension | Parser using JsonhCpp | |
CLI | JsonhCli | Command line interface using JsonhCs |