SJOT: Schemas for JSON Objects
by Robert van Engelen, September 28, 2016.
Updated November 15, 2017.
The JSON schema draft was an important move forward to make JSON more useful with APIs and other systems that require JSON content validation.
However, working with JSON schema can be daunting and defeats the simplicity of JSON.
We created a simpler alternative to JSON schema that is more compact and easier to use. We call it Schemas for JSON Objects or simply SJOT.
SJOT aims at fast JSON validation and type-checking with lightweight schemas and compact validators.
SJOT schemas are valid JSON, just like JSON schema. But SJOT schemas are faster, more compact, and more intuitive. A SJOT schema of an object can be as simple as a JSON object template. Because SJOT schemas have the look and feel of a template, SJOT is easy to use.
Not convinced? Try a live demo of SJOT and snapSJOT in action.
SJOT by example
As a first example, say we have a JSON representation of a company product, similar to the json-schema.org example of a JSON schema (which is over 40 lines long!) that describes a company product API. An example product in this API is:
{
"id": 1,
"name": "A green door",
"price": 12.50
}
The product properties id
, name
, and price
are considered the bare
minimum properties of a product and should therefore be required. Other
products may contain optional tags
, dimensions
, and a warehouseLocation
:
{
"id": 2,
"name": "An ice sculpture",
"price": 12.50,
"tags": ["cold", "ice"],
"dimensions": {
"length": 7.0,
"width": 12.0,
"height": 9.5
},
"warehouseLocation": {
"latitude": -78.75,
"longitude": 20.4
}
}
Let's give this a SJOT, pun intended (comments are added for clarity and are not part of SJOT):
{
"@id": "http://example.com/product.json", ← identify this schema (this is optional!)
"@note": "A company product", ← describe what is defined
"product": { ← define a product object that has...
"id": "number", ← a required id number
"name": "string", ← a required name string
"price": "<0.0..", ← a required price in decimal greater than 0.0
"tags?": "string{1,}", ← an optional tags array of unique strings (a non-empty set)
"dimensions?": { ← optional dimensions, when provided has..
"length": "number", ← a required length numeric dimension
"width": "number", ← a required width numeric dimension
"height": "number" ← a required height numeric dimension
},
"warehouseLocation?": "http://example.com/geo.json#location"
← an optional warehouseLocation with a type defined by another SJOT
}
}
It's easy to see that property names ending in ?
are optional. Property
types are named (such as "number"
) or {}
-objects, []
-arrays (see later),
and references to other SJOT schemas, such as
"http://example.com/geo.json#location"
.
Similar to the json-schema.org example, the location
type of
warehouseLocation
is defined in a separate SJOT schema:
{
"@id": "http://example.com/geo.json",
"location": { ← define location that has...
"latitude": "float", ← a required latitude single precision float
"longitude": "float" ← a required longitude single precision float
}
}
As a side note, if you don't want to write schemas at all, then consider
using the JS snapSJOT.convert(data)
defined in the snapSJOT
module of npm
package snapsjot
. This converts JSON data and JS values to a SJOT schema,
see npm package snapsjot.
SJOT types include the basic JSON types "string"
, "number"
, "boolean"
,
"null"
, "object"
, and "array"
but also more specific types, such as
"char[0,6]"
, "float"
, integer ranges "0..10"
and float ranges
"0.0..10.0"
, and arrays of these, such as "string[1,10]"
for and array of 1
to 10 strings and "1..10[3][4]"
for an array of 4 arrays of 3 integers
between 1 and 10.
Object types are just the {}
-brackets with members, as an "inline" style.
To create an array using the "inline style" (without requiring named types in
strings), simply use a pair of [
]
brackets to enclose the type. For
example, [{"id":"number"}]
is an array of objects with numeric id
properties.
The json-schema.org example schema actually defines an array of products. In
SJOT, the product array type is referred to by a SJOT type reference
"http://example.com/product.json#product[]"
. This reference uses an array
annotation and this suffices to describe and validate a JSON array of products.
Type referencing of the form URI#name is used to refer to a named type in
a schema, such as http://example.com/geo.json#location
that references the
location
object type defined in the http://example.com/geo.json
schema.
As you can see, a SJOT type reference is very simple and clean. A type
reference string contains a #
reference to a global type in a schema without
requiring deeper multi-hop paths (no JSON pointers or paths).
A reference to a type in the current schema (e.g. that has no @id
attribute
property) is simply written as #name with an empty URI. A reference to the
root type in a schema is simply written as URI# and # for the root type of
the current schema.
Multiple schemas can be combined in a list of schemas, each schema with a
unique @id
. Types can be referenced between these schemas and the schemas in
the array are used to validate JSON data. See the examples in
this article and check out our live demo of SJOT in
action.
SJOT can be translated to JSON schema draft v4 without loss of details. See the SJOT to JSON schema converter.
SJOT schema basics
A SJOT schema is a dictionary with named types and a @root
type:
{
"@root": type,
"SomeType": type,
"AnotherType": type,
...
}
Each type is either atomic (i.e. a primitive type), an object type, an array type, a reference to a named type, or unions thereof to define alternate choices of types. Types are explained in the next section.
The @root
property indicates the root type of the JSON document to validate.
For example:
{ "@root": "string[0,999]" }
This schema validates JSON arrays of strings. This array can contain up to 999 items.
If the schema has only one type, then @root
can be replaced by any name of
your choosing:
{ "mystrings": "string[0,999]" }
However, if the schema has multiple named types, then a @root
is mandatory
to avoid ambiguity.
The following example defines a @root
document type that refers to a Person
type using the #Person
type reference, a Name
string type and a Person
object with firstname
and lastname
properties:
{
"@id": "http://example.com/sjot.json",
"@root": "#Person",
"Name": "string",
"Person": {
"firstname": "#Name",
"lastname": "#Name"
}
}
A SJOT schema may optionally include an @id
property to declare a namespace
URI to identify the schema. Using a URL to identify a schema can be useful
when external schemas must be loaded by a validator.
Note that the firstname
and lastname
of a Person
object refer to a Name
instead of just a string. This is useful, because if we decide later to
restrict the string content of names then we only have to do this once, for
example by chaning the Name
type as follows:
{
...
"Name": "(\\w(\\w|\\s)*)",
...
}
where \\w
matches a letter or digit and \\s
matches a space.
SJOT schema types
SJOT has a list of built-in primitive types that are commonly used, besides
"boolean"
, "number"
, "string"
, and "null"
. Objects, arrays, sets,
tuples, and unions are simply defined in a SJOT schema using an inline style.
It only takes two tables to list all SJOT schema constructs. A SJOT type is one of:
"any" any type (wildcard)
"atom" any non-null primitive type (boolean, number, or string)
"boolean" Boolean with value true or false
"true" fixed value true
"false" fixed value false
"byte" 8-bit integer
"short" 16-bit integer
"int" 32-bit integer
"long" 64-bit integer
"ubyte" 8-bit unsigned integer
"ushort" 16-bit unsigned integer
"uint" 32-bit unsigned integer
"ulong" 64-bit unsigned integer
"integer" integer (unconstrained)
"float" single precision decimal
"double" double precision decimal
"number" decimal number (unconstrained)
"n..m" inclusive numeric range (n, m are optional integer/decimal values)
"<n..m>" exclusive numeric range (n, m are optional integer/decimal values)
"n,m..k,l" numeric enumeration with ranges (choice of integer/decimal values)
"string" string
"base64" string with base64 content
"hex" string with hexadecimal content
"uuid" string with UUID content, optionally starting with urn:uuid:
"date" string with RFC 3339 date YYYY-MM-DD
"time" string with RFC 3339 time and optional time zone HH-MM-SS[.s][[+|-]HH:MM|Z]
"datetime" string with RFC 3339 datetime and optional time zone
"duration" string with ISO-8601 duration PnYnMnDTnHnMnS
"char" string with a single character (ASCII, Unicode, UTF-8, etc.)
"char[n,m]" string of n to m characters (n, m are optional)
"(regex)" string that matches the regex
"type[]" array of typed values, shorthand for [ type ]
"type[n,m]" array of n to m typed value, shorthand for [ n, type, m ]
"type{}" set of atoms (array of unique atoms)
"type{n,m}" set of n to m atoms (n, m are optional)
"#name" reference to a named type in the current schema
"URI#name" reference to a named type in schema "@id": "URI"
"object" object, same as {}
"array" array, same as []
"null" fixed value null
[ type ] array of typed values
[ n, type, m ] array of n to m typed values (n, m, type are optional)
[ type, ..., type ] tuple of typed values
[[ type, ..., type ]] union (choice) of types
{ "name": type, ... } object with typed properties
The property names of object types can be annotated to make them optional or match a pattern:
"name" property is required
"name?" property is optional
"name?value" property with a default value (primitive types only!)
"(regex)" property name(s) that match the regex
If the character ?
is to be part of a property name, then we write it as a
regex (who\\?)
, with a double backslash to escape the ?
(a single backslash
will be removed by most JSON parsers). Likewise, if a property name starts
with a (
then we write it as a regex.
Objects with required, optional, and default properties
An example object type with a required, optional, and default property is:
{
"Widget": { ← a widget has...
"id": "string", ← a required id
"tags?": "string{1,}", ← an optional non-empty array of unique string tags
"counter?1": "ulong" ← an optional counter with default value 1
}
}
To disallow additional properties, add the "@final": true
attribute property.
To permit optional properties to occur depending on other optional properties,
see the SJOT dependencies described further below.
An object with any properties is "object"
or just {}
. An empty object that
does not permit any properties is { "@final": true }
.
Regex properties and values
Regex anchoring with ^
and $
is unnecessary (JSON and SJOT are language and
regex library neutral: regex patterns match entire strings). For example,
this dictionary object maps words to words:
{ "(\\w+)": "(\\w+)" }
To match strings partially, simply use a .*
at the ends of the regex.
Additional types with constraints can be easily added to a SJOT schema, for example the ISO 6709 Annex H latitude and longitude type values (see the Google JSON Style Guide):
{
"@id": "http://example.com/iso-6709.json",
"@note": "ISO 6709 Annex H latitude and longitude location",
"LatLon": "([+- ]\\d{2}(.\\d+)?[+- ]\\d{3}(.\\d+)?)"
}
Special string types such as ID, URI, email, hostname, and so on can be easily defined with a regex and put in a schema for reuse.
Tuples
A tuple is a fixed-length list of values, such as [ "point", true ]
, which is
defined by the tuple type:
[ "string", "boolean" ]
Arrays and sets
Arrays of named types are simply defined by "type[]"
without bounds and
"type[n,m]"
with bounds. The lower and upper bounds are optional, so
"type[n,]"
and "type[,m]"
can be used. Use "type[n]"
for a fixed-size
array.
The inline style for arrays is [type]
without bounds and [n, type, m]
with
bounds, where n
and m
are non-negative integers. The lower and upper
bounds are optional, so [n, type]
and [type, m]
can be used. The type
is
also optional and is "any"
when omitted. Thus, []
is an array of any type
with any length, [0]
is an empty array, [2]
is an array with two items, and
[1,3]
is an array of one to three items of any type.
For example, extending a Widget object type example to include an array of quantity-price objects:
{
"Widget": { ← a widget has...
"id": "string", ← a required id
"tags?": "string{1,}", ← an optional non-empty array of unique string tags
"counter?1": "ulong", ← an optional counter with default value 1
"pricing?": [ ← an optional array of quantity-price objects
{
"quantity": "1..", ← quantity
"price": "<0.0.." ← price per quantity
}
]
}
}
Sets of named types "type{}"
without bounds and "type{n,m}"
with bounds are
essentially arrays of atomic values that are unique. The lower and upper
bounds are optional.
Uniqueness of atomic values is well defined. By contrast, object equality is often semantic instead of structural. That is, two objects may still be considered equivalent when structurally different, such as when extra properties are to be ignored. Therefore, SJOT does not admit sets of non-atomic values. This requirement makes sorting stable and validation of sets (with sorting) fast.
Enumerations
To enumerate numbers for a numeric type, use constants and ranges:
"Composite": "4,6,8..10,12,14..16"
To enumerate strings, use regex alternations:
"Color": "(RED|GREEN|YELLOW|BLUE)"
Enumerations of mixed types are modeled with a union:
"TrueOrColorOrByte": [[ "true", "(RED|GREEN|YELLOW|BLUE)", "byte" ]]
Unions
A union of types describes the range of possible types that a value may have. For example, this union represents a string or a number value:
[[ "string", "number" ]]
Array types and object types in the union must be distinct. Objects are distinct if they do not share properties. For example, the following union has two distinct object types:
[[ { "a": "number" }, { "b": "string" } ]]
To combine objects that are not distinct in a union, you should define new objects that use a new outer property name that acts as a unique tag:
[[
{ "t1": { "a": "string", "b": "number" } },
{ "t2": { "b": "string" } }
]]
Why is this recommended? The goal of SJOT is to make validation fast and scalable with predictable validation times, similar to XML schema validators for XML data bindings. Therefore, the SJOT validator must be able to determine the type of the value efficiently among the choices in the union, using constant algorithmic complexity. By contrast, JSON schema's "oneOf" and "anyOf" are not always efficient because the validator may have to revisit the data multiple times.
This recommendation also enhances readability of the JSON data by design. Consider a counter example where we have a choice of two distinct objects:
{ "data": [ a long array of objects ], "id": 456 }
and
{ "data": [ a long array of objects ], "date": "01-01-2017" }
Since both objects have a data
array, they overlap. By just looking at the
JSON text, one has to search after the array to find the potentially
distinguishing properties. This is not acceptable from a performance point of
view. A compounding problem is that JSON does not require properties to be
ordered in any way, so there is no guarantee to implement a fast object
identification check.
A tag is needed to distinguish these objects properly, making them immediately recognizable and distinct:
{ "locations": { "data": [ a long array of objects ], "id": 456 } }
and
{ "invoices": { "data": [ a long array of objects ], "date": "01-01-2017" } }
Arrays in a union are distinct if the item type of the arrays are distinct. This takes care of notorious problems with JSON schema when using "oneOf" instead of "anyOf" for type choices. A "oneOf" over M arrays of length N may require M x N time to validate while SJOT takes at most M+N time. Worse, validation with this JSON schema "oneOf" fails for an empty array because it matches all arrays in the "oneOf" (surprise!).
You may have guessed by now that a union is a smart combination of "oneOf" and "anyOf". The validator applies "anyOf" semantics for efficiency, but the restriction on distinct types essentially force "oneOf" semantics by avoiding ambiguity.
Finally, unions should not be nested, either directly or indirectly via a type reference to another union or array of unions.
Type references
To refer to a named type we use a SJOT type reference of the form URI#name* or
#name. The first form refers to the named type in the schema identified by
its @id
and URI value and the second form refers the current schema. If the
reference is to the @root
type then we use URI# and just #, respectively.
For example, a linked list of numbers can be very compactly defined as:
{ "@root": { "value": "number", "next?": "#" } }
Spaghetti references are not allowed: a type reference must refer to a type and that type cannot directly be another referenced type.
SJOT in JSON
A SJOT schema can be embeded within a JSON object by using the @sjot
property. The embedded schema describes and validates that object. For
example:
{
"@sjot": {
"Person": {
"@note": "Person with a first name and a last name",
"firstname": "string",
"lastname": "string"
}
},
"firstname": "Jason",
"lastname": "Bourne"
}
When embedded, the SJOT schema should have only one type or define a @root
object type (if several types are defined) that defines the JSON document
content. In this example the Person
object type describes the content. The
JSON content is valid because it includes the required firstname
and
lastname
properties of a Person
object type.
An embedded SJOT may refer to an external schema's root using URL#
. For
example, the same object above with a schema reference:
{
"@sjot": "http://example.com/sjot.json#",
"firstname": "Jason",
"lastname": "Bourne"
}
The @sjot
URL points to a SJOT schema that has a Person
object type as the
root, such as the SJOT schema that we described earlier in this
article.
An embedded SJOT may refer to a specific type in a schema:
{
"@sjot": "http://example.com/sjot.json#Person",
"firstname": "Jason",
"lastname": "Bourne"
}
When you invoke the validator with a specific type and schema, then only that
type and schema are used to validate the data. Use null
as a type when
invoking the validator to permit an embedded @sjot
to override the type.
A @sjot
in a JSON object may occur anywhere JSON, not just the root-level
object.
A @sjot
may contain an array of schemas, each identified with a unique @id
.
SJOT attribute properties
A @sjot
attribute property of an object in JSON contains an embedded SJOT
that defines the JSON object. An embedded @sjot
value can be a type
reference to a SJOT schema. If multiple types are defined in the embedded SJOT
schema, the type that defines the JSON object should be named @root
.
A @id
attribute property in a SJOT schema identifies the schema by a URI
namespace string.
A @note
attribute property can be added to a SJOT schema and to the object
types that the schema defines. The @note
value should be a string.
A @root
attribute property refers to the root type of the schema. An
embedded SJOT should have a @root
attribute property or the schema should
define only one type.
A @one
, @any
, @all
, or @dep
attribute property of an object type in a
SJOT schema restricts the use of optional object properties. See the SJOT
dependencies described further below.
A @extends
attribute property of an object type in a SJOT schema introduces
a derived object type. A derived object type includes the properties of a base
object type. We will discuss the use of base and derived object types below.
A @final
attribute property declares an object type final and it cannot be
extended. Also extra properties for this object in JSON are not permitted.
SJOT base and derived object types
You can extend a base object by adding properties to define a derived object.
The @extends
attribute property in an object type refers to a base object
type that is extended. For example:
{
"@id": "http://www.example.com/sjot.json",
"@note": "Schema to store personal information",
"Person": {
"@note": "Person with a first name and a last name",
"firstname": "string",
"lastname": "string"
},
"PersonDetails": {
"@note": "Person with optional age and gender",
"@extends": "http://www.example.com/sjot.json#Person",
"age?": "0..",
"gender?": "(MALE|FEMALE)"
}
}
The age?
property is optional and has a non-negative integer value. The
gender?
property is optional and has one of the two string values MALE
or
FEMALE
.
When creating derived object types, it is not permitted to override the base properties. Only new properties can be added that are not already in the base object type to create a derived object type.
This ensures that a derived object can be used in place of a base object in JSON and will pass validation by ignoring the extra properties in the derived object. This permits upgrading of a JSON API with backward compatibility to a base API.
A derived object type can change a base property from optional to required by
using a @one
singleton propset with that property name.
SJOT final object types
A @final
object cannot have any extra properties that are not defined in the
schema. Consider the PersonDetails
example from the previous example but now
declared @final
:
{
"PersonDetails": {
"@note": "Person with optional age and gender",
"@extends": "http://www.example.com/sjot.json#Person",
"@final": true,
"age?": "0..",
"gender?": "(MALE|FEMALE)"
}
}
Additional properties that are used in a JSON PersonDetails
object will cause
the validator to reject this JSON content.
SJOT any, one, and all dependencies
When object type properties are optional, you can make their use dependent on the presence of other properties in the object. You can enforcing one property of a set of properties to be present. Or force any property of a set to be present. Or all properties as a group to be present or none of that group. More specific property dependencies can be enforced as well.
SJOT one
The SJOT @one
attribute property of an object type is a list of sets of
object property names. Each property set defines the properties that should be
exclusive, meaning only one of the properties may be present.
For example, the choices
object type defined below has one of the properties
a
, b
, or c
, and one of the properties x
or y
:
{
"choices": {
"a?": "int",
"b?": "int",
"c?": "int",
"x?": "float",
"y?": "float",
"@one": [
[ "a", "b", "c" ],
[ "x", "y" ]
]
}
}
The property sets in the @one
list should be mutually disjoint and only refer
to properties that are optional (without default values) in the schema.
SJOT any
The SJOT @any
attribute property of an object type is a list of sets of
object property names. Each property set defines the properties of which one
or more should be used in this object.
For example, the anyabc
object type defined below must have at least one of
the properties a
, b
, and c
and therefore cannot be empty:
{
"anyabc": {
"a?": "int",
"b?": "int",
"c?": "int",
"@any": [
[ "a", "b", "c" ]
]
}
}
The property sets in the @any
list should be mutually disjoint and only refer
to properties that are optional (without default values) in the schema.
SJOT all
The SJOT @all
attribute property of an object type is a list of sets of
object property names. Each property set defines which properties should all
be included when at least one of them is used, meaning that all properties
should be present or none of them at all.
For example, the allornone
object type defined below must have both of the
properties x
and y
or none of them:
{
"allornone": {
"x?": "int",
"y?": "int",
"@all": [
[ "x", "y" ]
]
}
}
The property sets in the @all
list should be mutually disjoint and only refer
to properties that are optional (without default values) in the schema.
SJOT dep
The SJOT @dep
attribute property of an object type enforces properties to be
present when a specific property is present.
For example, the ifxthenyz
object type defined below must have properties y
and z
if property x
is present:
{
"ifxthenyz": {
"x?": "int",
"y?": "int",
"z?": "int",
"@dep": {
"x": [ "y", "z" ]
}
}
}
To simplify this notation, if a property list has only one property, the property name can be directly used instead of the singleton list.
The property sets in each @dep
list should only refer to properties that are
optional (without default values) in the schema.
Note that the @all
attribute property enforces the N dependencies for a
group of N properties that are all dependent on each other.
SJOT validation
Validation proceeds recursively over objects, arrays, and tuples.
Primitive values (atoms) are verified against the value type constraints that are imposed on a value by using the type information in the SJOT schema.
The property names of an object are matched against the property names of a SJOT object type. For each matching property name the value is recursively validated.
If a property is required but is absent, validation fails.
If a property is optional and is absent or its value is null
, validation
succeeds, meaning that null
is equivalent to absent for optional properties.
In this case the null
property can be deleted by the validator.
If an optional property has a default value and is absent or its value is
null
, the default value is assumed and the default value can be assigned to
this property by the validator.
The @one
, @any
, @all
, and @dep
constraints on object properties is
enforced. For the @one
constraints, exactly one property must occur for each
property set specified. For the @any
set of properties at least one of the
properties must occur for each property set specified. For the @all
constraints, all or none of the properties must occur for each property set
specified. For the @dep
constraints, if an optional property is present then
the properties in the specified property set must all be present.
Extra properties of an object are ignored unless the object type is @final
.
Validation fails when extra properties are present in a final object.
An array is validated by checking constraints on its length and the uniqueness
of atomic items in case of a set. In case of a set of atoms atom{}
, it is
assumed that integers and floating point values are compared based on their
mathematical value, not their type. So a set cannot contain both 0 and 0.0.
A null
value in an array is converted when validated against a primitive
type. The result is false
for Boolean, 0
for numeric types, and ""
for
string types. An array of objects, arrays, or tuples cannot contain null
values and triggers a validation error.
A tuple is validated by validating its members, with the same validation rule
for null
as for arrays stated above. Tuple sizes are fixed. Validation
fails when tuples are not of the correct size.
An object that is validated against the types any
or object
is validated
using its embedded @sjot
schema, when present.
SJOT examples
Vehicle data with embedded schema
{
"@sjot": {
"vehicle": {
"color?": "(WHITE|GRAY|BLACK)",
"rgb?": "([0-9a-fA-F]{6})",
"make": "string",
"year?": "1970..",
"@one": [
[ "color", "rgb" ]
]
}
},
"rgb": "D71E1E",
"make": "Honda",
"year": 2006
}
Product catalog with embedded schemas
{
"@sjot": [
{
"@id": "http://example.com/product.json",
"@note": "Company product catalog",
"@root": {
"products": "http://example.com/product.json#product[]"
},
"product": {
"@note": "A company product",
"id": "number",
"name": "string",
"price": "<0.0..",
"tags?": "string{1,}",
"dimensions?": {
"length": "number",
"width": "number",
"height": "number"
},
"warehouseLocation?": "http://example.com/geo.json#location"
}
},
{
"@id": "http://example.com/geo.json",
"location": {
"latitude": "float",
"longitude": "float"
}
}
],
"products": [
{
"id": 1,
"name": "A green door",
"price": 12.50
},
{
"id": 2,
"name": "An ice sculpture",
"price": 12.50,
"tags": ["cold", "ice"],
"dimensions": {
"length": 7.0,
"width": 12.0,
"height": 9.5
},
"warehouseLocation": {
"latitude": -78.75,
"longitude": 20.4
}
}
]
}
SJOT chameleon objects: trick or treat?
A tricky situation arises when a derived object type extends a base object type that is defined in another schema.
Assuming that one or more of the base object properties refer to a type in the current base schema by using a local #type reference, then the scope of these type references changes as the base object properties are literally imported into the derived object.
We call this type of base object a chameleon object. A chameleon object (ab)uses local type references and tricks its properties into changing shape!
An example chameleon object is the Base
object type in the top SJOT schema of
the following two SJOT schemas:
[
{
"@id": "http://example.com/base.json",
"Base": {
"id": "#ID"
},
"ID", "any"
},
{
"@id": "http://example.com/derived.json",
"Derived": {
"@extends": "http://example.com/base.json#Base"
},
"ID": "string"
}
]
The Base
object id
propery changes type, from "any"
to "string"
when
imported into Derived
with the SJOT @extends
attribute property. To see
why, consider the derived object that results after the import and after
substituting the #ID
type reference:
{
"@id": "http://example.com/derived.json",
"Derived": {
"id": "#ID"
},
"ID": "string"
}
Chameleons allow us to define type generics that change shape via local type references. A real treat to the expressiveness of SJOT.
However, danger lurks here! When a JSON API relies on a base object with fixed property types and this base is a chameleon, then the use of a derived object in place of the base object may cause validation failures.
A local #type reference should only be used when the current schema has no
@id
so this schema cannot be referenced. If an @id
is used and the
resulting chameleon type generics are extended, then it makes sense that local
type references should be generic types, such as any
, atom
, or object
.
SJOT versus JSON schema
- JSON schema is verbose, doubling the nesting level compared to the JSON content it describes. By contrast, SJOT schema levels are one-on-one with JSON data.
- JSON schema validation performance is not scalable, because validation cost may exceed linear time processing cost (meaning linear in the size of the input), in the worst case taking exponential time or memory to validate constraints, see the exploding JSON Schema states examples. By contrast, SJOT validators are very fast and scalable. The asymptotic running time of JSON validity checking is linear in the size of the given JSON data.
- JSON schema permits constraining primitive type value ranges, but offers few predeclared primitive types to choose from when almost all programming languages offer byte, short, int, float and double precision types. You can use minimum, maximum and multipleOf to constrain the decimal representation in JSON Schema, but we have to keep in mind that floating point values are typically stored in IEEE 754 format and decimals are rounded, therefore values such as 1234567890123.0099 also validate when multipleOf is 0.01. Therefore, fractional constraints are not reliable. By constrast, SJOT offers a wide choice of pre-defined types and value range constraints work fine and are very simple to use in SJOT.
- JSON schema is non-strict by default, meaning that all object properties are optional and any additional properties are permitted by default, that is, schemas accept almost anything by default. For example, JSON with typos in property names will not be rejected by a JSON Schema validator by default. By contrast, SJOT is stricy by defailt.
- JSON schemas are not extensible, you can only add more constraints when combining schemas. There is no easy way to achieve object inheritance. Worse, combining schemas may lead to a schema that rejects too much or even rejects everything. By contrast, SJOT objects are extensible or final.
- JSON schema violates the encapsulation principle because it permits referencing local schema types via JSON Pointer such as nested objects, which means that you cannot update local types without breaking all the schemas that point to the updated local type structures. By contrast, SJOT groups all types at the top level in the schema as a simple dictionary of named types.
- JSON schema design violates the orthogonality principle for several constructs. For example [ and ] can sometimes be used to indicate choices but in other cases it cannot (perhaps oneOf should be used, but that has its own problems).
- Checking if a JSON schema's constraints reject everything is an NP-complete problem. Worse, constraints may depend on property values in the JSON data, not just property occurrences. By contrast, the SJOT schema checker verifies your schemas and detects blocking constraints.
- The principle of least surprise does not apply to JSON schema: a construct may work well in one case when the same construct causes problems elsewhere. For example, using oneOf to select among primitive types, say "string" and "number" makes sense, but using oneOf to select schemas may not always work and leads to surprising rejections. Consider the simple case when we have a JSON empty array that matches both the "array of strings" and "array of numbers" schemas!
Converting SJOT to JSON schema is easy and automatic with the tools included with SJOT, try our live demo to convert SJOT to JSON schema and vice versa.
Want to give it a SJOT?
SJOT for JS is licensed under the BSD3 and available for download from GitHub SJOT and npm package sjot.
In addition, the snapSJOT converter that creates SJOT schemas for JSON data is available for download from GitHub snapSJOT and npm package snapsjot.
Try a live demo of SJOT and snapSJOT in action.
APPENDIX A: Exploding JSON Schema states
The first "ping-pong" JSON schema example randomly alternates between a "ping"
and a "pong" schema for nested objects x
until we find a boolean y
that is
a final "pong":
{"x":{"x":{"x":{"x":{"x":{"x":{"y":true}}}}}}}
If the nesting level exceeds 16 then JSON schema validators can take minutes (or crash) using the following schema:
{
"$schema" : "http://json-schema.org/draft-04/schema#",
"$ref": "#/definitions/ping",
"definitions": {
"ping": {
"type": "object",
"properties": {
"x": {
"anyOf": [
{ "$ref": "#/definitions/ping" },
{ "$ref": "#/definitions/pong" }
]
}
},
"additionalProperties": false
},
"pong": {
"type": "object",
"properties": {
"x": {
"anyOf": [
{ "$ref": "#/definitions/ping" },
{ "$ref": "#/definitions/pong" }
]
},
"y": { "type": "boolean" }
},
"additionalProperties": false
}
}
}
For the second example, let's implement a finite state machine in a JSON schema. The JSON Schema has N definitions.
The "words" we validate with the schema are defined by the regular expression
(a{N}|a(a|b+){0,N-1}b)*x
that describes a sequence of a
and b
ending in
x
. The word abbx
is represented by the JSON pointer a/b/b/x
which is
{"a":{"b":{"b":{"x":true}}}}
.
The first definition for "0" has the following schema:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"$ref": "#/definitions/0",
"definitions": {
"0": {
"type": "object",
"properties": {
"a": { "$ref": "#/definitions/1" },
"x": { "type": "boolean" }
},
"additionalProperties": false
},
Then we add N-1 definitions <DEF>
to the schema enumerated "1", "2", "3",
... "N-1":
"<DEF>": {
"type": "object",
"properties": {
"a": { "$ref": "#/definitions/<DEF>+1" },
"b": {
"anyOf": [
{ "$ref": "#/definitions/0" },
{ "$ref": "#/definitions/<DEF>" }
]
}
},
"additionalProperties": false
},
where "<DEF>
+1" wraps back to "0" when <DEF>
is equal to N-1.
This "NFA" on a two-letter alphabet has N states, only one initial and one final state. Its equivalent minimal DFA has 2^N (2 to the power N) states.
In the worst case, a validator that uses this JSON schema either takes 2^N time or uses 2^N memory "cells" to validate the input.
APPENDIX B: Tips and tricks
What does SJOT stand for?
Schemas for JSON Objects.
To JS spelled backwards.
How to define a schema for JSON when the JSON content may have alternate types
If the alternate types are distinguishable and you must use the same schema for validation then use a union as the schema root:
{
"@root": [[ type1, type2, type3, ... ]]
}
How to define a property with a ? in the name
Use a regex:
"(PropWithA\\?InItsName)": "string",
This regex property is optional. To make the property required, see below.
Use the same approach when a property name starts with a (
.
How to make regex properties required instead of optional
Regex properties are optional by design. If the property is required, add an
@any
attribute property to force its presence:
"(PropWithA\\?InItsName)": "string",
"@any": [ ["PropWithA?InItsName"], ... ]
How to define a property with a default empty string value
Because null
is converted to an empty string when used as a string type, use
null
as the default value for a property that needs an empty string default
value:
"name?null": "string"
By contrast, "name?"
is an optional property without a default value.
How to define a singleton tuple
Use unit lower and upper bounds:
[1, type, 1]
By contrast, [type]
denotes an array of any length, not a singleton tuple.
How to define an array of tuples
Use an array lower bound and/or upper bound:
[0, [type1, type2] ]
By constrast, [[ type1, type2 ]]
denotes a union.
How to define an object that rejects additional properties
Use the @final
attribute property to restrict the object type:
{
"@final": true,
"name": "string"
}
This validates objects with a required "name"
property that is a string and
rejects all objects that include other properties. An object type may have
regex properties, which means that additional properties are permitted when
they match the regex:
{
"@final": true,
"name": "string",
"(extra.*)": "any"
}
This permits additional properties with names that start with "extra"
.
How to define an empty object
Use the following:
{ "@final": true }
By contrast, "object"
and {}
denote extensible object types.
How to define an empty array
Use the following:
[0]
By contrast, "array"
and []
denote arrays of any type and of any length.
Copyright (c) 2016, Robert van Engelen, Genivia Inc. All rights reserved.