JSON Schema is a powerful tool for validating JSON data structures, I viewed and studied the official JSON Schema documentation, made a detailed record, share it.
We can use JSON Schema in the follow-up to do interface testing to do detailed data value checksums, data type checksums, json data structure checksums.
jsonschema is used to annotate and validate the metadata of JSON documents.
Official Documentation Addressjsonschema
simple example
There is a simple json data, according to the json data format to write jsonschema, and then check the json data each field is the specified type.
import jsonschema json_data = [ { 'pm10': 24, 'city': 'Zhuhai', 'time': '2016-10-23 13:00:00' }, { 'pm10': 24, 'city': 'Shenzhen', 'time': '2016-10-21 13:00:00' }, { 'pm10': '21', 'city': 'Guangzhou', 'time': '2016-10-23 13:00:00' } ] json_schema = { 'type': 'array', 'items': { 'type': 'object', 'properties': { 'pm10': { 'type': 'number', }, 'city': { 'type': 'string', 'enum': ['Zhuhai', 'Shenzhen'] }, 'time': { 'type': 'string' } } } } try: (json_data, json_schema) except as ex: msg = ex print(ex)
type keyword
The type keyword is the foundation of the json schema, specifying the data type of the schema. the core of the JSON Schema defines the following basic types.
- string
- Numeric types
- object
- array
- boolean
- null
The Python equivalents of these types are as follows, and the following table maps the names of JavaScript types to the relevant Python types.
JavaScript | Python |
string | string |
number | int/float |
object | dict |
array | list |
boolean | bool |
null | none |
The type keyword can be a string or an array.
- If it is a string, then it is the name of one of the basic energy types above.
- If it is an array, it must be an array of strings, where each string is the name of one of the basic types, and each element is unique. In this case, if the json snippet matches any of the given types, the snippet is valid.
Here's a simple example of the type keyword.
{“type”: “number”}
Define a field type as number, if it is 40, 43.0 it will pass if it is "43", a string containing numbers it will not pass.
{“type”: [“number”, ‘string’]}
Defining a field type as one of number or string if it is 43, or "me and you" passes the checksum if it is [43, "me and you"], which fails because it does not accept structured datatypes.
object keyword
In Python, the corresponding type of an object is the dict type.
properties
Use the properties keyword to define properties (key-value pairs) on an object. For example, we want to define a simple schema for an address consisting of a number, a street name, and a street type.
{ “type” : “object” , “properties” : { “number” : { “type” : “number” }, “street_name” : { “type” : “string” }, “street_type” : { “type” : “string” , “enum” : [ “Street” , “Avenue” , “Boulevard” ] } } }
Required Attributes
By default, properties does not require the properties defined by the keyword. However, a list of required properties can be provided using the required keyword.
The required keyword accepts an array of one or more strings. Each string must be unique.
In the following sample schema for defining user records, we require each user to have a name and email address, but we don't mind if they provide their address or phone number:
{ “type” : “object” , “properties” : { “name” : { “type” : “string” }, “email” : { “type” : “string” }, “address” : { “type” : “string” }, “telephone” : { “type” : “string” } }, “required” : [ “name” , “email”] }
adults and children
You can limit the number of properties on an object using the minProperties and maxProperties keywords . Each of these must be a non-negative integer.
{ “type” : “object” , “minProperties” : 2 , “maxProperties” : 3 }
Array Properties
Arrays are used for ordered elements.
In Python, an array is similar to a list or tuple, depending on the usage.
Example: [1,2,3,4,5]
[2, ‘dd’]
items
The elements of an array can be anything, but it is often useful to validate the items of an array against certain patterns. This is done here using the items and additionalItems keywords.
In JSON, data is usually used in two ways
- List validation: a sequence of any length, where each element matches the same pattern
- Tuple validation: a fixed-length sequence in which each item may have a different pattern, in this usage the index (position) of each item is meaningful in terms of how the value is interpreted, e.g. Python's tuple.
List validation
List validation is useful for arrays of arbitrary length, where each item matches the same pattern. For such arrays, set the items keyword to a single pattern and use that pattern to validate all items in the array.
Note: items is a single pattern, the additionalItems keyword is meaningless and should not be used.
For example, in the following example we define each item in the array to be a number
{ “type” : “array” , "items": { "type": "number" } }
If [1,2,3,4,5], pass
If [1,2,3,'5', 6] , false
If [], pass
Tuple validation
Tuple validation is needed when the array is a collection of items. Each of these items has a different pattern, and the ordinal index of each item is meaningful.
Example: A street address would look like this
1600 Pennsylvania Avenue NWwould have4centertype[number, streent_name, street_type, direction]
Each field has a different architecture
- number: address number, must be numeric
- street_name: name of the street, must be a string
- street_type: the type of the street, should be a fixed value string.
- direction: the location of the address, should be a string from a different set of values
To do this we set the items keyword to an array, where each item is a pattern corresponding to each index of the document array, that is, an array, the first element pattern validates the first element of the input array. The second element mode validates the second element of the input array, and so on.
Example.
{ “type” : “array” , “items” : [ { “type” : “number” }, { “type” : “string” }, { “type” : “string” , “enum” : [ “Street” “ , ”Avenue“ , ”Boulevard“ ] }, { ”type“ : ”string“ , ”enum“ : [ ”NW“ , ”NE“ , “SW” , “SE” ] } ] }
If [1600, "Pennsylvania", 'Street', "NW"], passIf [10, 'etc. ', 'etc.'], false and by default, add other items as well:[ 1600 , "Pennsylvania" , "Street ", "NW", "Washington"]
The additionalItems keyword controls whether additional items are valid beyond the array defined in the schema; if set to false, additional items in the array are not allowed.
lengths
The length of the array can be specified using the minItems and maxItems keywords. The value of each keyword must be a non-negative number. These keywords are valid for both List and Tuple validation. Example.
{ “type” : “array” , “minItems” : 2 , “maxItems” : 3 }
uniqueness
With the uniqueItems keyword set to true, each item in the array is unique.
Common keywords
metadata
The json schema contains several keywords, title, description and default, which are not strictly used to check the format, but are used to describe part of the schema. In the title and description of the housekeeper you must be a string.
enumerated value
The enum keyword is used to restrict values to a fixed set of values, which must be an array that must contain one element, where each element is unique.
{‘type’ : ‘string’,‘enum’: [‘red’, ‘green’]}
If the value of the check field in the enumeration is passed, if not the check cannot be passed.
Combined mode
JSON Schema contains some keywords for combining schemas together, which does not mean combining schemas from multiple files or JSON trees, although these tools help with that and are described in Structured Complex Schemas.
For example, in the following pattern, the anyOf keyword is used to indicate that a given value may be valid for any given subpattern. The first subpattern requires a string with a maximum length of 5. The second subpattern requires a number with a minimum value of 0. As long as a value validates against any of these patterns, it is considered valid for the entire combined pattern.
{ ‘anyOf’: [ {‘type’: ‘string’, ‘maxLength’: 5}, {‘type’:’string’, ‘minimum’: 0 }]}
The keywords used for combining patterns are.
- allOf: must be valid for all submodalities
- anyOf: must be valid for any subpattern (one or more)
- oneOf: must be valid only for one of the subpatterns
anyOf
To validate anyOf, the given data must be valid for any (one or more) of the given subpatterns.
{ “anyOf” : [ { “type” : “string” }, { “type” : “number” } ] }
If "Hello", pass if 33, pass if ['ddd', 33], false
oneOf
To perform validation oneOf, the given data must be valid for only one of the given submodalities.
{ “oneOf” : [ { “type” : “number” , “multipleOf” : 5 }, { “type” : “number” , “multipleOf” : 3 } ] }
If a multiple of 5, passIf a multiple of 3, passIf a multiple of 5 and 3, false
allOf
To perform validation allOf, the given data must be valid for all given subpatterns.
{ “allOf” : [ { “type” : “string” }, { “maxLength” : 5 } ] }
$schema keyword
This $schema keyword is used to declare that the JSON fragment is actually part of a JSON schema. It also declares which version of the JSON Schema standard is written for the schema.
It is recommended that all JSON schemas have a $schema entry, which must be located in the root directory. Therefore, in most cases, you will need to be in the root directory of the schema:
“$ schema” : “/schema#”
regular expression (math.)
The pattern and pattern attribute keywords use regular expressions to represent constraints. The regular expression syntax used is derived from JavaScript (specifically ECMA 262). However, the full syntax is not widely supported, so it is recommended that you stick to a subset of the syntax below.
- A single unicode character (except the special characters below) matches itself.
- ^: matches only the beginning of the string.
- $: matches only the end of the string.
- (...): group a series of regular expressions into a single regular expression.
- |: matches regular expressions that are preceded or followed by the | symbol.
- [abc]: matches any character inside square brackets.
- [a-z]: matches a range of characters.
- [^abc]: matches any character not listed.
- [^a-z]: matches any character outside the range.
- +: Match one or more repetitions of the previous regular expression.
- *: Match zero or more repetitions of the preceding regular expression.
- ? : Match zero or one repetition of the previous regular expression.
- +What's your name? ,? , ??? : The + and ? qualifiers are greedy; they match as much text as possible. Sometimes this behavior is undesirable and you want to match as few characters as possible.
- {x}: exactly x matches the number of occurrences of the preceding regular expression.
- {x,y}: matches at least x and most y occurrences of the previous regular expression multiple times.
- {x,}: matches the number of occurrences of the regular expression preceding x or more.
- {x}? , {x,y}? , {x,}? : inert versions of the above expressions.
Example.
{ “type” : “string” , “pattern” : “^(\\([0-9] {3} \\))?[0-9]{3}-[0-9] {4} $ “ }
If "555-1212", pass if "(888) 555-1212" , pass if "(888) 555-1212 extension 532 " , false
Building complex patterns
reuse
Some patterns may be common to several places, if we rewrite them every time it will make the pattern more lengthy and complicated to update it later, we can do it by reusing it. For example: define a customer record, each customer may have both a shipping address and a billing address, the address is always the same, with street address, city, state.
Define the address pattern.
{ “type” : “object” , “properties” : { “street_address” : { “type” : “string” }, “city” : { “type” : “string” }, “state” : { “type” : “string” } }, “required” : [ “street_address” , “city” ,“state” ] }
We reuse the above schema and put it under the parent schema, named definitions.
{ “definitions” : { “address” : { “type” : “object” , “properties” : { “street_address” :{“type” :“string”}, “city” : { “type” : “string” }, “state” : { “type” : “string” } }, “required” :[“street_address” ,“city”,“state (e.g. of US)”] } } }
We then use the $ref keyword to reference this architecture fragment from elsewhere, pointing to the location of this module
{ “$ ref” : “#/ definitions / address” }
The value $ref is a string in the format called JSON Pointer.
'#' references the current document, '/' iterates over the keys in the objects in the document, and so on.
“#/ definitions / address” signify:
- Go to the root directory of the document
- Find the value of the secret key, "definitions".
- In this object, find the value of the key "address".
A $ref can also be a relative or absolute URI, for example.
{ “$ ref” : “#/ address” }
The above is the python library JsonSchema validate JSON data structure use details, more information about JsonSchema validate JSON data structure, please pay attention to my other related articles!