JSON Schema validation for columns

Extending a traditional database schema with NoSQL JSON columns can make the schema more understandable. When making use of it, the database will guarantee that all JSON data stored in a column is valid JSON. But can you also ensure some properties are available in the JSON document?

Now that your database has JSON column, it's tempting to store data that's complicated to model in a JSON column. You may not want to give up the safety of a defined schema, which is simplifying your application code so much. Indeed, the JSON column will be checked for structural validity whether it's conforming to the SQL standard. But you can also gain the security of a schema-definition for JSON columns by validating against JSON schema rules.

Usage

MySQL

ALTER TABLE products ADD CONSTRAINT CHECK(
  JSON_SCHEMA_VALID(
    '{
      "$schema": "http://json-schema.org/draft-04/schema#",
      "type": "object",
      "properties": {
        "tags": {
          "type": "array",
          "items": { "type": "string" }
        }
      },
      "additionalProperties": false
    }',
    attributes
  )
);

INSERT INTO products (..., attributes) VALUES (..., '{}');
-- Result: OK
INSERT INTO products (..., attributes) VALUES (..., '{ "tags":[] }');
-- Result: OK
INSERT INTO products (..., attributes) VALUES (..., '{ "tags":["test"] }');
-- Result: OK
INSERT INTO products (..., attributes) VALUES (..., '{ "tags":[2] }');
-- ERROR: Check constraint 'products_chk_1' is violated.

PostgreSQL (requires postgres-json-schema)

ALTER TABLE products ADD CONSTRAINT data_is_valid CHECK(
  validate_json_schema(
    '{
      "type": "object",
      "properties": {
        "tags": {
          "type": "array",
          "items": { "type": "string" }
        }
      },
      "additionalProperties": false
    }',
    attributes
  )
);

INSERT INTO products (..., attributes) VALUES (..., '{}');
-- Result: OK
INSERT INTO products (..., attributes) VALUES (..., '{ "tags":[] }');
-- Result: OK
INSERT INTO products (..., attributes) VALUES (..., '{ "tags":["test"] }');
-- Result: OK
INSERT INTO products (..., attributes) VALUES (..., '{ "tags":[2] }');
-- ERROR: new row for relation "products" violates check constraint "data_is_valid"
-- DETAIL: Failing row contains ({"tags": [2]}).

Detailed Explanation

JSON columns are available on all modern databases with automatic checking of structural JSON validity and saving in a storage-efficient way. But as data that is complicated to store in a relational model is now saved in JSON columns, a new set of problems emerges. It's been known for a long time that schema-less NoSQL database will just move the schema to the application. The schema and many versions of it need to be managed at numerous place throughout an application. But we want to avoid repeating that mistake when using JSON columns.

The problem of validating a (complex) JSON document to conform to a specific model is solved for a long time. The JSON schema standard enables to describe properties and data types, resulting in an effective solution to validate a JSON document. This process has been done in the application in the past, but there's no reason you couldn't or shouldn't do it in the database. Because whenever you need to change anything in the database manually, your application's validation logic will not be used. Adding JSON schema validation to your JSON columns combine's the power of a NoSQL and relational database:

  • NoSQL benefit: You can model some data in JSON columns which would be complicated to do in a relational database.
  • Relational database benefit: You get schema safety for your data and don't have to handle different data versions in your application.

Additional Resources

Author Tobias Petry

Tobias Petry

SQL for Devs Founder


Are you interested to learn more?

Be notified on future content. Never spam.