JSON support was introduced in Oracle Database Release 2 (12.1.0.2) allowing you to store and retrieve data between the applications and the database engine. This post expands upon a previous post about basic JSON support by detailing the performance effects of different JSON notations. In production scenarios, large JSON documents or large numbers of JSON documents can challenge a database's ability to perform. Let's create a sample table and populate it with some data as follows:
SQL> CREATE TABLE product_catalog (
2 ident number generated always as identity,
3 json_document CLOB
4 CONSTRAINT validate_json CHECK (json_document IS JSON));
Table created.
SQL> INSERT INTO product_catalog (json_document)
2 VALUES ('{
3 "id": 14,
4 "name": "Table",
5 "price": 22.95,
6 "details": {"color" : "brown",
7 "stock" : "yes" }
8 }');
1 row created.
SQL> INSERT INTO product_catalog (json_document)
2 VALUES ('{
3 "id": 22,
4 "name": "Door",
5 "price": 14.99,
6 "details": {"color" : "white",
7 "stock" : "yes" }
8 }');
1 row created.
SQL> INSERT INTO product_catalog (json_document)
2 VALUES ('{
3 "id": 99,
4 "name": "Chair",
5 "price": 5.25,
6 "details": {"color" : "black",
7 "stock" : "no" }
8 }');
1 row created.
Now execute and trace this simple query using "dot notation" as predicate:
SQL> SET AUTOTRACE TRACE EXPLAIN
SQL> SELECT ident,
2 json.json_document.id as id,
3 json.json_document.name as name,
4 json.json_document.price as price
5 FROM product_catalog json
6 WHERE json.json_document.name = 'Chair';
Let's take a look at the output:
Execution Plan
----------------------------------------------------------
Plan hash value: 283946387
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2015 | 2 (0)|
|* 1 | TABLE ACCESS FULL| PRODUCT_CATALOG | 1 | 2015 | 2 (0)|
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(JSON_QUERY("JSON"."JSON_DOCUMENT" FORMAT JSON , '$.name'
RETURNING VARCHAR2(4000) ASIS
WITHOUT ARRAY WRAPPER NULL ON ERROR)='Chair')
The same query can be performed using the "direct value" as predicate, as follows:
SQL> SELECT ident,
2 json.json_document.id as id,
3 json.json_document.name as name,
4 json.json_document.price as price
5 FROM product_catalog json
6 WHERE json_value(json_document,
7 '$.name' returning varchar2 error on error) = 'Chair';
And now take a look at the ouput:
Execution Plan
----------------------------------------------------------
Plan hash value: 283946387
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2015 | 2 (0)|
|* 1 | TABLE ACCESS FULL| PRODUCT_CATALOG | 1 | 2015 | 2 (0)|
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(JSON_VALUE("JSON_DOCUMENT" FORMAT JSON , '$.name' RETURNING
VARCHAR2(4000) ERROR ON ERROR)='Chair')
As first glance, both predicates give the same explain plan, and we can verify a table access full operation to return the result set as expected. PRODUCT_CATALOG table has no indexes created yet, so a full table scan is the only option.
A traditional approach would be to create an index using the "dot notation," but you must be aware that this new index will be used only when querying using the dot notation. If you query using the direct value notation, the index will not be used.
Let's create the new index as follows:
SQL> CREATE INDEX json_name
2 ON product_catalog json (json.json_document.name);
Index created.
Now execute both queries again, and compare their corresponding explain plans:
SQL> SELECT ident,
2 json.json_document.id as id,
3 json.json_document.name as name,
4 json.json_document.price as price
5 FROM product_catalog json
6 WHERE json.json_document.name = 'Chair';
Execution Plan
----------------------------------------------------------
Plan hash value: 552513142
-----------------------------------------------------------------------------
| Id| Operation | Name | Rows | Bytes|Cost(%CPU)|
-----------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 1 | 4017| 2 (0)|
| 1| TABLE ACCESS BY INDEX ROWID| PRODUCT_CATALOG| 1 | 4017| 2 (0)|
|* 2| INDEX RANGE SCAN | JSON_NAME | 1 | | 1 (0)|
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(JSON_QUERY("JSON_DOCUMENT" FORMAT JSON ,
'$.name' RETURNING VARCHAR2(4000) ASIS
WITHOUT ARRAY WRAPPER NULL ON ERROR)='Chair')
SQL> SELECT ident,
2 json.json_document.id as id,
3 json.json_document.name as name,
4 json.json_document.price as price
5 FROM product_catalog json
6 WHERE json_value(json_document,
7 '$.name' returning varchar2 error on error) = 'Chair';
Execution Plan
----------------------------------------------------------
Plan hash value: 283946387
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4017 | 2 (0)|
|* 1 | TABLE ACCESS FULL| PRODUCT_CATALOG | 1 | 4017 | 2 (0)|
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(JSON_VALUE("JSON_DOCUMENT" FORMAT JSON , '$.name' RETURNING
VARCHAR2(4000) ERROR ON ERROR)='Chair')
As described previously, this new index is only used during the "dot notation" SQL.
Now, let's populate PRODUCT_TABLE with additional data to show the performance difference clearly.
SQL> DECLARE
2 recins NUMBER := 0;
3 limit NUMBER := 3334;
4 BEGIN
5 WHILE recins < limit LOOP
6
7 INSERT INTO product_catalog (json_document)
8 VALUES ('{
9 "id": 14,
10 "name": "Table",
11 "price": 22.95,
12 "details": {"color" : "brown",
13 "stock" : "yes" }
14 }');
15
16 INSERT INTO product_catalog (json_document)
17 VALUES ('{
18 "id": 22,
19 "name": "Door",
20 "price": 14.99,
21 "details": {"color" : "white",
22 "stock" : "yes" }
23 }');
24
25 INSERT INTO product_catalog (json_document)
26 VALUES ('{
27 "id": 99,
28 "name": "Chair",
29 "price": 5.25,
30 "details": {"color" : "black",
31 "stock" : "no" }
32 }');
33
34 recins := recins + 1;
35 END LOOP;
36 END;
37 /
PL/SQL procedure successfully completed.
SQL> COL name FORMAT a10
SQL> COL total FORMAT 99999
SQL> SELECT json.json_document.name as name,
2 count(*) as total
3 FROM product_catalog json
4 GROUP BY json.json_document.name;
NAME TOTAL
---------- ------
Chair 3335
Door 3335
Table 3335
Now execute both queries again, and compare their corresponding explain plans.
SQL> SELECT ident,
2 json.json_document.id as id,
3 json.json_document.name as name,
4 json.json_document.price as price
5 FROM product_catalog json
6 WHERE json.json_document.name = 'Chair';
Execution Plan
----------------------------------------------------------
Plan hash value: 552513142
-----------------------------------------------------------------------------
| Id| Operation | Name | Rows | Bytes|Cost(%CPU)|
-----------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 3231| 24000| 13 (0)|
| 1| TABLE ACCESS BY INDEX ROWID| PRODUCT_CATALOG| 3231| 24000| 13 (0)|
|* 2| INDEX RANGE SCAN | JSON_NAME | 3231| | 12 (0)|
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(JSON_QUERY("JSON_DOCUMENT" FORMAT JSON ,
'$.name' RETURNING VARCHAR2(4000) ASIS
WITHOUT ARRAY WRAPPER NULL ON ERROR)='Chair')
SQL> SELECT ident,
2 json.json_document.id as id,
3 json.json_document.name as name,
4 json.json_document.price as price
5 FROM product_catalog json
6 WHERE json_value(json_document,
7 '$.name' returning varchar2 error on error) = 'Chair';
Execution Plan
----------------------------------------------------------
Plan hash value: 283946387
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3231 | 12M| 201 (0)|
|* 1 | TABLE ACCESS FULL| PRODUCT_CATALOG | 3231 | 12M| 201 (0)|
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(JSON_VALUE("JSON_DOCUMENT" FORMAT JSON , '$.name' RETURNING
VARCHAR2(4000) ERROR ON ERROR)='Chair')
Note how "Rows" and "Bytes" values are significantly different in both explain plans.
Conclusion
Both developers and database administrators must define a standard approach to using the JSON functionality in order to take advantage of the indexes created. Be careful about SQL predicates and verify if they are really using created indexes as expected.