Indexes
Aidbox index management is in draft stage. API will change.
Database indexes are essential for performance. In particular you will need indexes to speed up search requests.
Aidbox provides mechanisms to
- manage indexes
- suggest indexes
- generate indexes automatically
Background
Aidbox uses PostgreSQL database for storage. Most of resource data is contained in Database schema column with _jsonb_ type.
Consider simple example: active search parameter for Patient resource.
Let's try the search query
GET /Patient?active=true
Use \_explain to find out SQL query generated by this request
GET /Patient?active=true&_explain=analyze
Possible response is
query:
- >-
SELECT "patient".* FROM "patient" WHERE "patient".resource @> ? LIMIT ?
OFFSET ?
- '{"active":true}'
- 100
- 0
plan: >-
Limit (cost=0.00..1.01 rows=1 width=124) (actual time=0.015..0.015 rows=0
loops=1)
-> Seq Scan on patient (cost=0.00..1.01 rows=1 width=124) (actual time=0.014..0.014 rows=0 loops=1)
Filter: (resource @> '{"active": true}'::jsonb)
Rows Removed by Filter: 1
Planning Time: 0.729 ms
Execution Time: 0.050 ms
Corresponding SQL is
SELECT "patient".*
FROM "patient"
WHERE "patient".resource @> '{"active": "true"}'::jsonb
LIMIT 100
OFFSET 0
Here @>
is containment operator. It tests whether jsonb value on the right-hand side is contained in the jsonb value on the left-hand side.
Without indexes Postgres has to check this condition for every Patient resource stored in the database.
However, _GIN indexes_ can speed up these kind of queries.
We can create GIN index for the resource
column
CREATE INDEX patient_resource_gin_idx
ON Patient
USING GIN (resource)
Now Postgres can use this index to make search much faster.
Functional indexes
Consider more complex example: name
search parameter for Patient
resource.
Request
GET /Patient?name=abc
Generates SQL like
SELECT * FROM Patient WHERE aidbox_text_search( knife_extract_text( resource, '[["name","family"],["name","given"],["name","middle"],["name","text"]]' ) ) ILIKE unaccent('% abc%') LIMIT 100 OFFSET 0
Postgres' pg_trgm
module supports index searches for ILIKE
queries.
You can create functional index to speed up this query:
CREATE INDEX patient_name_trgm_idx
ON Patient
USING GIN (
aidbox_text_search(
knife_extract_text(
resource,
'[["name","family"],["name","given"],["name","middle"],["name","text"]]'
)
) gin_trgm_ops
)
Index management
Managing indexes manually is quite tedious and error-prone.
Aidbox provides a mechanism to manage indexes automatically using Aidbox configuration project.
Consider this example
{ns main
import #{aidbox.index.v1
aidbox
aidbox.repository.v1}
my-index
{:zen/tags #{aidbox.index.v1/index}
:table :patient
:expression "(jsonb_path_query_array(\"patient\".resource, ( '($.\"name\"[*]).** ? (@.type() == \"string\")')::jsonpath)::text) gin_trgm_ops"
:type :gin}
patient-repository
{:zen/tags #{aidbox.repository.v1/repository}
:resource-type :patient
:indexes #{my-index3 my-index1}}
repositories
{:zen/tags #{aidbox/service}
:engine aidbox.repository.v1/engine
:repositories #{patient-repository}}
box {:zen/tags #{aidbox/system}
:services
{:repositories repositories}}}
Here
box
is the Aidbox project entrypointrepositories
is an Aidbox servicepatientrepository
is an Aidbox repository. It configures resource behavior. Currently it can only add indexes and search parameters.myindex
is index definition
Use aidbox.index.v1/sync-indexes
RPC to update indexes.
POST /rpc
Content-Type: application/json
Accept: application/json
{"method": "aidbox.index.v1/sync-indexes"}
This RPC runs Workflow that creates indexes requested by your configuration. And removes indexes not requested.
Aidbox managed indexes start with aidbox_mng_idx
prefix. So your custom indexes which do not start with this prefix will not be affected by sync-indexes
RPC.
Auto-generated indexes
You can make index by your own with Get suggested indexes. However, Aidbox can be configured to make indexes for desired SearchParameters at start automatically.
Import aidbox.index.v1
in the example above and add :indexes
into patient-repository.
patient-repository
{:zen/tags #{aidbox.repository.v1/repository}
:resourceType "Patient"
:indexes #{my-index1}
:extra-parameter-sources :all
:search-parameters #{my-parameter}}
Add new symbol my-index1
with tag aidbox.index.v1/auto-index
to make index on start, based on Index Suggestion API for Patient.brthd SearchParameter.
my-index1
{:zen/tags #{aidbox.index.v1/auto-index}
:for my-parameter}
After restart new index will be added.
select * from pg_indexes where tablename = 'patient';
How to make my index explicitly with SQL?
Use aidbox.index.v1/index
tag with :expression
and PostgreSQL index:type
fields:
my-index2
{:zen/tags #{aidbox.index.v1/index}
:table "patient"
:expression "(jsonb_path_query_array(\"patient\".resource, ( '($.\"name\"[*]).** ? (@.type() == \"string\")')::jsonpath)::text) gin_trgm_ops"
:type :gin}
After restart new index will be added:
CREATE INDEX aidbox_mng_idx_main_my_index2
ON public.patient
USING gin (((jsonb_path_query_array(resource, '$.\"name\"[*].**?(@.type() == \"string\")'::jsonpath))::text) gin_trgm_ops)
As in previous section, actuall creation/deletion of indexes is triggered with aidbox.index.v1/sync-indexes
RPC.
Formal description of Zen Indexes:
Index suggestion
Aidbox provides two RPCs which can suggest you indexes
Suggest indexes for parameter
Use aidbox.index/suggest-index
RPC to get index suggestion for specific search parameter
POST /rpc
Content-Type: text/yaml
Accept: text/yaml
method: aidbox.index/suggest-index
params:
resource-type: <resourceType>
search-param: <searchParameter>
Suggest indexes for query
Use aidbox.index/suggest-index-query
RPC to get index suggestions based on query
POST /rpc
Content-Type: text/yaml
Accept: text/yaml
method: aidbox.index/suggest-index-query
params:
resource-type: Observation
query: date=gt2022-01-01&_id=myid