$export

The FHIR Bulk Data Export feature allows to export FHIR resources in ndjson format.

Aidbox supports patient-level and group-level export. When the export request is submitted the server returns URL to check the export status. When export is finished, status endpoint returns URLs to download resources.

Only one export process can be run at the same time. If you try to submit an export request while there is active export, you get 429 Too Many Requests error.

Setup storage

Aidbox can export data to GCP or AWS cloud. Export results will be in _ folder on the bucket.

GCP

Create bucket and service account that has read and write access to the bucket.

GCP Cloud Storage in Aidbox. Example:

yaml
private-key: |
  -----BEGIN PRIVATE KEY-----
  your-key-here
  -----END PRIVATE KEY-----
service-account-email: service-account@email
id: gcp-service-account
resourceType: GcpServiceAccount

Set the following environment variables:

  • box_bulk__storage_backend=gcp — backend for export
  • box_bulk__storage_gcp_service__account — id of the GcpServiceAccount resource
  • box_bulk__storage_gcp_bucket — bucket name

Azure

Create Azure storage account and storage container.

Create AzureAccount resource in Aidbox.

yaml
resourceType: AzureAccount
id: azureaccount                  # your storage account id
key: 7x..LA                       # your storage account key

Create AzureContainer resource in Aidbox.

yaml
resourceType: AzureContainer
id: smartboxexporttestcontainer
account:
  resourceType: AzureAccount
  id: azureaccount
storage: azureaccount             # your storage account
container: azureaccountcontainer  # your account container

Set the following environment variables:

  • box_bulk__storage_backend=azure — backend for export
  • box_bulk__storage_azure_container=smartboxexporttestcontainer — id of the AzureContainer resource

AWS

Create S3 bucket and IAM user that has read and write access to the bucket.

AWS S3 in Aidbox. Example:

yaml
region: us-east-1
access-key-id: your-key-id
secret-access-key: key
id: aws-account
resourceType: AwsAccount

Set the following environment variables:

  • box_bulk__storage_backend=aws — backend for export
  • box_bulk__storage_aws_account — id of the AwsAccount resource
  • box_bulk__storage_aws_bucket — bucket name

Parameters

Parameter Description
_outputFormat

Specifies format in which the server generates files.
The following formats are supported:

  • application/fhir+ndjson.ndjson files will be saved
  • application/fhir+ndjson+gzip.ndjson.gz files will be saved
_type Includes only the specified types. This list is comma-separated.
_since Includes only resources changed after the specified time.
patient Export data that belongs only to listed patient. Format: comma-separated list of patient ids. Available only for patient-level export.

Patient-level export

Patient-level export exports all Patient resources and resources associated with them. This association is defined by FHIR Compartments.

To start export make a request to /fhir/Patient/$export:

Rest console

http
GET /fhir/Patient/$export
Accept: application/fhir+json
Prefer: respond-async

Status

202 Accepted

Headers

  • ContentLocation — Link to check export status (e.g. /fhir/$exportstatus/)

Make a request to the export status endpoint to check the status:

Rest console

GET /fhir/$export-status/<id>

Status

200 OK

Body

jsonp
{
  "status": "completed",
  "transactionTime": "2021-12-08T08:28:06.489Z",
  "requiresAccessToken": false,
  "request": "[base]/fhir/Patient/$export"
  "output": [
    {
      "type": "Patient",
      "url": "https://storage/some-url",
      "count": 2
    },
    {
      "type": "Person",
      "url": "https://storage/some-other-url",
      "count": 1
    }
  ]
}

Delete request on the export status endpoint cancels export.

Rest console

DELETE /fhir/$export-status/<id>

Status

202 Accepted

Group-level export

Group-level export exports all Patient resources that belong to the specified group and resources associated with them. Characteristics of the group are not exported. This association is defined by FHIR Compartments.

To start export make a request to /fhir/Group//$export:

Rest console

http
GET /fhir/Group/<group-id>/$export
Accept: application/fhir+json
Prefer: respond-async

Status

202 Accepted

Headers

  • ContentLocation — Link to check export status (e.g. /fhir/$exportstatus/)

Make a request to the export status endpoint to check the status:

Rest console

GET /fhir/$export-status/<id>

Status

200 OK

Body

jsonp
{
  "status": "completed",
  "transactionTime": "2021-12-08T08:28:06.489Z",
  "requiresAccessToken": false,
  "output": [
    {
      "type": "Patient",
      "url": "https://storage/some-url",
      "count": 2
    },
    {
      "type": "Person",
      "url": "https://storage/some-other-url",
      "count": 1
    }
  ]
}

Delete request on the export status endpoint cancels export.

Rest console

http
DELETE /fhir/$export-status/<id>

Status

202 Accepted

System-level export

System-level export exports data from a FHIR server, whether or not it is associated with a patient. You may restrict the resources returned using the _type parameter.

Limitation: export operation will work for standard FHIR resources only, not for custom resources.

http
GET /fhir/$export
Accept: application/fhir+json
Prefer: respond-async

Status

200 OK

Body

jsonp
{
  "status": "completed",
  "transactionTime": "2021-12-08T08:28:06.489Z",
  "requiresAccessToken": false,
  "output": [
    {
      "type": "Patient",
      "url": "https://storage/some-url",
      "count": 2
    },
    {
      "type": "Person",
      "url": "https://storage/some-other-url",
      "count": 1
    }
  ]
}

Delete request on the export status endpoint cancels export.

Rest console

http
DELETE /fhir/$export-status/<id>

Status

202 Accepted

Troubleshooting guide

$export operation expects you setup external storage, Aidbox exports data into. In most cases issues with $exoprt are the consequence of incorrect Adbox configuration. In order to exclude this run the following rpc:

POST /rpc
Content-Type: text/yaml

method: aidbox.bulk/storage-healthcheck

Normally, you should see something like this in response body:

result:
  message: ok
  storage:
    type: gcp
    bucket: my_bucket
    account:
      id: gcp-acc
      resourceType: GcpServiceAccount

This means, that integration between Aidbox and your storage setup correctly.

What other responses you may see

Storage-type not specified

Storage-type not specified error means, box_bulk__storage_backend env variable wasn't setup. Valid values are aws and gcp.

Unsupported storage-type

unsupported storage-type error means, box_bulk__storage_backend env variable has invalid value. Valid values are aws and gcp.

bulk-storage account not specified

This error means account is not specified

  • box_bulk__storage_gcp_service__account for GCP
  • box_bulk__storage_aws_account for AWS

Account not found

This means there is no account for aws or gcp

Create AWS S3 or GCP Cloud Storage, depending on your config.

Bucket is not specified

This error means, bucket is not specified.

Specify box_bulk__storage_gcp_bucket for GCP.

Specify box_bulk__storage_aws_bucket for AWS.