$export
The FHIR Bulk Data Export feature allows to export FHIR resources in ndjson format.
Aidbox supports patient-level and group-level export. When the export request is submitted the server returns URL to check the export status. When export is finished, status endpoint returns URLs to download resources.
Only one export process can be run at the same time. If you try to submit an export request while there is active export, you get 429 Too Many Requests
error.
Setup storage
Aidbox can export data to GCP or AWS cloud. Export results will be in
folder on the bucket.
GCP
Create bucket and service account that has read and write access to the bucket.
GCP Cloud Storage in Aidbox. Example:
private-key: |
-----BEGIN PRIVATE KEY-----
your-key-here
-----END PRIVATE KEY-----
service-account-email: service-account@email
id: gcp-service-account
resourceType: GcpServiceAccount
Set the following environment variables:
box_bulk__storage_backend=gcp
— backend for exportbox_bulk__storage_gcp_service__account
— id of theGcpServiceAccount
resourcebox_bulk__storage_gcp_bucket
— bucket name
Azure
Create Azure storage account and storage container.
Create AzureAccount
resource in Aidbox.
resourceType: AzureAccount
id: azureaccount # your storage account id
key: 7x..LA # your storage account key
Create AzureContainer
resource in Aidbox.
resourceType: AzureContainer
id: smartboxexporttestcontainer
account:
resourceType: AzureAccount
id: azureaccount
storage: azureaccount # your storage account
container: azureaccountcontainer # your account container
Set the following environment variables:
box_bulk__storage_backend=azure
— backend for exportbox_bulk__storage_azure_container=smartboxexporttestcontainer
— id of theAzureContainer
resource
AWS
Create S3 bucket and IAM user that has read and write access to the bucket.
AWS S3 in Aidbox. Example:
region: us-east-1
access-key-id: your-key-id
secret-access-key: key
id: aws-account
resourceType: AwsAccount
Set the following environment variables:
box_bulk__storage_backend=aws
— backend for exportbox_bulk__storage_aws_account
— id of theAwsAccount
resourcebox_bulk__storage_aws_bucket
— bucket name
Parameters
Parameter | Description |
---|---|
_outputFormat | Specifies format in which the server generates files.
|
_type | Includes only the specified types. This list is comma-separated. |
_since | Includes only resources changed after the specified time. |
patient | Export data that belongs only to listed patient. Format: comma-separated list of patient ids. Available only for patient-level export. |
Patient-level export
Patient-level export exports all Patient resources and resources associated with them. This association is defined by FHIR Compartments.
To start export make a request to /fhir/Patient/$export
:
Rest console
GET /fhir/Patient/$export
Accept: application/fhir+json
Prefer: respond-async
Status
202 Accepted
Headers
ContentLocation
— Link to check export status (e.g./fhir/$exportstatus/
)
Make a request to the export status endpoint to check the status:
Rest console
GET /fhir/$export-status/<id>
Status
200 OK
Body
{
"status": "completed",
"transactionTime": "2021-12-08T08:28:06.489Z",
"requiresAccessToken": false,
"request": "[base]/fhir/Patient/$export"
"output": [
{
"type": "Patient",
"url": "https://storage/some-url",
"count": 2
},
{
"type": "Person",
"url": "https://storage/some-other-url",
"count": 1
}
]
}
Delete request on the export status endpoint cancels export.
Rest console
DELETE /fhir/$export-status/<id>
Status
202 Accepted
Group-level export
Group-level export exports all Patient resources that belong to the specified group and resources associated with them. Characteristics of the group are not exported. This association is defined by FHIR Compartments.
To start export make a request to /fhir/Group/
:
Rest console
GET /fhir/Group/<group-id>/$export
Accept: application/fhir+json
Prefer: respond-async
Status
202 Accepted
Headers
ContentLocation
— Link to check export status (e.g./fhir/$exportstatus/
)
Make a request to the export status endpoint to check the status:
Rest console
GET /fhir/$export-status/<id>
Status
200 OK
Body
{
"status": "completed",
"transactionTime": "2021-12-08T08:28:06.489Z",
"requiresAccessToken": false,
"output": [
{
"type": "Patient",
"url": "https://storage/some-url",
"count": 2
},
{
"type": "Person",
"url": "https://storage/some-other-url",
"count": 1
}
]
}
Delete request on the export status endpoint cancels export.
Rest console
DELETE /fhir/$export-status/<id>
Status
202 Accepted
System-level export
System-level export exports data from a FHIR server, whether or not it is associated with a patient. You may restrict the resources returned using the _type
parameter.
Limitation: export operation will work for standard FHIR resources only, not for custom resources.
GET /fhir/$export
Accept: application/fhir+json
Prefer: respond-async
Status
200 OK
Body
{
"status": "completed",
"transactionTime": "2021-12-08T08:28:06.489Z",
"requiresAccessToken": false,
"output": [
{
"type": "Patient",
"url": "https://storage/some-url",
"count": 2
},
{
"type": "Person",
"url": "https://storage/some-other-url",
"count": 1
}
]
}
Delete request on the export status endpoint cancels export.
Rest console
DELETE /fhir/$export-status/<id>
Status
202 Accepted
Troubleshooting guide
$export operation expects you setup external storage, Aidbox exports data into. In most cases issues with $exoprt are the consequence of incorrect Adbox configuration. In order to exclude this run the following rpc:
POST /rpc
Content-Type: text/yaml
method: aidbox.bulk/storage-healthcheck
Normally, you should see something like this in response body:
result:
message: ok
storage:
type: gcp
bucket: my_bucket
account:
id: gcp-acc
resourceType: GcpServiceAccount
This means, that integration between Aidbox and your storage setup correctly.
What other responses you may see
Storage-type not specified
Storage-type not specified error means, box_bulk__storage_backend
env variable wasn't setup. Valid values are aws
and gcp
.
Unsupported storage-type
unsupported storage-type error means, box_bulk__storage_backend
env variable has invalid value. Valid values are aws
and gcp
.
bulk-storage account not specified
This error means account is not specified
box_bulk__storage_gcp_service__account
for GCPbox_bulk__storage_aws_account
for AWS
Account not found
This means there is no account for aws or gcp
Create AWS S3 or GCP Cloud Storage, depending on your config.
Bucket is not specified
This error means, bucket is not specified.
Specify box_bulk__storage_gcp_bucket
for GCP.
Specify box_bulk__storage_aws_bucket
for AWS.