Distribution Task
Purpose
A Distribution Task is a standardised request asking for summary statistics of the OMOP dataset, run nightly on a dataset. The task includes metadata such as the collection ID, and task ID, along with the type (analysis
) to be run.
The result will help populate the user interface of RQUEST, enabling a researcher to build their query on the data that exists.
Upon retrieving the task, Bunny securely queries the OMOP dataset within the Data Custodian’s environment. The result is a list summary statistics of the dataset, subject to disclosure controls.
No raw or identifiable data leaves the system; only summary statistics are returned.
Example Input Schema
An example Distribution task:
{
"code": "DEMOGRAPHICS",
"analysis": "DISTRIBUTION",
"uuid": "unique_id",
"collection": "collection_id",
"owner": "user_id"
}
Response Schema
Bunny returns a structured JSON response upon completing a Distribution task.
The key field is the queryResult.files.file_data
, a base64 encoded summary statistics file.
The queryResult.count
field describes the number of rows in the file.
{
"status": "ok",
"protocolVersion": "v2",
"uuid": "unique_id",
"queryResult": {
"count": 8,
"datasetCount": 1,
"files": [
{
"file_name": "code.distribution",
"file_data": "QklPQkFOSwlDT0RFCUNPVU5UCURFU0NSSVBUSU9OCU1JTglRMQlNRURJQU4JTUVBTglRMwlNQVgJQUxURVJOQVRJVkVTCURBVEFTRVQJT01PUAlPTU9QX0RFU0NSCUNBVEVHT1JZCmNvbGxlY3Rpb25faWQJT01PUDo0MzIzNjg4CTIwCQkJCQkJCQkJCTQzMjM2ODgJQ291Z2ggYXQgcmVzdAlDb25kaXRpb24KY29sbGVjdGlvbl9pZAlPTU9QOjM1NjI2MDYxCTEwCQkJCQkJCQkJCTM1NjI2MDYxCU5vIGNvdWdoIHN0cmVuZ3RoCUNvbmRpdGlvbgpjb2xsZWN0aW9uX2lkCU9NT1A6MzgwMDM1NjMJNjAJCQkJCQkJCQkJMzgwMDM1NjMJSGlzcGFuaWMgb3IgTGF0aW5vCUV0aG5pY2l0eQpjb2xsZWN0aW9uX2lkCU9NT1A6MzgwMDM1NjQJNDAJCQkJCQkJCQkJMzgwMDM1NjQJTm90IEhpc3BhbmljIG9yIExhdGlubwlFdGhuaWNpdHkKY29sbGVjdGlvbl9pZAlPTU9QOjg1MDcJNDAJCQkJCQkJCQkJODUwNwlNQUxFCUdlbmRlcgpjb2xsZWN0aW9uX2lkCU9NT1A6ODUzMgk2MAkJCQkJCQkJCQk4NTMyCUZFTUFMRQlHZW5kZXIKY29sbGVjdGlvbl9pZAlPTU9QOjIxNDkwNzQyCTEwCQkJCQkJCQkJCTIxNDkwNzQyCUFpcndheSByZXNpc3RhbmNlIC0tZHVyaW5nIGluc3BpcmF0aW9uCU1lYXN1cmVtZW50CmNvbGxlY3Rpb25faWQJT01PUDo0MjY2MDA5CTEwCQkJCQkJCQkJCTQyNjYwMDkJQWJpbGl0eSB0byBzbWVsbAlPYnNlcnZhdGlvbg==",
"file_description": "Result of code.distribution analysis",
"file_size": 0.928,
"file_type": "BCOS",
"file_sensitive": true,
"file_reference": ""
}
]
},
"message": "",
"collection_id": "collection_id"
}
The decoded file_data
will contain summary statistics in a .tsv
format:
BIOBANK CODE COUNT DESCRIPTION MIN Q1 MEDIAN MEAN Q3 MAX ALTERNATIVES DATASET OMOP OMOP_DESCR CATEGORY
collection_id OMOP:4323688 20 4323688 Cough at rest Condition
collection_id OMOP:35626061 10 35626061 No cough strength Condition
collection_id OMOP:38003563 60 38003563 Hispanic or Latino Ethnicity
collection_id OMOP:38003564 40 38003564 Not Hispanic or Latino Ethnicity
collection_id OMOP:8507 40 8507 MALE Gender
collection_id OMOP:8532 60 8532 FEMALE Gender
collection_id OMOP:21490742 10 21490742 Airway resistance --during inspiration Measurement
collection_id OMOP:4266009 10 4266009 Ability to smell Observation
Each row summarises the frequency and distribution of a particular OMOP concept (e.g., a condition, gender, ethnicity, observation, or measurement).
For example, FEMALE (concept ID: 8532) appears 60 times, and the Observation Ability to smell (concept ID: 4266009) appears 10 times.