Configuration
Bunny can be configured using environment variables.
If you’re running Bunny as part of a composed stack of containers, these are set in the compose.yml.
services:
bunny:
image: ghcr.io/health-informatics-uon/hutch/bunny:edge
environment:
- DATASOURCE_DB_USERNAME=postgres
- DATASOURCE_DB_PASSWORD=postgres
- DATASOURCE_DB_DATABASE=postgres
- DATASOURCE_DB_DRIVERNAME=postgresql
- DATASOURCE_DB_SCHEMA=public
- DATASOURCE_DB_PORT=5432
- DATASOURCE_DB_HOST=db
- TASK_API_BASE_URL=http://relay:8080/link_connector_api
- TASK_API_USERNAME=username
- TASK_API_PASSWORD=password
- TASK_API_TYPE=a
- COLLECTION_ID=collection_id
- LOW_NUMBER_SUPPRESSION_THRESHOLD=10
- ROUNDING_TARGET=10Task API
| Name | Type | Description | Default |
|---|---|---|---|
TASK_API_BASE_URL* | string | RQUEST API URL | |
TASK_API_USERNAME* | string | API username | |
TASK_API_PASSWORD* | string | API Password | |
COLLECTION_ID* | string | API collection_id | |
TASK_API_TYPE | "a,b,c" | Specifies the task type to query for, needed if connecting to RQUEST | |
TASK_API_ENFORCE_HTTPS | boolean | Specifies whether to enforce a HTTPS TASK_API_BASE_URL | true |
To configure Bunny to communicate with a Task API, such as Relay or RQuest, the TASK_API variables must be set.
Host examples:
- If Bunny and Relay are running in the same Docker Compose stack, the host should be the Relay service name in the stack, e.g.
http://relay - If you have Relay running locally, and Bunny running locally, the host should be
http://localhost - If you have any Task API running elsewhere, the host should be e.g.
https://my-task-api.comwith the correct hostname - The Base URL should include
/link_connector_apiat the end of the path - Requests from Bunny are authenticated by the
TASK_API_USERNAMEandTASK_API_PASSWORD, please put your Task API credentials here
Task Type
If you would like Bunny to connect directly with RQUEST, you need to add TASK_API_TYPE.
This can only be either a (Availability), b (Distribution and PHEWAS) or c (AnalyticsGwas, AnalyticsGwasQuantitiveTrait, AnalyticsBurdenTest).
Further guidance is available in the Connect to RQuest How To Guide.
Database
The database connection is configured by the variables prefixed with DATASOURCE_
| Name | Type | Description | Default |
|---|---|---|---|
DATASOURCE_DB_USERNAME | string | Database username | trino-user |
DATASOURCE_DB_PASSWORD | string | Database password | |
DATASOURCE_DB_DATABASE* | string | Database name | |
DATASOURCE_DB_SCHEMA* | string | Database schema | |
DATASOURCE_DB_PORT* | string | Database port | |
DATASOURCE_DB_HOST* | string | Database host | |
DATASOURCE_DB_DRIVERNAME | "postgresql,mssql,duckdb" | Database dialect | postgresql |
Managed Identity
| Name | Type | Description | Default |
|---|---|---|---|
DATASOURCE_USE_AZURE_MANAGED_IDENTITY | boolean | Use Azure Managed Identity | False |
DATASOURCE_AZURE_MANAGED_IDENTITY_CLIENT_ID | string | Azure Identity Client ID |
When DATASOURCE_USE_AZURE_MANAGED_IDENTITY is enabled, the application will connect to Azure SQL Server using Azure Managed Identity instead of traditional credentials.
This allows the service to authenticate securely without requiring a stored username or password.
If you are using a user-assigned identity, you can provide its client ID through DATASOURCE_AZURE_MANAGED_IDENTITY_CLIENT_ID.
DuckDB
| Name | Type | Description | Default |
|---|---|---|---|
DATASOURCE_DUCKDB_PATH_TO_DB | string | The path to the DuckDB database file | /data/file.db |
DATASOURCE_DUCKDB_MEMORY_LIMIT | string | The memory limit for DuckDB (e.g. ‘1000mb’, ‘2gb’) | 1000mb |
DATASOURCE_DUCKDB_TEMP_DIRECTORY | string | The temporary directory for DuckDB - used as a swap fir larger-than-memory processing | /tmp |
Trino
Bunny can connect to databases Trino, though this functionality is untested.
| Name | Type | Description | Default |
|---|---|---|---|
DATASOURCE_DB_CATALOG | string | Trino catalog name | |
DATASOURCE_USE_TRINO | boolean | Use Trino to connect | False |
Disclosure Control
| Name | Type | Description | Default |
|---|---|---|---|
LOW_NUMBER_SUPPRESSION_THRESHOLD | integer | Specifies the minimum number of individuals required for a result to be returned; counts below this threshold are suppressed. | 10 |
ROUNDING_TARGET | integer | Specifies the value to which all returned counts are rounded | 10 |
- Low number suppression defaults to ON with a value of 10 such that any result with a count below 10 will return 0.
- Rounding defaults to ON with a value of 10 such that results are returned to the nearest 10.
Read more about disclosure controls.
Polling
Bunny continually polls upstream to a Task API for new tasks, these settings will configure that functionality.
| Name | Type | Description | Default |
|---|---|---|---|
POLLING_INTERVAL | integer | Specifies how often (seconds) the Task API is polled for new tasks | 5 |
INITIAL_BACKOFF | integer | Specifies the starting wait time (seconds) before retrying after a network error | 5 |
MAX_BACKOFF | integer | Sets the maximum wait time (seconds) that exponential backoff can reach after repeated failures | 60 |
Distribution Tasks Cache
Bunny can cache the results of a distribution task when it starts, and respond to those queries with the cached result. This is useful for large OMOP databases, where distribution tasks can take longer.
These settings can configure the specifics of this functionality.
| Name | Type | Description | Default |
|---|---|---|---|
CACHE_ENABLED | boolean | Enable caching of distribution query results | False |
CACHE_DIR | string | Directory to store cached distribution results | /app/cache |
CACHE_TTL_HOURS | float | Cache validity (time-to-live) period in hours (0 = never expires) | 24.0 |
CACHE_REFRESH_ON_STARTUP | boolean | Refresh cache when Bunny starts | True |
Logging
| Name | Type | Description | Default |
|---|---|---|---|
BUNNY_LOGGER_LEVEL | "DEBUG,INFO,WARNING,ERROR,CRITICAL" | Specifies the logging level | INFO |
OpenTelemetry
Bunny can use OpenTelemetry to export traces and metrics to your chosen collector. You will need to either have your own observability backend setup (for traces, metrics, and logs) or modify/use our minimal implementation (Loki, Tempo, Prometheus, Grafana) specified in the observability.compose.yml.
| Name | Type | Description | Default |
|---|---|---|---|
OTEL_ENABLED | boolean | Whether or not telemetry data is exported through opentelemetry to the observability backend(s) | False |
OTEL_SERVICE_NAME | string | Service identification for opentelemetry | hutch-bunny-daemon |
OTEL_EXPORTER_OTLP_ENDPOINT | string | Opentelemetry collector endpoint for sending data | http://otel-collector:4317 |