BunnyConfiguration

Configuration

Bunny can be configured using environment variables. If you’re running Bunny as part of a composed stack of containers, these are set in the compose.yml.

services:
  bunny:
    image: ghcr.io/health-informatics-uon/hutch/bunny:edge
    environment:
      - DATASOURCE_DB_USERNAME=postgres
      - DATASOURCE_DB_PASSWORD=postgres
      - DATASOURCE_DB_DATABASE=postgres
      - DATASOURCE_DB_DRIVERNAME=postgresql
      - DATASOURCE_DB_SCHEMA=public
      - DATASOURCE_DB_PORT=5432
      - DATASOURCE_DB_HOST=db
      - TASK_API_BASE_URL=http://relay:8080/link_connector_api
      - TASK_API_USERNAME=username
      - TASK_API_PASSWORD=password
      - TASK_API_TYPE=a
      - COLLECTION_ID=collection_id
      - LOW_NUMBER_SUPPRESSION_THRESHOLD=10
      - ROUNDING_TARGET=10

Task API

NameTypeDescriptionDefault
TASK_API_BASE_URL*stringRQUEST API URL
TASK_API_USERNAME*stringAPI username
TASK_API_PASSWORD*stringAPI Password
COLLECTION_ID*stringAPI collection_id
TASK_API_TYPE"a,b,c"Specifies the task type to query for, needed if connecting to RQUEST
TASK_API_ENFORCE_HTTPSbooleanSpecifies whether to enforce a HTTPS TASK_API_BASE_URLtrue

To configure Bunny to communicate with a Task API, such as Relay or RQuest, the TASK_API variables must be set.

Host examples:

  • If Bunny and Relay are running in the same Docker Compose stack, the host should be the Relay service name in the stack, e.g. http://relay
  • If you have Relay running locally, and Bunny running locally, the host should be http://localhost
  • If you have any Task API running elsewhere, the host should be e.g. https://my-task-api.com with the correct hostname
  • The Base URL should include /link_connector_api at the end of the path
  • Requests from Bunny are authenticated by the TASK_API_USERNAME and TASK_API_PASSWORD, please put your Task API credentials here

Task Type

If you would like Bunny to connect directly with RQUEST, you need to add TASK_API_TYPE.

This can only be either a (Availability), b (Distribution and PHEWAS) or c (AnalyticsGwas, AnalyticsGwasQuantitiveTrait, AnalyticsBurdenTest).

Further guidance is available in the Connect to RQuest How To Guide.

Database

The database connection is configured by the variables prefixed with DATASOURCE_

NameTypeDescriptionDefault
DATASOURCE_DB_USERNAMEstringDatabase usernametrino-user
DATASOURCE_DB_PASSWORDstringDatabase password
DATASOURCE_DB_DATABASE*stringDatabase name
DATASOURCE_DB_SCHEMA*stringDatabase schema
DATASOURCE_DB_PORT*stringDatabase port
DATASOURCE_DB_HOST*stringDatabase host
DATASOURCE_DB_DRIVERNAME"postgresql,mssql,duckdb"Database dialectpostgresql

Managed Identity

NameTypeDescriptionDefault
DATASOURCE_USE_AZURE_MANAGED_IDENTITYbooleanUse Azure Managed IdentityFalse
DATASOURCE_AZURE_MANAGED_IDENTITY_CLIENT_IDstringAzure Identity Client ID

When DATASOURCE_USE_AZURE_MANAGED_IDENTITY is enabled, the application will connect to Azure SQL Server using Azure Managed Identity instead of traditional credentials. This allows the service to authenticate securely without requiring a stored username or password. If you are using a user-assigned identity, you can provide its client ID through DATASOURCE_AZURE_MANAGED_IDENTITY_CLIENT_ID.

DuckDB

NameTypeDescriptionDefault
DATASOURCE_DUCKDB_PATH_TO_DBstringThe path to the DuckDB database file/data/file.db
DATASOURCE_DUCKDB_MEMORY_LIMITstringThe memory limit for DuckDB (e.g. ‘1000mb’, ‘2gb’)1000mb
DATASOURCE_DUCKDB_TEMP_DIRECTORYstringThe temporary directory for DuckDB - used as a swap fir larger-than-memory processing/tmp

Trino

Bunny can connect to databases Trino, though this functionality is untested.

NameTypeDescriptionDefault
DATASOURCE_DB_CATALOGstringTrino catalog name
DATASOURCE_USE_TRINObooleanUse Trino to connectFalse

Disclosure Control

NameTypeDescriptionDefault
LOW_NUMBER_SUPPRESSION_THRESHOLDintegerSpecifies the minimum number of individuals required for a result to be returned; counts below this threshold are suppressed.10
ROUNDING_TARGETintegerSpecifies the value to which all returned counts are rounded10
  • Low number suppression defaults to ON with a value of 10 such that any result with a count below 10 will return 0.
  • Rounding defaults to ON with a value of 10 such that results are returned to the nearest 10.

Read more about disclosure controls.

Polling

Bunny continually polls upstream to a Task API for new tasks, these settings will configure that functionality.

NameTypeDescriptionDefault
POLLING_INTERVALintegerSpecifies how often (seconds) the Task API is polled for new tasks5
INITIAL_BACKOFFintegerSpecifies the starting wait time (seconds) before retrying after a network error5
MAX_BACKOFFintegerSets the maximum wait time (seconds) that exponential backoff can reach after repeated failures60

Distribution Tasks Cache

Bunny can cache the results of a distribution task when it starts, and respond to those queries with the cached result. This is useful for large OMOP databases, where distribution tasks can take longer.

These settings can configure the specifics of this functionality.

NameTypeDescriptionDefault
CACHE_ENABLEDbooleanEnable caching of distribution query resultsFalse
CACHE_DIRstringDirectory to store cached distribution results/app/cache
CACHE_TTL_HOURSfloatCache validity (time-to-live) period in hours (0 = never expires)24.0
CACHE_REFRESH_ON_STARTUPbooleanRefresh cache when Bunny startsTrue

Logging

NameTypeDescriptionDefault
BUNNY_LOGGER_LEVEL"DEBUG,INFO,WARNING,ERROR,CRITICAL"Specifies the logging levelINFO

OpenTelemetry

Bunny can use OpenTelemetry to export traces and metrics to your chosen collector. You will need to either have your own observability backend setup (for traces, metrics, and logs) or modify/use our minimal implementation (Loki, Tempo, Prometheus, Grafana) specified in the observability.compose.yml.

NameTypeDescriptionDefault
OTEL_ENABLEDbooleanWhether or not telemetry data is exported through opentelemetry to the observability backend(s)False
OTEL_SERVICE_NAMEstringService identification for opentelemetryhutch-bunny-daemon
OTEL_EXPORTER_OTLP_ENDPOINTstringOpentelemetry collector endpoint for sending datahttp://otel-collector:4317