Skip to content

Improvement: Align environment variable names across Harvester and APIConnector / StorageAPI

@gabriel.preuss

Summary

There is an inconsistency between the environment variable names used in the Harvester's .env-example file and those expected in the config.py of the API Connector. This leads to misconfiguration of the environment variables.

Observed Inconsistencies

Harvester .env-example: DEBUG=0
STORAGE_API=localhost/api/v1/entities
MAX_RETRIES_API=2
PACKAGE_SIZE_API_SEND=100
MAX_THREADS=10

API Connector config.py:

MAX_HARVESTING_THREADS = int(os.environ.get("MAX_THREADS", 1))
API_URL = os.environ.get("STORAGE_API_URL", "http://localhost")
API_PACKAGE_SIZE_SEND = int(os.environ.get("PACKAGE_SIZE_API_SEND", 1))
MAX_API_THREADS = int(os.environ.get("MAX_API_THREADS", 1))
API_MAX_RETRIES = int(os.environ.get("API_MAX_RETRIES", 1))

This leads to the following issues:

  • MAX_RETRIES_API vs. API_MAX_RETRIES
  • MAX_THREADS vs. MAX_HARVESTING_THREADS
  • STORAGE_API vs. STORAGE_API_URL

Suggested Improvements

  1. Unify naming conventions across Harvester and API Connector / Storage API (e.g. always prefix with API_ or STORAGE_).

  2. Match environment variable names and Python variable names in config.py to reduce confusion and errors. For example:

MAX_HARVESTING_THREADS = int(os.environ.get("MAX_HARVESTING_THREADS", 1))

instead of

MAX_HARVESTING_THREADS = int(os.environ.get("MAX_THREADS", 1))
  1. Update all .env-example files accordingly to reflect the expected names.

Related Files

Edited by Lucas Lamparter