Skip to content

Soft vs. Strict Validation (Open vs. Closed world policy)

For TF-DP and long term purposes alike we need to decide on how to handle this. This might effect our "dumps" folder structure.

@j.broeder suggested to go back to the once envisioned model of "hubs_dumps" that feed into a "unified" folder once data is schema conform (I agree)

  • hubs_dumps this way allow for "soft" validation and hub specific data cleaning
  • the unified folder allows for strict schema validation (neccesary for interoperability between versions and systems)

The reason for a softer validation pipeline (eg. inside the hubs dumps folders) is that most current files stray away from the current Schema for various reasons. (typos, custom properties, meanwhile changed Schemas, etc.).