How to deal with faulty/incomplete data? indexer, uplifting, ...
Currently sometimes the data that we get is valid schema.org, but it has things in it which are clearly wrong and cause problems. for example having only a type for something like:
'licence': {@type: 'Dataset'}
or
'citation': [{@type: 'CreativeWork'}]
This originates, because schema.org has no required keys for a certain entry. but the type alone makes no sense.
This causes all sorts of problems down the line.
Solution:
-
During uplifting these entries should be deleted. -
During indexing we might want to ignore these.