Skip to content

Expose skip existing of harvesters through run method and pipeline

Currently most of the harvesters have a skip existing method, where they skip processing certain files which are already on disks, This is per default true, to allow to only reharvest failures.

But there are cases where one wants to enforce an overwrite (i.e the data has changed.) Currently the files have then to be deleted by hand.

If skip_existing is exposed in run one has to make sure that all harvesters can take this kwargs.