Skip to content
Snippets Groups Projects
  • Paul Millar's avatar
    e68d4a9b
    Add possibility to skip harvesting · e68d4a9b
    Paul Millar authored
    Motivation:
    
    Some OAI-PMH endpoints are broken; moreover, they're broken in such a
    way that makes harvesting information wastes a lot of time without
    producing useful information.
    
    The specific example is the ISIS endpoint, which is both very slow (~10
    seconds per request) and, after ~9 hours of havesting returns a
    resumptionToken that results in failures in a subsequent ListIdentifiers
    request.
    
    The ESS endpoint is also broken.  While also annoying, the impact is
    less because of special handling when a server hasn't provided useful
    information.
    
    The goal is to allow selective disabiling of harvesting while continuing
    to update high-level OAI-PMH information based on the Identity call.
    
    Modification:
    
    Add a `skip-harvesting` boolean option.  If set with the value true then
    harvesting is skipped for this endpoint.
    
    Result:
    
    It's possible to update all endpoints without a very long and fruitless
    time spent harvesting from broken endpoints.
    e68d4a9b
    History
    Add possibility to skip harvesting
    Paul Millar authored
    Motivation:
    
    Some OAI-PMH endpoints are broken; moreover, they're broken in such a
    way that makes harvesting information wastes a lot of time without
    producing useful information.
    
    The specific example is the ISIS endpoint, which is both very slow (~10
    seconds per request) and, after ~9 hours of havesting returns a
    resumptionToken that results in failures in a subsequent ListIdentifiers
    request.
    
    The ESS endpoint is also broken.  While also annoying, the impact is
    less because of special handling when a server hasn't provided useful
    information.
    
    The goal is to allow selective disabiling of harvesting while continuing
    to update high-level OAI-PMH information based on the Identity call.
    
    Modification:
    
    Add a `skip-harvesting` boolean option.  If set with the value true then
    harvesting is skipped for this endpoint.
    
    Result:
    
    It's possible to update all endpoints without a very long and fruitless
    time spent harvesting from broken endpoints.