Skip to content
Snippets Groups Projects
  1. Dec 31, 2024
    • Paul Millar's avatar
      Add possibility to skip harvesting · e68d4a9b
      Paul Millar authored
      Motivation:
      
      Some OAI-PMH endpoints are broken; moreover, they're broken in such a
      way that makes harvesting information wastes a lot of time without
      producing useful information.
      
      The specific example is the ISIS endpoint, which is both very slow (~10
      seconds per request) and, after ~9 hours of havesting returns a
      resumptionToken that results in failures in a subsequent ListIdentifiers
      request.
      
      The ESS endpoint is also broken.  While also annoying, the impact is
      less because of special handling when a server hasn't provided useful
      information.
      
      The goal is to allow selective disabiling of harvesting while continuing
      to update high-level OAI-PMH information based on the Identity call.
      
      Modification:
      
      Add a `skip-harvesting` boolean option.  If set with the value true then
      harvesting is skipped for this endpoint.
      
      Result:
      
      It's possible to update all endpoints without a very long and fruitless
      time spent harvesting from broken endpoints.
      e68d4a9b
    • Paul Millar's avatar
      Update facility data OAI-PMH metadata to record information as items · ca76d30d
      Paul Millar authored
      Motivation:
      
      The current OAI-PMH information is recorded as `datasets`.  However,
      this assumes that the items underlying the harvested OAI-PMH records
      correspond to datasets.  This is not guaranteed, and there are
      counter-examples.
      
      OAI-PMH describes three concepts: resource, item and record.  The
      OAI-PMH responses provide records (descriptive metadata of some item) or
      identifiers thereof.  However, since OAI-PMH requires repositories to
      support Dublin Core records, it seems a reasonable assumption for there
      to be a 1:1 relationship between each Dublin Core record and some
      corresponding item.
      
      Therefore, we can use the metadata-agnostic `item` concept when
      describing the information about the endpoint.
      
      Modification:
      
      Update script to record information under `items` node in the facilities
      YAML file.
      
      Update the Jekyll to consume this information when rendering the
      corresponding HTML.
      
      Result:
      
      No observable change, but the facilities YAML file now uses the more
      neutral 'items' instead of 'datasets'.
      ca76d30d
    • Paul Millar's avatar
      update_oai-pmh robust against metadata prefix lookup failures · 2a6ce96e
      Paul Millar authored
      Motivation:
      
      The ListMetadataFormats call can fail.  Currently, this causes the enire
      script to fail.
      
      Modification:
      
      Catch the exception and report a failure.
      
      Result:
      
      A metadata prefix lookup failure is now limited to a single
      OAI-PMH endpoint
      2a6ce96e
    • Paul Millar's avatar
      update_oai-pmh Add HTTP request timing statistics · f2fef214
      Paul Millar authored
      Motivation:
      
      Different OAI-PMH endpoints provide different performance
      characteristics.  It would be helpful to categorise them
      
      Modification:
      
      Introduce a Stats class to capture statistics
      
      Monkey-patch float to support printing number to specific significant
      figures.
      
      Produce request stats per round (40 requests) and overall as output.
      f2fef214
    • Paul Millar's avatar
      update_oai-pmh: make code DRY-er · 15618eaf
      Paul Millar authored
      15618eaf
    • Paul Millar's avatar
      update_oai-pmh record adminEmail address · 6039af5d
      Paul Millar authored
      Motivation:
      
      OAI-PMH provides admin contact details as email addresses. This could
      prove useful information.  One such situation is when the OAI-PMH
      endpoint is not working.  When this happens, the admin contact details
      are no longer available from the endpoint, so caching the values would
      prove useful.
      
      Modification:
      
      Update code to capture the admin contact information and record it
      against the facility-specific information.
      
      If, when updating the OAI-PMH details, the admin contact details are
      discovered (from the OAI-PMH endpoint) then any existing contact details
      are replaced with the discovered information; otherwise, any existing
      admin contact details are left unmodified.
      
      Result:
      
      We collect and cache OAI-PMH admin contact details in the facility
      metadata.
      6039af5d
  2. Dec 29, 2024
  3. Dec 26, 2024
  4. Dec 22, 2024
  5. Dec 21, 2024
  6. Dec 20, 2024
  7. Dec 12, 2024
  8. Dec 11, 2024
  9. Dec 09, 2024
    • Paul Millar's avatar
      Fix broken link to PaN search API · 46483c99
      Paul Millar authored
      Commit fe68e8d5 fixed the script for checking whether the PaN search
      API is working.  Unfortunately it failed to fix the broken links on the
      webpage.
      
      This patch fixes those links on the webpage
      46483c99
  10. Dec 06, 2024
  11. Dec 05, 2024
  12. Dec 04, 2024
Loading