Skip to content
Snippets Groups Projects
  1. Jan 02, 2025
    • Paul Millar's avatar
      api-pmh Add connection caching · 6f6b4c75
      Paul Millar authored
      Motivation:
      
      The OAI-PMH endpoints being queried behave so that the client makes a
      large number of requests, each returning a relatively small amount of
      data.
      
      When requests are processed by the OAI-PMH server quickly, the overhead
      for establishing the TCP and TLS connections can be very significant.
      
      Connection caching (sometimes called HTTP Keep Alive) involves sending
      multiple HTTP requests over a single TCP connection, allowing us to
      ameliorate the connection overhead by (effectively) spreading the cost
      over all OAI-PMH requests.
      
      Modification:
      
      Update client to use `persistent_http` connection pool, via the
      `persistent_httparty` adapter.
      
      A bug was discovered, where the host entity is cached between successive
      requests.
      
      Result:
      
      OAI-PMH requests are now faster.  Some observed speedups per request are
      (0.12 +/- 0.02) s, (0.16 +/- 0.01) s and (0.16 +/- 0.03) s for ESRF, HZB
      and HZDR respectively (measured with ListIdentifiers request on Dublin
      Core, following the resumptionToken).
      
      The overall impact of this improvement depends on how long the OAI-PMH
      endpoint takes to process a request.  For end above endpoints, the
      percentage improvements (per request) are 12%, 42% and 70% respectively.
      6f6b4c75
  2. Feb 19, 2024
  3. Jul 05, 2023
  4. Apr 17, 2023
Loading