Skip to content
Snippets Groups Projects
Commit 6f6b4c75 authored by Paul Millar's avatar Paul Millar
Browse files

api-pmh Add connection caching

Motivation:

The OAI-PMH endpoints being queried behave so that the client makes a
large number of requests, each returning a relatively small amount of
data.

When requests are processed by the OAI-PMH server quickly, the overhead
for establishing the TCP and TLS connections can be very significant.

Connection caching (sometimes called HTTP Keep Alive) involves sending
multiple HTTP requests over a single TCP connection, allowing us to
ameliorate the connection overhead by (effectively) spreading the cost
over all OAI-PMH requests.

Modification:

Update client to use `persistent_http` connection pool, via the
`persistent_httparty` adapter.

A bug was discovered, where the host entity is cached between successive
requests.

Result:

OAI-PMH requests are now faster.  Some observed speedups per request are
(0.12 +/- 0.02) s, (0.16 +/- 0.01) s and (0.16 +/- 0.03) s for ESRF, HZB
and HZDR respectively (measured with ListIdentifiers request on Dublin
Core, following the resumptionToken).

The overall impact of this improvement depends on how long the OAI-PMH
endpoint takes to process a request.  For end above endpoints, the
percentage improvements (per request) are 12%, 42% and 70% respectively.
parent aee25f9f
No related branches found
No related tags found
1 merge request!66oai-pmh: add custom HTTParty client
Pipeline #477132 passed