Gemfile · 55d155cb1b940eb20c74fadcab059c8cb8ca1f85 · LEAPS WG3 / Webpage · GitLab

Snippets Groups Projects

2 months ago

api-pmh Add connection caching · 6f6b4c75

Paul Millar authored 2 months ago

Motivation:

The OAI-PMH endpoints being queried behave so that the client makes a
large number of requests, each returning a relatively small amount of
data.

When requests are processed by the OAI-PMH server quickly, the overhead
for establishing the TCP and TLS connections can be very significant.

Connection caching (sometimes called HTTP Keep Alive) involves sending
multiple HTTP requests over a single TCP connection, allowing us to
ameliorate the connection overhead by (effectively) spreading the cost
over all OAI-PMH requests.

Modification:

Update client to use `persistent_http` connection pool, via the
`persistent_httparty` adapter.

A bug was discovered, where the host entity is cached between successive
requests.

Result:

OAI-PMH requests are now faster.  Some observed speedups per request are
(0.12 +/- 0.02) s, (0.16 +/- 0.01) s and (0.16 +/- 0.03) s for ESRF, HZB
and HZDR respectively (measured with ListIdentifiers request on Dublin
Core, following the resumptionToken).

The overall impact of this improvement depends on how long the OAI-PMH
endpoint takes to process a request.  For end above endpoints, the
percentage improvements (per request) are 12%, 42% and 70% respectively.

6f6b4c75

api-pmh Add connection caching

Paul Millar authored 2 months ago

Motivation:

The OAI-PMH endpoints being queried behave so that the client makes a
large number of requests, each returning a relatively small amount of
data.

When requests are processed by the OAI-PMH server quickly, the overhead
for establishing the TCP and TLS connections can be very significant.

Connection caching (sometimes called HTTP Keep Alive) involves sending
multiple HTTP requests over a single TCP connection, allowing us to
ameliorate the connection overhead by (effectively) spreading the cost
over all OAI-PMH requests.

Modification:

Update client to use `persistent_http` connection pool, via the
`persistent_httparty` adapter.

A bug was discovered, where the host entity is cached between successive
requests.

Result:

OAI-PMH requests are now faster.  Some observed speedups per request are
(0.12 +/- 0.02) s, (0.16 +/- 0.01) s and (0.16 +/- 0.03) s for ESRF, HZB
and HZDR respectively (measured with ListIdentifiers request on Dublin
Core, following the resumptionToken).

The overall impact of this improvement depends on how long the OAI-PMH
endpoint takes to process a request.  For end above endpoints, the
percentage improvements (per request) are 12%, 42% and 70% respectively.

This project manages its dependencies using Bundler. Learn more

A HIFIS Service | Privacy | Imprint | Support | Documentation | Changelog