Skip to content
Snippets Groups Projects
Commit 6039af5d authored by Paul Millar's avatar Paul Millar
Browse files

update_oai-pmh record adminEmail address

Motivation:

OAI-PMH provides admin contact details as email addresses. This could
prove useful information.  One such situation is when the OAI-PMH
endpoint is not working.  When this happens, the admin contact details
are no longer available from the endpoint, so caching the values would
prove useful.

Modification:

Update code to capture the admin contact information and record it
against the facility-specific information.

If, when updating the OAI-PMH details, the admin contact details are
discovered (from the OAI-PMH endpoint) then any existing contact details
are replaced with the discovered information; otherwise, any existing
admin contact details are left unmodified.

Result:

We collect and cache OAI-PMH admin contact details in the facility
metadata.
parent 5c7f3cb6
Branches joss
No related tags found
1 merge request!64update_oai-pmh Add a back-off strategy, to allow a service to recover
...@@ -29,6 +29,8 @@ ...@@ -29,6 +29,8 @@
link: https://data.cells.es/iws/icat_plus/oaipmh/request link: https://data.cells.es/iws/icat_plus/oaipmh/request
last-check: 2024-12-22 last-check: 2024-12-22
status: Active status: Active
adminAddress:
- mis@cells.es
datasets: datasets:
count: 20 count: 20
sets: sets:
...@@ -148,6 +150,8 @@ ...@@ -148,6 +150,8 @@
link: https://icatplus.esrf.fr/oaipmh/request link: https://icatplus.esrf.fr/oaipmh/request
last-check: 2024-12-22 last-check: 2024-12-22
status: Active status: Active
adminAddress:
- datapolicy@esrf.fr
datasets: datasets:
count: 7202 count: 7202
pan-search-api: pan-search-api:
...@@ -177,6 +181,8 @@ ...@@ -177,6 +181,8 @@
status: Active status: Active
datasets: datasets:
count: 0 count: 0
adminAddress:
- max.novelli@ess.eu
pan-search-api: pan-search-api:
link: https://search.panosc.ess.eu/api link: https://search.panosc.ess.eu/api
status: Error status: Error
...@@ -207,6 +213,9 @@ ...@@ -207,6 +213,9 @@
link: https://in.xfel.eu/metadata/oai-pmh/oai2 link: https://in.xfel.eu/metadata/oai-pmh/oai2
last-check: 2024-12-22 last-check: 2024-12-22
status: Active status: Active
adminAddress:
- luis.maia@xfel.eu
- krzysztof.wrona@xfel.eu
datasets: datasets:
count: 5 count: 5
sets: sets:
...@@ -246,6 +255,8 @@ ...@@ -246,6 +255,8 @@
link: https://data.helmholtz-berlin.de/oaipmh/request link: https://data.helmholtz-berlin.de/oaipmh/request
last-check: 2024-12-22 last-check: 2024-12-22
status: Active status: Active
adminAddress:
- icatmaster@helmholtz-berlin.de
datasets: datasets:
count: 28952 count: 28952
sets: sets:
...@@ -286,6 +297,8 @@ ...@@ -286,6 +297,8 @@
link: https://rodare.hzdr.de/oai2d link: https://rodare.hzdr.de/oai2d
status: Active status: Active
last-check: 2024-12-22 last-check: 2024-12-22
adminAddress:
- rodare-admin@hzdr.de
datasets: datasets:
count: 1020 count: 1020
sets: sets:
...@@ -438,6 +451,8 @@ ...@@ -438,6 +451,8 @@
link: https://fairdata.ill.fr/openaire/oai link: https://fairdata.ill.fr/openaire/oai
status: Active status: Active
last-check: 2024-12-22 last-check: 2024-12-22
adminAddress:
- data@ill.fr
datasets: datasets:
count: 0 count: 0
pan-search-api: pan-search-api:
...@@ -465,6 +480,8 @@ ...@@ -465,6 +480,8 @@
link: https://icatisis-prod.esc.rl.ac.uk/oaipmh/request link: https://icatisis-prod.esc.rl.ac.uk/oaipmh/request
status: Error status: Error
last-check: 2024-12-22 last-check: 2024-12-22
adminAddress:
- isisdata@stfc.ac.uk
pan-search-api: pan-search-api:
link: https://data.isis.stfc.ac.uk/datagateway-api/search-api/ link: https://data.isis.stfc.ac.uk/datagateway-api/search-api/
status: Active status: Active
...@@ -522,6 +539,8 @@ ...@@ -522,6 +539,8 @@
link: https://doi.psi.ch/oaipmh/oai link: https://doi.psi.ch/oaipmh/oai
status: Active status: Active
last-check: 2024-12-22 last-check: 2024-12-22
adminAddress:
- carlo.minotti@psi.ch
datasets: datasets:
count: 0 count: 0
pan-search-api: pan-search-api:
......
...@@ -17,26 +17,33 @@ def check_oai_pmh_endpoint(endpoint_url) ...@@ -17,26 +17,33 @@ def check_oai_pmh_endpoint(endpoint_url)
if !response.success? if !response.success?
print_with_time("Error: Identify response has HTTP status code #{response.code}.") print_with_time("Error: Identify response has HTTP status code #{response.code}.")
return "Error" return "Error", []
end end
response = HTTParty.get(queryIdentify_url) response = HTTParty.get(queryIdentify_url)
if response.body.nil? || response.body.empty? if response.body.nil? || response.body.empty?
print_with_time("Error: Identify response is empty.") print_with_time("Error: Identify response is empty.")
return "Error" return "Error", []
end end
xml_response = Nokogiri::XML(response.body) xml_response = Nokogiri::XML(response.body)
oai_pmh_tag = xml_response.at_xpath('//*[name()="OAI-PMH"]') oai_pmh_tag = xml_response.at_xpath('//*[name()="OAI-PMH"]')
if !oai_pmh_tag if !oai_pmh_tag
print_with_time("Error: Identify response has no OAI-PMH tag.") print_with_time("Error: Identify response has no OAI-PMH tag.")
return "Error" return "Error", []
end end
return "Active" addresses = []
adminEmails = xml_response.xpath('/xmlns:OAI-PMH/xmlns:Identify/xmlns:adminEmail')
adminEmails.each do |adminEmail|
address = adminEmail.content
addresses.append(address)
end
return "Active", addresses
rescue StandardError => e rescue StandardError => e
print_with_time("Error: Identify request failed: #{e.message}") print_with_time("Error: Identify request failed: #{e.message}")
return "Error" return "Error", []
end end
end end
...@@ -229,9 +236,9 @@ end ...@@ -229,9 +236,9 @@ end
def query_oai_pmh_endpoint(endpoint) def query_oai_pmh_endpoint(endpoint)
status = check_oai_pmh_endpoint(endpoint) status, adminAddress = check_oai_pmh_endpoint(endpoint)
if status == "Error" if status == "Error"
return status, {}, 0, {} return status, [], {}, 0, {}
end end
dc_prefix = metadata_prefix_of(endpoint, 'http://www.openarchives.org/OAI/2.0/oai_dc/') dc_prefix = metadata_prefix_of(endpoint, 'http://www.openarchives.org/OAI/2.0/oai_dc/')
...@@ -246,7 +253,7 @@ def query_oai_pmh_endpoint(endpoint) ...@@ -246,7 +253,7 @@ def query_oai_pmh_endpoint(endpoint)
set_counts = {} set_counts = {}
end end
return status, set_names, total_count, set_counts return status, adminAddress, set_names, total_count, set_counts
end end
...@@ -268,10 +275,14 @@ facilities.each do |facility| ...@@ -268,10 +275,14 @@ facilities.each do |facility|
name = facility['short-name'] name = facility['short-name']
puts "Checking OAI-PMH endpoint for #{name}: #{oai_pmh_endpoint}" puts "Checking OAI-PMH endpoint for #{name}: #{oai_pmh_endpoint}"
status, set_names, total_count, set_count = query_oai_pmh_endpoint(oai_pmh_endpoint) status, adminAddress, set_names, total_count, set_count = query_oai_pmh_endpoint(oai_pmh_endpoint)
oai_pmh['status'] = status oai_pmh['status'] = status
if !adminAddress.empty?
oai_pmh['adminAddress'] = adminAddress
end
oai_pmh.delete('datasets') oai_pmh.delete('datasets')
if status == "Active" if status == "Active"
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment