⚙️  Backend Issue — Extend /sample/create_with_file_upload to Support Additional File Formats
🗣️  User Feedback
"XML may not be the best format to use going forward. Other formats (e.g., JSON) can provide more accessible and complete information from additional repositories."
🎯  Goal
Enhance the backend endpoint
POST /sample/create_with_file_upload
to support additional file formats beyond XML, using the existing Connexion + OpenAPI setup.
Currently, it handles only DataCite XML uploads. The goal is to extend it to accept and process:
- JSON
 - CSV
 - Excel (.xlsx)
 - YAML
 - TOML
 
📄  Endpoint Specification (Updated)
/sample/create_with_file_upload:
  post:
    operationId: "sample.create_with_upload_file"
    tags:
      - Sample
    summary: "Create a sample record in SEPIA by file upload"
    parameters:
      - $ref: "#/components/parameters/session_id_qp"
    requestBody:
      x-body-name: data
      description: "Sample to create by file upload (XML, JSON, CSV, Excel, YAML, or TOML)"
      required: true
      content:
        multipart/form-data:
          schema:
            $ref: "#/components/schemas/Sample_with_file_upload"
    responses:
      '201':
        description: Sample created successfully in SEPIA by file upload
      '400':
        description: Bad request — invalid or unsupported file format
      '403':
        description: Invalid credentials or insufficient access
      '500':
        description: Server-side error (please inform the administrator)
🧩  Implementation Details
Tasks
- 
Update the existing /sample/create_with_file_uploadendpoint (replacing/sample/create_with_xml_upload). - 
Update the OpenAPI spec to include the new accepted formats in the request body description.  - 
Implement file type detection (by extension or MIME type).  - 
Implement parsing and validation logic for each supported format.  - 
Map parsed data to the existing internal Sampleschema. - 
Return consistent error messages for unsupported or malformed input files.  
Format Parsing Suggestions
| Format | Library | Notes | 
|---|---|---|
| XML | lxml | 
Already supported | 
| JSON | json | 
Direct structure mapping | 
| CSV | 
pandas.read_csv or csv
 | 
Convert to dicts | 
| Excel (.xlsx) | openpyxl | 
Tabular data | 
| YAML | PyYAML | 
Configuration-style input | 
| TOML | 
tomllib (Python ≥3.11) / toml
 | 
Structured metadata | 
📅  Proposed Rollout Plan
| Phase | Format(s) | Notes | 
|---|---|---|
| Phase 1 | JSON | Highest user demand, easiest integration | 
| Phase 2 | CSV, Excel | Common tabular formats | 
| Phase 3 | YAML, TOML | Developer/config-friendly formats | 
🏷️  Labels
backend feature enhancement data-import priority::high
📘  References
- Frontend Issue: Frontend Issue — Support Additional File Formats for “Create Sample via File”
 - Framework: Connexion (Python)
 - API Spec: OpenAPI 3.x
 
Edited  by Mojeeb Rahman Sedeqi