Dataverse Publication#

The curation system publishes curated catalog records and their associated files to a Dataverse repository via the Dataverse API. This process maps catalog record metadata to Dataverse dataset metadata blocks and uploads eligible files.

Configuration#

The Dataverse publisher is configured with the following parameters:

  • Dataverse URL - Base URL of the Dataverse API instance

  • Dataverse collection name - Name of the dataverse collection where datasets are published

  • API token - Authentication token passed in the X-Dataverse-key HTTP header

  • Published data directory - Root directory path where published data files are stored

  • Debug directory - Optional directory for writing JSON payloads before sending to the API

Publishing a Record#

When a curation record is published, the following steps are taken:

  1. Searches Dataverse for an existing dataset matching the catalog record number

  2. If a matching dataset is found, updates the existing draft version; otherwise, creates a new dataset in the configured collection

  3. Sets the citation date to the ISPS archive date

  4. Get the dataset’s persistent identifier (DOI) from Dataverse

Metadata Mapping#

Catalog record fields are mapped to Dataverse metadata blocks as described in the tables below.

Citation Metadata#

Catalog Record Field

Dataverse Field

Notes

Title

title

Authors

author

Includes name, affiliation, and ORCID

Description

dsDescription

Keywords

keyword

Split by comma

Number

otherId

Agency set to ISPS

CreatedDate

dateOfDeposit

Formatted as yyyy-MM-dd

StudyTimePeriod

timePeriodCovered

Funding

grantNumber

Split by semicolon or newline

Organization.Name

distributor

Includes name, affiliation, abbreviation, URL

Organization.ContactInformation

datasetContact

Fixed email: isps@yale.edu

RelatedDatabase

relatedDatasets

RelatedPublications

relatedMaterial

Subject

subject

Fixed value: Social Sciences

ISPS Custom Metadata#

Catalog Record Field

Dataverse Field

Notes

ArchiveDate

ispsArchiveDate

Formatted as yyyy-MM-dd

CertifiedDate

ispsCertifiedDate

Formatted as yyyy-MM-dd

OutcomeMeasures

ispsOutcomeMeasures

Split by comma

RandomizationProcedure

ispsRandomizationProcedure

ModeOfDataCollection

ispsModeOfDataCollection

Controlled vocabulary mapping

ResearchDesign

ispsResearchDesign

Controlled vocabulary; split by comma

ReviewType

ispsReviewType

Controlled vocabulary

Treatment

ispsTreatment

TreatmentAdministration

ispsTreatmentAdministration

Controlled vocabulary; split by comma

UnitOfObservation

ispsUnitOfObservation

Controlled vocabulary; split by comma

UnitOfRandomization

ispsUnitOfRandomization

Controlled vocabulary; split by comma

Version

ispsVersion

Note

The review type field maps values as follows: Full becomes Full - YARD data and code review, Partial becomes Partial - YARD data or code review, and None remains None.

Geospatial Metadata#

Catalog Record Field

Dataverse Field

Notes

Location

geographicCoverage

United States uses country field; other values use other geographic coverage

LocationDetails

geographicUnit

Social Science Metadata#

Catalog Record Field

Dataverse Field

Notes

InclusionExclusionCriteria

samplingProcedure

SampleSize

targetSampleSize

Numeric values use actual size; non-numeric values use formula

FieldDates

dateOfCollection

Parsed from JSON; start and optional end

License#

The license is set based on the catalog record’s terms of use:

  • CC0 1.0 - Creative Commons CC0 1.0 Universal public domain dedication

  • CC BY 4.0 - Creative Commons Attribution 4.0 International

  • Custom Dataset Terms - Custom terms with a reference to the original source from the related database field

Publishing Files#

File Eligibility#

Only files that meet both of the following criteria are published:

  • The file has public access (IsPublicAccess is true)

  • The file has an accepted status (Status is Accepted)

File Create and Update#

  • Existing files - If a file with a matching label already exists in the Dataverse dataset, only its metadata is updated.

  • New files - The file content is uploaded along with its metadata. Before uploading, the system checks the dataset lock status and waits until any locks are released.

Files that are present in Dataverse but no longer marked for publication are deleted from the Dataverse dataset.

File Metadata#

File Property

Dataverse Field

Notes

Title

title

Name

label

Filename for display

Description

description

Composite of title, ISPS number, publication date, source, and software

Source

source

IsPublicAccess

restricted

Inverted (restricted is true when not public)

CertifiedDate

publicationDate

KindOfData

categories

Added for data-type files if a valid value is set

Type

categories

File type added as a category