Skip to content

Policies and Procedures

Eric Lopatin edited this page Jun 22, 2021 · 10 revisions

The California Digital Library exists to support the University of California community’s pursuit of scholarship and extend the University’s public service mission. The Merritt curation repository and its companion research data publication portal, Dryad, are core CDL services available for use by all members of the UC community for managing, preserving, publishing, and sharing the University’s valuable digital content.  While Dryad gives the appearance of being a standalone repository, it is instead directly integrate with Merritt. Dryad provides a self-service interface and scholar-focused views of selected research data collections. However, all data managed in Dryad is automatically preserved in Merritt.

Merritt and Dryad comply with the general CDL terms of service as well as the following policy terms.

Contact

Merritt and Dryad administrators may be contacted at [email protected], which automatically opens in a new issue in CDL's internal ticketing system.  To report an urgent problem, call the CDL Help Line at (510) 987-0555.

Availability

Merritt and Dryad are available on a nominal 24x7x52 basis.  The current status of Merritt and Dryad availability can be found on the CDL system status page.

Whenever possible, major service outages for purposes of preventative maintenance and periodic enhancement are scheduled outside of normal business hours, Monday - Friday, 8:00 AM - 5:00 PM PT, and announced two weeks before the scheduled outage. In some cases unanticipated conditions may require immediate intervention without prior announcement in order to prevent damage or loss to managed content. However, Merritt's architecture has been carefully designed for robust fault-tolerance to minimize this necessity. Most diagnostic and maintenance activities can take place without any service interruption.

Privacy

Merritt and Dryad comply with the CDL's privacy policy, under which the privacy of all users will be respected and protected in compliance with federal and state laws and University of California Policies.

Accessibility

Merritt and Dash comply with UC's accessibility policy, which promotes an accessible IT environment at the University of California to help ensure that as broad a population as possible may access, benefit from, and contribute to the University’s electronic programs and services.

Contributor Responsibilities

By contributing to Merritt or Dryad, content owners and curators are acknowledging that they have followed all applicable laws, regulations, policies, ethical concerns, and disciplinary best practices regarding the creation and acquisition of that content, including obligations regarding intellectual property rights, privacy, IRB review, and accepted norms of scholarly discourse, and that they assign to CDL the non-exclusive, perpetual, revocable right to save, copy, enhance, federate, create derivatives for purposes of long-term preservation, and provide access to contributed content, subject to curatorially-designated access controls. Contributors exhibiting inappropriate behavior will be subject to loss of user privileges.

Merritt and Dryad are not appropriate repositories for managing content including clinical or personally identifiable information (PII) whose disclosure would constitute a violation of HIPAA/HITECH, FERPA, or other similar statutory, regulatory, or ethical regimes. Content containing PII must be redacted or anonymized prior to submission to Merritt or Dryad.

Merritt and Dryad are operated on a partial cost-recovery basis, as described in the Pricing section below. Content that is not paid for on a timely basis will be considered abandoned and may be subject to being de-accessioned.

Contributors may request a bulk export of their content, for which CDL may impose a one-time fee to cover the reasonable costs of the export.

User Responsibilities

By using Merritt and Dryad to search for, find, or retrieve managed content, users are acknowledging that they will follow all applicable laws, regulations, policies, ethical concerns, and disciplinary best practices regarding the use of that content, including obligations regarding intellectual property rights, privacy, and accepted norms of scholarly discourse. The latter includes an obligation to provide complete citation to Merritt or Dryad in any redistribution of the content or publications and presentations incorporating an analysis of or substantively based upon the content. Users exhibiting inappropriate behavior will be subject to loss of user privileges.

CDL Responsibilities

The CDL accepts, manages, and provides access to digital content in in order to support the University’s research, teaching, learning, and public service mission. The CDL will not exploit managed content in profit-generating activity without express permission of its legal owners.

The CDL makes reasonable efforts to provide managed content with the highest level of preservation assurance that is consistent with the form, structure, and packaging of the content, the degree to which that it is accompanied by authoritative and comprehensive metadata, the availability of appropriate tools, and other organizational priorities. Note that this implies a continuum of preservation outcomes dependent upon the nature of the content. At a minimum, however, CDL is committed to providing bit-level preservation of all content. CDL offers consultation and guidance on ways to acquire or create digital content in a manner that is most amenable to the highest level of future preservation service.

Merritt maintains a complete change history of managed content as it may evolve over time. The repository relies upon a primary preservation strategy of replication of content to geographically-dispersed sites and technological heterogeneity. Merritt incorporates a process of continual verification of cryptographic message digests of all content replicas to detect and correct any bit-level damage. The design, implementation, and operation of Merritt are consistent with the community-accepted standard ISO 14721 Open Archive Information System (OAIS) reference model.

Any changes to the Merritt fee structure will be provided to content owners at least 60 days prior to the effective date of the change.

Merritt and Dryad rely on browser-based cookies to maintain online session information for streamlining the user experience of those systems. All access log information and other personally-identifying evidence of use is collected and dispositioned in a manner consistent with the CDL privacy policy.

In the event that CDL is unable or unwilling to continue operation of Merritt, it will make reasonable efforts to find another curatorial organization, within or outside the UC system, willing to take on custodial responsibility for all managed content. If that is not possible, CDL will return all content to its contributors at no added expense.

Format Guidelines

Merritt and Dryad will accept submissions in any genre, format, and package. CDL believes that the most significant impediment to the future use of managed content is not insufficiently-complete curation, but the lack of collection and management under an appropriate and proactive stewardship regime. Consequently, Merritt and Dryad have been designed and are operated so as to maximize opportunities for self-service deposit of digital content. Once under secure management, this content is susceptible to ongoing review and enrichment by campus-based curators, collection managers, and RDM specialists to maintain and increase its curatorial value and provide a higher level of assurance of its ongoing availability and usability.

Dryad data contributors are encouraged to follow the UK Data Service recommendations on formats. CDL provides general guidelines for material contributed to Merritt.

Persistent Identification and Citation

All objects managed in Merritt are assigned unique, persistent Archival Resource Key (ARK) identifiers using CDL’s EZID service. All Dryad datasets and Merritt content in curatorially-designated collections also receive Digital Object Identifiers (DOIs) from DataCite. All Merritt and Dryad object landing pages prominently display the object’s actionable persistent identifier(s) for use in citations. Landing pages in Dryad feature pre-formatted citations conforming to the 2014 FORCE11 Joint Declaration on Data Citation Principles.

Versioning

Merritt and Dryad are strongly versioned. Any changes to data or metadata automatically results in the creation of a new version of the data object. Versioning relies on file-level backwards deltas to minimize duplicative file storage. Individual file-level components are never edited or replaced; new versions of files are added as components of the new dataset version. All previous versions can be retrieved through the Merritt and Dryad UI and API.

Federation and Internet Search

Content submitted to curatorially-designated collections may be federated with external systems and services to enhance long-term preservation and accessibility. Descriptive metadata associated with datasets that have been assigned DOIs by Merritt or Dryad is registered with DataCite, where it is indexed for online search. The Dryad research portal implements affirmative search engine optimization (SEO) techniques to ensure that managed content is indexed by well-known search engines, such asGoogle and Yahoo, for enhanced opportunities for internet search and discovery. Merritt is also open to indexing by these search engines.

Submission Agreements and Licensing Terms

Data submitted to Dryad are associated with standard Creative Commons CC-BY licenses or CC0 public domain dedications covering terms of access and acceptable use. Material contributed to Merritt is covered by the terms of campus-level agreements granting CDL a non-exclusive, perpetual, revocable license to save, copy, enrich, federate, create derivatives, and, if so curatorially-designated, distribute for non-commercial use. Access to content contributed without explicit associated terms is determined by the curatorially-assigned access control rules for the collection of which the content is a member, which permit designation for either authenticated access and use only by a restricted set of individuals, or unconstrained public access and use.

Restricted Access During Peer Review

Dryad datasets underlying articles being peer-reviewed may be designated for access restrictions for up to a six month period. During that time, no public data downloads are allowed, although certain minimal descriptive information — contributor(s), title, and date — will be presented on dataset landing page.

Take-down Requests

The procedures for responding to DMCA-compliant take-down requests are defined as part of the CDL's general terms of service.

Pricing

Merritt and Dryad operate on a partial cost-recovery basis. There is no service fee for their use , but CDL recoups its costs for provisioning preservation storage, which is typically billed at the campus level. The current nominal pricing is $150/TB/year, but this is pro-rated to reflect actual daily storage usage. Usage accounting is based on the sum total of byte-days of usage over the year, assessed at $0.000000000000411 per byte-day ($150/TB/year ÷ 1,000,000,000,000 bytes/TB ÷ 365 days/year). The reliance on byte-day accounting means that contributors do not need to be concerned about the timing of their deposits. 1 TB deposited on the first day of a billing year and saved for the entire year will accrue a cost of $150 (1 TB * 365 days * 1,000,000,000,000 bytes/TB * $0.000000000000411/byte-day). That same 1 TB deposited on the last day of the billing year will cost only $0.41 (1 TB * 1 day * 1,000,000,000,000 bytes/TB * $0.000000000000411/byte-day).

The billing year is aligned with the University of California fiscal year, July through June. Billing for the previous year’s storage usage is billed early in the subsequent year, and is payable within 60 days of billing.

Third-party Service Providers

Merritt’s external partnerships are shown this diagram:

Merritt external partnerships

Merritt uses CDL’s EZID for assigning, managing, and resolving object ARK identifiers. Dash DOIs for datasets come from DataCite, an international non-profit membership organization, of which CDL is a founding member, for assigning, managing, and resolving DOIs.

Merritt relies on external storage providers for primary and replication storage in its preservation system. The content of all collections in Merritt benefits from three object copies, maintained across three different cloud storage providers. These copies are distributed across two geographic regions (US West Coast, and US East Coast) with differing disaster threats in order to mitigate risk.

Though external to the CDL, the San Diego Supercomputer Center (SDSC)’s cloud storage is internal to the University of California system. The service level agreement defining the terms of the contractual arrangements is available at:

SDSC’s cloud storage is routinely subject to scans by Nessus, a professional auditing service that probes for vulnerabilities and malware.

Merritt also relies on two non-UC commercial service providers – Amazon Web Services (AWS) and Wasabi Hot Cloud Storage.

AWS S3 and Glacier, database hosting, using RDS, and virtual server hosting, using EC2. All of these services are located on the West coast (Oregon). The service level agreements defining the terms of the contractual relationship between CDL and Amazon are available at:

AWS [complies](https://aws.amazon.com/compliance/ with a number of regulatory and professional IT standards and certification programs, including CSA, FERPA, FISMA, HIPAA, ISO 9001, 27001, 27017, SOC 1, 2, 3, and others.

Wasabi Hot Cloud Storage is used as preservation storage for an additional object copy and is located on the East coast (Virginia). The customer agreement that defines the terms of the contractual relationship between the University of California Office of the President and Wasabi, and the Wasabi privacy policy are available here:

Wasabi complies with number of regulatory and professional IT standards and certification programs including HIPAA, FERPA, SOC 2, ISO 27001 and PCI-DSS: Wasabi Compliance

Typical collection storage configurations:

  • The Dryad collection – The primary object copy is stored in AWS S3, while the secondary copy is stored in SDSC Qumulo, and the third copy exists in Glacier.
  • The majority of all other Merritt collections – The primary object copy resides in SDSC Qumulo, while the secondary copy is stored with Wasabi, and the third copy exists in Glacier.

Dryad relies on ORCID, an international non-profit membership organization, of which CDL is a member, for managing unique, persistent research identifiers, and Crossref's Fundref, for funding agency identifiers. The membership agreement defining the terms of the relationship between CDL and ORCID is available at:

The Funder Registry data are available as CC0 Public Domain via an open public API requiring no prior contractual relationship.

Indemnification

CDL makes no representations or warranties with respect to Merritt or Dryad, and disclaims any liability arising out of their use. Neither the CDL nor Merritt or Dryad users shall be liable for any indirect, special, incidental, punitive or consequential damages arising out of that use. Liability for direct damages is limited to the dollar amount of the fee paid for the service. By making use of Merritt or Dryad, users are indemnifying, defending, and holding harmless CDL, its officers, employees, and agents from and against any liability and damages, including any reasonable attorney’s fees, that arise from that use. No limitation of liability set forth elsewhere in these terms applies to this indemnification; further, this indemnification shall survive the termination of these terms.