Cybersecurity Update for Approved Users of NIH Controlled-Access Genomic Data

If you will be downloading data from one of the 20 listed NIH controlled access genomic data repositories (e.g. dbGaP), the NIH requires data from these repositories and all derived data to be secured in NIST SP 800-171-compliant systems (see NOT-OD-24-157). This applies to any new or renewed Data Use Certifications or similar agreements executed on or after Jan. 25, 2025. Complying with this new requirement may impact project costs; please evaluate UCSF's available services and budget accordingly.

Currently, UCSF offers AWS Secure Enterprise Cloud (SEC) as the compliant IT environment that meets this new cybersecurity requirement:

  1. Submit an IT Consultation Request to assist you with budgeting for and configuring your environment.
    • UCSF IT and our AWS Account Manager can assist with building a quote, including UC discounts in SEC.
    • AWS Solutions Architect Office Hours are also available every Wednesday at 11 a.m. Email [email protected] for the meeting link.
  2. Once you are ready to move forward with an SEC environment, note that some workloads require additional support from the cloud services team to ensure compliance. Existing workloads that need to be transitioned to SEC will also require support from a senior solutions architect. Please be patient with us through this process.

UCSF IT plans to offer its Research Analysis Environment (RAE) as another compliant IT environment. Please continue to check this page for an update on timing.

If you are not sure which environment is best for your use case, please submit an IT Consultation Request.

If you plan on using or developing generative artificial intelligence (AI) models with the requested dataset:

On 3/28/25, NIH posted NOT-OD-25-081 clarifying how to protect human genomic data obtained from NIH controlled-access repositories when using and developing generative artificial intelligence (AI) models. Specifically, NIH states:

  1. Users of NIH controlled-access data cannot use that data to train generative AI models without approval from NIH.
  2. NIH considers generative AI models, including model parameters, to be Data Derivatives. Approved users of NIH controlled-access data may not share the model, including model parameters, except with collaborators who are also Approved Users.
  3. Approved users of NIH controlled-access data may not retain the generative AI model, including model parameters, upon closeout of the project. To continue using generative AI models, Approved Users may request to renew expiring projects.
  4. Sharing controlled-access data with public generative AI tools (e.g., third party tools) is prohibited.

Frequently Asked Questions