Request a project
IMPORTANT! PLEASE NOTE:
All Projects must be requested by a FACULTY MEMBER who will serve as the Project Principal Investigator (PI).
STUDENTS MAY NOT SERVE AS PIs.
Step 1: PIs should provide the following information in one or two paragraphs:
- A short description of the science and computational aspects: what code(s) will be used (in-house, open source, or licensed). Include the name of the code if relevant, and language(s) (C, C++, Fortran, Matlab, Python, perl, Java, etc.).
- Is the code 'true parallel' (i.e. does it use MPI), multi-threaded (OpenMP), hybrid (MPI/OpenMP), GPU or Xeon Phi-enabled, or is it embarrassingly parallel? ["Embarrassingly parallel" refers to many instances of an identical code run on distinct, independent nodes, each instance with different input data.]
- Please specify if any special libraries or compilers are needed.
- Any other special requirements (e.g. "I will need to routinely store more than 1 TB of data on disk").
Step 2: Once the Project is submitted, it will be reviewed by the Center and you should hear back approximately one week upon approval.
Step 3: After the Project has been approved, the PI and his/her research group members can apply for accounts (Request an account). You will need to provide Project information (listed by project number under PI name) and specify requested machines (see guidance on this below).
Which machines to request
- 'True parallel' calculations: Gibbs, Wheeler
- 'Embarrassingly parallel' calculations: Wheeler, Galles
- Calculations that require large memory (> 4 GB/core or > 32 GB/node): Poblano, Gibbs
- Hadoop: Galles
Is there a fee?
No. The Center is supported by the Office of the Vice President for Research, and all complete resources are provided free of charge. You may use as many cycles as you wish, subject to availability through the queuing system. You may run on different machines simultaneously, although on some machines the queuing system is configured so that no single user can use more than 25% of the entire machine at one time. If you have a research deadline or require a larger share of a specific machine for a special run, please email firstname.lastname@example.org with a brief technical description and justification to request special resources.
Example Project description
Cancer genomes have complex genetic and epigenetic changes including point mutations (SNVs), chromosome rearrangements, changes in copy number (duplication, LOH), alternative RNA splicing, changes in DNA methylation and altered patterns of gene expression. Massively parallel next-generation sequencing provides new types of assays for identifying and characterizing these changes, but produces extremely large data sets requiring many types of interconnected statistical analyses. The size of next gen sequencing datasets is rapidly exploding. There are a wide variety of commercial and open source tools available to process the datasets. Unfortunately, the size of the datasets we are currently generating are 2 orders of magnitude larger than what many of the tools were written for and within a few months the size of the datasets will increase again. We need to rapidly evaluate a wide variety of tools to see which can scale to process our data and do so in a timely manner, without major modification. We also need to explore archival options for our influx of data. This will require some space on a device like the StorageWorks SAN, and access to a representative cross section of the supercomputers available at the center. It is vital we explore both the high processor count, high memory approach alongside the more traditional distributed node systems. The security implications for the project should be minimal. Most packages should not need to run as services for evaluation or open network ports. Beyond storage and compute, the only other requirements are a group folder to minimize dataset duplication and a result folder visible from the web. Many of the result visualization tools are web-based and are designed to retrieve the users’ data from http/https source. (Courtesy Prof. S Ness)
Export control and other data restrictions
The U.S. Government controls exports of sensitive equipment, software and technology as a means to promote U.S. national security interests and foreign policy objectives. Please see the Export Control page for relevant CARC policies and procedures.
HIPAA, PHI, PCI, and FERPA data may not be stored on or transferred via CARC systems.
If you need technical assistance, please submit a help request (email@example.com) and make an appointment to meet with one of our HPC Systems staff members to arrange for a quick tutorial on getting started. Please make an appointment; there is no walk-in consultation. CARC offers half-day Intro to Computing at CARC workshops on a rolling basis, based on demand. The workshops include tutorial lectures and a bring-your-own-code lab session. Please email firstname.lastname@example.org or visit Education & Training to register for upcoming workshops.
You might have trouble opening this form in Chrome. Try using another browser such as Firefox.