Technology that supports researchers’ simulation and data science needs
WSU is making significant strategic investments to extend its research computing capabilities and build its cyber infrastructure. Its ultra-scale scientific and data-intensive computing power fuels scientific discovery across disciplines. It propels major advances in the fields of simulation and data science.
Strategic Research Computing Initiative
A high-performance computing (HPC) infrastructure enables WSU researchers to advance the frontier of knowledge in several fields of science. It allows them to analyze and share large amounts of data and conduct collaborative research.
To integrate HPC into existing and future academic and research activities, WSU is launching an ambitious strategic initiative in research computing across the university system.
The initiative has 3 aims:
- To build a cyber infrastructure that is responsive to and anticipates the needs of WSU researchers
- To focus on “signature” science applications that reflect WSU’s research priorities, as articulated in the Grand Challenges
- To forge strategic regional partnerships, recruit top faculty, and develop transformative education and training programs
Building a cyber infrastructure that anticipates research needs
With support from recently awarded grants, WSU is investing in computing to accelerate scientific and data-intensive research.
A new home for big data
Genomics and bioinformatics researchers at WSU’s Pullman and Spokane campuses can fast-forward their investigations, thanks to a new high-performance data storage and transfer system. Funded by a grant from the National Science Foundation Major Research Instrumentation Program in 2014, the system will support researchers’ “big data” needs.
Increasing the speed of science
A 2014 campus cyber-infrastructure grant from the National Science Foundation is enabling WSU to establish a High-Speed Scalable Research Core (HSSRC) to access wide-area science services and software-defined networking environments. The HSSRC will catalyze new multidisciplinary research applications. It will allow researchers to analyze data from simulations and instruments more rapidly and efficiently. What’s more, deployment of an HSSRC at WSU supports the training of the next generation of scientific leaders. It will enable them to build knowledge and experience in a broad range of modeling and simulation disciplines and “big data” science.
Focusing on science applications that reflect research priorities
WSU’s initial high-performance computing applications support investigations in the following areas:
Genomics, genetics, bioinformatics, agriculture
- Evolutionary genomics
- Biomedical genomics
- Crop genomics
- Breeding research
- Software platform for next generation data analysis and sharing
Physics, materials science and engineering, chemistry and biochemistry
- Materials genomics
- Computational design of materials
- Materials for clean energy
- Materials in extreme environments
- Actinide chemistry
- Nuclear theory
- Computational astrophysics
Atmospheric and environmental research
- Air quality forecasting
- Numerical weather prediction
- Regional-scale earth system modeling
- Watershed integrated systems dynamics modeling
Smart energy grid
- Power system analysis
- Control enhancement
- Demand management
- Cyber-physical security to power infrastructure
- Biomedical genomics
- Systems pharmacology
Education and training
- Computational science
- Computer science
- Data science
- Artificial intelligence
- High-performance computing training
Forging partnerships, recruiting top faculty, and advancing education and training
WSU is teaming with the Pacific Northwest National Laboratory and the University of Washington to create a regional hub of expertise in high-performance computing, as well as simulation and data science. The collaboration will give scientists an edge in addressing scientific and societal challenges.
Building “big data” partnerships
WSU researchers have joined with top minds nationwide to develop an open-source toolkit for online genomic and genetic databases. The project is funded by a National Science Foundation grant. The toolkit will provide a common infrastructure to transfer large datasets quickly, expediting discovery.
Research computing resources
The University’s centralized institutional research computing resources consist of 2 main platforms:
The IBM HPC solution supports jobs with large memory requirements. It consists of 164 computational nodes and 3 special purpose nodes.
The IBM dx360 compute nodes contain two Intel processors with six cores operating at 2.67GHz and a total of 24GB of RAM each. Six IBM x3650 servers connected via fiber channel to several M1015 disk arrays provide the infrastructure for a total of 111TB of storage, split into 3 separate volumes based on performance. These volumes comprise 73TB (7200 RPM SATA) of storage for home directories, 33TB (10K RPM SAS) for higher performance storage space and 5TB (15K RPM SAS) of optimized high performance temporary storage for jobs that demand higher data throughput.
The Kamiak pilot cluster provides the platform for deployment of an institutional compute- and data-intensive cyber-infrastructure that responds to existing and anticipated needs of the University’s research community. It enables simulation and data science “at-scale.”
Condominium computing model
This $1.3M procurement was made possible by contributions from the College of Agricultural, Human, and Natural Resource Sciences (CAHNRS) and the Office of the Vice President for Research. It will be operated, managed, and expanded under the functional principles of a “condominium” model, in which modular enhancements to the centralized resources are provided by contributions from researchers. The cluster provides a foundation for further expansion.
The Kamiak pilot system can deliver a peak performance of 20 TFlops and supports both GPUs and CPUs, as well as a high-memory-per-core ratio. Components currently include:
- Standard Compute Node (2x IntelE5-2680v2, 20 total cores/40 total threads, 256GB RAM, 400GB SSD, 10GbE, FDR Infiniband)
- Large Compute Node (2x IntelE5-2680v2, 20 total cores/40 total threads, 512GB RAM, 400GB SSD, 10GbE, FDR Infiniband)
- Large Memory Node (4x E7-4880v2, 60 total cores/120 total threads, 2TB RAM, 10Gb, FDR Infiniband)