Artificial Intelligence (AI) is a catalyst for scientific discovery. Increasing computational demand from data management and analytics, simulation and modeling, and multi-facility experiments and collaboration calls for the capability to train and deploy AI and machine learning (ML) models at scale and the compute infrastructure to support these growing workloads. Experts in NCCS meet these demands by providing the following:
- Extreme-scale distributed training of AI models
- Deployment, evaluation, and improvement of state-of-the-art AI frameworks and methods
- Integrated AI models with simulation campaigns
- Development and evaluation of scalable AI/ML benchmarks
- Continuous benchmarking of AI frameworks and workloads
- Mixed precision benchmarks and numerical scalable algorithms at extreme scale