International Workshop on The Lustre Ecosystem:
Challenges and Opportunities
Large-scale storage systems are often difficult to manage with complicated interactions between expensive storage hardware, high performance interconnection networks, and client computer systems. The Lustre parallel file system has been widely adopted by high-performance computing (HPC) centers as an effective system for managing large-scale storage resources. Lustre achieves unprecedented aggregate performance by parallelizing I/O over file system clients and storage targets at extreme scales. Today, 7 out of 10 fastest supercomputers in the world use Lustre for high-performance storage.
Lustre development has focused on improving the performance and scalability of large-scale scientific workloads. In particular, large-scale checkpoint storage and retrieval, which is characterized by bursty I/O from coordinated parallel clients, has been the primary driver of Lustre development over the last decade. With the advent of extreme scale computing and Big Data computing, many HPC centers are seeing increased user interest in running diverse workloads that place new demands on Lustre.
In early March 2015, the International Workshop on the Lustre Ecosystem: Challenges and Opportunities was held in Annapolis, Maryland at the Historic Inns of Annapolis Governor Calvert House. This workshop series is intended to help explore improvements in the performance and flexibility of Lustre for supporting diverse application workloads. The 2015 workshop was the inaugural edition, and the goal was to initiate a discussion on the open challenges associated with enhancing Lustre for diverse applications, the technological advances necessary, and the associated impacts to the Lustre ecosystem. The workshop program featured a day of tutorials and a day of technical paper presentations.
The first day of the program featured a keynote talk from Intel's Eric Barton titled "From Lab to Enterprise - Growing the Lustre Ecosystem". The remainder of the first day was devoted to tutorials on managing and monitoring large-scale center-wide Lustre deployments. Tutorials were presented by staff of the Oak Ridge Leadership Computing Facility (OLCF) at Oak Ridge National Laboratory. After the tutorials, an "Ask the OLCF" Q&A session was held where attendees had the opportunity to ask the tutorial presenters additional questions related to Lustre administration and monitoring.
The technical program, presented on day two of the workshop, featured talks on various topics including:
- Workload Characterization
- Adaptability and Scalability of Lustre for Diverse Workloads
- Resilience and Serviceability of Lustre
- Application-driven Lustre Benchmarking
- Performance Monitoring Tools for Lustre
The workshop was held at the Historic Inn's Governor Calvert House located at 58 State Cir, Annapolis, MD 21401.