Follow this link to skip to the main content
NASA - National Aeronautics and Space Administration

+ NASA Home
+ Ames Home

+ Sitemap
+ Staff Directory


+Home


HIGH END SYSTEMS
+ Pleiades
+ Columbia
+ Schirra
+ RTJones

PRESS RELEASES
TECHNICAL REPORTS
IMAGES
NEWS ARCHIVE





Installation Photos
(roll over photos to zoom)
Thumbnail Photo of Pleiades ConstructionEnlarged Photo of Pleiades Construction
Thumbnail Photo of Pleiades Install with TechnicianEnlarged Photo of Pleiades Install with Technician
Thumbnail Photo of Pleiades Install with TechnicianEnlarged Photo of Pleiades power distribution panel


Keep up with progress on construction of NASA's newest supercomputer, named Pleiades after the star cluster in the constellation Taurus. The system is being built and tested throughout summer 2008, and will be available to users for production computing in September.


WEEKLY UPDATES


09.02.08 - Original 40 Pleiades Racks on Floor—Additional Racks Start Arriving Next Week
The original 40-rack (20,480 core) Pleiades system has been installed on the NAS computer floor, in an 8-rack and 32-rack configuration. System diagnostics on 32 racks have been completed, along with security clearances. NAS continues working on diagnostics and Linpack testing; on 15,256 cores, the Linpack number came in at 58 teraflops—81.2% efficiency.

Performance/scaling comparisons of applications such as Overflow, CART3D, ECCO, and USM3D on Columbia, RTJones, and Pleiades also continue. One "ultralarge" case (96 millions grid points) showed that Pleiades is 40-45% faster than RTJones.

Preparations for the additional 44 Pleiades racks purchased at the end of July are under way, including equipment moves, installation of 2 new RAIDs, more power distribution units, and associate whips. A first shipment of 16 racks arrived on August 29th, with the next 16 scheduled for delivery on September 12th. In total, Pleiades will be comprised of 84 racks.

Other recent work:

  • Network support for Pleiades installs and system moves continued, and fiber InfiniBand optical cables and copper cables were run from the "service" rack to the 32-rack configuration
  • PBSpro queues were added for six "early" exploration users given accounts on Pleiades, and NAS applications experts are helping resolve initial problems reported
  • NASA Mission Directorate managers were given allocations of compute hours for Pleaides projects for FY2009

08.18.08 - Extensive Testing Begins on First 32 Pleiades Racks
Engineers began an extensive (3–5 week) hardware testing phase on the 32 SGI ICE 8200EX racks installed last week. Diagnostics and benchmarks will ensure that performance scales on this large system—Pleiades is the largest SGI ICE installation to date. So far, testing has resulted in minimal hardware failures. NAS staff also ran experimental workload applications using 4,096 cores to check for system stability.

A new file system that doubles the bandwidth and space for Pleiades was created and tested, and NAS staff worked with SGI to integrate the 300-gigabyte Lustre metadata server and test Infiniband (IB) modifications.

Additional work this week:

  • Applications staff continued testing, and resolved compiling and linking issues for 12 beta users.
  • Patch panels were relocated to accommodate cooling racks.
  • Nearly 1,800 IB cables were connected to Pleiades.
  • A mailing list was created to inform Pleiades users of upcoming outages and maintenance events, and system shutdown procedures reviewed.

08.01.08 - Pleiades Gets 32 Additional Systems—More to Come
The NAS facility buzzed with activity this week, with the arrival of 32 SGI ICE 8200EX racks, the installation of 6 new (and two relocated) power distribution units, installation of a pump package for Pleiades water cooling system, relocation of 16 Columbia racks to Ames building 233, and general coomputer floor clean-up. All system delivieries, and electrical and plumbing work for the initial system order are nearing completion.

A second purchase order associated with the NAS Technology Refresh was placed on July 28th for 44 additional SGI ICE racks—doubling the size of Pleiades to over 43,000 cores. More facilities work began to prepare for an early September delivery of these systems.

Also this week:

  • Applications staff conducted testing of "mpiexec" and are preparing to do stress testing with a representative workload without impacting test users.
  • 48 disk drives on the Lustre metadata server for Pleiades increased from 73 gigabytes (GB) to 300 GB.
  • NAS staff held a tutorial on development work for the Nagios Open Source service and network monitoring program.
  • Online user documentation was updated.
    + See SGI Altix ICE Systems

07.25.08 - Pleiades Testing Nearly Complete, Facilities Work Continues
Pleiades testing continued during the week on the 8 compute cabinets (4096 cores) installed, and is almost complete. The NAS Applications group continues to characterize performance for various applications. One user has been invited to begin testing and there are plans for additional users to be given access next week. Ongoing porting and scaling work shows that ModelE runs 30% faster on Pleiades than on RTJones.

The NAS facility power panels were upgraded to support the additional power requirements of Pleiades, with 1200-amp panels replacing old 800-amp panels. Power outages associated with the facility upgrade project caused some processing issues that were handled quickly.

Thirty-two additional Pleiades racks are scheduled for installation starting August 4. Current plans call for these new racks to be initially tested independently of the 8 racks installed in June. This will allow the original racks to continue serving users and Applications group staff while the 32 new racks are carefully analyzed and tuned for performance.

Additional work this week:

  • Applications staff began porting the MIT General Circulation Model (MITgcm) code for a researcher from NASA's Jet Propulsion lab. MITgcm is designed for study of the atmosphere, ocean, and climate.
  • Intel's Integrated Performance Primitives cryptography library was installed on Pleiades as part of the Intel Fortran Compiler (ifort) v11 beta.
  • SGI engineers conducting InfiniBand router testing on 8 racks of Pleiades.
  • NAS and SGI engineers came up with solutions to handle some issues with the Nexis 9000 file server.

07.17.08 - Pleiades Installation Effort Quickly Overcomes Small Setbacks
The facility upgrade project continued over the past two weeks, highlighted by testing and energizing of the new power complex. Additional power distribution units for Pleiades have been delivered and are being installed. Power panels are being activated. Although this phase of the major power upgrade is taking slightly longer than anticipated, it should be completed this week. Due to the sheer volume and complexity of changes, some unexpected outages occurred. NAS staff acted quickly to ameliorate the effects of the outages.

Application testing on Pleiades continues and most applications are showing a 20-35% performance improvement over RTJones. A set of NPB performance tests were conducted on the newly installed Pleiades system. The system demonstrated ~30% performance improvement per processor over RTJones in a wide range of processor counts.

Other work:

  • Pleiades PBS accounting data is now collected and can be queried at the level of individual PBS jobs.
  • Saturn (home filesystem) was upgraded and reconfigured to improve performance.
  • The "topo" program was run on 512 nodes of the Pleiades 4096 to measure latency from node to node.
  • Versions of DDSCAT and USM3D have been built and tested on Pleiades.
  • NDMFTS code and Phantom code are being benchmarked on both RTJones and Pleiades.

07.03.08 - Facility Upgrade for Pleiades Completed, Testing Continues
The facility shutdown on June 28th to upgrade the chilled water system to accommodate Pleiades went smoothly, with plumbing and electrical work completed, including installation of a 450-ton chiller. The computing systems returned to service on schedule, and some associated hardware issues have since been handled.

Testing continued and comparisons made to theoretical performance numbers. CART3D and FUN3D testing continued, and the latter's results for the sample data set showed that Pleiades is 35-38% faster than RTJones, using 64 cores (8 cores in each of 8 nodes). The NAS Parallel Benchmarks ran successfully on the four newest Pleiades racks, with performance numbers very similar to those obtained previously on the first four racks. Several workload stress tests were run, and verification and validation for each application were completed.

Also this week:

  • Testing of the Pleiades filesystem began, and an approach storage monitoring is in the works
  • Electrical breakers and panels arrived mid-week, and preparations were made to pour a concrete for the pump package due to arrive July 10-11
  • Improvements were made to the NAS Control Room's Heads Up Display
  • A proposed allocation plan for Pleiades was drafted

06.27.08 - Test Work Steps up on Pleiades
All 4,096 cores of the first two Pleiades deliveries are ready for substantial applications testing, now that security checks on the second set Pleiades systems are complete. The legacy parallel code FUN3D code was placed on Pleiades to begin performance testing. FUN3D is integral to aerodynamic design work within NASA and industry. Tests were also run on other workload applications including CART3D. Limited user beta testing should begin in a couple of weeks. More NAS Parallel Benchmark test were run, and results from MPI1, Fortran, OpenMP, and MPT tests turned up no issues.

All plans and support for the 18-hour outage on June 28th were in place to upgrade the chilled water system. The installation schedule for remaining Pleiades racks in on schedule, with power distributions units and whips for those racks scheduled to arrive next week.

Other work this week:

  • InfiniBand has been integrated into the first set of 2,048 cores, and 1-GigE connections were installed from Pleiades service nodes to the network switch
  • Portable Batch System scripts were modified for submitting multiple copies of workload applications
  • Preparation began to enable Pleiades as an LDAP client
  • A plan for transitioning some users from Columbia to Pleiades will focus on ensuring users are on the best system for their jobs

06.20.08 - Second Pleiades Systems Installed on Schedule
The second of three Pleiades deliveries arrived on Tuesday, June 17th. The four racks housing 2,048 cores are in place with electrical connections completed. Preliminary testing and evaluation of the first 2,048 cores indicate a substantial performance improvement between Pleiades and RTJones—between 5–30%. A new chiller will arrive next week, as part of the electrical-mechanical upgrade required to support Pleiades. An 18- hour outage is scheduled for June 28th from 2 a.m. - 9 p.m. The entire NAS computer room floor will be taken down to upgrade the chilled water system.

Also this week:
  • Columbia node C9 was relocated to building 233, completing the Columbia moves required to make room for all 40 Pleiades compute racks
  • Basic tools for monitoring Pleiades are now available to NAS Control Room staff

06.13.06 - Testing Begins on First Set of Pleiades Nodes
The first 2,048 cores and 11 racks comprising the new 20,480-core SGI Altix ICE system were installed within seven days of delivery on May 23rd. Diagnostics and testing, along with security scanning, were completed in two days. Another 2,048 cores will arrive in mid-June, with remaining hardware installed in July. Facility upgrades to power, cooling, and network systems are on schedule.

Other progress this week includes:

  • Columbia node C10 was moved and reinstalled in building 233
  • The second set of RAID racks were received and installed

06.06.08 - First Pleiades Systems Installed on Schedule
The first 2,048 cores and 11 compute racks comprising the new 20,480-core SGI Altix ICE system were installed within seven days of delivery on May 23rd. Each rack contains 512 processor cores and 512 GB of memory. Diagnostics and initial testing were completed in 2 days. Another 2,048 cores will arrive in June, with remaining hardware installed in July. Facility upgrades to power, cooling, and network systems are on schedule.

Other progress this week includes:

  • Relocation of 1 Columbia node
  • First of two Pleiades RAID systems was installed
  • First of two Pleiades RAID systems was installed
  • Initial user documentation for Pleiades was posted online

05.30.08 - Project Underway for Next-Generation HECC System
The purchase order for the next-generation high-end computing system at NAS has been was signed and placed on May 2, 2008. The system, to be named Pleiades, is part of the NAS Technology Refresh (NTR-08) process. The complete 245-teraflop system will have the newest generation of Quad-Core Intel Xeon processors (800 Mhz speed) and more than 20,800 gigabytes (GB) of system memory. Energy-smart and space-efficient, the dense, water-cooled SGI Altix ICE system will allow NASA to minimize its impact on the facility -- in terms of space, energy use, and cooling costs. Facilities planning and preparation for the new system are underway and include:

  • Removal of glass walls
  • Moving and reinstallation of multiple Columbia nodes to accommodate the Pleiades floor configuration
  • Installation of power distribution units required for Pleiades delivery #1


PLEIADES LINKS


More Information
+ Pleiades Description
+ Press Release

User Documentation
+ SGI ICE Systems




USA.gov -- government made easy
+ Feedback
+ Site Help
+ NASA Privacy Statement, Disclaimer, and Accessibility Certification
Click to visit the NAS Homepage
Editor: Jill Dunbar
Webmaster: John Hardman
NASA Official: Rupak Biswas
+ Contact NAS

Last Updated: September 3, 2008