Difference between revisions of "Clusters"
(→Geoffrey: Upgrade of the first LinuxBIOS cluster)
(→SHREK: Simple Highly Reliable Embedded Komputer)
|Line 194:||Line 194:|
Snazzy new rack
Snazzy new rack
= SHREK: Simple Highly Reliable Embedded Komputer =
= Bento: Cluster-in-a-lunchbox =
= Bento: Cluster-in-a-lunchbox =
Revision as of 06:56, 18 January 2014
This page exists for historical reasons since these are the first systems that ever ran LinuxBIOS (what became coreboot).
- 1 SC 2000: The first LinuxBIOS cluster, built at SC 2000, now at LANL
- 2 Ed: A 50 GFlops Compaq DS10 Alpha cluster, LANL
- 3 Geoffrey: Upgrade of the first LinuxBIOS cluster
- 4 SHREK: Simple Highly Reliable Embedded Komputer
- 5 Bento: Cluster-in-a-lunchbox
- 6 DQ: The rebuilt lunchbox
- 7 MCR: Multiprogrammatic Capability Cluster, Lawrence Livermore National Laboratory
- 8 Pinky: Single Evolucity chassis dual 2.4 GHz P4 Myrinet cluster, LANL
- 9 Pinkish: 1 TeraFlop dual 2.4 GHz P4 Myrinet cluster, LANL
- 10 Pink: A Science Appliance, 9.6 TeraFlop 1024-node dual 2.4 GHz P4 Myrinet cluster, LANL
SC 2000: The first LinuxBIOS cluster, built at SC 2000, now at LANL
News Flash! Magnus Svensson is the first (and only) to correctly identify the rack below as an old Digital rack that contained a couple of drives with platter-sized disks and a washing machine style drive.
We took a 16 node cluster to SC00 in Dallas, TX. It was the first LinuxBIOS cluster and despite the fact that Dallas sucked, this cluster sucked less.
The cluster was comprised of the following:
- Frontend Node
- 1-4U box with the Acer Aladdin TNT2
- 16 LinuxBIOS-Based Cluster Appliance Nodes
- 13-1U Linux Labs Nodes with the SiS 630E chipset
- 1-4U box with the SiS 630E chipset
- 2 mid-towers with the SiS 630E chipset
- Packet Engines 20 port switch
The frontend node ran the Scyld Beowulf clustering software. The appliance nodes ran LinuxBIOS out of the Millenium Disk on Chip and the Scyld Beowulf node boot program. We used the cluster to run various programs (NAS MG, K-Means clustering, 2-D Elasticity, etc.) written in the ZPL programming language.
Setting up the cluster
On the left, uniformed laboratory employees help prepare the cluster before the show floor opens. On the right, Ron burns a Millenium Disk on Chip to complete the cluster.
Our part of the LANL booth
The front end node, the switch and most of the Linux Labs nodes are in the rack on the right. On the left, a VA Linux node running LinuxBIOS sits on top of dirtball, the flash-burnin', all around utility machine.
Front and side views of the cluster
The majority of the cluster was housed in a rack we got out of lab salvage (a prize to the first person who can identify what machinery the rack came from). The cluster was the only one on the floor to have bestowed upon it the coveted "THIS CLUSTER SUCKS LESS" award from Scyld (see picture on right).
We left one of the Linux Labs nodes open (left) and following Ollie Lho's lead at ALS in Atlanta, we also left a completely naked node (right) on the table.
Help from our friends
The guys from Scyld came by to eat candy and help us out.
On the left, Ron and Mitch (a colleague from the lab) track down a problem with one of the nodes. On the right, Ron explains the cluster to the Deputy Director of our division, Buck Thompson.
Ed: A 50 GFlops Compaq DS10 Alpha cluster, LANL
10/2/2002: Ed retired after 1.5 years of service. All 96 working and non-working DS10s removed from Ed.
Our second LinuxBIOS/BProc cluster is a 128 node Alpha cluster comprised of 104 single processor Compaq DS10s, 16 dual processor API networks CS20s, and 8 four processor SMP Compaq ES40s.
Ed retired! All 96 DS10s and Myrinet hardware removed from Ed. The remainder of the cluster is set up as a simple 10/100 Ethernet connected cluster for 64-bit development.
Due to the previous day's firedrill, it was decided that we should remove three racks of DS10s — 96 total nodes (not all working). The Myrinet cards were also removed from all 96 nodes. The nodes will be shipped to Sandia California to live out the rest of their lives as part of their LinuxBIOS cluster. The Myrinet cards will be used in pinkish.
A couple more DS10s overheat and cause a burning smell that concerns our staff. After consulting the fire department, it is decided to pull the fire alarm and bring in the troops. It turns out that the heatsinks got so hot it burnt the paint off.
Quadrics is returned to Myricom for in-trade.
Once the Myrinet hardware arrived, we needed to remove the Quadrics stuff. The bulk of the work was in removing the cabling. Erik commanded the operation with his troop of three minions. These four brave young men set forth on a journey they will not soon forget.
The purchase order for the Quadrics/Myrinet2000 trade-in faxed to Myricom.
- We still cannot get working Linux drivers for Elan 3 under bproc. As far as we can tell you must be running the Quadrics RMS scheduler for even IP to work, and bproc and RMS are fundamentally incompatible. Also, much of the software we need to work with is not Open Source, which makes it difficult for us to test new ideas. That being the case, we have received a quote from Myricom for a trade-in for Myrinet 2000. The quote is currently being approved for purchase.
- The information to get LinuxBIOS working on the ES40s, so that the fontend and 5 compute nodes could run LinuxBIOS, was never forthcoming from Compaq.
- Charlie Strauss (from LANL's Biology division) has been running his CPU-intensive protein folding codes on Ed continuously since about day one.
Affectionately named Ed, the current cluster is comprised of the following:
- 104 DS10's booting Linux out of flash — yes, the SRM is gone!
- ES-40 front end, running BProc (but not LinuxBIOS, yet).
- No Quadrics support, yet. Quadrics is working with us on this and we hope to have it soon.
- 104 DS10 nodes with Quadrics interface delivered
- No switches
- Power in machine room not ready
- Minion sacrifice
Notice no CD ROM drive, no floppy drive.
Geoffrey: Upgrade of the first LinuxBIOS cluster
The first LinuxBIOS cluster recently got an upgrade:
LinuxBIOS based on 2.4 kernel
BProc and associated beo-stuff
Linksys Etherfast II managed switch
Snazzy new rack
SHREK: Simple Highly Reliable Embedded Komputer
Simple Highly Reliable Embedded Komputer
Technoland sells a cute little embedded board called the EmbSBC 710 that uses the 440BX chipset and other thing we know how to do in LinuxBIOS. It measures about 6" by 8". We will be putting 5 of these in a 2u box.
4/22/2002: The lunchbox gets a makeover!
Bento, aka the lunchbox cluster, is our newest LinuxBIOS/BProc cluster. Okay, so it's really in a toolbox, so think of it as a lunchbox for the really hungry. Thanks to Rob Armstrong and Mitch Williams of the Embedded Reasoning Institute at Sandia - Livermore for turning us on to this hardware. They're way ahead of us in terms of picking out good, small iron since they're sending their's up in the nose cone of a missle.
Front-end: IBM Thinkpad T23 (Ron's laptop) running BProc from the Clustermatic W2002 release
7 smartCoreP5 nodes from Digital Logic running LinuxBIOS configured with BProc support
1 (one) naked 3Com 100 Mb HUB (removed from it's case)
3 IBM Thinkpad 12 V power bricks
1 (one) Master Mechanic yellow plastic toolbox
This is a nice little demo unit to take around the country. It's been through a lot already -- Ron's was randomly selected to have his all his bag searched ("uh, what's that?" said security), and now he's blacklisted forever. But more importantly, it's been great way for us to get real, kernel-level development work done while traveling. For example, in Houston Matt was able to work on Supermon when (not) in meetings. Ron integrated Lm_sensors into Supermon in California. You just can't do that unless you have a cluster that can be easily rebooted (i.e., on-site).
DQ: The rebuilt lunchbox
The lunchbox cluster recently underwent a change. The change was motivated by the need to improve the use of mounting hardware, replace the hub with a switch, and cooling issues in the case. We've renamed it "DQ" -- we'll let you guess what that means.
Here's the parts list:
Front-end: Ron's IBM Thinkpad T23 or Erik's IBM Thinkpad X20 or Sung' Sony VAIO Z505JS.
6 smartCoreP5 nodes from Digital Logic running LinuxBIOS configured with BProc support (one less than the lunchbox due to the spacers required to make better use of the mounting hardware)
1 (one) NetGear 100 Mb switch (with its own power brick)
2 IBM Thinkpad 12 V power bricks
1 (one) CD storage case
1 pink fuzzy strap