On Stumbling Blocks  //  Tuesday, September 16, 2008

Last time I covered the /home directory mounted via NFS. In the last week, I have accomplished alot. Here is what's happened:


1 - Installed MPICH2
I actually got MPICH2 all compiled and installed. The install guide for MPICH2 is fantastic. I must say, however, that the aforementioned walkthrough was actually less than helpful in my case. The difference between the versions of the operating system and MPI were such that it served more as a rough set of guidelines, and less as a reference.

I installed MPICH2 on disseminate/grid1 in /mpi. The source code can be found in /mpi/mpich2-1.0.7-src, and I built the code into /mpi/mpich2-1.0.7 in the standard fashion:

./configure --prefix=/mpi/mpich2-1.0.7 \
1>>~/mpich2-conf.log
make && sudo make install
Afterwards, I chown'ed the files to root, and linked the mpich2-1.0.7* directories to just mpich2 directories, as follows:
lrwxrwxrwx  1 root root   14 2008-09-06 18:43 mpich2 -> ./mpich2-1.0.7
drwxr-xr-x 12 root root 4096 2008-09-06 20:02 mpich2-1.0.7
drwxr-xr-x 12 root root 4096 2008-09-06 18:41 mpich2-1.0.7-src
lrwxrwxrwx 1 root root 17 2008-09-07 16:26 mpich2-src -> mpich2-1.0.7-src/
I've found that this tends to make life easier in any future upgrades. In fact, I've seen this approach taken in many different UNIX environments, and the results always seem to work well. This way, I can install upgrades to MPICH2, test out the newer version, and then once everything is squared, all I have to do is move a symlink.

After all that, I exported /mpi via NFS to the other nodes in the grid. That way, each system can mount the same /mpi, and has access to the same versions of MPICH2.

2 - Set up NIS for authentication

Next step was to allow each user to log in to any of the grid nodes that they want with the same credentials. Since we're using a homogenous Linux environment, I decided that NIS was the way to go. Setup was surprisingly simple, but this is just another example of why I like Ubuntu. To install NIS, all I did was to run the following:

sudo apt-get install nis
More to follow on that as I want to post it. The most important thing I learned about NIS is that when all your user accounts are NIS accounts, MAKE SURE YOU HAVE A root PASSWORD SET ON ALL NODES!. That is all.

I cron'ed a job on the master that rebuilds the NIS database every 15 minutes. That means that I can create an account on disseminate/grid1, and within 15 minutes, that user will be able to log into any of the nodes with the same credentials. That - coupled with the NFS-mounted /home - makes each node a virtually identical environment.

3 - First-Time Login Wizard

I wrote a log-in script for first-time logins that sets up a user's environment to use MPI/MPD and to ssh between grid nodes without entering their passwords. Granted, this is necessary for MPD usage, but still convenient. I should probably post the code for those scripts here at some point, but I intend to put up an image if the actual node OS when I'm done, so you can look at it then.

Well, that's all I've got for now. I think for next week, I'm going to work on building a .deb file for the setup I've developed for the node. That way, I can just dpkg -i elon-grid-node.deb and it'll set everything up for me. Maybe, I'll even set up my own APT repository...

Cheers!

-CF

Labels: , , ,


posted by Christian @ 5:51 PM
0 Comments

Install, NFS and Shared Keys  //  Tuesday, September 9, 2008

Good evening, ladies and gentlemen. I have made more progress on the Grid, and I believe I am formulating a valid methodology on how to actually deploy the machines once I've figured everything out. I'll cover that at the end, but first, here's what happened today:

I began the night by doing some compare and contrast of LVM vs standard partition schemes, and also did some reading on the differences between ReiserFS, XFS, JFS and EXT3. I decided on LVM and EXt3. LVM offers the ability to dynamically resize my partitions, which could come in handy should I ever (however unlikely) run out of disk space on the existing partition. It also allows me to do VG snapshots on the filesystem as it is running, without taking anything down. That will make for useful backups on-the-fly. I chose EXT3 because it is the most stable and widely-supported filesystem for Linux.

In order to test this setup, I moved a second box into the lab, and stacked it right on top of the first. Secondly, I configured DNS for both systems' static IPs and applied the settings in /etc/network/interfaces. I'll spare you the details, because if you've seen one, you've seen them all. I've kept disseminate, dubbing the second system mete.

Second, I configured /etc/hosts to include the gridN hostnames, and a few necessary system hosts. This I will post, obfuscating IPs as I see fit.

# Me
127.0.0.1 localhost
127.0.1.1 disseminate.cs.elon.edu disseminate

# Grid hosts
10.XX.XX.1 grid1 nfs-host
10.XX.XX.2 grid2

# Necessary System Information
10.NN.NN.1 xdc1 ntp-host

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
I didn't touch any of the IPv6 stuff that comes with Ubuntu, both because we don't use it, and because I haven't bothered to learn anything about IPv6 yet (the latter very well may be a symptom of the former). This file was taken from disseminate, but is essentially the same as the one from mete.

I cron'ed the clock to update automatically at midnight against the campus-wide ntp server. Also, for development purposes, I have set up disseminate/grid1 as the system NFS host. Because of this, and although I will eventually be setting up both in tandem, grid2 will be the system whose image is distributed to the other machines.

That said, I blew away the /home directory on mete/grid2, and exported /home on disseminate/grid1 via NFS. Mete's /etc/fstab was modified to mount appropriately on boot. The effect was splendid. Here is the entry in disseminate's /etc/exports:

/home 10.XX.XX.0/24(rw,sync,no_subtree_check,no_root_squash)

This will allow all systems on the Grid's VLAN to mount /home, and will allow root to be all rootlike if it wants to.

Here is mete's /etc/fstab entry for /home:
nfs-host:/home /home nfs rw,suid 0 0


As trivial as it is, I will not discuss how I went about setting up SSH Shared Keys. I'm sure a simple Google search could turn up much more useful tidbits than I wish to divulge here.

For coming posts, look forward to how I am going to do cross-system credential sharing (or whatever the technical jargon for that is). I really like LDAP, but I feel like my supervisor would frown on such a complex solution. Also, I will begin tackling the MPICH2 install problem.

Here's to sleep,

-CF

posted by Christian @ 7:46 PM
0 Comments

MPICH2 and a phone call  //  Monday, September 1, 2008

After getting the nod from my supervisor, I am moving forward with MPICH2. Unfortunately, I understand that the only Ubuntu-packaged version of MPICH is 1-something-ubuntu, so I'll be installing from source. On the up side, there's a fantastic walkthrough I found on how to accomplish this and set up my MPICH2 cluster. Also, there's some great stuff in there about NFS and SSH Shared-Key authentication, which is my intended destination as well. It's good to see I'm not the only one trying to do these things with Ubuntu.

My supervisor expressed concerns over the way Ubuntu server is set up for development, specifically that it's not. In a previous post, I mentioned that finding all the development packages was a tedious task. Well, I had another RTFM moment, and discovered a package called build-essential, which contains everything essential to (you guessed it) building code. It was a good day.

In other news, I received a phone call from Canonical on Friday. They just wanted to check up with me on how I was using Ubuntu, and how I was liking it. The gentleman on the other end sounded excited to hear that I'm using it for a University's Grid, and told me he would be sending me an email so we could establish contact. Sadly, I never received said email, and there is no way in hell I'm returning a call to a +44 number, so I'll just have to let it go.

In the meantime, I hear tell I'm going to receive a key to the "server room," so I will be able to start moving things about as I see fit, and maybe posting some pictures.

I'll let you know.

-CF

Labels: ,


posted by Christian @ 7:10 AM
0 Comments

Site Design Copyright © 2008 Christian Funkhouser

Site used in accordance with the Elon University Web Policy.

Make note of this disclaimer.