Navigation

User login

LUG Meetings

Meet up with Montana Linux people on IRC. We currently gather at the #ubuntu-montana room on the freenode network. Stop in and say "hello" or even ask/answer some questions. Don't have to be a Ubuntu user either. Or use the web-based chat client.

BillingsLUG
The BillingsLUG hasn't had a meeting in a while but they are hoping to start back up real soon. Really.

BozemanLUG
The BozemanLUG meets on the FIRST Thursday of the month at 7PM

MSU-Bozeman
EPS Building, Rm 259
Bozeman, MT
Used to be the LAST Thursday, but we changed it.

HelenaLUG
The HelenaLUG is currently on hiatus.

MissoulaLUG
The MissoulaLUG is currently on hiatus but a few individuals are working in the background to change that ASAP.

If you want to help get the BillingsLUG, HelenaLUG, or the MissoulaLUG started again, be sure to visit us in IRC.

Events

Upcoming events

no upcoming events available

Message Passing Interface

I've been mucking around in the world of distributed programming recently. In my defense, I was investigating this topic BEFORE Scott got his hands on a stack of rack mount computers from Intel. But I will admit the sight of all those idle CPU's put the angst into my code-writing fingers. So I've gone ahead and actually written a distributed application.

A little background on distributed computing. First off, we need the obligatory "Linux is awesome." There is a kernel module for Linux called MOSIX that will transparently move CPU/Memory bound applications away from the machine they were launched on, and return those processes as soon as they need to make a system call. That's awesome.

Next I need to introduce the term "grain-size". In a parallel application, there are usually sections that cannot be parallel in between the easily parallel parts. For example, in a genetic algorithm, a population of candidate solutions is held in a single list. That one list needs to be modified by exactly one process for things like member mutation and crossovers. This activity happens once per "generation" and is reasonably quick to do. When that is over, each member needs to be evaluated for fitness before the next generation can be processed. This evaluation happens one at time and there is no interaction between the members.

In a single process application, the amount the fitness evaluation will take is computed by multiplying the amount of time it takes to evaluate one member by the number of members. If we had enough processors, however, we could theoretically reduce this time to the amount of time it takes to compute just one evaluation. this length of time is the grain-size of the application. The smaller the grain, the quicker(theoretically) the parallel job can be finished. The limitations to this notion come from the speed of networking. The overhead in distributed computing is the network bandwidth. If the grain size is too small, all the process spend more time waiting for the network to move bits around instead of crunching some numbers.

The MOSIX approach doesn't perform well when the grain size of a parallel application is smaller than about 10 to 20 seconds. This is because the process starts on a local machine, and there is an algorithm waiting for your process to qualify as "CPU bound." Then, while it's in the middle of crunching numbers, the process is paused, wrapped up, and shipped to a remote computer where it can get it's work done in peace. Then, when the process needs a system resource, it's paused again, then shipped home to do it's work. A programmer is likely to see worse performance by using MOSIX if his grain size is too small than if he just wrote a single process application.

I'd like to see an improvement in performance down to 1 and 2 second grains. Enter stage left: MPI, the Message Passing Interface. After running 'emerge openmpi' I had the libraries and programs necessary to compile and run an MPI capable application. I love Portage. After tooling around with "hello world" MPI style, I knew I had the right tool for the job.
[Scott Notes: Open MPI is included as a stock package in many distributions these days including your editor's favorites, those based on Red Hat.]

MPI is commonly considered a "low-level" distributed computing library. It really only gives basic communication tools. It's up to the developer to make those tools work in an application. My goal was to use any configuration of networked computers to improve the performance of a single application, such as a genetic algorithm that is evolving neural networks.

Enter stage right: the thread pool. A thread pool is a tool used on multiprocessor systems to leverage multiple cores on a set of jobs. It's a common parallel programming tool. I just needed to make it work with MPI, and I would have a distributed computing framework.

A thread pool works by using a single queue of jobs to feed a static set of threads. A static thread is one that sticks around. Starting and killing threads to do work is slow, because the operating system has some work to do to make such things happen. If a programmer can just keep the same threads active and keep them well fed with new jobs, the application will run faster than spawning new threads every time a job needs to be done.

Hopefully the design of my application is starting to be apparent. The program itself runs in parallel on all the MPI machines that are configured. This is managed entirely by the mpi support programs. Inside the program, the index number of the program instance is known, and the zero index is the master node. The master node runs all non-parallel code, and uses a thread pool to control how jobs are delivered to the other copies of the program.

Each thread in the thread pool takes the job data and sends it to a particular remote process. One thread -> one remote process. The thread on the master node doesn't do any actual work, it just waits for the remote process to finish and send back the results. It's a lot like a remote procedure call (RPC), only faster. Lots faster.

That's it. There are still some bugs to work out, and my gosh if my code isn't ugly, then I'd hate to see what ugly code looks like.

Ta-ta for now,
chad

» cbohannan's blog | login to post comments

No you don't need to know this.

Submitted by cbohannan on Thu, 2007-07-05 16:05.

This is the only blog I write to, and I don't write very often. I believe in the open source model, and the GPL, and everything that Linux stands for. That means keeping developments open.
This site seems like the best place for me to describe my work in Linux. On the practical side, posting here counts as prior art. I may think what I've written is obvious, but the last thing I want is to get in trouble later because someone decided to patent. From my perspective, everything I do is simple and obvious... you know?
So yeah, I know most of the readers on this board aren't programmers, but that's a good thing. The world needs a lot more testers than developers.... hehehehe.

chad

» login to post comments

Chad, Even tho I'm not a

Submitted by Worzie on Fri, 2007-07-06 07:58.

Chad,

Even tho I'm not a programmer, I read your blog just because you posted here and I enjoyed it. I don't normally stop to read a programmers blog on other sites due to language barriers. I enjoyed your in-depth coverage and plain English style. I encourage you to write more!

» login to post comments