ff1073c779e825a643a3fb8fcb2aa29f17d6574f kent Thu Sep 20 18:42:13 2012 -0700 Updating comment to reflect how code actually works these days. diff --git src/parasol/paraHub/paraHub.c src/parasol/paraHub/paraHub.c index e947d64..2cb569a 100644 --- src/parasol/paraHub/paraHub.c +++ src/parasol/paraHub/paraHub.c @@ -1,38 +1,48 @@ /* paraHub - Parasol hub server. This is the heart of the parasol system * and consists of several threads - sucketSucker, heartbeat, a collection * of spokes, as well as the main hub thread. The system is synchronized - * around a message queue. + * around a message queue that the hub reads and the other threads write. * * The purpose of socketSucker is to move messages from the UDP * socket, which has a limited queue size, to the message queue, which * can be much larger. The spoke daemons exist to send messages to compute * nodes. Since sending a message to a node can take a while depending on * the network conditions, the multiple spokes allow the system to be * delivering messages to multiple nodes simultaniously. The heartbeat * daemon simply sits in a loop adding a heartbeat message to the message * queue every 15 seconds or so. The hub thead is responsible for - * keeping track of everything. The hub thread puts jobs - * on the job list, moves machines from the busy list to the free list, - * and calls the 'runner' routines, and appends job results to results - * files in batch directories. + * keeping track of everything. * - * The runner routine looks to see if there is a free machine, a free spoke, - * and a job to run. If so it will send a message to the spoke telling - * it to run the job on the machine, and then move the job from the 'pending' - * to the 'running' list, the spoke from the freeSpoke to the busySpoke list, - * and the machine from the freeMachine to the busyMachine list. This + * The hub keeps track of users, batches, jobs, and machines. It tries + * to balance machine usage between users and between batches. If a machine + * goes down it will restart the jobs the machine was running on other machines. + * When a job finishes it will add a line about the job to the results file + * associated with the batch. + * + * A fair bit of the hub's code is devoted to scheduling. It does this by + * periodically "planning" what batches to associate with what machines. + * When a machine is free it will run the next job from one of it's batches. + * A number of events including a new batch of jobs, machines being added or + * removed, and so forth can make the system decide it needs to replan. The + * replanning itself is done in the next heartbeat. + * + * When the plan is in place, the most common thing the system does is + * try to run the next job. It keeps lists of free machines and free spokes, + * and for the most part just just takes the next machine, a job from one + * of the batches the machine is running, and the next free spoke, and sends + * a message to the machine via the spoke to run the job. This * indirection of starting jobs via a separate spoke process avoids the * hub daemon itself having to wait for a response from a compute node * over the network. * * When a spoke is done assigning a job, the spoke sends a 'recycleSpoke' * message to the hub, which puts the spoke back on the freeSpoke list. * Likewise when a job is done the machine running the jobs sends a * 'job done' message to the hub, which puts the machine back on the * free list, writes the job exit code to a file, and removes the job * from the system. * * Sometimes a spoke will find that a machine is down. In this case it * sends a 'node down' message to the hub as well as the 'spoke free' * message. The hub will then move the machine to the deadMachines list, * and put the job back on the top of the pending list.