Please note that this is a secondary copy of a file in the gryphon/sched
directory on Dearborn Group CVS, so you may want to check there in case
a later version exists.

Also note that normal Gryphon users should probably _not_ be reading
this file, it is not intended that they should be using gcsd directly.


	    Communication with Gryphon Critical Scheduler Daemon
	    ====================================================


Please note that so far gcsd communication has been developed primarily
for the purpose of being controlled by the main gryphon protocol message
scheduler daemon.  Handling of invalid datagrams has not been a priority
and at this stage I can't guarantee gcsd will work perfectly in all
possible situations.  At best gcsd was written in a way not to needlessly
exclude general usage.  General usage by hostile clients is something yet
to be tested and further developed.  The accuracy of this documentation
is also something yet to be checked and updated if necessary.

Communication takes place using unix datagram sockets.  gcsd creates a
socket, "/tmp/gcsd", and then listens for datagram commands from clients,
responding to whichever socket they send from.  Client code using gcsd
should include gcsd.h (currently in the gryphon/sched dir), which contains
the structure for the header used at the beginning of all datagrams sent
from and to gcsd:

struct gcsdhdr
        {
        unsigned char mtype;
        unsigned char a;
        unsigned char b;
        unsigned char c;
        unsigned long jobid;
        };

(note: 'unsigned char' represents a single byte, 'unsigned long' represents
a four byte (32 bit) integer, here)

mtype:

	specifies the purpose of the datagram (mtype - message type).

a,b and c:

	serve as general purpose arguments or are ignored, depending
	on what the mtype is.

jobid:
	 serves as a unique reference for a job running on gcsd.

Strangely, it is the clients responsibility to chose a jobid and ensure that
it is unique and doesn't interfere with other clients.  They should use a
jobid from (pid * GCSDPIDMULT) to (((pid+1) * GCSDPIDMULT) - 1) where pid is
the client's unix process id (as returned by getpid(2)).  Other schemes may
operate for non-local network clients (when/if implemented) and for the main
gryphon protocol 'gsched' daemon, these however should not conflict with the
above scheme.

The meaning of any data following struct gcsdhdr is dependent on mtype.
For performance, and the predominance of local operation, host (x86) byte
order is used for all integer parameters.


Message types:
--------------


mtype=GCSDRUNJOB		(direction = from client to gcsd)

Runs the given job.  The format of the job is exactly the same as the
parameters of gryphon protocol's CMD_SCHED_TX command, except integer values
are in host (x86) byte order and not network byte order.  The job in this
format is given in data immediately following 'struct gcsdhdr' in the same
datagram.  Client must fill in the jobid field as described above.  The
general purpose 'a' field of 'struct gcsdhdr' should be set to the default
device channel.  Fields 'b' and 'c' are currently ignored.  gcsd should
respond with 'GCSDRUNJOBRESP'.


mtype=GCSDRUNJOBRESP		(direction = from gcsd to client)

Indicates whether the request to start the job referred to by 'jobid' was
successful or not.  Field 'a' specifies a success/error code as will be
described later.  Fields 'b' and 'c' should be ignored.  There is no data
following 'struct gcsdhdr'.


mtype=GCSDRUNJOBAT		(direction = from client to gcsd)

Same as GCSDRUNJOB except has an extra four bytes appended to the end
of the job data.  These very last four bytes are interpreted as a
32 bit unsigned integer, which the client should set to the timeslot
value on which it wishes the job to start.  This timeslot is the same
as what GCSDJOBDONE gives, please refer to the GCSDJOBDONE description
below.  The only other difference between GCSDRUNJOBAT and GCSDRUNJOB
is the response returned is GCSDRUNJOBATRESP.  If the timeslot specified
is zero or any other value less than the current timeslot, the job will
start immediately.


mtype=GCSDRUNJOBATRESP		(direction = from gcsd to client)

The equivalent of GCSDRUNJOBRESP for GCSDRUNJOBAT.  The fields are the
same as for GCSDRUNJOBRESP.


mtype=GCSDKILLJOB		(direction = from client to gcsd)

Stops and discards the job specified by 'jobid'.  Fields a, b and c are
ignored.  There is no data.  gcsd should respond with GCSDKILLJOBRESP


mtype=GCSDKILLJOBRESP		(direction = from gcsd to client)

Indicates whether the request to kill the job referred to by 'jobid' was
successful or not.  Like GCSDRUNJOBRESP, field 'a' specifies a success/error
code as will be described later.  Fields 'b' and 'c' should be ignored.
there is no data.


mtype=GCSDKILLALL		(direction = from client to gcsd)

Kills all running jobs.  Should not be used, very unfriendly towards
other clients, especially the main gryphon protocol 'gsched' daemon which
maintains its own record of jobs running on gcsd.  There is no data and
all header fields apart from mtype are ignored.  gcsd may or may not
respond with a acknowledgment (currently it doesn't).


mtype=GCSDJOBDONE		(direction = from gcsd to client)

Provides notification that the job specified by 'jobid' will shortly
finish.  Fields a, b and c are ignored.  There are 4 bytes of data:
a 32 bit unsigned integer representing the timeslot the job will
finish.  Local clients may obtain the current timeslot by use of the
(currently undocumented) custom 'csched()' system call, which will
increment at the frequency of the critical scheduler (currently approx
1000 times a second), as long as gcsd is currently running one or more
jobs (any jobs, system wide).  In order to reduce the likelyhood of
timeslot wrap (a condition that occurs after about 45 days of
continuous running which is currently catastrophic to gcsd), gcsd may
halt incrementing the timeslot counter and reset it to zero after a
few seconds of not being given any jobs to run.  If this is a problem,
clients can just ensure that gcsd is never left without a job to run
for more than a moment (currently about 10 seconds as #define'd by
OVERRUN in gcsd.c, but if it matters to clients, I see no reason for
clients to leave gcsd without a job for more than a second).


Success/error codes:
--------------------

GCSDOK		Success.
GCSDERR		Unspecified error.
GCSDNOMEM	Command failed due to lack of free memory.
GCSDNOSPACE	Not enough space left to store job's messages.
GCSDINVAL	Invalid argument.
GCSDLENINVAL	Invalid length.
GCSDUNEXPECT	Unexpected error.
GCSDNOSUCHJOB	No such jobid.
GCSDJOBIDUSED	Client specified a jobid that is already being used.