Hurd Hacking Guide
******************
+------+
| |
,--- | | <----. ``The GNU Hurd is the GNU project's
,' | | `-. replacement for the Unix kernel.
| +------+ `. The Hurd is a collection of servers
v | `. that run on the Mach micro-kernel to
+------+ | +------+ | implement file systems, network
| | `. | | | protocols, file access control, and
| | `--> | | | other features that are implemented
| | | | | by the Unix kernel or similar kernels
+------+ +------+ ,' (such as Linux).''
^ | _,'
| +------+ +-' ---
| | | ,-'|
`. | | -' ,' This is the
| | | |
`. +------+ ,' H u r d H a c k i n g G u i d e
`. ,'
`----------' Version 0.2_1 - Mar 25, 2002
Copyright (C) 2001, 2002, 2005, 2007 Free Software Foundation, Inc.
Written by Wolfgang Jährling, .
Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License,
Version 1.1 or any later version published by the Free Software
Foundation; with no Invariant Sections, with no Front-Cover Texts, and
with no Back-Cover Texts. For details see
.
1 About this document
*********************
1.1 Conventions
===============
The Version of this document follows the convention _. This means that 0.2_7 is the seventh
release since Hurd 0.2 and a version like 0.4_1 would be the first
release for Hurd 0.4 (which, of course, is not yet available :)).
'$(HURD)' means the top directory of the 'Hurd' source tree.
'$(GNUMACH)' means the top directory of the 'GNU Mach' source tree.
'$(MIG)' means the top directory of the 'MIG' source tree. '$(GLIBC)'
means the top directory of the 'GNU Libc' source tree.
Single shell commands start with a '$' for user commands and a '#'
for root commands:
$ diff -u libtrivfs.old/open.c libtrivfs/open.c
# reboot
Prompts of other programs look as they do in the respective
application. For example, the GDB prompt is indicated by '(gdb)':
(gdb) break trivfs_S_io_write
I will try to obey the GNU Coding Standards in my C examples.
1.2 Topic
=========
This document is an introduction to GNU Hurd and Mach programming.
The purpose of this guide is to help interested people start hacking
the Hurd or extending it (by writing translators). It gives lots of
references to the Hurd or GNU Mach source files. It is recommended
that you read through some of these sources. Indeed the Hurd sources
are very well written and commented and you can learn a lot by reading
them.
The Hurd looks very complex and hard to learn -- at a first glance.
But it isn't, because you don't need to understand everything at once,
you may do it slowly and step-by-step and can apply your existing
knowledge. There are also libraries that make hacking of certain
common kinds of translators easy. I think that the only problem is
the absence of nice documentation like the "Linux Module Programming
Guide" and such, which makes it possible to get into it step by step.
This document tries to fill that gap.
Mach and MIG are not handled in depth here, so if you want specific
information on them, I recommend reading the GNU Mach Reference Manual
(1) and the documentation about MIG available on the internet.
The Hurd Hacking Guide is not intended to be a complete reference,
but it ought to help you getting started. The only real reference at
the time of writing is the Hurd source code.
The order of chapters in this document was originally partly based
on a mail from Farid Hajji (2), which in turn seems to be based on
'$(HURD)/doc/navigating'.
---------- Footnotes ----------
(1)
(2)
1.3 Feedback
============
Don't hesitate to send me improvements, corrections and extensions.
You are of course welcome to correct my lousy english -- I'm not a
native speaker/writer. (Thanks to Alfred M. Szmidt for his
corrections, BTW.)
I would be happy to hear about how understandable this text is.
Did you "get" everything? Was some part confusing? Send me feedback!
Actually, I wrote this to document everything I learned about the
Hurd, so that I later could quickly lookup a detail that I had
forgotten. This means that if you send me extensions, I will also
profit from them.
+------+
| | B e a
,-> | Hurd | -. p a r t o f
,' | | `. i t
| +------+ v
2 Requirements
**************
1. You should know at least basic things about the Hurd (1).
2. More specific knowledge is of course welcome (2).
3. It would be good to know what a tranlators is (3).
4. The sources of Hurd and GNU Mach are also useful:
mkdir ~/hurd-cvs
cd ~/hurd-cvs/
cvs -d:pserver:anonymous@cvs.savannah.gnu.org:/sources/hurd \
co hurd
cvs -d:pserver:anonymous@cvs.savannah.gnu.org:/sources/hurd \
co -r gnumach-1-branch gnumach
5. A GNU/Hurd installation certainly will help you, too (4).
6. Diving into the header files of the Hurd's libraries is
dangerous, because you can drown very easily, because you won't
find the way out of all these data structures. Maybe a list of
where to find what might be helpful, e.g. the output of
$ cd $(HURD) && egrep '^struct [^;*]+$' */*.h
7. If you know the principles of Mach, what MIG is, etc., then this
will of course help you a lot, but it should not be necessary.
Knowing about the Linux kernel might also help to some degree.
8. Oh, and you should know the C programming language. :-)
---------- Footnotes ----------
(1)
(2)
(3)
(4)
3 A Short Overview of Hurd and Mach
***********************************
"We're way ahead of you here. The Hurd has always been on the
cutting edge of not being good for anything." (Roland
McGrath(1))
"In short: just say NO TO DRUGS, and maybe you won't end up like
the Hurd people." (Linus Torvalds(2))
Every seasonable piece of software needs a method for communication
between components. Nowadays, things like CORBA or Mozilla's XPCOM
are used for that. The advantage of the Hurd over other systems is,
that it provides such a facility and does not require existing
applications to be modified to take advantage of its communication
framework. How does the Hurd reach this goal?
"The Hurd, at its most central core, is just the protocols that
the cooperating servers use." (Thomas Bushnell(3))
In a Mach environment, communication between programs is mostly
done by sending messages through so-called "ports", which are a kind
of message queue. For each port, there is one task with
receive-permission (i.e. this task receives the messages someone
sends to this port). Other tasks might have a send-permission or a
send-once-permission (which is used for getting a reply from a server,
because ports are one-way channels) for this port, or even no
permission at all.
If you read through '$(GNUMACH)/include/mach/port.h', you may
notice that there are more port rights: A send or send-once right is
turned into a "dead name" if the receive right is destroyed. So you
can't use the dead name right for anything, it's merely a
place-holder. Another port right is "port set", which Marcus
Brinkmann explains as follows (4):
"A port set is a set of ports. It is useful to combine ports
into a port set if you just want the next message on any of the
ports you have a receive right for. In the Hurd, we use port
classes and buckets provided by libports, though.
Well, we are not exactly strict in our wording when talking about
ports. [...] You can think of port rights as capabilities
associated with ports and port sets, if you want."
How can a process find a specific port? It's really simple:
through the file system. For example, the services of an ext2-server
are available through the node where the file system he handles was
"mounted" (please note that there is no such thing as mounting in the
Hurd world, this is what it would be called under Unix; the correct
term for the GNU/Hurd operating system would be "setting a
translator").
Your favourite email client doesn't support random signatures?
Write a random signature translator (5) (which returns a new signature
each time you read from it). And the best thing is: now all email
clients may use this feature! Do you see how the Hurd encourages code
reuse? Do you see how GNU/Hurd does not require programs to be
modified to take advantage of most of the nifty features it provides?
If you would like to learn more about the Hurd filesytem, I highly
recommend reading the presentation "The Hurd" (6).
We can say that the file system is the name space for services,
this also true in the other direction: the name-space for services is
the file system. This a very important thing to understand. While
the file system is the canonical way to get a port, there are other
ways as well; for example, you can get a port in a message.
If you are wondering why I compared this kind of communication with
CORBA, the following quote from the paper "Towards a New Strategy of
OS Design" might help you understand the reason:
"With translators, the filesystem can act as a rendezvous for
interfaces which are not similar to files. Consider a service
which implements some version of the X protocol, using Mach
messages as an underlying transport. For each X display, a file
can be created with the appropriate program as its translator. X
clients would open that file. At that point, few file operations
would be useful (read and write, for example, would be useless),
but new operations (XCreateWindow or XDrawText) might become
meaningful. In this case, the filesystem protocol is used only
to manipulate characteristics of the node used for the
rendezvous. The node need not support I/O operations, though it
should reply to any such messages with a message_not_understood
return code."
---------- Footnotes ----------
(1) REFERENCE missing
(2) REFERENCE missing
(3)
(4) see
for the complete discussion
(5) Or better use the already existing 'run' translator, which was
written my Marcus Brinkmann and is much more flexible; using the
existing filemux translator would also be possible
(6)
4 Basics of Mach and MIG
************************
4.1 Mach ports
==============
Now we will take a look at some Mach details. Yes, I know that you
would like to start writing translators as soon as possible, but you
really should know at least the basics of Mach. Mach ports are used
extensively throughout the Hurd, so let's start with them.
First, let's make the distinction between ports, port rights and
port names clear. Marcus Brinkmann once wrote (on IRC):
"'mach_port_t' is a port name, the port name denotes an entry in
the tasks port name space, which is associated with either a dead
name, a send-once right, or a combination of a receive right and
a send right with potentially many user references.
Assume you have 'mach_port_t 5', and want to send a message to
it. To send a message, you pass the task 'mach_task_self()', the
port name, the msgid and the arguments. The task is used to get
the ipc name space, the port name is used to find the entry in
this name space. The entry tells Mach about the port rights you
have for the associated port with this port name.
You have a single port name for all receive/send rights you might
have for a port. But you have distinct port names for send once
rights, because this is easier for Mach to manage -- and for user
programs, too."
Mach defines the type 'natural_t', which is the native type of the
machine, e.g. 32 bits on a 32-bit processor. 'natural_t' is always
unsigned, but there is a signed variant called 'integer_t'. The
definitions for the i386 platform can be found in
'$(GNUMACH)/i386/include/mach/i386/vm_types.h'. In this file you can
find other interesting types as well, but those are not important for
us right know.
Like Unix file descriptors, Mach port names are plain, boring
integers. In the file '$(GNUMACH)/include/mach/port.h' you can see
the following definitions:
typedef natural_t mach_port_t;
typedef mach_port_t *mach_port_array_t;
(For most data types in Mach, an additional '*_array_t' type is
defined.) The port number identifies a unique port (in the namespace
of the task), thus a 'mach_port_t' value is often refered to as "port
name".
A value of 'MACH_PORT_DEAD' (i.e. '~0') represents a port right
that has died, while 'MACH_PORT_NULL' (quoting 'port.h') "indicates
the absence of any port or port right."
You may check with 'MACH_PORT_VALID (port)' if a port is neither of
these two values.
For the port rights, we have the macros with the names
'MACH_PORT_RIGHT_RECEIVE', 'MACH_PORT_RIGHT_SEND',
'MACH_PORT_RIGHT_SEND_ONCE', 'MACH_PORT_RIGHT_PORT_SET',
'MACH_PORT_RIGHT_DEAD_NAME' and 'MACH_PORT_RIGHT_NUMBER' which are of
type 'mach_port_right_t':
typedef natural_t mach_port_right_t;
This type is used whenever we need to act on a particular port
right. Often, however, we want to carry around a set of rights,
because we may have multiple rights on a single port. For this, we
use another type:
typedef natural_t mach_port_type_t;
typedef mach_port_type_t *mach_port_type_array_t;
A 'mach_port_type_t' variable may either carry the value
'MACH_PORT_TYPE_NONE', which of course represents an empty set of
rights, or a set of the macros 'MACH_PORT_TYPE_SEND',
'MACH_PORT_TYPE_RECEIVE' etc. combined with a bitwise or. There are
also several predefined combinations:
Macro Combination
MACH_PORT_TYPE_SEND_RECEIVE MACH_PORT_TYPE_SEND,
MACH_PORT_TYPE_RECEIVE
MACH_PORT_TYPE_SEND_RIGHTS MACH_PORT_TYPE_SEND,
MACH_PORT_TYPE_SEND_ONCE
MACH_PORT_TYPE_PORT_RIGHTS MACH_PORT_TYPE_SEND_RIGHTS,
MACH_PORT_TYPE_RECEIVE
MACH_PORT_TYPE_PORT_OR_DEAD MACH_PORT_TYPE_PORT_RIGHTS,
MACH_PORT_TYPE_DEAD_NAME
MACH_PORT_TYPE_ALL_RIGHTS MACH_PORT_TYPE_PORT_OR_DEAD,
MACH_PORT_TYPE_PORT_SET
Don't confuse the 'MACH_PORT_RIGHT_*' with the 'MACH_PORT_TYPE_*'
macros. They have similar names, but different meanings as well as
different values.
More details on Mach IPC can be found in the GNU Mach Reference
Manual, Chapter 4 ("Inter Process Communication") (1).
---------- Footnotes ----------
(1)
4.2 Threads and Tasks
=====================
4.3 MIG
=======
Making RPCs (Remote Procedure Calls) by sending Mach messages is not
trivial. Making it easier is the purpose of MIG (Mach Interface
Generator). You must write an interface definition file, feed it into
MIG and then it outputs two C sources and a header file, which do the
Mach port magic for you. Then you can send messages with simple
function calls.
Of course, you only need to do that if you want to define your own
interfaces. The Hurd already contains various interfaces. The
apropriate functions are in glibc, thus you don't have to specify any
special flags if you want to use them.
The syntax of MIG files is similar to Pascal and should not be hard
to understand. We will talk about some details later.
5 The Hurd Interfaces
*********************
Understanding concepts is a very important thing, but this alone will
not enable you to get any work done. You also need to know about the
interfaces which make it possible to use those concepts.
You can find the interface definitions in the '$(HURD)/hurd/*.defs'
files. Important are 'io.defs', 'password.defs', 'fsys.defs',
'fs.defs' and 'auth.defs'. The 'login.defs' interface is nice, but
not implemented so far and maybe it will never be. The '*_reply.defs'
interfaces are -- of course -- for reply messages.
You definitively should take a look at the Hurd interfaces now. In
the next chapter, we will see how one uses those interfaces.
6 How does this look in practice?
*********************************
6.1 Writing a file to standard output
=====================================
Let's take a closer look at a program that dumps a file to its
standard output in a "hurdish" way. Please note that the non-Hurd way
may still be used. Glibc still provides functions like 'write ()' on
GNU/Hurd systems. The Hurd generic parts of glibc are in
'$(GLIBC)/hurd/', Mach dependent parts are in
'$(GLIBC)/sysdeps/mach/hurd/'.
/* dump.c - Dump a file to stdout in a "hurdish" way.
*
* Copyright (C) 2001, 2002, 2007 Free Software Foundation, Inc.
*
* Written by Wolfgang Jährling .
*
* Distributed under the terms of the GNU General Public License.
* This is distributed "as is". No warranty is provided at all.
*/
#define _GNU_SOURCE 1
#include
#include
#include
#include
#include
#include
int
main (int argc, char *argv[])
{
file_t f;
mach_msg_type_number_t amount;
char *buf;
error_t err;
if (argc != 2)
error (1, 0, "Usage: %s ", argv[0]);
/* Open file */
f = file_name_lookup (argv[1], O_READ, 0);
if (f == MACH_PORT_NULL)
error (1, errno, "Could not open %s", argv[1]);
/* Get size of file (buggy! See below) */
err = io_readable (f, &amount);
if (err)
error (1, err, "Could not get number of readable bytes");
/* Create buffer */
buf = malloc (amount + 1);
if (buf == NULL)
error (1, 0, "Out of memory");
/* Read */
err = io_read (f, &buf, &amount, -1, amount);
if (err)
error (1, errno, "Could not read from file %s", argv[1]);
buf[amount] = '\0';
mach_port_deallocate (mach_task_self (), f);
/* Output */
printf ("%s", buf);
return 0;
}
You may compile this with:
$ gcc -g -o dump dump.c
But let's look at the interesting pieces of this program:
#define _GNU_SOURCE 1
You should always define this macro for GNU/Hurd specific programs.
Otherwise, you will not even be able to compile them. Define it
before including any headers. Alternatively, you might want to pass
'-D_GNU_SOURCE' to gcc, by adding it to 'CPPFLAGS', for example.
file_t f;
A 'file_t' is actually a 'mach_port_t', but we use 'file_t' to make
clear we use this to open a file.
char *buf;
You may have noticed that we will use this buffer in a situation
where we should pass a 'data_t' (which is typedef'ed as a 'char *'),
but '$(HURD)/hurd/hurd_types.h' states about 'data_t' and several
other types:
"These names exist only because of MIG deficiencies. You should
not use them in C source; use the normal C types instead."
error_t err;
There is also the Mach type 'kern_return_t', but in the Hurd,
'error_t' is the better choice. Marcus Brinkmann explains:
"kern_return_t is the mach error type from mach_msg for example,
so if you call an RPC, and want to do it in a Mach compatible
fashion, use kern_return_t. BUT on the Hurd, we use error_t,
because that is compatible with the glibc error types. You can
always cast from kern_return_t to error_t on GNU systems."
if (argc != 2)
error (1, 0, "Usage: %s ", argv[0]);
Just in case you are not familiar with the 'error ()' function, I
will quote '/include/error.h' (Remember we don't use /usr in the Hurd
:-)):
/* Print a message with `fprintf (stderr, FORMAT, ...)';
if ERRNUM is nonzero, follow it with ": " and strerror (ERRNUM).
If STATUS is nonzero, terminate the program with `exit (STATUS)'. */
extern void error (int status, int errnum, const char *format, ...);
Now the actual action starts. We try to open the file with the
glibc function 'file_name_lookup ()'. This function returns
'MACH_PORT_NULL' if the attempt failed. 'O_READ' is a GNU extension
and is the same as the POSIX constant 'O_RDONLY'. In the same way,
'O_WRITE' is identical to 'O_WRONLY'.
/* Open file */
f = file_name_lookup (argv[1], O_READ, 0);
if (f == MACH_PORT_NULL)
error (1, errno, "Could not open %s", argv[1]);
Of course we could read the file character-by-character, but in
this example we assume that it is a "normal" file and may be read at
once, so we use 'io_readable ()' to find out about the size of the
file (this won't work for files like '/dev/random'! 'io_readable ()'
only tells us how much data is available right now). Then we allocate
a buffer for the whole file:
/* Get size of file */
err = io_readable (f, &amount);
if (err)
error (1, err, "Could not get number of readable bytes");
/* Create buffer */
buf = malloc (amount + 1);
if (buf == NULL)
error (1, 0, "Out of memory");
Now all we have to do is reading the file into the buffer. We put
a 0-byte at the end, so we will be able to use the file content as a
string (we assume that the file does not contain any 0-bytes, which is
bad style, but in this case, we don't care).
/* Read */
err = io_read (f, &buf, &amount, -1, amount);
if (err)
error (1, errno, "Could not read from file %s", argv[1]);
buf[amount] = '\0';
Note that we pass '&buf', which is a pointer to a pointer. 'buf'
itself might get modified, but this decision is up to the receiver of
the 'io_read' message.
If you look at '$(HURD)/hurd/io.defs', you will probably wonder
that 'io_read' only has four arguments, while we passed five.
'io_object', 'data', 'offset' and 'amount' are written down in the
'.defs' file, while '/include/hurd/io.h' has an additional argument
(called 'dataCnt') after data, which makes perfectly sense: we also
need to get the information which amount of data we actually got.
Adding this argument is done automatically by MIG.
We are finished using the file, so we can close it. This is done
by deallocating the port:
mach_port_deallocate (mach_task_self (), f);
That's all.
6.2 Creating a copy of a file
=============================
Now we will try to copy a file. But this time, we will do it right:
if the translator that provides the file takes while to deliver new
data, we will wait that while. So we will read until we reach a real
EOF. We know that we reached the end of the file if our call to
'io_read ()' gives us zero bytes of data.
/* copy.c - Copy a file in a "hurdish" way.
*
* Copyright (C) 2001, 2007 Free Software Foundation, Inc.
*
* Written by Wolfgang Jährling .
*
* Distributed under the terms of the GNU General Public License.
* This is distributed "as is". No warranty is provided at all.
*/
#define _GNU_SOURCE 1
#include
#include
#include
#include
#include
#include
#define BUFLEN 10 /* Arbitrary */
int
main (int argc, char *argv[])
{
file_t in, out;
mach_msg_type_number_t rd_amount, wr_amount;
char *buf, *ptr;
error_t err;
if (argc != 3)
error (1, 0, "Usage: %s ", argv[0]);
/* Create buffer */
buf = malloc (BUFLEN + 1);
if (buf == NULL)
error (1, 0, "Out of memory");
/* Open files */
in = file_name_lookup (argv[1], O_READ, 0);
if (in == MACH_PORT_NULL)
error (1, errno, "Could not open %s", argv[1]);
out = file_name_lookup (argv[2], O_WRITE | O_CREAT | O_TRUNC, 0640);
if (out == MACH_PORT_NULL)
error (1, errno, "Could not open %s", argv[2]);
/* Copy */
while (1)
{
/* Read */
err = io_read (in, &buf, &rd_amount, -1, BUFLEN);
if (err)
error (1, err, "Could not read from file %s", argv[1]);
if (rd_amount == 0)
break;
/* Write */
ptr = buf;
do
{
err = io_write (out, ptr, rd_amount, -1, &wr_amount);
if (err)
error (1, err, "Could not write to file %s", argv[2]);
rd_amount -= wr_amount;
ptr += wr_amount;
}
while (rd_amount);
}
mach_port_deallocate (mach_task_self (), in);
mach_port_deallocate (mach_task_self (), out);
return 0;
}
Interesting parts are:
out = file_name_lookup (argv[2], O_WRITE | O_CREAT | O_TRUNC, 0640);
Here we open the output file and create it, if it does not exist,
with permissions being 0640 (which is of course 'rw-r---') minus the
umask.
/* Read */
err = io_read (in, &buf, &rd_amount, -1, BUFLEN);
if (err)
error (1, err, "Could not read from file %s", argv[1]);
if (rd_amount == 0)
break;
As we said above: if we couldn't read any bytes, this indicates
that we reached the end of the file. It does not mean that there is
no data available at the moment: if we would read from '/dev/random'
for example, and there would be no data to read at the moment, this
call would not tell us that there is no data, but would block until
new data is available.
/* Write */
ptr = buf;
do
{
err = io_write (out, ptr, rd_amount, -1, &wr_amount);
if (err)
error (1, err, "Could not write to file %s", argv[2]);
rd_amount -= wr_amount;
ptr += wr_amount;
}
while (rd_amount);
There is no guarantee that 'io_write ()' accepts all data we
attempt to feed into it immediately, so we might need to retry, but
only with the data that was not accepted until now.
6.3 Final notes
===============
Finaly I would like to note that on the GNU/Hurd system there is no
reason not to use the nice non-standard extensions of GCC. For
example, nested functions are used frequently throughout the Hurd
sources. It might be helpful to know about these extensions, so you
should probably do
$ info gcc "C Extensions"
A nice example for "hurdish" code is '$(HURD)/init/init.c', so you
should also take a look at that. You maybe won't understand all of
it, but that doesn't matter. More sources you might want to read are
in '$(HURD)/utils/' and '$(HURD)/sutils/'.
7 The Hurd Libraries (Overview)
*******************************
There are several libraries which make writing translators of various
kinds easier.
libtrivfs is used for "trivial" translators. In this case, trivial
translators means all translators that provide only a single file
(node), as opposed to a complete directory or even a file system. As
the first translators any new Hurd hacker will develop are simple
single-file translators, this library is the first you should learn
about.
libnetfs is the library for complete file systems where the
translator does not directly control the underlying data, as is the
case in ftpfs, nfs and unionfs(1), for example. libnetfs will
probably be renamed to libfsserver.
libdiskfs is also for complete filesystems, but it is used in the
case where the translator controls the underlying data. Examples are
ext2fs, UFS and tmpfs.
libtreefs is defunct. It was never finished. Nobody uses it.
Neither should you.
libports provides functions for working with ports. It can also be
seen as an abstraction of what functionality the Hurd expects from a
message-passing system.
libstore: the following explanation can be found in the libstore
header file: "A 'store' is a fixed-size block of storage, which can be
read and perhaps written to. This library implements many different
backends which allow the abstract store interface to be used with
common types of storage -- devices, files, memory, tasks, etc. It
also allows stores to be combined and filtered in various ways."
libiohelp: "Library providing helper functions for io servers."
libthreads is the cthreads library. This library comes from the
Mach microkernel and was developed before the POSIX threads standard
existed. We will have POSIX threads in the future, but currently this
library is used for multithreading.
libihash provides integer-keyed hash table functions.
libps: "Routines to gather and print process information."
libshouldbeinlibc: Nomen est omen. :-)
---------- Footnotes ----------
(1) unionfs is not yet part of the Hurd, but a partly working
implementation exists.
8 An Example using trivfs
*************************
8.1 GNU/Linux and GNU/Hurd
==========================
Before we take a closer look at how to use trivfs, let's see how this
would be done on a GNU/Linux system. I won't explain the GNU/Linux
example in much detail, because it is not very important for us, but
since lots of people are familar with Linux (kernel) coding, it might
be helpful to compare how things are done on GNU/Linux as opposed to
GNU/Hurd.
Writing a Linux kernel module for a device file like '/dev/one'
(which, of course, gives you infinite ones if you read from it is easy
in theory. A module for kernel 2.4.x providing a special file might
look like this:
/* linux-one.c - Linux kernel module for /dev/one.
*
* Copyright (C) 2000, 2001, 2007 Free Software Foundation, Inc.
*
* Written by Wolfgang Jährling .
*
* Distributed under the terms of the GNU General Public License.
* This is distributed "as is". No warranty is provided at all.
*/
#include
#include
#include
#include
#include
#define ONE_NAME "one"
#define ONE_MAJOR 100 /* Major device file number */
static int is_opened = 0;
/* Someone wants to open the file */
static int
device_open (struct inode *inode, struct file *file)
{
/* We allow only one simultaneous usage */
if (is_opened)
return -EBUSY;
is_opened = 1;
MOD_INC_USE_COUNT; /* Module can't be unloaded now */
return 0; /* Could be opened */
}
/* The file is closed again */
static int
device_release (struct inode *inode, struct file *file)
{
is_opened = 0;
MOD_DEC_USE_COUNT; /* Module may be unloaded now */
return 0;
}
/* Somebody wants to have lots of one's */
static ssize_t
device_read (struct file *file, char *buf, size_t len, loff_t *offset)
{
int i;
static char one = '1';
for (i = 0; i < len; i++)
if (copy_to_user (&buf[i], &one, 1))
return -EFAULT;
return len;
}
/* Now he/she wants to write something... */
static ssize_t
device_write (struct file *file, const char *buf, size_t len,
loff_t *offset)
{
/* ...but we don't care */
return len;
}
/* Let's put the supported operations in a structure */
struct file_operations one_operations =
{
NULL, /* Owner module... wonder what this means :-) */
NULL, /* seek */
device_read,
device_write,
NULL, /* readdir */
NULL, /* poll */
NULL, /* ioctl */
NULL, /* mmap */
device_open
NULL, /* flush */
device_release,
NULL, NULL, NULL, NULL, NULL /* Some others */
};
/* This is automatically called when the module is loaded */
int
init_module (void)
{
int result = register_chrdev (ONE_MAJOR, ONE_NAME, &one_operations);
if (result < 0) /* Could not register character device */
{
printk (KERN_ERR "Couldn't register device: %d.\n", result);
return result;
}
printk (KERN_INFO "Loading the %s module.\n", ONE_NAME);
return 0;
}
/* This gets called when unloading the module */
void
cleanup_module (void)
{
int result = unregister_chrdev (ONE_MAJOR, ONE_NAME);
if (result < 0)
printk (KERN_ERR "Couldn't unregister device: %d.\n", result);
else
printk (KERN_INFO "Unloading the %s module.\n", ONE_NAME);
}
We simply implement the usual operations like 'read ()', 'close ()'
and 'open ()', put pointers to these functions in a struct and
register this as a character device.
In the next step, we would compile this into an object file and
load it as root with insmod(8) and create the device file with
# mknod /dev/one c 100 1
This is simple to understand -- but hard to put into practice, for
(at least) four reasons: first, a small mistake in the code might
cause a kernel panic; second, you need root privileges to do this at
all; third, you can't use functions provided by the GNU C library, let
alone other helpful libraries like the GLib; fourth, you need an
unused device number (I used 100 above and hoped that nobody else used
that before).
With the Hurd, things work different though. Of course, the
superstructure looks quite different, but also (and more importantly)
the environment of the code is much more friendly: it's the 'normal'
user space we all know and love. This effectively means that you can
develop a translator almost like any other program. The only
disadvantage is that the programming interface is a bit more complex
than the one of the Linux kernel. But if you understood the GNU/Linux
example above, you won't have problems with the following translator,
which implements the same functionality.
In fact, it provides much more functionality, because libtrivfs
forces us to implement more; when writing a real world translator, you
should also do option parsing, because sometimes users only ask for
'--help', but I tried to keep this example translator as simple as
possible. When writing programs for the GNU system, we recommend
parsing options with the argp functions(1).
---------- Footnotes ----------
(1) see "info libc Argp"
8.2 Implementing trivfs callback functions
==========================================
In the GNU/Hurd system, the usual Unix system calls are provided by
the GNU C Library. The GNU C Library wrapps them to messages and
sends them to the respective ports. This means that you won't write
direct implementations for functions like 'read ()', but one needs
functions with the arguments of, say, 'io_read ()'. When using the
trivfs library, we have to implement routines with slightly different
arguments. For example, the arguments of 'trivfs_S_io_read ()', as
the name of such a function would be when using libtrivfs, are:
Type Name Description
'struct trivfs_protid 'cred' Credentials
*'
'mach_port_t' 'reply' The port where the reply will be
sent
'mach_msg_type_name_t' 'reply_type'The rights we have on the above
port
'vm_address_t *' 'data' Pointer to the place where you
should write you reply data to
'mach_msg_type_number_t''data_len' Here you should store, how much
data you actually return.
Initialy, this is set to the size
of the already available memory
at '*data'.
'loff_t' 'offs' Seek a position. If 'offs' is
-1, use the internal file
pointer. Ignore it if the object
is not seekable.
'mach_msg_type_number_t''amount' How much data you should write
The 'trivfs_S_io_read ()' function of the "Hello, world" translator
(see '$(HURD)/trans/hello.c') is a nice example for how to implement
such a function. The implementation of our "one" translator will be a
less complete example.
This is how our function looks like:
error_t
trivfs_S_io_read (struct trivfs_protid *cred,
mach_port_t reply, mach_msg_type_name_t reply_type,
data_t *data, mach_msg_type_number_t *data_len,
loff_t offs, mach_msg_type_number_t amount)
{
/* Deny access if they have bad credentials. */
if (!cred)
return EOPNOTSUPP;
else if (! (cred->po->openmodes & O_READ))
return EBADF;
if (amount > 0)
{
int i;
/* Possibly allocate a new buffer. */
if (*data_len < amount)
*data = mmap (0, amount, PROT_READ|PROT_WRITE, MAP_ANON, 0, 0);
/* Copy the constant data into the buffer. */
for (i = 0; i < amount; i++)
(*data)[i] = '1';
}
*data_len = amount;
return 0;
}
This is the most complex callback function of our translator. The
others are much simpler.
You should always return 'EOPNOTSUPP' (Operation not supported) if
'cred' is faulty and 'EBADF' (Bad file descriptor) if the necessary
bit is not set in the open mode.
If the user wants to read more bytes (the number in 'amount') than
the buffer can hold, we have to allocate some more memory. Do you
remember that we had to pass a pointer to the pointer to our buffer,
when calling 'io_read ()'? This was done exactly for this reason. We
are allocating the memory with 'mmap ()' here, because we want a page
aligned memory block. Now we made sure that we've got enough space
where we can write the data into. This means we can begin filling in
the ones. Note that '*data' is a 'vm_address_t' and we have to cast
it into a pointer before we can use it as such.
If you understand the above function, I doubt you will have
troubles with the following write routine, which does almost nothing:
kern_return_t
trivfs_S_io_write (struct trivfs_protid *cred,
mach_port_t reply, mach_msg_type_name_t replytype,
data_t data, mach_msg_type_number_t datalen,
loff_t offs, mach_msg_type_number_t *amout)
{
if (!cred)
return EOPNOTSUPP;
else if (!(cred->po->openmodes & O_WRITE))
return EBADF;
*amout = datalen;
return 0;
}
Apart from the usual error checking it only claims that all data
the user of our translator (i.e. the program which opened the file
our translator implements) wanted to write was successfully written --
which makes perfect sense, because we ignore all data someone writes
into our file.
There are several callbacks we will implement in a similar way like
the write function. You can find these functions in the complete
source code of our translator.
Another callback routine is 'trivfs_S_io_readable ()'. It will be
called if somebody wants to know how much data we can deliver
immediately. Of course we can provide an infinite number of bytes
directly. And as Marcus Brinkmann told me (1):
"If you can deliver an unlimited number of bytes without
blocking, I think the highest possible value that fits in
mach_msg_type_number_t seems to be appropriate. (I hope
applications can deal with that)."
Well, if something can go wrong, it will go wrong. For example,
our example program 'dump.c' above would handle such a situation in a
very ungraceful way. This is why I wrote the following
implementation, which is a bit paranoid and does not cause problems if
an application is unable to handle a huge value in a sane way:
kern_return_t
trivfs_S_io_readable (struct trivfs_protid *cred,
mach_port_t reply, mach_msg_type_name_t replytype,
mach_msg_type_number_t *amount)
{
if (!cred)
return EOPNOTSUPP;
else if (!(cred->po->openmodes & O_READ))
return EINVAL;
else
*amount = 10240; /* Dummy value: 10k */
return 0;
}
The last interesting callback function is 'trivfs_S_io_select ()',
which is well commented in the complete source bellow.
---------- Footnotes ----------
(1) for the complete mail see
8.3 Other trivfs callbacks
==========================
We have to do some more work before we have a complete trivfs
translator: we must define some symbols that hold general information
about our translator.
/* Trivfs hooks. */
int trivfs_fstype = FSTYPE_MISC; /* Generic trivfs server */
int trivfs_fsid = 0; /* Should always be 0 on startup */
In most cases, you might want to set 'trivfs_fstype' to
'FSTYPE_MISC'. Other possible values are (descriptions from
'$(HURD)/hurd/hurd_types.h'):
1. 'FSTYPE_IFSOCK': 'PF_LOCAL' socket naming point
2. 'FSTYPE_DEV': GNU Special file server
3. 'FSTYPE_TERM': GNU Terminal driver
In 'trivfs_allow_open', you specify the initial permissions for
your translator:
int trivfs_allow_open = O_READ | O_WRITE;
And with the following three variables, you specify what kinds of
accesses are actually implemented:
/* Actual supported modes: */
int trivfs_support_read = 1;
int trivfs_support_write = 1;
int trivfs_support_exec = 0;
8.4 The main function
=====================
Translators are normal programs, and as such, they need a 'main ()'
function. The trivfs library does not define such a function, so we
have to do this on our own. Our program may be started as a normal
program or as a translator, so we have to distinguish between these
two cases. This can be done by testing if our bootstrap port is
'MACH_PORT_NULL'. If it is, the program was not started as a
translator. In most cases, a translator might simply abort in this
case. If, however, our bootstrap port is not 'MACH_PORT_NULL', we
should initialize libtrivfs and deallocate the bootstrap port. In the
last step, we will launch the translator. Our 'main ()' function
(which does not process any command line arguments) looks like this:
int
main (void)
{
error_t err;
mach_port_t bootstrap;
struct trivfs_control *fsys;
task_get_bootstrap_port (mach_task_self (), &bootstrap);
if (bootstrap == MACH_PORT_NULL)
error (1, 0, "Must be started as a translator");
/* Reply to our parent */
err = trivfs_startup (bootstrap, 0, 0, 0, 0, 0, &fsys);
mach_port_deallocate (mach_task_self (), bootstrap);
if (err)
error (1, err, "trivfs_startup failed");
/* Launch. */
ports_manage_port_operations_one_thread (fsys->pi.bucket,
trivfs_demuxer, 0);
return 0;
}
You may have wondered why our functions had such disgusting names
like 'trivfs_S_io_read ()'. At least by now you should know the
answer: We don't need to register our functions anywhere, we only have
to give them the apropriate names and libtrivfs will do the rest for
us. Of course, the function names actually have a meaning, as Marcus
Brinkmann explains:
"The io_read is the name of the RPC as in the .defs file. The S_
prefix means it is the _S_erver stub implemented here, rather
than the message packaging/unpacking functions. The trivfs_
prefix means that this is not the bare RPC, but the mach_port_t
is actually converted to a credential struct cred (or so). This
is done at the INTRAN.
Usually, you get the mach_port_t as first argument, but in
libtrivfs stubs, you get a different. Grep for intran in
'$(HURD)/libtrivfs/*' and you will see how ports are mapped to
credentials."
8.5 The complete source
=======================
Okay, that's all, folks. I left out some unimportant details, which
you may look up in the following complete listing of 'hurd-one.c'.
You can compile this file with
$ gcc -g -o one hurd-one.c -ltrivfs -lports
Other sources you might want to look at are '$(HURD)/trans/hello.c'
and '$(HURD)/trans/null.c'.
/* hurd-one.c - A trivial single-file translator
Copyright (C) 1995, 1996, 1997, 1998, 1999, 2001, 2002, 2007
Free Software Foundation, Inc.
Written by Wolfgang Jährling , 2001
This is based on hurd/trans/hello.c. The hello.c source says:
Copyright (C) 1998, 1999, 2001 Free Software Foundation, Inc.
Gordon Matzigkeit , 1999
It also uses parts of hurd/trans/null.c. The null.c source says:
Copyright (C) 1995,96,97,98,99,2001 Free Software Foundation, Inc.
Written by Miles Bader
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2, or (at
your option) any later version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330,
Boston, MA 02111-1307 USA */
#define _GNU_SOURCE 1
#include
#include /* exit () */
#include /* Error numers */
#include /* O_READ etc. */
#include /* MAP_ANON etc. */
/* Trivfs hooks. */
int trivfs_fstype = FSTYPE_MISC; /* Generic trivfs server */
int trivfs_fsid = 0; /* Should always be 0 on startup */
int trivfs_allow_open = O_READ | O_WRITE;
/* Actual supported modes: */
int trivfs_support_read = 1;
int trivfs_support_write = 1;
int trivfs_support_exec = 0;
/* May do nothing... */
void
trivfs_modify_stat (struct trivfs_protid *cred, io_statbuf_t *st)
{
/* .. and we do nothing */
}
error_t
trivfs_goaway (struct trivfs_control *cntl, int flags)
{
exit (EXIT_SUCCESS);
}
error_t
trivfs_S_io_read (struct trivfs_protid *cred,
mach_port_t reply, mach_msg_type_name_t reply_type,
data_t *data, mach_msg_type_number_t *data_len,
loff_t offs, mach_msg_type_number_t amount)
{
/* Deny access if they have bad credentials. */
if (!cred)
return EOPNOTSUPP;
else if (!(cred->po->openmodes & O_READ))
return EBADF;
if (amount > 0)
{
int i;
/* Possibly allocate a new buffer. */
if (*data_len < amount)
*data = mmap (0, amount, PROT_READ|PROT_WRITE, MAP_ANON, 0, 0);
/* Copy the constant data into the buffer. */
for (i = 0; i < amount; i++)
((char *) *data)[i] = 1;
}
*data_len = amount;
return 0;
}
kern_return_t
trivfs_S_io_write (struct trivfs_protid *cred,
mach_port_t reply, mach_msg_type_name_t replytype,
data_t data, mach_msg_type_number_t datalen,
loff_t offs, mach_msg_type_number_t *amout)
{
if (!cred)
return EOPNOTSUPP;
else if (!(cred->po->openmodes & O_WRITE))
return EBADF;
*amout = datalen;
return 0;
}
/* Tell how much data can be read from the object without blocking for
a "long time" (this should be the same meaning of "long time" used
by the nonblocking flag. */
kern_return_t
trivfs_S_io_readable (struct trivfs_protid *cred,
mach_port_t reply, mach_msg_type_name_t replytype,
mach_msg_type_number_t *amount)
{
if (!cred)
return EOPNOTSUPP;
else if (!(cred->po->openmodes & O_READ))
return EINVAL;
else
*amount = 10000; /* Dummy value */
return 0;
}
/* Truncate file. */
kern_return_t
trivfs_S_file_set_size (struct trivfs_protid *cred, mach_port_t reply,
mach_msg_type_name_t reply_type, loff_t size)
{
if (!cred)
return EOPNOTSUPP;
else
return 0;
}
/* Change current read/write offset */
error_t
trivfs_S_io_seek (struct trivfs_protid *cred, mach_port_t reply,
mach_msg_type_name_t reply_type, loff_t offs, int whence,
loff_t *new_offs)
{
if (! cred)
return EOPNOTSUPP;
else
return 0;
}
/* SELECT_TYPE is the bitwise OR of SELECT_READ, SELECT_WRITE, and
SELECT_URG. Block until one of the indicated types of i/o can be
done "quickly", and return the types that are then available.
TAG is returned as passed; it is just for the convenience of the
user in matching up reply messages with specific requests sent. */
kern_return_t
trivfs_S_io_select (struct trivfs_protid *cred,
mach_port_t reply, mach_msg_type_name_t replytype,
int *type, int *tag)
{
if (!cred)
return EOPNOTSUPP;
else
if (((*type & SELECT_READ) && !(cred->po->openmodes & O_READ))
|| ((*type & SELECT_WRITE) && !(cred->po->openmodes & O_WRITE)))
return EBADF;
else
*type &= ~SELECT_URG;
return 0;
}
/* Well, we have to define these four functions, so here we go: */
kern_return_t
trivfs_S_io_get_openmodes (struct trivfs_protid *cred, mach_port_t reply,
mach_msg_type_name_t replytype, int *bits)
{
if (!cred)
return EOPNOTSUPP;
else
{
*bits = cred->po->openmodes;
return 0;
}
}
error_t
trivfs_S_io_set_all_openmodes (struct trivfs_protid *cred,
mach_port_t reply,
mach_msg_type_name_t replytype,
int mode)
{
if (!cred)
return EOPNOTSUPP;
else
return 0;
}
kern_return_t
trivfs_S_io_set_some_openmodes (struct trivfs_protid *cred,
mach_port_t reply,
mach_msg_type_name_t replytype,
int bits)
{
if (!cred)
return EOPNOTSUPP;
else
return 0;
}
kern_return_t
trivfs_S_io_clear_some_openmodes (struct trivfs_protid *cred,
mach_port_t reply,
mach_msg_type_name_t replytype,
int bits)
{
if (!cred)
return EOPNOTSUPP;
else
return 0;
}
int
main (void)
{
error_t err;
mach_port_t bootstrap;
struct trivfs_control *fsys;
task_get_bootstrap_port (mach_task_self (), &bootstrap);
if (bootstrap == MACH_PORT_NULL)
error (1, 0, "Must be started as a translator");
/* Reply to our parent */
err = trivfs_startup (bootstrap, 0, 0, 0, 0, 0, &fsys);
mach_port_deallocate (mach_task_self (), bootstrap);
if (err)
error (1, err, "trivfs_startup failed");
/* Launch. */
ports_manage_port_operations_one_thread (fsys->pi.bucket,
trivfs_demuxer, 0);
return 0;
}
9 Debugging a translator
************************
This chapter requires you to know how to use GDB, the GNU Debugger.
If you did not use GDB before, I recommend reading the sample session
chapter in the GDB Texinfo documentation. If info and the
documentation are installed on your system, simply do
$ info gdb "Sample Session"
Ok, so now how does one debug a translator? It's pretty obvious,
but I will explain it anyway. :-) The easiest way is to start your
program as an active translator:
$ gcc -g -o one one.c -ltrivfs -lports
$ settrans -ac foo one
Now the translator is up and running. You can see it in the
process list:
$ ps Aux
(We don't have POSIX 'ps' at the moment, so 'ps aux' won't work,
sorry.) Now we need to attach to the running (respective waiting)
process. For example, if the PID was 357, we would do:
$ gdb one 357
At the gdb prompt, we can now set breakpoints, then let the
translator continue:
(gdb) break trivfs_S_io_read
(gdb) c
Now, you should switch to another screen window, xterm or similar.
Enter a command like
$ cat foo
there and switch back to the terminal where you are running GDB.
You will see that it stopped at the breakpoint. Now you can debug as
usual. That's easy, isn't it?
If you are done, enter
(gdb) quit
and say that you want to detach the process. We can conclude by
saying that you don't need any special technique for debugging a
translator -- at least such a simple one.
At this point, you probably have an idea about how to develop Hurd
servers (translators). Often, you will need more than libtrivfs
provides. For outdated information on other libraries, see the Hurd
Reference Manual ('info hurd'), also available in
'$(HURD)/doc/hurd.texi', for up-to-date information read the
appropriate header files.
10 Comprehensive trivfs example
*******************************
TODO: Maybe a 'cat' translator?
11 An example using netfs
*************************
TODO
12 An example using diskfs
**************************
TODO
13 Frequently Asked Questions
*****************************
_
.-[_]<-.
v | _`. T h e G N U H u r d
[_] `->[_]|
^ _ ,',' ...be a part of it!
`.[_]-+-'
`.__.'
Q: How can a translator access it's underlying node? A gzip
translator must do this, for example.
A: The underlying node is returned by fsys_startup(). See
'$(HURD)/hurd/fsys.defs'.
Q: Can one stack translators?
A: Yes, stacking active translators is possible, but you can't do
it with passive translators.
Q: What is a 'protid'?
A: 'prot' means protection. Every protid structure denotes a
unique user (i.e. client) of our translator. You can access per-open
information via the 'po' field of the structure.
Q: Which kind of threads should I use if I want to write a program
that will run on both GNU/Linux and GNU/Hurd?
A: Use pthreads. We will have pthreads eventually.
Q: How can we claim to be POSIX compliant without having pthreads?
A: First of all, pthreads are optional. Second, we do have
pthreads, for example GNU Portable Threads (pth) does provide a
non-preemtive pthreads emulation. This seems to be standard
compliant, but of course it's not that useful, as most programs assume
preemtive multi-threading. Jeroen Dekkers is working on "real"
pthreads, which partially work, as of now.
Q: In the GNU Manifesto, RMS wrote that both C and LISP will be
system languages. What about that?
A: The Scheme interpreter Guile is part of the GNU project. As of
now, it provides only POSIX functionality, but there's no reason why
nobody should add GNU specific stuff. Adding support for GNU
functionality to various languages would be nice indeed. That's
certainly not an urgent issue, however.
Q: Why should I learn about Mach if the Hurd switches to L4 soon?
A: As of now, the Hurd uses Mach and you need to know Mach basics
to do Hurd work. Maybe the Hurd will run on L4 in the future, but
currently it's very, very far away from doing so.
14 Appendices
*************
14.1 Stuff to do
================
This document is a work in progress. There are several things that
should be added. If you want to help, please contact me.
- Use consistent formating. often, @code{} should be used but isn't
- Correct the remaining FIXMEs (ok, this one was obvious)
- Take OSKit-Mach into account
- Add Moritz' Mach device access example and link to mailing list
archive with Daniel Wagners Mach device code
- Write about netfs and diskfs, add a longer trivfs example
- Describe more Mach and esp. MIG details