2016-04-01

How to define a low-level C event reactor/proactor API?

Recently looking through the source of lispd (a daemon to support the LISP protocol for mobility), I saw it has some sort of bespoke reactor in it. It would be nice if there was a relatively simple reactor API that could be easily implemented on multiple platforms, and be documented only once, so that (for example) the authors of lispd could just use it directly, and other developers could follow the code without having to understand how a bespoke reactor works internally.

I have a portable reactor implementation, but I doubt anyone uses it but me. If you were designing a portable reactor API, e.g., as a POSIX interface, what should it look like? If you were using one, what would you expect to find?

Also, a proactor does a similar job, but has a different API. Sometimes, this API has better performance, because it permits the kernel to partly implement it. Can the APIs of a reactor and proactor be combined (into a “reproactor”)?

  • You can always emulate a proactor over a reactor. You don't get the kernel optimizations, but you just rebuild on a different system, and there they are.
  • Having the reactor interface around as well means that you can still access special APIs that are not represented in the proactor API.
  • Using event-type-specific functions (see below) blurs the distinction between reactor and proactor anyway.
  • So, I'm leaning towards supporting proactor-like calls in the API, even if they are not kernel-optimized.

Should events be automatically reprimed?

  • The main benefit is that it allows the reproactor to avoid rebuilding its internal structures on each iteration. It might make the application simpler too.
  • On the other hand, the event handler will almost always know how to reprime if it has been written properly.
  • Auto-repriming makes sense for monitoring (say) sockets, but not for timed events (where the handler would be repeatedly called until you explicitly cancelled the event). I don't like requiring the user to intuit this, as there may be cases where the expected behaviour is not intuitive. Also, if the interface is re-implemented, I foresee this subtle behaviour being missed if it's not clearly formalized, producing variable behaviour with the same interface, and chaos ensuing.
  • I'm leaning towards sticking with manual repriming. Auto-repriming also has other impacts on the issues below.

How should a reproactor support priorities?

  • It could support multiple (major-)priority queues, and on each iteration, flush out only one queue (the highest non-empty). Low-priority events could then be forced to wait for several iterations before being handled. You could ensure that an idle event handler would be invoked only when there really was nothing else to do.
  • There could be sub-priorities (or minor priorities) within a queue. This would merely guarantee ordering of execution of user functions within a queue. Is that worthwhile?
  • Both forms could be supported simultaneously, and the application could exploit one, both or neither, as required.
  • The number of priorities and sub-priorities don't have to be established beforehand. Just keep track of the highest priority requested, and create new queues as necessary. There can be implementation-defined limits.
  • Is 0 the highest priority or the lowest? It's perhaps more intuitive to make it the lowest. Yet, if you're going to allow the number of priorities to be dynamic, you probably want 0 to be the highest, and dynamically grow out from there, so you always know what the highest is, and put essential events on it. If you still want < and > to have the expected meanings, then you could grow down from 0, -1, -2, etc., and only do that internally. Probably not high-value.
  • For a real proactor (not one emulated over a reactor), must the priorities be part of the underlying OS API?
  • If multi-level priorities are supported, should they be set together in one call, or set separately?

How should a reproactor interact with multi-threaded programs?

  • Does it ever make sense to have more than one per thread? If not, we can make it a thread-local. In C11, call_once can be used to establish a tss_t key, and a function to identify the thread's reproactor will use that to create the reproactor in the first instance. A destructor will clean everything up when the thread terminates.
  • Should one thread be able to express interest in an event to be handled by a different thread? Perhaps, but it introduces synchronization complexity, undermining the simplicity of thread-locality.
  • Should pools of threads be able to watch for the same event? Doesn't make a lot of sense, unless you can guarantee that only one of the watchers will be informed of readiness. If two watchers are informed, one of them will be successful, while the other gets blocked when it actually tries to process the event. In only informing one thread to avoid this, you'd have to remember which thread you informed, so that the other threads don't test for it again.
    • Is it important to support this? Or should the user just do it themselves, by-passing the reactor to (say) listen for connections, and just get the threads to use their own reactors on their own sockets thereafter?

How should a reproactor interact with some platform-specific features, like Windows events?

  • The Windows events are especially high-priority, so should they be handled distinctly, with a special yield function? That would be really ugly.
  • Or do we ignore the priority, and always make them the highest?
  • Or perhaps, if the event is low-priority, and is blocked by higher-priority activity, we don't test for further events of the same type until it has been dealt with. This makes the most sense. We use a zero timeout, and use WaitForMultipleObjectsEx instead of MsgWaitForMultipleObjectsEx on Windows. A timeout of zero should be used anyway, because there will be unprocessed event handlers still pending.
    • The problem with the Windows events is that we don't find out ourselves what events have come in. Could we efficiently peek to probe for different types? If not, we would only be able to permit one handler to watch for events, and not watch for any more until the handler was called.

How do we handle signals?

  • Handling signals is do-able so long as a signal will interrupt all threads that have not blocked it, and we can have a signal handler per thread.
  • Handling signals at all should only be done if the user can also avoid using the reproactor for some signals, in case it is important that the user's signal handler actually take some action directly. That should be fine if we expose the sigset_t passed to pselect/ppoll.
  • Looks like per-thread signal handling is tricky, to say the least. There is some StackOverflow advice. I suppose, with one static array of volatile sig_atomic_ts, and atomic operations to increment them from the signal handler, and decrement from repro_yield(), it doesn't matter how many threads access the array. Just have to make sure that any interrupted ppoll/pselect returns with EINTR after the signal handler has completed.

Can a reproactor exploit splice() or similar functions to copy from one descriptor directly to another?

  • Yes. You should first wait for the destination to become writable. The proactive action to take place is then to prime an event to perform the splice. Interesting. That's a use case for an event to be primed alternately on two distinct event types, i.e., when auto-repriming is undesirable.

Do we allow internal, user-defined events?

  • What for? Isn't that the application's problem?
  • We could have a null event which never fires, except by an explicit trigger call.
  • It might help when providing external extensions to the API.
  • Current position: probably not useful at the moment, but having a null event type is fairly trivial. Could cause problems with repro_yield() detecting when there's no more work.

So far, I've listed event types as normal C enums. All types are listed, including ones not supported on a particular platform.

  • This means you have to include all possible types in the API, and report a run-time error if one attempts to prime on an unsupported type. That way, if an existing one becomes supported on its platform, its number won't change. You can't simply add at the end, because it might already be supported on another platform. This is cumbersome.
  • An alternative is to use addresses of static constant objects. They don't have to have any contents, just distinct addresses. Unsupported event types won't even compile. However, although it's not a big deal, it's not possible to switch on addresses, even constant ones.
  • Another alternative is to do away with condition types, and have specifically typed functions. They will fail in the same way as static constant object addresses, and they are more type-safe. In fact, I'm leaning towards this.

Should we allow a single event handle to wait on multiple events, triggering as soon as any one of them occurs?

  • That means we have to either stop watching for the others until the event handler has been called, or the handle exists in the queue while still primed for the remaining events. It will simultaneously be primed and triggered! Not impossible, but it breaks a previous invariant.
  • It might help to deal with the Windows events too…?
  • I'm leaning towards supporting this.

How do we fit in ALSA's attempts to interact with a reproactor?

  • ALSA has three functions to:
    • get the number of structures to pass to poll(),
    • write those structures into an array,
    • pass them back after poll() has modified them.
  • If we try to support this directly, it means we are allowing a library to wait on multiple events. (See above.)

What does the API look like now?

// Thread-locally create and destroy an event handle. 
repro_ev_t repro_open(void);
repro_close(repro_ev_t);

// Set and get priorities.
void repro_setprio(repro_ev_t, repro_prio_t);
void repro_setsubprio(repro_ev_t, repro_subprio_t);
repro_prio_t repro_getprio(repro_ev_t);
repro_subprio_t repro_getsubprio(repro_ev_t);
void repro_setprios(repro_ev_t, repro_prio_t, repro_subprio_t);
void repro_getprios(repro_ev_t, repro_prio_t *, repro_subprio_t *);

// Specify the event handler. 
typedef void repro_func_t(void *ctxt);
void repro_direct(repro_ev_t, repro_func_t *, void *ctxt);

// Artificially trigger/cancel an event.
void repro_defuse(repro_ev_t);
void repro_trigger(repro_ev_t); // Defuse and queue?
void repro_cancel(repro_ev_t); // Defuse and dequeue?

typedef const struct repro_cond *repro_cond_t;

// Example condition type as address of static object
extern const struct repro_cond repro_OSOCK;
#define repro_CSOCK (&repro_OSOCK)


// Generic priming function

int repro_prime(repro_ev_t, repro_cond_t, const void *conddata);

// Type-specific priming function
int repro_prime_sock(repro_ev_t, int sock, repro_iomode_t);

// More proactive type-specific priming functions
int repro_prime_idle(repro_ev_t);
int repro_prime_recv(repro_ev_t, int sock,
void *buf, size_t len, int flags,
ssize_t *rc, int *ec);
int repro_prime_recvfrom(repro_ev_t, int sock,
void *buf, size_t len, int flags, 
struct sockaddr *addr,
socklen_t *addrlen, 
ssize_t *rc, int *ec);
int repro_prime_read(repro_ev_t, int fd, void *buf, size_t len,
ssize_t *rc, int *ec);
int repro_prime_sock(repro_ev_t, int sock, repro_iomode_t, int *status);
int repro_prime_splice(repro_ev_t, int from, int to, ssize_t *rc, int *ec);
int repro_prime_poll(repro_ev_t, struct pollfd *, size_t);
int repro_prime_select(repro_ev_t, fd_set *, fd_set *, fd_set *);
int repro_prime_signal(repro_ev_t, int sig);

#ifdef __WIN32__
int repro_prime_winhandle(repro_ev_t, HANDLE, int *status);
#endif

#if _POSIX_C_SOURCE >= 1 || _XOPEN_SOURCE || _POSIX_SOURCE
// Access the signal mask, or return NULL if not available.

sigset_t *repro_sigset(void);
#endif

// Yield control to the reproactor.
int repro_yield(void);

Priming functions should return 0 on success. On error, they could set errno to:

  • EADDRINUSE/EBUSY - something is already waiting for that event.
    • Both could be used when auto-repriming, with one code indicating that an event has already triggered, and will re-prime in a conflicting way when handled. Not good when handling multiple events with one handle, as both error codes are then valid.
  • EINVAL - condition not recognized or has invalid parameters.

repro_yield() would set errno to:

  • EWOULDBLOCK/EGAIN - nothing to wait for.
  • EINTR - interrupted by signal not handled internally.
  • ? - some essential events not being awaited.

So, can you fill in the blanks? Or point out flaws?

Updated formatting 2021-05-18