Tuesday, September 6, 2016

When is it signaled?

Signals are a mechanism for notifying a process of a simple event.  Most programmers and programs can ignore them, as the default behaviors are reasonable and taking steps to handle them would greatly increase the program complexity.  But when teaching future computer scientists, we want the programmers to know about these mechanisms and therefore properly understand the functioning of the system.

In working with signals, the developing programmers are often exposed to their first dose of concurrency.  The idea that execution can be happening in simultaneous, arbitrary orders except when action is taken by the program.  With signals, a program can do several things:

  • Provide and install a handler for one or more signals
  • Block the receipt of one or more signals
  • Send a signal to one or more processes

We are going to consider the problem of sending and receiving a signal.  After a signal is sent, when does the other process receive it?  Students make the assumption that when the sending function (i.e., kill) has returned, the signal has been sent *and received*.  However, I have found no text that explicitly guarantees this condition.  Instead, I prepared a simple program (source at the end of this post) to test this condition.

As signals are communicated through the operating system, we want a different mechanism for measuring simultaneity, in this case shared memory.  The experiment program will create and set up a small space of shared memory between two processes.  Next, it will wait until both programs are running in a known state (i.e., barrier).  Then one (the parent) will signal the other (the child), while measuring how long it takes to send, as well as receive the signal.  Finally, run this experiment a million times.

On my current Ubuntu box with Skylake processors, the experimental measurements show that 80% of the time, the child lasts for an average of 40 cycles after kill returns.  The maximum time is almost 1200 cycles.  Assuming that each core's clock has a smaller skew, this effectively means that the child can continue to run even after the function has returned.

Source code follows: (compiled with gcc version 4.8.3, with gcc -O3 -lrt)
#include <sys/mman.h>
#include <sys/types.h>

#include <sys/wait.h>
#include <sys/stat.h> /* For mode constants */
#include <fcntl.h> /* For O_* constants */
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <stdint.h>

#include "rdtsc.h"

static const int ITER = 1000 * 1000;

int main(int argc, char** argv)
{
    int shm_fd = shm_open("squig", O_CREAT | O_RDWR, (S_IREAD | S_IWRITE));
    int count = 0, i;
    uint64_t zero = 0, min = ~0x0, max = 0, sum = 0;
    uint64_t minN = ~0x0, maxN = 0, sumN = 0;
    write(shm_fd, &zero, 8);  // Give the shared file "space"
    void* msh = mmap(NULL, 4 * 1024, PROT_READ | PROT_WRITE, MAP_SHARED, shm_fd, 0);
    volatile uint64_t* tsc = msh;
    
    for (i = 0; i < ITER; i++)
    {
        *tsc = 0;
        pid_t chd = fork();
        
        if (chd == 0)
        {
            // Give the compiler something to think about
            while(zero < ~0x0) { *tsc = rdtsc(); zero++;}
        }
        else
        {
            // Wait for the child
            while (*tsc == 0) {}
            
            uint64_t st = *tsc; // When is it for the child?
            kill(chd, SIGINT); // Send SIGINT to child
            uint64_t k = rdtsc(); // Returned from kill
            wait(NULL); // Reap
            uint64_t delta = 0;
            uint64_t en = *tsc; // When did the child end?
            
            // K >, implies that kill returned after the child terminated
            if (k > en)
            {
                count ++;
                delta = k - en;
                if (delta < minN) minN = delta;
                if (delta > maxN) maxN = delta;
                sumN += delta;
            }
            else
            {
                delta = en - k;
                if (delta < min) min = delta;
                if (delta > max) max = delta;
                sum += delta;
            }
        }
    }
    
    printf("Min: %lx, Max: %lx, Avg: %lx\n", min, max, (sum / (ITER - count)));
    printf("Min: %lx, Max: %lx, Avg: %lx\n", minN, maxN, (sumN / (count)));
    printf("Percent Parent After: %lf\n", (count / (double)ITER));
    
    return 0;
}

Update: Results also hold when sending SIGKILL.


1 comment:

Anonymous said...

Nice experiment. What I'd be curious about next is if the child program actually makes progress after kill() returns, or is the elapsed time just the delay for wait time + kernel (i.e., that the child process is not scheduled when the signal is generated, so there is a delay until the next time it gets scheduled). I'm not sure if there's any good way to measure this without putting instrumentation in the kernel itself, though... ??