Racing Timers: A Kernel Pastime

Timers can produce their own special problems with races. Consider a collection of objects (list, hash, etc) where each object has a timer which is due to destroy it.

If you want to destroy the entire collection (say on module removal), you might do the following:

        /* THIS CODE BAD BAD BAD BAD: IF IT WAS ANY WORSE IT WOULD USE
           HUNGARIAN NOTATION */
        spin_lock_bh(&list_lock);

        while (list) {
                struct foo *next = list->next;
                del_timer(&list->timer);
                kfree(list);
                list = next;
        }

        spin_unlock_bh(&list_lock);
    

Sooner or later, this will crash on SMP, because a timer can have just gone off before the spin_lock_bh(), and it will only get the lock after we spin_unlock_bh(), and then try to free the element (which has already been freed!).

This can be avoided by checking the result of del_timer(): if it returns 1, the timer has been deleted. If 0, it means (in this case) that it is currently running, so we can do:

        retry:  
                spin_lock_bh(&list_lock);

                while (list) {
                        struct foo *next = list->next;
                        if (!del_timer(&list->timer)) {
                                /* Give timer a chance to delete this */
                                spin_unlock_bh(&list_lock);
                                goto retry;
                        }
                        kfree(list);
                        list = next;
                }

                spin_unlock_bh(&list_lock);
    

Another common problem is deleting timers which restart themselves (by calling add_timer() at the end of their timer function). Because this is a fairly common case which is prone to races, the function del_timer_sync() (include/linux/timer.h) is provided to handle this case. It returns the number of times the timer had to be deleted before we finally stopped it from adding itself back in.