[ACCEPTED]-Overhead of pthread mutexes?-mutex

Accepted answer
Score: 40

All modern thread implementations can handle 19 an uncontended mutex lock entirely in user 18 space (with just a couple of machine instructions) - only 17 when there is contention, the library has 16 to call into the kernel.

Another point to 15 consider is that if an application doesn't 14 explicitly link to the pthread library (because 13 it's a single-threaded application), it 12 will only get dummy pthread functions (which 11 don't do any locking at all) - only if the 10 application is multi-threaded (and links 9 to the pthread library), the full pthread 8 functions will be used.

And finally, as others 7 have already pointed out, there is no point 6 in protecting a getter method for something 5 like isActive with a mutex - once the caller 4 gets a chance to look at the return value, the 3 value might already have been changed (as 2 the mutex is only locked inside the getter 1 method).

Score: 23

"A mutex requires an OS context switch. That 1 is fairly expensive. "

  • This is not true on Linux, where mutexes are implemented using something called futex'es. Acquiring an uncontested (i.e., not already locked) mutex is, as cmeerw points out, a matter of a few simple instructions, and is typically in the area of 25 nanoseconds w/current hardware.

For more info: Futex

Numbers everybody should know

Score: 7

This is a bit off-topic but you seem to 12 be new to threading - for one thing, only 11 lock where threads can overlap. Then, try 10 to minimize those places. Also, instead 9 of trying to lock every method, think of 8 what the thread is doing (overall) with 7 an object and make that a single call, and 6 lock that. Try to get your locks as high 5 up as possible (this again increases efficiency 4 and may /help/ to avoid deadlocking). But 3 locks don't 'compose', you have to mentally 2 at least cross-organize your code by where 1 the threads are and overlap.

Score: 4

I did a similar library and didn't have 13 any trouble with lock performance. (I can't 12 tell you exactly how they're implemented, so 11 I can't say conclusively that it's not a 10 big deal.)

I'd go for getting it right first 9 (i.e. use locks) then worry about performance. I 8 don't know of a better way; that's what 7 mutexes were built for.

An alternative for 6 single thread clients would be to use the 5 preprocessor to build a non-locked vs locked 4 version of your library. E.g.:

#ifdef BUILD_SINGLE_THREAD
    inline void lock () {}
    inline void unlock () {}
#else
    inline void lock () { doSomethingReal(); }
    inline void unlock () { doSomethingElseReal(); }
#endif

Of course, that 3 adds an additional build to maintain, as 2 you'd distribute both single and multithread 1 versions.

Score: 3

I can tell you from Windows, that a mutex 24 is a kernel object and as such incurs a 23 (relatively) significant locking overhead. To 22 get a better performing lock, when all you 21 need is one that works in threads, is to 20 use a critical section. This would not work 19 across processes, just the threads in a 18 single process.

However.. linux is quite 17 a different beast to multi-process locking. I 16 know that a mutex is implemented using the 15 atomic CPU instructions and only apply to 14 a process - so they would have the same 13 performance as a win32 critical section 12 - ie be very fast.

Of course, the fastest 11 locking is not to have any at all, or to 10 use them as little as possible (but if your 9 lib is to be used in a heavily threaded 8 environment, you will want to lock for as 7 short a time as possible: lock, do something, unlock, do 6 something else, then lock again is better 5 than holding the lock across the whole task 4 - the cost of locking isn't in the time 3 taken to lock, but the time a thread sits 2 around twiddling its thumbs waiting for 1 another thread to release a lock it wants!)

Score: 2

A mutex requires an OS context switch. That 20 is fairly expensive. The CPU can still do 19 it hundreds of thousands of times per second 18 without too much trouble, but it is a lot 17 more expensive than not having the mutex there. Putting 16 it on every variable access is probably overkill.

It 15 also probably is not what you want. This 14 kind of brute-force locking tends to lead 13 to deadlocks.

do you know better ways to 12 protect such variable accesses?

Design your 11 application so that as little data as possible 10 is shared. Some sections of code should 9 be synchronized, probably with a mutex, but 8 only those that are actually necessary. And 7 typically not individual variable accesses, but tasks 6 containing groups of variable accesses that 5 must be performed atomically. (perhaps you 4 need to set your is_active flag along with some other 3 modifications. Does it make sense to set 2 that flag and make no further changes to 1 the object?)

Score: 2

I was curious about the expense of using 13 a pthred_mutex_lock/unlock. I had a scenario where I needed to either 12 copy anywhere from 1500-65K bytes without 11 using a mutex or to use a mutex and do a 10 single write of a pointer to the data needed.

I 9 wrote a short loop to test each

gettimeofday(&starttime, NULL)
COPY DATA
gettimeofday(&endtime, NULL)
timersub(&endtime, &starttime, &timediff)
print out timediff data

or

ettimeofday(&starttime, NULL)
pthread_mutex_lock(&mutex);
gettimeofday(&endtime, NULL)
pthread_mutex_unlock(&mutex);
timersub(&endtime, &starttime, &timediff)
print out timediff data

If I was 8 copying less than 4000 or so bytes, then 7 the straight copy operation took less time. If 6 however I was copying more than 4000 bytes, then 5 it was less costly to do the mutex lock/unlock.

The 4 timing on the mutex lock/unlock ran between 3 3 and 5 usec long including the time for the 2 gettimeofday for the currentTime which 1 took about 2 usec

Score: 1

For member variable access, you should use 13 read/write locks, which have slightly less 12 overhead and allow multiple concurrent reads 11 without blocking.

In many cases you can use 10 atomic builtins, if your compiler provides 9 them (if you are using gcc or icc __sync_fetch*() and 8 the like), but they are notouriously hard 7 to handle correctly.

If you can guarantee 6 the access being atomic (for example on 5 x86 an dword read or write is always atomic, if 4 it is aligned, but not a read-modify-write), you 3 can often avoid locks at all and use volatile 2 instead, but this is non portable and requires 1 knowledge of the hardware.

Score: 0

Well a suboptimal but simple approach is 7 to place macros around your mutex locks 6 and unlocks. Then have a compiler / makefile 5 option to enable / disable threading.

Ex.

#ifdef THREAD_ENABLED
#define pthread_mutex_lock(x) ... //actual mutex call
#endif

#ifndef THREAD_ENABLED
#define pthread_mutex_lock(x) ... //do nothing
#endif

Then 4 when compiling do a gcc -DTHREAD_ENABLED to enable threading.

Again 3 I would NOT use this method in any large 2 project. But only if you want something 1 fairly simple.

More Related questions