InnoDB kernel_mutex Contention and Memory Allocators
tl;dr: We found that in our case, contention for InnoDB's kernel_mutex
was caused by contention for a malloc arena lock. We fixed it by moving to tcmalloc. Instructions on how to do that here.
We recently doubled the IO throughput capacity of our near-capacity MySQL master by adding a second RAID controller, and striping the two together. As we were climbing up to a record throughput peak the following weekend, there was a major db latency spike (>3x).
A look at SHOW ENGINE INNODB STATUS indicated quite a bit of contention for InnoDB's kernel_mutex
.
Note: the contention I observed was actually considerably worse than what I pasted above, but I didn't save the output, so this is all I have to show.
The kernel_mutex
has been removed in MySQL 5.6, but that's unfortunately not ready for production. As a workaround, the Percona guys suggest modifying innodb_sync_spin_loops
, which had absolutely no effect for our workload. They also suggest lowering innodb_thread_concurrency
, which did reduce contention, but it also reduced concurrency, which left us right back where we started.
I pulled out my poor man's profiler to see if I could figure out exactly what was holding the lock and what it was doing with it. Here are the stacks I got.
Immediately, we can see that lots of stuff is waiting on locks inside of malloc/free-related functions. After reading through the MySQL sources, it was clear that this thread was holding the kernel_mutex
.
Note: all links to glibc code below are specifically to the version that we are using.
Reading through _int_free
in the glibc sources seemed to indicate that there was only one lock (malloc_state->mutex
) in there.
Our glibc was built with --enable-experimental-malloc
, which is supposed to reduce contention by dividing the heap in to multiple arenas, each with their own lock (at least as far as I understand it — and I'm far from an expert).
tcmalloc is a malloc implementation from google-perftools that satisfies small malloc requests without locks by using a per-thread cache. Using tcmalloc should mean that the allocations inside the kernel_mutex
are (at least mostly) lockless.
Here's how to LD_PRELOAD
tcmalloc.
Put this in /usr/local/bin/mysqld_wrapper
:
Put this fragment in my.cnf:
Since we moved to tcmalloc, all of the contention for the kernel_mutex
has completely disappeared. We're also seeing better performance overall and using ~15% less memory in total. This fix probably isn't applicable in all cases, but if you're seeing kernel_mutex
contention, it's worth using your poor man's profiler to see whether swapping allocators might help.