InnoDB kernel_mutex Contention and Memory Allocators


Jul 18, 2012

tl;dr: We found that in our case, contention for InnoDB's kernel_mutex was caused by contention for a malloc arena lock. We fixed it by moving to tcmalloc. Instructions on how to do that here.

We recently doubled the IO throughput capacity of our near-capacity MySQL master by adding a second RAID controller, and striping the two together. As we were climbing up to a record throughput peak the following weekend, there was a major db latency spike (>3x).

A look at SHOW ENGINE INNODB STATUS indicated quite a bit of contention for InnoDB's kernel_mutex.

Note: the contention I observed was actually considerably worse than what I pasted above, but I didn't save the output, so this is all I have to show.

The kernel_mutex has been removed in MySQL 5.6, but that's unfortunately not ready for production. As a workaround, the Percona guys suggest modifying innodb_sync_spin_loops, which had absolutely no effect for our workload. They also suggest lowering innodb_thread_concurrency, which did reduce contention, but it also reduced concurrency, which left us right back where we started.

I pulled out my poor man's profiler to see if I could figure out exactly what was holding the lock and what it was doing with it. Here are the stacks I got.

Immediately, we can see that lots of stuff is waiting on locks inside of malloc/free-related functions. After reading through the MySQL sources, it was clear that this thread was holding the kernel_mutex.

Note: all links to glibc code below are specifically to the version that we are using.

Reading through _int_free in the glibc sources seemed to indicate that there was only one lock (malloc_state->mutex) in there.

Our glibc was built with --enable-experimental-malloc, which is supposed to reduce contention by dividing the heap in to multiple arenas, each with their own lock (at least as far as I understand it — and I'm far from an expert).

tcmalloc is a malloc implementation from google-perftools that satisfies small malloc requests without locks by using a per-thread cache. Using tcmalloc should mean that the allocations inside the kernel_mutex are (at least mostly) lockless.

Here's how to LD_PRELOAD tcmalloc.

Put this in /usr/local/bin/mysqld_wrapper:

Put this fragment in my.cnf:

Since we moved to tcmalloc, all of the contention for the kernel_mutex has completely disappeared. We're also seeing better performance overall and using ~15% less memory in total. This fix probably isn't applicable in all cases, but if you're seeing kernel_mutex contention, it's worth using your poor man's profiler to see whether swapping allocators might help.