A possible fix for a bottleneck in piwik used on high traffic sites

The last weeks I have been confronted with a big performance problem on a mid- to high-traffic website using Piwik to track the user behaviour. The MySQL server permanently had a load of over 90 and the mysqld process consumed permanently 99% CPU time.

The system runs as a single virtual VMware machine on quite good hardware. Only the Apache webserver with PHP 5.2 and MySQL 5.0 are running on this machine.

Most of you may now say something like: „Ok, you are wondering why your MySQL server runs out of performance on a virtual machine? Get a new job…“ But there are some reasons for that. And no, the solution was not moving the system from the virtual machine to a physical one.

Looking at the output of mytop showed me that mostly one kind of queries ran very slowly. The first thing I did was tweeking the configuration files of the MySQL server an the Apache webserver. After that, „show status“ looked much better but mysqld still comsumed 99% CPU time and the load was still over 90.

Even google couldn’t give me any solutions or hints. The only thing that I found out was, that many other people have the same problem with much better hardware. So it seemed that the performance problem comes from the Piwik application.

Knowing that tuning the application instead of tuning the server’s configuration files in the backend shows up the best results in most cases I took a closer look at the database structure (exploring the PHP code of piwik didn’t make sense to me).

As I mentioned before mytop showed me that the slow queries came all from the same source:

SELECT idaction FROM piwik_log_action WHERE name = 'my/site/has/a/long/uri/?s=af3729febc827382424' AND type = 3

The Piwik javascript block has been added to a site with a long URI where the unique part of the URI (the user session ID) started after character 15 of the URI string. Most entries in piwik_log_action.name look like the following:

‚my/site/has/a/long/uri/?s=af3729febc827382424‘
‚my/site/has/a/long/uri/?s=0ca629febb623893883‘
‚my/site/has/a/long/uri/?s=9ceff388edb34093490‘
‚my/site/has/a/long/uri/?s=5bbefef672beaa82839‘
‚my/site/has/a/long/uri/?s=55392fea00bccde0392‘

The index on the coresponding table field was set to 15. Adding a new index with a size of 50 and dropping the old one solved the problem. The server now idles with a load of 0.25 and the mysqld process now comsumes about 8% of the CPU time instead of 99%.

Here is the solution:

The original create table statement looks like this:

CREATE TABLE `piwik_log_action` (
  `idaction` int(10) unsigned NOT NULL auto_increment,
  `name` varchar(255) NOT NULL,
  `type` tinyint(3) unsigned default NULL,
  PRIMARY KEY  (`idaction`),
  KEY `index_type_name` (`type`,`name`(15))
) ENGINE=MyISAM DEFAULT CHARSET=utf8

To add the new index you just type the following command in the MySQL shell:

CREATE INDEX index_type_name2 ON piwik_log_action (type, name(50));

The new table structure should then look like:

CREATE TABLE `piwik_log_action` (
  `idaction` int(10) unsigned NOT NULL auto_increment,
  `name` varchar(255) NOT NULL,
  `type` tinyint(3) unsigned default NULL,
  PRIMARY KEY  (`idaction`),
  KEY `index_type_name` (`type`,`name`(15)),
  KEY `index_type_name2` (`type`,`name`(50))
) ENGINE=MyISAM DEFAULT CHARSET=utf8

To remove the old index just type the following in your MySQL shell:

DROP INDEX index_type_name ON piwik_log_action;

The final table structure should then look like:

CREATE TABLE `piwik_log_action` (
  `idaction` int(10) unsigned NOT NULL auto_increment,
  `name` varchar(255) NOT NULL,
  `type` tinyint(3) unsigned default NULL,
  PRIMARY KEY  (`idaction`),
  KEY `index_type_name2` (`type`,`name`(50))
) ENGINE=MyISAM DEFAULT CHARSET=utf8

Piwik can so be used for mid- and high-traffic sites on a virtual machine without any problem. But be aware: adding a new index on a table with many entries takes some time. While the index is set up by MySQL the table will be locked! So don’t do that while your site has much traffic.

Kudos to Manuel Kiessling from MyHammer AG for providing the crucial hint!

redbrick (12.11.2009 23:00:51)

@Anthon: Thanks for your feedback. I know that this problem will be solved in the upcoming version. But many people have this problem and I could not find any (short term) solutions. And I think this „hack“ is OK as long as the new version is not released. I hope the changes will not raise problems during the upgrade process.

I am glad to read that problem is already fixed in the SVN and I am looking forward to the upcoming release!

Anthon Pang (12.11.2009 19:02:19)

This is a known problem (see forum or trac).

It’s already fixed in svn, Watch for the upcoming 0,5 release.

Manuel Kiessling (12.11.2009 14:55:12)

Offenbar haben sie es schon auf der Uhr:
http://dev.piwik.org/trac/ticket/708