Wettone.com

Reducing SlimStat's database size

Apr
20

One of the most common themes in the feedback I get is that the database becomes quite large over time. That was the main reason behind the Admin plug-in. I've been tinkering with my database this evening and I thought I'd recount what I've done so other people could try it and see how their performance is affected.

The main difference between the databases in ShortStat and SlimStat is that I added lots of indexes. These are the main reason SlimStat is faster than ShortStat, but they are also the main reason that the database grows so quickly.

I added indexes on multiple fields, for maximum speed when selecting. The downside is that their cardinality can be huge. For some of them there is one entry in the index for each row in the table. So an obvious strategy is to reduce these to single fields and therefore reduce their cardinality and size.

To remove the multiple-field indexes, run these queries:

ALTER TABLE `slimstat` DROP INDEX `resource_total_uniques_idx`;
ALTER TABLE `slimstat` DROP INDEX `searchterms_total_uniques_idx`;
ALTER TABLE `slimstat` DROP INDEX `domain_total_uniques_idx`;
ALTER TABLE `slimstat` DROP INDEX `platform_total_uniques_idx`;
ALTER TABLE `slimstat` DROP INDEX `browser_version_total_uniques_idx`;
ALTER TABLE `slimstat` DROP INDEX `country_total_uniques_idx`;
ALTER TABLE `slimstat` DROP INDEX `language_total_uniques_idx`;
ALTER TABLE `slimstat` DROP INDEX `dt_total_uniques_idx`;
ALTER TABLE `slimstat` DROP INDEX `visit_total_uniques_idx`;

You could do the equivalent of this from within a graphical tool such as PHPMyAdmin. Just remove all the indexes that have the same cardinality as the primary key.

Then add new single-field indexes:

ALTER TABLE `slimstat` ADD INDEX (`resource`);
ALTER TABLE `slimstat` ADD INDEX (`searchterms`);
ALTER TABLE `slimstat` ADD INDEX (`domain`);
ALTER TABLE `slimstat` ADD INDEX (`platform`);
ALTER TABLE `slimstat` ADD INDEX `browser_version` (`browser`,`version`);
ALTER TABLE `slimstat` ADD INDEX (`country`);
ALTER TABLE `slimstat` ADD INDEX (`language`);
ALTER TABLE `slimstat` ADD INDEX (`visit`);

This will all take quite a while, so I suggest pasting it all in at once and going to put the kettle on. If removing any index breaks, that probably indicates a bug in the setup script, so please let me know. :)

There's still one multiple-field index in there, on browser and version, but it's a useful index to have because one of the modules uses both those fields. I decided not to create an index on dt because it would still have large cardinality, as it's based on a timestamp to the nearest second. If there were separate fields for year, month, day, hour, minute and second, indexing them would probably be worthwhile. That's something I have been considering changing.

Almost forgot: before you do any of this, you may want to remove the HTML comments from line 190 of index.php, so you can see the execution time on screen more easily. It just shows the number of seconds that it took to display the current page. Having several before-and-after times would be useful.

If you do try this, please send feedback with as much detail as you like. I've noticed a bit of slowdown, but nothing drastic. My database is certainly significantly smaller, so I think I'm going to alter the setup script unless I get vastly negative feedback.

12 comments on this entry. Comments are now closed.

About this page

This is an entry in Stephen Wettone's weblog, published on 20 April 2006.

Summary of this entry

One of the most common themes in the feedback I get is that the database becomes quite large over time. I've been tinkering with my database this evening and I thought I'd recount what I've done so other people could try it and see how their performance is affected.

Other weblog entries

The previous entry, published on 14 Mar 2006, was SlimStat via JavaScript.

Use the calendar to find other entries in the weblog.

← Mar Apr
SMTWTFS
301
2345678
9101112131415
16171819202122
23242526272829

Search the weblog

Enter a word or phrase to search my weblog archives: