Web analytics with Piwik: keeping control over your own data

Web analytics is one the essential tools for a website and including measuring web traffic and getting information about the number of visitors it can be also used as a tool to assess and improve the effectiveness of a website. The most common way to collect data is to use on-site web analytics, measure a visitor’s behavior once on your website, with page tagging technology like on Google Analytics which is widely used web analytics service. But what would you use if you want to keep control over your own data?

You don’t have to look farther than Piwik which is open source web analytics application and aims to be the ultimate open alternative to Google Analytics. Here’s a short overview to Piwik Analytics and how to get started with it.

“Web analytics is the measurement, collection, analysis and reporting of web data for purposes of understanding and optimizing web usage.” – Wikipedia

Piwik Open Analytics Platform

Piwik is web analytics application which tracks online visits to one or more websites and displays reports on these visits for analysis. In short it aims to be the ultimate open source alternative to Google Analytics. The code is GPL v3 licensed and available in GitHub. In technical side Piwik is written in PHP, uses MySQL or MariaDB database and you can host it by yourself. And if you don’t want to setup or host Piwik yourself you can also get commercial services.

Piwik provides the usual features you would expect from a web analytics application. You get reports regarding the geographic location of visits, the source of visits, the technical capabilities of visitors, what the visitors did and the time of visits. Piwik also provides features for analysis of the data it accumulates such as saving notes to data, goals for actions, transitions for seeing how visitors navigate, overlaying analytics data on top of a website and displaying how metrics change over time. The easiest way to see what it has to offer is to check the Piwik online demo.

Feature highlights

You might ask how Piwik differs from other web analytics applications such as Google Analytics? One principle advantage of using Piwik is that you are in control. You can host Piwik on your own server and the data is tracked inside your MySQL or preferably MariaDB database: you’ve full control over your data. Software as a service analytics applications on the other hand, have full access to the data users collect. Data privacy is essential for public sector and enterprises who can’t or don’t want to share it for example with Google. You ensure that your visitors behavior on your website is not shared with advertising companies.

Other interesting feature is that it provides advanced privacy options: ability to anonymize IP addresses, purge tracking data regularly (but not report data), opt-out support and Do Not Track support. Your website visitors can decide if they want to be tracked.

You can also do scheduled reports which are sent by e-mail, import data from web server logs, use the API for accessing reports and administrative functions and Piwik also has mobile app to access the analytics data. Piwik is also customizable with plugins and you can integrate it with WordPress and other applications.

Piwik’s User Interface

Piwik has clean and simple user interface as seen in the following screenshots (taken from the online demo).

Piwik Dashboard

Piwik Visitors Overview

Setting up Piwik

Setting up Piwik is easy and there’s good documention available for running Piwik web analytics. All you need is web server like Nginx, PHP 7 and MariaDB which has in some cases significantly improved query performance and reliability of Piwik over using MySQL. You can setup it manually but the most easiest way to start with it is to use the provided Docker image and docker-compose. The docker-compose file setups four containers (MySQL, Piwik, Nginx and Cron) and with compose you can start it up. The Piwik image is available from official docker-library.

The alternative is to do your own Docker image for Piwik and related services. In my opinion it makes sense to have just two containers: one for Piwik related web stuff and other for MariaDB. The Piwik container runs Piwik, Nginx and Cron script with e.g. supervisor. The official image uses Debian (from PHP) but Piwik runs nicely also on Alpine Linux. One thing to tinker with when using Docker is to get MariaDB access to Piwik’s assets for LOAD DATA INFILE which will greatly speed Piwik’s archiving process.

If you’re setting up Piwik manually you can watch a video of installation and after that a video of configuring settings. After you’re done with the 5 minute installation you get the JavaScript tag which you add to the bottom of each page of your website. If you’re using React there’s Piwik analytics component for React Router. Piwik will then record the activity across your website within your database.

And that’s about all there is to starting with Piwik. Simple setup with Docker or doing it manually, adding the JavaScript tag, configuring some options if needed and then just wait for the data from visitors.

Summary

Piwik is good and feature rich alternative for web analytics application. Setting it up isn’t as straightforward as using some hosted service as Google Analytics but that’s the way self-hosted services always are. If you need web analytics and want to keep control of your own data and don’t mind hosting it yourself and paying for the server then Piwik is a good choice.


Posted

in

by

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *