Playing with data in databases is sometimes tricky but when you get down to it it’s just couple of lines on the command line. Sometime ago we switched from Piwik PRO to Matomo and of course we wanted to migrate logs. We couldn’t just use the full MySQL / MariaDB database dump and go with it as table names and the schema was different (Piwik PRO 3.1.1. -> Matomo 3.5.1). In short we needed to export couple of tables and rename them to match new instance similarly as discussed in Stack Overflow.
There’s a VisitExport plugin for Piwik/Matomo which lets you export and import log tables with PHP and JSON files but it didn’t seem usable approach for our use case with tables being 500 MB or so.
The more practical solution was to simply create a dump of the tables we wished to restore separately.
Web analytics is one the essential tools for a website and including measuring web traffic and getting information about the number of visitors it can be also used as a tool to assess and improve the effectiveness of a website. The most common way to collect data is to use on-site web analytics, measure a visitor’s behavior once on your website, with page tagging technology like on Google Analytics which is widely used web analytics service. But what would you use if you want to keep control over your own data?
You don’t have to look farther than Piwik which is open source web analytics application and aims to be the ultimate open alternative to Google Analytics. Here’s a short overview to Piwik Analytics and how to get started with it.
“Web analytics is the measurement, collection, analysis and reporting of web data for purposes of understanding and optimizing web usage.” – Wikipedia
Piwik Open Analytics Platform
Piwik is web analytics application which tracks online visits to one or more websites and displays reports on these visits for analysis. In short it aims to be the ultimate open source alternative to Google Analytics. The code is GPL v3 licensed and available in GitHub. In technical side Piwik is written in PHP, uses MySQL or MariaDB database and you can host it by yourself. And if you don’t want to setup or host Piwik yourself you can also get commercial services.
Piwik provides the usual features you would expect from a web analytics application. You get reports regarding the geographic location of visits, the source of visits, the technical capabilities of visitors, what the visitors did and the time of visits. Piwik also provides features for analysis of the data it accumulates such as saving notes to data, goals for actions, transitions for seeing how visitors navigate, overlaying analytics data on top of a website and displaying how metrics change over time. The easiest way to see what it has to offer is to check the Piwik online demo.
You might ask how Piwik differs from other web analytics applications such as Google Analytics? One principle advantage of using Piwik is that you are in control. You can host Piwik on your own server and the data is tracked inside your MySQL or preferably MariaDB database: you’ve full control over your data. Software as a service analytics applications on the other hand, have full access to the data users collect. Data privacy is essential for public sector and enterprises who can’t or don’t want to share it for example with Google. You ensure that your visitors behavior on your website is not shared with advertising companies.
Other interesting feature is that it provides advanced privacy options: ability to anonymize IP addresses, purge tracking data regularly (but not report data), opt-out support and Do Not Track support. Your website visitors can decide if they want to be tracked.
You can also do scheduled reports which are sent by e-mail, import data from web server logs, use the API for accessing reports and administrative functions and Piwik also has mobile app to access the analytics data. Piwik is also customizable with plugins and you can integrate it with WordPress and other applications.
Piwik’s User Interface
Piwik has clean and simple user interface as seen in the following screenshots (taken from the online demo).
Setting up Piwik
Setting up Piwik is easy and there’s good documention available for running Piwik web analytics. All you need is web server like Nginx, PHP 7 and MariaDB which has in some cases significantly improved query performance and reliability of Piwik over using MySQL. You can setup it manually but the most easiest way to start with it is to use the provided Docker image and docker-compose. The docker-compose file setups four containers (MySQL, Piwik, Nginx and Cron) and with compose you can start it up. The Piwik image is available from official docker-library.
The alternative is to do your own Docker image for Piwik and related services. In my opinion it makes sense to have just two containers: one for Piwik related web stuff and other for MariaDB. The Piwik container runs Piwik, Nginx and Cron script with e.g. supervisor. The official image uses Debian (from PHP) but Piwik runs nicely also on Alpine Linux. One thing to tinker with when using Docker is to get MariaDB access to Piwik’s assets for LOAD DATA INFILE which will greatly speed Piwik’s archiving process.
Piwik is good and feature rich alternative for web analytics application. Setting it up isn’t as straightforward as using some hosted service as Google Analytics but that’s the way self-hosted services always are. If you need web analytics and want to keep control of your own data and don’t mind hosting it yourself and paying for the server then Piwik is a good choice.