Skip to main content

Web analytics

This document currently describes the approach to web analytics for content on www.ietf.org maintained in the Wagtail CMS.

The IETF uses the Matomo On-Premise analytics package for www.ietf.org. While Matomo provides a broad range of functionalit, only a limited subset is used. Table 1 summarizes the data collection and the motivation for collecting each field. This approach was implemented after consulting the IETF community on a proposal, which provides additional detail and background..

Website analytics is implemented to:

  • limit collection and retention of data to what is needed to serve specific identified purposes;
  • be documented, as appropriate, in the privacy policy;
  • not require the use of web cookies; and
  • not impede the use of the website via browsers that do not have JavaScript enabled.
Table 1: Website Analytics Data Collection
Item Description
IP addresses IP addresses of the request
Timestamp Approximate data and and time a site resources was requested
Page Title The title of the requirements web page (from the HTML title tag)
Page URL URL of the requested resource
Referer URL URL of the page that linked to the requested resource
File Downloads Identifies which non-HTML resources were downloaded from the current page.
Outside Link Clicks Identifies which links to sites outside www.ietf.org were clicked on the current page.
Page Speed Track the time it takes for web pages to be generated by the webserver and then downloaded by the requestor
Browser Language The preferred language of the requestor’s browser (derived from the HTTP Accept-Language header)
User Agent The user agent string of the browser making the request (derived from the HTTP User-Agent header).

Matomo collects the raw visitor data defined in Table 1, and then computes aggregate data (reports) summarizing this raw data.

The aggregated data is made available to the IETF LLC staff, contractors whose role requires it, and the Internet Engineering Steering Group.

Access to the raw visitor data is restricted to only those users required to operate the system.

Analytics configuration only uses client-side JavaScript to collect all metrics. The Matamo Image Tracker feature which allows limited metric collection without JavaScript is disabled.

A visitor can prevent all web analytics functionality by disabling JavaScript for www.ietf.org in their browser.

The collection and reporting of website usage metrics will entail the handling of IP addresses which in certain environments might enable user identification.

Therefore, the product will be configured to apply the “Matomo level 2” anonymization scheme:

  • IPv4 – mask the lower 16 bits of the address
  • IPv6 – mask the lower 80 bits of the address

IP addresses are not logged in un-anonymized form by the analytics system, and the system is configured to minimize the long-term re-identification of users across visits. Specifically, this entails disabling tracking cookies and not using the Matomo User ID feature in the Tracking API which allows for persistent user identification (even across networks).

Returning visitor statistics (i.e., the linking of multiple page requests) are enabled based on dynamically calculated fingerprint that uses the “operating system, browser, browser plugins, [anonymized] IP address and browser language”. The lifetime of this fingerprint is 30 minutes. There is residual risk that could lead to the identification of users:

  • Geolocation of these IP addresses (in concert with the Browser Language) is an expected analysis. For countries with small number of IETF participants, one might be able to infer their usage.
  • With holistic access to the raw visitor data (likely through SQL-level access to the underlying Matomo database as this is not a product feature), novel de-anonymization approaches could be possible. This risk is mitigated by restricting access to the database (and raw visitor information) as noted above.

Matomo is configured with data retention periods defined in Table 2. Data beyond this period is purged.

Table 2: Retention Periods
Data Set Rentention
Raw Visitor Information 5 days
Aggregate Data 12 months

Configuration of website analytics is subject to review for GDPR compliance by IETF LLC Counsel, and compliance with the IETF Privacy Statement.