Google Universal Analytics: Data and Structure
Legacy content
This page is for Universal Analytics (UA). Find a page like this for Google Analytics 4 (GA4)
EPA content related to Google Analytics is changing.
Google's legacy platform, Universal Analytics (UA), will reach end of life in mid-2023 with a one-time extension for contracting clients such as EPA until July 1, 2024. See KB article.
In these Web Analytics pages, content for Universal Analytics is marked "Google Universal Analytics (legacy)."
Content for the new platform, Google Analytics 4, is marked "Google Analytics 4 (GA4)."
Google Analytics (GA) is a Web analytics tool. It collects Web traffic metrics using Google Tag Manager (GTM). Google Tag Manager is in the WebCMS template. Anyone building an application should use the applications stand alone template but at least grab the GTM code. The code sends Users and Session data to Google servers for processing. Google servers are notified any time a page is viewed by a Web browser. This process of using embedded code to collect Web traffic metrics is called page tagging. GA is one of many Web analytics tools that use page tagging.
- How Does it Work?
- Cookies and Privacy
- Page Tagging v. Log File Analysis
- How is Content Organized in Google Analytics?
How Does it Work?
GA works by setting various cookies in visitors’ Web browsers. Cookies are small files that Web servers place in Web browsers, often for the purpose of tracking internet activity. The file consists of a text message that is sent back to the server each time the browser requests a Web page. Session cookies only remain in a Web browser until the browser is closed or remains inactive for a specified amount of time. Persistent cookies, on the other hand, remain after a browser session ends. Some persistent cookies, including those set by GA, are set to expire after a specific amount of time passes (i.e. six months or two years).
Cookies and Privacy
Neither EPA nor Google collects any personally identifiable information (PII) about Users of the EPA website.
GA uses first-party persistent cookies, which the government classifies as Tier 2 persistent cookies. Tier 2 persistent cookies do not collect any PII and are permissible for use by federal agencies. For more information on how the Office of Management and Budget (OMB) defines persistent cookies, see OMB M-10-22, Guidance for Online Use of Web Measurement and Customization Technologies (9 pp, 102 K, About PDF).
Unless you first optout by blocking the cookies, the GTM will automatically set a persistent cookie in the browser of the computer or mobile device you are using to access the EPA website. Visitors can choose not to allow GA to track their Web activity by changing their browser settings. Modern browsers have options to block the kinds of cookies set by GA.
Cookie Deletion and Return Users
Page Tagging v. Log File Analysis
An alternative approach to collecting Web traffic metrics through page tagging is log file analysis. This method entails downloading server log files for processing in an analytics software program. It does not use page tagging or cookies.
Since server logs record all server transactions, including activity from Web crawlers and bots, software is needed to filter out non-human activity. While log file software does filter out known crawlers and bots and those that self-identify, not all crawlers and bots self-identify, making it difficult to filter all non-human activity.
On the other hand, GA page tags have to be activated by GTM, which the vast majority of Web spiders and bots do not process. While this may represent a small amount of traffic, it should be considered as part of any Web traffic analysis.
Log files may not be collecting all human activity either, since consecutive Sessions to the same Web page can cause the page to be retrieved from the browser’s cache. Web servers do not typically record such transactions.
Where page tagging holds a major advantage over log file analysis, however, is in the breadth of traffic metrics that can be collected and the ad hoc customizations that are available.
Page tagging solutions use cookies to track Return Sessions and other Session-based metrics, such as Pages per Session and Session Duration. Log file software relies on IP addresses to calculate Session-based metrics, which can be problematic since many large companies have dynamic IP addresses that can change after or even during a Session. Even though some Users delete their cookies prior to returning the same website, page tagging is viewed as a more accurate calculation of Session-based metrics.
Page tagging tools also provide user-friendly segmentation and custom reporting options. This allows you to quickly calculate the number of Sessions from segments, such as:
- Mobile devices
- Geographical locations (down to the City level)
- Social media referrals
- Searches that included particular keywords
These calculations can be executed quickly in the interface. In contrast, customizations to log file reports may require reprocessing the raw log files, or even custom configurations to the software itself. In most cases, however, these customizations are not possible with log file analysis.
The main advantage of log file analysis is the internal control of data. Whereas page tagging usually requires third-party hosting and processing of data, log file analysis enables organizations to process metrics without relying on outside parties. Depending on the organization, this can be a major selling point.
For analysis purposes, it is most important to find the tool that meets your needs, understanding that all analytics tools will provide differing calculations, and stick with that tool as you compare metrics month over month and year over year.
How is Content Organized in Google Analytics?
TSSMS areas have no practical application in EPA’s Google Analytics (GA) Account. GA follows an organizational hierarchy that consists of
- An Account
- Web Properties
- Views
GA accounts have the ability to track multiple Web Properties, which represent unique entities, such as websites or applications. It is important to think of each Web property as representing a distinct entity, in total, because the metrics of different Web properties are inherently separate. For example, one account might include three different Web properties for an agency’s
- Public websites
- Intranet websites
- Staging environment
Each Web property has its own set of Views, from which users access metrics and generate reports in the interface. Each view represents a particular instance of data within a Web property. The master view will include all metrics for that Web property, while other views are pre-filtered so that they only include metrics for certain content, such as a single subdomain or Web application.
GA accounts include a limited number of views, so not all content is likely to have, nor will it need, its own unique view. You must contact the Web analytics program to request a new View.