Google Analytics (UA)

This data source refers to the Universal Analytics (UA) version of Google Analytics. To use the new version, GA4, see this documentation

This data source will no longer receive updates or bug fixes

GA is a Google tool created to enable analysis of your website data by installing a code on pages to track information from users who visit them.

Among the data generated by Google Analytics are the volume of visits to each page, the time the user spends on it, the geographical location of the visitor, the channel from which the visit came (through Google, links on other sites or directly by address), which browser the user used and whether it was accessed via smartphone or desktop, among other metrics and dimensions that you can use with custom tagging.

All these statistics not only help you monitor your website traffic, but also help marketing understand the effectiveness of created campaigns.

Having GA integrated into your database can help you understand the complete cycle of the user with your brand, starting from the moment they first have contact with your website, as well as optimizing the user experience (UI/UX).

Adding the data source

To add Google Analytics data source on Kondado platform, follow the steps below:

1) On the Kondado platform, go to the add new sources page and select the Google Analytics data source;

2) Click AUTHORIZE

3) Select the account you will use

4) On the next screen, check ALL permissions that are required and click Continue

5) You will be redirected to Kondado and all you have to do is give your data source a name and click on SAVE

Pipelines

Relationship Chart

Gráfico de relacionamento entre tabelas

Custom report

The GA Custom Report pipeline allows you to have full control over the format of your pipeline data.

First, the Property (UA) to which the report will refer must be chosen. This choice is necessary because our report has the ability to allow you to select custom metrics and dimensions from your account, the same reason why it is not possible to get data from multiple accounts in the same pipeline.

Once the Property parameter is selected, you must select the metrics and dimensions. Dimensions can be understood as details of information. For example, date is one dimension and page is another. So, if you select these dimensions (date and page) you will get detailed information by page and date – “on this day, on this page we have X”.

Metrics are the information that is given about the dimensions. So, going back to the previous example, on a given day, for a given page, you might want to get, for example, the number of users and the number of sessions – those are the metrics.

The table that will be created in your database will be defined by the combination of metrics and dimensions you choose and will have a format similar to this:

Field	Type
dimension_x	text
dimension_y	text
dimension_z	text
metric_x	float
metric_y	float
metric_z	float

To find out which metrics and dimensions to use, you can use this exploration tool created by Google:

Here is the list of metrics and dimensions: https://ga-dev-tools.appspot.com/dimensions-metrics-explorer/

Here you can test metrics and dimensions before building your pipeline:https://ga-dev-tools.appspot.com/query-explorer/

The tools provided above for listing and selecting dimensions and metrics are very useful, but in rare cases they can generate “false positives” – indicating that a given combination of metrics and dimensions can be queried together when in fact they cannot. It is common for this to happen with ultra-granular dimensions such as page path, which may contain sensitive data and, in this case, GA will return some metrics as 0 by internal data protection rules. As always, the best way out is to run the pipeline in Kondado first and to observe the data obtained

The rest of the pipeline parameters are used in several other pipelines and concepts widely used in Kondado: Attribution Window and Savepoint.

The savepoint defines the initial reading date. After a first read, the savepoint is updated to today's date. In this way, the data from the past (from the savepoint to today) will remain in your destination and in the next execution of your pipeline, only new and updated data will be fetched – avoiding re-reading.

The Attribution Window is a parameter that defines the updating of historical data. Depending on your metric's attribution model, data from a few days ago may change. For example, if a user sees a given page 1 week ago and just performs an action (goal/goal) today, the goal completion record can be assigned to a date one week ago – changing the past. The Attribution Window parameter solves this by always running a read a few days before the current savepoint. So, on the first run of a pipeline, the data will be read starting at the savepoint until today's date – this data will stay at its destination. In the next execution, data will be read from when the previous execution stopped minus the days defined by its attribution window.

Once all your parameters are configured, just click on NEXT to continue creating your pipeline.

Metrics that count distinct values (for example, “distinct users”) behave differently if you use a “day” dimension versus a “month” dimension. For example, let's say the same user visits your page for an entire month. In a pipeline that sees distinct users per day, you will get a value of 1 per row. Adding this value to get the number of unique users in that month would result in a value of 30 – which is wrong, as in that month it was the same unique user who accessed your page every day. In this case, the right thing would be to use the unique user’s metric with the month dimension. In order for the values obtained to be close to the tool, it is always important that the report is created in the same dimension as the tool's report.

For GA accounts with a large volume of data, there may be a difference between the values obtained by the pipeline and those displayed in the tool. This occurs due to a sampling that the GA does when sending the data. In the Kondado pipeline, we already try to avoid the sampling effect as much as possible by using the highest precision available, but it is something that can happen due to the way GA makes data available and there is nothing that can be done about it.

Multi-Channel Funnel (MCF)

Field	Type
dimension_x	text
dimension_y	text
dimension_z	text
metric_x	float
metric_y	float
metric_z	float