Google Cloud Storage

Google Cloud Storage is a service for storing objects and files (eg CSV and JSON) on Google Cloud.

By adding the Google Cloud Storage data source in Kondado, you will be able to create ETLs from your files directly to your Data Warehouse or Data Lake with just a few clicks.

Adding the data source

To add the Google Cloud Storage connection, follow the steps below:

1) Login to your Google Cloud account

2) Click on this link to access the Service Accounts section;

3) Once in the Service Accounts section, click on “CREATE SERVICE ACCOUNT”

4) In the first step, fill in a name for your service account (eg “kondado gcs”) and click on “CREATE”

5) In the second step of the creation process, select the Role “Storage Object Admin” and click CONTINUE

6) Now just click on “DONE” to finish the creation

7) Once created, you will be directed to a list of all active service accounts. Locate the one you just created and, on the three vertical points on the right, click on “Create key”

8) In the dialog, select the type “JSON” and then click on “CREATE”

9) After clicking create, the key will be downloaded to your computer. Open the downloaded file in a text editor, it will look something like this:

10) Login to Kondado platform, go to add data sources page and select the Google Cloud Storage data source;

11) On the add data source page, do the following:

In “Bucket” fill in the name of your bucket

In “JSON Credential”, copy and paste the file values from step (9)

12) Now just click on “SAVE” and you will be ready to upload your files from Google Cloud Storage to your Data Warehouse or Data Lake

Pipelines

Relationship Chart

CSV

You can indicate the name of a file or even the beginning of the file name and we will integrate all of them.

Once executed, the pipeline will save the highest change date of the files it read and, on the next run, only look for files that have a later change date.

In order to absorb files with different columns, the data will be pivoted on the target and will follow the following pattern:

Field Type

row_number

int

column_number

int

first_column_value

text

value

text

__file_basename

text

__file_path

text

__file_name

text

__kdd_insert_time

timestamp

Connect Google Cloud Storage to Kondado

Set up a service account in Google Cloud and configure the JSON credentials in Kondado to start extracting files to your data warehouse.

1
Create a Google Cloud service account

Log in to your Google Cloud account, navigate to the Service Accounts section, and click CREATE SERVICE ACCOUNT. Name it (e.g., "kondado gcs") and assign the "Storage Object Admin" role.

2
Generate and download the JSON key

Locate your new service account in the list, open the three-dot menu, and select "Create key." Choose JSON format, then download and open the file in a text editor to copy its contents.

3
Add the source in Kondado

Log in to the Kondado platform, go to the add data sources page, and select Google Cloud Storage. Enter your bucket name and paste the JSON credential values into the corresponding field.

4
Save and start building pipelines

Click "SAVE" to finalize the connection. You can now create ETL pipelines that extract CSV or JSON files from your bucket directly to your Data Warehouse or Data Lake, with automatic incremental loading based on file modification dates.

Frequently asked questions

What file formats does the Google Cloud Storage connector support?
The connector supports CSV and JSON files stored in your Google Cloud Storage bucket. You can specify an exact file name or a prefix to integrate multiple matching files in a single pipeline.
How does incremental loading work for files in Google Cloud Storage?
After each pipeline execution, Kondado records the latest modification date of the files it processed. On the next run, it only reads files with a more recent change date, enabling efficient incremental data ingestion without reprocessing unchanged files.
What happens when CSV files have different column structures?
To handle varying schemas, Kondado pivots the data into a standardized format with fields like row_number, column_number, first_column_value, and value, plus metadata columns such as __file_basename and __kdd_insert_time.
What role should I assign to the Google Cloud service account?
You should assign the Storage Object Admin role to your service account. This provides the necessary permissions for Kondado to read objects from your bucket while following security best practices.
Where can I find pre-built reports for Google Cloud Storage data?
You can explore pre-built reports for Google Cloud Storage to quickly visualize your extracted data, or connect to BI tools and dashboards for custom analysis.

Written by·Published 2023-07-05·Updated 2026-04-25