Google Cloud Storage is a service for storing objects and files (eg CSV and JSON) on Google Cloud.
By adding the Google Cloud Storage data source in Kondado, you will be able to create ETLs from your files directly to your Data Warehouse or Data Lake with just a few clicks.
Adding the data source
To add the Google Cloud Storage connection, follow the steps below:
1) Login to your Google Cloud account
3) Once in the Service Accounts section, click on “CREATE SERVICE ACCOUNT”
4) In the first step, fill in a name for your service account (eg “kondado gcs”) and click on “CREATE”
5) In the second step of the creation process, select the Role “Storage Object Admin” and click CONTINUE
6) Now just click on “DONE” to finish the creation
7) Once created, you will be directed to a list of all active service accounts. Locate the one you just created and, on the three vertical points on the right, click on “Create key”
8) In the dialog, select the type “JSON” and then click on “CREATE”
9) After clicking create, the key will be downloaded to your computer. Open the downloaded file in a text editor, it will look something like this:
11) On the add data source page, do the following:
In “Bucket” fill in the name of your bucket
In “JSON Credential”, copy and paste the file values from step (9)
12) Now just click on “SAVE” and you will be ready to upload your files from Google Cloud Storage to your Data Warehouse or Data Lake
Pipelines
Relationship Chart
CSV
You can indicate the name of a file or even the beginning of the file name and we will integrate all of them.
Once executed, the pipeline will save the highest change date of the files it read and, on the next run, only look for files that have a later change date.
In order to absorb files with different columns, the data will be pivoted on the target and will follow the following pattern: