A marketing data warehouse is a central database that consolidates ad, CRM, and analytics data from every platform an agency runs, so reports are built once and refreshed automatically. Instead of pulling Meta Ads, Google Ads, TikTok Ads, and CRM data into separate spreadsheets every month, the warehouse becomes the single source of truth that feeds Looker Studio, Power BI, and any BI report the team needs.
This guide walks through the reference architecture agencies and in-house marketing teams use in 2026, the trade-offs between building it yourself and using a no-code data platform, and how to size the stack so it fits both a single client and a portfolio of fifty.
Key Takeaways:
- A marketing data warehouse replaces manual exports with a central database that auto-refreshes on the schedule you set, so reports stay current without ops work.
- The modern reference stack is Ad and CRM data sources → Kondado → BigQuery (or PostgreSQL / Redshift) → Looker Studio or Power BI. The choice of warehouse depends on volume, team skills, and cost ceilings.
- Building the pipelines yourself with the platform APIs is rarely worth it once an agency manages more than three or four clients. The maintenance cost of broken tokens, schema changes, and rate limits exceeds the licence cost of a managed platform within months.
- The Kondado platform replicates from 80+ data sources into BigQuery, PostgreSQL, MySQL, SQL Server, Redshift, Amazon S3, Google Sheets, and Excel, with bilingual support in English and Portuguese, USD or BRL billing, and a 14-day trial.
What is a marketing data warehouse, and why do agencies need one in 2026?
A marketing data warehouse is a database designed to store marketing performance data from many sources in one place, with a schema optimised for analytical queries. It sits between the source platforms (Meta Ads, Google Ads, TikTok Ads, LinkedIn Ads, HubSpot, RD Station, Salesforce, Google Analytics 4) and the reporting layer (Looker Studio, Power BI, Tableau).
The reason it has become standard in 2026 is the rate at which platform APIs change. Meta deprecated `post_impressions` in November 2025. LinkedIn still does not expose a total-followers metric. TikTok rotates OAuth tokens on a 60-day cycle. An agency that built a fresh Looker Studio data source for every client three years ago now spends days each month fixing broken refreshes. A warehouse decouples the report from the API: the pipeline lands the data once a day, the report reads from the warehouse, and an API change only breaks one pipeline rather than every client report.
The second driver is multi-client scale. A marketing agency reporting on five Meta Ads accounts, five Google Ads accounts, and three TikTok accounts is operating thirteen separate data feeds. The warehouse stores all of them in unified tables (`ads_spend`, `ads_clicks`, `accounts`), with an `account_id` column. One report, ten clients, no copy-paste.
The reference architecture: data sources to BigQuery to Looker Studio
The reference architecture has four layers. From left to right:
- Data sources - the ad and CRM platforms where the data lives. For a marketing agency this is typically Meta Ads, Google Ads, TikTok Ads, LinkedIn Ads, X Ads, Pinterest Ads, GA4, and one or more CRM and marketing-automation tools (HubSpot, RD Station, Pipedrive, Salesforce, ActiveCampaign).
- Extract and load layer - the data platform that connects to each source's API, handles authentication, paginates the responses, and writes the rows to the warehouse on a schedule. This is where Kondado sits.
- Warehouse - the database. For a marketing data warehouse the common choices are BigQuery (Google Cloud), PostgreSQL (including PostgreSQL-compatible services like Supabase and Neon), or Redshift (AWS).
- Reporting layer - the BI tool the team or the client reads. Looker Studio is the most common for marketing teams because it is free and the Google ecosystem maps directly to BigQuery. Power BI is the choice for teams already on Microsoft. Tableau, Metabase, Qlik Sense and others fit specific company stacks.
A typical day-in-the-life flow looks like this. Kondado's Meta Ads data source pulls campaign, ad-set, and ad-level spend from every connected account at 6 a.m. and writes it to a BigQuery dataset called `marketing_warehouse`. Google Ads, TikTok Ads, and LinkedIn Ads pipelines do the same into the same dataset, each with its own schema. The team's master Looker Studio report connects directly to BigQuery, joins by `account_id` and `date`, and serves the unified view to every client. When a client logs in at 9 a.m., yesterday's numbers are already there.
How does Kondado fit into the marketing data warehouse architecture?
Kondado is the extract-and-load layer that replicates data from 80+ data sources into your warehouse, on a refresh schedule you choose. The platform handles API authentication, pagination, schema mapping, and incremental replication. The team focuses on the report, not on broken tokens.
Kondado's role on each piece of the stack:
- Sources covered: ad platforms (Meta Ads, Google Ads, TikTok Ads, LinkedIn Ads, X Ads, Pinterest Ads, Microsoft Ads, Bing, Criteo, Taboola), CRMs (HubSpot, RD Station, Pipedrive, Salesforce, ActiveCampaign, Bitrix24), marketing automation (Mailchimp, ActiveCampaign, RD Station, E-goi, Mautic), GA4, Mixpanel, social media (Instagram, Facebook, LinkedIn, TikTok organic, YouTube, X), and e-commerce platforms when the agency also reports on transactions (Shopify, VTEX, Nuvemshop, Mercado Libre).
- Supported warehouses and destinations: BigQuery, PostgreSQL, MySQL, SQL Server, Redshift, Amazon S3, Google Sheets, and Excel.
- Compatible reporting layer: Looker Studio, Power BI, Tableau, Metabase, Qlik Sense, Looker, Superset, and many others - either by reading from the warehouse, or by direct connection Via Kondado without an intermediate warehouse.
For agencies that do not need full warehouse storage on day one, Kondado also writes directly to Google Sheets or Excel, which then feed Looker Studio or Power BI. This is a common starter pattern that later migrates to BigQuery once volume grows.
Should you build a marketing data warehouse yourself or use a no-code platform?
The honest answer is that an agency or marketing team with more than three reporting clients almost always lands on a managed platform after burning two to four engineering months on the DIY path. The maths is below.
The DIY route:
- Hire or assign a data engineer.
- Build a Python pipeline per platform: OAuth flow, rate-limit handling, pagination, retry logic, schema mapping. Budget 1 to 2 weeks per platform for a working v1, plus ongoing maintenance.
- Host the pipelines (Cloud Run, Airflow, dbt, or a managed orchestrator).
- Monitor for breakage: API deprecations, token expiry, rate-limit changes, schema drift.
- Re-do the work for every new platform a client requests.
Annualised cost for an agency running ten ad platforms with one mid-level data engineer: roughly US$ 80,000 to US$ 150,000 in salary alone, plus cloud infrastructure, plus the opportunity cost of the engineer not building anything that differentiates the agency.
The Kondado route:
- Sign in, pick the data source, paste the credentials.
- Pick the destination (warehouse) and the refresh frequency.
- Watch the data land. Repeat for the next platform.
- When a data source breaks because the platform API changed, Kondado's engineering team fixes it.
Annualised platform cost is in the hundreds of US dollars per month, not tens of thousands.
The DIY route wins in one scenario: when the team needs a transformation that no managed platform supports out of the box and the engineering team already exists. For 95 percent of agencies and in-house marketing teams, the managed platform wins on time-to-value and on long-term maintenance cost.
Which warehouse should you choose: BigQuery, PostgreSQL, or Redshift?
The Kondado platform writes to BigQuery, PostgreSQL, MySQL, SQL Server, Redshift, Amazon S3, Google Sheets, and Excel. For a marketing data warehouse the three serious warehouse candidates are BigQuery, PostgreSQL, and Redshift. Pick by cost model and ecosystem fit.
BigQuery (Google Cloud) is the default choice for marketing teams because:
- Free tier covers 1 TB of queries and 10 GB of storage per month, which holds most agency-scale workloads.
- Pricing is on-demand: pay per byte scanned, no idle cluster cost.
- Native compatibility with Looker Studio means a report connects to a BigQuery dataset in two clicks.
- Schema-on-read is forgiving for marketing data, which has many sparse columns.
PostgreSQL (including Supabase, Neon, CockroachDB, Amazon Aurora PostgreSQL, Google Cloud SQL PostgreSQL) fits when:
- The team already runs an application database on PostgreSQL and prefers to consolidate.
- Volumes are under 100 GB, so a transactional database handles the analytics workload fine.
- The team needs SQL features that BigQuery dialects implement differently (row-level security, triggers, complex stored procedures).
Redshift (AWS) fits when:
- The team is already on AWS and prefers to keep billing consolidated.
- Workloads benefit from predictable capacity rather than pay-per-byte (large recurring queries).
- The agency has Snowflake-style internal expectations and the team is AWS-native.
A common starter path: begin on Google Sheets to validate the schema and the report layout, migrate to PostgreSQL when row counts pass a few million, and migrate again to BigQuery when monthly data volumes pass tens of gigabytes. Kondado replicates to all three so the team changes destination without re-engineering the upstream extracts.
How much does a marketing data warehouse cost to run?
The cost has three line items: the data platform, the warehouse, and the BI tool. A realistic monthly bill for a marketing agency reporting on ten ad accounts (Meta + Google + TikTok), three CRM sources, and GA4 looks like this:
- Data platform (Kondado): a 14-day free trial includes 30 pipelines and 10 million records, no credit card. Paid plans start at US$ 19 per month internationally or R$ 99 per month for clients invoicing in BRL (with NF, Pix, and boleto available). Pipeline limits and row limits scale with the plan.
- Warehouse (BigQuery): Google's free tier of 10 GB storage and 1 TB of queries covers most agency portfolios under twenty clients. Paid spend begins around US$ 5 to US$ 50 per month for medium workloads.
- BI tool (Looker Studio): free for the core product. Looker Studio Pro is optional at US$ 9 per user per month for advanced features.
Sizing notes that drive the bill:
- Refresh frequency: replicating every hour costs more rows than replicating once a day. Most marketing reports do not need hourly cadence. Daily is the sensible default; the platform supports the schedule you choose.
- Date range per refresh: pulling "last 30 days" on every run is normal for ad data because retroactive attribution updates land for up to 28 days. Pulling "last 12 months" daily multiplies row count by twelve and almost never adds insight.
- Per-platform row counts: a single ad account generates roughly 10,000 to 100,000 ad-level rows per month, depending on the number of active ads. Ten accounts across three platforms lands well inside the 10 million record trial limit.
Marketing data warehouse vs data lake: which one do agencies use?
A warehouse stores structured tables ready for SQL queries and BI tools. A data lake stores raw files (JSON, Parquet, CSV) in object storage and defers structuring until query time. For a marketing reporting workload that powers Looker Studio or Power BI, the warehouse is the right answer in nearly every case: structured tables map directly to a BI tool's data source, while a lake requires an extra query engine (Athena, Trino, Spark) to read.
Kondado supports both patterns. The warehouse pattern writes to BigQuery, PostgreSQL, MySQL, SQL Server, or Redshift. The lake pattern writes to Amazon S3 as raw files. Most marketing teams choose the warehouse; engineering-led teams that already run a lake architecture for product analytics sometimes choose the lake.
Data residency, invoicing, and compliance considerations
A marketing data warehouse stores customer-level data: ad spend, click-through rates, lead and customer identifiers, sometimes email addresses. Two questions come up before any agency hands a client warehouse credentials.
Where does the data sit? Each warehouse names its region at creation. BigQuery offers `southamerica-east1` (São Paulo), `us-central1`, `europe-west1`, and many others. PostgreSQL on Supabase or Neon offers the same choice. The agency picks the region. Kondado replicates to whichever region the warehouse owner configured.
How is the data platform billed and invoiced? Kondado offers USD or BRL billing, with NF (Nota Fiscal), Pix, and boleto available for clients invoicing in BRL. This matters for agencies whose own clients require local invoicing for tax reasons. The 14-day trial requires no credit card.
Frequently Asked Questions
What is a marketing data warehouse?
A marketing data warehouse is a database that consolidates marketing data (ad spend, clicks, conversions, CRM events, GA4 sessions) from multiple platforms into one schema, refreshed on a schedule, so that BI tools like Looker Studio and Power BI can build reports without manual exports. It is the modern alternative to maintaining a separate report per platform.
What is the best ETL tool for a marketing data warehouse?
For a marketing team, the best ETL platform is the one with the widest set of ad and CRM data sources, a no-code interface, and direct compatibility with BigQuery, PostgreSQL, and Redshift. Kondado replicates from 80+ data sources, including Meta Ads, Google Ads, TikTok Ads, LinkedIn Ads, GA4, HubSpot, RD Station, Salesforce, Pipedrive, and ActiveCampaign, into all the common warehouses. The platform offers a 14-day free trial with 10 million records included, no credit card.
How do I send Meta Ads data to BigQuery?
The Kondado platform connects to the Meta Ads account via the official Marketing API, pulls campaign, ad-set, and ad-level metrics, and writes them to a BigQuery dataset the team owns. Setup steps: sign up for Kondado, authorise the Meta Ads account, point the destination to your BigQuery project, choose the refresh schedule. The first replication lands the historical window the platform supports (typically 28 to 90 days depending on the platform's API limits) and incremental refreshes append from there.
How do I send Google Ads data to BigQuery?
The same pattern applies. Kondado's Google Ads data source authenticates to the Google Ads API, pulls keyword, ad group, campaign, and account-level metrics, and writes to BigQuery on the schedule you choose. Google offers its own BigQuery Data Transfer Service for Google Ads, but it covers only the Google product family. The Kondado platform consolidates Google Ads with Meta, TikTok, LinkedIn, X, and Pinterest in the same warehouse so a single Looker Studio report spans them all.
Should an agency build the warehouse pipelines in-house?
For an agency reporting on more than three ad platforms or more than five clients, a managed platform pays back the licence cost within months. The hidden cost of DIY is not the initial build, it is the long tail of API breakage: tokens expiring, rate limits changing, schema fields deprecated. Kondado's engineering team absorbs that maintenance load so the agency's engineers, if it has any, can work on differentiation.
What is the difference between a marketing data warehouse and a data lake?
A warehouse stores structured tables that BI tools read directly. A data lake stores raw files in object storage (S3) and needs a query engine to read. For a marketing reporting workload that powers Looker Studio or Power BI, the warehouse pattern is faster to value because the BI tool connects to the dataset in two clicks. The lake pattern fits engineering-led teams already operating a lake for product analytics.
Does Kondado support Snowflake as a destination?
Kondado's destinations cover Google Sheets, Excel, BigQuery, PostgreSQL, MySQL, SQL Server, Redshift, and Amazon S3. A team already invested in Snowflake either reads Kondado output from one of the supported warehouses (PostgreSQL and Redshift are common bridges) or syncs through a separate process. The Kondado platform fits the modern marketing stack on BigQuery, PostgreSQL, or Redshift natively.
How much does a marketing data warehouse cost per month?
A realistic agency-scale monthly bill is the Kondado plan (starting at US$ 19 per month internationally or R$ 99 per month for clients invoicing in BRL), plus a few US dollars to a few tens of US dollars for BigQuery storage and queries (most agencies stay inside the free tier), plus free Looker Studio. The dominant variable is the Kondado plan tier, which scales with pipelines and row counts.
Build your marketing data warehouse with Kondado
The Kondado platform replicates from 80+ data sources into BigQuery, PostgreSQL, MySQL, SQL Server, Redshift, Amazon S3, Google Sheets, and Excel, with bilingual support in English and Portuguese, USD or BRL billing, and a 14-day trial that includes 30 pipelines and 10 million records, no credit card required. Start your free trial and have the first Meta Ads or Google Ads pipeline landing in BigQuery within the hour.
