Amazon Redshift connector

Use Amazon Redshift as the analytics layer of an AWS-native stack.

Data Panda lands the data the rest of your business produces into Amazon Redshift and curates one model the dashboards, automations, AI workflows and internal apps build on. Reporting, scheduled jobs and ad-hoc SQL come off the same warehouse, not three copies of it.

About Amazon Redshift

AWS's data warehouse, sitting next to the rest of your AWS data.

Amazon Redshift launched in February 2013, built on top of MPP technology Amazon licensed from ParAccel and modelled on an early PostgreSQL fork. It runs in two shapes: Provisioned, where you size RA3 nodes (compute) against managed storage on S3, and Serverless, where you pay per Redshift Processing Unit (RPU) by the second and AWS scales the compute for you. Both share the same SQL surface and the same storage layer.

What makes Redshift the natural pick for AWS-first teams is what sits around it. Spectrum reads Parquet, ORC and JSON straight out of S3 without loading first. Zero-ETL integrations stream Aurora, RDS and DynamoDB changes into the warehouse without a separate pipeline. Federated Query joins Postgres or MySQL tables live. Redshift ML calls SageMaker from SQL. If your application data, your event logs and your data science already live in AWS, the warehouse stops being a separate destination and starts being the read model on top of what is already there.

What your Amazon Redshift data is for

What you get once Amazon Redshift is connected.

One model, every dashboard

Curated Redshift schemas feed every BI tool from one definition.

QuickSight, Tableau and Power BI read the same fact tables
Revenue, margin and pipeline have one calculation, not three
Spectrum joins S3 history into the same query without a copy job

Scheduled jobs on warehouse data

Operational triggers and exports run off the same Redshift model the dashboards use.

Daily exports to ERP and CRM read reconciled numbers
Reverse ETL to Salesforce or HubSpot from the curated layer
Alerting on thresholds that come from the same SQL the board sees

ML and Q on your own data

Redshift ML and Amazon Q work against the curated tables, not raw dumps.

Redshift ML trains SageMaker models from SQL on clean fact tables
Amazon Q in Query Editor writes SQL against the schema you trust
Bedrock calls enrich text columns without an extra ETL pipeline

Custom apps on the warehouse

Internal tools read from Redshift via the Data API and write back to the same model.

Operational portals on warehouse-grade reconciled data
Embedded analytics next to the transactional UIs
Data API endpoints replace ad-hoc connection strings

Use cases

Use cases we deliver with Amazon Redshift data.

A list of concrete reports, automations and AI features we have built on Amazon Redshift data. Pick the one that matches your situation.

AWS-native warehouseRedshift as the read layer for an S3, Aurora and RDS stack.

S3 lake plus SpectrumQuery historical Parquet in S3 from the same SQL as your hot tables.

Zero-ETL from AuroraApplication data lands in Redshift without a separate ingestion job.

Serverless or RA3Pick per workload, with the same model layer underneath.

Reverse ETL to SaaSPush reconciled metrics from Redshift back into Salesforce, HubSpot, marketing tools.

Concurrency controlWorkload Management and concurrency scaling for mixed BI and batch loads.

Distkey reworkAudit and redesign distribution and sort keys so the slow joins stop being slow.

Materialized viewsPrecompute heavy joins once, refresh incrementally, serve dashboards in milliseconds.

Data sharingShare read-only datasets across Redshift clusters and AWS accounts without copies.

ML in SQLTrain and call SageMaker models from Redshift ML on the curated fact tables.

Cost cap on ServerlessPut base RPU and max RPU caps in place so a runaway query is not a billing event.

Multi-AZ resilienceFailover-ready Redshift for workloads where downtime costs more than the second AZ.

Real business questions

Answers you will finally get.

Are we paying Serverless RPU rates for workloads that would be cheaper on RA3, or the other way around?

A read of the Serverless usage logs against the actual query mix shows where the workload is steady enough to fit a sized RA3 cluster, and where the bursty pattern justifies Serverless. Most warehouses started on one model and never tested the other; a quarterly check on the split usually shifts a meaningful slice of the bill.

Which of our slow Redshift queries are slow because of the data, and which because of the table design?

EXPLAIN plans plus SVV_TABLE_INFO show where queries pay for a network shuffle on the wrong distkey, scan a sortkey that no filter uses, or skew across nodes. Reworking distribution and sort on the top ten heaviest tables usually moves more queries than buying more compute does.

Where are we copying data into Redshift that Spectrum could read in place from S3?

An audit of the load jobs against the S3 lake shows historical Parquet partitions that are loaded once and queried rarely. Moving those out of managed storage and into a Spectrum external schema cuts the storage line and the load pipeline in one move, with the same SQL surface for the dashboards.

Value for everyone in the organisation

Where each function gets value.

For finance leaders

One reconciled revenue and margin layer in Redshift that the board pack, the controllers and the FP&A models all read. Month-end stops being a debate about whose extract is fresher, and the AWS bill for storage and compute becomes one line you can model against actual reporting load.

For sales leaders

Salesforce or HubSpot opportunity data joined with Aurora-based application usage and S3 event logs in one Redshift model. Pipeline coverage, win-rate and product engagement come off the same numbers, and reverse ETL pushes the curated scores back into the CRM where reps work day to day.

For operations

RA3 sizing, Serverless RPU caps, distribution-key health and Spectrum spend become observable in one place. Ops teams stop debugging ad-hoc COPY jobs from a dozen lambdas and start reviewing one model layer with one schedule.

Your existing tools

Your data lands in a warehouse. Your BI tools read from it.

You keep the reporting tool you already have. We connect it to the warehouse where your Amazon Redshift data lives.

Power BI Microsoft

Fabric Microsoft

Snowflake Data warehouse

BigQuery Google

Tableau Visualisation

Excel Sheets & pivots

Three steps

From Amazon Redshift to answers in three steps.

Connect securely

OAuth authentication. Read-only by default. We sign a DPA and your admin keeps the keys.

Land in your warehouse

Data flows into your warehouse on your schedule. Near real time or nightly, your call. You own the data.

Reporting, automation, AI

We build the first dashboard, workflow or AI feature with you, then hand over the keys. Or we stay on for ongoing delivery.

Two ways to work with us

Pick the track that fits how you work.

Track 01

Self-serve

We set up the foundation. Your team builds on top.

Amazon Redshift connector configured and running
Warehouse set up in your cloud account
Clean access for your Power BI, Fabric or Tableau team
Documentation on what's in the data model
Sync monitoring so you're warned before reports break

Best fit Teams that already have a BI analyst or data engineer and want to own the build.

Track 02

Done for you

We build the whole thing, end to end.

Everything in Self-serve
Dashboards built to the questions your team actually asks
Automations between your systems
AI workflows scoped to real tasks your team runs
Custom apps where a dashboard does not cut it
Ongoing delivery at a pace that fits your team

Best fit Teams without in-house BI or dev capacity. You tell us what you need and we deliver it.

Before you book

Frequently asked questions.

Who owns the data?

You do. It lands in your warehouse, on your cloud account. We don't resell or aggregate it. If you stop working with us, the warehouse stays yours and keeps running.

How fresh is the data?

Near real time for most operational systems. For heavier sources we schedule hourly or nightly. You pick based on what the reports need.

Do I need a warehouse already?

No. If you don't have one, we help you pick one and set it up as part of the first delivery. Common starting points are Snowflake, Microsoft Fabric, or a small Postgres start.

Should we run Redshift Serverless or Provisioned RA3?

Both have a place. Serverless bills per Redshift Processing Unit by the second and is the simpler default for spiky or unpredictable workloads. RA3 nodes bill per hour and become cheaper once the warehouse is steadily busy enough to keep them warm. We size based on the actual query mix and revisit the choice quarterly, since the right answer shifts as the workload grows.

How do we use Redshift Spectrum and S3 together without doubling the bill?

Spectrum reads Parquet, ORC and JSON straight out of S3, so cold and historical data can stay there instead of in managed storage. The split we usually land on is hot, frequently joined data inside Redshift; older partitions and append-only event data exposed as Spectrum external tables. The same SQL queries both, so the dashboards do not change.

Can distribution and sort keys be changed once a table is loaded?

Yes. ALTER TABLE supports changing distribution style and sort keys on an existing table, and CREATE TABLE AS rebuilds when an in-place change is not feasible. The catch is that picking the wrong keys early can leave a query slow for months before anyone connects the symptom to the design, which is why a quarterly review on the heaviest tables is usually a better investment than more compute.

GDPR-compliant

Data stays in the EU

You own the warehouse

A first deliverable live in four to six weeks.

We review your Amazon Redshift setup and the systems around it. Together we pick the first thing worth building.

Book a call See our other connectors