amazon web services - Using AWS Pipeline for incremental copy of on premises postgres to Redshift -


we're exploring using redshift our warehouse , need move new data our existing on-premises postgres db redshift nightly. looks can pipeline , template if primary db in rds, can pipeline if database on-premises?

since synching in bulk on periodic basis might want consider pushing data s3, easy on-premises, , loading redshift via copy command. it's fast , reliable , backup free (cheap) in s3 side effect.

you can use aws s3 cp awscli python tool push csv postgres dumps (after putting access key in .aws/config) like:

aws s3 cp current_dump.csv.gz s3://yourbucket/20170108/dump.csv.gz 

and regular postgres psql utility execute copy on redshift so:

pgpassword='yourpass' psql -h your.redshift.end.point.com -u youruser -d yourdb -p 5439 -c "copy yourtable 's3://yourbucket/20170108/dump.csv.gz' credentials 'aws_access_key_id=[yourkey];aws_secret_access_key=[yoursecret]' delimiter ',' null 'null' ignoreblanklines emptyasnull blanksasnull timeformat 'auto' fillrecord maxerror 1 csv gzip;" 

you put postgres dump , these commands in shell script bit scripting of time values , run cron job.

for best practice want upload staging table , merge main table support updates if needed , prevent duplicates.


Comments

Popular posts from this blog

python - How to insert QWidgets in the middle of a Layout? -

python - serve multiple gunicorn django instances under nginx ubuntu -

module - Prestashop displayPaymentReturn hook url -