amazon web services - Using AWS Pipeline for incremental copy of on premises postgres to Redshift -

we're exploring using redshift our warehouse , need move new data our existing on-premises postgres db redshift nightly. looks can pipeline , template if primary db in rds, can pipeline if database on-premises?

since synching in bulk on periodic basis might want consider pushing data s3, easy on-premises, , loading redshift via copy command. it's fast , reliable , backup free (cheap) in s3 side effect.

you can use aws s3 cp awscli python tool push csv postgres dumps (after putting access key in .aws/config) like:

aws s3 cp current_dump.csv.gz s3://yourbucket/20170108/dump.csv.gz

and regular postgres psql utility execute copy on redshift so:

pgpassword='yourpass' psql -h your.redshift.end.point.com -u youruser -d yourdb -p 5439 -c "copy yourtable 's3://yourbucket/20170108/dump.csv.gz' credentials 'aws_access_key_id=[yourkey];aws_secret_access_key=[yoursecret]' delimiter ',' null 'null' ignoreblanklines emptyasnull blanksasnull timeformat 'auto' fillrecord maxerror 1 csv gzip;"

you put postgres dump , these commands in shell script bit scripting of time values , run cron job.

for best practice want upload staging table , merge main table support updates if needed , prevent duplicates.

Search This Blog

WIKI

amazon web services - Using AWS Pipeline for incremental copy of on premises postgres to Redshift -

Comments

Post a Comment

Popular posts from this blog

qt - QML MouseArea onWheel event not working properly when inside QML Scrollview -

java - is not an enclosing class / new Intent Cannot Resolve Constructor -

python - Error importing VideoFileClip from moviepy : AttributeError: 'PermissionError' object has no attribute 'message' -