First off you can install the Command Line Interface with pip:
pip install awscli
If you are using the Python virtual environment described on the ‘Configure Docker’ page, you will already have this installed.
# The aws configure command at the command line pulls up an interactive
# session. Press <enter> after each line. Values needed can be found in
# secrets.json, which you will need to request directly from the project team.
# This command configures a awscli profile called 'housinginsights',
# which means you will be required to add the argument "--profile
# housinginsights" to every awscli command. You can optionally omit
# this if housinginsights is the only project you want to use awscli for.
$ aws configure --profile housinginsights
AWS Access Key ID [None]: <use secrets.json.s3.access_key_id>
AWS Secret Access Key [None]: <use secrets.json.s3.secret_access_key>
Default region name [None]: us-east-1
Default output format [None]: <none>
Because our raw data is large, we don’t store it in our GitHub repository (it has a 1Gb total limit on repository size). The sync command of the awscli acts like Dropbox or similar services, but manually using the command line. There is only ever one copy of the data (unlike Git), and you can download changes from S3 or push your own changes to S3. S3 is a file storage service from Amazon Web Services.
Do this before you add/download new files to the /data folder, to make sure you’re up to date.
Navigate to the root project folder path\to\your\housing-insights\
, Type ls (dir on windows) and make sure you see the data folder.
aws s3 sync s3://housinginsights data --profile housinginsights --dryrun
--dryrun
flag lists all the actions that would be performed (e.g. ‘download: /data/newfile.txt’). If none, no action needed. If there are some, make sure they will not cause any conflicts with changes you have made locally since you last synced.aws s3 sync s3://housinginsights data --profile housinginsights
Make sure you download updates before you start working.
Navigate to the root project folder path\to\your\housing-insights\
Type ls (dir on Windows) and check for the data folder
Test push your changes: aws s3 sync data s3://housinginsights --profile housinginsights --dryrun
Note, this is the same command as above, but with the order of the two folders reversed.
Make sure it all is what you want, then do it for real: aws s3 sync data s3://housinginsights --profile housinginsights
This pushes your folder contents to S3.
The sync command only updates or adds files by default. If you deleted a file and actually want it to be deleted (e.g. it was moved to another folder). By default contributors don’t have delete permission - contact a project lead for help. Do try to put things in the right folder before you sync to avoid deletion issues!