Download large number of files from Data Science servers


#1

I have generated a large number of figures and saved them in a folder on the data science server. How can I download them all at once without having to download the files one by one?


#2

Ideally one would use rsync command to download or upload large number of files between a remote server and a local machine. However, in the event when proper permissions to perform rsync command is not available, one could first zip the files and then download the resulting zipped file. The process to do so is as follows:

  1. Open a terminal in the notebook server (while on the tab “Files” on the Jupyter notebook, in the top right corner of the page click on “new” and select “Terminal”).
  2. Change directory (cd) to the folder where_your_files_are.
  3. Use command “zip -r myfigures *” to zip all the files in that folder.
  4. Go back to notebook server view and look for your zip file, then right-click download.

#3

rsync requires ssh access and is not available on the DS servers, second approach is recommended for now


#4

Please note that the method described above is not compatible with 7.8+ version of C3 platform.

In 7.8 Jupyter infrastructure, files displayed in the hub page are stored in C3 object datastore (such as AWS S3).
Those files can be edited, using c3_notebook.c3_open util function (see tutorials).

There is no satisfying/non-convoluted way to download a ton of files today. If the customer request is there, we’ll consider adding it to our product roadmap. In the meanwhile, if you need to it, you might want to create a bunch of files locally (using open and not c3_open) then zip them following the example above, and finally copy the binary content of that zip file by open-ing it and copy-ing the content into c3_open-ed file.