Importing several Jupyter notebook (.ipynb) files into a Jupyter container's tree

I am migrating a bunch of notebooks and files from a DS server instance to a 7.10 Jupyter container. I have like 40 Jupyter notebooks backed-up on S3. I want to copy or import them onto this new environment. Based on my little understanding of how Jupyter containers work, I cannot simply copy the notebooks on the container’s file system and expect them to work (because the notebooks are persisted as types in the type system, and not files in the file system).

Option 1) I could download those 40 notebooks and individually upload them to the Jupyter container one at a time (assuming that the upload action triggers the upsertion of the .ipynb file). Naturally, that approach is tedious and I was wondering if there were another way.

Option 2) Is there, for example, a function on one of the Jupyter types to upsert a jupyter notebook file (.ipynb), which is on the container’s local filesystem, into the type system? So the function’s input would be a local file path on the container. If I had that feature, I could copy the files on S3 onto the local file system (using AWS CLI) and then use code to upsert the notebooks into the type system.

Option 3) Of course, a better option would be to provide the path to an .ipynb file on S3 (instead of the local file system). So perhaps the file could be directly upserted from S3 without having to go through the Jupyter container’s file system.

I’ve done some investigation with @ihleonard, and right now, we are not able to do Option 2 or Option 3. The primary issue is the conversion from ipynb to JupyterNotebook c3typ (the storage format by which notebooks are saved in the C3 database). This is currently handled jointly by the Jupyter container ContentManager extension, whose code is not generally available, and the nbconvert library that the Jupyter client implementation itself uses.

This does seem like a reasonable feature request. If you can help compose a ticket with the details of what you’d like to see, then we can help flesh that out into a platform API.

We just did this exercise and developed a script for it. @steveders can you outline the APIs we used to move notebooks from a DS server to containerized jupyter?

1 Like

Overall approach:

  1. Develop the reference notebook(s) in a tenant with the intended root package provisioned
  2. Access the JupyterNotebook from the console. Note: the notebook will be stored with a GUID. c3Grid(JupyterNotebook.fetch()), copy the GUID (id field) and var nb = JupyterNotebook.get(<id>). Then, copy(nb.jsonForSeedData()) to copy it to the macOS clipboard.
  3. In the root package intended for training create a json file in <root_pkg>/seed/JupyterNotebook/<notebook_name>.json. Note: change the id to the same as name (vs. using the GUID).
  4. Provision root package as usual.

Note: similarly, seed data for JupyterDirectory needs to be created matching the path info in the JupyterNotebook records.

A slightly faster script after you’ve done the fetch and have the IDs. This will update the id field to match the name before copying it into your clipboard (as mentioned in step 3).

var nb = JupyterNotebook.get(<id>);
nb.id = nb.name;
copy(nb.jsonForSeedData());

Now continue with step 3, but you no longer need to change any field values.

Thanks for the writeup @steveders and @rileysiebel! However, this is not the issue that @varun.krishna is facing. Let me delineate the difference:
Your situation

  • notebook is already developed on the C3 Platform and is already stored in the form of JupyterNotebook. All you are trying to do is get the json to seed it
  • problem that is solved: JupyterNotebook -> json seed data

Varun’s situation

  • notebook is not developed on C3 Platform is not in JupyterNotebook but in a more “raw” .ipynb form
  • problem to be solved: ipynb -> JupyterNotebook

Please correct me if you believe I have misunderstood anything. Or perhaps you have solved Varun’s issue as well and just posted the wrong script?

Thanks for trying to help, @rileysiebel and @steveders , but I too don’t understand how this approach will help me import .ipynb files.

Also, I still think that having this feature is essential to migration (I don’t want to have to provision every time I want to import a .ipynb file). Please correct me if I’m wrong.

Here’s the ticket @ihleonard @dennis.wang: https://c3energy.atlassian.net/browse/DATA-3816

Please leave a comment on the ticket if it isn’t clear.

Thats true, we did manually copy the notebooks from the DS server to a containerized jupyter instance first.

Yeah, I do not know of a workaround.

Upload the .ipynb to the env.
Download the JupyterNotebook json into the package
Provision the package.

Maybe @jakewhitcomb or @harry can better advise.