Upload a local csv file into jupyter (not into a type)


#1

Hi there,

I need to load a .csv file I have on my local hd on the platform. My plan is to read it in jupyter as a dataframe. Note that I don’t want to save this data into a type, I just need it to filter a type and for some temporary computation.

I tried to use the upload tool from the jupyter home page, but even if the file is loaded, I can’t find it anywhere…

Thanks
Alessandro


#2

I suggest you to upload the file into an S3 bucket using curl at a path you can define in the command below and then access it using the S3 APIs.
Change the part related to auth cookie with hostname of your env and then tenant/tag … about the c3auth you can retrieve it doing

https://myenvornmnetUrl/auth/1/token – replace here your environment hostname

curl -H "Content-Type: text/csv"  --cookie "c3auth=..."  \
     -X PUT --data-binary @100KServiePoint.csv \
     https://myenv/file/1/mytenant/mytag/myPath/myfile.csv -v -L -k

#3

Hi,

If I understand correctly, you are using the containerized Jupyter.
Therefore, when you upload your file through the Jupyter homepage, it is in fact stored on S3, and not in your Jupyter folder, although it appears there.

To open your file you need to use the function c3_open:

from c3notebook.c3_utils import c3_open
with c3_open('file.csv', 'r') as f:
    df = pd.read_csv(f)

#4

What if you want to read it from an S3File entity?
For example myfile=S3.listFiles().files[0]
I know I can do S3File.readString(myfile,0,1024).
How can I create a pandas dataframe from its contents?


#5

I would say that it depends on the content of the file.

If the file contains a serialized C3 Dataset, you can do something like this (with your variable myfile):

dataset = c3.S3File.readObj(this=myfile)
df = c3.Dataset.toPandas(dataset=dataset)

If the file is a csv, you can maybe use a StringIO:

import io
content = io.StringIO()
content.write(c3.S3File.readString(myfile))
content.seek(0)
df = pd.read_csv(content)

Maybe someone from MLE has a better solution…