Seed data folder structure and contents

#1

I was reading the docs - specifically the topic “documentation/topic/seed.c3doc” structure of the seed directory, but it seems to be lacking detail. So we have the standard [seed,src,test] folders, what I’m interested in is what is required/allowed within the seed directory so that an existing type has instances created from seed data. We are using json files with 1:M entries per type to be created during provision.

  1. Subdirectory - what is required vs allowed for naming, paths, nested folders? I’d like to structure the seed directory just like the source, whereas today our seed directory is flattened.
  2. json file name - what is required vs allowed for naming of the files? How many seed data files can exist for the same type? Do certain naming of the seed files cause short circuiting of processing other seed files that have the same type content? (ie. seed/TypeA/ folder has 1.json, 2.json, TypeA.json files within it)
0 Likes

#2

For example, if you have seed for types: CronJob, SimpleMetric, TypeA, TypeB, then you can have this structure:

seed/
├── CronJob/
    ├──cron-job-1.json
    ├──. . .
    ├──cron-job-n.json
├──SimpleMetric/
   ├── simple-metric-all.json
├──TypeA/
   ├── type-a-1.json
├──TypeB/
   ├── type-b-1.json
src/
test/

Subdirectory names have to match existing types, for the json file themselves there is no naming restriction.

0 Likes

#3

Thank you. So how about the subdirectory allowances?
seed/
-SocialNetworks (arbitrary names or meaningful “package” structures)
—TypeA/
------type-a-1.json
—TypeB/
------type-b-1.json
src/
test/

0 Likes

#4

it’s better to have flat seed directory, but you can have package folders and inside them the type folders.

0 Likes

#5

Can you elaborate on “better” : is that a preference, performance or c3-standard?

0 Likes

#6

no performance impact, it is just easier to browse.

0 Likes

#7

Thanks for clarifying that it’s personal preference. I’ll take it up internally on how we choose to move forward.

0 Likes

#8

1.Seed category of metadata files should have SeedData or Metadata Type instances.
2. Each instance must have id.
3. When provisioned the instaces will be upserted or replaced according to it’s type (SeedData vs Metadata)
4. Typically C3 seed file convention is seed/…/TypeA/id1.json etc. one instace per file. you can have multiple instances in a json file with fields “type” : [TypeA], value: [instances], or csv
5. don’t duplicate same id in the same package. When same id is overriden across packages, these instances will be merged in order of package dependency.
6. Have type name as immediate parent directory of files.
7. Don’t keep data ( like timeseries ) in seed.

I will update the seed.doc for future release.

2 Likes

#9

Thanks Pavan. in regards to #4, could you clarify on the redundancy of the immediate parent directory needing to match the type but also defining the “type” :[TypeA] inside the file within that directory. Is it used to validate the end type and so mistakes of seed data inadvertently populating types would be avoided by the platform?

0 Likes

#10

for case of many instances in a metdata file, you can have just array of instances [instance1, instance2], type name is derived from file path.

“type” : [TypeA], value: [instances] is another way to have content, which can be read without path info.

1 Like

#11

Update after the last sprint. We’ve tried various methods of structuring the seed folder, filename, and including type in the json file and to no avail. Provisioning still gives us an error if the last foldername is not a valid type, regardless if the filename or the type string in the file match a valid type.

image

image

image

On provisioning,
image

0 Likes

#12

Your folder should have the same name as your type

0 Likes

#13

I took that as a soft requirement, one that could be changed by the seed data file name or with adding the type in the seed data file. I’m confused by the value of defining the type as Pavan has shown above, if the folder name is going to dictate the type. If folder name is truly required per Pavan’s #6, then I’d like to understand what remains as convention/best practice vs requirement. From my point of view, we will have a three-peat of the type name in the TypeA/TypeA.json -> {type: TypeA , values:[…]}

0 Likes

#14

@clowtown In your ItemTest.json file, you have the wrong type. It should be "[ItemTest]" since it is an array of ItemTest objects

0 Likes

#15

you only have to put the type in the file if it doesn’t EXACTLY match the folder name.

For example, the type Facility extends the type FixedAsset, but the type Transformer also extends FixedAsset.

You could have a folder in seed data called FixedAsset, another folder called Facility and another called Hospital. In this case you don’t need to put any type information in the file.

However, you could also have a singel folder called FixedAsset, and provide instances of all 3 types in the one folder. In this case you have to say which objects are Transformers and which objects are Facilities using pavan’s syntax.

Personally, i pretty much always just use the folder name, but we’ve tried to provide the developer with flexible options.

To be clear, it is a REQUIREMENT that the name of the final folder in the seed directory be a valid type, and that all objects in that folder be of that type (they can be more specific, e.g. Facility in a folder called FixedAsset)

You can have an arbitrary folder structure between seed/ and the final folder. I do not agree that a flat structure is easier to read.

4 Likes