How to transform a JSON object to a Map object to be consumed in a Reduce function?

Hi,

I’m trying to transform a “mono thread algorithm” to a map reduce job…

The main object (which is a Json object) that i’m using in that algorithm looks like this:

It’s the final output of my Map function… i’m facing issues to send the Json object to the reduce function… as the output of the map function must be a Map<string,Obj>…

Is there any simple way to transform the Json to an object that may be sent throught the map object to the reduce function?

I tried many ways to make it… some of them were rejected and others are risky… your ideas are welcome…

FYI, my JSON is multilevel object and may be huge in term of size…

thnx

1 Like

A mapreduce basically output K, V in the map function then the reduce gets all values for a given key, i.e. K, [V].

if your object are too big, may be split then into smaller objects, e.g.

  • construct K as concatenation of the different level keys like this res1base-2018-2-01100144586748_SE
  • construct V as the object at the leaf level like [{'rmsId': ...}].

Just for reference, this is how MapReduce type is declared:

entity type MyJob mixes MapReduce<MapInputType, ReduceInputKeyType, ReduceInputValType, ReduceOutputType> type key 'XYZ' {
  . . .
}

Can you describe the shape of the object you want receive in the reduce function? or the logic you want to perform inside reduce, this could give hints how the data should be.

Here more details about the issue:

// the MapReduce Job declaration

entity type ForecasterEnedisMapReduceJob mixes MapReduce<ServicePoint, string, json, Void> type key ‘FORECASTER_ENEDIS_MR’ {
setup: function(): ForecasterEnedisMapReduceJob js server
map: ~ js server
reduce: ~ js server
}

////////////// the Map function return instruction

function map(batch, objs, job) {
.
.
.
var dataWithNbCumSum = calculateNbCumSum(data); // generates a JS Json object

var resFinal = c3Make(‘map<string, json>’, { value: dataWithNbCumSum });

//i tried this as well: var resFinal = c3Make(‘map<string, json>’, { value: Json.fromString(JSON.stringify(explodedDataWithNbCumSum))});

return resFinal;
}

///////// the reduce function

function reduce(outKey, interValues, job) {
// nothing for the moment, just trying to look wath i get in interValues
try{blabla.get(’’)}
catch (e) {
throw new Error( JSON.stringify(interValues));
}
}

So, i tried many types as map output, i check the output is full, but i recieve nothing in the reduce side…

Here the structure of my JS Json object:
{
“res1base”: {
“2018”: {
“6”: {
“04332127265965_5E”: [{
“rmsId”: “RMSM_04332127265965_5E_1_R”,
“servicePointId”: “04332127265965_5E”,
“startDate”: “2018-11-03T00:00:00.000+01:00”,
“endDate”: “2018-12-02T00:00:00.000+01:00”,
“nbJours”: 29,
“month”: 11,
“year”: 2018,
“profil”: “RES1_BASE”,
“consoBase”: 53,
“coeffBase”: 32.022510721326135,
“monthAtStart”: 6,
“yearAtStart”: 2018,
“nbCumSum”: 180,
“nbCumSumMax”: 358
},
{
“rmsId”: “RMSM_04332127265965_5E_1_R”,
“servicePointId”: “04332127265965_5E”,
“startDate”: “2018-12-03T00:00:00.000+01:00”,
“endDate”: “2019-01-02T00:00:00.000+01:00”,
“nbJours”: 30,
“month”: 12,
“year”: 2018,
“profil”: “RES1_BASE”,
“consoBase”: 39,
“coeffBase”: 35.87927775537637,
“monthAtStart”: 6,
“yearAtStart”: 2018,
“nbCumSum”: 210,
“nbCumSumMax”: 358
}
]
},
“7”: {
“07264109981292_5E”: [{
“rmsId”: “RMSM_07264109981292_5E_1_R”,
“servicePointId”: “07264109981292_5E”,
“startDate”: “2018-07-25T00:00:00.000+02:00”,
“endDate”: “2019-01-24T00:00:00.000+01:00”,
“nbJours”: 183,
“month”: 7,
“year”: 2018,
“profil”: “RES1_BASE”,
“consoBase”: 1059,
“coeffBase”: 189.66471270161284,
“monthAtStart”: 7,
“yearAtStart”: 2018,
“nbCumSum”: 183,
“nbCumSumMax”: 362
}]
}
}
},
“res11base”: {},
“hphc”: {},
“cad3”: {}
}

Thanks,
Hedi

The shape of the object i’m expecting in reduce is an array of the objects sent by the maps… isn’t the “by default” behaviour of a map reduce job ? Please, check the shape of the returned object by maps function in my first comment

here the results i’m recieving in the Map/Reduce functions (attached)
MAP: the object sent is full

REDUCE: the recieved array is empty
reduce

@HediTek
I tested the code with any or json types, in both cases we get null objects passed to the reduce().
You can bypass this constraint by defining intermediate values type as string:

entity type ForecasterEnedisMapReduceJob mixes MapReduce<ServicePoint, string, string, Void>

In the map() function you can serialize your objet using JSON.stringify() and decode it in the reduce() with JSON.parse():

function map(batch, objs, job) {
    return {
            "res1base": 
                JSON.stringify({
                    "2018": {
                        "2": {
                            "011..5E": [
                                { "rmsId": "RMS1", servicePointId: "SP1" },
                                { "rmsId": "RMS2", servicePointId: "SP2" },
                                { "rmsId": "RMS3", servicePointId: "SP3" },
                            ]
                        }
                    }
                })
    };
}

function reduce(outKey, interValues, job) {
    C3.logger("mylogs").info(JSON.stringify(interValues));
    var value = JSON.parse(interValues[0]);
    return [ value["2018"]["2"]["011..5E"].length ];
}

Please fill a ticket if you want to get a long term solution with json type.

One question, Do the JSON object generated string have any limitation in term of size? as my object may be so big…