The perils of removeAll()


#1

Be careful when using removeAll() functions on persitable types (ex: Item.removeAll()). You may inadvertantly delete data that was not meant to be deleted. Instead pass in a filter when using this function (ex: Item.removeAll(Filter.eq(“id”, “id_of_item”)).


#2

Hi,

When I am trying to do a mass removal of risk scores for one well I have used the following command:

RiskScore.removeAll(“parent==‘DM193_Slugging’”)

However, this doesn’t scale when there is a large number of wells to remove.

Is there a command to add multiple conditions for removing data ?
e.g. the below
RiskScore.removeAll(“contains(parent,‘Slugging’)”)
Query


#3

@ChrisHui check the javascript helper Filter, it comes with many built-in functions useful to create complex conditions.


#4

@ChrisHui First of all, your screenshot is not very clear:

  • we don’t know what the first error relates to
  • in the second error message, you have a syntax error in your filter: you forgot the closing parenthesis for the contains() function
  • you seem to have fixed this in the last statement, but we don’t see what error was returned, if any

Assuming the first error is what gets returned by your last statement, I believe your issue is that you should pass parent.id to the contains() function, not just parent because then it expects the second argument to be of type RiskScore

Finally, note that if your type is stored in Cassandra, you will not have the same flexibility for filtering as with Relational entity types.


#5

@ishka - Apologies I should have worded this better. My main interest is we have a bunch of wells we have uploaded risk scores against which contain the _slugging suffix. Rather than removing these well by well which doesn’t scale when you have 1000+ entries, can we use a command to take the keyword slugging and remove the risk scores for all entries which have slugging against them?

Thank you.


#6

Given that RiskScore data is partitioned by parent id, if it were possible to do:

RiskScore.removeAll(Filter.contains('parent.id', 'Slugging'))

it would be practically the same performance as:

ParentType.fetch({filter: Filter.contains('id', 'Slugging')}).at('objs').each(function (parent) {
  RiskScore.removeAll(Filter.eq('parent.id', parent.id));
});

I think the only way to achieve better performance for this would be to implement a batch job or map-reduce job to do it, batching on the ParentType.


#7

I’m assuming you are removing both “RiskScoreHistory” and “RiskScore”. One way to do this is the following (JS code for console use):

  1. Fetch RiskScoreHistory desired
var rsh = RiskScoreHistory.fetch({
  filter: Filter.eq("scoreName", "Slugging")
});
  1. Remove RiskScore based on fetched RiskScoreHistory
rsh.objs.each(function(o) {RiskScore.removeAll(Filter.eq("parent.id", o.id));});
  1. Remove RiskScoreHistory themselves
rsh.objs.each(function(o) {RiskScoreHistory.removeAll(Filter.eq("id", o.id));});

Be careful when you do this. Check after step 1 to see if you fetched the correct things.


#8

Be careful with this approach. This might overwhelm the master node, and you would lose connection to the server


#9

To do this via MapReduce, you can use a variation of the following code (tweak as needed):

  objs.each(function(rsh) { 
	var log=C3.logger("RemoveRiskScores");
    try {
		RiskScore.removeAll(Filter.eq('parent.id',rsh.id))
	} catch(err){log.error(err);}
  }); 
}

var spec = JSMapReduceSpec.make({
	targetType: RiskScoreHistory,
	include: "id",
	order:"id",
	limit: -1,
	batchSize: 100,
	map: map
});

var mrj = JS.mapReduce(spec);
mrj.status();

mrj.status() can be called repeatedly to track the status of this removal.


#10

I just noticed on v7.6.1 (which might be true on later versions) that the doc says that removeAll uses the same FetchSpec#filter as fetch but I am getting an error.

PointMeasurement.fetch({ filter:Filter.intersects('parent', testHdrs) })

works (where testHdrs is an array of strings), but

PointMeasurement.removeAll({ filter:Filter.intersects('parent', testHdrs) })

fails with:

"Cannot convert object to string for removeAll argument removeFilter.↵value: {"filter":"intersects(parent, [\"33b61cdd-2258-4e65-3e1b-0164cd16552a-15\", \"33b61cdd-2258-4e65-3e1... [InvalidInputParam]"

#11

The doc says that the removeFilter argument should be a string, not an object, hence the error message you are receiving…
It also says: Valid filter expressions follow the same rules as in FetchSpec#filter, so you could try the following:
PointMeasurement.removeAll(Filter.intersects('parent', testHdrs))
But I’m not sure if the Filter helper function will work here…


#12

You’re right about the filter parameter, I keep misreading the documentation…