Killing long running actions


#1
  1. What is the difference between the types Action and ClusterAction?
  2. What is the difference between the APIs interrupt and stop?
  3. When I try to stop/interrupt some long running actions, I sometimes see a response like "Could not find the action: <action-id>". In the past I recall getting around this by tunneling (c3Tunnel(<host-id>)) to the host that is running that action and then trying the stop/interrupt again - is that a suggested approach, why do I need to do that?
  4. If after tunneling and running stop/interrupt I get a:
500 response with invalid data (cannot parse JSON) -
org.apache.http.conn.HttpHostConnectException: Connect to <host> [<host>] failed: Connection refused

what is my next course of action?


How to kill a SourceFile.syncAll job initiated with process set to true
#2
  1. Action is the type which contains information about the action whereas ClusterAction contains information about the host of that action as well

  2. interrupt - attempts graceful shutdown of the action
    stop - kills the thread running the action

  3. Yes currently you need to c3Tunnel(“IP_ADDRESS:8080”) and call Action.stop/interrupt

  4. If after tunneling you get that error, its possible that the tunnel has not be formed correctly, try another simple action to ensure that the response comes back. If it comes back with the same error be sure to use the IP address:8080 to tunnel


#3

Thanks, the :8080 was the key that I was missing. After doing that I still see:
Could not find the action: <action_id>
when I try to stop the action, is there anything else I can try to kill this action?


#4

How about pausing queues that are triggering this action and subsequently clearing them aka kill at source.


#5

Im not clear on specifically what you mean or how that would help.

@rohit.sureka
This action is persistent even after trying all of the above, possibly due to the fact that its stuck in a waiting state. How can I kill this?

{
  "type": "ClusterAction",
  "host": "xxx-w-004.c3-prod.internal",
  "id": "1136.-1455050460",
  "target": "xxx/JmsDataLoadQueue?action=dispatchDataLoad",
  "status": "Waiting",
  "age": "84:27:07.106",
  "elapsed": 304027106
}

#6

@DavidT How can one kill actions in the Waiting state? The above method of tunneling to the host does not work


#7

@caljep for all of the below you need to be tunneled to worker:

a) try Action.stackTrace: function(action: Action): string to make sure athat action is really running - also stack trace could help to identify why is it stuck

b) alternatively use true as a last argument in Action.dump: function(tenant: string, tag: string, byThread: boolean): [Action] - if you don’t see action in that case it means it is stuck in the queue (this should never happen as long as there are available threads - but bugs do happen)

c) as a last resort use Action.stop: function(action: Action, reason: string): string to kill the thread action is running on - again this should be a last resort as resources allocated by this thread will orphan
`