A Task Self-destruct Mechanism For ECS
In the last post I have demonstrated that ECS might terminate our tasks if we don’t build the appropriate infrastructure. In this post I’ll attempt to build a successful mechanism that works as following:
- ECS sends a ‘terminate task’ signal to the running ECS task (due to a scale-in/deployment).
- The ECS agent sends a ‘SIGTERM’ signal to the running docker container
- The signal is being caught by the application running inside the docker container.
- The application checks if it is currently processing, if not exits
- If it is processing, it waits until processing is completed, than exists
A sample application code will:
import signaldef _sig_handler(self, signum, frame):
self.should_exit = True
# processing code# on startup - register to the sigterm
signal.signal(signal.SIGTERM, self._sig_handler)# process
self.is_processing = True
self.is_processing = False
One more thing to notice here is that ECS has a timeout for stopping a container, so if our processing might be longer than the time out configured, ECS might forcefully kill the container. In order to prevent that we can simply change the configuration to exceed our maximal processing time:
ECS_CONTAINER_STOP_TIMEOUT = 480s
In case your’e using a CMD in your docker file like I did, keep in mind you’ll need to change it to an ENTRYPOINT. This is due to the fact that when using a CMD, docker will run a sub sh thread and your code won’t get the sigterm signal rather the bash process that is running the sh. To avoid that simply use an ENTRYPOINT as following:
How are you protecting your tasks from getting terminated ? have you found an alternative mechanism ? I’ll be happy to hear about it!