Cleaning up after killing a running docker container
up vote
0
down vote
favorite
My goal is to write a docker image that runs a python script that produces a lot of csv files full of random numbers, which once finished, are to be written to an external storage drive, after which the container quits. Assume that it writes so many of these csv files that they cannot be stored into memory.
What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.
First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.
Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well? Can it be configured to auto quit thereby deleting any garbage that was created?
In case you're curious, the final goal is to use these containers with some sort of orchestration system.
linux docker
add a comment |
up vote
0
down vote
favorite
My goal is to write a docker image that runs a python script that produces a lot of csv files full of random numbers, which once finished, are to be written to an external storage drive, after which the container quits. Assume that it writes so many of these csv files that they cannot be stored into memory.
What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.
First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.
Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well? Can it be configured to auto quit thereby deleting any garbage that was created?
In case you're curious, the final goal is to use these containers with some sort of orchestration system.
linux docker
Stack Overflow is a site for programming and development questions. This question appears to be off-topic because it is not about programming or development. See What topics can I ask about here in the Help Center. Perhaps Super User or Unix & Linux Stack Exchange would be a better place to ask.
– jww
Sep 25 at 12:58
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
My goal is to write a docker image that runs a python script that produces a lot of csv files full of random numbers, which once finished, are to be written to an external storage drive, after which the container quits. Assume that it writes so many of these csv files that they cannot be stored into memory.
What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.
First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.
Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well? Can it be configured to auto quit thereby deleting any garbage that was created?
In case you're curious, the final goal is to use these containers with some sort of orchestration system.
linux docker
My goal is to write a docker image that runs a python script that produces a lot of csv files full of random numbers, which once finished, are to be written to an external storage drive, after which the container quits. Assume that it writes so many of these csv files that they cannot be stored into memory.
What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.
First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.
Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well? Can it be configured to auto quit thereby deleting any garbage that was created?
In case you're curious, the final goal is to use these containers with some sort of orchestration system.
linux docker
linux docker
asked Sep 24 at 20:39
Mr. Fegur
3731313
3731313
Stack Overflow is a site for programming and development questions. This question appears to be off-topic because it is not about programming or development. See What topics can I ask about here in the Help Center. Perhaps Super User or Unix & Linux Stack Exchange would be a better place to ask.
– jww
Sep 25 at 12:58
add a comment |
Stack Overflow is a site for programming and development questions. This question appears to be off-topic because it is not about programming or development. See What topics can I ask about here in the Help Center. Perhaps Super User or Unix & Linux Stack Exchange would be a better place to ask.
– jww
Sep 25 at 12:58
Stack Overflow is a site for programming and development questions. This question appears to be off-topic because it is not about programming or development. See What topics can I ask about here in the Help Center. Perhaps Super User or Unix & Linux Stack Exchange would be a better place to ask.
– jww
Sep 25 at 12:58
Stack Overflow is a site for programming and development questions. This question appears to be off-topic because it is not about programming or development. See What topics can I ask about here in the Help Center. Perhaps Super User or Unix & Linux Stack Exchange would be a better place to ask.
– jww
Sep 25 at 12:58
add a comment |
2 Answers
2
active
oldest
votes
up vote
1
down vote
accepted
What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.
Note that you may configure your ENTRYPOINT Python script to automatically perform the necessary cleanup.
To give you some guidelines/examples of that approach:
- I gave one such example (implemented in Bash, namely with a
trap
) in this SO answer. - Another possible example (implemented in Python) is given in this blog article.
Note that beyond the graceful termination of your containers, you may want to setup a restart
policy, such as always
or unless-stopped
. See for example this codeship blog article.
First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.
Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well?
Albeit the two solutions you present wouldn't be necessary to address the main question of this thread, I have to mention that in general, it is a best practice to use volumes in production, rather than using a mere bind-mount. But of course using any of these two approches (-v volume-name:/path
or a bind-mount -v /path:/path
) is way better than not using the -v
option at all, because I recall that writing data directly in the writable layer of the container implies this data will be lost if the container is recreated from the image.
add a comment |
up vote
1
down vote
What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.
If you write your intermediate files into the container filesystem, rather than to a persistent volume, then docker can do all the hard work for you. Simply run your container with the remove option (--rm
). E.g. if you did:
docker run --rm -v /path/to/external/storage:/final/result your_image
Then your application can write to anywhere other than /final/result, and upon exit of the container (successful or any other error condition), the container will be automatically deleted by the docker daemon. On successful completion of your task, write your content to /final/result to be persisted after the container exits. This path is completely made up and you'll likely want to adjust this for your usage.
Note that if you are running on a desktop environment (mac/windows) and not native linux, then there is an issue with the VM disk expanding with usage and not shrinking as files are deleted. This is the nature of VM filesystems that allocate upon usage and outside of docker's control. In that scenario, you'd likely want the entire setup running with an external volume and configuring your entrypoint to cleanup any temporary files left over from the last run of your container.
The reason I didn't go with this option is because I did not know where docker did its file IO when left to its devices. For example if I have two hard disks, an HDD and an SSD, and I want my python script to do its file IO on the SSD because of speed, then I don't know how to get docker to do that using this method. With volumes, I can specify this fast SSD and have my script use it.
– Mr. Fegur
Sep 24 at 22:49
1
@Mr.Fegur the container files will be stored under /var/lib/docker while the container is running.
– BMitch
Sep 25 at 9:22
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.
Note that you may configure your ENTRYPOINT Python script to automatically perform the necessary cleanup.
To give you some guidelines/examples of that approach:
- I gave one such example (implemented in Bash, namely with a
trap
) in this SO answer. - Another possible example (implemented in Python) is given in this blog article.
Note that beyond the graceful termination of your containers, you may want to setup a restart
policy, such as always
or unless-stopped
. See for example this codeship blog article.
First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.
Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well?
Albeit the two solutions you present wouldn't be necessary to address the main question of this thread, I have to mention that in general, it is a best practice to use volumes in production, rather than using a mere bind-mount. But of course using any of these two approches (-v volume-name:/path
or a bind-mount -v /path:/path
) is way better than not using the -v
option at all, because I recall that writing data directly in the writable layer of the container implies this data will be lost if the container is recreated from the image.
add a comment |
up vote
1
down vote
accepted
What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.
Note that you may configure your ENTRYPOINT Python script to automatically perform the necessary cleanup.
To give you some guidelines/examples of that approach:
- I gave one such example (implemented in Bash, namely with a
trap
) in this SO answer. - Another possible example (implemented in Python) is given in this blog article.
Note that beyond the graceful termination of your containers, you may want to setup a restart
policy, such as always
or unless-stopped
. See for example this codeship blog article.
First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.
Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well?
Albeit the two solutions you present wouldn't be necessary to address the main question of this thread, I have to mention that in general, it is a best practice to use volumes in production, rather than using a mere bind-mount. But of course using any of these two approches (-v volume-name:/path
or a bind-mount -v /path:/path
) is way better than not using the -v
option at all, because I recall that writing data directly in the writable layer of the container implies this data will be lost if the container is recreated from the image.
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.
Note that you may configure your ENTRYPOINT Python script to automatically perform the necessary cleanup.
To give you some guidelines/examples of that approach:
- I gave one such example (implemented in Bash, namely with a
trap
) in this SO answer. - Another possible example (implemented in Python) is given in this blog article.
Note that beyond the graceful termination of your containers, you may want to setup a restart
policy, such as always
or unless-stopped
. See for example this codeship blog article.
First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.
Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well?
Albeit the two solutions you present wouldn't be necessary to address the main question of this thread, I have to mention that in general, it is a best practice to use volumes in production, rather than using a mere bind-mount. But of course using any of these two approches (-v volume-name:/path
or a bind-mount -v /path:/path
) is way better than not using the -v
option at all, because I recall that writing data directly in the writable layer of the container implies this data will be lost if the container is recreated from the image.
What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.
Note that you may configure your ENTRYPOINT Python script to automatically perform the necessary cleanup.
To give you some guidelines/examples of that approach:
- I gave one such example (implemented in Bash, namely with a
trap
) in this SO answer. - Another possible example (implemented in Python) is given in this blog article.
Note that beyond the graceful termination of your containers, you may want to setup a restart
policy, such as always
or unless-stopped
. See for example this codeship blog article.
First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.
Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well?
Albeit the two solutions you present wouldn't be necessary to address the main question of this thread, I have to mention that in general, it is a best practice to use volumes in production, rather than using a mere bind-mount. But of course using any of these two approches (-v volume-name:/path
or a bind-mount -v /path:/path
) is way better than not using the -v
option at all, because I recall that writing data directly in the writable layer of the container implies this data will be lost if the container is recreated from the image.
edited Sep 24 at 22:04
answered Sep 24 at 21:58
ErikMD
1,8231318
1,8231318
add a comment |
add a comment |
up vote
1
down vote
What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.
If you write your intermediate files into the container filesystem, rather than to a persistent volume, then docker can do all the hard work for you. Simply run your container with the remove option (--rm
). E.g. if you did:
docker run --rm -v /path/to/external/storage:/final/result your_image
Then your application can write to anywhere other than /final/result, and upon exit of the container (successful or any other error condition), the container will be automatically deleted by the docker daemon. On successful completion of your task, write your content to /final/result to be persisted after the container exits. This path is completely made up and you'll likely want to adjust this for your usage.
Note that if you are running on a desktop environment (mac/windows) and not native linux, then there is an issue with the VM disk expanding with usage and not shrinking as files are deleted. This is the nature of VM filesystems that allocate upon usage and outside of docker's control. In that scenario, you'd likely want the entire setup running with an external volume and configuring your entrypoint to cleanup any temporary files left over from the last run of your container.
The reason I didn't go with this option is because I did not know where docker did its file IO when left to its devices. For example if I have two hard disks, an HDD and an SSD, and I want my python script to do its file IO on the SSD because of speed, then I don't know how to get docker to do that using this method. With volumes, I can specify this fast SSD and have my script use it.
– Mr. Fegur
Sep 24 at 22:49
1
@Mr.Fegur the container files will be stored under /var/lib/docker while the container is running.
– BMitch
Sep 25 at 9:22
add a comment |
up vote
1
down vote
What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.
If you write your intermediate files into the container filesystem, rather than to a persistent volume, then docker can do all the hard work for you. Simply run your container with the remove option (--rm
). E.g. if you did:
docker run --rm -v /path/to/external/storage:/final/result your_image
Then your application can write to anywhere other than /final/result, and upon exit of the container (successful or any other error condition), the container will be automatically deleted by the docker daemon. On successful completion of your task, write your content to /final/result to be persisted after the container exits. This path is completely made up and you'll likely want to adjust this for your usage.
Note that if you are running on a desktop environment (mac/windows) and not native linux, then there is an issue with the VM disk expanding with usage and not shrinking as files are deleted. This is the nature of VM filesystems that allocate upon usage and outside of docker's control. In that scenario, you'd likely want the entire setup running with an external volume and configuring your entrypoint to cleanup any temporary files left over from the last run of your container.
The reason I didn't go with this option is because I did not know where docker did its file IO when left to its devices. For example if I have two hard disks, an HDD and an SSD, and I want my python script to do its file IO on the SSD because of speed, then I don't know how to get docker to do that using this method. With volumes, I can specify this fast SSD and have my script use it.
– Mr. Fegur
Sep 24 at 22:49
1
@Mr.Fegur the container files will be stored under /var/lib/docker while the container is running.
– BMitch
Sep 25 at 9:22
add a comment |
up vote
1
down vote
up vote
1
down vote
What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.
If you write your intermediate files into the container filesystem, rather than to a persistent volume, then docker can do all the hard work for you. Simply run your container with the remove option (--rm
). E.g. if you did:
docker run --rm -v /path/to/external/storage:/final/result your_image
Then your application can write to anywhere other than /final/result, and upon exit of the container (successful or any other error condition), the container will be automatically deleted by the docker daemon. On successful completion of your task, write your content to /final/result to be persisted after the container exits. This path is completely made up and you'll likely want to adjust this for your usage.
Note that if you are running on a desktop environment (mac/windows) and not native linux, then there is an issue with the VM disk expanding with usage and not shrinking as files are deleted. This is the nature of VM filesystems that allocate upon usage and outside of docker's control. In that scenario, you'd likely want the entire setup running with an external volume and configuring your entrypoint to cleanup any temporary files left over from the last run of your container.
What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.
If you write your intermediate files into the container filesystem, rather than to a persistent volume, then docker can do all the hard work for you. Simply run your container with the remove option (--rm
). E.g. if you did:
docker run --rm -v /path/to/external/storage:/final/result your_image
Then your application can write to anywhere other than /final/result, and upon exit of the container (successful or any other error condition), the container will be automatically deleted by the docker daemon. On successful completion of your task, write your content to /final/result to be persisted after the container exits. This path is completely made up and you'll likely want to adjust this for your usage.
Note that if you are running on a desktop environment (mac/windows) and not native linux, then there is an issue with the VM disk expanding with usage and not shrinking as files are deleted. This is the nature of VM filesystems that allocate upon usage and outside of docker's control. In that scenario, you'd likely want the entire setup running with an external volume and configuring your entrypoint to cleanup any temporary files left over from the last run of your container.
answered Sep 24 at 22:45
BMitch
54.5k9110127
54.5k9110127
The reason I didn't go with this option is because I did not know where docker did its file IO when left to its devices. For example if I have two hard disks, an HDD and an SSD, and I want my python script to do its file IO on the SSD because of speed, then I don't know how to get docker to do that using this method. With volumes, I can specify this fast SSD and have my script use it.
– Mr. Fegur
Sep 24 at 22:49
1
@Mr.Fegur the container files will be stored under /var/lib/docker while the container is running.
– BMitch
Sep 25 at 9:22
add a comment |
The reason I didn't go with this option is because I did not know where docker did its file IO when left to its devices. For example if I have two hard disks, an HDD and an SSD, and I want my python script to do its file IO on the SSD because of speed, then I don't know how to get docker to do that using this method. With volumes, I can specify this fast SSD and have my script use it.
– Mr. Fegur
Sep 24 at 22:49
1
@Mr.Fegur the container files will be stored under /var/lib/docker while the container is running.
– BMitch
Sep 25 at 9:22
The reason I didn't go with this option is because I did not know where docker did its file IO when left to its devices. For example if I have two hard disks, an HDD and an SSD, and I want my python script to do its file IO on the SSD because of speed, then I don't know how to get docker to do that using this method. With volumes, I can specify this fast SSD and have my script use it.
– Mr. Fegur
Sep 24 at 22:49
The reason I didn't go with this option is because I did not know where docker did its file IO when left to its devices. For example if I have two hard disks, an HDD and an SSD, and I want my python script to do its file IO on the SSD because of speed, then I don't know how to get docker to do that using this method. With volumes, I can specify this fast SSD and have my script use it.
– Mr. Fegur
Sep 24 at 22:49
1
1
@Mr.Fegur the container files will be stored under /var/lib/docker while the container is running.
– BMitch
Sep 25 at 9:22
@Mr.Fegur the container files will be stored under /var/lib/docker while the container is running.
– BMitch
Sep 25 at 9:22
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52487019%2fcleaning-up-after-killing-a-running-docker-container%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Stack Overflow is a site for programming and development questions. This question appears to be off-topic because it is not about programming or development. See What topics can I ask about here in the Help Center. Perhaps Super User or Unix & Linux Stack Exchange would be a better place to ask.
– jww
Sep 25 at 12:58