About Docker and Storage

Docker LogoDocker seems to be all the rage this days, everyone seems to be running around integrating it, building things on top of it and generally giving it great press. It is no surprise then that I decided I should look into what this is all about.

The one bit of information I found somewhat less frequently discussed is where everything gets stored.

Storage is important. Disk partitioning is the first task any OS installer puts you through, even before that, an experienced sysadmin pays great attention to what kind of storage devices and channels go into a server. Data storage decisions have great effect on how your system end up performing, how robust is it as well how easy is it to backup and repair when it breaks. Bad storage decisions tend to be hard to fix, necessitating large data transfers and long downtimes. Indeed, allowing a sysadmin to fix bad storage decisions is where LVM, Veritas Volume Manager and other storage visualization tools come from.

When it comes to docker, it seems that by default everything lives under “/var/lib/docker“. And I mean everything! Not only are all the containers there, with all the applications, databases and OS libraries that go inside then, but also Docker’s own metadata graph database is there as well.

All this leads to the conclusion that you need one hell of a filesystem available in “/var/lib/docker“. This is indeed a trap for the unweary sysadmin that puts too little a filesystem in “/var” thinking only logfiles and caches go there. I am indeed very surprised this doesn’t seem to be mentioned anywhere on Docker’s otherwise very well designed and curated website.

Docker doesn’t seem to provide many tools for managing container storage, you can use the “-g” option for the docker daemonĀ  to place everything somewhere other then “/var/lib/docker“, but there doesn’t seem to be a way to request that certain containers get stored in certain places. One tool that is available for sysadmins, is utilizing the so called “docker volumes” that allow one to link container directories to outside host directories.

Some blog posts suggest utilizing volumes in the so called “data-only container pattern” for the sake of portability, but this beats the purpose of moving data outside of the “/var/lib/docker” directory. I would suggest sysadmins take care to use volumes for moving things such as database files to places where better performance can be delivered or operations such as snapshotting and backup can be performed.

Digging further into the issue of storage in docker, I finally came across this blog post detailing how data storage is implemented when Docker runs on top of RHEL/CentOS/Fedora. Following that, it seems to me the best approach is to create a dedicated small metadata filesystem mounted in “/var/lib/docker”, and then create dedicated raw volumes linked to “/var/lib/docker/devicemapper/devicemapper/data” and “/var/lib/docker/devicemapper/devicemapper/metadata“.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s