persistent volume of type "MOUNT" suddenly disappear #438

julienlau · 2017-04-24T14:36:58Z

I run a cluster with 3 dedicated cassandra nodes (dcos 3.0.10 - 25) using placement constraints and labels.
Each node has 4 disks available and I want to use /dev/sdd1 entirely for cassandra data.
sdd1 is 1TiB, so that i reserve 978000Mib on Mesos via marathon in order to ensure that cassandra data will be alone on this partition.
commitlog max size is 20GiB.
The cluster run DCOS 1.8.7 on Azure.
It sometimes appear that after sometime running correctly (first time runs OK for several weeks second time for 3 days) the mounting points are messy and the cluster loses the whole data after unmounting the disk.

I get in DCOS-cassandra log:
INFO 05:50:56 Redistributing index summaries WARN 06:34:50 Writing large partition lri/modis_product_status:MYD09GA (160373299 bytes to sstable /var/lib/mesos/slave/slaves/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-S3/frameworks/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-0002/executors/node-2_executor__30cb97c7-e797-43de-a948-159a30338595/runs/f64776d9-2d4b-47c2-893e-eed7b33e73a3/volume/data/lri/modis_product_status-43287da0266c11e7aa19cb05e35a70d4/mc-38-big-Data.db) WARN 06:35:12 Writing large partition lri/modis_product_status:MYD09GQ (160275001 bytes to sstable /var/lib/mesos/slave/slaves/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-S3/frameworks/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-0002/executors/node-2_executor__30cb97c7-e797-43de-a948-159a30338595/runs/f64776d9-2d4b-47c2-893e-eed7b33e73a3/volume/data/lri/modis_product_status-43287da0266c11e7aa19cb05e35a70d4/mc-38-big-Data.db) WARN 06:35:32 Writing large partition lri/modis_product_status:MOD09GA (159282714 bytes to sstable /var/lib/mesos/slave/slaves/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-S3/frameworks/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-0002/executors/node-2_executor__30cb97c7-e797-43de-a948-159a30338595/runs/f64776d9-2d4b-47c2-893e-eed7b33e73a3/volume/data/lri/modis_product_status-43287da0266c11e7aa19cb05e35a70d4/mc-38-big-Data.db) WARN 06:35:51 Writing large partition lri/modis_product_status:MOD09GQ (159777110 bytes to sstable /var/lib/mesos/slave/slaves/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-S3/frameworks/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-0002/executors/node-2_executor__30cb97c7-e797-43de-a948-159a30338595/runs/f64776d9-2d4b-47c2-893e-eed7b33e73a3/volume/data/lri/modis_product_status-43287da0266c11e7aa19cb05e35a70d4/mc-38-big-Data.db) INFO 06:49:08 Redistributing index summaries INFO 07:47:22 Redistributing index summaries INFO 08:47:07 Redistributing index summaries INFO 09:47:05 Redistributing index summaries ERROR 10:06:52 Stopping gossiper WARN 10:06:52 Stopping gossip by operator request INFO 10:06:52 Announcing shutdown INFO 10:06:52 Node /10.253.3.136 state jump to shutdown ERROR 10:06:54 Stopping native transport INFO 10:06:54 Stop listening for CQL clients ERROR 10:06:55 Failed managing commit log segments. Commit disk failure policy is stop; terminating thread org.apache.cassandra.io.FSWriteError: java.nio.file.NoSuchFileException: /var/lib/mesos/slave/slaves/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-S3/frameworks/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-0002/executors/node-2_executor__30cb97c7-e797-43de-a948-159a30338595/runs/f64776d9-2d4b-47c2-893e-eed7b33e73a3/volume/commitlog/CommitLog-6-1492762195831.log at org.apache.cassandra.db.commitlog.CommitLogSegment.<init>(CommitLogSegment.java:163) ~[apache-cassandra-3.0.10.jar:3.0.10] at org.apache.cassandra.db.commitlog.MemoryMappedSegment.<init>(MemoryMappedSegment.java:47) ~[apache-cassandra-3.0.10.jar:3.0.10] at org.apache.cassandra.db.commitlog.CommitLogSegment.createSegment(CommitLogSegment.java:124) ~[apache-cassandra-3.0.10.jar:3.0.10] at org.apache.cassandra.db.commitlog.CommitLogSegmentManager$1.runMayThrow(CommitLogSegmentManager.java:122) ~[apache-cassandra-3.0.10.jar:3.0.10] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) [apache-cassandra-3.0.10.jar:3.0.10] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] Caused by: java.nio.file.NoSuchFileException: /var/lib/mesos/slave/slaves/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-S3/frameworks/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-0002/executors/node-2_executor__30cb97c7-e797-43de-a948-159a30338595/runs/f64776d9-2d4b-47c2-893e-eed7b33e73a3/volume/commitlog/CommitLog-6-1492762195831.log at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[na:1.8.0_121] at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[na:1.8.0_121] at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[na:1.8.0_121] at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) ~[na:1.8.0_121] at java.nio.channels.FileChannel.open(FileChannel.java:287) ~[na:1.8.0_121] at java.nio.channels.FileChannel.open(FileChannel.java:335) ~[na:1.8.0_121] at org.apache.cassandra.db.commitlog.CommitLogSegment.<init>(CommitLogSegment.java:158) ~[apache-cassandra-3.0.10.jar:3.0.10] ... 5 common frames omitted INFO 10:47:03 Redistributing index summaries

And if I have a look in journalctl I got:
mount -l | grep "^/dev" /dev/sda1 on / type ext4 (rw,relatime,discard,data=ordered) [cloudimg-rootfs] /dev/sdb1 on /var/lib/docker type ext4 (rw,relatime,data=ordered) /dev/sdc1 on /var/lib/mesos type ext4 (rw,relatime,data=ordered) /dev/sdd1 on /dcos/volume0 type ext4 (rw,relatime,data=ordered) /dev/sdd1 on /var/lib/mesos/slave/slaves/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-S3/frameworks/8b2debbc-29a2-4fdf-8760-993266ec2db7-0007/executors/node-0_executor__3cd0eefc-f84a-40c1-abe7-ca394596a4bf/runs/832b8072-69f5-4b24-a02d-cf07409e18dc/volume type ext4 (rw,relatime,data=ordered) /dev/sdd1 on /var/lib/mesos/slave/slaves/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-S3/frameworks/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-0002/executors/node-2_executor__30cb97c7-e797-43de-a948-159a30338595/runs/f64776d9-2d4b-47c2-893e-eed7b33e73a3/volume type ext4 (rw,relatime,data=ordered)

First problem I see that the cassandra scheduler tried to mount 2 different task on my datadisk and it did not remove properly the mount point.
[coretech@dcos-cassandra-363418932]> ll /var/lib/mesos/slave/slaves/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-S3/frameworks/8b2debbc-29a2-4fdf-8760-993266ec2db7-0007/executors/node-0_executor__3cd0eefc-f84a-40c1-abe7-ca394596a4bf/runs/832b8072-69f5-4b24-a02d-cf07409e18dc/volume drwxr-xr-x 3 root root 4.0K Apr 20 15:46 .. [coretech@dcos-cassandra-363418932]> ll /var/lib/mesos/slave/slaves/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-S3/frameworks/6a98c117-bd03-452c-87f7-8bcc3dd20aaa-0002/executors/node-2_executor__30cb97c7-e797-43de-a948-159a30338595/runs/f64776d9-2d4b-47c2-893e-eed7b33e73a3/volume drwxr-xr-x 7 root root 4.0K Apr 21 08:09 ..

Second problem, I do not understand why it worked for 3 days and why it does not work anymore (node is stucked in DRAINING state).

Any help appreciated.
Regards

The text was updated successfully, but these errors were encountered:

verma7 · 2017-05-15T17:36:45Z

Disclaimer: I am not affiliated with Mesosphere. I work at Uber where we use the open source frameework with Apache Mesos (without DCOS).

We found this bug in production a month ago and a few of our persistent volumes were deleted. Thankfully we didn't lose any customer data because we had 3 replicas. I debugged the issue and found that the cause was stale mounts in the sandbox directory and the sandbox garbage collector cleaning the sandbox directory along with the mounted persistent volume after a week.

We filed a critical bug here https://issues.apache.org/jira/browse/MESOS-7366 and the Mesos team urgently released a hotfix. Please upgrade to one of the fixed versions ASAP.

While we were upgrading all our agents, we setup alerts for whenever the same directory is mounted more than once on any agent and then the oncall will manually unmount the older mount point.

cc @mohitsoni @triclambert @gabrielhartmann : You should probably send an email or force upgrade every one who is running older versions of Mesos because they will be affected by this bug which leads to data loss.

triclambert · 2017-12-04T14:32:53Z

This repo is deprecated and will be archived in one week. Please see the latest version of Cassandra or DSE for DC/OS:

https://docs.mesosphere.com/service-docs/cassandra/
https://docs.mesosphere.com/service-docs/dse/ (enterprise-only)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

persistent volume of type "MOUNT" suddenly disappear #438

persistent volume of type "MOUNT" suddenly disappear #438

julienlau commented Apr 24, 2017 •

edited

Loading

verma7 commented May 15, 2017

triclambert commented Dec 4, 2017

persistent volume of type "MOUNT" suddenly disappear #438

persistent volume of type "MOUNT" suddenly disappear #438

Comments

julienlau commented Apr 24, 2017 • edited Loading

verma7 commented May 15, 2017

triclambert commented Dec 4, 2017

julienlau commented Apr 24, 2017 •

edited

Loading