Most of the nosql autoscaling faces issue due to the fact the data have to be migrated during peak load. What if data is stored in a shared storage like CLVM which has less overhead(compared to NFS or shared file system). Now if each bucket/shard is a separate LVM and a compute can mount one or more LVMs based on the amount of shards its responsible for. On high load, the compute will give up few shards(umount LVM) and new compute which has come up will mount the shards. This decouples the storage and compute problems of DB and can make compute horizontally scalable. I know serverfault doesn’t accept open ended discussions. Suggesting a forum to post this will also help me. If anybody could help me understand pitfalls on this idea are also welcome
MongoDB, for instance, has the concept of the replica set for this situation. In this case multiple MongoDB instances serve the same data. If one fails, the others will continue serving the data. Shared storage is not necessary or desirable for a replica set; each MongoDB instance keeps a separate copy of the data.
This is entirely orthogonal to sharding, in which data is split among different MongoDB instances or replica sets.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.