cloudplan blog

Distributed file system vs distributed file storage

The main goal of a distributed file system is to offer access transparency to the client – the clients do not need to know where the content is located.

A network protocol allows block level access to specific storage locations. The content can reside in different locations whereby the user cannot decide where content or parts or the content is stored. The “automatic” handling of how and where file content is stored is part of the distributed file system.

Changes like scaling up/ down or failure of single network nodes are all handled by the distributed files system.

There are even more requirements such as the need for heterogeneity across different hardware platforms.

There are quite a few products on the market such as Windows Distributed Files System (DFS) by Microsoft, Hadoop distributed file system by Apache or the GFS = Google file system and a lot more.

All these distributed files system might have their advantages and feature differences, but there is one big design disadvantage: If the software of the distributed file system fails the user has no chance to access the content data. Functionality can only be provided if the whole system runs correct. In addition to that if network parts of the distributed file system do not run whereby the failure cannot be compensated by other nodes content cannot be accessed.

When certain number of networks nodes fail which are part of the steering logic they need to be replaced and configured.

 

cloudplan distributed file storage / software defined storage

The cloudplan solution follows a different approach: All file content is stored using the file system of the particular network node of the hardware being used. Different hardware platforms can be used in one network. All files can be accessed anytime on the local machines using the file system of the OS.

The client app takes over work to be done of the distributed design, since a client “knows” on which nodes the content resides the node/ user has access to. The client decides themselves what content nodes shall be used to read and write content based on geo location and prioritization of the node. The clients connect to other nodes automatically if target source nodes fail or are unavailable.

This design allows unlimited scalability and redundancy. When content is located on more than one node with the same geo location and prioritization, the clients split up their connections to the available source nodes automatically.

The node can be spread out globally or be moved to any new location within a private or public IP network location without extra products or configuration. The nodes find each other automatically through firewalls and routers without a VPN, port forwarding or other time consuming work.

Every node can fail anytime, but content is always accessible since the file system used is not proprietary.



back to bloglist