Since the blockchain has become a new novelty of entrepreneurship, the project with the concept of decentralized storage in blockchain has become very attractive from time to time.
Facing the temptation of trillion data storage market, many entrepreneurs have targeted at the business opportunity.
It’s reported that there are many blockchain projects with decentralized storage in the market such as Storj, Sia, Factom, MaidSafe, Genero, etc. The IPFS protocol is the most well-known, although its theory hasn’t been fully implemented, it has already attracted many fans(Read more: Introduction to Technical Framework and Applications of IPFS).
Wang Donglin is such a practitioner who has got involved in the storage industry for nearly 10 years. He told EastShore that if people were confused about the terminology such as storage certification mechanism and network protocol, then just tried to remove the thick clouds concentrated over the decentralized storage project. “As for the storage itself, the decentralized storage is still the continuation of classic technology.”
It means that if people try to sort out the classic data storage practices in history, they can naturally understand how the decentralized storage solutions can be improved, and why the industry believes the era of “blockchain storage surpassing the cloud storage” has come already.
Revolution and Reform of Cloud Storage
There will be no computer without storage.
The history of storage devices can be traced back to the birth of the first computer. The era of storage service for the public can be traced back to 2006 when e-commerce giant, Amazon, released the S3 storage service.
Due to the easy operation, the cloud server which was originally used to serve Amazon was used by many enterprises with strong requirements for data storage. Hence, the entire cloud storage industry has gradually developed.
After more than decade of development, the cloud storage has evolved from a tiny market to a huge market. According to the data released by IDC China, the scale of cloud management service market in China reached US$307 million in 2018, a year-on-year growth of 131.4%. The forecast pointed out the compound growth rate in the whole market will reach 70.8% from 2018 to 2023. By 2023, the whole market size will jump to US$4.66 billion by 2023.
That’s only the market in China. As for the whole world, the entire cloud storage market will reach 10 billion with technology/Internet giants such as Amazon, Microsoft, Google, Aliyun and Tencent. If the traditional enterprise storages are taken into consideration, the entire market will be about US$70 billion, including some traditional IT giants such as Dell/EMC, NetApp, IBM, HPE, HDS and Huawei.
But in fact, the cloud storage firstly reformed the traditional storage.
Wang Donglin said that a friend of a big storage equipment company told him they would refuse the “business with a gross margin less than 85%, that is, in $100 of sales income only $15 is spent on purchasing hardware.” Compared to the traditional storage, the price of cloud storage service is much more attractive.
However, although the cloud storage service providers can depend on strong capital and resources to provide service to enterprises by building multiple data centers around the world, many problems have shown up due to various reasons.
The first one is, of course, the technical problem.
Here we need to distinguish two concepts: redundancy and fault domain isolation. The centralized data storage solution can achieve data reliability by increasing the reliability of a single system, for instance, more storage hard drives or more data service centers.
“Improving the reliability of a single system may encounter bottlenecks. In this case, redundancy and fault domain isolation are needed to improve reliability.” Wang Donglin explained, “Redundancy can ensure complete data reading even if some data are lost. Fault domain isolation can limit the scope of fault to a small range.”
For example, the annual failure rate of hard disks is around 1% (nominal figure is lower, but the actual data are slightly higher), “The data should have reached the limit, and it’s impossible to be lower.”
The solution for traditional enterprise storage is to “distribute data across multiple hard drives and allow one (RAID5) or two (RAID6) hard drives to fail, no data loss even though.”
“To continue development along this route will result in a distributed storage system at the same location by multiple servers to achieve redundancy and fault domain isolation at the level of storage server. The cloud storage service providers even spread data across different cabinets to realize the redundancy and fault domain isolation at the level of cabinet.” Wang Donglin said, “However, the reliability of a single data center has also encountered a bottleneck now. The solution is to continue the development along the technical route to achieve redundancy and fault domain isolation at different locations.”
In fact, it’s already become the “decentralized storage”. If the blockchain incentives are included, it will evolve into the “blockchain storage”.
“The concept of distributed storage has already appeared in the storage industry.” Wang Donglin said. With the popularity of decentralization and sharing concept, new business models like Airbnb and Uber have become popular. The storage industry can certainly adopt such an approach to fully utilize the storage resources of ordinary users.
The blockchain incentive system can also play a very good role. It can encourage mining workers to join in and quickly build a huge storage pool covering the whole world; punish the storage nodes which do not provide the promised services to guarantee the quality of storage service; attract more users to greatly reduce the costs.
Data redundancy, fault domain isolation and supervision
“For the practitioners in storage industry, they should have a concept of value: data itself are alive, they must be responsible for the security and reliability of the users’ data.” Wang Donglin said that even the decentralized storage solution should follow this principle as well.
He told EastShore that even the storage solutions are improved and innovated, the key points such as reliability, security, redundancy, cost, availability, data deduplication and DDOS should also be taken into consideration.
The reliability, security and cost of data are the most important of all.
“For example, if we compare the data to the deposit. One day when the user needs money urgently, the bank claims the machine is out of service due to malfunction, that’s the problem of availability; if the deposit amount is revealed, that’s the problem of security; if the money is gone, that’s the problem of reliability.”
Based on the basic common sense of storage, the redundancy must be done to improve data reliability. “To ensure data reliability, existing cloud storage service providers will usually store three copies of data, that is, the data redundancy rate is 300%.”
In addition to redundancy, the data fault isolation is also necessary.
“The original practice is fault isolation to ensure other data will not be affected even if some have been damaged. However, it’s unrealistic to reduce the failure rate of hard disks unlimitedly.” Wang Donglin said.
In addition, the data also need to be supervised so that the data reconstruction can be done once problems occur.
“Data redundancy, fault domain isolation, heartbeat supervision and data reconstruction will be considered by the professional storage service providers. Even if it’s integrated with the blockchain, all these aspects are still attached with great importance.” said Wang Donglin.
Cost Performance of Blockchain Storage
Apart from technology, the cost should also be considered.
“The company data used to be stored in the hard disk, so people always worry about the malfunction may occur. Once the cloud storage has been developed, they start to store data in the server, but they also worry about the malfunction of the entire server system. Accidents such as natural disasters may happen, some enterprises then start to expand the data backup from a single location to multiple locations, lest all the data may be corrupted.” Wang Donglin told EastShore.
Decentralized storage ensures data security to a certain extent, but it’s a problem who should pay for such a data center. “Even if the world’s largest cloud service provider, Amazon, has only dozens of nodes worldwide. For the centralized organizations, they still face big financial pressure to build data centers.”
After the emergence of blockchain incentive system, such a situation may be changed.
Wang Donglin told EastShore that application of blockchain to cloud storage can effectively reduce the threshold for access to data storage. “With the incentive mechanism, more nodes can participate in to ensure the entire decentralized system to work more effectively.”
Besides its own incentive system, the blockchain is an innovation which can bring a new decentralized system to the whole data storage industry. “The distributed concept of storage industry used to be reflected in a few locations only. But after the blockchain appeared, each user was encouraged to contribute his own storage device to the entire decentralized cloud storage ecosystem.”
The concept of distributed storage in original solution has also gradually been replaced by decentralized cloud storage. Wang Donglin believes that “no matter which stage the storage industry has developed, it’s always a natural continuation of classic solution.”
Implementation Challenge
According to the incomplete statistics issued by EastShore, there are nearly 10 blockchain projects which focus on decentralized storage as the core concept. Although some have not yet been implemented, they have become so popular and attractive that even the phenomenon of miner fraud has already appeared.
Wang Donglin said that the relationship between the project side and the miner manufacturer is somewhat similar to Google search and the websites. On the one hand, in order to make the website content more easily accessed by Google engine, some people will optimize the SEO of websites. On the other hand, Google also hopes to provide users with more accurate and high-quality content. That doesn’t matter at all.”
“In general, the emergence of professional miners will indeed have a positive effect on the ecology of entire decentralized storage project.” However, whether the performance of the miners can meet the claimed level, we should only “wait and see”.
Compared with the original plan, the advantage of decentralized storage solution is obvious, but as a blockchain project dedicated to 2B enterprise-level services, the implementation will become a big challenge.
As for enterprises, what they care most should be whether the storage services will become cheaper with less costs. They haven’t fully aware of the meaning of blockchain storage.
However, we have seen the necessity of blockchain storage in various news such as “Tencent cloud may suffer from data loss of startup companies due to Aliyun failure.”
But no matter how attractive the blockchain becomes, the implementation should be the key factor.
Many documents hyping decentralized storage projects usually emphasize the original sin of the centralized system, declare that decentralization is safer than centralized storage, Wang Donglin believes that it’s wrong as a matter of fact. “Without data encryption, the decentralized solution in which the user’s data are exposed will have less security.”
Moreover, in the industry there’s also another problem of “dilemma of encryption and deduplication”. According to Wang Donglin, some project has made breakthrough in the regard, “TruPrivacy technology can achieve zero-knowledge encryption and cross-user deduplication.”
“The whole industry hasn’t reached the level of fierce competition. I hope that more similar projects can form alliance so that the whole industry can develop together and eventually compete with the centralized organizations such like Amazon.” Wang Donglin said with full confidence.