Introduction to Technical Framework and Applications of IPFS

4 Comments 1822 Views1


Introduction to Technical Framework and Applications of IPFS IMG N01

IPFS stands for Inter Planetary File System, which was launched by Juan Benet in May 2014. Juan Benet’s personal experience was quite legendary. He graduated from Stanford University. Before the establishment of IPFS project, his first company was acquired by Yahoo. In 2015, he created IPFS in the YCombinator incubation competition which won a huge investment for him and he set up a protocol lab which consisted of 14 core developers and hundreds of code contributors in the community.

 

What is IPFS?

The APPs we use every day just adopt the HTTP protocol. It is based on the TCP/IP to transfer hypertext data from the server to the local browser. After being rendered by the local browser or APP, it will be presented to the users. Based on such a network environment, the CS or BS framework is formed and finally provided for the large network providers such as Baidu, Alibaba and Tencent.

Introduction to Technical Framework and Applications of IPFS IMG 08

The network services provided by the Internet platform have undergone an iterative process of three modes as follows:

The first mode is centralization. For example, the early ticketing website 12306.cn had only one central service group. When all the traffic for buying tickets is directly carried on this service group, it has to undertake very big pressure.

The second mode is decentralized cluster. Just like establishing service groups in different areas during the hot O2O competitions, the IDC (Internet Data Center) behind them will make the same service dispersed within the LAN, which will alleviate pressure on the central servers.

Of course, the first two modes have some drawbacks:

  1. As for the former, the services are highly dependent on the central network. The big companies or startups can’t afford the shutdown. The operation maintenance has a KPI index named SLA. If the stability is not 99.9%, it will be basically regarded as unqualified. SLA demands considerable costs and the big companies also need to hire professional operators to ensure the stability of the system.
  2. As for the latter, there is a risk of losing the stored data. Some people often claim jokingly that the cables may be cut off and the employees may escape after deleting the database. All are hidden risks.

At the same time, the two modes require high cost bandwidth, which will cause a waste of certain bandwidth resources. For example, the video playback volume of the first general election for the program THE RAP OF CHINA has reached 1 billion clicks. If the video file size is 1GB, playing the entire video will require 1000PB of bandwidth. Assuming $ 0.001 for 1GB bandwidth, IQIYI will have to pay $1 million to ISP (Internet Service Provider) for only one episode.

IPFS is essentially a protocol and network designed to create a content-addressable, peer-to-peer method of storing and sharing hypermedia in a distributed file system, with the purpose of complementing or even replacing the Hypertext Media Transfer Protocol (HTTP) used over the past 20 years, so as to build a faster, safer and more free Internet era.

Introduction to Technical Framework and Applications of IPFS IMG 07

IPFS is expected to become the third mode.

  • IPFS intends to create a peer-to-peer network topology, which is equivalent to subverting the distribution relationship represented by HTTP. Being featured with addressable content, it can generate a unique hash identifier through the file content, which will save the cost of space to some extent.
  • The domain name addressing used by HTTP protocol will be eventually mapped to the lowest level to match a host under the IP address corresponding to the domain name as well as a file in the file directory. It does not care whether there is the same file or not, but the content addressing of IPFS will conduct access by unique ID and check in advance whether the ID has been stored. If yes, it will read directly from other nodes without storage again, which can save space to some extent.

Take an example of specific application. If I want to watch the movie Pacific Rim and Xiao Ming happened to download it before. He started the IPFS node and added this video file to the IPFS network. He will get a hash fingerprint b and publish it to the public gateway with a pathname /IPFS/b.

When he told me the hash fingerprint and the path name, the only thing I need to do is to start a local node and send a request to the gateway to address the PIN. IPFS will automatically search the hashrate in the distributed table to find the node list corresponding to fingerprint b.

Large video files usually do not have only one node, it may be saved in other sub-nodes. The IPFS connects all these parallel nodes for the local manager to resume them to complete file again. The speed of parallel nodes is much faster than downloading the whole file directly, so that I can enjoy the movie on local browser as soon as possible and share it with others.

 

IPFS Framework

IPFS has at least eight layers of sub-protocol stacks, from top to bottom which include identity, network, routing, exchange, objects, files(MerkleDag), naming, and application. Each protocol stack performs its own functions with mutual collocation.

Introduction to Technical Framework and Applications of IPFS IMG 07

The identity layer and the routing layer

The identity layer and the routing layer can be explained together. The generation of the peer node identity information as well as the routing rules are generated by the Kademlia protocol(KAD protocol). The KAD protocol essentially builds up a loose distributed hash table, referred to as DHT. Everyone joining the DHT network must generate his own identity information before using the ID to store the resource information in this network as well as the contact information of other members. It’s just like sharing the WeChat business cards. If you can’t search through WeChat number directly, you can turn to a friend to share the business card with you.

The network layer

The network layer acts as the core, which can support any transport layer protocols with the LibP2P. The NAT technology allows devices on the intranet to share the same external IP address, such as the routers we have at home.

The exchange layer

The exchange layer is the BT tool like Thunder. In fact, Thunder simulates a P2P network and creates a central server. When registering the user request for some resources, it will gather the other users requesting the same resources to form a small cluster swarm for data sharing. Of course, it has drawbacks because the server is maintained by Thunder. Once a fault or downtime occurs, the download operation will also fail.

Centralized services can also restrict some download requests and Bittorrent, one smarter method was invented, so that the data to be stored by each seed node will be stored in a hash table. BT tools are relatively free from being supervised, so the service is more stable.

The IPFS team then upgraded BitTorrent and called it Bitswap, which included a credit system to stimulate nodes for sharing. It’s inferred that FileCoin is most probably based on Bitswap and users adding data in Bitswap will improve their credit points. More sharing will result in higher credit points. If users only retrieve the data without storing them, the credit points will be decreased and other nodes will preferentially choose the one with higher credit points when embedding hyperlinks.

The design can solve the witch attack because the credit points cannot be improved by refreshing. Those who constantly refresh the search requests will lose more credit points. There is a precise algorithm between the number of requests and the amount of storage, similar to a parabola, which can tolerate a lot in the early stage while no longer trust after a certain threshold value.

The objects and files layers

The objects and files layers should be discussed together. They manage 80% of data structures on IPFS. Most of the data objects exist as MerkleDag, which facilitates content addressing and deduplication. The files layer is a new data structure parallel to DAG, which adopts Git-like data structure to support version snapshots.

The naming layer

The naming layer has the feature of self-verification (when other users try to obtain the object, the fingerprint public key will be used for verification, that is, whether the public key used for verification matches the NodeId, so as to verify the authenticity of the object released by users as well as obtain the mutable state). Meanwhile, the clever design, IPNS, is also added to make the name of encrypted DAG object definable to enhance readability.

The application layer

Finally, the application layer, the core value of IPFS lies in the running applications. We can make use of its CDN-like function to obtain the desired data under low cost bandwidth to improve the efficiency of the whole applications.

 

Application Significance of IPFS

Firstly, it can bring certain freedom to content creation. As a typical application, Akasha is a social blog creation platform based on Ethereum and IPFS. The blog content created by users are published via an IPFS network instead of a central server. Meanwhile, users and Ethereum wallet accounts are bound together, so that users can give ETH reward for high-quality content, and content creators can then earn ETH, just like brain mining. Without too many regulatory restrictions or commissions, the content benefits are directly obtained by the creators.

Introduction to Technical Framework and Applications of IPFS IMG 02

Secondly, storage and bandwidth costs can be reduced. There is a successful video project called “Dtube“, which is a decentralized video platform based on Steemit. The video files uploaded by users are stored with UID based on IPFS protocol. Compared with the traditional video website, it reduces the redundancy of the same resources and greatly saves the bandwidth cost resulting from huge number of users playing videos.

Introduction to Technical Framework and Applications of IPFS IMG 03

Thirdly, it can be perfectly combined with blockchain. The essence of blockchain is distributed ledger. One of its bottlenecks is the storage capacity. At present, the biggest problem of most public chains is that a large amount of hypermedia data cannot be stored in its own chain. Bitcoin has only about 30-40G of block data so far. The programmable blockchain projects such as Ethereum can only execute and store small segment of contract codes. It’s greatly restricted for DApp to develop into a super APP.

Using IPFS technology to solve the storage bottleneck seems to be the transitional solution nowadays and the most typical application is EOS. What’s the most significant for EOS is that it can support million-level TPS concurrency, thanks to the DPOS consensus mechanism as well as the underlying storage design which adopts IPFS to improve the transmission efficiency of large data.

Introduction to Technical Framework and Applications of IPFS IMG 04

EOS makes use of IPLD to conduct heterogeneous processing of its packaged block data and unifies it into a data structure easy to be addressed; then mounts it on the IPFS link to make IPFS network assume the function of storage and P2P retrieval without consuming too much computing resources of EOS blockchain system.

Fourthly, it can provide a distributed caching scheme for traditional applications. IPFS-GEO is a project to provide distributed caching for traditional LBS application. It can convert geographic location data into one-dimensional string through GeoHash algorithm and store the relevant retrieved data into IPFS network so that it can be uniquely identified and distributed on neighboring nodes.

Introduction to Technical Framework and Applications of IPFS IMG 05

When receiving the retrieval requests, the system first compares the string approximation range to narrow the search range and speed up the efficiency. By collecting hypermedia data from nearby nodes through NodeID, it can achieve the effect similar to distributed cache and greatly improve the efficiency of the entire retrieval operation of LBS application.

 

Star Applications with IPFS

OpenBazaar is a star application with IPFS and we give a Chinese name for it which means “Open Market”, which just got $5 million investment from Bitmain.

Introduction to Technical Framework and Applications of IPFS IMG 06

OpenBazaar Version 1.0 used to be called black market, because it adopted ZeroMQ for P2P transactions instead of IPFS. To a certain extent, the check on centralization was bypassed and the transaction fee was paid to users as bonus. Meanwhile, because it has integrated bitcoin as the payment channel, it was well-known and attracted a lot of users within a short period of time.

After the release of OpenBazaar Version 2.0, the review mechanism was officially added since some factors such as laws and regulations need to be taken into consideration. Meanwhile, it supported other cryptocurrencies such as BCH apart from bitcoin. Also, the IPFS has been integrated and reconstructed to replace the previous ZeroMQ.

Now, many stores on OpenBazaar can also run on the host even without online users. In the past, the transaction could not be conducted without logging in. However, the offline stores can be realized by means of IPFS. It also means the more people visit your store, the more store data are copied, which is conducive to publicity and promotion of high-quality stores. It’s a return to value to some extent.

I call it a star project, not only because it is based on IPFS with good performance, but also because it’s a complete reconstruction of IPFS. It redevelops all IPFS source codes, protocols and various supporting facilities, which not only reconstructs the branches of IPFS, but also changes the protocol name as well as the protocol headers. In a certain sense, it separates the OB network from the main IPFS network.

OpenBazaar is eager for more centralized control to build its own mining farms and rooms to ensure stable services, choose a balance point between the decentralization and centralization. We hope all the traditional applications and companies shall take the same consideration. After all, it’s unlikely to realize complete decentralization, but we can absorb the essence and discard the dross.

 

Postscript

When the new technology replaces the old one, there are usually two reasons: first, it can improve system efficiency; second, it can reduce system cost. Fortunately, IPFS is capable of both.

 

Appendix Ⅰ

Graph Credit: IPFS-GEOThemerkleIPFSAkasha

 

Appendix Ⅱ

Series articles on cryptocurrency introduction and analysis at EastShore: