Data mining and user data collection applications, like Facebook and Yahoo, make dealing with huge amounts of data more and more frequent. A solution to cope with this problem is to spread data over multiple network-connected physical devices. Having more devices, though, means increasing system complexity and introducing additional possible points of failure. Moreover, despite the capacity of hard drives as massive storage systems has increased extremely during years, the speed at which data can be accessed has not. In order to address this problem, over the years, distributed file systems, such as NFS and HDFS, have been designed and deployed. Such systems provide access to files stored on multiple hosts connected through a computer network in a transparent way to users. The peer-to-peer network paradigm has been introduced to overcome some limitations of the client-server architecture by adding features, such as scalability, fault-tolerance, and self-organization. In this work, we present a solution that integrates peer-to-peer network support to HDFS in order to realize a flexible, low-cost and, dynamic distributed file system.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.