Independent recovery in large-scale distributed systems

Triantafillou, P. (1996) Independent recovery in large-scale distributed systems. IEEE Transactions on Software Engineering, 22(11), pp. 812-826. (doi: 10.1109/32.553700)

Full text not currently available from Enlighten.

Abstract

In large systems, replication can become important means to improve data access times and availability. Existing recovery protocols, on the other hand, were proposed for small-scale distributed systems. Such protocols typically update stale, newly-recovered sites with replicated data and resolve the commit uncertainty of recovering sites. Thus, given that in large systems failures are more frequent and that data access times are costlier, such protocols can potentially introduce large overheads in large systems and must be avoided, if possible. We call these protocols dependent recovery protocols since they require a recovering site to consult with other sites. Independent recovery has been studied in the context of one-copy systems and has been proven unattainable. This paper offers independent recovery protocols for large-scale systems with replicated data. It shows how the protocols can be incorporated into several well-known replication protocols and proves that these protocols continue to ensure data consistency. The paper then addresses the issue of nonblocking atomic commitment. It presents mechanisms which can reduce the overhead of termination protocols and the probability of blocking. Finally, the performance impact of the proposed recovery protocols is studied through the use of simulation and analytical studies. The results of these studies show that the significant benefits of independent recovery can be enjoyed with a very small loss in data availability and a very small increase in the number of transaction abortions.

Item Type:Articles
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Triantafillou, Professor Peter
Authors: Triantafillou, P.
College/School:College of Science and Engineering > School of Computing Science
Journal Name:IEEE Transactions on Software Engineering
ISSN:0098-5589
ISSN (Online):1939-3520

University Staff: Request a correction | Enlighten Editors: Update this record