Fault-Tolerance Techniques for High-Performance Computing
This timely text/reference presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC). The text opens with a detailed introduction to the concepts of checkpoint protocols and scheduling algorithms, prediction, replication, silent error detection and correcti...
Corporate Author: | |
---|---|
Other Authors: | , |
Language: | English |
Published: |
Cham :
Springer International Publishing : Imprint: Springer,
2015.
|
Edition: | 1st ed. 2015. |
Series: | Computer Communications and Networks,
|
Subjects: | |
Online Access: | https://doi.org/10.1007/978-3-319-20943-2 |