REFIT: Resource-Efficient Fault and Intrusion Tolerance
Internet-based services play a central role in today's society. With such services progressively taking over from traditional infrastructures, their complexity steadily increases. On the downside, this leads to more and more faults occurring. As improving software-engineering techniques alone will not do the job, systems have to be prepared to tolerate faults and intrusions.
REFIT investigates how systems can provide fault and intrusion tolerance in a resource-efficient manner. The key technology to achieve this goal is virtualization, as it enables multiple service instances to run in isolation on the same physical host. Server consolidation through virtualization not only saves resources in comparison to traditional replication, but also opens up new possibilities to apply optimizations (e.g., deterministic multi-threading).
Resource efficiency and performance of the REFIT prototype are evaluated using a web-based multi-tier architecture, and the results are compared to non-replicated and traditionally-replicated scenarios. Furthermore, REFIT develops an infrastructure that supports the practical integration and operation of fault and intrusion-tolerant services; for example, in the context of cloud computing.
News
- December 2020: Our paper "Resilient Cloud-based Replication with Low Latency" received the Best Student Paper Award at the 21st Middleware Conference (Middleware '20).
- July 2020: The source code of the REFIT-Framework is now publically available.
- June 2018: Our paper "Strome: Energy-Aware Data-Stream Processing" received the Best Paper Award at the 18th International Conference on Distributed Applications and Interoperable Systems (DAIS '18).
- May 2015: REFIT has been renewed for a second funding period by the DFG.
- Tobias Distler received an IBM Ph.D. Fellowship Award for his work in REFIT.
- Starting January 2012, Rüdiger Kapitza is Professor at the Technische Universität Braunschweig, Distributed Systems Group.
Publications
PRDC 2021 | Michael Eischer and Tobias Distler. Egalitarian Byzantine Fault Tolerance. In Proceedings of the 26th Pacific Rim International Symposium on Dependable Computing (PRDC '21), pages 77–86, Perth, 1–4 December, 2021. (BibTeX, Extended version) |
---|---|
EDCC 2021 | Laura Lawniczak and Tobias Distler. Stream-based State Machine Replication. In Proceedings of the 17th European Dependable Computing Conference (EDCC '21), pages 119–126, Munich, 13–16 September 2021. (BibTeX, Extended version, Source code) |
Habilitation | Tobias Distler. Resource-Aware System Software for Replicated Services. Habilitation, 2021. (BibTeX) |
CSUR 2021 | Tobias Distler. Byzantine Fault-Tolerant State-Machine Replication from a Systems Perspective. In ACM Computing Surveys, 54(1):24:1–38, 2021. (BibTeX) |
Middleware 2020 (Best Student Paper) |
Michael Eischer and Tobias Distler. Resilient Cloud-based Replication with Low Latency. In Proceedings of the 21st Middleware Conference (Middleware '20), pages 14–28, Delft, 7–11 December 2020. (BibTeX, Extended version) |
PaPoC 2020 | Michael Eischer, Benedikt Straßner, and Tobias Distler. Low-Latency Geo-Replicated State Machines with Guaranteed Writes. In Proceedings of the 7th Workshop on Principles and Practice of Consistency for Distributed Data (PaPoC '20), Heraklion, 27 April 2020. (BibTeX) |
FB-SYS 2019 | Michael Eischer and Tobias Distler. Efficient Checkpointing in Byzantine Fault-Tolerant Systems. In Tagungsband des FB-SYS Herbsttreffens 2019 (FB-SYS '19), Osnabrück, 21–22 November 2019. (BibTeX) |
SRDS 2019 | Michael Eischer, Markus Büttner, and Tobias Distler. Deterministic Fuzzy Checkpoints. In Proceedings of the 38th International Symposium on Reliable Distributed Systems (SRDS '19), pages 153–162, Lyon, 1–4 October 2019. (BibTeX) |
PaPoC 2019 | Christian Deyerl and Tobias Distler. In Search of a Scalable Raft-based Replication Architecture. In Proceedings of the 6th Workshop on Principles and Practice of Consistency for Distributed Data (PaPoC '19), pages 1–7, Dresden, 25 March 2019. (BibTeX) |
Computing 2019 | Michael Eischer and Tobias Distler. Scalable Byzantine Fault-tolerant State-Machine Replication on Heterogeneous Servers. Computing, 101(2):97–118, 2019. (BibTeX) |
DSN 2018 | Bijun Li, Nico Weichbrodt, Johannes Behl, Pierre-Louis Aublin, Tobias Distler, and Rüdiger Kapitza. Troxy: Transparent Access to Byzantine Fault-Tolerant Systems. In Proceedings of the 48th International Conference on Dependable Systems and Networks (DSN '18), pages 59–70, Luxembourg City, 25–28 June 2018. (BibTeX) |
BCRB 2018 | Michael Eischer and Tobias Distler. Latency-Aware Leader Selection for Geo-Replicated Byzantine Fault-Tolerant Systems. In Proceedings of the 1st Workshop on Byzantine Consensus and Resilient Blockchains (BCRB '18), pages 140–145, Luxembourg City, 25 June 2018. (BibTeX) |
DAIS 2018 (Best Paper) |
Christopher Eibel, Christian Gulden, Wolfgang Schröder-Preikschat, and Tobias Distler. Strome: Energy-Aware Data-Stream Processing. In Proceedings of the 18th International Conference on Distributed Applications and Interoperable Systems (DAIS '18), pages 40–57, Madrid, 18–20 June 2018. (BibTeX) |
IC2E 2018 | Christopher Eibel, Thao-Nguyen Do, Robert Meißner, and Tobias Distler. Empya: Saving Energy in the Face of Varying Workloads. In Proceedings of the 6th International Conference on Cloud Engineering (IC2E '18), pages 134–140, Orlando, 17–20 April 2018. (BibTeX) |
Technical Report | Christopher Eibel, Thao-Nguyen Do, Robert Meißner, and Tobias Distler. Empya: An Energy-aware Middleware Platform for Dynamic Applications. Technical Report CS-2018-01, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), 2018. (BibTeX) |
EDCC 2017 | Michael Eischer and Tobias Distler. Scalable Byzantine Fault Tolerance on Heterogeneous Servers. In Proceedings of the 13th European Dependable Computing Conference (EDCC '17), pages 34–41, Geneva, 4–8 September 2017. (BibTeX) |
DSN 2017 | Rainer Schiekofer, Johannes Behl and Tobias Distler. Agora: A Dependable High-Performance Coordination Service for Multi-Cores. In Proceedings of the 47th International Conference on Dependable Systems and Networks (DSN '17), pages 333–344, Denver, 26–29 June 2017. (BibTeX) |
EuroSys 2017 | Johannes Behl, Tobias Distler, and Rüdiger Kapitza. Hybrids on Steroids: SGX-based High Performance BFT. In Proceedings of the 12th European Conference on Computer Systems (EuroSys '17), pages 222–237, Belgrade, 23–26 April 2017. (BibTeX) |
Technical Report | Johannes Behl, Tobias Distler, and Rüdiger Kapitza. Hybster – A Highly Parallelizable Protocol for Hybrid Fault-Tolerant Service Replication. Technical report 64440. TU Braunschweig, 2017. (BibTeX) |
IEEE TC 2016 | Tobias Distler, Christian Cachin, and Rüdiger Kapitza. Resource-efficient Byzantine Fault Tolerance. In IEEE Transactions on Computers, 65(9):2807–2819, 2016. (BibTeX) |
EDCC 2016 | Bijun Li, Wenbo Xu, Muhammad Zeeshan Abid, Tobias Distler, and Rüdiger Kapitza. SAREK: Optimistic Parallel Ordering in Byzantine Fault Tolerance. In Proceedings of the 12th European Dependable Computing Conference (EDCC '16), pages 77–88, Gothenburg, 5–9 September 2016 (BibTeX) |
Middleware 2015 | Johannes Behl, Tobias Distler, and Rüdiger Kapitza. Consensus-Oriented Parallelization: How to Earn Your First Million. In Proceedings of the 16th Middleware Conference (Middleware '15), pages 173–184, Vancouver, 7–11 December 2015. (BibTeX) |
ARM 2015 | Christopher Eibel and Tobias Distler. Towards Energy-Proportional State-Machine Replication. In Proceedings of the 14th Workshop on Adaptive and Reflective Middleware (ARM '15), pages 19–24, Vancouver, 8 December 2015. (BibTeX) |
HotDep 2014 | Johannes Behl, Tobias Distler, and Rüdiger Kapitza. Scalable BFT for Multi-Cores: Actor-based Decomposition and Consensus-oriented Parallelization. In Proceedings of the 10th Workshop on Hot Topics in System Dependability (HotDep '14), pages 49–54, Broomfield, 5 October 2014. (BibTeX) |
Dissertation | Tobias Distler. Resource-efficient Fault and Intrusion Tolerance. Dissertation, 2014. (BibTeX) |
EuroSys 2012 | Rüdiger Kapitza, Johannes Behl, Christian Cachin, Tobias Distler, Simon Kuhnle, Seyed Vahid Mohammadi, Wolfgang Schröder-Preikschat, and Klaus Stengel. CheapBFT: Resource-efficient Byzantine Fault Tolerance. In Proceedings of the 7th European Conference on Computer Systems (EuroSys '12), pages 295–308, Bern, 10–13 April 2012. (BibTeX) |
EuroSys 2011 | Tobias Distler and Rüdiger Kapitza. Increasing Performance in Byzantine Fault-Tolerant Systems with On-Demand Replica Consistency. In Proceedings of the 6th European Conference on Computer Systems (EuroSys '11), pages 91–105, Salzburg, 10–13 April 2011. (BibTeX) |
NDSS 2011 | Tobias Distler, Rüdiger Kapitza, Ivan Popov, Hans P. Reiser, and Wolfgang Schröder-Preikschat. SPARE: Replicas on Hold. In Proceedings of the 18th Network and Distributed System Security Symposium (NDSS '11), pages 407–420, San Diego, 6–9 February 2011. (BibTeX) |
SICHERHEIT 2010 | Tobias Distler, Rüdiger Kapitza, and Hans P. Reiser. State Transfer for Hypervisor-Based Proactive Recovery of Heterogeneous Replicated Services. In Proceedings of the 5th "Sicherheit, Schutz und Zuverlässigkeit" Conference (SICHERHEIT '10), pages 61–72, Berlin, 5–7 October 2010. (BibTeX) |
HotDep 2010 | Rüdiger Kapitza, Matthias Schunter, Christian Cachin, Klaus Stengel, and Tobias Distler. Storyboard: Optimistic Deterministic Multithreading. In Proceedings of the 6th Workshop on Hot Topics in System Dependability (HotDep '10), pages 1–6, Vancouver, 3 October 2010. (BibTeX) |
Source Code
- REFIT-Framework, an implementation of multiple agreement protocol and system architectures
- Tara, a stream-based replication protocol
Project Partners
- Rüdiger Kapitza (TU Braunschweig)
- Johannes Behl (TU Braunschweig)
People Involved in REFIT at Erlangen
Dr.-Ing. Tobias Distler | Michael Eischer, M. Sc. | Laura Lawniczak, M. Sc. | Dipl.-Inf. Christopher Eibel | Dipl.-Inf. Klaus Stengel |
![]() |
![]() |
![]() |
![]() |
![]() |
Related Projects
VM-FIT | Virtual Machine-based Fault and Intrusion Tolerance |
---|---|
TClouds | Trustworthy Clouds – Privacy and Resilience for Internet-scale Critical Infrastructure |
FOREVER | Fault/intrusiOn REmoVal through Evolution & Recovery |