Please use this identifier to cite or link to this item: http://ktisis.cut.ac.cy/handle/10488/3514
Title: A fault-tolerant routing algorithm for interconnected networks-on-chips based on distributed graph theory
Authors: Χαραλάμπους, Πέτρος 
Keywords: Chip Multi-Processors;Network-on-Chip;Processing Elements;Routing Algorithm
Advisor: Σωτηρίου, Βάσος
Issue Date: 2013
Publisher: Τμήμα Ηλεκτρολόγων Μηχανικών και Τεχνολογιών Πληροφορικής, Σχολή Μηχανικής και Τεχνολογίας, Τεχνολογικό Πανεπιστήμιο Κύπρου
Abstract: The rapid down-scaling of silicon technology has made massive on-chip transistor integration densities possible. Consequently, parallel-processing chips have emerged as the new digital design paradigm in ultra-high-performance computing. Unfortunately, buses and dedicated interconnect wiring, which present severe restrictions in their ability to scale with system size, cannot meet the inter-tile communication requirements in current parallel on-chip systems such as Chip Multi-Processors (CMPs), imposing severe limitations in the performance efficiency and scalability of the entire parallel system. In addition, long wires give rise to numerous design restrictions which must be resolved, such as routing contention and challenges in synchronizing the various components on-chip. Network-on-Chip (NoC) architectures, a scalable and modular fabric, have been proposed as an alternative way to solve the above problems, replacing buses and “spaghetti” wiring, by using a packet-based on-chip network. Hence, communication among numerous Processing Elements (PEs) which reside on a single chip now takes the form of exchanging messages over the NoC, which function like off-chip interconnects found in supercomputers. However, a single link failure in an NoC topology, which can occur due to various damaging physical effects such as electro-migration, can prevent the communication process. This effect may eventually render the entire CMP chip useless if the routed packets are stalled indefinitely in their routers. In this Thesis, a routing algorithm capable of handling large numbers of link failures that can occur either at manufacture-time (statically) or at run-time (dynamically), named Pythia1, is presented. As opposed to marking the entire CMP as faulty, whenever some links fail, Pythia routes packets around the faulty link(s) until a new healthy link is available. This process can lead in-flight packets towards their destinations, maintaining network connectivity, and guaranteeing packet delivery, albeit at a slower rate which degrades gracefully. Pythia is an adaptive and localized fault-tolerant routing algorithm, oblivious to the global state of the network. It uses distributed graph tables, which are held by each NoC router, in order to determine the best choice in routing a packet under the presence of faulty links. Every router has a graph table, which contains only the statuses of a its own links and of those nodes it leads to. Given this information, routing is made possible by manipulating either the coordinates of a packet’s destination or destination distance, and the next available router’s number of faulty links, for each link available from the current router to the next ones. Graph tables that hold all the statuses of all links residing at the next nodes found in the vicinity of the route, gives the foreknowledge of a possible deadlocked route a the next-hop router(s); hence deadlocked parts of a topology are avoided. In case a deadlocked situation cannot be avoided a-priori, as its existence is not yet known, a deadlock-resolution mechanism resolves such a detrimental scenario. Moreover, Pythia introduces a new type of header flit, which holds information that helps in livelock-avoidance as well. Pythia was simulated under uniform random and transpose synthetic traffic patterns, with a range of virtual channel per port counts using wormhole flow-control, in order to determine its performance and behavior, utilizing two faulty link spatial placement scenarios: (1) random, and (2) hotspot faulty link distributions. When compared against ARIADNE, an existing state-of-the-art fault-tolerant routing algorithm, Pythia demonstrated up to 237.5% and 166.67% improvement in throughput with a random faulty link placement, while it showed up to 165.0% and 66.7% increase in throughput with a hotspot faulty link placement, under uniform random and transpose traffic pattern usages, respectively. In addition, the proposed routing algorithm was simulated under a lightly-loaded network to determine its basic routing delay under “no stress” conditions, while further experiments exhibit its superior throughput attainment just before network saturation.
URI: http://ktisis.cut.ac.cy/handle/10488/3514
http://hdl.handle.net/10488/3514
Rights: Απαγορεύεται η δημοσίευση ή αναπαραγωγή, ηλεκτρονική ή άλλη χωρίς τη γραπτή συγκατάθεση του δημιουργού και κατόχου των πνευματικών δικαιωμάτων.
Type: Bachelors Thesis
Appears in Collections:Πτυχιακές Εργασίες/ Bachelor's Degree Theses

Files in This Item:
File Description SizeFormat 
Petros_Charalambous_Thesis_Abstract.pdf238.53 kBAdobe PDFView/Open
Show full item record

Page view(s) 20

40
Last Week
1
Last month
2
checked on Nov 24, 2017

Download(s) 20

12
checked on Nov 24, 2017

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.