Please use this identifier to cite or link to this item: http://ktisis.cut.ac.cy/handle/10488/3513
Title: An implementation of Sybilrankon Apache Giraph
Authors: Καρκούλιας, Νικόλας 
Keywords: Cloud Computing;Big Data;Apache Giraph;Bulk Synchronous Parallel Μodel
Advisor: Σιριβιανός, Μιχάλης
Issue Date: 2013
Publisher: Τμήμα Ηλεκτρολόγων Μηχανικών και Τεχνολογιών Πληροφορικής, Σχολή Μηχανικής και Τεχνολογίας, Τεχνολογικό Πανεπιστήμιο Κύπρου
Abstract: Contemporary technological advancements, besides affecting all facets of human society in a multitude of ways, often demand the aggregated processing of ever-increasing amounts of data. This is often achieved with the use of computer clusters, a practice popularized by industry behemoths such as Google and Amazon. The Cloud Computing era, reminiscent of the older one of the mainframes but differentiated by the startling expansion of the Internet, is associated with the distant harnessing of this computing power and remote storage space, turning Cloud Computing into a viable solution for a variety of applications. Simultaneously, economic and ecological reasons dictate that this harnessing should be achieved in an efficient and cost-effective way; the computational resources are finite and the Big Data field, especially, requires the harmonious coordination of numerous processing nodes. From the programmer's perspective, parallel processing -in its various forms- is notoriously difficult to get right and reason about. In order to surpass the inherent difficulties of the domain and use the aforementioned infrastructure with optimal efficiency, reliability and ease, a variety of parallel processing models have been proposed and implemented over time. These models often differ in their purpose and their level of abstraction. In this work, we present a brief overview of certain promising ones, mostly oriented towards Big Data processing in cluster environments. We focus on Valiant's abstract Bulk Synchronous Parallel model and on the technical aspects of Apache Giraph, one of its open-source implementations and member of the wider Apache Hadoop software ecosystem. As a proof-of-concept, we attempt to lay the foundations for an implementation of SybilRank on the programming environment of Giraph; SybilRank is an efficient algorithm for the detection of "Sybil" nodes in large Online Social Networks. Finally, we describe the deployment process of the resulting program on the Amazon EC2 cluster and attempt to draw conclusions from our experience.
URI: http://ktisis.cut.ac.cy/handle/10488/3513
http://hdl.handle.net/10488/3513
Rights: Απαγορεύεται η δημοσίευση ή αναπαραγωγή, ηλεκτρονική ή άλλη χωρίς τη γραπτή συγκατάθεση του δημιουργού και κατόχου των πνευματικών δικαιωμάτων.
Type: Bachelors Thesis
Appears in Collections:Πτυχιακές Εργασίες/ Bachelor's Degree Theses

Files in This Item:
File Description SizeFormat 
Nikolaos_Karkoulias_Abstract.pdf59.06 kBAdobe PDFView/Open
Show full item record

Page view(s) 20

43
Last Week
1
Last month
2
checked on Nov 22, 2017

Download(s) 5

30
checked on Nov 22, 2017

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.