Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/802
Title: An implementation of Sybilrankon Apache Giraph
Authors: Καρκούλιας, Νικόλας 
Keywords: Cloud Computing;Big Data;Apache Giraph;Bulk Synchronous Parallel Μodel
Advisor: Σιριβιανός, Μιχάλης
Issue Date: 2013
Department: Department of Electrical Engineering, Computer Engineering and Informatics
Faculty: Faculty of Engineering and Technology
Abstract: Contemporary technological advancements, besides affecting all facets of human society in a multitude of ways, often demand the aggregated processing of ever-increasing amounts of data. This is often achieved with the use of computer clusters, a practice popularized by industry behemoths such as Google and Amazon. The Cloud Computing era, reminiscent of the older one of the mainframes but differentiated by the startling expansion of the Internet, is associated with the distant harnessing of this computing power and remote storage space, turning Cloud Computing into a viable solution for a variety of applications. Simultaneously, economic and ecological reasons dictate that this harnessing should be achieved in an efficient and cost-effective way; the computational resources are finite and the Big Data field, especially, requires the harmonious coordination of numerous processing nodes. From the programmer's perspective, parallel processing -in its various forms- is notoriously difficult to get right and reason about. In order to surpass the inherent difficulties of the domain and use the aforementioned infrastructure with optimal efficiency, reliability and ease, a variety of parallel processing models have been proposed and implemented over time. These models often differ in their purpose and their level of abstraction. In this work, we present a brief overview of certain promising ones, mostly oriented towards Big Data processing in cluster environments. We focus on Valiant's abstract Bulk Synchronous Parallel model and on the technical aspects of Apache Giraph, one of its open-source implementations and member of the wider Apache Hadoop software ecosystem. As a proof-of-concept, we attempt to lay the foundations for an implementation of SybilRank on the programming environment of Giraph; SybilRank is an efficient algorithm for the detection of "Sybil" nodes in large Online Social Networks. Finally, we describe the deployment process of the resulting program on the Amazon EC2 cluster and attempt to draw conclusions from our experience.
URI: https://hdl.handle.net/20.500.14279/802
Rights: Απαγορεύεται η δημοσίευση ή αναπαραγωγή, ηλεκτρονική ή άλλη χωρίς τη γραπτή συγκατάθεση του δημιουργού και κατόχου των πνευματικών δικαιωμάτων.
Type: Bachelors Thesis
Affiliation: Cyprus University of Technology 
Appears in Collections:Πτυχιακές Εργασίες/ Bachelor's Degree Theses

Files in This Item:
File Description SizeFormat
Nikolaos_Karkoulias_Abstract.pdf59.06 kBAdobe PDFView/Open
CORE Recommender
Show full item record

Page view(s) 20

365
Last Week
1
Last month
6
checked on May 1, 2024

Download(s) 20

98
checked on May 1, 2024

Google ScholarTM

Check


Items in KTISIS are protected by copyright, with all rights reserved, unless otherwise indicated.