Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/13927
DC FieldValueLanguage
dc.contributor.authorAthanasiou, George S.-
dc.contributor.authorKritikakou, Angeliki S.-
dc.contributor.authorGoutis, Costas E.-
dc.contributor.authorKelefouras, Vasileios I.-
dc.contributor.authorAlachiotis, Nikolaos-
dc.contributor.authorMichail, Harris-
dc.date.accessioned2019-05-31T09:19:13Z-
dc.date.available2019-05-31T09:19:13Z-
dc.date.issued2010-09-07-
dc.identifier.citationJournal of Supercomputing, 2012, vol. 59, no. 2, pp. 830-851en_US
dc.identifier.issn15730484-
dc.description.abstractMatrix-Matrix Multiplication (MMM) is a highly important kernel in linear algebra algorithms and the performance of its implementations depends on the memory utilization and data locality. There are MMM algorithms, such as standard, Strassen-Winograd variant, and many recursive array layouts, such as Z-Morton or U-Morton. However, their data locality is lower than that of the proposed methodology. Moreover, several SOA (state of the art) self-tuning libraries exist, such as ATLAS for MMM algorithm, which tests many MMM implementations. During the installation of ATLAS, on the one hand an extremely complex empirical tuning step is required, and on the other hand a large number of compiler options are used, both of which are not included in the scope of this paper. In this paper, a new methodology using the standard MMM algorithm is presented, achieving improved performance by focusing on data locality (both temporal and spatial). This methodology finds the scheduling which conforms with the optimum memory management. Compared with (Chatterjee et al. in IEEE Trans. Parallel Distrib. Syst. 13:1105, 2002; Li and Garzaran in Proc. of Lang. Compil. Parallel Comput., 2005; Bilmes et al. in Proc. of the 11th ACM Int. Conf. Super-comput., 1997; Aberdeen and Baxter in Concurr. Comput. Pract. Exp. 13:103, 2001), the proposed methodology has two major advantages. Firstly, the scheduling used for the tile level is different from the element level's one, having better data locality, suited to the sizes of memory hierarchy. Secondly, its exploration time is short, because it searches only for the number of the level of tiling used, and between (1, 2) (Sect. 4) for finding the best tile size for each cache level. A software tool (C-code) implementing the above methodology was developed, having the hardware model and the matrix sizes as input. This methodology has better performance against others at a wide range of architectures. Compared with the best existing related work, which we implemented, better performance up to 55% than the Standard MMM algorithm and up to 35% than Strassen's is observed, both under recursive data array layouts. © 2010 Springer Science+Business Media, LLC.en_US
dc.formatpdfen_US
dc.language.isoenen_US
dc.relation.ispartofJournal of Supercomputingen_US
dc.rights© Springeren_US
dc.subjectCompilersen_US
dc.subjectData localityen_US
dc.subjectData reuseen_US
dc.subjectMatrix-matrix multiplicationen_US
dc.subjectMemory managementen_US
dc.subjectRecursive array layoutsen_US
dc.subjectSchedulingen_US
dc.subjectStrassen's algorithmen_US
dc.titleA Data Locality Methodology for Matrix-matrix Multiplication Algorithmen_US
dc.typeArticleen_US
dc.collaborationUniversity of Patrasen_US
dc.collaborationCyprus University of Technologyen_US
dc.subject.categoryElectrical Engineering - Electronic Engineering - Information Engineeringen_US
dc.journalsSubscriptionen_US
dc.countryGreeceen_US
dc.countryCyprusen_US
dc.subject.fieldEngineering and Technologyen_US
dc.publicationPeer Revieweden_US
dc.identifier.doi10.1007/s11227-010-0474-3en_US
dc.identifier.scopus2-s2.0-84855646851en
dc.identifier.urlhttps://api.elsevier.com/content/abstract/scopus_id/84855646851en
dc.contributor.orcid#NODATA#en
dc.contributor.orcid#NODATA#en
dc.contributor.orcid#NODATA#en
dc.contributor.orcid#NODATA#en
dc.contributor.orcid#NODATA#en
dc.contributor.orcid#NODATA#en
dc.relation.issue2en_US
dc.relation.volume59en_US
cut.common.academicyear2010-2011en_US
dc.identifier.spage830en_US
dc.identifier.epage851en_US
item.grantfulltextnone-
item.languageiso639-1en-
item.cerifentitytypePublications-
item.openairecristypehttp://purl.org/coar/resource_type/c_6501-
item.openairetypearticle-
item.fulltextNo Fulltext-
crisitem.journal.journalissn1573-0484-
crisitem.journal.publisherSpringer Nature-
crisitem.author.deptDepartment of Electrical Engineering, Computer Engineering and Informatics-
crisitem.author.facultyFaculty of Engineering and Technology-
crisitem.author.orcid0000-0002-8299-8737-
crisitem.author.parentorgFaculty of Engineering and Technology-
Appears in Collections:Άρθρα/Articles
CORE Recommender
Show simple item record

SCOPUSTM   
Citations

10
checked on Mar 14, 2024

WEB OF SCIENCETM
Citations

10
Last Week
0
Last month
0
checked on Oct 29, 2023

Page view(s)

327
Last Week
0
Last month
4
checked on Nov 6, 2024

Google ScholarTM

Check

Altmetric


Items in KTISIS are protected by copyright, with all rights reserved, unless otherwise indicated.