{"data":{"id":"10.48550/arxiv.1007.1388","type":"dois","attributes":{"doi":"10.48550/arxiv.1007.1388","prefix":"10.48550","suffix":"arxiv.1007.1388","identifiers":[{"identifier":"1007.1388","identifierType":"arXiv"}],"alternateIdentifiers":[{"alternateIdentifierType":"arXiv","alternateIdentifier":"1007.1388"}],"creators":[{"name":"Feichtinger, Christian","nameType":"Personal","givenName":"Christian","familyName":"Feichtinger","affiliation":[],"nameIdentifiers":[]},{"name":"Habich, Johannes","nameType":"Personal","givenName":"Johannes","familyName":"Habich","affiliation":[],"nameIdentifiers":[]},{"name":"Koestler, Harald","nameType":"Personal","givenName":"Harald","familyName":"Koestler","affiliation":[],"nameIdentifiers":[]},{"name":"Hager, Georg","nameType":"Personal","givenName":"Georg","familyName":"Hager","affiliation":[],"nameIdentifiers":[]},{"name":"Ruede, Ulrich","nameType":"Personal","givenName":"Ulrich","familyName":"Ruede","affiliation":[],"nameIdentifiers":[]},{"name":"Wellein, Gerhard","nameType":"Personal","givenName":"Gerhard","familyName":"Wellein","affiliation":[],"nameIdentifiers":[]}],"titles":[{"title":"A Flexible Patch-Based Lattice Boltzmann Parallelization Approach for Heterogeneous GPU-CPU Clusters"}],"publisher":"arXiv","container":{},"publicationYear":2010,"subjects":[{"lang":"en","subject":"Distributed, Parallel, and Cluster Computing (cs.DC)","subjectScheme":"arXiv"},{"lang":"en","subject":"Performance (cs.PF)","subjectScheme":"arXiv"},{"subject":"FOS: Computer and information sciences","subjectScheme":"Fields of Science and Technology (FOS)"},{"subject":"FOS: Computer and information sciences","schemeUri":"http://www.oecd.org/science/inno/38235147.pdf","subjectScheme":"Fields of Science and Technology (FOS)"}],"contributors":[],"dates":[{"date":"2010-07-08T14:27:05Z","dateType":"Submitted","dateInformation":"v1"},{"date":"2012-03-01T01:01:58Z","dateType":"Updated","dateInformation":"v1"},{"date":"2010-07","dateType":"Available","dateInformation":"v1"},{"date":"2010","dateType":"Issued"}],"language":null,"types":{"ris":"RPRT","bibtex":"article","citeproc":"article-journal","schemaOrg":"ScholarlyArticle","resourceType":"Article","resourceTypeGeneral":"Text"},"relatedIdentifiers":[{"relationType":"IsVersionOf","relatedIdentifier":"10.1016/j.parco.2011.03.005","relatedIdentifierType":"DOI"}],"relatedItems":[],"sizes":[],"formats":[],"version":"1","rightsList":[{"rights":"arXiv.org perpetual, non-exclusive license","rightsUri":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/"}],"descriptions":[{"description":"Sustaining a large fraction of single GPU performance in parallel computations is considered to be the major problem of GPU-based clusters. In this article, this topic is addressed in the context of a lattice Boltzmann flow solver that is integrated in the WaLBerla software framework. We propose a multi-GPU implementation using a block-structured MPI parallelization, suitable for load balancing and heterogeneous computations on CPUs and GPUs. The overhead required for multi-GPU simulations is discussed in detail and it is demonstrated that the kernel performance can be sustained to a large extent. With our GPU implementation, we achieve nearly perfect weak scalability on InfiniBand clusters. However, in strong scaling scenarios multi-GPUs make less efficient use of the hardware than IBM BG/P and x86 clusters. Hence, a cost analysis must determine the best course of action for a particular simulation task. Additionally, weak scaling results of heterogeneous simulations conducted on CPUs and GPUs simultaneously are presented using clusters equipped with varying node configurations.","descriptionType":"Abstract"},{"description":"20 pages, 12 figures","descriptionType":"Other"}],"geoLocations":[],"fundingReferences":[],"xml":"PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0idXRmLTgiPz4KPHJlc291cmNlIHhtbG5zPSJodHRwOi8vZGF0YWNpdGUub3JnL3NjaGVtYS9rZXJuZWwtNCIgeG1sbnM6eHNpPSJodHRwOi8vd3d3LnczLm9yZy8yMDAxL1hNTFNjaGVtYS1pbnN0YW5jZSIgeHNpOnNjaGVtYUxvY2F0aW9uPSJodHRwOi8vZGF0YWNpdGUub3JnL3NjaGVtYS9rZXJuZWwtNCBodHRwOi8vc2NoZW1hLmRhdGFjaXRlLm9yZy9tZXRhL2tlcm5lbC00LjMvbWV0YWRhdGEueHNkIj4KICA8aWRlbnRpZmllciBpZGVudGlmaWVyVHlwZT0iRE9JIj4xMC40ODU1MC9BUlhJVi4xMDA3LjEzODg8L2lkZW50aWZpZXI+CiAgPGFsdGVybmF0ZUlkZW50aWZpZXJzPgogICAgPGFsdGVybmF0ZUlkZW50aWZpZXIgYWx0ZXJuYXRlSWRlbnRpZmllclR5cGU9ImFyWGl2Ij4xMDA3LjEzODg8L2FsdGVybmF0ZUlkZW50aWZpZXI+CiAgPC9hbHRlcm5hdGVJZGVudGlmaWVycz4KICA8Y3JlYXRvcnM+CiAgICA8Y3JlYXRvcj4KICAgICAgPGNyZWF0b3JOYW1lIG5hbWVUeXBlPSJQZXJzb25hbCI+RmVpY2h0aW5nZXIsIENocmlzdGlhbjwvY3JlYXRvck5hbWU+CiAgICAgIDxnaXZlbk5hbWU+Q2hyaXN0aWFuPC9naXZlbk5hbWU+CiAgICAgIDxmYW1pbHlOYW1lPkZlaWNodGluZ2VyPC9mYW1pbHlOYW1lPgogICAgPC9jcmVhdG9yPgogICAgPGNyZWF0b3I+CiAgICAgIDxjcmVhdG9yTmFtZSBuYW1lVHlwZT0iUGVyc29uYWwiPkhhYmljaCwgSm9oYW5uZXM8L2NyZWF0b3JOYW1lPgogICAgICA8Z2l2ZW5OYW1lPkpvaGFubmVzPC9naXZlbk5hbWU+CiAgICAgIDxmYW1pbHlOYW1lPkhhYmljaDwvZmFtaWx5TmFtZT4KICAgIDwvY3JlYXRvcj4KICAgIDxjcmVhdG9yPgogICAgICA8Y3JlYXRvck5hbWUgbmFtZVR5cGU9IlBlcnNvbmFsIj5Lb2VzdGxlciwgSGFyYWxkPC9jcmVhdG9yTmFtZT4KICAgICAgPGdpdmVuTmFtZT5IYXJhbGQ8L2dpdmVuTmFtZT4KICAgICAgPGZhbWlseU5hbWU+S29lc3RsZXI8L2ZhbWlseU5hbWU+CiAgICA8L2NyZWF0b3I+CiAgICA8Y3JlYXRvcj4KICAgICAgPGNyZWF0b3JOYW1lIG5hbWVUeXBlPSJQZXJzb25hbCI+SGFnZXIsIEdlb3JnPC9jcmVhdG9yTmFtZT4KICAgICAgPGdpdmVuTmFtZT5HZW9yZzwvZ2l2ZW5OYW1lPgogICAgICA8ZmFtaWx5TmFtZT5IYWdlcjwvZmFtaWx5TmFtZT4KICAgIDwvY3JlYXRvcj4KICAgIDxjcmVhdG9yPgogICAgICA8Y3JlYXRvck5hbWUgbmFtZVR5cGU9IlBlcnNvbmFsIj5SdWVkZSwgVWxyaWNoPC9jcmVhdG9yTmFtZT4KICAgICAgPGdpdmVuTmFtZT5VbHJpY2g8L2dpdmVuTmFtZT4KICAgICAgPGZhbWlseU5hbWU+UnVlZGU8L2ZhbWlseU5hbWU+CiAgICA8L2NyZWF0b3I+CiAgICA8Y3JlYXRvcj4KICAgICAgPGNyZWF0b3JOYW1lIG5hbWVUeXBlPSJQZXJzb25hbCI+V2VsbGVpbiwgR2VyaGFyZDwvY3JlYXRvck5hbWU+CiAgICAgIDxnaXZlbk5hbWU+R2VyaGFyZDwvZ2l2ZW5OYW1lPgogICAgICA8ZmFtaWx5TmFtZT5XZWxsZWluPC9mYW1pbHlOYW1lPgogICAgPC9jcmVhdG9yPgogIDwvY3JlYXRvcnM+CiAgPHRpdGxlcz4KICAgIDx0aXRsZT5BIEZsZXhpYmxlIFBhdGNoLUJhc2VkIExhdHRpY2UgQm9sdHptYW5uIFBhcmFsbGVsaXphdGlvbiBBcHByb2FjaCBmb3IgSGV0ZXJvZ2VuZW91cyBHUFUtQ1BVIENsdXN0ZXJzPC90aXRsZT4KICA8L3RpdGxlcz4KICA8cHVibGlzaGVyPmFyWGl2PC9wdWJsaXNoZXI+CiAgPHB1YmxpY2F0aW9uWWVhcj4yMDEwPC9wdWJsaWNhdGlvblllYXI+CiAgPHN1YmplY3RzPgogICAgPHN1YmplY3QgeG1sOmxhbmc9ImVuIiBzdWJqZWN0U2NoZW1lPSJhclhpdiI+RGlzdHJpYnV0ZWQsIFBhcmFsbGVsLCBhbmQgQ2x1c3RlciBDb21wdXRpbmcgKGNzLkRDKTwvc3ViamVjdD4KICAgIDxzdWJqZWN0IHhtbDpsYW5nPSJlbiIgc3ViamVjdFNjaGVtZT0iYXJYaXYiPlBlcmZvcm1hbmNlIChjcy5QRik8L3N1YmplY3Q+CiAgICA8c3ViamVjdCBzdWJqZWN0U2NoZW1lPSJGaWVsZHMgb2YgU2NpZW5jZSBhbmQgVGVjaG5vbG9neSAoRk9TKSI+Rk9TOiBDb21wdXRlciBhbmQgaW5mb3JtYXRpb24gc2NpZW5jZXM8L3N1YmplY3Q+CiAgPC9zdWJqZWN0cz4KICA8ZGF0ZXM+CiAgICA8ZGF0ZSBkYXRlVHlwZT0iU3VibWl0dGVkIiBkYXRlSW5mb3JtYXRpb249InYxIj4yMDEwLTA3LTA4VDE0OjI3OjA1WjwvZGF0ZT4KICAgIDxkYXRlIGRhdGVUeXBlPSJVcGRhdGVkIiBkYXRlSW5mb3JtYXRpb249InYxIj4yMDEyLTAzLTAxVDAxOjAxOjU4WjwvZGF0ZT4KICAgIDxkYXRlIGRhdGVUeXBlPSJBdmFpbGFibGUiIGRhdGVJbmZvcm1hdGlvbj0idjEiPjIwMTAtMDc8L2RhdGU+CiAgPC9kYXRlcz4KICA8cmVzb3VyY2VUeXBlIHJlc291cmNlVHlwZUdlbmVyYWw9IlRleHQiPkFydGljbGU8L3Jlc291cmNlVHlwZT4KICA8cmVsYXRlZElkZW50aWZpZXJzPgogICAgPHJlbGF0ZWRJZGVudGlmaWVyIHJlbGF0ZWRJZGVudGlmaWVyVHlwZT0iRE9JIiByZWxhdGlvblR5cGU9IklzVmVyc2lvbk9mIj4xMC4xMDE2L2oucGFyY28uMjAxMS4wMy4wMDU8L3JlbGF0ZWRJZGVudGlmaWVyPgogIDwvcmVsYXRlZElkZW50aWZpZXJzPgogIDx2ZXJzaW9uPjE8L3ZlcnNpb24+CiAgPHJpZ2h0c0xpc3Q+CiAgICA8cmlnaHRzIHJpZ2h0c1VSST0iaHR0cDovL2FyeGl2Lm9yZy9saWNlbnNlcy9ub25leGNsdXNpdmUtZGlzdHJpYi8xLjAvIj5hclhpdi5vcmcgcGVycGV0dWFsLCBub24tZXhjbHVzaXZlIGxpY2Vuc2U8L3JpZ2h0cz4KICA8L3JpZ2h0c0xpc3Q+CiAgPGRlc2NyaXB0aW9ucz4KICAgIDxkZXNjcmlwdGlvbiBkZXNjcmlwdGlvblR5cGU9IkFic3RyYWN0Ij5TdXN0YWluaW5nIGEgbGFyZ2UgZnJhY3Rpb24gb2Ygc2luZ2xlIEdQVSBwZXJmb3JtYW5jZSBpbiBwYXJhbGxlbCBjb21wdXRhdGlvbnMgaXMgY29uc2lkZXJlZCB0byBiZSB0aGUgbWFqb3IgcHJvYmxlbSBvZiBHUFUtYmFzZWQgY2x1c3RlcnMuIEluIHRoaXMgYXJ0aWNsZSwgdGhpcyB0b3BpYyBpcyBhZGRyZXNzZWQgaW4gdGhlIGNvbnRleHQgb2YgYSBsYXR0aWNlIEJvbHR6bWFubiBmbG93IHNvbHZlciB0aGF0IGlzIGludGVncmF0ZWQgaW4gdGhlIFdhTEJlcmxhIHNvZnR3YXJlIGZyYW1ld29yay4gV2UgcHJvcG9zZSBhIG11bHRpLUdQVSBpbXBsZW1lbnRhdGlvbiB1c2luZyBhIGJsb2NrLXN0cnVjdHVyZWQgTVBJIHBhcmFsbGVsaXphdGlvbiwgc3VpdGFibGUgZm9yIGxvYWQgYmFsYW5jaW5nIGFuZCBoZXRlcm9nZW5lb3VzIGNvbXB1dGF0aW9ucyBvbiBDUFVzIGFuZCBHUFVzLiBUaGUgb3ZlcmhlYWQgcmVxdWlyZWQgZm9yIG11bHRpLUdQVSBzaW11bGF0aW9ucyBpcyBkaXNjdXNzZWQgaW4gZGV0YWlsIGFuZCBpdCBpcyBkZW1vbnN0cmF0ZWQgdGhhdCB0aGUga2VybmVsIHBlcmZvcm1hbmNlIGNhbiBiZSBzdXN0YWluZWQgdG8gYSBsYXJnZSBleHRlbnQuIFdpdGggb3VyIEdQVSBpbXBsZW1lbnRhdGlvbiwgd2UgYWNoaWV2ZSBuZWFybHkgcGVyZmVjdCB3ZWFrIHNjYWxhYmlsaXR5IG9uIEluZmluaUJhbmQgY2x1c3RlcnMuIEhvd2V2ZXIsIGluIHN0cm9uZyBzY2FsaW5nIHNjZW5hcmlvcyBtdWx0aS1HUFVzIG1ha2UgbGVzcyBlZmZpY2llbnQgdXNlIG9mIHRoZSBoYXJkd2FyZSB0aGFuIElCTSBCRy9QIGFuZCB4ODYgY2x1c3RlcnMuIEhlbmNlLCBhIGNvc3QgYW5hbHlzaXMgbXVzdCBkZXRlcm1pbmUgdGhlIGJlc3QgY291cnNlIG9mIGFjdGlvbiBmb3IgYSBwYXJ0aWN1bGFyIHNpbXVsYXRpb24gdGFzay4gQWRkaXRpb25hbGx5LCB3ZWFrIHNjYWxpbmcgcmVzdWx0cyBvZiBoZXRlcm9nZW5lb3VzIHNpbXVsYXRpb25zIGNvbmR1Y3RlZCBvbiBDUFVzIGFuZCBHUFVzIHNpbXVsdGFuZW91c2x5IGFyZSBwcmVzZW50ZWQgdXNpbmcgY2x1c3RlcnMgZXF1aXBwZWQgd2l0aCB2YXJ5aW5nIG5vZGUgY29uZmlndXJhdGlvbnMuPC9kZXNjcmlwdGlvbj4KICAgIDxkZXNjcmlwdGlvbiBkZXNjcmlwdGlvblR5cGU9Ik90aGVyIj4yMCBwYWdlcywgMTIgZmlndXJlczwvZGVzY3JpcHRpb24+CiAgPC9kZXNjcmlwdGlvbnM+CjwvcmVzb3VyY2U+","url":"https://arxiv.org/abs/1007.1388","contentUrl":null,"metadataVersion":0,"schemaVersion":"http://datacite.org/schema/kernel-4","source":"mds","isActive":true,"state":"findable","reason":null,"viewCount":0,"viewsOverTime":[],"downloadCount":0,"downloadsOverTime":[],"referenceCount":0,"citationCount":0,"citationsOverTime":[],"partCount":0,"partOfCount":0,"versionCount":0,"versionOfCount":0,"created":"2022-03-14T05:11:54.000Z","registered":"2022-03-14T05:11:55.000Z","published":"2010","updated":"2022-03-14T05:11:55.000Z"},"relationships":{"client":{"data":{"id":"arxiv.content","type":"clients"}},"provider":{"data":{"id":"arxiv","type":"providers"}},"media":{"data":{"id":"10.48550/arxiv.1007.1388","type":"media"}},"references":{"data":[]},"citations":{"data":[]},"parts":{"data":[]},"partOf":{"data":[]},"versions":{"data":[]},"versionOf":{"data":[]}}}}