Euro-Par 2018: Parallel Processing 24th International Conference on Parallel and Distributed Computing, Turin, Italy, August 27 - 31, 2018, Proceedings /

This book constitutes the proceedings of the 24th International Conference on Parallel and Distributed Computing, Euro-Par 2018, held in Turin, Italy, in August 2018. The 57 full papers presented in this volume were carefully reviewed and selected from 194 submissions. They were organized in topical...

Full description

Corporate Author: SpringerLink (Online service)
Other Authors: Aldinucci, Marco. (Editor, http://id.loc.gov/vocabulary/relators/edt), Padovani, Luca. (Editor, http://id.loc.gov/vocabulary/relators/edt), Torquati, Massimo. (Editor, http://id.loc.gov/vocabulary/relators/edt)
Language:English
Published: Cham : Springer International Publishing : Imprint: Springer, 2018.
Edition:1st ed. 2018.
Series:Theoretical Computer Science and General Issues ; 11014
Subjects:
Online Access:https://doi.org/10.1007/978-3-319-96983-1
Table of Contents:
  • Support Tools and Environments
  • Automatic detection of synchronization errors in codes that target the Open Community Runtime
  • A Methodology for Performance Analysis of Applications Using Multi-layer I/O
  • Runtime Determinacy Race Detection for OpenMP Tasks
  • Estimating the impact of external interference on application performance
  • GT-Race: Graph Traversal based Data Race Detection for Asynchronous Many-Task Parallelism
  • Performance and Power Modeling, Prediction and Evaluation
  • Reducing GPU Register File Energy
  • Taxonomist: Application Detection through Rich Monitoring Data
  • Diagnosing Highly-Parallel OpenMP Programs With Aggregated Grain Graphs
  • Characterization of smartphone governor strategies
  • HPC Benchmarking: Scaling Right and Looking Beyond the Average
  • Combined Vertical and Horizontal Autoscaling Through Model Predictive Control
  • Scheduling and Load Balancing
  • Early Termination of Failed HPC Jobs Through Machine and Deep Learning
  • Peacock: Probe-Based Scheduling of Jobs by Rotating Between Elastic Queues
  • Online Scheduling of Task Graphs on Hybrid Platforms
  • Interference-Aware Scheduling using Geometric Constraints
  • Resource-efficient execution of conditional parallel real-time tasks
  • High Performance Architectures and Compilers
  • Improving GPU Cache Hierarchy Performance with a Fetch and Replacement Cache
  • Abelian: A Compiler for Graph Analytics on Distributed, Heterogeneous Platforms
  • Using Dynamic Compilation to achieve Ninja Performance for CNN training on Many-Core Processors
  • Parallel and Distributed Data Management and Analytics
  • Privacy-Preserving Top-k Query Processing in Distributed Systems
  • Minimizing Network Traffic for Distributed Joins Using Lightweight Locality-Aware Scheduling
  • Cluster and Cloud Computing
  • VIoLET: A Large-scale Virtual Environment for Internet of Things
  • Adaptive Bandwidth-Efficient Recovery Techniques in Erasure-Coded Cloud Storage Systems
  • IT Optimization for Datacenters Under Renewable Power Constraint
  • GPU Provisioning: The 80 - 20 Rule
  • ECSched: Efficient Container Scheduling on Heterogeneous Clusters
  • Combinatorial Auction Algorithm Selection for Cloud Resource Allocation using Machine Learning
  • Cloud Federation Formation in Oligopolistic Markets
  • Improving Cloud Simulation using the Monte-Carlo Method
  • Distributed Systems and Algorithms
  • Nobody cares if you liked Star Wars: KNN graph construction on the cheap
  • One-Sided Communications for more Efficient Parallel State Space Exploration over RDMA Clusters
  • Robust Decentralized Mean Estimation with Limited Communication
  • Parallel and Distributed Programming, Interfaces, and Languages
  • Snapshot-based Synchronization: A Fast Replacement for Hand-over-Hand Locking
  • Measuring Multi-threaded Message Matching Misery
  • Global-Local View: Scalable Consistency for Concurrent Data Types
  • OpenABL: A Domain-Specific Language for Parallel and Distributed Agent-Based Simulations
  • Bulk: a Modern C++ Interface for Bulk-Synchronous Parallel Programs
  • SharP Unified Memory Allocator: An Intent-based Memory Allocator for Extreme-scale Systems
  • Multi-Granularity Locking in Hierarchies with Synergistic Hierarchical and Fine-Grained Locks
  • Efficient Communication/Computation Overlap with MPI+OpenMP Runtimes Collaboration
  • Multicore and Manycore Methods and Tools
  • Efficient Lock-Free Removing and Compaction for the Cache-Trie Data Structure
  • NUMA Optimizations for Algorithmic Skeletons
  • Improving System Turnaround Time with Intel CAT by Identifying LLC Critical Applications
  • Dynamic Placement of Progress Thread for Overlapping MPI Non-Blocking Collectives on Manycore Processor
  • Load balancing strategies for graph traversal applications on GPUs
  • Energy Efficient Stencil Computations on the Low-Power Manycore MPPA-256 Processor
  • Theory and Algorithms for Parallel Computation and Networking
  • High-Quality Shared-Memory Graph Partitioning
  • Design Principles for Sparse Matrix Multiplication on the GPU
  • Distributed Graph Clustering using Modularity and Map Equation
  • Improved Distributed Algorithm for Graph Truss Decomposition
  • Parallel Numerical Methods and Applications
  • Exploiting Data Sparsity for Large-Scale Matrix Computations
  • Hybrid Parallelization and Performance Optimization of the FLEUR Code: New Possibilities for All-electron Density Functional Theory
  • Efficient Strict-Binning Particle-in-Cell Algorithm for Multi-Core SIMD Processors
  • Task-Based Programming on Emerging Parallel Architectures for Finite-Differences Seismic Numerical Kernel
  • Accelerator Computing for Advanced Applications
  • CEML: a Coordinated Runtime System for Efficient Machine Learning on Heterogeneous Computing Systems
  • Stream Processing on Hybrid CPU/Intel Xeon Phi Systems
  • Tile Low-Rank GEMM Using Batched Operations on GPUs. .