The necessity of identifying suitable computing resources to solve a scientific or engineering problem in a Grid environment requires more and more sophisticated resource management systems: 1) strategies and technologies should be able to master the complexity of modern large-scale networks and computing facilities and 2) the convergence of Grid computing toward the service-oriented approach is fostering a new vision where economic aspects represent central issues to burst the adoption of computing as a utility. In this context, the design and execution of data and compute-intensive applications are often simplified by the adoption of model-driven approaches based on workflows. The execution of Grid workflows can leverage on meta-scheduling systems to automatically and transparently allocate tasks to resources that ensure the fulfillment of functional requirements and quality-of-service (QoS) constraints, specified by the user. This paper presents a time and cost-constrained scheduling strategy that, according to the data parallelism pattern, is able to deploy scientific and business workflow tasks (or other kinds of application tasks) on pools of resources selected with the aim of minimizing the overall execution time. The strategy was implemented as a plug-in in a matchmaker for Grid services and its validity and accuracy were experimentally proved on a real testbed leveraging a framework for the deployment of data parallel tasks. The results show that the tasks deployment is effective and accurate and pave the way for using the Internet as a utility computing facility.
|Titolo:||Time and Cost-driven Scheduling of Data Parallel Tasks in Grid Workflows|
|Data di pubblicazione:||2009|
|Appare nelle tipologie:||1.1 Articolo in rivista|