The necessity of identifying suitable computing resources to solve a scientific or engineering problem in a Grid environment requires more and more sophisticated resource management systems: 1) strategies and technologies should be able to master the complexity of modern large-scale networks and computing facilities and 2) the convergence of Grid computing toward the service-oriented approach is fostering a new vision where economic aspects represent central issues to burst the adoption of computing as a utility. In this context, the design and execution of data and compute-intensive applications are often simplified by the adoption of model-driven approaches based on workflows. The execution of Grid workflows can leverage on meta-scheduling systems to automatically and transparently allocate tasks to resources that ensure the fulfillment of functional requirements and quality-of-service (QoS) constraints, specified by the user. This paper presents a time and cost-constrained scheduling strategy that, according to the data parallelism pattern, is able to deploy scientific and business workflow tasks (or other kinds of application tasks) on pools of resources selected with the aim of minimizing the overall execution time. The strategy was implemented as a plug-in in a matchmaker for Grid services and its validity and accuracy were experimentally proved on a real testbed leveraging a framework for the deployment of data parallel tasks. The results show that the tasks deployment is effective and accurate and pave the way for using the Internet as a utility computing facility.

Time and Cost-driven Scheduling of Data Parallel Tasks in Grid Workflows

ZIMEO E.
2009

Abstract

The necessity of identifying suitable computing resources to solve a scientific or engineering problem in a Grid environment requires more and more sophisticated resource management systems: 1) strategies and technologies should be able to master the complexity of modern large-scale networks and computing facilities and 2) the convergence of Grid computing toward the service-oriented approach is fostering a new vision where economic aspects represent central issues to burst the adoption of computing as a utility. In this context, the design and execution of data and compute-intensive applications are often simplified by the adoption of model-driven approaches based on workflows. The execution of Grid workflows can leverage on meta-scheduling systems to automatically and transparently allocate tasks to resources that ensure the fulfillment of functional requirements and quality-of-service (QoS) constraints, specified by the user. This paper presents a time and cost-constrained scheduling strategy that, according to the data parallelism pattern, is able to deploy scientific and business workflow tasks (or other kinds of application tasks) on pools of resources selected with the aim of minimizing the overall execution time. The strategy was implemented as a plug-in in a matchmaker for Grid services and its validity and accuracy were experimentally proved on a real testbed leveraging a framework for the deployment of data parallel tasks. The results show that the tasks deployment is effective and accurate and pave the way for using the Internet as a utility computing facility.
data parallelism; meta-scheduling; divisible load; service matchmaking
File in questo prodotto:
File Dimensione Formato  
04785112.pdf

non disponibili

Licenza: Non specificato
Dimensione 789.57 kB
Formato Adobe PDF
789.57 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12070/2439
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 25
  • ???jsp.display-item.citation.isi??? 11
social impact