A data driven dispatcher for big data applications in heterogeneous systems

Souza Junior, Paulo Ricardo Rodrigues de

dc.contributor.advisor	Geyer, Claudio Fernando Resin	pt_BR
dc.contributor.author	Souza Junior, Paulo Ricardo Rodrigues de	pt_BR
dc.date.accessioned	2019-01-18T02:31:31Z	pt_BR
dc.date.issued	2018	pt_BR
dc.identifier.uri	http://hdl.handle.net/10183/187882	pt_BR
dc.description.abstract	Mankind is increasing technology capacity every day, as it is taking place in multiple areas like automation, predicting, making actions, and so on. In this process, data is produced in different ratios and quantities, and from a close point of view the data production of a single sensor is not much and does not provide clear insights. However, a global vision and the union of that information may contain helpful knowledge about business intelligence, people and sensor behavior. The global view of all this data is called Big Data and may achieve overwhelming amounts of data, which is being produced in outstanding rates by devices and people. Therefore, it is necessary to provide solutions to manage Big Data systems, which give robustness and quality of service. In order to achieve robust systems to process high amounts of data, Big Data frameworks are proposed and deployed using several management tools. Furthermore, Big Data frameworks are usually separated in different perspectives of processing (i.e., batch and stream processing), and focuses on processing balanced data in homogeneous environments. Stream and Batch Processing Engines have to support high data ingestion to ensure the quality and efficiency for the end-user or a system administrator. The data flow processed by SPE fluctuates over time and requires real-time or near real-time resource pool adjustments (network, memory, CPU and other). This scenario leads to the problem known as skewed data production caused by the non-uniform incoming flow at specific points on the environment, resulting in slow down of applications produced by network bottlenecks and inefficient load balance. The current proposal of this thesis is the Aten a data-driven dispatcher as a solution to overcome unbalanced data flows processed by Big Data Stream applications in heterogeneous systems. Aten manages data aggregation and data streams within message queues, assuming different algorithms as strategies to partition data flow over all the available computational resources. The thesis presents results indicating that is possible to maximize the throughput and also provide low latency levels for SPEs.	en
dc.format.mimetype	application/pdf	pt_BR
dc.language.iso	eng	pt_BR
dc.rights	Open Access	en
dc.subject	Big data	pt_BR
dc.subject	Big data	en
dc.subject	Processamento de dados	pt_BR
dc.subject	Communication optimization	en
dc.subject	Data-stream partition	en
dc.subject	Load balance	en
dc.title	A data driven dispatcher for big data applications in heterogeneous systems	pt_BR
dc.title.alternative	Um dispatcher acionado por dados de aplicações de big data em sistemas heterogêneos	pt
dc.type	Dissertação	pt_BR
dc.identifier.nrb	001084082	pt_BR
dc.degree.grantor	Universidade Federal do Rio Grande do Sul	pt_BR
dc.degree.department	Instituto de Informática	pt_BR
dc.degree.program	Programa de Pós-Graduação em Computação	pt_BR
dc.degree.local	Porto Alegre, BR-RS	pt_BR
dc.degree.date	2018	pt_BR
dc.degree.level	mestrado	pt_BR

Nome:: 001084082.pdf
Tamanho:: 2.503Mb
Formato:: PDF
Descrição:: Texto completo (inglês)

Visualizar/abrir

Este item está licenciado na Creative Commons License

Ciências Exatas e da Terra (5041)

Computação (1733)

Mostrar registro simples