SBBD Tutoriais – Programação

Segunda-feira, 25 de setembro de 2023

	Tutoriais 2 e 4
14:00 – 16:00	Tutorial 4: Graph-based Methods for Similarity Searches Larissa Shimomura (Eindhoven University of Technology – Netherlands), Daniel Kaster (UEL – Brazil) Chair: Marcos Bedo (IFF) Similarity searches are based on retrieving similar data of one or more data used as reference according to an intrinsic characteristic of the data. Recently, graph-based methods have emerged as a very efficient option to execute similarity queries in metric and non-metric spaces. Graphs model the interconnectivity among data, enabling us to explore relationships and neighbors in an agile way. In this tutorial, we will give an introduction to similarity searches and present graph-based methods for similarity searches, the types of graphs used in such methods, their properties, open challenges, and research opportunities.
16:30 – 18:30	Tutorial 2: Automatic Disambiguation of Author Names: Foundations, Methods and Open Issues Anderson Ferreira (UFOP – Brazil), Alberto Laender (UFMG – Brazil) Chair: Eduardo Pena (UTFPR) This tutorial is based on our book “Automatic Disambiguation of Author Names in Bibliographic Repositories” and aims to spread the problem and its challenges among the SBBD community. Author name ambiguity problem occurs when an author publishes works under distinct names or distinct authors publish works under similar names. This problem may be caused by a number of reasons, including the lack of standards and common practices, and the decentralized generation of bibliographic content. In this tutorial, we intend to present an ample view on the automatic disambiguation of author names. We start by discussing its motivational issues, defining the author name disambiguation task and presenting its foundations. Next, we describe some methods proposed by our research group, as well as some recent approaches to author name disambiguation. Finally, we discuss open issues.

Tutoriais 2 e 4

14:00 – 16:00

Tutorial 4: Graph-based Methods for Similarity Searches
Larissa Shimomura (Eindhoven University of Technology – Netherlands), Daniel Kaster (UEL – Brazil)
Chair: Marcos Bedo (IFF)

Similarity searches are based on retrieving similar data of one or more data used as reference according to an intrinsic characteristic of the data. Recently, graph-based methods have emerged as a very efficient option to execute similarity queries in metric and non-metric spaces. Graphs model the interconnectivity among data, enabling us to explore relationships and neighbors in an agile way. In this tutorial, we will give an introduction to similarity searches and present graph-based methods for similarity searches, the types of graphs used in such methods, their properties, open challenges, and research opportunities.

16:30 – 18:30

Tutorial 2: Automatic Disambiguation of Author Names: Foundations, Methods and Open Issues
Anderson Ferreira (UFOP – Brazil), Alberto Laender (UFMG – Brazil)
Chair: Eduardo Pena (UTFPR)

This tutorial is based on our book “Automatic Disambiguation of Author Names in Bibliographic Repositories” and aims to spread the problem and its challenges among the SBBD community. Author name ambiguity problem occurs when an author publishes works under distinct names or distinct authors publish works under similar names. This problem may be caused by a number of reasons, including the lack of standards and common practices, and the decentralized generation of bibliographic content. In this tutorial, we intend to present an ample view on the automatic disambiguation of author names. We start by discussing its motivational issues, defining the author name disambiguation task and presenting its foundations. Next, we describe some methods proposed by our research group, as well as some recent approaches to author name disambiguation. Finally, we discuss open issues.

Terça-feira, 26 de setembro de 2023

	Tutoriais 1 e 3
14:00 – 16:00	Tutorial 3: Introdução à Ciência de Dados em Cibersegurança Michele Nogueira (UFMG – Brazil), Ligia Borges (UFMG – Brazil) Chair: Fabio Porto (LNCC) Diante das mudanças progressivas em cibersegurança, muitas delas impulsionadas pela ciência de dados e pela oferta de mais recursos computacionais, é cada vez mais necessária a disseminação do tema e a formação e/ou atualização de profissionais. Este tutorial visa oferecer aos participantes um primeiro contato com o tema de ciência de dados aplicada à cibersegurança através de uma perspectiva híbrida: teórica e prática. O tutorial inicia com uma parte teórica sobre cibersegurança e ciência de dados destacando seu uso na prevenção, detecção e mitigação de ataques. Serão apresentadas brevemente as plataformas utilizadas em ciência de dados, como Kaggle e Jupyter, através de exemplos na área de cibersegurança. O tutorial segue com uma série de diferentes estudos de caso, em Python, representativos da área de cibersegurança e relacionados aos principais problemas, como a detecção de malware, detecção de anomalia, de bots, de spams, vazamentos de informações e phishing. O tutorial segue uma abordagem introdutória e não se aprofundará nos tópicos abordados, porém oferecerá uma visão geral e ampla dos avanços e conceitos relacionados ao tema.
14:00 – 16:00	Tutorial 1: Privacy-Preserving Techniques for Social Network Analysis André Luís Mendonça (UFC – Brazil), Felipe Brito (UFC – Brazil), Javam Machado (UFC – Brazil) Chair: Humberto Razente (UFU) With the increasing concerns over data privacy, preserving the privacy of individuals in social network analysis has become crucial. This tutorial provides a comprehensive overview of methods and techniques to protect individual privacy while conducting social network analysis. We perform a deep analysis of differential privacy, which is a rigorous mathematical framework to protect individual privacy while enabling accurate analysis of network structure and characteristics. Additionally, this tutorial explores a variety of examples and case studies to demonstrate the application of these techniques in practical scenarios.

Tutoriais 1 e 3

14:00 – 16:00

Tutorial 3: Introdução à Ciência de Dados em Cibersegurança
Michele Nogueira (UFMG – Brazil), Ligia Borges (UFMG – Brazil)
Chair: Fabio Porto (LNCC)

Diante das mudanças progressivas em cibersegurança, muitas delas impulsionadas pela ciência de dados e pela oferta de mais recursos computacionais, é cada vez mais necessária a disseminação do tema e a formação e/ou atualização de profissionais. Este tutorial visa oferecer aos participantes um primeiro contato com o tema de ciência de dados aplicada à cibersegurança através de uma perspectiva híbrida: teórica e prática. O tutorial inicia com uma parte teórica sobre cibersegurança e ciência de dados destacando seu uso na prevenção, detecção e mitigação de ataques. Serão apresentadas brevemente as plataformas utilizadas em ciência de dados, como Kaggle e Jupyter, através de exemplos na área de cibersegurança. O tutorial segue com uma série de diferentes estudos de caso, em Python, representativos da área de cibersegurança e relacionados aos principais problemas, como a detecção de malware, detecção de anomalia, de bots, de spams, vazamentos de informações e phishing. O tutorial segue uma abordagem introdutória e não se aprofundará nos tópicos abordados, porém oferecerá uma visão geral e ampla dos avanços e conceitos relacionados ao tema.

14:00 – 16:00

Tutorial 1: Privacy-Preserving Techniques for Social Network Analysis
André Luís Mendonça (UFC – Brazil), Felipe Brito (UFC – Brazil), Javam Machado (UFC – Brazil)
Chair: Humberto Razente (UFU)

With the increasing concerns over data privacy, preserving the privacy of individuals in social network analysis has become crucial. This tutorial provides a comprehensive overview of methods and techniques to protect individual privacy while conducting social network analysis. We perform a deep analysis of differential privacy, which is a rigorous mathematical framework to protect individual privacy while enabling accurate analysis of network structure and characteristics. Additionally, this tutorial explores a variety of examples and case studies to demonstrate the application of these techniques in practical scenarios.

Quarta-feira, 27 de setembro de 2023

	Tutorial 5
13:00 – 16:00	Tutorial 5: Prompting and Fine-tuning Pre-trained Generative Language Models Johny Moreira (UFAM – Brazil), Altigran da Silva (UFAM – Brazil), Luciano Barbosa (UFPE – Brazil) Chair: Fabio Porto (LNCC) There has been an explosion of available pre-trained and fine-tuned Generative Language Models (LM). They vary in the number of parameters, architecture, training strategy, and training set size. Aligned with it, alternative strategies exist to exploit these models, such as Fine-tuning and Prompt Engineering. However, many questions may arise throughout this process: Which model to apply for a given task? Which strategies to use? Will Prompt Engineering solve all tasks? What are the computational and financial costs involved? This tutorial will introduce and explore typical modern LM architectures with a hands-on approach to the available strategies.

Tutorial 5

13:00 – 16:00

Tutorial 5: Prompting and Fine-tuning Pre-trained Generative Language Models
Johny Moreira (UFAM – Brazil), Altigran da Silva (UFAM – Brazil), Luciano Barbosa (UFPE – Brazil)
Chair: Fabio Porto (LNCC)

There has been an explosion of available pre-trained and fine-tuned Generative Language Models (LM). They vary in the number of parameters, architecture, training strategy, and training set size. Aligned with it, alternative strategies exist to exploit these models, such as Fine-tuning and Prompt Engineering. However, many questions may arise throughout this process: Which model to apply for a given task? Which strategies to use? Will Prompt Engineering solve all tasks? What are the computational and financial costs involved? This tutorial will introduce and explore typical modern LM architectures with a hands-on approach to the available strategies.