> ÇмúÇà»ç > ±¹³»Çмú´ëȸ
Abs:
Image synthesis has been a hot topic in computer vision and graphics, showing remarkable performance in various applications such as image translation, image editing, etc. As research on generative modeling has matured, it has become relatively easy to create high-resolution photorealistic images. However, today's users want models that give them full control over the creation process beyond simply generating high-quality images. One intuitive way to let the user easily control the generative model is to use a simple text input as a condition to generate the desired image or to manipulate specific objects in the image. In this talk, I will present recent trends in the development of text-to-image translation models from GANs to diffusion models.
Bio:
Recently, we have seen remarkable progress in image generation and translation. In particular, Text-to-Image Translation, which synthesizes high-quality images reflecting the semantic meanings of an input text. Diffusion models are play a major role in making such a significant progress. In this talk, I will present how diffusion models work in detail and discuss the future research directions.
For realizing artificial general intelligence (AGI), there are two necessary features that an AI system has to possess: task generalization and self-learning. While a number of recent learning paradigms such as meta-learning, automated learning, continual learning, and generative pre-training aim to achieve these features, agent learning is also one of them and most of all is closely in line with how a human perform multiple tasks through trial-and-error learning. In particular, agent learning can be defined as learning from experiences to perform optimal actions given observations and involves reinforcement learning as a core component. On top of reinforcement learning, it generally combines sequential modeling and world modeling.
This tutorial will first review the basic reinforcement learning briefly and then focus on recent distributed deep reinforcement learning for large-scale agent learning. Throughout the tutorial, a number of representative agent learning problems including sparse rewards, high-dimensional state space, procedurally generated and partially observable environments will also be introduced and how these problems can be solved using recent machine learning algorithms will be described. In addition, recent approaches for generalizable agent learning based on multi-task / multi-modal learning, self-supervised representation learning, world modeling, and offline reinforcement learning will be introduced and discussed.
Continual learning, especially class-incremental learning uses an episodic memory for past knowledge for better performance. Updating a model with the episodic memory is similar to (1) updating a model with past knowledge in the memory by a few-shot learning scheme, and (2) learning an imbalanced distribution of past data and the present data. We address the unrealistic factors in popular continual learning setups and propose a few ideas to make the continual learning research in realistic scenarios.
Stomach cancer is the third leading cause of global cancer mortality. Early detection treatment remains the best measure to improve patient survival. Early gastric cancer (EGC) is hard to find and so it can be easily overlooked. Currently, screening for EGC is based on direct visualization during gastroscopy. Meticulous examination of the whole stomach using current techniques can be time-consuming. Since the fact that early cancer detection significantly improves the prognosis, the need for reliable detection-systems of EGC is increasing recently. In this tutorial, how the application of AI to endoscopy could help endoscopist to detect cancer and improve survival or quality of life for patients
Bio (Á¤ÁØ¿ø ±³¼ö):
Supervised learning with deep neural networks has brought phenomenal advances to many fields of research, but the performance of such systems relies heavily on the quality and quantity of annotated databases tailored to the particular application. It can be prohibitively difficult to manually collect and annotate databases for every task. There is a plethora of data on the internet that is not used in machine learning due to the lack of such annotations. Self-supervised learning allows a model to learn representations using properties inherent in the data itself, such as natural co-occurrence.
In this talk, I will introduce recent works on self-supervised learning of audio and speech representations. Recent works demonstrates that phonetic and semantic representations of audio and speech can be learnt from unlabelled audio and video. The learnt representations can be used for downstream tasks such as automatic speech recognition, speaker recognition, face recognition and lip reading. Other noteworthy applications include localizing sound sources in images and separating simultaneous speech from video.
±×·¡ÇÁ µ¥ÀÌÅÍ´Â ¼Ò¼È ³×Æ®¿öÅ©, Åë½Å, ´Ü¹éÁú »óÈ£ÀÛ¿ë µî ÀÚ¿¬°úÇÐ ¹× »çȸ°úÇÐ Àü¹Ý¿¡ °ÉÃÄ Æø† °Ô È°¿ëµÇ°í ÀÖ´Ù. Graph Neural Networks(GNNs)Àº ±×·¡ÇÁ µ¥ÀÌÅ͸¦ ÀÎDzÀ¸·Î »ç¿ëÇÏ´Â ±×·¡ÇÁ Ç¥Çö ÇнÀ(Graph Representation Learning) ¹æ¹ý Áß¿¡ Çϳª·Î, ³ëµå ºÐ·ù, ¸µÅ© ºÐ¼®, Ŭ·¯½ºÅ͸µ µî ´Ù¾çÇÑ ±×·¡ÇÁ È°¿ë ÀÀ¿ë Àü¹Ý¿¡¼ ±âÁ¸ ¹æ¹ý·Ð ´ëºñ ³ôÀº ¼º´ÉÀ» º¸¿©ÁÖ°í ÀÖ´Ù. º» Æ©Å丮¾ó¿¡¼´Â GNNÀÇ ±âº» °³³ä, ÀÌ·ÐÀû µ¿±â, ½ÇÁ¦ Àû¿ë »ç·ÊµéÀ» °øÀ¯ÇÏ°í, ¸î °¡Áö ÁÖ¸ñÇÒ¸¸ÇÑ ÃֽŠµ¿ÇâÀ» ¼Ò°³ÇÏ°íÀÚ ÇÑ´Ù.