Loading...

ÇмúÇà»ç

Korean AI Association

  >   ÇмúÇà»ç   >   ±¹³»Çмú´ëȸ

±¹³»Çмú´ëȸ

¿¬»ç ¹× ÃÊ·Ï (20ÀÏ)
 
À¯¿µÀç ±³¼ö(¿¬¼¼´ëÇб³)
 
Title:  Grounded Commonsense Reasoning for LLMs
 
Abs
Human learning is inherently multimodal, encompassing observation, listening, reading, and communication to understand and learn from our environment. Significant advancements in machine learning fields relevant to these multimodal interactions, such as Speech Recognition and Computer Vision, have enabled the computational modeling of this innate learning process. Multimodal commonsense reasoning on massive web data closely mirrors this approach. In this presentation, I will discuss my recent work on curating multimodal datasets and developing Multimodal LLMs. Specifically, I will focus on foundational models that integrate the training of various tasks in multimodal language understanding. Additionally, I will extend this work to grounded commonsense reasoning, which not only involves perception but also provides explanations and facilitates communication based on video understanding. To this end, I will explore multimodal foundation models incorporating self-judgment to improve video understanding and commonsense reasoning.
 
Bio
Youngjae Yu is an Assistant Professor of Artificial Intelligence at Yonsei University, focusing on computer vision, natural language processing, and multimodal learning. Before joining Yonsei, he was a researcher at the Allen Institute for AI (AI2). He received his Ph.D. and B.S. in Computer Science and Engineering from Seoul National University. His research interests include video understanding and large language models, with a particular focus on large-scale video dataset curation for multimodal foundation models. His work has earned recognition, including the Best Paper Award at NAACL 2022, as well as two Outstanding Paper Awards at EMNLP 2023 and ACL 2024.
 

 
 
À̵¿ÇÏ ±³¼ö(¿¬¼¼´ëÇб³)
 
Title: Towards Enhanced Reasoning Capabilities of Large Language Models
 
Abs
As Large Language Models (LLMs) continue to evolve, improving their reasoning capabilities remains a key challenge. In this talk, we explore strategies that enhance LLM performance by guiding their reasoning toward smarter, more accurate, and more reliable outputs. While effective prompting techniques have played a crucial role in shaping model behavior, recent advancements in fine-tuning and optimization methods provide additional avenues for improvement. This talk will examine both prompting strategies and broader model refinement techniques, highlighting how these approaches contribute to more coherent and effective reasoning. By exploring how different methodologies can reduce errors and enhance reliability, we aim to demonstrate their impact on achieving more robust AI performance across various use cases.
 
Bio

Dongha Lee is an assistant professor in the Department of Artificial Intelligence at Yonsei University. He holds a Ph.D. in Computer Science from POSTECH and completed his postdoctoral research with the Data Mining Group at the University of Illinois Urbana-Champaign (UIUC). There, he specialized in text mining, natural language processing, and artificial intelligence. His research primarily explores the development and utilization of symbolic knowledge derived from extensive web texts and language models. Currently, his work focuses on improving the reasoning abilities of AI models by integrating and enhancing both parametric and non-parametric knowledge.


 
 
±èÅÂ±Õ ±³¼ö(KAIST)
 
Title: Image and 3D Shape Generation
 
Abs
Followed by the motivations and challenges of 3D video generation, we present our recent works published in CVPR and ECCV 2024. 
This includes InterHandGen (Two-hand interaction generation via cascaded reverse diffusion), arbitrary-scale upscaling by latent diffusion model with implicit neural decoder, prompt augmentation for self-supervised text-guided image editing, BITT (Bi-directional texture reconstruction of interacting two hands from a single image). We emphasize the use of diffusion model, diffusion sampling, implicit functions, and self-supervised learning. 
 
Bio
Tae-Kyun (T-K) Kim is a full Professor and the director of Computer Vision and Learning Lab at School of Computing, KAIST since 2020, and has been an adjunct reader of Imperial College London (ICL), UK for 2020-2024. He led Computer Vision and Learning Lab at ICL during 2010-2020. He obtained his PhD from Univ. of Cambridge in 2008 and Junior Research Fellowship (governing body) of Sidney Sussex College, Univ. of Cambridge during 2007-2010. His BSc and MSc are from KAIST. His research interests primarily lie in machine (deep) learning for 3D computer vision and generative AI, including: articulated 3D hand/body reconstruction, face analysis and recognition, 6D object pose estimation, activity recognition, object detection/tracking, active robot vision, which lead to novel active and interactive visual sensing. He has co-authored over 100 academic papers in top-tier conferences and journals in the field, and has co-organised series of HANDS workshops and 6D Object Pose workshops (in conjunction with CVPR/ICCV/ECCV) since 2015. He was the general chair of BMVC17 in London, the program co-chair of BMVC23, and is Associate Editor of Pattern Recognition Journal, Image and Vision Computing Journal. He regularly serves as an Area Chair for top-tier vision/ML conferences. He received KUKA best service robotics paper award at ICRA 2014, and 2016 best paper award by the ASCE Journal of Computing in Civil Engineering, and the best paper finalist at CVPR 2020, and his co-authored algorithm for face image representation is an international standard of MPEG-7 ISO/IEC.
 

 
 
Çѽ¿­ ±³¼ö(UNIST)
 
Title: °­È­ÇнÀ ±âÃÊ ¹× ÀÀ¿ëºÐ¾ß
 
Abs
°­È­ÇнÀ(Reinforcement Learning, RL)Àº ¿¡ÀÌÀüÆ®°¡ Á¤Ã¥À» ±â¹ÝÀ¸·Î È¯°æ°ú »óÈ£ÀÛ¿ëÇÏ¸ç º¸»óÀ» ÃÖ´ëÈ­ÇÏ´Â ¹æÇâÀ¸·Î ÇнÀÇϴ ±â°è ÇнÀ ¹æ¹ý Áß ÇϳªÀÌ´Ù. º» °­ÀÇ¿¡¼­´Â °­È­ÇнÀÀÇ ±âº» °³³ä, ÁÖ¿ä ¾Ë°í¸®Áò(Q-learning, SARSA µî), ±×¸®°í °¡Ä¡ ±â¹Ý ¹× Á¤Ã¥ ±â¹Ý ¹æ¹ý·ÐÀ» ´Ù·é´Ù. ¶ÇÇÑ, °­È­ÇнÀÀÌ ½ÇÁ¦·Î Àû¿ëµÇ´Â ´Ù¾çÇÑ ºÐ¾ß (·Îº¸Æ½½º, ÀÚÀ² ÁÖÇà, °ÔÀÓ AI µî)¸¦ ¼Ò°³ÇÑ´Ù. À̸¦ ÅëÇØ °­È­ÇнÀÀÇ ÇÙ½É ¿ø¸®¸¦ ÀÌÇØÇÏ°í, ´Ù¾çÇÑ ÀÀ¿ë °¡´É¼ºÀ» Ž»öÇÏ´Â µ¥ µµ¿òÀ» ÁÖ°íÀÚ ÇÑ´Ù.
 
Bio
Çѽ¿­ ±³¼ö´Â ¿ï»ê°úÇбâ¼ú¿ø(UNIST) ÀΰøÁö´É´ëÇпø ¹× Àü±âÀüÀÚ°øÇкο¡¼­ Á¶±³¼ö·Î ÀçÁ÷ ÁßÀÌ´Ù. 2021³â KAIST¿¡¼­ Àü±âÀüÀÚ°øÇÐ ¹Ú»ç ÇÐÀ§¸¦ ÃëµæÇÏ°í ¹Ú»çÈÄ ¿¬±¸¿ø(Postdoc) °úÁ¤À» °ÅÄ£ ÈÄ UNIST¿¡ ÇÕ·ùÇß´Ù. ±×ÀÇ ¿¬±¸´Â ±â°èÇнÀ Àü¹Ý¿¡ °ÉÃÄ ÀÖÀ¸¸ç, ƯÈ÷ °­È­ÇнÀÀ» Áß½ÉÀ¸·Î ÇÑ´Ù. ½Ç»ýÈ°¿¡¼­ÀÇ °­È­ÇнÀ Àû¿ëÀ» ¸ñÇ¥·Î ´ÙÁß ¿¡ÀÌÀüÆ®, ´ÙÁß Å½ºÅ© ÇнÀ, µµ¸ÞÀÎ ÀûÀÀ, ÀÚÀ²ÁÖÇà ½Ã½ºÅÛ µîÀÇ ºÐ¾ß¿¡¼­ È°¹ßÈ÷ ¿¬±¸¸¦ ¼öÇàÇÏ°í ÀÖ´Ù.