사단법인 한국인공지능학회

학술행사

국내학술대회

분과학술대회

2025 한국인공지능학회 하계학술대회

> 학술행사 > 국내학술대회

국내학술대회

기조 & 초청강연

Plenary Talk 1

이용재 교수(University of Wisconsin-Madison)

Title: Beyond Understanding: Toward Controllable and Agentic Multimodal Models

Abs:

The field of AI has been undergoing a transformative shift with the emergence of generalist models capable of performing a wide range of understanding and generation tasks. Trained on massive, internet-scale datasets---often unlabeled or weakly labeled---many of these models are multimodal, seamlessly integrating vision, language, audio, and action. In this talk, I will first present our work on LLaVA, a family of intelligent assistants that interpret the visual world and communicate naturally in language. I will highlight strategies to make these models more controllable, efficient, and agentic, enabling them to not only describe but also act upon the world around them. I will conclude with reflections on current limitations and opportunities for advancing toward more grounded, interactive AI systems.

Bio:

Yong Jae Lee is a Professor in the Department of Computer Sciences at the University of Wisconsin-Madison, and a Research Scientist at Adobe Research. His core research interests are in computer vision and machine learning, with a focus on creating robust AI systems that can understand our multimodal world with minimal human supervision. Before joining UW-Madison in 2021, he spent one year as an AI Visiting Faculty at Cruise, and before that, six years as an Assistant and then Associate Professor at UC Davis. He received his PhD from the University of Texas at Austin in 2012 advised by Kristen Grauman, and was a postdoc at Carnegie Mellon University (2012-2013) and UC Berkeley (2013-2014) advised by Alyosha Efros. He is a recipient of the Army Research Office Young Investigator Program Award, NSF CAREER Award, industry awards from Amazon, Adobe, Samsung, and Sony, UC Davis College of Engineering Outstanding Junior Faculty Award, UW-Madison SACM Student Choice Professor of the Year Award, Susan Beth Horwitz Professorship, and H. I. Romnes Faculty Fellowship. He and his collaborators received the Most Innovative Award at the COCO Object Detection Challenge ICCV 2019 and the Best Paper Award at BMVC 2020.

Plenary Talk 2

조경현 교수(New York University)

Title: Reality Checks

Abs:

Despite its amazing success, leaderboard chasing has become something researchers dread and mock. When implemented properly and executed faithfully, leaderboard chasing can lead to both faster and easily reproducible progress in science, as evident from the amazing progress we have seen with machine learning, or more broadly artificial intelligence, in recent decades. It does not however mean that it is easy to implement and execute leaderboard chasing properly. In this talk, I will go over four case studies demonstrating the issues that ultimately prevent leaderboard chasing from a valid scientific approach. The first case study is on the lack of proper hyperparameter tuning in continual learning, the second on the lack of consensus on evaluation metrics in machine unlearning, the third on the challenges of properly evaluating the evaluation metrics in free-form text generation, and the final one on wishful thinking. By going over these cases, I hope we can collectively acknowledge some of our own fallacies, think of underlying causes behind these fallacies and come up with better ways to approach artificial intelligence research.

Bio:

Kyunghyun Cho is a professor of computer science and data science at New York University and an executive director of frontier research at the Prescient Design team within Genentech Research & Early Development (gRED). He became the Glen de Vries Professor of Health Statistics in 2025. He is also a CIFAR Fellow of Learning in Machines & Brains and an Associate Member of the National Academy of Engineering of Korea. He served as a (co-)Program Chair of ICLR 2020, NeurIPS 2022 and ICML 2022. He was one of the three founding Editors-in-Chief of the Transactions on Machine Learning Research (TMLR) until 2024. He was a research scientist at Facebook AI Research from June 2017 to May 2020 and a postdoctoral fellow at University of Montreal until Summer 2015 under the supervision of Prof. Yoshua Bengio, after receiving MSc and PhD degrees from Aalto University April 2011 and April 2014, respectively, under the supervision of Prof. Juha Karhunen, Dr. Tapani Raiko and Dr. Alexander Ilin. He received the Samsung Ho-Am Prize in Engineering in 2021. He tries his best to find a balance among machine learning, natural language processing, and life, but almost always fails to do so.

초청강연

정혜동 PM (IITP)

Title: AGI로 가는 여정

Abs:

AGI로 가는 여정은 단순한 인공지능의 성능 향상이 아니라, 인간 수준의 범용 지능에 도달하기 위한 총체적 기술 집약의 과정입니다. 최근의 글로벌 기술 동향을 보면, 대규모 언어 모델(LLM)을 넘어 멀티모달 처리, 툴 기반 에이전트, 장기 메모리, 자기 반성 등 AGI에 필요한 핵심 능력들이 빠르게 고도화되고 있습니다. 이는 인간 수준의 이해력, 학습력, 적응력을 지닌 인공지능을 향한 탐구의 과정이며, 이 과정은 단순히 모델의 성능 개선이 아니라 인지적 능력의 본질에 대한 깊은 이해와 새로운 사고방식을 찾는 것입니다. 이 발표에서는 AGI의 개념과 의미를 되짚고, 그것이 인류에게 던지는 도전과 가능성, 그리고 우리가 준비해야 할 방향에 대해 함께 생각해 보고자 합니다.

Bio:

- 학력

• 경희대 전자공학 박사

• 경희대 전자공학 석사

• 경희대 전자공학 학사

- 주요경력

• (현) IITP AI PM

• (전) KETI 인공지능연구센터 센터장

• (전) KETI 인공지능사업기획단 단장

초청강연

김병학 카나나 성과리더 (카카오)

Title: 카카오의 AI 모델 ‘카나나(Kanana)’ 개발 사례 - Agentic AI를 향한 AI 모델의 진화

Abs:

본 발표에서는 카카오가 자체 개발한 AI 모델 라인업 ‘카나나 모델 패밀리’의 구조와 개발전략, 그리고 실제 서비스 적용 사례를 소개한다.

카나나 모델 패밀리는 ‘사람처럼 보고 듣고 말하는 모델’을 목표로, 텍스트, 이미지, 영상, 음성 등 다양한 형태의 데이터를 처리할 수 있는 모델들로 설계되었다. 크게 언어모델을 중심으로, 멀티모달 언어모델, 비주얼 생성 모델, 음성 모델로 구성되며, 각기 다른 목적와 크기에 따라 세분화되어 있다. 특히, 단순한 언어 생성에 그치지 않고, 대화의 흐름을 이해하고 정확한 지식을 제공하며 사용자의 의도에 맞춰 자연스럽게 실행을 연결하는 등 사용자 중심의 능동적 AI 에이전트를 구현하는 데 중점을 두고 있다. 이를 통해 기존 LLM 기반 접근만으로는 한계가 있는 영역을 보완하며, 사용자에게 보다 직관적이고 차별화된 AI 경험을 제공하고 있다.

또한, 카카오는 국내 AI 생태계의 발전을 위해 2025년 2월부터 카나나 모델을 오픈소스로 공개하고 있다. 본 발표에서는 카나나 모델의 개발철학과 기술적 구성뿐 아니라, 오픈소스화 과정까지 폭넓게 다룰 예정이다.

Bio:

- 경력

김병학 카나나 성과리더는 20년 이상 AI 검색 분야에 몸담아온 전문가입니다. 초기 스타트업 ‘첫눈’을 시작으로, NHN 검색개발센터장, 삼성전자 미디어솔루션센터(MSC) 부장, 카카오브레인 각자대표 등을 역임하며 국내 검색 기술의 발전과 인공지능 생태계의 기반을 다져왔습니다.

2024년 6월부터는 카카오의 자체 AI 모델 ‘카나나(Kanana)’ 개발을 총괄하며, 초거대 언어모델 및 멀티모달 모델의 개발을 이끌고 있습니다. 특히 텍스트, 음성, 이미지, 영상 등 다양한 모달리티를 아우르는 모델을 기반으로 AI 기술 혁신을 주도하며, 이를 카카오 서비스 전반과 산업 현장에 적용하기 위한 확장 전략에 집중하고 있습니다.

- 학력

~1993년: KAIST 대학원 전산학과 석사

~1990년: KAIST 전산학과 학사

학술행사

Korean AI Association

국내학술대회