Keynotes
Prof. Jungong Han
Tsinghua University, China
Multimodal Scene Understanding and Multimodal Foundation Model
Professor Jungong Han is a Xinghua Chair Professor with the Department of Automation at Tsinghua University. Previously, he held chair, tenured associate professor positions at leading European universities. His research has made notable contributions to dynamic neural networks, multimodal visual perception, brain-inspired learning, and foundation model optimization. In these areas, he has published over 200 papers in leading journals and conferences, accumulating more than 33,000 citations and an H-index of 84, with consistent recognition among Stanford University’s Global Top 2% Scientists. His AI innovations—such as a CSI Award-winning content retrieval system—have been successfully translated into industrial applications and recognized with a Technology & Engineering Emmy Award. He is a Fellow of the IAPR and the AAIA.0
His current research interests include biomedical engineering, biomedical informatics, ehealth, AAL, personalised health, biosignal analysis, medical imaging, and neurosciences. He has published more than 500 papers in peer-reviewed international journals, books and conference proceedings out of which over 160 as full peer review papers in indexed international journals. He has developed graduate and undergraduate courses in the areas of (bio)medical informatics, biomedical signal processing, personal health systems, physiology and biological systems simulation.
He has served as a Reviewer in CEC AIM, ICT and DGRT D-HEALTH technical reviews and as reviewer, associate editor and editorial board member in more than 20 international journals, and participated as Coordinator or Core Partner in over 45 national and EU and US funded competitive research projects attracting more than 16 MEUROs in funding. He has served as president of the EAMBES in 2008-2010. Dr. Maglaveras has been a member of the IEEE, AMIA, the Greek Technical Chamber, the New York Academy of Sciences, the CEN/TC251, Eta Kappa Nu and an EAMBES Fellow.
Abstract
Our world is becoming increasingly interconnected through diverse data sources, and understanding complex scenes from modalities such as vision, language, and depth has become central to artificial intelligence. This talk will explore multimodal scene understanding—how machines integrate and reason across different sensory inputs to perceive, interpret, and interact with their environments. We will present recent advances in model architectures, fusion strategies, applications, and multimodal foundation models, illustrating how these developments push the boundaries of perception and cognition. Finally, we will discuss the challenges and opportunities in building systems capable of cross-modal learning, robustness, contextual awareness, and human-like understanding.
Toward Scalable Generative AI via Mixture of Experts in Mobile Edge Networks and Metaverse
Dusit Niyato is currently a President’s Chair Professor in the College of Computing & Data Science (CCDS), Nanyang Technological University, Singapore. Dusit’s research interests are in the areas of mobile generative AI, edge intelligence, quantum computing and networking, and incentive mechanism design. Currently, Dusit is serving as Editor-in-Chief of IEEE Transactions on Network Science and Engineering (TNSE). He is also an area editor of IEEE Communications Surveys and Tutorials, IEEE Transactions on Vehicular Technology (TVT), topical editor of IEEE Internet of Things Journal (IoTJ), lead series editor of IEEE Communications Magazine, and associate editor of IEEE Transactions on Wireless Communications (TWC), IEEE Transactions on Communications. Dusit is the Members-at-Large to the Board of Governors of IEEE Communications Society for 2024-2026. He was named the 2017-2024 highly cited researcher in computer science. He is a Fellow of IEEE and a Fellow of IET.
Abstract
TBA
Dusit Niyato
Nanyang Technological University, Singapore
