来週の月曜に迫りましたので、再送させていただきます。 杉本 On Mon, 03 Feb 2025 13:01:34 +0900 Akihiro Sugimoto <sugimoto@nii.ac.jp> wrote: > 各位 > > 下記の講演会3件をハイブリッドで開催しますので、奮ってご参加ください。 > > 杉本 > ------ > 日時: > 2月17日(月曜日) 13h30-16h45 > > 会場: > 国立情報学研究所19F Room 1902&1903 及び > Zoom link: > https://us02web.zoom.us/j/89003851193?pwd=ZiSceBggwoA0OKtPNOqbqA8os79 > wNG.1 > Meeting ID: 890 0385 1193 > Passcode: 381228 > > -------------------- 1st talk ----------------------- > Speaker: Zuzana Kukelova (Czech Technical University in Prague) > https://cmp.felk.cvut.cz/~kukelova/ > > Title: A Brief Introduction to Camera Geometry Estimation Solvers > > Abstract: > We will briefly introduce the most common camera geometry estimation > problems, including relative and absolute pose problems for calibrated, > uncalibrated, and partially calibrated cameras. Starting with a short > historic overview, we will then discuss the current state-of-the-art for > these problems. This includes highlighting the challenges faced when > aiming for efficient and robust solutions for camera geometry estimation. > > -------------------- 2nd talk ----------------------- > Speaker: Torsten Sattler (Czech Technical University in Prague) > https://tsattler.github.io/ > > Title: 3D Reconstruction with Gaussian Splatting > > Abstract: > Accurate 3D reconstruction is a core computer vision problem with > many applications, including autonomous robots such as self-driving cars, > cultural heritage documentation, and content creation for the > entertainment industry (movies, games, etc.). Traditionally, 3D > reconstructions have been based on 3D meshes and point clouds. > Recently, learning-based approaches, such as neural radiance fields > (NeRFs) and most recently 3D Gaussian Splatting (3DGS), have become > popular. These representations are learned from images with known > intrinsics and extrinsics and generate (close-to) photorealistic > representations of scenes and objects. Compared to NeRFs, which > can be slow to train and slow to render, 3DGS offers both faster > training and test times. This talk first briefly reviews the original > 3DGS formulation before identifying shortcomings and explaining how > to resolve them. In particular, we will discuss (i) how to handle > artifacts in the reconstruction caused by a limited set of training > viewpoints, (ii) how to extend the original formulation for handling > images taken under different conditions (day, night, etc.), and (iii) > how to extract accurate 3D meshes from 3DGS representations by > defining a field on top of the 3D Gaussians used to represent the scene. > In addition, we will briefly mention ongoing efforts to ensure that > benchmark results are comparable and that comparisons are fair. > > -------------------- 3rd talk ----------------------- > Speaker: Ming-Hsuan Yang (University of California, Merced/Google > DeepMind) > https://faculty.ucmerced.edu/mhyang/ > > Title: Video Understanding and Generation with Multimodal Foundation Models > > Abstract: > Recent advances in vision and language models have significantly > improved visual understanding and generation tasks. In this talk, I will > present our latest research on designing effective tokenizers for > transformers and our efforts to adapt frozen large language models for > diverse vision tasks. These tasks include visual classification, video-text > retrieval, visual captioning, visual question answering, visual grounding, > video generation, stylization, outpainting, and video-to-audio conversion. > If time permits, I will also discuss our recent findings in dynamic 3D > vision. > > ---------------------------------------------------------------------------------- > >