#12【Vision AI】CNNからViT・Diffusion・GPT-4oのマルチモーダルLLMまで。生成系AIの系譜を徹底解説

from SingularRadio - シンギュラーラジオ · host Keisuke / Takeshi

ImageNetを席巻し深層学習の夜明けを告げたAlexNet (2012)、そこからPatch＋Attentionで視覚にTransformerを持ち込んだViT、テキスト×画像を結び付けたCLIPへ──そしてGAN、VAE、Diffusionが拓いた画像生成の最前線へ。最後はOmni-Transformer 4oに代表されるマルチモーダルLLMまで一気通貫で解説します。(00:00:00) イントロ — 画像AIの現在地 (00:02:12) CNN (00:04:24) ViT：Patch＋Attentionで視覚をTransformer化 (00:09:55) CLIP：Dual Encoderが拓くマルチモーダル表現 (00:18:27) GAN(00:26:18) VAE・の生成モデル総覧 (00:33:38) Diffusion(00:37:08) Omni-Transformer 4o(00:47:08) スケール則でマルチモーダル問題は解決するかSingularRadio（シンギュラーラジオ）は、テクノロジー、イノベーション、社会の未来について、深い知識と洞察を提供するポッドキャストです。海外大（ブリティッシュコロンビア大学）でコンピュータサイエンスを専攻するKeisukeとTakeshiのホスト二人が、AI、ロボティクス、スタートアップ、経済などの最前線で起きている出来事を掘り下げ、知的好奇心を刺激する内容をお届けします。▼運営会社（株式会社日本自動化技術）はこちら https://japan-automation-technology.vercel.appお仕事の御依頼は上記HPのお問い合わせフォームまたは[email protected]までご連絡ください。Apple Podcast: https://podcasts.apple.com/us/podcast/id1809437976Spotify: ⁠https://open.spotify.com/show/2nOYrpc9PhKQ5v7s81KzCWX (Twitter) アカウント：⁠https://x.com/SingularRadio⁠#CNN #LeNet #AlexNet #ResNet #ViT #CLIP #GAN #Diffusion #生成AI #GPT4o #OmniTransformer #Stargate #AI解説 #深層学習 #multimodal

What this episode covers

NOW PLAYING

0:00 57:19

1×

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Share this episode

Similar Episodes

No similar episodes found.

Similar Podcasts

No similar podcasts found.

Frequently Asked Questions

How long is this episode of SingularRadio - シンギュラーラジオ?

This episode is 57 minutes long.

When was this SingularRadio - シンギュラーラジオ episode published?

This episode was published on July 11, 2025.

What is this episode about?

Can I download this SingularRadio - シンギュラーラジオ episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.

URL copied to clipboard!