PaliGemma 구현 Part 2
modeling_gemma
modeling_gemma
modeling_siglip, processing_paligemma
Segment Anything
Mistral 7B, Mixtral 8x7b
Phi-3-3.8B (Multi-turn PE, Generated Knowledge PE)
Mistral-7B (CoT PE, Zero-shot PE)
LLaMA-3-8B (Multi-turn PE, Few-shot PE)
Flash Attention 개념, 코드 실습
Flash Attention 개념, 코드 실습
Hugging Face, OLLaMA, LangChain, VectorDB, RAG
LLM 평가, LLM 기반 시스템 평가
sLLM, sLLM vs LLM, sLLM 예시
94 Architectures
94 Architectures
VLM downstream tasks
arxiv 2024
Inference
LLM Inference를 위한 라이브러리
DPO 데이터셋 구축 & DPO 수행
SFT 데이터 & Full-finetuning 하기
Evolving
LLM을 통한 데이터 생성
Open Source Model 종류 및 특징
DPO 데이터 전처리 & 생성하기
Multi-GPU
FSDP, ZeRO 예제
분산 처리 기법
Single GPU 환경에서 LLM 돌리기
Hugging Face & PEFT
GPU vs CPU
LLM & GPU
NeurIPS 2024
arxiv 2024
arxiv 2024
arxiv 2024
arxiv 2024
arxiv 2024
arxiv 2024
arxiv 2024
Diffusion Models and Representation Learning; A Survey (TPAMI 2024)
Diffusion Models and Representation Learning; A Survey (TPAMI 2024)
MME, MMMU, GQA, ChartQA, POPE, NoCaps, TextVQA
ICML 2024 Oral
arxiv 2024
CVPR 2023 Highlighted Paper
arxiv 2023
Proximal Policy Optimization, Direct Preference Optimization
Offload, DeepSpeed
Float32 vs Float16 vs BFloat16
NeurIPS 2023 Oral
ICLR 2024 Oral
feat ChatGPT
arxiv 2024
A Survey on Speech Large Language Models
A Survey on Speech Large Language Models
A Survey on Speech Large Language Models
Multimodal Transformer, Cross-modal attention, self-attention
Signal Data, Wav2Vec, SincNet, PASE
Signal Data, Wav2Vec, SincNet, PASE
Signal Data, Fourier Transform, MFCC
Signal Data, Fourier Transform, MFCC
Multimodal Learning, Multimodal Representations
Multimodal Learning, Translation
Multimodal Learning, Multimodal Representations
Multimodal Deep Learning에 대한 소개글