텍스트-비디오 모델

텍스트-비디오 모델은 자연어 설명을 입력으로 받아 해당 ^[1]설명과 일치하는 비디오를 생성하는 기계 학습 모델입니다.

안정적인 배경에서 객체를 현실적으로 만들기 위한 비디오 예측은 커넥터 컨볼루션 신경망을 사용하여 각 프레임 픽셀을 ^[2]픽셀 단위로 인코딩 및 디코딩하는 시퀀스 투 시퀀스 모델에 대한 반복 신경망을 사용하여 딥 ^[3]러닝을 사용하여 비디오를 생성합니다.

방법론

키네틱 휴먼 액션 비디오의 클리어 비디오를 이용한 데이터 수집 및 데이터 세트 준비.
비디오 제작을 위한 컨볼루션 신경망 훈련.
자연어 프로그래밍을 사용하여 텍스트에서 키워드를 추출합니다.
가변 자동 인코더 및 생성적 적대 네트워크에 의한 텍스트의 기존 정적 및 동적 정보에 대한 조건부 생성 모델의 데이터 세트 테스트.

모델

오픈 소스 모델을 포함한 다양한 모델이 있습니다.CogVideo는 GitHub에서 ^[4]코드를 제시했습니다.메타 플랫폼은 makevideo.^[5]^[6]^[7]studio와 함께 텍스트 대 비디오를 사용합니다.Google은 텍스트를 ^[8]^[9]^[10]^[11]^[12]비디오로 변환하기 위해 Imagen Video를 사용했습니다.

Antonia Antonova는 다른 ^[13]모델을 제시했습니다.

레퍼런스

^ Artificial Intelligence Index Report 2023 (PDF) (Report). Stanford Institute for Human-Centered Artificial Intelligence. p. 98. Multiple high quality text-to-video models, AI systems that can generate video clips from prompted text, were released in 2022.
^ "Leading India" (PDF).
^ Narain, Rohit (2021-12-29). "Smart Video Generation from Text Using Deep Neural Networks". Retrieved 2022-10-12.
^ CogVideo, THUDM, 2022-10-12, retrieved 2022-10-12
^ Davies, Teli (2022-09-29). "Make-A-Video: Meta AI's New Model For Text-To-Video Generation". W&B. Retrieved 2022-10-12.
^ Monge, Jim Clyde (2022-08-03). "This AI Can Create Video From Text Prompt". Medium. Retrieved 2022-10-12.
^ "Meta's Make-A-Video AI creates videos from text". www.fonearena.com. Retrieved 2022-10-12.
^ "google: Google takes on Meta, introduces own video-generating AI - The Economic Times". m.economictimes.com. Retrieved 2022-10-12.
^ Monge, Jim Clyde (2022-08-03). "This AI Can Create Video From Text Prompt". Medium. Retrieved 2022-10-12.
^ "Nuh-uh, Meta, we can do text-to-video AI, too, says Google". www.theregister.com. Retrieved 2022-10-12.
^ "Papers with Code - See, Plan, Predict: Language-guided Cognitive Planning with Video Prediction". paperswithcode.com. Retrieved 2022-10-12.
^ "Papers with Code - Text-driven Video Prediction". paperswithcode.com. Retrieved 2022-10-12.
^ "Text to Video Generation". Antonia Antonova. Retrieved 2022-10-12.

[AIIR-1] Artificial Intelligence Index Report 2023 (PDF) (Report). Stanford Institute for Human-Centered Artificial Intelligence. p. 98. Multiple high quality text-to-video models, AI systems that can generate video clips from prompted text, were released in 2022.

[2] "Leading India" (PDF).

[3] Narain, Rohit (2021-12-29). "Smart Video Generation from Text Using Deep Neural Networks". Retrieved 2022-10-12.

[4] CogVideo, THUDM, 2022-10-12, retrieved 2022-10-12

[5] Davies, Teli (2022-09-29). "Make-A-Video: Meta AI's New Model For Text-To-Video Generation". W&B. Retrieved 2022-10-12.

[6] Monge, Jim Clyde (2022-08-03). "This AI Can Create Video From Text Prompt". Medium. Retrieved 2022-10-12.

[7] "Meta's Make-A-Video AI creates videos from text". www.fonearena.com. Retrieved 2022-10-12.

[8] "google: Google takes on Meta, introduces own video-generating AI - The Economic Times". m.economictimes.com. Retrieved 2022-10-12.

[9] Monge, Jim Clyde (2022-08-03). "This AI Can Create Video From Text Prompt". Medium. Retrieved 2022-10-12.

[10] "Nuh-uh, Meta, we can do text-to-video AI, too, says Google". www.theregister.com. Retrieved 2022-10-12.

[11] "Papers with Code - See, Plan, Predict: Language-guided Cognitive Planning with Video Prediction". paperswithcode.com. Retrieved 2022-10-12.

[12] "Papers with Code - Text-driven Video Prediction". paperswithcode.com. Retrieved 2022-10-12.

[13] "Text to Video Generation". Antonia Antonova. Retrieved 2022-10-12.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

Search

텍스트-비디오 모델

네임스페이스

더

방법론

모델

레퍼런스