diff --git a/README.md b/README.md
index 441f0b7..885ff4d 100644
--- a/README.md
+++ b/README.md
@@ -2,18 +2,21 @@
 
 This is the official repo for the paper: CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers.
 
-<video src="assets/CogVideo_samples.mp4"></video>
+
+https://user-images.githubusercontent.com/48993524/170857367-2033c514-3c9f-4297-876f-2468592a254b.mp4
+
 
 ## Generated Samples
 
 **Video samples generated by CogVideo**. The actual text inputs are in Chinese. Each sample is a 4-second clip of 32 frames, and here we sample 9 frames uniformly for display purposes.
 
-![Overview](assets/intro-image.pdf)
+![Intro images](assets/intro-image.pdf)
 
 ![More samples](assets/appendix-moresamples.pdf)
 
 
 
-**CogVideo is able to generate relatively high-frame-rate videos. ** A 4-second clip of 32 frames is shown below. 
+**CogVideo is able to generate relatively high-frame-rate videos.**
+A 4-second clip of 32 frames is shown below. 
 
-![Overview](assets/appendix-sample-highframerate.pdf)
\ No newline at end of file
+![high-frame-rate sample](assets/appendix-sample-highframerate.pdf)