Presentation is loading. Please wait.

Presentation is loading. Please wait.

Self-Supervised Cross-View Action Synthesis

Similar presentations


Presentation on theme: "Self-Supervised Cross-View Action Synthesis"— Presentation transcript:

1 Self-Supervised Cross-View Action Synthesis
Kara Schatz Advisor: Dr. Yogesh Rawat UCF CRCV – REU, Summer 2019

2 Synthesize a video from an unseen view.
Project Goal Synthesize a video from an unseen view. The goal of this project is to be able to synthesize a video from an unseen view

3 Synthesize a video from an unseen view.
Project Goal Synthesize a video from an unseen view. Given: video of the same scene from a different viewpoint appearance conditioning from the desired viewpoint In order to achieve this, our approach will use a video of the same scene from a different viewpoint as will as appearance conditioning from the desired viewpoint

4 Approach This diagram shows the approach that we are using to accomplish our goal. The overall idea is to use a network to learn the appearance of the desired view and another network to learn a representation for the 3D pose in a different view of the video. Then, we will take both of those and input them into a video generator that will reconstruct the video from the desired view. To do the training, we will run the network on two different views and reconstruct both viewpoints. Once trained, we will only need to give one view of the video an one frame of the desired view.

5 Datasets NTU 13K+ training videos 5K+ testing videos 3 camera angles:
-45°, 0°, +45° So pan has far less samples, but way more viewpoints so the training set is more diverse

6 Datasets NTU PANOPTIC 13K+ training videos 5K+ testing videos
3 camera angles: -45°, 0°, +45° 3800 training samples 500 testing samples 100 cameras So pan has far less samples, but way more viewpoints so the training set is more diverse

7 Total Loss vs. Epochs Batch size = 20 Frame count = 16 Skip rate = 2
NTU Panoptic

8 Total Loss vs. Epochs Batch size = 20 Frame count = 16 Skip rate = 2
NTU Panoptic

9 Output Frames

10 Output Frames NTU Noticed that the people get cropped out in pan a lot…

11 Output Frames PANOPTIC NTU
Noticed that the people get cropped out in pan a lot… Think diff is that the colors are so close in pan its hard to differenentiate

12 Modified Network After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.

13 Modified Network After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.

14 Modified Network Key Point Extraction Key Point Extraction Key-points
After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance. Key Point Extraction Key-points

15 Modified Network Key Point Extraction Trans- formation
viewpoint Key Point Extraction Trans- formation Key-points Estimated Keypoints Key-points After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance. Key Point Extraction Trans-formation Key-points Estimated Keypoints Key-points viewpoint

16 Modified Network Key Point Extraction Trans- formation
viewpoint Key Point Extraction Trans- formation Key-points Estimated Keypoints Key-points After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance. Key Point Extraction Trans-formation Key-points Estimated Keypoints Key-points viewpoint

17 Total Loss vs. Epochs Dataset = NTU Batch size = 20 Frame count = 16
Skip rate = 2 New network Old network

18 Total Loss vs. Epochs Dataset = Panoptic Batch size = 20
Frame count = 16 Skip rate = 2 New network Old network

19 Next Steps Reconstruction with new network
After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.

20 Next Steps Reconstruction with new network Fix dataset issues
Missing data Cropping people out After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.

21 Next Steps Reconstruction with new network Fix dataset issues
Missing data Cropping people out Using close cameras After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.

22 Next Steps Reconstruction with new network Fix dataset issues
Missing data Cropping people out Using close cameras Modify Network design After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.


Download ppt "Self-Supervised Cross-View Action Synthesis"

Similar presentations


Ads by Google