Presentation is loading. Please wait.

Presentation is loading. Please wait.

GU Junli SUN Yihe 1.  Introduction & Related work  Parallel encoder implementation  Test results and Analysis  Conclusions 2.

Similar presentations


Presentation on theme: "GU Junli SUN Yihe 1.  Introduction & Related work  Parallel encoder implementation  Test results and Analysis  Conclusions 2."— Presentation transcript:

1 GU Junli SUN Yihe 1

2  Introduction & Related work  Parallel encoder implementation  Test results and Analysis  Conclusions 2

3  Parallel processing ◦ Real time  Parallel processing type ◦ Cluster[5], MPP[4] ◦ Shared memory[6] 3

4  MPI (message passing interface) ◦ Communicate by passing message  Inefficient  Shared memory ◦ Share the same data space  Efficient 4

5  Most MPI codes adopt master-slave standard which has one master and couples of slaves to do different jobs. ◦ Workload imbalance ◦ Communication cost is high  On a typical shared memory CMP ◦ Each code has a private L1 cache ◦ Shared a large L2 cache 5

6  Balanced parallel scheme ◦ A strip-wise balanced parallel scheme 6

7 ◦ Each process take one strip. ◦ Each strip contains a number of slices  S n = Frame_size/P ◦ If S n is not integer -> workload problem  Data dependency ◦ Message passing 7

8  Hybrid communication 8

9 ◦ Combine MPI and shared memory  To reduce the communication cost ◦ Ex. It takes 54.5ms to read a file and send the data to others process by MPI but 9ms by shared memory.  The memory allocation scheme has one global shared memory area to store the original video data from where all processes read the original strip data. 9

10  Three dedicated memory spaces kept by each process including one for original data, a second for reconstructed data and the last for up-sampled data. 10

11  Environment ◦ Two Intel Xeon E5310 @1.6 GHz processors, each with 4 cores.  Test case ◦ HD, VGA, SD, CIF and QCIF  Version ◦ H264 JM10.2 11

12 12

13  25% higher speed improvement for the shared memory architecture as Compared to the case of cluster[5]. 13

14 14

15 0.2 15

16 16

17 17

18 18

19 19

20  Upgrading legacy MPI applications to the class of shared memory architectures can provide significant performance improvements.  Optimizing the communication mechanism and further enhancements to the hybrid shared-memory and message-passing multi-core processor design can be expected to raise performance to still higher levels. 20


Download ppt "GU Junli SUN Yihe 1.  Introduction & Related work  Parallel encoder implementation  Test results and Analysis  Conclusions 2."

Similar presentations


Ads by Google