TAILIEUCHUNG - Parallel Programming: for Multicore and Cluster Systems- P39

Parallel Programming: for Multicore and Cluster Systems- P39: Innovations in hardware architecture, like hyper-threading or multicore processors, mean that parallel computing resources are available for inexpensive desktop computers. In only a few years, many standard software products will be based on concepts of parallel programming implemented on such hardware, and the range of applications will be much broader than that of scientific computing, up to now the main application area for parallel computing | Gaussian Elimination 373 5. Computation of the elimination factors The function fact _loc is used to compute the elimination factors lik for all elements aik owned by the processor. The elimination factors are stored in buffer elim_buf. 5a. Distribution of the elimination factors A single-broadcast operation is used to send the elimination factors to all processors in the same row group Rop q the corresponding communicator comm Rop q is used. The number rank of the root processor q for this broadcast operation in a group G is determined by the function rank q G . 6. Computation of the matrix elements The computation of the matrix elements by comput locakentriesf and the backward substitution performed by backward_substitution are similar to the pseudocode in Fig. . The main differences to the program in Fig. are that more communication is required and that almost all collective communication operations are performed on subgroups of the set of processors and not on the entire set of processors. Analysis of the Parallel Execution Time The analysis of the parallel execution time of the Gaussian elimination uses functions expressing the computation and communication times depending on the characteristics of the parallel machine see also Sect. . The function describing the parallel execution time of the program in Fig. additionally contains the parameters p1 p2 b1 and b2 of the parameterized data distribution in Formula . In the following we model the parallel execution time of the Gaussian elimination with checkerboard distribution neglecting pivoting and backward substitution for simplicity see also 147 . These are the phases 4 5 5a and 6 of the Gaussian elimination. For the derivation of functions reflecting the parallel execution time these four SPMD computation phases can be considered separately since there is a barrier synchronization after each phase. For a communication phase a formula describing the time of a collective

Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.