Академический Документы
Профессиональный Документы
Культура Документы
(Due in class by end of lecture, Thursday, May 3rd, 2012) Description: Similarly to the previous project, in this project you will explore the effect of different choices for the branch predictors. You will use sim-outorder, which models all the execution aspects of Alpha 21264. Thus, the simulator will provide the calculated CPI (sim_CPI variable) to use for your comparisons. Since simulation of the out-of-order model is much slower, you can use the fastfwd and -max:inst switches to limit the number of instructions executed. Skip at least 1M instructions and execute at least 10M instructions. Please provide the numbers you used for each part. The syntax for modifying the branch prediction mechanisms is the following:
-bpred bimod # branch predictor type {nottaken|taken|bimod|2lev|comb} -bpred:bimod 256 # bimodal predictor config (<table size>) -bpred:2lev 1 1024 8 0 # 2-level predictor config (<l1size> <l2size> <hist_size> <xor>) -bpred:comb 1024 # combining predictor config (<meta_table_size>) -bpred:ras 8 # return address stack size (0 for no return stack) -bpred:btb 64 2 # BTB config (<num_sets> <associativity>) Branch predictor configuration examples for 2-level predictor: Configurations: N, M, W, X N # entries in first level (# of shift register(s)) W width of shift register(s) M # entries in 2nd level (# of counters, or other FSM) X (yes-1/no-0) xor history and address for 2nd level index Sample predictors: GAg : 1, W, 2^W, 0 GAp : 1, W, M (M > 2^W), 0 PAg : N, W, 2^W, 0 PAp : N, W, M (M == 2^(N+W)), 0 gshare : 1, W, 2^W, 1 Predictor `comb' combines a bimodal and a 2-level predictor.
Project #2
Grading: Please submit (in a hard copy) the deliverables for each part in order and clearly defined. Discuss the results. DO NOT SUBMIT ANY SCRIPTS.
Part 1: Compare performance of several branch predictor types and different RAS configurations
Run sim-outorder for the three benchmarks used in the previous project. (1) Baseline: Bimodal predictor with the default value for RAS. Note: We use the different values for the bimodal branch predictor from the default. bpred bimod bpred:bimod 256 bpred:ras 8 bpred:btb 64 2
Project #2
(2) Change predictor type to 2-level predictor with the following options. -bpred 2lev bpred:2lev 1 256 4 0 bpred:ras 8 bpred:btb 64 2 (3) Change predictor type to combining predictor with the following options. -bpred comb bpred:comb 256 bpred:bimod 256 bpred:2lev 1 256 4 0 bpred:ras 8 bpred:btb 64 2 (4) Change the return address stack (RAS) size to 4. -bpred bimod bpred:bimod 256 bpred:ras 4 bpred:btb 64 2 (5) Change the return address stack (RAS) size to 16. -bpred bimod bpred:bimod 256 bpred:ras 16 bpred:btb 64 2 Deliverables: Compare the results to the baseline and write your own analysis with graphs in your report. When you compare performance, use CPI and branch predictor hit rates.
Project #2
Part 3: Compare performance of the branch target buffer (BTB) with different configurations
Run sim-outorder for the three benchmarks used in the previous project. (1) Baseline BTB configuration: 64 sets, 2 way associativity bpred bimod bpred:bimod 256 -bpred:btb 64 2 (2) Show the effect of the number of sets in BTB with the following options bpred bimod bpred:bimod 256 -bpred:btb 32 2 -bpred bimod bpred:bimod 256 bpred:btb 128 2 (2) Show the effect of associativity when the total size of BTB is fixed with the following options bpred bimod bpred:bimod 256 -bpred:btb 32 4 bpred bimod bpred:bimod 256 -bpred:btb 128 1 Deliverables: Compare the performance of BTB with different options to the baseline. Comparison of BTB performance should be made with addr_hits and addr_misses. Use graphs when you compare performance. Every graph must have a detailed discussion. Show the effect of the BTB performance to the overall performance. Use CPI and branch predictor hit rates.