Академический Документы
Профессиональный Документы
Культура Документы
In order to specify the configuration we want, we will also use the following
command line options:
--list-cpu-types: List available CPU types.
--cpu-type=CPU_TYPE: Type of CPU to run with.
--caches: enable cache (L1 Cache). You need this label if you want your cache
specification to be utilized.
--l2cache: enable L2 Cache, which is similar to the above case.
--l1d_size=L1D_SIZE: Set size of L1 data cache.
--l1i_size=L1I_SIZE: Set size of L1 instruction cache.
--l2_size=L2_SIZE: Set size of L2 cache.
--l1d_assoc=L1D_ASSOC: Set associativity of L1 data cache (DCache).
--l1i_assoc=L1I_ASSOC: Set associativity of L1 instruction cache (ICache).
--l2_assoc=L2_ASSOC: Set associativity of L2 cache.
--cacheline_size=CACHELINE_SIZE: cache block size. This setting affects all
caches
Please see the video on the E3 about how to run FM-Index on a virtual machine
1. (3%) Get performance trends by adjusting the configuration In the case of
L1 only.
⚫ (0.6%) (a) Set l1d_assoc = 2 4 8 16, get the respective simulation
time, and then fill in the form below
Associative Simulation Tick
2-way
4-way
8-way
16-way
⚫ (0.6%) (b) Continue from (a), Please explain ,under this experiment,
why the higher the associative number, the worse/better
performance. Hint: Explain the advantages and disadvantages of 2-
way and 16-way.
⚫ (0.6%) (c) Set --mem-ranks = 2 4 8, get the respective simulation
time, and then fill in the form below
Number of Rank Simulation Tick
1
2
4
8
⚫ (0.6%) (d) Continue from (c), Please explain ,under this experiment,
why the higher the number of rank, the worse/better performance.
Hint : what are the advantages and disadvantages of more Rank
⚫ (0.6%) (e) Set --cacheline_size = 32 , 64, Please explain ,under this
experiment, why the higher number of cacheline size, the
worse/better performance.
2. (3%) The size of L1D + L1I + L2 must be less than 200KB. Please try to get
the shortest run time. You can change the FM-Index algorithm, but the
answer must be correct! It means you must pass the checker. Performance
will be ranked. The team with the shortest running time gets 3%. The team
with the slowest running time gets 0%.
Your report should include your simulation time, configuration(command)
and how to run your program in your report. Also, you have to submit your
FM-Index code. (student_ID.cpp)
Note that you can only change the command line options provided by
se.py. other changes are not allowed !!
3. PIM on Gem5 (9%)
Processing In Memory (PIM) is used to reduce the delay and power
consumption of data movement. Please use PIM API in FM-Index to get the
shortest running time. TA will provide some PIM API function for you to accelerate
your program.
For detailed information of this part, please look at 2020 CA FP User Guide.pdf
1. (6%, 3% for correctness, 3% for ranking) Please use PIM API in
FM-Index to get the shortest simulated run time.
2. (3%) Write a report of (a) (1.5%) List out the PIM API you have
done? (b) (1.5%) Why do you think that these PIM API would
benefit the program? Please hand in your report in .pdf format.