1 | #/* Author : Rick van der Zwet
|
---|
2 | # * S-number : 0433373
|
---|
3 | # * Version : $Id: README.txt 438 2007-12-28 02:53:47Z rick $
|
---|
4 | # * Copyright : FreeBSD Licence
|
---|
5 | # * Description : Assignment documentation
|
---|
6 | # */
|
---|
7 |
|
---|
8 | = Index =
|
---|
9 | * Preface
|
---|
10 | * Methodology
|
---|
11 | * Design decisions
|
---|
12 | * Directory stucture
|
---|
13 | * Programs structure
|
---|
14 | * Definitions
|
---|
15 | * Configurations
|
---|
16 | ** Memory
|
---|
17 | ** Cache
|
---|
18 | * BUS2 Output
|
---|
19 | * Usage / Running
|
---|
20 | * Conclusion
|
---|
21 | * Recommendationn
|
---|
22 |
|
---|
23 |
|
---|
24 | = Preface =
|
---|
25 | Over here you will find the results of assigment 3 -internally also
|
---|
26 | called assignment 4- of Computer architicture. The main purpose of the
|
---|
27 | assignment where to determine what the bank conflicts and memory
|
---|
28 | bandwith where in certain configurations. Cache simulation is done by
|
---|
29 | using dinero. To calculate the result several C programs had to be
|
---|
30 | written.
|
---|
31 |
|
---|
32 | = Methodology =
|
---|
33 | The only need of the dinero output will be the the BUS2 (-o2) output ,
|
---|
34 | no need anything else. As perl, awk, grep and others are not allowed, a
|
---|
35 | small custom C program called grep-bus2 is written. Shell calculate.sh
|
---|
36 | will determine the correct cache options variables and will call the
|
---|
37 | proper compu with the memory configuration.
|
---|
38 |
|
---|
39 | = Design decisions =
|
---|
40 | * As we are working in 'debugging' a standard program I assume every call
|
---|
41 | to the memory will fetch 32 bits (4bytes)
|
---|
42 | * No cache will be simulated in dinero using 1 word cache, which might
|
---|
43 | fake results while accessing multiple times the same memory cell
|
---|
44 | (highly unlikely however)
|
---|
45 | * Shell script will allow output to set benchmark file
|
---|
46 |
|
---|
47 |
|
---|
48 | = Directory structure =
|
---|
49 | Makefile = GNU Make config file
|
---|
50 | compu.c = calculate bandwidth and bank conflicts
|
---|
51 | common.[ch] = Common functions
|
---|
52 | memory_std.[ch] = Standard memory
|
---|
53 | memory_bank.[ch] = Bank memory
|
---|
54 | memory_dram.[ch] = DRAM memory
|
---|
55 | calculate.sh = Shell script to generate results of combinations
|
---|
56 | develop.sh = Very simple script to keep running will coding to
|
---|
57 | ensure continues feedback
|
---|
58 | data = directory of the traces
|
---|
59 | data/lisp.002.din = provided input
|
---|
60 | data/spic.002.din = provided input
|
---|
61 | docs = Some additional documentation
|
---|
62 | docs/dineroIII.txt = man page
|
---|
63 | src/dineroIII.tar.gz = dineroIII source
|
---|
64 |
|
---|
65 | = Programs structure =
|
---|
66 | compu.c will be the logic of choosing the correct memory module to use.
|
---|
67 | All memory implementations are defined in memory_<type>.[ch]. To avoid double
|
---|
68 | coding a 'interface' common.c is defined which includes the common
|
---|
69 | functions mainly outputs.
|
---|
70 |
|
---|
71 |
|
---|
72 | = Definitions =
|
---|
73 | The bandwidth of a cache/memory-system is the total number of bytes send
|
---|
74 | between CPU and cache divided by the number of cycles. We assume here
|
---|
75 | that memory activity is the bottleneck for the enire system in other
|
---|
76 | words: that the cache is continuously busy processing requests.
|
---|
77 |
|
---|
78 | A bankconflict is defined as follows:
|
---|
79 |
|
---|
80 | For normal memory, a bankconflict is a request that is sent to a bank
|
---|
81 | that is still busy handling the previous request (either in the access
|
---|
82 | phase or in the bus transfer phase). The first memory acces incurs no
|
---|
83 | conflict.
|
---|
84 |
|
---|
85 | In page-mode DRAM we do not use multiple banks, so the above definition
|
---|
86 | is not so useful. In the case interpret the following as a bankconflict:
|
---|
87 | a request for a page (=column) in which the previous request was not for
|
---|
88 | the same page. The first access here also does not incur a conflict.
|
---|
89 |
|
---|
90 | = Configurations =
|
---|
91 |
|
---|
92 | == Memory ==
|
---|
93 | 1) standard memory with random access time of 8 clock cycles
|
---|
94 | 2) 4-bank word-interleaved memory with random access time of 8 clock
|
---|
95 | cycles
|
---|
96 | 3) 8-bank word-interleaved memory with random access time of 8 clock
|
---|
97 | cycles
|
---|
98 | 4) page-mode DRAM with a page-size of 64 words, a random access time of
|
---|
99 | 8 clock cycles and a 'next access time' of 3 clock cycles.
|
---|
100 | 5) page-mode DRAM with a page-size of 1024 words, a random access time
|
---|
101 | of 8 clock cycles and a 'next access time' of 2 clock cycles.
|
---|
102 |
|
---|
103 |
|
---|
104 | == Cache ==
|
---|
105 | a) no cache, a write buffer of 1 word deep
|
---|
106 | b) a 64 KB, unified, direct-mapped, write-through, no write-allocate
|
---|
107 | cache with 4 word blocks and a 1 word write buffer
|
---|
108 | c) a 64 KB, unified, direct-mapped, write-back, write-allocate cache
|
---|
109 | with 4 word blocks and a 1 word write buffer
|
---|
110 |
|
---|
111 | = Assumptions =
|
---|
112 | == Given ==
|
---|
113 | * All adddresses in this assignment are word-aligned and all
|
---|
114 | data-accesses to the cache are 1 word.
|
---|
115 | * Per clock cycl, 1 request for 1 word can be handled, but requests
|
---|
116 | should remain in order.
|
---|
117 | * The time for submitting the requests does not have to be taken into
|
---|
118 | consideration
|
---|
119 | * If a memory access is requested to the same bank or row of a busy
|
---|
120 | memory part it is implied to be to adifferent address
|
---|
121 | * When accessing memory, the same 'checking' cycle will be used as
|
---|
122 | initial memory call cycle
|
---|
123 | * DRAM memory will also have RAS time of 1, all memory will have a bus
|
---|
124 | and no need to disticts between all of them
|
---|
125 | * Calculating Byte transfer between memory and cache is trivial,
|
---|
126 | number of lines * bytes every line
|
---|
127 | * Both read and writes are treaded the same, no optimizations are made
|
---|
128 | to ensure simplicity
|
---|
129 | * Code is not build to be optimized, but to be clear instead, which
|
---|
130 | will result in 'dumb' loops
|
---|
131 | * With DRAM worst case senario is used, meaning a call to a locked
|
---|
132 | * memory adress or diffent page will block for RAS seconds
|
---|
133 |
|
---|
134 |
|
---|
135 | = BUS2 Output =
|
---|
136 | BUS2 <type> <size> <adddress> <reference_count> <instruction_count>
|
---|
137 | * BUS2 are four literal characters to start bus record access is the access
|
---|
138 | * type ( r for a bus-read, w for a bus-write, p for a bus-prefetch, s for
|
---|
139 | snoop activity (output style 3 only).
|
---|
140 | * size is the transfer size in bytes
|
---|
141 | * address is a hexadecimal byte-address between 0 and ffffffff
|
---|
142 | inclusively
|
---|
143 | * reference_count is the number of demand references since the
|
---|
144 | last bus transfer (i.e. cache misses)
|
---|
145 | * instruction_count is the number of demand instruction fetches
|
---|
146 | since the last bus transfer
|
---|
147 |
|
---|
148 | = Usage / Running =
|
---|
149 | # Alter calculate.sh to speficy the right dinero binary path
|
---|
150 | # Build binaries
|
---|
151 | $ make
|
---|
152 | # Call calculate.sh with proper argument for datafile
|
---|
153 | $ sh calculate.sh <path-to-datafile>
|
---|
154 | # Result will be posted to stderr and to the files res-xx.txt
|
---|
155 | # Rerun will simplely overwrite all the old data
|
---|
156 |
|
---|
157 |
|
---|
158 | = Conclusion =
|
---|
159 | It really does depend which type of cache and memory to use. A lot of
|
---|
160 | assumptions where made and many generalisations where made. In order
|
---|
161 | to have the systems work better with eachother the system must need to
|
---|
162 | have good kownleage of the underlaying implementation in hardware. It
|
---|
163 | seems that the output traces lisp and spic has been optimized for the
|
---|
164 | use of class C cause and type word memory. It's also pretty clear that
|
---|
165 | without any cause the system will not preform at all and will just
|
---|
166 | suffer from the really slow memory.
|
---|
167 |
|
---|
168 |
|
---|
169 | = Recommendations =
|
---|
170 | * As the input is text and only simple calculations are beeing used, allowing
|
---|
171 | to use a interperted like perl, python or else would be pretty handy
|
---|
172 | * /usr/local/edu/data does not exists this should be ~csca/edu/data
|
---|
173 | * The 'large files' are tar gzipped 100k, 300k ;-)
|
---|
174 | * Please specify which version of dinero to use and were the program needs to
|
---|
175 | run on
|
---|
176 | * dineroIII does not exists online anymore and has been replaced by dineroIV,
|
---|
177 | compiling also fails for dineroIII on more modern systems, quick and dirty
|
---|
178 | fix patch: http://rickvanderzwet.nl/svn/data/liacs/ca/opdr3/src/
|
---|
179 | * Submit does not include an email where to submit to
|
---|
180 | * Translate 'Het gaat om' part Assignment
|
---|