

## **Review from last lecture**

- Quantify and summarize performance
  A Ratios, Geometric Mean, Multiplicative Standard Deviation
- F&P: Benchmarks age, disks fail,1 point fail danger
- Control VIA State Machines and Microprogramming
- Just overlap tasks; easy if tasks are independent
- Speed Up ≤ Pipeline Depth; if ideal CPI is 1, then:
  - $Speedup = \frac{Pipeline \ depth}{1 + Pipeline \ stall \ CPI} \times \frac{Cycle \ Time_{unpipelined}}{Cycle \ Time_{pipelined}}$
- Hazards limit performance on computers:
   Structural: need more HW resources
  - Data (RAW,WAR,WAW): need forwarding, compiler scheduling
    Control: delayed branch, prediction
- · Exceptions, Interrupts add complexity

# Outline

- Review
- Memory hierarchy
- Locality
- Cache design
- Virtual address spaces
- Page table layout
- TLB design options
- Conclusion











# The Principle of Locality

### The Principle of Locality:

 Program access a relatively small portion of the address space at any instant of time.

- Two Different Types of Locality:
  - Temporal Locality (Locality in Time): If an item is referenced, it will tend to be referenced again soon (e.g., loops, reuse)
  - Spatial Locality (Locality in Space): If an item is referenced, items whose addresses are close by tend to be referenced soon (e.g., straightline code, array access)
- · Last 15 years, HW relied on locality for speed

It is a property of programs which is exploited in machine design.







- Miss rate fallacy: as MIPS to CPU performance, miss rate to average memory access time in memory
- miss rate to average memory access time in memory
- Average memory-access time
  = Hit time + Miss rate x Miss penalty (ns or clocks)
- Miss penalty: time to replace a block from lower level, including time to replace in CPU
  - access time:
    time to lower level = f(latency to lower level)

- transfer time:

time to transfer block = f(bandwidth between upper & lower levels)













# Cache missesCompulsoryFirst access miss, cold start miss.CapacityCache is full.ConflictTwo blocks are mapped to the same location.

# **6 Basic Cache Optimizations**

### **Reducing Miss Rate**

- 1. Larger Block size (compulsory misses)
- 2. Larger Cache size (capacity misses)
- 3. Higher Associativity (conflict misses)

### **Reducing Miss Penalty**

4. Multilevel Caches

# Reducing hit time

- 5. Giving Reads Priority over Writes
- E.g., Read complete before earlier writes in write buffer 6. Avoiding Address Translation during Indexing of the Cache

- Outline
- Review
- Memory hierarchy
- Locality
- Cache design
- Virtual address spaces
- Page table layout
- TLB design options
- Conclusion

































# Summary #3/3: TLB, Virtual Memory

- Page tables map virtual address to physical address
- TLBs are important for fast translation
- TLB misses are significant in processor performance funny times, as most systems can't access all of 2nd level cache without TLB misses!
- Caches, TLBs, Virtual Memory all understood by examining how they deal with 4 questions: 1) Where can block be placed? 2) How is block found? 3) What block is replaced on miss? 4) How are writes handled? •
- Today VM allows many processes to share single memory without having to swap all processes to disk; <u>today VM protection is more</u> <u>important than memory hierarchy benefits</u>, <u>but computers insecure</u>

# Reading

- This lecture: appendix C Memory Hierarchy
- Next lecture: chapter 2 Instruction-Level Parallelism