1 Memory Hierarchy and Entry Time Sand, Software Program And Sound
Albertha Seiffert edited this page 3 days ago


This web page takes a more in-depth look on the Raspberry Pi memory hierarchy. Each stage of the memory hierarchy has a capacity and pace. Capacities are comparatively simple to discover by querying the operating system or reading the ARM1176 technical reference guide. Speed, Memory Wave nevertheless, shouldn't be as easy to find and should often be measured. I use a easy pointer chasing technique to characterize the conduct of each stage in the hierarchy. The method additionally reveals the conduct of memory-associated performance counter occasions at every stage. The Raspberry Pi implements five levels in its memory hierarchy. The degrees are summarized in the table under. The highest level consists of digital memory pages which are maintained in secondary storage. Raspbian Wheezy retains its swap area within the file /var/swap on the SDHC card. That is enough area for 25,600 4KB pages. You might be allowed as many pages as will fit into the preallocated swap space.


The Raspberry Pi has either 256MB (Mannequin A) or 512MB (Model B) of primary memory. That is enough space for 65,536 pages or 131,072 bodily pages, if all of main Memory Wave Program were available for paging. It isn’t all accessible for consumer-space packages as a result of the Linux kernel needs house for its personal code and data. Linux additionally helps giant pages, but that’s a separate topic for now. The vmstat command displays information about virtual memory utilization. Please confer with the man page for utilization. Vmstat is a good device for troubleshooting paging-related performance points since it shows web page in and out statistics. The processor within the Raspberry Pi is the Broadcom BCM2835. The BCM2835 does have a unified level 2 (L2) cache. However, the L2 cache is dedicated to the VideoCore GPU. Memory references from the CPU facet are routed around the L2 cache. The BCM2835 has two degree 1 (L1) caches: a 16KB instruction cache and a 16KB knowledge cache.
amazon.com


Our analysis below concentrates on the information cache. The data cache is 4-way set associative. Each manner in an associative set shops a 32-byte cache line. The cache can handle up to 4 energetic references to the same set without battle. If all 4 ways in a set are valid and a fifth reference is made to the set, then a battle happens and Memory Wave one of the four ways is victimized to make room for the brand new reference. The information cache is virtually listed and physically tagged. Cache traces and tags are stored individually in DATARAM and TAGRAM, respectively. Digital deal with bits 11:5 index the TAGRAM and DATARAM. Given a 16KB capacity, 32 byte traces and four ways, there have to be 128 units. Digital deal with bits 4:0 are the offset into the cache line. The info MicroTLB interprets a digital tackle to a physical handle and sends the physical tackle to the L1 data cache.


The L1 information cache compares the bodily address with the tag and determines hit/miss status and the correct means. The load-to-use latency is three (3) cycles for an L1 information cache hit. The BCM2835 implements a two level translation lookaside buffer (TLB) construction for digital to physical deal with translation. There are two MicroTLBs: a ten entry information MicroTLB and a ten entry instruction MicroTLB. The MicroTLBs are backed by the principle TLB (i.e., the second stage TLB). The MicroTLBs are absolutely associative. Each MicroTLB interprets a virtual address to a bodily deal with in a single cycle when the page mapping info is resident within the MicroTLB (that is, a success within the MicroTLB). The principle TLB is a unified TLB that handles misses from the instruction and data MicroTLBs. A 64-entry, 2-approach associative construction. Fundamental TLB misses are handled by a hardware page desk walker. A web page table walk requires at the very least one extra memory entry to find the web page mapping info in major memory.