Determining whether an application has poor cache performance – Red Hat Developer Blog: "Modern computer systems include cache memory to hide the higher latency and lower bandwidth of RAM memory from the processor. The cache has access latencies ranging from a few processor cycles to ten or twenty cycles rather than the hundreds of cycles needed to access RAM. If the processor must frequently obtain data from the RAM rather than the cache, performance will suffer. With Red Hat Enterprise Linux 6 and newer distributions, the system use of cache can be measured with the perf utility available from the perf RPM.
perf uses the Performance Monitoring Units (PMUs) hardware in modern processors to collect data on hardware events such as cache accesses and cache misses without undue overhead on the system. The PMU hardware is processor implementation specific and the specific underlying events may differ between processors. For example one processor implementation measure the first-level cache events of the cache closest to the processor and another processor implementation may measure lower-level cache events for a cache farther from the processor and closer to main memory. The configuration of the cache may also differ between processors models; one processor in the processor family may have 2MB of last level cache and another member in the same processor family may have 8MB of last level cache. These differences makes direct comparison of event counts between processors difficult."
'via Blog this'
No comments:
Post a Comment