Memory Half 2: CPU Caches
페이지 정보

본문
It should have been famous within the textual content that a lot of the outline of multi-cache interaction is particular to x86 and equally "sequentially-consistent" architectures. Most modern architectures aren't sequentially constant, and threaded packages have to be extremely careful about one thread relying on data written by another thread becoming visible within the order by which it was written. Alpha, MemoryWave PPC, Itanium, and (sometimes) SPARC, however not x86, AMD, or MIPS. The consequence of the requirement to take care of sequential consistency is poor performance and/or horrifyingly complicated cache interplay equipment on machines with greater than (about) four CPUs, MemoryWave so we are able to anticipate to see more non-x86 multi-core chips in use soon. I feel your criticism is misdirected. The text doesn't contact on memory consistency in any respect - it is totally out of its scope. In addition to, you need a cache coherency protocol on any multi processor system. On the subject of memory consistency, there are completely different opinions.
Some time ago there was a very attention-grabbing dialogue in RealWorldTech where Linus Torvalds made an attention-grabbing level that it may be argued that explicit memory barriers are dearer than what the CPU has to do in order to create the illusion of sequential memory consistency, because express MBs are by necessity more common and even have stronger guarantees. Sorry, not true. It describes how caches of different x86 CPUs work together, but doesn't say it only describes x86, falsely suggesting that's how every different machine does it too. It leaves the reasonable reader underneath the impression that programmers need not know anything about memory consistency. That is not totally true even on x86, but is simply false on most non-x86 platforms. If Ulrich is writing for folks programming solely x86, the article ought to say so with out quibbling. If not, it should name out places the place it is describing x86-specific conduct. To the best of my information, the outline within the article applies to all cache coherent methods, including the ones listed in your earlier put up.
It has nothing to do with memory consistency, which is an issue mostly internal to the CPU. I am very probably wrong, after all - I'm not a hardware system designer - so I am glad to discuss it. Can you describe how the cache/memory habits in an Alpha (for instance; or any other weak consistency system) differs from the article ? I agree that coding with memory boundaries (etc.!) is an enormous subject, and past the scope of this installment. It would have sufficed, although, to mention that (and the place) it is a matter for concern, and why. 86 and x86-sixty four actually aren't sequentially-consistent, because this is able to result in an enormous efficiency hit. They implement "processor consistency" which means masses can go stores but no different reordering is allowed (aside from some particular instructions). Or to place it another approach, loads have an acquire barrier and stores have a release barrier.
Implementations can situation hundreds to the bus out of order, but will invalidate early hundreds if crucial to attain the identical affect as if all loads had been carried out in order. Explicit memory barrier directions could also be mandatory or helpful even on x86 and x86-64. However ideally programmers will use portable locking or lockless abstractions as a substitute. The concept of disabling hyperthreading (SMT) in the BIOS as a approach to cut back cache misses and presumably enhance efficiency is interesting (and pertinent to me as I run a system with such a CPU and motherboard). In spite of everything, my CPU seems to make the most of this function about 10% of the time, and even then it is normally (99.99% of the time) with two distinct, non-threaded functions. It does appear logical that, if the hyperthreaded CPU reveals as two CPUs to the OS (I get two penguins at boot time plus cat /proc/cpuinfo reveals two processors), but each digital CPU is sharing the same 512K of L2 cache, then maybe my Computer is sucking rocks in efficiency as a result of cache miss price alone.
- 이전글【토토사이트 광고프로그램 텔레 @TTSOFTKR12】 유튜브영상올리기 프로토픽 백링크작업 큐카지노 먹튀 커뮤니티홍보프로그램 네이버쇼핑광고 포스트광고 25.10.30
- 다음글Galaxy Watch Active 2 Gets Blood Pressure Monitoring In South Korea 25.10.30
댓글목록
등록된 댓글이 없습니다.