MorphOS Developer            
            
            
            
                             
             
                Posts: 619 from 2005/8/27            
            
                From: the land with ...            
    
            
                            
                
			
				I'm really getting tired of repeating my answers to you, please take some time to process what you read before you reply...
Quote:
The truth is that the test will measure both the
memory bandwith, the 1st level cache , and 2nd level cache bandwidth.
I didn't bother looking at the code, but the graphs on your page (nor your previously pasted figures) show no such thing.
Quote:
The tests will work on different sized array from small ones to big ones. As finally copying blocks of 80MB each to another 80 MB region.
This is of course 100% memory bandwidth what gets measured in that case, as the Pegasos can never fit 80 MB in its 500KB cache.
I'm not talking about the speed of the cache when I say "cache efficiency", I'm talking about how efficiently your code (with the cache-modes of the areas you are manipulating) is with regards to the CPUs ability to keep the cache saturated.
Quote:
The G4 Pegasos under Linux using 32bit copy loop gets 700MB/sec
Best possible copy loop on MOS (MorphOS copymem)
will score around 400 MB/sec.
You are getting these figures by way of cache-prefetching though, hence my "cache efficiency" claims, this has nothing to do with "memory throughput" .. when you are cache-prefetching the CPU will do the transfer in the background with the widest possible operation (in this case 64bit), thus you are not really benchmarking 32bit ops to memory, but rather cache (which has next-to-no impact) plus the time the CPU takes to prefetch.
If you want to generate some more realistic figures, try loading/storing random bits and pieces in memory with 32bit ops, that should throw the prefetching out of whack.
Quote:
Question: Is 400 less than 700 ? 
Sure, but it's not half (and you seem to repeatedly round MorphOS figures down and Linux figures up).
Quote:
If its not true that MOS can not get over 400 MB/s (coldcache) ? 
Not much more than that (on cached areas) atleast, yes it's true.
Quote:
If MOS can not get in the range of 700 MB then MOS is not utilizing the possible throughput of a G4 !
(700 MB on Linux with 32bit access!)
Again, you're neither measuring "memory throughput" nor 32bit access here...
Quote:
So what is true and who is lying? 
I don't think anyone is lying (you really are obsessed with this), I'm simply claiming you are mixing up your facts, be it due to incompetence or unwillingness, you choose...
Quote:
What is the best throughput you can achieve on MOS when copying lets say a block of 100MB ?
On MOS you can't even get close to Linux !
When you're talking about "cache efficiency" during a large copy you are right, same goes for 64bit access to cached areas (same thing really), however as "memory throughput" goes with 32bit access you are wrong.
Quote:
CISC, either acknowledge that on MOS you can not achieve the same memory bandwith and memory copying speed as on Linux or come back with real results and show us how to achive 700 MB/sec when copying for example a block of 100MB. 
I've never disputed that memory-block copy on cached areas are faster in Linux, infact I believe I mentioned this specifically as the downside of MorphOS' cache-mode, thus you will never achieve those kind of speeds on cached areas.
- CISC