I’ve been seeing a lot of talk about CachyOS recently. Has anyone here tried it? It seems interesting and I might give it a go (currently on EndeavourOS) on a spare drive in my PC.
I’ve been seeing a lot of talk about CachyOS recently. Has anyone here tried it? It seems interesting and I might give it a go (currently on EndeavourOS) on a spare drive in my PC.
[citation needed]
Those would show up in any benchmark that is sensitive to I/O latency.
Also, again, [citation needed] that march optimisations measurably lower I/O latency for compressed I/O. For that to happen it is a necessary condition that compression is a significant component in I/O latency to begin with. If 99% of the time was spent waiting for the device to write the data, optimising the 1% of time spent on compression by even as much as 20% would not gain you anything of significance. This is obviously an exaggerated example but, given how absolutely dog slow most I/O devices are compared to how fast CPUs are these days, not entirely unrealistic.
Generally, the effect of such esoteric “optimisations” is so small that the length of your unix username has a greater effect on real-world performance. I wish I was kidding.
You have to account for a lot of variables and measurement biases if you want to make factual claims about them. You can observe performance differences on the order of 5-10% just due to a slight memory layout changes with different compile flags, without any actual performance improvement due to the change in code generation.
That’s not my opinion, that’s rather well established fact. Read here:
So far, I have yet to see data that shows a significant performance increase from march optimisations which either controlled for the measurement bias or showed an effect that couldn’t be explained by measurement bias alone.
There might be an improvement and my personal hypothesis is that there is at least a small one but, so far, we don’t actually know.
The more realistic case is that an execution that would have taken 4 CPU cycles on average would then take 3.9 CPU cycles.
I don’t have data on how power scales with varying cycles/task at a constant task/time but I doubt it’s linear, especially with all the complexities surrounding speculative execution.
“visible” in what way? March optimisations are hardly visible in controlled synthetic tests…
These features cater towards specialised workloads, not general purpose computing.
Applications which facilitate such specialised workloads and are performance-critical usually have hand-made assembly for the critical paths where these specialised instructions can make a difference. Generic compiler optimisations will do precisely nothing to improve performance in any way in that case.
I’d worry more about your applications not making any use of all the cores you’ve paid good money for. Spoiler alert: Compiler optimisations don’t help with that problem one bit.