Please, send an email to lwn@lwn.net to report this issue to them, they usually fix things quickly.
Please, send an email to lwn@lwn.net to report this issue to them, they usually fix things quickly.
Sounds interesting! As I don’t know restic that this is apparently based on, what are the differentiating factors between them? While I’m always on board for a rewrite in Rust in general, I’m curious as to if there is anything more to it than that.
EDIT: seems this is already answered in the FAQ, my bad.
I have read it, it is a very good book, and the memory ordering and atomics sections are also applicable to C and C++ since all of these languages use the same memory ordering model.
Can strongly recommend it if you want to do any low level concurrency (which I do in my C++ day job). I recommended it to my colleagues too whenever they had occasion to look at such code.
I do wish there was a bit more on more obscure and advanced patterns though. Things like RCU, seqlocks etc basically get an honorable mention in chapter 10.
Yes, Sweden really screwed up the first attempt at switching to Gregorian calendar. But there were also multiple countries who switched back and forth a couple of times. Or Switzerland where each administrative region switched separately.
But I think we in Sweden still “win” for worst screw up. Also, there is no good way to handle these dates without specific reference to precise location and which calender they refer to (timestamps will be ambiguous when switching back to Julian calendar).
I would go with the Arch specific https://aur.archlinux.org/packages/aconfmgr-git instead of ansible, since it can save current system state as well. I use it and love it. See another reply on this post for a slightly deeper discussion on it.
I can second this, I use aconfmgr and love it. Especially useful to manage multiple computers (desktop, laptop, old computer doing other things etc).
Though I’m currently planning to rewrite it since it doesn’t seem maintained any more, and I want a multi-distro solution (because I also want to use it on my Pis where I run Raspbians). The rewrite will be in Rust, and I’m currently deciding on what configuration language to use. I’m leaning towards rhai (because it seems easy to integrate from the rust side, and I’m not getting too angry at the language when reading the docs for it). Oh and one component for it is already written and published: https://github.com/VorpalBlade/paketkoll is a fast rust replacement for paccheck (that is used internally by aconfmgr to find files that differ).
I went ahead and implemented support for filtering packages (just made a new release: v0.1.3).
I am of course still faster. Here are two examples that show a small package (where it doesn’t really matter that much) and a huge package (where it makes a massive difference). Excuse the strange paths, this is straight from the development tree.
Lets check on pacman itself, and lets include config files too (not sure if pacman has that option even?). Config files or not doesn’t make a measurable difference though:
$ hyperfine -i -N --warmup 1 "./target/release/paketkoll --config-files=include pacman" "pacman -Qkk pacman"
Benchmark 1: ./target/release/paketkoll --config-files=include pacman
Time (mean ± σ): 14.0 ms ± 0.2 ms [User: 21.1 ms, System: 19.0 ms]
Range (min … max): 13.4 ms … 14.5 ms 216 runs
Warning: Ignoring non-zero exit code.
Benchmark 2: pacman -Qkk pacman
Time (mean ± σ): 20.2 ms ± 0.2 ms [User: 11.2 ms, System: 8.8 ms]
Range (min … max): 19.9 ms … 21.1 ms 147 runs
Summary
./target/release/paketkoll --config-files=include pacman ran
1.44 ± 0.02 times faster than pacman -Qkk pacman
Lets check on davici-resolve as well. Which is massive (5.89 GB):
$ hyperfine -i -N --warmup 1 "./target/release/paketkoll --config-files=include pacman davinci-resolve" "pacman -Qkk pacman davinci-resolve"
Benchmark 1: ./target/release/paketkoll --config-files=include pacman davinci-resolve
Time (mean ± σ): 770.8 ms ± 4.3 ms [User: 2891.2 ms, System: 641.5 ms]
Range (min … max): 765.8 ms … 778.7 ms 10 runs
Warning: Ignoring non-zero exit code.
Benchmark 2: pacman -Qkk pacman davinci-resolve
Time (mean ± σ): 10.589 s ± 0.018 s [User: 9.371 s, System: 1.207 s]
Range (min … max): 10.550 s … 10.620 s 10 runs
Warning: Ignoring non-zero exit code.
Summary
./target/release/paketkoll --config-files=include pacman davinci-resolve ran
13.74 ± 0.08 times faster than pacman -Qkk pacman davinci-resolve
What about a some midsized packages (vtk 359 MB, linux 131 MB)?
$ hyperfine -i -N --warmup 1 "./target/release/paketkoll vtk" "pacman -Qkk vtk"
Benchmark 1: ./target/release/paketkoll vtk
Time (mean ± σ): 46.4 ms ± 0.6 ms [User: 204.9 ms, System: 93.4 ms]
Range (min … max): 45.7 ms … 48.8 ms 65 runs
Benchmark 2: pacman -Qkk vtk
Time (mean ± σ): 702.7 ms ± 4.4 ms [User: 590.0 ms, System: 109.9 ms]
Range (min … max): 698.6 ms … 710.6 ms 10 runs
Summary
./target/release/paketkoll vtk ran
15.15 ± 0.23 times faster than pacman -Qkk vtk
$ hyperfine -i -N --warmup 1 "./target/release/paketkoll linux" "pacman -Qkk linux"
Benchmark 1: ./target/release/paketkoll linux
Time (mean ± σ): 34.9 ms ± 0.3 ms [User: 95.0 ms, System: 78.2 ms]
Range (min … max): 34.2 ms … 36.4 ms 84 runs
Benchmark 2: pacman -Qkk linux
Time (mean ± σ): 313.9 ms ± 0.4 ms [User: 233.6 ms, System: 79.8 ms]
Range (min … max): 313.4 ms … 314.5 ms 10 runs
Summary
./target/release/paketkoll linux ran
9.00 ± 0.09 times faster than pacman -Qkk linux
For small sizes where neither tool performs much work, the majority is spent on fixed overheads that both tools have (loading the binary, setting up glibc internals, parsing the command line arguments, etc). For medium sizes paketkoll pulls ahead quite rapidly. And for large sizes pacman is painfully slow.
Just for laughs I decided to check an empty meta-package (base, 0 bytes). Here pacman actually beats paketkoll, slightly. Not a useful scenario, but for full transparency I should include it:
$ hyperfine -i -N --warmup 1 "./target/release/paketkoll base" "pacman -Qkk base"
Benchmark 1: ./target/release/paketkoll base
Time (mean ± σ): 13.3 ms ± 0.2 ms [User: 15.3 ms, System: 18.8 ms]
Range (min … max): 12.8 ms … 14.1 ms 218 runs
Benchmark 2: pacman -Qkk base
Time (mean ± σ): 8.8 ms ± 0.2 ms [User: 2.8 ms, System: 5.8 ms]
Range (min … max): 8.4 ms … 10.0 ms 327 runs
Summary
pacman -Qkk base ran
1.52 ± 0.05 times faster than ./target/release/paketkoll base
I always start a threadpool regardless of if I have work to do (and changing that would slow the case I actually care about). That is the most likely cause of this slightly larger fixed overhead.
My guess is that the relevant keyword for the choice of OpenSSL is FIPS. Rusttls doesn’t (or at least didn’t) have that certification, which matters if you are dealing with US government (directly or indirectly). I believe there is an alternative backend (instead of ring) these days that does have FIPS though.
It very much is (as I even acknowledge at the end of the github README). 😀
I have only implemented for checking all packages at the current point in time (as that is what I need later on). It could be possible to add support for checking a single package.
Thank you for reminding me of pacman -Qkk
though, I had forgotten it existed.
I just did a test of pacman -Qk
and pacman -Qkk
(with no package, so checking all of them) and paketkoll
is much faster. Based on the man page:
pacman -Qk
only checks file exists. I don’t have that option, I always check file properties at least, but have the option to skip checking the file hash if the mtime and size matches (paketkoll --trust-mtime
). Even though I check more in this scenario I’m still about 4x faster.pacman -Qkk
checks checksum as well (similar to plain paketkoll
). It is unclear to me if pacman will check the checksum if the mtime and size matches.I can report that paketkoll
handily beats pacman in both scenarios (pacman -Qk
is slower than paketkoll --trust-mtime
, and pacman -Qkk
is much slower than plain paketkoll
). Below are the output of using the hyperfine benchmarking tool:
$ hyperfine -i -N --warmup=1 "paketkoll --trust-mtime" "paketkoll" "pacman -Qk" "pacman -Qkk"
Benchmark 1: paketkoll --trust-mtime
Time (mean ± σ): 246.4 ms ± 7.5 ms [User: 1223.3 ms, System: 1247.7 ms]
Range (min … max): 238.2 ms … 261.7 ms 11 runs
Warning: Ignoring non-zero exit code.
Benchmark 2: paketkoll
Time (mean ± σ): 5.312 s ± 0.387 s [User: 17.321 s, System: 13.461 s]
Range (min … max): 4.907 s … 6.058 s 10 runs
Warning: Ignoring non-zero exit code.
Benchmark 3: pacman -Qk
Time (mean ± σ): 976.7 ms ± 5.0 ms [User: 101.9 ms, System: 873.5 ms]
Range (min … max): 970.3 ms … 984.6 ms 10 runs
Benchmark 4: pacman -Qkk
Time (mean ± σ): 86.467 s ± 0.160 s [User: 53.327 s, System: 16.404 s]
Range (min … max): 86.315 s … 86.819 s 10 runs
Warning: Ignoring non-zero exit code.
It appears that pacman -Qkk
is much slower than paccheck --file-properties --sha256sum
even. I don’t know how that is possible!
The above benchmarks were executed on an AMD Ryzen 5600X with 32 GB RAM and an Gen3 NVME SSD. pacman -Syu
executed as of yesterday most recently. Disk cache was hot in between runs for all the tools, that would make the first run a bit slower for all the tools (but not to a large extent on a SSD, I can imagine it would dominate on a mechanical HDD though)
In conclusion:
paketkoll
is 3.96 times faster than pacman checking just if the files existpaketkoll
is 16.3 times faster than pacman checking file properties. This is impressive on a 6 core/12 thread CPU. pacman must be doing something exceedingly stupid here (might be worth looking into, perhaps it is checking both sha256sum and md5sum, which is totally unneeded). Compared to paccheck
I see a 7x speedup in that scenario which is more in line with what I would expect.Another aspect is that calling a cli command is way slower than a library function (in general). This is most apparent on short running commands, since the overhead is mostly fixed per command invocation rather than scaling with the amount of work or data.
As such I would at the very least keep those commands out of any hot/fast paths.
That assembly program the author compares to is waay bloated. This guy managed with 105 bytes: https://nathanotterness.com/2021/10/tiny_elf_modernized.html (that is with overlapping part of the code into the ELF header and other similar level shenanigans). ;)
All kidding aside, interesting article.
The example FileDescriptorPollContext doesn’t really work. What if my runtime uses io-uring instead of polling? Those need very different interfaces to be sound. How do you abstract over that.
Two tips that work for me:
Thanks for the clear and detailed explanation!
Looks cool. Absolutely not my area of knowledge let alone expertise. But I thought digital colour stuff was all about ICC profiles (that basically describe how wrong a device handles colour and how to correct for it).
I don’t see any mention of ICC profiles in the docs though? Or is this the lower building block which you would use to work with data from ICC profiles? Basically I think I’m asking: who would use this crate and for what? Image viewers/editors?
I don’t feel like rust compile times are that bad, but I’m coming from C++ where the compile times are similar or even worse. (With gcc at work a full debug build takes 40 minutes, with clang it is down to about 17.)
Rust isn’t an interpreted or byte code compiled language, and as such it is hard to compete with that. But that is comparing apples and oranges really. Better to compare with other languages that compile to machine code. C and C++ comes to mind, though there are of course others that I have less experience with (Fortran, Ada, Haskell, Go, Zig, …). Rust is on par with or faster than C++ but much slower than C for sure. Both rust and C++ have way more features than C, so this is to be expected. And of course it also depends on what you do in your code (template heavy C++ is much slower to compile than C-like C++, similarly in Rust it depends on what you use).
That said: should we still strive to optimise the build times? Yes, of course. But please put the situation into the proper perspective and don’t compare to Python (there was a quote by a python developer in the article).
It all depends on what part you want to work with. But some understanding of the close to hardware aspects of rust wouldn’t hurt, comes in handy for debugging and optimising.
But I say that as somone who has a background (and job) in hard realtime c++ (writing control software for industrial vehicles). We recently did our first Rust project as a test at work though! I hope there will be more. But the question then becomes how to teach 200+ devs (over time, gradually presumably). For now it is just like 3 of us who know rust and are pushing for this and a few more that are interested.
I would indeed consider Go a bigger language, because I do indeed think in terms of the size of the runtime.
But your way of defining it also makes sense. Though in those terms I have no idea if Go is smaller or not (as I don’t know Go).
But Rust is still a small language by this definition, compared to for example C++ (which my day job still involves to a large extent). It is also much smaller than Python (much smaller standard library to learn). Definitely smaller than Haskell. Smaller than C I would argue (since there are leas footguns to keep in mind), though C has a smaller standard library to learn.
What other languages do I know… Erlang, hm again the standard library is pretty big, so rust is smaller or similar size I would argue. Shell script? Well arguably all the Unix commands are the standard library, so that would make shell script pretty big.
So yeah, rust is still a pretty small language compared to all other languages I know. Unsafe rust probably isn’t, but I have yet to need to write any (except one line to work around AsRawFd vs AsFd mismatch between two libraries).
The standard library does have some specialisation internally for certain iterators and collection combinations. Not sure if it will optimise that one specifically, but
Vec::into_iter().collect::<Vec>()
is optimised (it may look silly, but it comes up with functions returningimpl Iterator