this post was submitted on 31 Oct 2023
19 points (100.0% liked)

Rust

5989 readers
44 users here now

Welcome to the Rust community! This is a place to discuss about the Rust programming language.

Wormhole

!performance@programming.dev

Credits

  • The icon is a modified version of the official rust logo (changing the colors to a gradient and black background)

founded 1 year ago
MODERATORS
 

Quite some exciting progress since the last progress report! There have been 180 commits since the last progress report.

As of today, rustc_codegen_cranelift is available on nightly! :tada: You can run rustup component add rustc-codegen-cranelift-preview --toolchain nightly to install it and then either CARGO_PROFILE_DEV_CODEGEN_BACKEND=cranelift cargo +nightly build to use it for the current invocation or add

top 3 comments
sorted by: hot top controversial new old
[–] AppleSheeple@programming.dev 6 points 1 year ago* (last edited 1 year ago) (1 children)

Trying cranelift for the first time (I think).

Let's create a "release-dev-cl" profile that inherits "release-dev" profile and compare.

For reference, "release-dev" is:

inherits = "release"
debug = "full"
codegen-units = 8
lto = "off"

Cool, cold builds (including deps) went from 73s to 37s, with zstd-sys becoming a bigger offender.

But but but...

warning: unsupported x86 llvm intrinsic llvm.x86.aesni.aesimc; replacing with trap
warning: unsupported x86 llvm intrinsic llvm.x86.aesni.aesdec; replacing with trap
warning: unsupported x86 llvm intrinsic llvm.x86.aesni.aesdeclast; replacing with trap

Alright. Which dep is using this. Let's cargo vendor and rg.

% cargo vendor &>/dev/null
% cd vendor
% rg -l 'aesimc|aesdec|aesdeclast' | sed 's|/.*||' | sort -u
aes
ring

reported

Alright, let's try another project...

Nice, this one goes from 52s to 19s, and no unsupported intrinsics.

Let's test the binary.

Hmm, it's orders of magnitude slower.. let's perf...

  • LLVM
   1.71%  async-global-ex  abcd-cli              [.] alloc::collections::btree::map::IntoIterᐸK,V,Aᐳ::dying_next                                   
   1.66%  async-global-ex  abcd-cli              [.] ᐸalloc::collections::btree::map::KeysᐸK,Vᐳ as core::iter::traits::iterator::Iteratorᐳ::next   
   1.53%  async-global-ex  libc.so.6             [.] 0x0000000000158c4a                                                                            
   1.48%  blocking-4       abcd-cli              [.] ᐸlz4_flex::frame::decompress::FrameDecoderᐸRᐳ as std::io::Readᐳ::read_to_end                  
   1.35%  async-global-ex  libc.so.6             [.] 0x000000000015818d                                                                            
   1.28%  blocking-1       abcd-cli              [.] ᐸlz4_flex::frame::decompress::FrameDecoderᐸRᐳ as std::io::Readᐳ::read_to_end                  
   1.25%  blocking-4       abcd-cli              [.] ᐸcore::iter::adapters::map::MapᐸI,Fᐳ as core::iter::traits::iterator::Iteratorᐳ::try_fold     
   1.25%  async-global-ex  libc.so.6             [.] malloc                                                                                        
   1.19%  blocking-4       libc.so.6             [.] malloc                                                                                        
   1.12%  blocking-2       abcd-cli              [.] ᐸlz4_flex::frame::decompress::FrameDecoderᐸRᐳ as std::io::Readᐳ::read_to_end                  
   1.06%  async-global-ex  libc.so.6             [.] 0x0000000000158487                                                                            
   0.99%  async-global-ex  abcd-cli              [.] ᐸalloc::collections::btree::map::BTreeMapᐸK,V,Aᐳ as core::ops::drop::Dropᐳ::drop              
   0.92%  blocking-2       libc.so.6             [.] malloc                                                                                        
   0.91%  blocking-4       libc.so.6             [.] 0x0000000000158180                                                                            
   0.85%  blocking-2       [kernel.vmlinux]      [k] clear_page_erms                                                                               
   0.84%  async-global-ex  abcd-cli              [.] alloc::collections::btree::search::ᐸimpl alloc::collections::btree::node::NodeRefᐸBorrowType, 
   0.81%  async-global-ex  abcd-cli              [.] abcd::all::ELSMap::mk_extracted_st                                                            
   0.78%  blocking-1       libc.so.6             [.] malloc                                                                                        
   0.75%  blocking-4       abcd-cli              [.] core::str::converts::from_utf8                                                                
   0.75%  async-global-ex  [kernel.vmlinux]      [k] clear_page_erms                                                                               
   0.74%  async-global-ex  abcd-cli              [.] core::ptr::drop_in_placeᐸabcd::foo::FooStreamᐳ                                                
   0.74%  async-global-ex  abcd-cli              [.] ᐸalloc::string::String as core::fmt::Writeᐳ::write_str                                        
   0.74%  blocking-4       libc.so.6             [.] 0x000000000015818d                                                                            
   0.74%  async-global-ex  libc.so.6             [.] 0x000000000009a9f8                                                                            
   0.66%  async-global-ex  abcd-cli              [.] alloc::raw_vec::RawVecᐸT,Aᐳ::reserve::do_reserve_and_handle                                                                 
  • Cranelift
  13.54%  async-global-ex  abcd-cli              [.] alloc::vec::VecᐸT,Aᐳ::extend_with                                                             
   2.34%  async-global-ex  abcd-cli              [.] ᐸusize as core::iter::range::Stepᐳ::forward_unchecked                                         
   1.57%  async-global-ex  libc.so.6             [.] 0x000000000015818d                                                                            
   1.38%  async-global-ex  abcd-cli              [.] core::clone::impls::ᐸimpl core::clone::Clone for u8ᐳ::clone                                   
   1.34%  blocking-4       abcd-cli              [.] lz4_flex::block::decompress_safe::decompress_internal                                         
   1.18%  async-global-ex  abcd-cli              [.] ᐸcore::iter::adapters::enumerate::EnumerateᐸIᐳ as core::iter::traits::iterator::Iteratorᐳ::ne 
   1.09%  blocking-4       abcd-cli              [.] lz4_flex::block::decompress_safe::read_u16                                                    
   1.00%  blocking-4       libc.so.6             [.] 0x000000000015818d                                                                            
   0.97%  blocking-3       abcd-cli              [.] lz4_flex::block::decompress_safe::read_u16                                                    
   0.94%  async-global-ex  abcd-cli              [.] alloc::collections::btree::search::ᐸimpl alloc::collections::btree::node::NodeRefᐸBorrowType, 
   0.86%  blocking-2       abcd-cli              [.] lz4_flex::block::decompress_safe::read_u16                                                    
   0.84%  blocking-4       abcd-cli              [.] ᐸcore::ops::range::Rangeᐸusizeᐳ as core::slice::index::SliceIndexᐸ[T]ᐳᐳ::index_mut            
   0.77%  blocking-2       abcd-cli              [.] lz4_flex::block::decompress_safe::decompress_internal                                         
   0.72%  async-global-ex  abcd-cli              [.] ᐸcore::slice::iter::IterᐸTᐳ as core::iter::traits::iterator::Iteratorᐳ::next                  
   0.71%  blocking-3       abcd-cli              [.] ᐸcore::ops::range::Rangeᐸusizeᐳ as core::slice::index::SliceIndexᐸ[T]ᐳᐳ::index_mut            
   0.69%  blocking-3       libc.so.6             [.] 0x000000000015818d                                                                            
   0.68%  blocking-2       libc.so.6             [.] 0x000000000015818d                                                                            
   0.67%  blocking-4       abcd-cli              [.] ᐸcore::ops::range::Rangeᐸusizeᐳ as core::slice::index::SliceIndexᐸ[T]ᐳᐳ::index                
   0.67%  blocking-3       abcd-cli              [.] lz4_flex::block::decompress_safe::decompress_internal                                         
   0.62%  blocking-2       abcd-cli              [.] ᐸcore::ops::range::Rangeᐸusizeᐳ as core::slice::index::SliceIndexᐸ[T]ᐳᐳ::index_mut            
   0.60%  blocking-3       abcd-cli              [.] ᐸcore::ops::range::Rangeᐸusizeᐳ as core::slice::index::SliceIndexᐸ[T]ᐳᐳ::index                
   0.57%  blocking-3       abcd-cli              [.] speedy::circular_buffer::CircularBuffer::consume_into                                         
   0.56%  async-global-ex  abcd-cli              [.] alloc::collections::btree::search::ᐸimpl alloc::collections::btree::node::NodeRefᐸBorrowType, 
   0.56%  blocking-4       abcd-cli              [.] speedy::circular_buffer::CircularBuffer::consume_into                                         
   0.54%  blocking-3       abcd-cli              [.] speedy::reader::Reader::read_u64                                                              

Ouch, Vec::extend_with(), usize::forward_unchecked(), and even worse, u8::clone() are slow!

[–] xav@programming.dev 2 points 1 year ago

That's a hell of a comment !

[–] open_world@lemmy.world 3 points 1 year ago

Really exciting news! This should mitigate a longstanding problem that Rust developers have had when iterating.