英文:
Java is almost twice as fast as Rust?!? What am I missing?
问题
I ported a Java class that generates combinations to Rust, but Rust runs slower than Java. The Java code produces results in about 6.5 to 7 seconds, while the Rust code takes 12 to 12.5 seconds to run. The provided Rust code appears to be correctly translated and optimized. The issue might be related to differences in the runtime performance of Rust and Java. Without deeper analysis, it's hard to pinpoint the exact cause.
英文:
I ported a Java class that I have that loops over all possible unordered combinations of choosing k elements from n choices to Rust, expecting Rust to help me speed up the calculations. But when running both head-to-head, it turned out that Java was almost two times faster!
Since this does not sound right to me at all and since I'm just getting started in Rust, I must be doing something wrong and was wondering if someone with more Rust experience would help me figure out why my Rust code is so much slower.
Here is my Java code for the generic interface, the implementation, and the test-code:
public interface Choice<Type> {
/**
* Returns the number of possible options that this choice provides.
*
* @return the number of choices
*/
public long getChoices();
/**
* Returns the choice with the given index.
*
* @param index - the index of the choice to return
* @return the choice of the given index
*/
public Type getChoice(long index);
}
public class NChooseK implements Choice<int[]> {
private final int n;
private final int k;
private final long count;
public NChooseK(int n, int k) {
if ((n<=0) || (k<0) || (k>n)) throw new IllegalArgumentException();
this.n = n;
this.k = k;
this.count = choose(n, k);
}
@Override
public long getChoices() {
return count;
}
@Override
public int[] getChoice(long m) {
if (k==0) return new int[0];
long count = this.count;
int[] result = new int[this.k];
int n = this.n;
int k = this.k;
long x = (count-1) - m;
while (true) {
if (n == k) {
while (true) {
result[this.k - k] = this.n - k;
if (k==1) return result;
k--;
}
}
count = count * (n-k) / n;
if (x >= count) {
result[this.k - k] = this.n - n;
if (k==1) return result;
x -= count;
count = count * k / (n-k);
k--;
}
n--;
}
}
private long choose(int n, int k) {
if (n<k) return 0;
if (k>n-k) k=n-k;
long b=1;
for (int i=1, m=n; i<=k; i++, m--)
b = b*m/i;
return b;
}
}
public class Test {
public static void main(String[] args) {
NChooseK nck = new NChooseK(26, 13);
long choices = nck.getChoices();
System.out.println("Running ("+choices+" choices)...");
long start = System.currentTimeMillis();
for (long index = 0; index<choices; index++) {
int[] choice = nck.getChoice(index);
//System.out.println(ArrayTools.toString(choice));
}
long end = System.currentTimeMillis();
System.out.println("Done ("+((end - start)/1000.0)+"s)!");
}
}
And here is what I thought would be the closest translation to Rust:
pub trait Choice<Type> {
/// Returns the number of possibilities for this choice.
fn get_choices(&self) -> u32;
/// Returns the posibility of the given index.
fn get_choice(&self, index: u32) -> Type;
}
use super::choice::Choice;
pub struct NChooseK {
n: u32,
k: u32,
count: u32,
}
impl NChooseK {
pub fn new(n: u32, k: u32) -> Result<NChooseK, &'static str> {
if k > n {
Err("invalid parameters: k cannot be larger than n")
} else {
Ok(NChooseK {
n: n,
k: k,
count: choose(n, k).unwrap() as u32,
})
}
}
}
impl<'a> Choice<Vec<u32>> for NChooseK {
fn get_choices(&self) -> u32 {
self.count
}
fn get_choice(&self, m: u32) -> Vec<u32> {
if self.k == 0 {
return vec![];
}
let mut count = self.count;
let mut result:Vec<u32> = Vec::with_capacity(self.k as usize);
let mut n = self.n;
let mut k = self.k;
let mut x = (count-1) - m;
loop {
if n == k {
loop {
result.push(self.n - k);
if k == 1 {
return result;
}
k -= 1;
}
}
count = count * (n - k) / n;
if x >= count {
result.push(self.n - n);
if k == 1 {
return result;
}
x -= count;
count = count * k / (n - k);
k -= 1;
}
n -= 1;
}
}
}
fn choose(n: u32, mut k: u32) -> Option<u64> {
if k > n-k {
k = n-k;
}
let mut b : u64 = 1;
let mut m = n;
for i in 1..=k {
if b > 0xFFFFFFFFFFFFFFFF / (m as u64) {
return None;
}
b = b * (m as u64) / (i as u64);
m -= 1;
}
Some(b)
}
fn main() {
let nck = NChooseK::new(26, 13).unwrap();
let choices = nck.get_choices();
println!("Running ({} choices)...", choices);
let start = time::precise_time_s();
for index in 0..choices {
let choice = nck.get_choice(index);
//println!("{:?}", choice);
}
let end = time::precise_time_s();
println!("Done ({}s)!", end - start);
}
The Rust code takes around 12 to 12.5s to run through the ~10 million calls to get_choice and the Java code takes 6.5 to 7s! WTF?!?!?
This is using rustc v1.45.2 and OpenJDK v1.8.0_212-3-redhat on Windows 7 64-bit.
Notes:
- I initially had get_choices also return a u64 in Rust, but changed it to a u32 to try to eliminate as much type casting as possible (didn't actually make a difference, though).
- I also tried replacing all u32s with i32s. Also no difference.
- Commenting out the
result.push(...)
statements lowers the run-time to 9 to 9.5s. That's a big difference (bigger than I expected!), but still way slower than Java! - If I uncomment the print statements inside the inner loop and change the parameters to something more reasonable (
NChooseK(7, 3)
, for example), both versions do produce the same exact output values (ArrayTools.toString(...)
is just a simple helper-function that concatenates the components of an int-array with commas into a string; decided to leave it out since there is already so much code here)
答案1
得分: 9
Always use cargo build --release
or cargo run --release
to have rustc
/llvm
optimize your code when you try to squeeze performance:
$ cargo build 2>/dev/null && time cargo -q run 2>/dev/null
Running (10400600 choices)...
Done (9.487796306610107s)!
real 0m9.512s
user 0m9.500s
sys 0m0.000s
$ cargo build --release 2>/dev/null && time cargo -q run --release 2>/dev/null
Running (10400600 choices)...
Done (3.2046568393707275s)!
real 0m3.229s
user 0m3.222s
sys 0m0.008s
英文:
Always use cargo build --release
or cargo run --release
to have rustc
/llvm
optimize your code when you try to squeeze performance:
<!-- language: none -->
$ cargo build 2>/dev/null && time cargo -q run 2>/dev/null
Running (10400600 choices)...
Done (9.487796306610107s)!
real 0m9,512s
user 0m9,500s
sys 0m0,000s
$ cargo build --release 2>/dev/null && time cargo -q run --release 2>/dev/null
Running (10400600 choices)...
Done (3.2046568393707275s)!
real 0m3,229s
user 0m3,222s
sys 0m0,008s
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论