2023年3月7日 20:09:22go评论67阅读模式

英文:

Function to change random distribution in Swift?

问题

Swift是否有一些内置方法可以改变随机数的分布？我想使用线性方程来定义分布，类似于y = k * x + m。当k = 0时，所有数字应该是均匀分布的。当k = 1时，分布应该遵循这条线，因此较小的x值将非常罕见，而较大的x值将很常见。我在Excel中尝试了不同的策略，最终得出了这段代码，似乎可以工作 - 但肯定有更简洁的方法在Swift中实现这个吧？

作为一种注意：首先，我使用了一个ClosedRange&lt;Double&gt;的数组，而不是元组的方法，然后使用.contains。然后我将它改为了元组的数组，因为我的代码没有按预期工作。可能是另一个错误，但由于代码现在能正常工作，所以我就一直使用元组了。

```swift
import Foundation

/* 函数，根据线性分布（y = k * x + m）创建带有上下限的元组数组 */
func createDistributions(numbers: ClosedRange&lt;Int&gt;, k: Double) -&gt; [(Double, Double)] {
    var dist = [(Double, Double)]()
    let m: Double = 0.5
    let nVal: Int = numbers.count
    var pStop: Double = 0.0

    for x in numbers {
        let t = (Double(x) + 0.5) / Double(nVal)
        let y = (k * (t - 0.5) + m) * 2.0 / Double(nVal)
        let start = pStop
        let stop = y + start
        
        dist.append((start, stop))
        pStop = stop
    }
    
    return dist
}

// 基于k值为1.0创建分布
var result = createDistributions(numbers: 0...34, k: 1.0)


// 循环十次，每次创建一个双精度随机数
for _ in 0...9 {
    let ran = Double.random(in: 0...1)
    
    // 通过检查下限和上限来确定随机数属于哪个索引数组
    for i in 0..&lt;result.count {
        
        // 随机数属于第i个元素，打印i
        if ran &gt;= result[i].0 &amp;&amp; ran &lt;= result[i].1 {
            print(i)
        }
    }
}

英文:

Does Swift have some built in methods to change the distribution of random numbers? I want to use a linear equation to define the distribution, something like y = k * x + m. When k = 0, all numbers should be equally distributed. When k = 1 the distribution should follow the line, so low x-values would be very rare while high x-values would be common. I played around in Excel and tried different strategies and finally came up with this code, which seems to work - but there must be a neater way to do this in Swift?

As a note: First I did use an array of ClosedRange<Double> instead of the tuples-approach, then used .contains. Then I changed it to an array of tuples because my code didn't work as expected. Probably another bug, but I stayed with tuples as the code works now.

import Foundation

/* function to create an array of tuples with upper and lower
 limits based on a linear distribution (y = k * x + m) */
func createDistributions(numbers: ClosedRange&lt;Int&gt;, k: Double) -&gt; [(Double, Double)] {
    var dist = [(Double, Double)]()
    let m: Double = 0.5
    let nVal: Int = numbers.count
    var pStop: Double = 0.0

    for x in numbers {
        let t = (Double(x) + 0.5) / Double(nVal)
        let y = (k * (t - 0.5) + m) * 2.0 / Double(nVal)
        let start = pStop
        let stop = y + start
        
        dist.append((start, stop))
        pStop = stop
    }
    
    return dist
}

// create distributions based on k-value of 1.0
var result = createDistributions(numbers: 0...34, k: 1.0)


// loop ten times, creating a Double random number each time
for _ in 0...9 {
    let ran = Double.random(in: 0...1)
    
    // check in which indexed array the random number belongs to by checking lower and upper limit
    for i in 0..&lt;result.count {
        
        // the random number belongs to the i:th element, print i
        if ran &gt;= result[i].0 &amp;&amp; ran &lt;= result[i].1 {
            print(i)
        }
    }
}

答案1

得分: 4

Your y = kx+m is a probability density function (PDF). A very good way to apply this to random number generation is an inverse transform sampling function. I'll walk through how to develop that, step by step, so that you can adapt it to your specific needs. In the general case, this would be done with first-year calculus, but for the linear case it's easy enough to do with basic algebra and some grade-school geometry. For this example, I'll be generating a random value between 0 and 1.

(For other Americans reading along: this is the slope-intercept form we learn as y = mx+b. Please don't get confused that m here is the intercept, not the slope. I hopefully haven't mixed them up anywhere in this answer.)

To experiment with this answer, see the GeoGebra worksheet that the images come from.

The TL;DR of all of this is:

let u = Double.random(in: 0...1)
if k == 0 {
    return u
} else {
    return (sqrt(k*k + k*(8*u - 4) + 4) + k - 2)/(2*k)
}

But learning why that's the answer is the real goal.

A PDF is a function whose area between two x-values is the probability that the value is between those values. This leads to the fact that the PDF must be positive for all values in its range, and the area under it must be exactly 1 (representing a 100% chance of selecting some value over the entire range).

But a quick look at an arbitrary version of this curve shows that it may not have the right area:

[![Graph of y=kx+m with k=0.4 and m=0.4][1]][1]

For a given value of k there is a specific value of m that is valid. We can work that out by computing the area in terms of k and m, setting it to 1, and then solving for m. The area of the graph is a rectangle with base 1 (the range 0-1 of values we'll select) and a height m, plus a triangle of base 1, and height k. So:

Area = Rectangle + Triangle = 1
       m + k/2 = 1
       m = 1 - k/2

And substituting back into F(x):

F(x) = kx + 1 - k/2

We are also constrained that m cannot be less than 0, which constrains k to the range [0,2]. When k is 0, then all values are equally likely. When k is 2, there is a linear relationship between a value and its likelihood.

[![Constrained PDF with a linear relationship][2]][2]

With a valid PDF, it is time to create a cumulative distribution function. This is a function that represents the likelihood that a randomly selected value is no greater than the given value. These functions are constrained for the same reasons as PDFs. They must monotonically increase from zero to one over the valid range.

[![CDF from 0 to 0.3 = 0.174][3]][3]

This area can be computed just like the full area, by summing a rectangle and a triangle:

CDF(x) = Rectangle + Triangle
       = mx + (x/2 * (F(x) - m))
       = ... some algebra later ...
       = (k*x^2)/2 + (1-k/2)*x

[![CDF function that intersects 0,0 and 1,1][4]][4]

Notice that this function correctly passes through (0,0) and (1,1) as it must, and is positive over the entire range. There is no chance of a value less than zero, and there is a 100% probability of a value less than or equal to one.

Almost there. An inverse transform sample applies the inverse of the CDF. That's not particularly complicated, but it is a lot of algebra, so let WolframAlpha do it:

solve y = (k*x^2)/2 + (1-k/2)*x for x
=>
x = y and k = 0
x = -(sqrt(k^2 + 8ky - 4k + 4) - k + 2)/(2k) and k!=0
x = (sqrt(k^2 + k(8y - 4) + 4) + k - 2)/(2k) and k!=0

For k=0, x=y. Elsewhere, there are two solutions. Only positive values make sense here, so ignore the negative one.

[![Inverse of CDF for k=1.5][5]][5]

The red line is the function you want (this is for k=1.5). Very long road to get here, but now the code is pretty easy:

// `k` ranges from 0 to 2, which is confusing. Map it to range 0...1
func randomValue(distribution d: Double) -> Double {
    assert((0...1).contains(d))
    let u = Double.random(in: 0...1)

    // k ranges from 0 to 2
    let k = d * 2

    if k == 0 {
        return u
    } else {
        return (sqrt(k*k + k*(8*u - 4) + 4) + k - 2)/(2*k)
    }
}

And just to test it out:

func testRun(distribution d: Double) {
    print("Distribution for \(d)")
    let n = 10_000

    // How many results begin with a given digit after the decimal point?
    var h: [Substring:Int] = [:]
    for _ in 0..<n {
        let value = randomValue(distribution: d)
        let firstDigit = "\(value)".prefix(3).suffix(1)
        h[firstDigit, default: 0] += 1
    }

    for digit in h.keys.sorted() {
        let ratio = Double(h[digit]!)/Double(n)
        print("\(digit) -> \(ratio.formatted(.percent.precision(.fractionLength(0))))")
    }
}

testRun(distribution: 0)
testRun(distribution: 0.5)
testRun(distribution: 1)

The result would look something like this:

Distribution for 0.0
0 -> 10%
1 -> 10%
2 -> 10%
3 -> 11%
4 -> 10%


<details>
<summary>英文:</summary>

Your `y = kx+m` is a [probability density function](https://en.wikipedia.org/wiki/Probability_density_function) (PDF). A very good way to apply this to random number generation is an [inverse transform sampling function](https://en.wikipedia.org/wiki/Inverse_transform_sampling). I&#39;ll walk through how to develop that, step by step, so that you can adapt it to your specific needs. In the general case, this would be done with first-year calculus, but for the linear case it&#39;s easy enough to do with basic algebra and some grade-school geometry. For this example, I&#39;ll be generating a random value between 0 and 1.

(For other Americans reading along: this is the slope-intercept form we learn as `y = mx+b`. Please don&#39;t get confused that `m` here is the intercept, not the slope. I hopefully haven&#39;t mixed them up anywhere in this answer.)

To experiment with this answer, see the [GeoGebra worksheet](https://www.geogebra.org/calculator/xpy6mum5) that the images come from.

The TL;DR of all of this is:

    let u = Double.random(in: 0...1)
    if k == 0 {
        return u
    } else {
        return (sqrt(k*k + k*(8*u - 4) + 4) + k - 2)/(2*k)
    }

But learning why that&#39;s the answer is the real goal.

A PDF is a function whose area between two x-values is the probability that the value is between those values. This leads to the fact that the PDF must be positive for all values in its range, and the area under it must be exactly 1 (representing a 100% chance of selecting some value over the entire range).

But a quick look at an arbitrary version of this curve shows that it may not have the right area:

[![Graph of y=kx+m with k=0.4 and m=0.4][1]][1]

For a given value of `k` there is a specific value of `m` that is valid. We can work that out by computing the area in terms of `k` and `m`, setting it to 1, and then solving for `m`. The area of the graph is a rectangle with base 1 (the range 0-1 of values we&#39;ll select) and a height `m`, plus a triangle of base 1, and height `k`. So:

    Area = Rectangle + Triangle = 1
           m + k/2 = 1
           m = 1 - k/2

And substituting back into F(x):

    F(x) = kx + 1 - k/2

We are also constrained that `m` cannot be less than 0, which constrains `k` to the range [0,2]. When `k` is 0, then all values are equally likely. When `k` is 2, there is a linear relationship between a value and its likelihood.

[![Constrained PDF with linear relationship][2]][2]

With a valid PDF, it is time to create a [cumulative distribution function](https://en.wikipedia.org/wiki/Cumulative_distribution_function). This is a function that represents the likelihood that a randomly selected value is no greater than the given value. These functions are constrained for the same reasons as PDFs. They must monotonically increase from zero to one over the valid range.

[![CDF from 0 to 0.3 = 0.174][3]][3]

This area can be computed just like the full area, by summing a rectangle and a triangle:

    CDF(x) = Rectangle + Triangle
           = mx + (x/2 * (F(x) - m))
           = ... some algebra later ...
           = (k*x^2)/2 + (1-k/2)*x

[![CDF function that intersects 0,0 and 1,1][4]][4]

Notice that this function correctly passes through (0,0) and (1,1) as it must, and is positive over the entire range. There is no chance of a value less than zero, and there is 100% probability of a value less than or equal to one.


Almost there. An inverse transform sample applies the inverse of the CDF. That&#39;s not particularly complicated, but it is a lot of algebra, so let [WolframAlpha](https://www.wolframalpha.com/input?i=solve+y+%3D+%28k*x%5E2%29%2F2+%2B+%281-k%2F2%29*x+for+x) do it:

    solve y = (k*x^2)/2 + (1-k/2)*x for x
    ==&gt;
    x = y and k = 0
    x = -(sqrt(k^2 + 8 k y - 4 k + 4) - k + 2)/(2 k) and k!=0
    x = (sqrt(k^2 + k (8 y - 4) + 4) + k - 2)/(2 k) and k!=0

For k=0, x=y. Elsewhere, there are two solutions. Only positive values make sense here, so ignore the negative one.

[![Inverse of CDF for k=1.5][5]][5]

The red line is the function you want (this is for k=1.5). Very long road to get here, but now the code is pretty easy:

// k ranges from 0 to 2, which is confusing. Map it to range 0...1
func randomValue(distribution d: Double) -> Double {
assert((0...1).contains(d))
let u = Double.random(in: 0...1)

// k ranges from 0 to 2
let k = d * 2

if k == 0 {
    return u
} else {
    return (sqrt(k*k + k*(8*u - 4) + 4) + k - 2)/(2*k)
}

}


And just to test it out:

func testRun(distribution d: Double) {
print("Distribution for (d)")
let n = 10_000

// How many results begin with a given digit after the decimal point?
var h: [Substring:Int] = [:]
for _ in 0..&lt;n {
    let value = randomValue(distribution: d)
    let firstDigit = &quot;\(value)&quot;.prefix(3).suffix(1)
    h[firstDigit, default: 0] += 1
}

for digit in h.keys.sorted() {
    let ratio = Double(h[digit]!)/Double(n)
    print(&quot;\(digit) -&gt; \(ratio.formatted(.percent.precision(.fractionLength(0))))&quot;)
}

}

testRun(distribution: 0)
testRun(distribution: 0.5)
testRun(distribution: 1)

===>
Distribution for 0.0
0 -> 10%
1 -> 10%
2 -> 10%
3 -> 11%
4 -> 10%
5 -> 10%
6 -> 10%
7 -> 10%
8 -> 10%
9 -> 10%
Distribution for 0.5
0 -> 6%
1 -> 6%
2 -> 7%
3 -> 9%
4 -> 10%
5 -> 11%
6 -> 11%
7 -> 13%
8 -> 13%
9 -> 14%
Distribution for 1.0
0 -> 1%
1 -> 3%
2 -> 5%
3 -> 7%
4 -> 9%
5 -> 11%
6 -> 13%
7 -> 15%
8 -> 17%
9 -> 19%


A linear equation can only push this so far. I don&#39;t believe you can get a larger difference between low-probability and high-probability values with only a linear PDF (a better mathematician may correct me here; this isn&#39;t my speciality). If you want that, I would explore applying this to larger-order polynomials. Rather than `F(x) = kx + m`, you could do the same thing with `F(x) = kx^2 + m` or even higher powers. This will take some first-year calculus, but the overall approach should be similar.

  [1]: https://i.stack.imgur.com/HEN20.png
  [2]: https://i.stack.imgur.com/syvJH.png
  [3]: https://i.stack.imgur.com/WRHWw.png
  [4]: https://i.stack.imgur.com/dDzsw.png
  [5]: https://i.stack.imgur.com/flgZA.png


</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Swift中更改随机分布的函数？

问题

答案1

垂直对齐图像在TextEditor中

Binding’s inside NavigationSplitView detail (TextField, TextEditor)

Conflicting arguments to generic parameter Content, RowContent when using a custom view is used inside List and NavigationView

Tap手势在SwiftUI中无法识别视图上的偏移更改。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论