2 回答

TA贡献1813条经验 获得超2个赞
好吧,我想我可以将我曾经实施的反向加权(请参阅如何随机均衡不相等的值?)到您的案例。
基本上,样本概率与其人口数量成反比。初始人口将是您的指导参数 - 如果它很高,则反向会很低,并且累积计数器几乎没有影响,所以它会非常接近均匀。如果初始人口数较低(例如 1),则累积计数器将更多地影响采样。
当您想放弃累积概率并返回原始概率时要考虑的第二个参数,否则低初始计数器的影响会随着时间的推移而消失。
代码,使用Math .NET在 [0...6) 范围内进行分类采样,.NET Core 2.2,x64。
using System;
using System.Linq;
using MathNet.Numerics.Random;
using MathNet.Numerics.Distributions;
namespace EqualizedSampling
{
class Program
{
static void Main(string[] args)
{
int increment = 10; // how much inverse probabilities are updated per sample
int guidanceParameter = 1000000; // Small one - consequtive sampling is more affected by outcome. Large one - closer to uniform sampling
int[] invprob = new int [6];
double[] probabilities = new double [6];
int[] counter = new int [] {0, 0, 0, 0, 0, 0};
int[] repeat = new int [] {0, 0, 0, 0, 0, 0};
int prev = -1;
for(int k = 0; k != 100000; ++k ) {
if (k % 60 == 0 ) { // drop accumulation, important for low guidance
for(int i = 0; i != 6; ++i) {
invprob[i] = guidanceParameter;
}
}
for(int i = 0; i != 6; ++i) {
probabilities[i] = 1.0/(double)invprob[i];
}
var cat = new Categorical(probabilities);
var q = cat.Sample();
counter[q] += 1;
invprob[q] += increment;
if (q == prev)
repeat[q] += 1;
prev = q;
}
counter.ToList().ForEach(Console.WriteLine);
repeat.ToList().ForEach(Console.WriteLine);
}
}
}
我计算了重复的对以及数字的总外观。在低引导参数的情况下,连续对的外观更均匀:
16670
16794
16713
16642
16599
16582
2431
2514
2489
2428
2367
2436
引导参数为 1000000 时,选择连续对的概率更高
16675
16712
16651
16677
16663
16622
2745
2707
2694
2792
2682
2847
更新
我们可以添加另一个参数,每个样本递增。较大的增量将使连续采样更不可能。代码更新,输出
16659
16711
16618
16609
16750
16653
2184
2241
2285
2259
2425
2247

TA贡献1863条经验 获得超2个赞
我最终修改了 Severin 的解决方案以更好地满足我的需求,所以我想我在这里分享它,以防有人遇到同样的问题。我做了什么:
替换
Categorical
为基于Random
类的自己的代码,因为Categorical
这给我带来了奇怪的结果。改变了概率的计算方式。
添加了更多统计数据。
要更改的关键参数是ratio
:
最小值为 1.0,这使得它的行为就像一个随机数生成器
值越高,它就越类似于洗牌算法,因此可以保证数字在不久的将来出现并且不会重复。订单仍然不可预测。
比率 1.0 的结果:
这就像伪随机数生成一样。
3, 5, 3, 3, 3, 3, 0, 3, 3, 5, 5, 5, 2, 1, 3, 5, 3, 3, 2, 3, 1, 0, 4, 1, 5, 1, 3, 5, 1, 5, -
Number of occurences:
2
5
2
12
1
8
Max occurences in a row:
1
1
1
4
1
3
Max length where this number did not occur:
14
13
12
6
22
8
比率 5.0 的结果
我最喜欢的。很好的分布,偶尔的重复,没有那么长的间隔没有发生一些数字。
4, 1, 5, 3, 2, 5, 0, 0, 1, 3, 2, 4, 2, 1, 5, 0, 4, 3, 1, 4, 0, 2, 4, 3, 5, 5, 2, 4, 0, 1, -
Number of occurences:
5
5
5
4
6
5
Max occurences in a row:
2
1
1
1
1
2
Max length where this number did not occur:
7
10
8
7
10
9
比率 1000.0 的结果
分布非常均匀,但仍然带有一些随机性。
4, 5, 2, 0, 3, 1, 4, 0, 1, 5, 2, 3, 4, 3, 0, 2, 5, 1, 4, 2, 5, 1, 3, 0, 2, 4, 5, 0, 3, 1, -
Number of occurences:
5
5
5
5
5
5
Max occurences in a row:
1
1
1
1
1
1
Max length where this number did not occur:
8
8
7
8
6
7
代码:
using System;
using System.Linq;
namespace EqualizedSampling
{
class Program
{
static Random rnd = new Random(DateTime.Now.Millisecond);
/// <summary>
/// Returns a random int number from [0 .. numNumbers-1] range using probabilities.
/// Probabilities have to add up to 1.
/// </summary>
static int Sample(int numNumbers, double[] probabilities)
{
// probabilities have to add up to 1
double r = rnd.NextDouble();
double sum = 0.0;
for (int i = 0; i < numNumbers; i++)
{
sum = sum + probabilities[i];
if (sum > r)
return i;
}
return numNumbers - 1;
}
static void Main(string[] args)
{
const int numNumbers = 6;
const int numSamples = 30;
// low ratio makes everything behave more random
// min is 1.0 which makes things behave like a random number generator.
// higher ratio makes number selection more "natural"
double ratio = 5.0;
double[] probabilities = new double[numNumbers];
int[] counter = new int[numNumbers]; // how many times number occured
int[] maxRepeat = new int[numNumbers]; // how many times in a row this number (max)
int[] maxDistance = new int[numNumbers]; // how many samples happened without this number (max)
int[] lastOccurence = new int[numNumbers]; // last time this number happened
// init
for (int i = 0; i < numNumbers; i++)
{
counter[i] = 0;
maxRepeat[i] = 0;
probabilities[i] = 1.0 / numNumbers;
lastOccurence[i] = -1;
}
int prev = -1;
int numRepeats = 1;
for (int k = 0; k < numSamples; k++)
{
// sample next number
//var cat = new Categorical(probabilities);
//var q = cat.Sample();
var q = Sample(numNumbers, probabilities);
Console.Write($"{q}, ");
// affect probability of the selected number
probabilities[q] /= ratio;
// rescale all probabilities so they add up to 1
double sumProbabilities = 0;
probabilities.ToList().ForEach(d => sumProbabilities += d);
for (int i = 0; i < numNumbers; i++)
probabilities[i] /= sumProbabilities;
// gather statistics
counter[q] += 1;
numRepeats = q == prev ? numRepeats + 1 : 1;
maxRepeat[q] = Math.Max(maxRepeat[q], numRepeats);
lastOccurence[q] = k;
for (int i = 0; i < numNumbers; i++)
maxDistance[i] = Math.Max(maxDistance[i], k - lastOccurence[i]);
prev = q;
}
Console.WriteLine("-\n");
Console.WriteLine("Number of occurences:");
counter.ToList().ForEach(Console.WriteLine);
Console.WriteLine();
Console.WriteLine("Max occurences in a row:");
maxRepeat.ToList().ForEach(Console.WriteLine);
Console.WriteLine();
Console.WriteLine("Max length where this number did not occur:");
maxDistance.ToList().ForEach(Console.WriteLine);
Console.ReadLine();
}
}
}
- 2 回答
- 0 关注
- 161 浏览
添加回答
举报