Java 流 - 标准偏差
我想提前澄清一下,我正在寻找一种使用 Streams 计算标准偏差的方法(我目前有一种工作方法可以计算并返回 SD,但不使用 Streams).
I wish to clarify upfront I am looking for a way to calculate Standard deviation using Streams (I have a working method at present which calculates & returns SD but without using Streams).
我正在使用的数据集与 链接.如此链接所示,我可以将我的数据和获得平均值但无法弄清楚如何获得 SD.
The dataset i am working with matches closely as seen in Link. As shown in this link am able to group my data & get the average but not able to figure out how to get the SD.
代码
outPut.stream()
.collect(Collectors.groupingBy(e -> e.getCar(),
Collectors.averagingDouble(e -> (e.getHigh() - e.getLow()))))
.forEach((car,avgHLDifference) -> System.out.println(car+ " " + avgHLDifference));
我还检查了 DoubleSummaryStatistics 上的 Link,但它没有似乎对 SD 没有帮助.
I also checked Link on DoubleSummaryStatistics but it doesn't seem to help for SD.
推荐答案
您可以为此任务使用自定义收集器来计算平方和.内置的 DoubleSummaryStatistics
收集器不会跟踪它.专家组在此线程中对此进行了讨论但最后没有实施.计算平方和的难点在于对中间结果求平方时可能会溢出.
You can use a custom collector for this task that calculates a sum of square. The buit-in DoubleSummaryStatistics
collector does not keep track of it. This was discussed by the expert group in this thread but finally not implemented. The difficulty when calculating the sum of squares is the potential overflow when squaring the intermediate results.
static class DoubleStatistics extends DoubleSummaryStatistics {
private double sumOfSquare = 0.0d;
private double sumOfSquareCompensation; // Low order bits of sum
private double simpleSumOfSquare; // Used to compute right sum for non-finite inputs
@Override
public void accept(double value) {
super.accept(value);
double squareValue = value * value;
simpleSumOfSquare += squareValue;
sumOfSquareWithCompensation(squareValue);
}
public DoubleStatistics combine(DoubleStatistics other) {
super.combine(other);
simpleSumOfSquare += other.simpleSumOfSquare;
sumOfSquareWithCompensation(other.sumOfSquare);
sumOfSquareWithCompensation(other.sumOfSquareCompensation);
return this;
}
private void sumOfSquareWithCompensation(double value) {
double tmp = value - sumOfSquareCompensation;
double velvel = sumOfSquare + tmp; // Little wolf of rounding error
sumOfSquareCompensation = (velvel - sumOfSquare) - tmp;
sumOfSquare = velvel;
}
public double getSumOfSquare() {
double tmp = sumOfSquare + sumOfSquareCompensation;
if (Double.isNaN(tmp) && Double.isInfinite(simpleSumOfSquare)) {
return simpleSumOfSquare;
}
return tmp;
}
public final double getStandardDeviation() {
return getCount() > 0 ? Math.sqrt((getSumOfSquare() / getCount()) - Math.pow(getAverage(), 2)) : 0.0d;
}
}
那么,你就可以用这个类了
Then, you can use this class with
Map<String, Double> standardDeviationMap =
list.stream()
.collect(Collectors.groupingBy(
e -> e.getCar(),
Collectors.mapping(
e -> e.getHigh() - e.getLow(),
Collector.of(
DoubleStatistics::new,
DoubleStatistics::accept,
DoubleStatistics::combine,
d -> d.getStandardDeviation()
)
)
));
这会将输入列表收集到一个映射中,其中的值对应于同一键的 high - low
的标准偏差.
This will collect the input list into a map where the values corresponds to the standard deviation of high - low
for the same key.
相关文章