如何选择一行内的标准差?(在 SQL - 或 R :)

2022-01-07 00:00:00 statistics r mysql

我想知道是否有办法从同一行内的MySQL中的多个整数字段中选择标准偏差.显然,如果我使用

I wonder whether there is a way to select the standard deviation from several integer fields in MySQL within the same row. Obviously, if I use

SELECT STDDEV(col1) FROM mytable

我只是得到该特定列的标准偏差.假设我有一张像:id,somefield1,somefield2, integerfield1,integerfield2,integerfield3, ... ,integerfield30 .现在我想在一行中选择 integerfield 1-30 的标准偏差并将其保存为 sdfield .当然我可以使用统计软件来做这个,但我只是想知道是否有一种方法可以直接在 MySQL 中完成.

I just get the standard deviation of that particular column. Let´s assume I have a table like: id,somefield1,somefield2, integerfield1,integerfield2,integerfield3, ... ,integerfield30 . Now I´d like to select the standard deviation of integerfield 1-30 within a row and save it AS sdfield . Of course I could use statistical software for this, but I just wonder if there is a way to do it directly in MySQL.

推荐答案

我自己找到了两个解决方案:

I found two solutions on my own:

1) 规范化数据库.我最终得到两个表:

1) Normalize the database. I end up with two tables:

表一用户名 |信息1 |元信息2

table one uid | information1 | metainformation2

表二用户名 |上校 |result_of_col

table two uid | col | result_of_col

然后我就可以轻松使用标准的 STDDEV 函数了.

Then I can easily use the standard STDDEV function.

2) 使用 R.数据是非规范化格式,因为它应该用于统计分析.因此很容易进入 R 并使用以下代码.

2) Use R. The data is a de-normalized format because it should be used in statistical analysis. Thus it´s easy to get into R and use the following code.

sd(t(dataset[1:4,3:8]))

sd(t(dataset[1:4,3:8]))

请注意,我只是通过选择第 3-8 列来获取此 data.frame 的数字部分.并且不要被太多数据击中(这就是为什么我这次只使用前几行).t() 转置必要的数据,因为 sd() 仅适用于列.

Note that, I just take the numeric part of this data.frame by leaving selecting the columns 3-8. And dont get hit by too much data (that´s why I only use the first couple of rows this time). t() transposes the data which is necessary because sd() only works with columns.

在 vsn 包中有一个函数 rowSds,它应该与 rowMean 和 rowSum 类似地工作,但不知何故这可能会被弃用.至少这个包在 Swiss CRAN 镜像上是不可用的 ;) .

There´s a function rowSds around in the vsn package, that is supposed to work analogously to rowMean and rowSum, but somehow this might be deprecated. At least this packages was not available on the Swiss CRAN mirror ;) .

HTH 其他人.

相关文章