将浮点数转换为位置格式的字符串(没有科学记数法和错误精度)
问题描述
我想打印一些浮点数,以便它们始终以十进制形式写入(例如 12345000000000000000000.0
或 0.000000000000012345
,而不是 科学记数法,但我希望结果达到~15.7 重要数字 IEEE 754 double,仅此而已.
I want to print some floating point numbers so that they're always written in decimal form (e.g. 12345000000000000000000.0
or 0.000000000000012345
, not in scientific notation, yet I'd want to the result to have the up to ~15.7 significant figures of a IEEE 754 double, and no more.
我想要的是 ideally 以便结果是位置十进制格式的 最短 字符串,当转换为 浮动
.
What I want is ideally so that the result is the shortest string in positional decimal format that still results in the same value when converted to a float
.
众所周知,如果指数大于 15 或小于 -4,则 float
的 repr
以科学计数法编写:
It is well-known that the repr
of a float
is written in scientific notation if the exponent is greater than 15, or less than -4:
>>> n = 0.000000054321654321
>>> n
5.4321654321e-08 # scientific notation
如果使用了str
,结果字符串又是科学计数法:
If str
is used, the resulting string again is in scientific notation:
>>> str(n)
'5.4321654321e-08'
<小时>
有人建议我可以使用带有 f
标志的 format
和足够的精度来摆脱科学记数法:
It has been suggested that I can use format
with f
flag and sufficient precision to get rid of the scientific notation:
>>> format(0.00000005, '.20f')
'0.00000005000000000000'
它适用于该数字,尽管它有一些额外的尾随零.但是对于 .1
,同样的格式会失败,它给出的十进制数字超出了 float 的实际机器精度:
It works for that number, though it has some extra trailing zeroes. But then the same format fails for .1
, which gives decimal digits beyond the actual machine precision of float:
>>> format(0.1, '.20f')
'0.10000000000000000555'
如果我的号码是 4.5678e-20
,使用 .20f
仍然会失去相对精度:
And if my number is 4.5678e-20
, using .20f
would still lose relative precision:
>>> format(4.5678e-20, '.20f')
'0.00000000000000000005'
因此这些方法不符合我的要求.
这导致了一个问题:以十进制格式打印任意浮点数的最简单且性能良好的方法是什么,其数字与 repr(n)
(或 Python 3 上的 str(n)
),但始终使用十进制格式,而不是科学记数法.
This leads to the question: what is the easiest and also well-performing way to print arbitrary floating point number in decimal format, having the same digits as in repr(n)
(or str(n)
on Python 3), but always using the decimal format, not the scientific notation.
即例如将浮点值0.00000005
转换为字符串'0.00000005'
的函数或操作;0.1
到 '0.1'
;420000000000000000.0
到 '420000000000000000.0'
或 420000000000000000
并将浮点值 -4.5678e-5
格式化为 '-0.000045678'
.
That is, a function or operation that for example converts the float value 0.00000005
to string '0.00000005'
; 0.1
to '0.1'
; 420000000000000000.0
to '420000000000000000.0'
or 420000000000000000
and formats the float value -4.5678e-5
as '-0.000045678'
.
在赏金期之后:似乎至少有两种可行的方法,正如 Karin 证明的那样,与我在 Python 2 上的初始算法相比,使用字符串操作可以显着提高速度.
After the bounty period: It seems that there are at least 2 viable approaches, as Karin demonstrated that using string manipulation one can achieve significant speed boost compared to my initial algorithm on Python 2.
因此,
- 如果性能很重要并且需要兼容 Python 2;或者如果由于某种原因不能使用
decimal
模块,那么 Karin 使用字符串操作的方法 是方法. - 在 Python 3 上,我稍微短一些的代码也会更快.
- If performance is important and Python 2 compatibility is required; or if the
decimal
module cannot be used for some reason, then Karin's approach using string manipulation is the way to do it. - On Python 3, my somewhat shorter code will also be faster.
由于我主要在 Python 3 上进行开发,我将接受我自己的答案,并将奖励 Karin.
Since I am primarily developing on Python 3, I will accept my own answer, and shall award Karin the bounty.
解决方案
不幸的是,似乎连带有 float.__format__
的新格式格式都不支持这一点.float
s 的默认格式与 repr
相同;并且带有 f
标志,默认情况下有 6 个小数位:
Unfortunately it seems that not even the new-style formatting with float.__format__
supports this. The default formatting of float
s is the same as with repr
; and with f
flag there are 6 fractional digits by default:
>>> format(0.0000000005, 'f')
'0.000000'
<小时>
然而,有一个技巧可以得到想要的结果——不是最快的,但相对简单:
However there is a hack to get the desired result - not the fastest one, but relatively simple:
- 首先使用
str()
或repr()
将浮点数转换为字符串 - 然后是一个新的
十进制
实例是从该字符串创建的. Decimal.__format__
支持提供所需结果的f
标志,并且与float
不同,它打印实际精度而不是默认精度.
- first the float is converted to a string using
str()
orrepr()
- then a new
Decimal
instance is created from that string. Decimal.__format__
supportsf
flag which gives the desired result, and, unlikefloat
s it prints the actual precision instead of default precision.
因此我们可以制作一个简单的效用函数float_to_str
:
Thus we can make a simple utility function float_to_str
:
import decimal
# create a new context for this task
ctx = decimal.Context()
# 20 digits should be enough for everyone :D
ctx.prec = 20
def float_to_str(f):
"""
Convert the given float to a string,
without resorting to scientific notation
"""
d1 = ctx.create_decimal(repr(f))
return format(d1, 'f')
必须注意不要使用全局十进制上下文,因此为此函数构造了一个新上下文.这是最快的方法;另一种方法是使用 decimal.local_context
但它会更慢,为每次转换创建一个新的线程本地上下文和一个上下文管理器.
Care must be taken to not use the global decimal context, so a new context is constructed for this function. This is the fastest way; another way would be to use decimal.local_context
but it would be slower, creating a new thread-local context and a context manager for each conversion.
此函数现在返回包含尾数中所有可能数字的字符串,四舍五入为最短等效表示:
This function now returns the string with all possible digits from mantissa, rounded to the shortest equivalent representation:
>>> float_to_str(0.1)
'0.1'
>>> float_to_str(0.00000005)
'0.00000005'
>>> float_to_str(420000000000000000.0)
'420000000000000000'
>>> float_to_str(0.000000000123123123123123123123)
'0.00000000012312312312312313'
最后一位结果四舍五入
正如@Karin 所说,float_to_str(420000000000000000.0)
与预期的格式不完全匹配;它返回 420000000000000000
而没有尾随 .0
.
As @Karin noted, float_to_str(420000000000000000.0)
does not strictly match the format expected; it returns 420000000000000000
without trailing .0
.
相关文章