检查字符串是否可以在Python中转换为float

181

我有一些运行在字符串列表中的Python代码，并在可能的情况下将它们转换为整数或浮点数。对整数执行此操作非常简单

if element.isdigit():
  newelement = int(element)

浮点数比较困难。现在，我正在使用partition('.')分割字符串并检查以确保一侧或两侧都是数字。

partition = element.partition('.')
if (partition[0].isdigit() and partition[1] == '.' and partition[2].isdigit()) 
    or (partition[0] == '' and partition[1] == '.' and partition[2].isdigit()) 
    or (partition[0].isdigit() and partition[1] == '.' and partition[2] == ''):
  newelement = float(element)

这是可行的，但是显然，如果使用if语句有点让人头疼。我考虑的另一种解决方案是将转换仅包装在try / catch块中，然后查看转换是否成功，如本问题所述。

还有其他想法吗？关于分区和尝试/捕获方法的相对优点的看法？

python string type-conversion

— 克里斯·厄普彻奇
source

304

我会用..

try:
    float(element)
except ValueError:
    print "Not a float"

..它很简单，并且可以正常工作

另一个选择是正则表达式：

import re
if re.match(r'^-?\d+(?:\.\d+)?$', element) is None:
    print "Not float"

— dbr
source

3

@ S.Lott：应用于此的大多数字符串将变成int或float。

— 克里斯·厄普彻奇

10

您的正则表达式不是最佳的。“ ^ \ d + \ .. \ d + $”将以与上述相同的速度失败，但是成功更快。另外，更正确的方法是：“ ^ [+-]？\ d（>？\。\ d +）？$”但是，这仍然不匹配数字，例如：+ 1.0e-10

— John Gietzen

86

除了您忘记将函数命名为“ will_it_float”。

— 卸下

3

第二个选项不会捕获nan和指数表达式-例如2e3。

— Patrick B.

4

我认为正则表达式不会解析负数。

— 卡洛斯

191

检查浮点数的Python方法：

def isfloat(value):
  try:
    float(value)
    return True
  except ValueError:
    return False

不要被隐藏在浮船上的妖精所咬！做单元测试！

什么是浮动货币，哪些不是浮动货币，可能会让您感到惊讶：

Command to parse                        Is it a float?  Comment
--------------------------------------  --------------- ------------
print(isfloat(""))                      False
print(isfloat("1234567"))               True 
print(isfloat("NaN"))                   True            nan is also float
print(isfloat("NaNananana BATMAN"))     False
print(isfloat("123.456"))               True
print(isfloat("123.E4"))                True
print(isfloat(".1"))                    True
print(isfloat("1,234"))                 False
print(isfloat("NULL"))                  False           case insensitive
print(isfloat(",1"))                    False           
print(isfloat("123.EE4"))               False           
print(isfloat("6.523537535629999e-07")) True
print(isfloat("6e777777"))              True            This is same as Inf
print(isfloat("-iNF"))                  True
print(isfloat("1.797693e+308"))         True
print(isfloat("infinity"))              True
print(isfloat("infinity and BEYOND"))   False
print(isfloat("12.34.56"))              False           Two dots not allowed.
print(isfloat("#56"))                   False
print(isfloat("56%"))                   False
print(isfloat("0E0"))                   True
print(isfloat("x86E0"))                 False
print(isfloat("86-5"))                  False
print(isfloat("True"))                  False           Boolean is not a float.   
print(isfloat(True))                    True            Boolean is a float
print(isfloat("+1e1^5"))                False
print(isfloat("+1e1"))                  True
print(isfloat("+1e1.3"))                False
print(isfloat("+1.3P1"))                False
print(isfloat("-+1"))                   False
print(isfloat("(1)"))                   False           brackets not interpreted

— 埃里克·莱斯钦斯基
source

6

好答案。只需再添加2个，其中float = True： isfloat(" 1.23 ")和isfloat(" \n \t 1.23 \n\t\n")。在网络请求中有用；无需先修剪空白。

— BareNakedCoder

22

'1.43'.replace('.','',1).isdigit()

true仅当存在一个或没有“。”时返回。在数字字符串中。

'1.4.3'.replace('.','',1).isdigit()

将返回 false

'1.ww'.replace('.','',1).isdigit()

将返回 false

— 图拉西雷迪
source

3

不是最佳选择，但实际上很聪明。无法处理+/-和指数。

— 疯狂物理学家

迟了几年，但这是一个不错的方法。在熊猫数据框中使用以下内容为我工作：[i for i in df[i].apply(lambda x: str(x).replace('.','').isdigit()).any()]

— Mark Moretto

1

@MarkMoretto当您知道负数的存在时，您会感到震惊

— David Heffernan

最适合我的情况的单线，我只需要检查正浮子或数字。我喜欢。

— MJohnyJ

8

TL; DR：

如果您输入的大部分内容都是可以转换为浮点数的字符串，则该try: except:方法是最好的本机Python方法。
如果您的输入大部分是不能输入的字符串转换为浮点数的，则正则表达式或partition方法会更好。
如果您1）不确定您的输入或需要更快的速度，并且2）不介意并可以安装第三方C扩展名，则fastnumbers效果很好。

通过第三方模块可以使用另一种称为fastnumbers的方法（公开，我是作者）。它提供了一个称为isfloat的功能。我在这个答案中采用了Jacob Gabrielson概述的单元测试示例，但是添加了该fastnumbers.isfloat方法。我还应该注意，Jacob的示例对正则表达式选项不公道，因为该示例中的大多数时间都因为点运算符而花费在全局查找中try: except:。

def is_float_try(str):
    try:
        float(str)
        return True
    except ValueError:
        return False

import re
_float_regexp = re.compile(r"^[-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?$").match
def is_float_re(str):
    return True if _float_regexp(str) else False

def is_float_partition(element):
    partition=element.partition('.')
    if (partition[0].isdigit() and partition[1]=='.' and partition[2].isdigit()) or (partition[0]=='' and partition[1]=='.' and partition[2].isdigit()) or (partition[0].isdigit() and partition[1]=='.' and partition[2]==''):
        return True
    else:
        return False

from fastnumbers import isfloat


if __name__ == '__main__':
    import unittest
    import timeit

    class ConvertTests(unittest.TestCase):

        def test_re_perf(self):
            print
            print 're sad:', timeit.Timer('ttest.is_float_re("12.2x")', "import ttest").timeit()
            print 're happy:', timeit.Timer('ttest.is_float_re("12.2")', "import ttest").timeit()

        def test_try_perf(self):
            print
            print 'try sad:', timeit.Timer('ttest.is_float_try("12.2x")', "import ttest").timeit()
            print 'try happy:', timeit.Timer('ttest.is_float_try("12.2")', "import ttest").timeit()

        def test_fn_perf(self):
            print
            print 'fn sad:', timeit.Timer('ttest.isfloat("12.2x")', "import ttest").timeit()
            print 'fn happy:', timeit.Timer('ttest.isfloat("12.2")', "import ttest").timeit()


        def test_part_perf(self):
            print
            print 'part sad:', timeit.Timer('ttest.is_float_partition("12.2x")', "import ttest").timeit()
            print 'part happy:', timeit.Timer('ttest.is_float_partition("12.2")', "import ttest").timeit()

    unittest.main()

在我的机器上，输出为：

fn sad: 0.220988988876
fn happy: 0.212214946747
.
part sad: 1.2219619751
part happy: 0.754667043686
.
re sad: 1.50515985489
re happy: 1.01107215881
.
try sad: 2.40243887901
try happy: 0.425730228424
.
----------------------------------------------------------------------
Ran 4 tests in 7.761s

OK

如您所见，regex实际上并不像最初看起来的那样糟糕，并且如果您确实对速度有需求，则此fastnumbers方法相当不错。

— 塞思·摩顿
source

如果您有大多数不能转换为浮点数的字符串，那么快速数字检查会非常有效，真的可以加快速度，谢谢

— ragardner

5

如果您关心性能（我不建议您这样做），则只要您不期望太多，基于尝试的方法就是明显的赢家（与基于分区的方法或regexp方法相比）无效的字符串，在这种情况下，它可能会变慢（可能是由于异常处理的开销）。

再一次，我不建议您关心性能，只是给您数据，以防您每秒进行100亿次这样的操作。同样，基于分区的代码不能处理至少一个有效的字符串。

$ ./floatstr.py
F..
分区悲伤：3.1102449894
分区快乐：2.09208488464
..
难过：7.76906108856
重新开心：7.09421992302
..
尝试悲伤：12.1525540352
尝试快乐：1.44165301323
。
================================================== ====================
失败：test_partition（__main __。ConvertTests）
-------------------------------------------------- --------------------
追溯（最近一次通话）：
  在test_partition中，文件“ ./floatstr.py”，第48行
    self.failUnless（is_float_partition（“ 20e2”））
断言错误

-------------------------------------------------- --------------------
在33.670秒内进行了8次测试

失败（失败= 1）

以下是代码（Python 2.6，来自John Gietzen的答案的正则表达式）：

def is_float_try(str):
    try:
        float(str)
        return True
    except ValueError:
        return False

import re
_float_regexp = re.compile(r"^[-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?$")
def is_float_re(str):
    return re.match(_float_regexp, str)


def is_float_partition(element):
    partition=element.partition('.')
    if (partition[0].isdigit() and partition[1]=='.' and partition[2].isdigit()) or (partition[0]=='' and partition[1]=='.' and pa\
rtition[2].isdigit()) or (partition[0].isdigit() and partition[1]=='.' and partition[2]==''):
        return True

if __name__ == '__main__':
    import unittest
    import timeit

    class ConvertTests(unittest.TestCase):
        def test_re(self):
            self.failUnless(is_float_re("20e2"))

        def test_try(self):
            self.failUnless(is_float_try("20e2"))

        def test_re_perf(self):
            print
            print 're sad:', timeit.Timer('floatstr.is_float_re("12.2x")', "import floatstr").timeit()
            print 're happy:', timeit.Timer('floatstr.is_float_re("12.2")', "import floatstr").timeit()

        def test_try_perf(self):
            print
            print 'try sad:', timeit.Timer('floatstr.is_float_try("12.2x")', "import floatstr").timeit()
            print 'try happy:', timeit.Timer('floatstr.is_float_try("12.2")', "import floatstr").timeit()

        def test_partition_perf(self):
            print
            print 'partition sad:', timeit.Timer('floatstr.is_float_partition("12.2x")', "import floatstr").timeit()
            print 'partition happy:', timeit.Timer('floatstr.is_float_partition("12.2")', "import floatstr").timeit()

        def test_partition(self):
            self.failUnless(is_float_partition("20e2"))

        def test_partition2(self):
            self.failUnless(is_float_partition(".2"))

        def test_partition3(self):
            self.failIf(is_float_partition("1234x.2"))

    unittest.main()

— 雅各布·加布里埃尔森
source

4

仅出于多样性，这是另一种方法。

>>> all([i.isnumeric() for i in '1.2'.split('.',1)])
True
>>> all([i.isnumeric() for i in '2'.split('.',1)])
True
>>> all([i.isnumeric() for i in '2.f'.split('.',1)])
False

编辑：我确信它不会容纳所有的float情况，尽管特别是在有指数的时候。为了解决它看起来像这样。这将返回True，只有val是浮点数，而对于int则返回False，但性能可能不如regex。

>>> def isfloat(val):
...     return all([ [any([i.isnumeric(), i in ['.','e']]) for i in val],  len(val.split('.')) == 2] )
...
>>> isfloat('1')
False
>>> isfloat('1.2')
True
>>> isfloat('1.2e3')
True
>>> isfloat('12e3')
False

— 彼得·摩尔
source

isnumeric函数看起来是一个糟糕的选择，因为它对各种Unicode字符（如分数）返回true。文档说：“数字字符包括数字字符，以及所有具有Unicode数值属性的字符，例如U + 2155，

— VULGAR FRACTION

3

此正则表达式将检查科学的浮点数：

^[-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?$

但是，我相信您最好的选择是尝试使用解析器。

— 约翰·吉岑
source

2

如果您不必担心数字的科学表达式或其他表达式，而只使用可能是带或不带句点的数字的字符串，则：

功能

def is_float(s):
    result = False
    if s.count(".") == 1:
        if s.replace(".", "").isdigit():
            result = True
    return result

Lambda版本

is_float = lambda x: x.replace('.','',1).isdigit() and "." in x

例

if is_float(some_string):
    some_string = float(some_string)
elif some_string.isdigit():
    some_string = int(some_string)
else:
    print "Does not convert to int or float."

这样，您就不会意外将应为int的内容转换为float。

— Kodetojoy
source

2

函数的简化版本， is_digit(str)在大多数情况下就足够了（不考虑指数符号和“ NaN”值）：

def is_digit(str):
    return str.lstrip('-').replace('.', '').isdigit()

— Simhumileco
source

1

我使用了已经提到的函数，但是很快我注意到，字符串“ Nan”，“ Inf”及其变体被视为数字。因此，我建议您对函数进行改进，使其在这些输入类型上返回false，并且不会使“ 1e3”变体失败：

def is_float(text):
    # check for nan/infinity etc.
    if text.isalpha():
        return False
    try:
        float(text)
        return True
    except ValueError:
        return False

— Mathfac
source

1

我们不能if text.isalpha():马上开始检查吗？

— Csaba Toth

顺便说一句，我需要是相同的：我不想接受的NaN，天道酬勤之类的东西

— 乔鲍·托特

1

尝试转换为浮点数。如果有错误，请打印ValueError异常。

try:
    x = float('1.23')
    print('val=',x)
    y = float('abc')
    print('val=',y)
except ValueError as err:
    print('floatErr;',err)

输出：

val= 1.23
floatErr: could not convert string to float: 'abc'

— edW
source

1

将字典作为参数传递时，它将转换可以转换为float的字符串，并留下其他字符串

def covertDict_float(data):
        for i in data:
            if data[i].split(".")[0].isdigit():
                try:
                    data[i] = float(data[i])
                except:
                    continue
        return data

— 拉胡尔·贾恩（Rahul Jain）
source

0

我一直在寻找一些类似的代码，但是看起来使用try / excepts是最好的方法。这是我正在使用的代码。如果输入无效，则包括重试功能。我需要检查输入是否大于0，是否将其转换为浮点型。

def cleanInput(question,retry=False): 
    inputValue = input("\n\nOnly positive numbers can be entered, please re-enter the value.\n\n{}".format(question)) if retry else input(question)
    try:
        if float(inputValue) <= 0 : raise ValueError()
        else : return(float(inputValue))
    except ValueError : return(cleanInput(question,retry=True))


willbefloat = cleanInput("Give me the number: ")

— 洛基
source

0

def try_parse_float(item):
  result = None
  try:
    float(item)
  except:
    pass
  else:
    result = float(item)
  return result

— 托万达·马特雷克（Tawanda Matereke）
source

2

尽管这段代码可以解决问题，但包括解释如何以及为何解决该问题的说明，确实可以帮助提高您的帖子质量，并可能导致更多的投票。请记住，您将来会为读者回答问题，而不仅仅是现在问的人。请编辑您的答案以添加说明，并指出适用的限制和假设。

— 双响

0

我尝试了上述一些简单的选项，并使用了一个围绕转换为浮点数的try测试，发现大多数答复中都存在问题。

简单测试（沿以上答案行）：

entry = ttk.Entry(self, validate='key')
entry['validatecommand'] = (entry.register(_test_num), '%P')

def _test_num(P):
    try: 
        float(P)
        return True
    except ValueError:
        return False

问题出现在以下情况：

输入“-”以开始一个负数：

然后float('-')，您正在尝试失败

您输入一个数字，但然后尝试删除所有数字

然后float('')，您正在尝试同样失败的尝试

我的快速解决方案是：

def _test_num(P):
    if P == '' or P == '-': return True
    try: 
        float(P)
        return True
    except ValueError:
        return False

— 理查德
source

-2

str(strval).isdigit()

似乎很简单。

处理以字符串或int或float形式存储的值

— ks鼠
source

输入[2]：'123,123'.isdigit（）输出[2]：错误

— 丹尼尔·马什金

1

它不适用于负数文字，请修正您的答案

— RandomEli

'39 .1'.isdigit（）

— Lad

all（[str.VAR）.strip（'-'）。replace（'，'，'。'）。split（'。'）]中x的x.isdigit（）]）如果您正在寻找更完整的实施。

— lotrus28