存储Python字典

197

我习惯于使用.csv文件将数据导入和导出Python，但这存在明显的挑战。关于将字典（或字典集）存储在json或pck文件中的简单方法的任何建议？例如：

data = {}
data ['key1'] = "keyinfo"
data ['key2'] = "keyinfo2"

我想知道如何保存它，然后再将其加载回去。

— 麦克风
source

8

您是否已阅读json或pickle标准模块的文档？

— 2011年

请参阅在Python中将字典保存到文件（代替pickle）吗？

— 马丁·托马

441

泡菜保存：

try:
    import cPickle as pickle
except ImportError:  # python 3.x
    import pickle

with open('data.p', 'wb') as fp:
    pickle.dump(data, fp, protocol=pickle.HIGHEST_PROTOCOL)

有关该参数的其他信息，请参见pickle模块文档protocol。

酸洗负荷：

with open('data.p', 'rb') as fp:
    data = pickle.load(fp)

JSON保存：

import json

with open('data.json', 'w') as fp:
    json.dump(data, fp)

提供额外的参数，例如sort_keys或indent以获得漂亮的结果。参数sort_keys将按字母顺序对键进行排序，而indent将使用indent=N空格缩进您的数据结构。

json.dump(data, fp, sort_keys=True, indent=4)

JSON加载：

with open('data.json', 'r') as fp:
    data = json.load(fp)

— 马蒂
source

4

JSON本身会执行字典（尽管它们显然不像python字典在内存中那样准确，但出于持久性目的，它们是相同的）。实际上，json中的基本单位是“对象”，它被定义为{<string>：<value>}。看起来熟悉？标准库中的json模块支持每种Python本机类型，并且只需很少的json知识即可轻松扩展以支持用户定义的类。JSON主页在3个以上的印刷页面中完全定义了语言，因此很容易快速吸收/消化。

— 乔纳森（Jonathanb）

1

同样值得了解的第三个参数pickle.dump。如果文件不需要人类可读，则可以大大加快处理速度。

— 史蒂夫·杰索普

11

如果将sort_keys和indent参数添加到转储调用中，则会得到更漂亮的结果。例如：json.dump(data, fp, sort_keys=True, indent=4)。更多信息可以在这里

— juliusmh '16

1

您可能应该使用pickle.dump(data, fp, protocol=pickle.HIGHEST_PROTOCOL)

— Martin Thoma

1

对于Python 3，使用import pickle

— danger89

35

最小的示例，直接写入文件：

import json
json.dump(data, open(filename, 'wb'))
data = json.load(open(filename))

或安全地打开/关闭：

import json
with open(filename, 'wb') as outfile:
    json.dump(data, outfile)
with open(filename) as infile:
    data = json.load(infile)

如果要将其保存为字符串而不是文件：

import json
json_str = json.dumps(data)
data = json.loads(json_str)

— f
source

7

另请参阅加速包ujson。 https://pypi.python.org/pypi/ujson

import ujson
with open('data.json', 'wb') as fp:
    ujson.dump(data, fp)

— 艾略特
source

5

要写入文件：

import json
myfile.write(json.dumps(mydict))

要读取文件：

import json
mydict = json.loads(myfile.read())

myfile 是存储字典的文件的文件对象。

— 拉菲·凯特勒
source

您是否知道json具有将文件作为参数并直接写入它们的参数？

json.dump(myfile)和json.load(myfile)

— Niklas R

5

如果您正在序列化之后但不需要其他程序中的数据，则强烈建议您使用该shelve模块。将其视为持久性字典。

myData = shelve.open('/path/to/file')

# check for values.
keyVar in myData

# set values
myData[anotherKey] = someValue

# save the data for future use.
myData.close()

— 通用直流
source

2

如果要存储整个字典或加载整个字典，json则更加方便。shelve一次只能访问一个密钥更好。

— 2011年

3

如果您想要替代pickle或json，则可以使用klepto。

>>> init = {'y': 2, 'x': 1, 'z': 3}
>>> import klepto
>>> cache = klepto.archives.file_archive('memo', init, serialized=False)
>>> cache        
{'y': 2, 'x': 1, 'z': 3}
>>>
>>> # dump dictionary to the file 'memo.py'
>>> cache.dump() 
>>> 
>>> # import from 'memo.py'
>>> from memo import memo
>>> print memo
{'y': 2, 'x': 1, 'z': 3}

使用klepto，如果使用过serialized=True，则该字典将被memo.pkl作为腌制的字典写入，而不是使用明文。

你可以在klepto这里找到：https : //github.com/uqfoundation/klepto

dill酸洗可能比酸洗更好pickle，因为dill可以在python中序列化几乎所有内容。 klepto也可以使用dill。

你可以在dill这里找到：https : //github.com/uqfoundation/dill

前几行中额外的mumbo-jumbo是因为klepto可以配置为将字典存储到文件，目录上下文或SQL数据库中。无论选择什么作为后端存档，API都是相同的。它为您提供了一个“可存档”字典，您可以使用该字典load并dump与档案进行交互。

— 迈克·麦克肯斯
source

3

这是一个老话题，但是为了完整起见，我们应该包括ConfigParser和configparser，它们分别是Python 2和3中的标准库的一部分。该模块读取和写入config / ini文件，并且（至少在Python 3中）其行为类似于字典。它的另一个好处是，您可以将多个词典存储到config / ini文件的不同部分中，并对其进行调用。甜！

Python 2.7.x示例。

import ConfigParser

config = ConfigParser.ConfigParser()

dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'}
dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'}
dict3 = {'x':1, 'y':2, 'z':3}

# make each dictionary a separate section in config
config.add_section('dict1')
for key in dict1.keys():
    config.set('dict1', key, dict1[key])

config.add_section('dict2')
for key in dict2.keys():
    config.set('dict2', key, dict2[key])

config.add_section('dict3')
for key in dict3.keys():
    config.set('dict3', key, dict3[key])

# save config to file
f = open('config.ini', 'w')
config.write(f)
f.close()

# read config from file
config2 = ConfigParser.ConfigParser()
config2.read('config.ini')

dictA = {}
for item in config2.items('dict1'):
    dictA[item[0]] = item[1]

dictB = {}
for item in config2.items('dict2'):
    dictB[item[0]] = item[1]

dictC = {}
for item in config2.items('dict3'):
    dictC[item[0]] = item[1]

print(dictA)
print(dictB)
print(dictC)

Python 3.X示例。

import configparser

config = configparser.ConfigParser()

dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'}
dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'}
dict3 = {'x':1, 'y':2, 'z':3}

# make each dictionary a separate section in config
config['dict1'] = dict1
config['dict2'] = dict2
config['dict3'] = dict3

# save config to file
f = open('config.ini', 'w')
config.write(f)
f.close()

# read config from file
config2 = configparser.ConfigParser()
config2.read('config.ini')

# ConfigParser objects are a lot like dictionaries, but if you really
# want a dictionary you can ask it to convert a section to a dictionary
dictA = dict(config2['dict1'] )
dictB = dict(config2['dict2'] )
dictC = dict(config2['dict3'])

print(dictA)
print(dictB)
print(dictC)

控制台输出

{'key2': 'keyinfo2', 'key1': 'keyinfo'}
{'k1': 'hot', 'k2': 'cross', 'k3': 'buns'}
{'z': '3', 'y': '2', 'x': '1'}

config.ini的内容

[dict1]
key2 = keyinfo2
key1 = keyinfo

[dict2]
k1 = hot
k2 = cross
k3 = buns

[dict3]
z = 3
y = 2
x = 1

— 弗里斯
source

1

如果保存到json文件，最好的和最简单的方法是：

import json
with open("file.json", "wb") as f:
    f.write(json.dumps(dict).encode("utf-8"))

— 刘亚当
source

为什么这比json.dump( )其他答案中概述的要容易？

— baxx

0

我的用例是将多个json对象保存到文件中，而marty的回答对我有所帮助。但是要满足我的用例，答案并不完整，因为每次保存新条目时，它都会覆盖旧数据。

为了将多个条目保存在一个文件中，必须检查旧内容（即在写入之前先读取）。存放json数据的典型文件将具有a list或objectas根。因此，我认为我的json文件始终具有a，list of objects并且每次向其添加数据时，我只会首先加载列表，在其中添加新数据，然后将其转储回文件（w）的仅可写实例：

def saveJson(url,sc): #this function writes the 2 values to file
    newdata = {'url':url,'sc':sc}
    json_path = "db/file.json"

    old_list= []
    with open(json_path) as myfile:  #read the contents first
        old_list = json.load(myfile)
    old_list.append(newdata)

    with open(json_path,"w") as myfile:  #overwrite the whole content
        json.dump(old_list,myfile,sort_keys=True,indent=4)

    return "sucess"

新的json文件将如下所示：

[
    {
        "sc": "a11",
        "url": "www.google.com"
    },
    {
        "sc": "a12",
        "url": "www.google.com"
    },
    {
        "sc": "a13",
        "url": "www.google.com"
    }
]

注意：必须file.json使用[]以初始数据命名的文件，此方法才能正常工作

PS：与原始问题无关，但是通过首先检查我们的条目是否已经存在（基于1 /多个键），然后仅追加并保存数据，也可以进一步改进此方法。让我知道是否有人需要该支票，我将添加到答案中

— 安什·萨切德娃（Ansh Sachdeva）
source