程序设计 corpus

使用NLTK创建新的语料库

我认为标题的答案通常是去阅读文档，但是我浏览了NLTK书，但没有给出答案。我是Python的新手。我有很多.txt文件，我希望能够使用NLTK为语料库提供的语料库功能nltk_data。我已经尝试过，PlaintextCorpusReader但是我无法超越： >>>import nltk >>>from nltk.corpus import PlaintextCorpusReader >>>corpus_root = './' >>>newcorpus = PlaintextCorpusReader(corpus_root, '.*') >>>newcorpus.words() 如何newcorpus使用punkt分割句子？我尝试使用punkt函数，但punkt函数无法读取PlaintextCorpusReader类？您还可以引导我介绍如何将分段数据写入文本文件吗？

83 python nlp nltk corpus

通过python连接时，如何更改默认的Mysql连接超时？

我使用python连接到mysql数据库con = _mysql.connect('localhost', 'dell-pc', '', 'test') 。我编写的程序需要大量时间才能完全执行，即大约需要10个小时。实际上，我正在尝试从语料库中读取不同的词。读取完成后，出现超时错误。我检查了Mysql默认超时是： +----------------------------+----------+ | Variable_name | Value | +----------------------------+----------+ | connect_timeout | 10 | | delayed_insert_timeout | 300 | | innodb_lock_wait_timeout | 50 | | innodb_rollback_on_timeout | OFF | | interactive_timeout | 28800 | | lock_wait_timeout | 31536000 | | net_read_timeout | 30 | …

70 python mysql corpus

Questions tagged «corpus»