Questions tagged «lxml»

lxml是用于处理XML和HTML的功能齐全的高性能Python库。

11
如何在Ubuntu上安装LXML
我在Ubuntu 11上使用easy_install安装lxml遇到困难。 当我输入时,$ easy_install lxml我得到: Searching for lxml Reading http://pypi.python.org/simple/lxml/ Reading http://codespeak.net/lxml Best match: lxml 2.3 Downloading http://lxml.de/files/lxml-2.3.tgz Processing lxml-2.3.tgz Running lxml-2.3/setup.py -q bdist_egg --dist-dir /tmp/easy_install-7UdQOZ/lxml-2.3/egg-dist-tmp-GacQGy Building lxml version 2.3. Building without Cython. ERROR: /bin/sh: xslt-config: not found ** make sure the development packages of libxml2 and libxslt are installed …

28
使用pip的libxml安装错误
这是我的错误: (mysite)zjm1126@zjm1126-G41MT-S2:~/zjm_test/mysite$ pip install lxml Downloading/unpacking lxml Running setup.py egg_info for package lxml Building lxml version 2.3. Building without Cython. ERROR: /bin/sh: xslt-config: not found ** make sure the development packages of libxml2 and libxslt are installed ** Using build configuration of libxslt Installing collected packages: lxml Running setup.py install …
269 python  lxml  pip 

23
无法在Mac OS X 10.9上安装Lxml
我想安装Lxml,以便随后可以安装Scrapy。 今天更新Mac时,不允许我重新安装lxml,但出现以下错误: In file included from src/lxml/lxml.etree.c:314: /private/tmp/pip_build_root/lxml/src/lxml/includes/etree_defs.h:9:10: fatal error: 'libxml/xmlversion.h' file not found #include "libxml/xmlversion.h" ^ 1 error generated. error: command 'cc' failed with exit status 1 我尝试使用brew安装libxml2和libxslt,两者都安装良好,但仍然无法安装lxml。 上次安装时,我需要在Xcode上启用开发人员工具,但是由于将其更新为Xcode 5,因此不再提供该选项。 有人知道我需要做什么吗?
234 python  xcode  macos  scrapy  lxml 

12
bs4.FeatureNotFound:找不到具有您请求的功能的树构建器:lxml。您需要安装解析器库吗?
... soup = BeautifulSoup(html, "lxml") File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 152, in __init__ % ",".join(features)) bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library? 以上输出在我的终端上。我在Mac OS 10.7.x上。我有Python 2.7.1,并按照本教程操作获得了Beautiful Soup和lxml,它们都已成功安装并与位于此处的单独测试文件一起使用。在导致此错误的Python脚本中,我包含以下行: from pageCrawler import comparePages 在pageCrawler文件中,我包含以下两行: from bs4 import BeautifulSoup from urllib2 import …

2
builtins.TypeError:必须为str,而不是字节
我已经将脚本从Python 2.7转换为3.2,并且有一个错误。 # -*- coding: utf-8 -*- import time from datetime import date from lxml import etree from collections import OrderedDict # Create the root element page = etree.Element('results') # Make a new document tree doc = etree.ElementTree(page) # Add the subelements pageElement = etree.SubElement(page, 'Country',Tim = 'Now', name='Germany', AnotherParameter …
219 python  python-3.x  lxml 

2
如何使用xpath选择以下同级/ xml标记
我有一个HTML文件(来自Newegg),它们的HTML如下组织。规格表中的所有数据均为“ desc ”,而每个部分的标题均为“ name”。以下是Newegg页面中的两个数据示例。 <tr> <td class="name">Brand</td> <td class="desc">Intel</td> </tr> <tr> <td class="name">Series</td> <td class="desc">Core i5</td> </tr> <tr> <td class="name">Cores</td> <td class="desc">4</td> </tr> <tr> <td class="name">Socket</td> <td class="desc">LGA 1156</td> <tr> <td class="name">Brand</td> <td class="desc">AMD</td> </tr> <tr> <td class="name">Series</td> <td class="desc">Phenom II X4</td> </tr> <tr> <td class="name">Cores</td> <td class="desc">4</td> </tr> <tr> …
102 xml  xpath  lxml 

5
src / lxml / etree_defs.h:9:31:致命错误:libxml / xmlversion.h:没有此类文件或目录
我正在运行以下命令将软件包安装在该文件“ pip install -r requirements.txt --download-cache=~/tmp/pip-cache”中。 required.txt包含以下内容 # Data formats # ------------ PIL==1.1.7 # html5lib==0.90 httplib2==0.7.4 lxml==2.3.1 # Documentation # ------------- Sphinx==1.1 docutils==0.8.1 # Testing # ------- behave==1.1.0 dingus==0.3.2 django-testscenarios==0.7.2 mechanize==0.2.5 mock==0.7.2 testscenarios==0.2 testtools==0.9.14 wsgi_intercept==0.5.1 在安装“ lxml”软件包时,出现以下错误 Requirement already satisfied (use --upgrade to upgrade): django-testproject>=0.1.1 in /usr/lib/python2.7/site-packages/django_testproject-0.1.1-py2.7.egg (from django-testscenarios==0.7.2->-r requirements.txt …
99 python-2.7  lxml  pip 

7
在python中安装lxml模块
在运行python脚本时,出现此错误 from lxml import etree ImportError: No module named lxml 现在我尝试安装lxml sudo easy_install lmxl 但这给了我以下错误 Building lxml version 2.3.beta1. NOTE: Trying to build without Cython, pre-generated 'src/lxml/lxml.etree.c' needs to be available. ERROR: /bin/sh: xslt-config: not found ** make sure the development packages of libxml2 and libxslt are installed ** 使用libxslt的构建配置 …

5
如何删除lxml中的元素
我需要使用python的lxml根据属性的内容完全删除元素。例: import lxml.etree as et xml=""" <groceries> <fruit state="rotten">apple</fruit> <fruit state="fresh">pear</fruit> <fruit state="fresh">starfruit</fruit> <fruit state="rotten">mango</fruit> <fruit state="fresh">peach</fruit> </groceries> """ tree=et.fromstring(xml) for bad in tree.xpath("//fruit[@state=\'rotten\']"): #remove this element from the tree print et.tostring(tree, pretty_print=True) 我想打印: <groceries> <fruit state="fresh">pear</fruit> <fruit state="fresh">starfruit</fruit> <fruit state="fresh">peach</fruit> </groceries> 有没有一种方法可以执行此操作而无需存储临时变量并手动将其打印出来,如下所示: newxml="<groceries>\n" for elt in tree.xpath('//fruit[@state=\'fresh\']'): newxml+=et.tostring(elt) newxml+="</groceries>"
84 python  xml  lxml 

15
获取lxml中标签内的所有文本
我想编写一个代码片段<content>,在以下所有三个实例中(包括代码标签),都将在lxml中的标签中捕获所有文本。我已经尝试过了,tostring(getchildren())但是那样会错过标签之间的文本。我没有太多运气在API中搜索相关功能。你能帮我吗? <!--1--> <content> <div>Text inside tag</div> </content> #should return "<div>Text inside tag</div> <!--2--> <content> Text with no tag </content> #should return "Text with no tag" <!--3--> <content> Text outside tag <div>Text inside tag</div> </content> #should return "Text outside tag <div>Text inside tag</div>"
75 python  parsing  lxml 

9
如何使用缩进将HTML漂亮地打印到文件
我正在使用lxml.html生成一些HTML。我想将最终结果漂亮地打印(带有缩进)到html文件中。我怎么做? 这是我迄今为止一直尝试并得到的(我对Python和lxml还是比较陌生的): import lxml.html as lh from lxml.html import builder as E sliderRoot=lh.Element("div", E.CLASS("scroll"), style="overflow-x: hidden; overflow-y: hidden;") scrollContainer=lh.Element("div", E.CLASS("scrollContainer"), style="width: 4340px;") sliderRoot.append(scrollContainer) print lh.tostring(sliderRoot, pretty_print = True, method="html") 如您所见,我正在使用该pretty_print=True属性。我以为可以缩进代码,但这并没有真正的帮助。这是输出: <div style="overflow-x: hidden; overflow-y: hidden;" class="scroll"><div style="width: 4340px;" class="scrollContainer"></div></div>

5
lxml安装错误ubuntu 14.04(内部编译器错误)
我在安装时遇到问题lxml。我已经在本网站和其他网站上尝试了相关问题的解决方案,但无法解决问题。需要一些建议/解决方案。 执行后pip install lxml,我将提供完整的日志, Downloading/unpacking lxml Downloading lxml-3.3.5.tar.gz (3.5MB): 3.5MB downloaded Running setup.py (path:/tmp/pip_build_root/lxml/setup.py) egg_info for package lxml /usr/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url' warnings.warn(msg) Building lxml version 3.3.5. Building without Cython. Using build configuration of libxslt 1.1.28 warning: no previously-included files found matching '*.py' Installing collected packages: lxml Running …
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.