There are many ways to read XML, both all-at-once (DOM) and one-bit-at-a-time (SAX). I have used SAX or lxml to iteratively read large XML files (e.g. wikipedia dump which is 6.5GB compressed).

有许多方法可以读取XML,包括一次性(DOM)和一次一位(SAX)。我使用SAX或lxml迭代读取大型XML文件(例如压缩6.5GB的wikipedia转储)。

However after doing some iterative processing (in python using ElementTree) of that XML file, I want to write out the (new) XML data to another file.

但是,在对该XML文件进行一些迭代处理(使用ElementTree的python)之后,我想将(新)XML数据写出到另一个文件。

Are there any libraries to iteratively write out XML data? I could create the XML tree and then write it out, but that is not possible without oodles of ram. Is there anyway to write the XML tree to a file iteratively? One bit at a time?

是否有任何库可以迭代地写出XML数据?我可以创建XML树,然后将其写出来,但如果没有大量的ram,这是不可能的。反正有迭代地将XML树写入文件吗?一次一点?

I know I could generate the XML myself with print "<%s>" % tag_name, etc., but that seems a bit... hacky.

我知道我可以使用print“<%s>”%tag_name等自己生成XML,但这看起来有点...... hacky。

4 个解决方案

#1


4

Fredrik Lundh's elementtree.SimpleXMLWriter will let you write out XML incrementally. Here's the demo code embedded in the module:

Fredrik Lundh的elementtree.SimpleXMLWriter将允许您逐步写出XML。这是模块中嵌入的演示代码:

from elementtree.SimpleXMLWriter import XMLWriter
import sys

w = XMLWriter(sys.stdout)

html = w.start("html")

w.start("head")
w.element("title", "my document")
w.element("meta", name="generator", value="my application 1.0")
w.end()

w.start("body")
w.element("h1", "this is a heading")
w.element("p", "this is a paragraph")

w.start("p")
w.data("this is ")
w.element("b", "bold")
w.data(" and ")
w.element("i", "italic")
w.data(".")
w.end("p")

w.close(html)

更多相关文章

  1. pytorch中tensor数据和numpy数据转换中注意的一个问题
  2. 使用自定义qemu二进制文件与libvirt失败?
  3. python处理数据,存进hive表
  4. 【python coding 1:网络检测】ping本地文件里的ip地址
  5. 如何输出NLTK块到文件?
  6. python 读写文本文件
  7. 批量重命名文件——python实现
  8. python 读写json数据
  9. python编程之一:使用网格索引算法进行空间数据查询

随机推荐

  1. [置顶] Python + C/C++ 嵌入式编
  2. 零基础Python教程:如何实现PCA算法
  3. 解除装饰器作用(python3新增)
  4. django模板引擎有错误检查?
  5. Python定义函数时,不同参数类型的传递
  6. Python学习之==>json处理
  7. 28.mysql数据库之查询
  8. Python编程之解释器
  9. python - pandas或者sklearn中如何将字符
  10. Python随心记--进程、线程