将文本文件拆分为短行以进行阅读?


10

是否有一个程序可以使用长行的纯文本文件,并在一定数量的字符后添加换行符(仅在单词上分开)以使其可读?例如,采取以下方法:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aliquam vel lectus ac enim venenatis porttitor in et est. Curabitur ut eros quis risus consequat dictum a a lectus. Integer ut risus quis augue lobortis molestie vel id nibh. Aliquam sit amet mattis lorem, vel ornare felis. Donec pulvinar tempus lorem, at porta sem pretium ut. Cras ut lorem tincidunt, scelerisque nunc vitae, posuere augue. Vestibulum iaculis libero id congue ultrices. Nullam mauris ipsum, aliquet eget nisl non, venenatis euismod enim. Phasellus a eleifend velit. Aenean molestie venenatis turpis, consectetur convallis velit fringilla non.

并将其转换为:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aliquam vel
lectus ac enim venenatis porttitor in et est. Curabitur ut eros quis
risus consequat dictum a a lectus. Integer ut risus quis augue lobortis
molestie vel id nibh. Aliquam sit amet mattis lorem, vel ornare felis.
Donec pulvinar tempus lorem, at porta sem pretium ut. Cras ut lorem
tincidunt, scelerisque nunc vitae, posuere augue. Vestibulum iaculis
libero id congue ultrices. Nullam mauris ipsum, aliquet eget nisl non,
venenatis euismod enim. Phasellus a eleifend velit. Aenean molestie
venenatis turpis, consectetur convallis velit fringilla non.

Answers:


16

我认为您要寻找的命令称为fmt

$ fmt loremipsum.txt
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aliquam vel
lectus ac enim venenatis porttitor in et est. Curabitur ut eros quis risus
consequat dictum a a lectus. Integer ut risus quis augue lobortis molestie
vel id nibh. Aliquam sit amet mattis lorem, vel ornare felis. Donec
pulvinar tempus lorem, at porta sem pretium ut. Cras ut lorem tincidunt,
scelerisque nunc vitae, posuere augue. Vestibulum iaculis libero id congue
ultrices. Nullam mauris ipsum, aliquet eget nisl non, venenatis euismod
enim. Phasellus a eleifend velit. Aenean molestie venenatis turpis,
consectetur convallis velit fringilla non.

您可以控制结果,例如宽度等。

$ fmt --help
Usage: fmt [-WIDTH] [OPTION]... [FILE]...
Reformat each paragraph in the FILE(s), writing to standard output.
The option -WIDTH is an abbreviated form of --width=DIGITS.

Mandatory arguments to long options are mandatory for short options too.
  -c, --crown-margin        preserve indentation of first two lines
  -p, --prefix=STRING       reformat only lines beginning with STRING,
                              reattaching the prefix to reformatted lines
  -s, --split-only          split long lines, but do not refill
  -t, --tagged-paragraph    indentation of first line different from second
  -u, --uniform-spacing     one space between words, two after sentences
  -w, --width=WIDTH         maximum line width (default of 75 columns)
      --help     display this help and exit
      --version  output version information and exit

With no FILE, or when FILE is -, read standard input.

5

最好的选择可能是文本编辑器。他们中的大多数提供某种文本换行。

如果您正在寻找更简单的内容,则可以使用sed或类似的方法提出一些建议。将您的长行放入loremipsum.txt,然后sed在56-73个字符后加空格,然后换行,即可得到所需的结果...

$ sed -r -e 's/.{56,73} /&\n/g' loremipsum.txt
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aliquam vel 
lectus ac enim venenatis porttitor in et est. Curabitur ut eros quis 
risus consequat dictum a a lectus. Integer ut risus quis augue lobortis 
molestie vel id nibh. Aliquam sit amet mattis lorem, vel ornare felis. 
Donec pulvinar tempus lorem, at porta sem pretium ut. Cras ut lorem 
tincidunt, scelerisque nunc vitae, posuere augue. Vestibulum iaculis 
libero id congue ultrices. Nullam mauris ipsum, aliquet eget nisl non, 
venenatis euismod enim. Phasellus a eleifend velit. Aenean molestie 
venenatis turpis, consectetur convallis velit fringilla non.

...或者您可以使用fold -s -w 74 loremipsum.txt我猜...


3

您可以通过管道输入文本fold -s -w 72来获得该结果。

如果您的系统没有fold安装python,则可以执行以下操作:

cat /var/tmp/li.txt | cat /var/tmp/li.txt | python -c "import sys; from textwrap import fill; print fill(sys.stdin.read(), width=72)"
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.