转换整个目录树(Git)的行尾


162

以下情况:

我正在运行OS X的Mac上工作,最近加入了一个项目,到目前为止,其成员都使用Windows。我的首要任务之一是在Git存储库中设置代码库,因此我从FTP提取了目录树,并尝试将其检入我在本地准备的Git存储库中。当尝试这样做时,我得到的只是这个

fatal: CRLF would be replaced by LF in blog/license.txt.

由于这会影响“ blog”文件夹下的所有文件,因此我正在寻找一种将树中的所有文件方便地转换为Unix行尾的方法。有没有可以立即使用的工具,还是可以自己编写脚本的工具?

作为参考,我关于行尾的Git配置:

core.safecrlf=true
core.autocrlf=input

Answers:


268

dos2unix为您做到了。相当直接的过程。
dos2unix filename

多亏了toolbear,这是一种单行代码,可递归替换行尾并正确处理空格,引号和shell meta字符。

find . -type f -exec dos2unix {} \;

如果您使用的是dos2unix 6.0,二进制文件将被忽略。


8
find blog -type f | xargs dos2unix应该更快。-name *.*两者都不用,除非您只希望名称中带有句点的文件。那是Windows的问题,而不是* nix的问题。
没用

15
如果匹配路径中带有空格,引号或其他shell元字符的任何文件,则findto 管道xargs将失败find。至少用于find blog -type f -print0 | xargs -0 dos2unix处理空格的情况。你必须使用find-exec,而不是管道,以避免行情等。本dos2unix手册页没有规定是,如果你调用它的二进制文件什么的行为。如果将CRLF转换为二进制文件,则会损坏它们。请参阅我的答案,以获得更安全的替代方法。
toolbear 2011年

1
@lukmdo,这不是安装在centos 6.4上的版本......确实破坏了它们....相反,我不得不从此处进行d / l rpmfind.net/linux/rpm2html/search.php?query=dos2unix
Kerridge0

附录:最容易通过Homebrew(而不是npm)安装dos2unix CLI。
2015年

2
如果可能的话,如何使用这种方法忽略目录?
datatype_void

50

假设您拥有GNU grepperl它将在当前目录下的非二进制文件中将CRLF递归转换为LF:

find . -type f -exec grep -qIP '\r\n' {} ';' -exec perl -pi -e 's/\r\n/\n/g' {} '+'

这个怎么运作

在当前目录下递归查找;更改.到子目录blogwhatev子目录以限制替换:

find .

仅匹配常规文件:

  -type f

测试文件是否包含CRLF。排除二进制文件。grep对每个常规文件运行命令。那就是排除二进制文件的代价。如果您年纪大了grep,可以尝试使用以下file命令来构建测试:

  -exec grep -qIP '\r\n' {} ';'

用LF替换CRLF。在'+'与第二-exec告诉find积累匹配的文件,并将它们传递给此命令的一个(或尽可能少)的调用-就像管道到xargs,但没有问题,如果文件路径包含空格,引号或其他shell元字符。在i-pi告诉Perl修改文件就位。您可以在这里使用sedawk进行一些工作,并且可能会将'+'更改为';'。并为每个匹配项调用一个单独的过程:

  -exec perl -pi -e 's/\r\n/\n/g' {} '+'

6
万一它对任何人grep -qIP '\r\n'都有帮助:永远不要在我的CentOS系统上匹配任何东西。将其更改为grep -qIP '\r$'有效。
史蒂夫·奥纳拉托

讨厌在评论中提问,但是有没有办法排除类似文件夹node_modules
datatype_void

1
@datatype_void看看stackoverflow.com/questions/4210042/…,了解如何修改find命令部分以排除目录。他们建议使用-path,但您也可以使用-regex-iregex,即-not -regex '.*/node_modules/.*',它将node_modules在任何深度排除a 。
toolbear

很抱歉,如果我摘下来作为regexbash小白,但对于多个排除,说node_moduledist的例子吗?
datatype_void

-P标志需要GNU grep 。OS X从GNU grep切换到BSD grep。OS X的一些替代方法:stackoverflow.com/questions/16658333/…–
toolbear

28

这是一个更好的选择:瑞士锉刀。它跨子目录递归工作,并正确处理空格和特殊字符。

您要做的就是:

sfk remcr -dir your_project_directory

奖励:sfk还进行了许多其他转换。请参阅下面的完整列表:

SFK - The Swiss File Knife File Tree Processor.
Release 1.6.7 Base Revision 2 of May  3 2013.
StahlWorks Technologies, http://stahlworks.com/
Distributed for free under the BSD License, without any warranty.

type "sfk commandname" for help on any of the following.
some commands require to add "-help" for the help text.

   file system
      sfk list       - list directory tree contents.
                       list latest, oldest or biggest files.
                       list directory differences.
                       list zip jar tar gz bz2 contents.
      sfk filefind   - find files by filename
      sfk treesize   - show directory size statistics
      sfk copy       - copy directory trees additively
      sfk sync       - mirror tree content with deletion
      sfk partcopy   - copy part from a file into another one
      sfk mkdir      - create directory tree
      sfk delete     - delete files and folders
      sfk deltree    - delete whole directory tree
      sfk deblank    - remove blanks in filenames
      sfk space [-h] - tell total and free size of volume
      sfk filetime   - tell times of a file
      sfk touch      - change times of a file

   conversion
      sfk lf-to-crlf - convert from LF to CRLF line endings
      sfk crlf-to-lf - convert from CRLF to LF line endings
      sfk detab      - convert TAB characters to spaces
      sfk entab      - convert groups of spaces to TAB chars
      sfk scantab    - list files containing TAB characters
      sfk split      - split large files into smaller ones
      sfk join       - join small files into a large one
      sfk hexdump    - create hexdump from a binary file
      sfk hextobin   - convert hex data to binary
      sfk hex        - convert decimal number(s) to hex
      sfk dec        - convert hex number(s) to decimal
      sfk chars      - print chars for a list of codes
      sfk bin-to-src - convert binary to source code

   text processing
      sfk filter     - search, filter and replace text data
      sfk addhead    - insert string at start of text lines
      sfk addtail    - append string at end of text lines
      sfk patch      - change text files through a script
      sfk snapto     - join many text files into one file
      sfk joinlines  - join text lines split by email reformatting
      sfk inst       - instrument c++ sourcecode with tracing calls
      sfk replace    - replace words in binary and text files
      sfk hexfind    - find words in binary files, showing hexdump
      sfk run        - run command on all files of a folder
      sfk runloop    - run a command n times in a loop
      sfk printloop  - print some text many times
      sfk strings    - extract strings from a binary file
      sfk sort       - sort text lines produced by another command
      sfk count      - count text lines, filter identical lines
      sfk head       - print first lines of a file
      sfk tail       - print last lines of a file
      sfk linelen    - tell length of string(s)

   search and compare
      sfk find       - find words in binary files, showing text
      sfk md5gento   - create list of md5 checksums over files
      sfk md5check   - verify list of md5 checksums over files
      sfk md5        - calc md5 over a file, compare two files
      sfk pathfind   - search PATH for location of a command
      sfk reflist    - list fuzzy references between files
      sfk deplist    - list fuzzy dependencies between files
      sfk dupfind    - find duplicate files by content

   networking
      sfk httpserv   - run an instant HTTP server.
                       type "sfk httpserv -help" for help.
      sfk ftpserv    - run an instant FTP server
                       type "sfk ftpserv -help" for help.
      sfk ftp        - instant anonymous FTP client
      sfk wget       - download HTTP file from the web
      sfk webrequest - send HTTP request to a server
      sfk tcpdump    - print TCP conversation between programs
      sfk udpdump    - print incoming UDP requests
      sfk udpsend    - send UDP requests
      sfk ip         - tell own machine's IP address(es).
                       type "sfk ip -help" for help.
      sfk netlog     - send text outputs to network,
                       and/or file, and/or terminal

   scripting
      sfk script     - run many sfk commands in a script file
      sfk echo       - print (coloured) text to terminal
      sfk color      - change text color of terminal
      sfk alias      - create command from other commands
      sfk mkcd       - create command to reenter directory
      sfk sleep      - delay execution for milliseconds
      sfk pause      - wait for user input
      sfk label      - define starting point for a script
      sfk tee        - split command output in two streams
      sfk tofile     - save command output to a file
      sfk toterm     - flush command output to terminal
      sfk loop       - repeat execution of a command chain
      sfk cd         - change directory within a script
      sfk getcwd     - print the current working directory
      sfk require    - compare version text

   development
      sfk bin-to-src - convert binary data to source code
      sfk make-random-file - create file with random data
      sfk fuzz       - change file at random, for testing
      sfk sample     - print example code for programming
      sfk inst       - instrument c++ with tracing calls

   diverse
      sfk media      - cut video and binary files
      sfk view       - show results in a GUI tool
      sfk toclip     - copy command output to clipboard
      sfk fromclip   - read text from clipboard
      sfk list       - show directory tree contents
      sfk env        - search environment variables
      sfk version    - show version of a binary file
      sfk ascii      - list ISO 8859-1 ASCII characters
      sfk ascii -dos - list OEM codepage 850 characters
      sfk license    - print the SFK license text

   help by subject
      sfk help select   - how dirs and files are selected in sfk
      sfk help options  - general options reference
      sfk help patterns - wildcards and text patterns within sfk
      sfk help chain    - how to combine (chain) multiple commands
      sfk help shell    - how to optimize the windows command prompt
      sfk help unicode  - about unicode file reading support
      sfk help colors   - how to change result colors
      sfk help xe       - for infos on sfk extended edition.

   All tree walking commands support file selection this way:

   1. short format with ONE directory tree and MANY file name patterns:
      src1dir .cpp .hpp .xml bigbar !footmp
   2. short format with a list of explicite file names:
      letter1.txt revenues9.xls report3\turnover5.ppt
   3. long format with MANY dir trees and file masks PER dir tree:
      -dir src1 src2 !src\save -file foosys .cpp -dir bin5 -file .exe

   For detailed help on file selection, type "sfk help select".

   * and ? wildcards are supported within filenames. "foo" is interpreted
   as "*foo*", so you can leave out * completely to search a part of a name.
   For name start comparison, say "\foo" (finds foo.txt but not anyfoo.txt).

   When you supply a directory name, by default this means "take all files".

      sfk list mydir                lists ALL  files of mydir, no * needed.
      sfk list mydir .cpp .hpp      lists SOME files of mydir, by extension.
      sfk list mydir !.cfg          lists all  files of mydir  EXCEPT .cfg

   general options:
      -tracesel tells in detail which files and/or directories are included
                or excluded, and why (due to which user-supplied mask).
      -nosub    do not process files within subdirectories.
      -nocol    before any command switches off color output.
      -quiet    or -nohead shows less output on some commands.
      -hidden   includes hidden and system files and dirs.
      For detailed help on all options, type "sfk help options".

   beware of Shell Command Characters.
      command parameters containing characters < > | ! & must be sur-
      rounded by quotes "". type "sfk filter" for details and examples.

   type "sfk ask word1 word2 ..."   to search ALL help text for words.
   type "sfk dumphelp"              to print  ALL help text.

编辑:提请您注意:在具有二进制文件的文件夹上运行此命令时要小心,因为它将有效地破坏您的文件,尤其是.git目录。如果是这种情况,请不要在整个文件夹中运行sfk,而是选择特定的文件扩展名(* .rb,*。py等)。例:sfk remcr -dir chef -file .rb -file .json -file .erb -file .md


在OSX Mavericks上效果很好。无需安装任何软件,只需从已挂载的dmg运行脚本,您的终端就可以使用了。
内特·库克

@Gui Ambros您无需担心.git文件夹中的文件。默认情况下,sfk不会更新隐藏文件夹中的文件。
bittusarkar,2015年

1
@bittusarkar:在回答我时,sfk有效地处理了我的整个.git文件夹并销毁了一堆二进制文件(因此,我进行了编辑;不记得是Linux还是Mac)。他们可能已更改了较新版本的默认行为,但为了安全起见,我仍然建议指定扩展名。
Gui Ambros

1
在花了太多时间尝试使用建议的git命令规范我的存储库后,这对我来说效果很好,而git命令根本无法修复所有相关文件。
angularsen 2015年

1
谢谢!刚刚使用它来快速,轻松地转换一大堆文件,现在我可以将它们添加到Git的暂存区域中。在OSX 10.9.5上,不确定文件的创建位置。
ryanwc

16
find . -not \( -name .svn -prune -o -name .git -prune \) -type f -exec perl -pi -e 's/\r\n|\n|\r/\n/g' {} \;

这更安全,因为它可以避免损坏git repo。将.git,.svn添加或替换为.bzr,.hg或您在not列表中使用的任何源代码控制。


3
如果您不必安装dos2unix之类的东西,这是最佳答案。允许排除文件类型,并避免破坏源代码文件。
拉格万

10

在OS X上,这对我有用:

find ./ -type f -exec perl -pi -e 's/\r\n|\n|\r/\n/g' {} \;

警告:请在执行此命令之前备份目录。


5
只是要注意,这损坏了我的git存储库。我再次尝试了以下方法:将.git文件夹移出,然后再运行,然后将其移回,以获得更好的成功。
加里

1
我还要注意,这不会排除二进制文件,因此会损坏jpg。
尼克,

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.