diff -r仅适用于某些文件类型


12

有没有一种方法可以执行两个目录的递归比较,但仅比较(在各自位置)匹配特定文件名或文件类型谓词的文件?

例如,我想这样做的东西

diff -r dir-a dir-b -filenames *.java, ivy.xml, build.xml

...甚至更好:

diff -r dir-a dir-b -filetype text

显然,并不是强制性使用,diff因为我认为有一个咒语,find并且-exec diff也可能会成功(我只是不知道在后一种情况下如何生成互补文件路径)。


3
有一个选项可以排除与模式匹配的文件,我看不到一个选项,仅包括与模式匹配的文件。
2014年

1
所有特定于比较目录的选项都可以在gnu.org/software/diffutils/manual/html_node/…中
Barmar

1
请参阅此链接并查看Sérgio的答案。
yehudahs

Answers:


1

Shell脚本 differ-r

该shellscript可以执行两个目录的递归diff,但只能(在它们各自的位置)比较与特定文件名或文件类型模式匹配的文件。

#!/bin/bash

greenvid="\0033[32m"
resetvid="\0033[0m"

if [ $# -ne 3 ]
then
 echo "Usage: compare files in two directories including subdirectories"
 echo "         $0 <source-dir> <target-dir> <pattern>"
 echo "Example: $0  subdir-1     subdir-2     \"*.txt\""
 exit
fi

cmd='for pathname do
        greenvid="\0033[32m"
        resetvid="\0033[0m"
        echo -e "${greenvid}diff \"$pathname\" \"${pathname/'\"$1\"'/'\"$2\"'}\"${resetvid}"
        diff "$pathname" "${pathname/'\"$1\"'/'\"$2\"'}"
    done'
#echo "$cmd"

find "$1" -type f -name "$3" -exec bash -c "$cmd" bash {} +

演示版

档案:

$ find -type f
./1/ett.txt
./1/two.doc
./1/t r e.txt
./1/sub/only-one.doc
./1/sub/hello.doc
./1/sub/hejsan.doc
./differ-r2
./differ-r1
./differ-r
./2/ett.txt
./2/two.doc
./2/t r e.txt
./2/sub/hello.doc
./2/sub/hejsan.doc

用法:

$ ./differ-r
Usage: compare files in two directories including subdirectories
         ./differ-r <source-dir> <target-dir> <pattern>
Example: ./differ-r  subdir-1     subdir-2     "*.txt"

正在运行differ-r

diff当不匹配时,执行的命令行将以绿色文本打印,而输出将以默认文本(以下屏幕截图中的黑底白字)进行打印。

在此处输入图片说明

$ ./differ-r 1 2 "*.doc"
diff "1/two.doc" "2/two.doc"
diff "1/sub/only-one.doc" "2/sub/only-one.doc"
diff: 2/sub/only-one.doc: No such file or directory
diff "1/sub/hello.doc" "2/sub/hello.doc"
2d1
< world
diff "1/sub/hejsan.doc" "2/sub/hejsan.doc"

$ ./differ-r 1 2 "*.txt"
diff "1/ett.txt" "2/ett.txt"
2c2
< stabben
---
> farsan
diff "1/t r e.txt" "2/t r e.txt"
1c1
< t r e
---
> 3
$ 

$ ./differ-r 1 2 "*"
diff "1/ett.txt" "2/ett.txt"
2c2
< stabben
---
> farsan
diff "1/two.doc" "2/two.doc"
diff "1/t r e.txt" "2/t r e.txt"
1c1
< t r e
---
> 3
diff "1/sub/only-one.doc" "2/sub/only-one.doc"
diff: 2/sub/only-one.doc: No such file or directory
diff "1/sub/hello.doc" "2/sub/hello.doc"
2d1
< world
diff "1/sub/hejsan.doc" "2/sub/hejsan.doc"

$ ./differ-r 2 1 "*"
diff "2/ett.txt" "1/ett.txt"
2c2
< farsan
---
> stabben
diff "2/two.doc" "1/two.doc"
diff "2/t r e.txt" "1/t r e.txt"
1c1
< 3
---
> t r e
diff "2/sub/hello.doc" "1/sub/hello.doc"
1a2
> world
diff "2/sub/hejsan.doc" "1/sub/hejsan.doc"

rsync 带过滤器

如果您不需要获得描述差异的任何输出,而只知道哪些文件不同或丢失(以便rsync复制它们),则可以使用以下命令行。

rsync --filter="+ <pattern>" --filter="+ */" --filter="- *"--filter="- */"  -avcn <source directory>/ <target directory>

演示版

$ rsync --filter="+ *.doc" --filter="+ */" --filter="- *"  -avcn 1/ 2
sending incremental file list
./
sub/
sub/hello.doc
sub/only-one.doc

sent 276 bytes  received 35 bytes  622.00 bytes/sec
total size is 40  speedup is 0.13 (DRY RUN)

sent 360 bytes  received 41 bytes  802.00 bytes/sec
total size is 61  speedup is 0.15 (DRY RUN)
olle@bionic64 /media/multimed-2/test/test0/temp $ rsync --filter="+ *.txt" --filter="+ */" --filter="- *" -avcn 1/ 2
sending incremental file list
./
ett.txt
t r e.txt
sub/

sent 184 bytes  received 29 bytes  426.00 bytes/sec
total size is 21  speedup is 0.10 (DRY RUN)

如果您想要干净的输出而没有注释行和目录,则可以grep这样输出:

$ pattern="*.doc"; rsync --filter="+ $pattern" --filter="+ */" --filter="- *"  -avcn 1/ 2 | grep "${pattern/\*/.\*}"
sub/hello.doc
sub/only-one.doc

Shell脚本 rsync-diff

可以将此单行代码编写为shellscript的核心命令rsync-diff

#!/bin/bash

LANG=C

if [ $# -ne 3 ]
then
 echo "Usage: compare files in two directories including subdirectories"
 echo "         $0 <source-dir> <target-dir> <pattern>"
 echo "Example: $0  subdir-1     subdir-2     \"*.txt\""
 exit
fi

pattern="$3"; rsync --filter="+ $pattern" --filter="+ */" --filter="- *" \
 -avcn "$1"/ "$2" | grep "${pattern//\*/.\*}" | grep -v \
  -e '/$' \
  -e '^sending incremental file list$' \
  -e '^sent.*received.*sec$' \
  -e '^total size is.*speedup.*(DRY RUN)$'

0

由于您提到“显然,使用diff不是强制性的”,

你这个应该做的工作合并 进行什么样的文件类型的忽视很容易配置:

在此处输入图片说明

此外,另一种选择是编写一个简单的脚本,该脚本将从白名单转移到黑名单,然后将黑名单与该--exclude选项一起传递给diff 。


更新的标签以添加“命令行”
Marcus Junius Brutus

0

随着外壳支持命令替换,你可以使用下面的一行代码(如已被@JammingThebBits说明):

diff -r dir-a dir-b --exclude-from=<( \
find dir-a dir-b -type f -not \( -name '*.xml'  -or -name '*.java' \) \
| sed 's:^.*/\([^/]*\)$:\1:' \
)

它的工作原理是这样的:find搜索不感兴趣的文件,sed提取基本名(basename如果有很多文件,运行速度非常慢),然后将它们放在一个临时文件中;然后将此类文件传递给diff它,告诉它从比较中排除它们(双重排除=包含)。

如果没有命令替换,请将sed输出放入文件中,并将其显式传递给diff

在该示例中,我仅搜索XML和JAVA文件,并通过使用OR进行分隔来根据需要进行更改。

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.