GNU中“启用的功能”是什么意思？

8

当我find --version与GNU find一起使用时，会得到如下信息：

find (GNU findutils) 4.5.9     
[license text]
Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION FTS(FTS_CWDFD) CBO(level=2)

这些“功能”是什么意思？在中提到了O_NOFOLLOW某种安全措施man find，并且提到了LEAF_OPTIMISATION一种优化方法，可以节省lstat对叶节点的调用。但我无法找到任何东西FTS，D_TYPE或CBO。

find gnu

— Nonneneo
source

1

这似乎是阶梯的尽头。也许可以强迫某人阅读find的源代码。答应一些巧克力。

— ott-- 2015年

8

这是从Ketan和daniel Kullman的答案以及我自己的研究得出的完整答案。

事实证明，大多数“功能”都是查询优化，因为find它通常能够（几乎）对文件系统进行任意复杂的查询。

D_TYPE

该D_TYPE功能的存在意味着find在的支持下对该d_type字段进行了编译struct dirent。此字段是Linux也采用的BSD扩展，它在从readdir和朋友返回的结构中提供文件类型（目录，文件，管道，套接字，字符/块设备等）。作为优化，find可以lstat在-type用作过滤器表达式时使用它来减少或消除调用。

readdir可能并不总是d_type在某些文件系统上填充，因此有时lstat仍然需要。

O_NOFOLLOW

此选项将读取(enabled)或(disabled)。如果存在并启用，则此功能将实施一项安全措施，以防止find某些TOCTTOU种族攻击。具体来说，它可以防止find在执行目录遍历时遍历符号链接，如果在检查目录的文件类型之后但在输入目录之前用符号链接替换目录，则可能会发生这种情况。

启用此选项后，find将用于open(..., O_NOFOLLOW)在目录上仅打开真实目录，然后用于openat在该目录中打开文件。

LEAF_OPTIMISATION

这种稍微模糊的优化允许find通过使用父目录的链接计数来推断父目录的哪些子目录是目录，因为子目录（通过..链接）将有助于父目录的链接数。在某些情况下，它将允许find取消stat呼叫。但是，如果文件系统或OS表示错误st_nlinks，则可能导致find产生伪造的结果（幸运的是，这种情况很少发生）。

FTS

启用后，此FTS功能将导致find使用ftsAPI遍历文件层次结构，而不是直接递归实现。

我尚不清楚优点fts是什么，但FTS基本上是find到目前为止我所看到的所有默认版本中的默认值。

国会预算办公室

事实证明（在阅读finddaniel kullman建议的源代码之后），“ CBO”指的是查询优化级别（代表“基于成本的优化器”）。例如，如果我这样做find -O9001 --version，我得到

Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION FTS() CBO(level=9001)

看着中的-O选项man find，我看到了

-Olevel
  Enables query optimisation.   The find program reorders tests to speed up execution  while  preserving  the  overall
  effect; that is, predicates with side effects are not reordered relative to each other.  The optimisations performed
  at each optimisation level are as follows.

  0      Equivalent to optimisation level 1.

  1      This is the default optimisation level  and  corresponds  to  the  traditional  behaviour.   Expressions  are
         reordered  so that tests based only on the names of files (for example -name and -regex) are performed first.

  2      Any -type or -xtype tests are performed after any tests based only on the names  of  files,  but  before  any
         tests  that  require information from the inode.  On many modern versions of Unix, file types are returned by
         readdir() and so these predicates are faster to evaluate than predicates which need to stat the file first.

  3      At this optimisation level, the full cost-based query optimiser is enabled.  The order of tests  is  modified
         so  that  cheap  (i.e. fast) tests are performed first and more expensive ones are performed later, if neces-
         sary.  Within each cost band, predicates are evaluated earlier or later according to whether they are  likely
         to  succeed or not.  For -o, predicates which are likely to succeed are evaluated earlier, and for -a, predi-
         cates which are likely to fail are evaluated earlier.

  The cost-based optimiser has a fixed idea of how likely any given test is to succeed.  In some cases the probability
  takes  account of the specific nature of the test (for example, -type f is assumed to be more likely to succeed than
  -type c).  The cost-based optimiser is currently being evaluated.   If it does not actually improve the  performance
  of find, it will be removed again.  Conversely, optimisations that prove to be reliable, robust and effective may be
  enabled at lower optimisation levels over time.  However, the default behaviour (i.e. optimisation level 1) will not
  be  changed  in  the 4.3.x release series.  The findutils test suite runs all the tests on find at each optimisation
  level and ensures that the result is the same.

谜团已揭开！该选项为运行时值有点奇怪；通常，我希望--version输出仅反映编译时选项。

— Nonneneo
source

1

有关的信息O_NOFOLLOW在给定info的页面find：

9.2.1.1 O_NOFOLLOW

.....................

如果系统支持O_NOFOLLOW标志（1），则open(2)' system call,find'在安全地更改目录时使用它。首先打开目标目录，然后打开find' changes working directory with thefchdir（）的系统调用。这样可以确保不遵循符号链接，从而避免了使用符号链接的竞争情况攻击。

...

在源树中，CBO仅出现在文件中parser.c：

 printf("CBO(level=%d) ", (int)(options.optimisation_level));

表示这是基于成本的优化（我的最佳猜测）。

D_TYPE 发生在源树的多个位置，并且似乎与目录条目类型有关：

$ grep 'D_TYPE' */**

产量：

find/parser.c:#if defined USE_STRUCT_DIRENT_D_TYPE && defined HAVE_STRUCT_DIRENT_D_TYPE
lib/savedirinfo.c:#if defined HAVE_STRUCT_DIRENT_D_TYPE && defined USE_STRUCT_DIRENT_D_TYPE

还有更多条目。您可以在此处找到源。

— k
source

0

浏览findutils源代码树（http://git.savannah.gnu.org/cgit/findutils.git/tree/）时，发现了以下内容：

configure.ac：--enable-d_type-optimization，利用readdir（）在struct dirent.d_type中返回的文件类型数据，
m4 / withfts.m4：--without-fts使用较旧的机制搜索文件系统，而不是使用fts（）

我没有发现关于CBO的任何信息。您可能必须下载源代码并搜索该术语。

— 丹尼尔·库尔曼
source