RecursiveIteratorIterator 在 PHP 中是如何工作的?

2022-01-10 00:00:00 iterator php spl

RecursiveIteratorIterator 是如何工作的?

PHP 手册没有太多的文档或解释.IteratorIteratorRecursiveIteratorIterator 有什么区别?

解决方案

href="http://php.net/Traversable" rel="noreferrer">Traversable 允许 foreach 对其进行迭代:

$path = 'tree';$dir = new DirectoryIterator($path);回声[$路径]
";foreach ($dir as $file) {回声├$文件
";}

上面目录结构的示例输出是:

[树]├ .├..├ 迪拉├ 文件A

如您所见,这还没有使用 IteratorIteratorRecursiveIteratorIterator.相反,它只是使用在 Traversable 接口上运行的 foreach.

由于 foreach 默认情况下只知道名为线性顺序的迭代类型,我们可能希望明确指定迭代的类型.乍一看它可能看起来太冗长,但出于演示目的(并且为了让 RecursiveIteratorIterator 的区别在以后更明显),让我们明确指定迭代的线性类型,指定 IteratorIterator 目录列表的迭代类型:

$files = new IteratorIterator($dir);回声[$路径]
";foreach ($files 作为 $file) {回声├$文件
";}

这个例子几乎和第一个例子一样,不同的是 $files 现在是 IteratorIterator 类型的迭代>可遍历 $dir:

$files = new IteratorIterator($dir);

像往常一样,迭代动作由 foreach 执行:

foreach ($files as $file) {

输出一模一样.那么有什么不同呢?foreach 中使用的对象不同.在第一个示例中它是 DirectoryIterator 在第二个示例中它是 IteratorIterator.这显示了迭代器具有的灵活性:您可以将它们相互替换,foreach 中的代码继续按预期工作.

让我们开始获取整个列表,包括子目录.

既然我们现在已经指定了迭代的类型,让我们考虑将其更改为另一种迭代类型.

我们知道我们现在需要遍历整个树,而不仅仅是第一层.要使用简单的 foreach 进行这项工作,我们需要不同类型的迭代器:RecursiveIteratorIterator.而且只能遍历具有 RecursiveIterator 接口的容器对象.

接口是一个合约.任何实现它的类都可以与 RecursiveIteratorIterator 一起使用.这种类的一个例子是 RecursiveDirectoryIterator,它类似于DirectoryIterator.

在写任何其他带有 I 字的句子之前,让我们先看看第一个代码示例:

$dir = new RecursiveDirectoryIterator($path);回声[$路径]
";foreach ($dir as $file) {回声├$文件
";}

第三个示例几乎与第一个示例相同,但是它创建了一些不同的输出:

[树]├ 树.├ 树..├ 树dirA├ 树文件A

好的,没什么不同,文件名现在包含前面的路径名,但其余部分看起来也相似.

如示例所示,即使目录对象已经实现了 RecursiveIterator 接口,这还不足以让 foreach 遍历整个目录树.这就是 RecursiveIteratorIterator 发挥作用的地方.示例 4 展示了如何:

$files = new RecursiveIteratorIterator($dir);回声[$路径]
";foreach ($files 作为 $file) {回声├$文件
";}

使用 RecursiveIteratorIterator 而不是仅仅使用前面的 $dir 对象将使 foreach 以递归方式遍历所有文件和目录.然后列出所有文件,因为现在已经指定了对象迭代的类型:

[树]├ 树.├ 树..├ 树dirA.├ 树目录..├ 树dirAdirB.├ 树dirAdirB..├ 树dirAdirBfileD├ 树dirAfileB├ 树dirAfileC├ 树文件A

这应该已经演示了平面遍历和树遍历之间的区别.RecursiveIteratorIterator 能够以元素列表的形式遍历任何树状结构.因为有更多信息(比如当前发生的迭代级别),所以可以在迭代时访问迭代器对象,例如缩进输出:

echo "[$path]
";foreach ($files 作为 $file) {$indent = str_repeat(' ', $files->getDepth());echo $indent, " ├ $file
";}

示例5的输出:

[树]├ 树.├ 树..├ 树dirA.├ 树目录..├ 树dirAdirB.├ 树dirAdirB..├ 树dirAdirBfileD├ 树dirAfileB├ 树dirAfileC├ 树文件A

当然这不会赢得选美比赛,但它表明使用递归迭代器可以获得更多信息,而不仅仅是 key 和 value 的线性顺序.甚至 foreach 也只能表达这种线性,访问迭代器本身可以获取更多信息.

与元信息类似,也有不同的方法可以遍历树并因此对输出进行排序.这是RecursiveIteratorIterator的模式code> 可以用构造函数设置.

下一个示例将告诉 RecursiveDirectoryIterator 删除点条目(...),因为我们不需要它们.但递归模式也将更改为先获取父元素(子目录)(SELF_FIRST),然后再获取子元素(子目录中的文件和子目录):

$dir = new RecursiveDirectoryIterator($path, RecursiveDirectoryIterator::SKIP_DOTS);$files = new RecursiveIteratorIterator($dir, RecursiveIteratorIterator::SELF_FIRST);回声[$路径]
";foreach ($files 作为 $file) {$indent = str_repeat(' ', $files->getDepth());echo $indent, " ├ $file
";}

输出现在显示正确列出的子目录条目,如果您与之前的输出进行比较,那些不存在:

[树]├ 树dirA├ 树dirAdirB├ 树dirAdirBfileD├ 树dirAfileB├ 树dirAfileC├ 树文件A

因此,递归模式控制返回树中的分支或叶子的内容和时间,例如目录:

  • LEAVES_ONLY(默认):只列出文件,不列出目录.
  • SELF_FIRST(上):列出目录,然后列出其中的文件.
  • CHILD_FIRST(无示例):首先列出子目录中的文件,然后是目录.

示例 5 与其他两种模式的输出:

 LEAVES_ONLY CHILD_FIRST[树] [树]├ 树dirAdirBfileD ├ 树dirAdirBfileD├ 树dirAfileB ├ 树dirAdirB├ 树dirAfileC ├ 树dirAfileB├ 树fileA ├ 树dirAfileC├ 树dirA├ 树文件A

当您将其与标准遍历进行比较时,所有这些都是不可用的.因此,递归迭代在您需要绕开它时会稍微复杂一些,但是它很容易使用,因为它的行为就像一个迭代器,您将它放入 foreach 并完成.

我认为这些例子足以作为一个答案.您可以在此要点中找到完整的源代码以及显示漂亮 ascii 树的示例:https://gist.github.com/3599532

<块引用>

自己动手:让 RecursiveTreeIterator 逐行工作.

示例 5 表明存在有关迭代器状态的元信息.然而,这是有目的地在 within foreach 迭代中演示的.在现实生活中,这自然属于 RecursiveIterator.

一个更好的例子是 RecursiveTreeIterator,它负责缩进、前缀和很快.请看以下代码片段:

$dir = new RecursiveDirectoryIterator($path, RecursiveDirectoryIterator::SKIP_DOTS);$lines = new RecursiveTreeIterator($dir);$unicodeTreePrefix($lines);echo "[$path]
", implode("
", iterator_to_array($lines));

RecursiveTreeIterator 旨在逐行工作,输出非常简单,有一个小问题:

[树]├ 树dirA│ ├ 树dirAdirB│ │ └ 树dirAdirBfileD│ ├ 树dirAfileB│ └ treedirAfileC└ 树文件A

当与 RecursiveDirectoryIterator 结合使用时,它会显示整个路径名,而不仅仅是文件名.其余的看起来不错.这是因为文件名是由 SplFileInfo 生成的.这些应该显示为基本名称.所需的输出如下:

///已解决///[树]├ 迪拉│ ├ dirB│ │ └ 文件D│ ├ 文件B│ └ 文件C└ 文件A

创建一个可与 RecursiveTreeIterator 而非 RecursiveDirectoryIterator 一起使用的装饰器类.它应该提供当前 SplFileInfo 的基本名称而不是路径名.最终的代码片段可能如下所示:

$lines = new RecursiveTreeIterator(新的 DiyRecursiveDecorator($dir));$unicodeTreePrefix($lines);echo "[$path]
", implode("
", iterator_to_array($lines));

包括 $unicodeTreePrefix 在内的这些片段是附录中要点的一部分:自己动手:使 RecursiveTreeIterator 逐行工作..

How does RecursiveIteratorIterator work?

The PHP manual has nothing much documented or explained. What is the difference between IteratorIterator and RecursiveIteratorIterator?

解决方案

RecursiveIteratorIterator is a concrete Iterator implementing tree traversal. It enables a programmer to traverse a container object that implements the RecursiveIterator interface, see Iterator in Wikipedia for the general principles, types, semantics and patterns of iterators.

In difference to IteratorIterator which is a concrete Iterator implementing object traversal in linear order (and by default accepting any kind of Traversable in its constructor), the RecursiveIteratorIterator allows looping over all nodes in an ordered tree of objects and its constructor takes a RecursiveIterator.

In short: RecursiveIteratorIterator allows you to loop over a tree, IteratorIterator allows you to loop over a list. I show that with some code examples below soon.

Technically this works by breaking out of linearity by traversing all of a nodes' children (if any). This is possible because by definition all children of a node are again a RecursiveIterator. The toplevel Iterator then internally stacks the different RecursiveIterators by their depth and keeps a pointer to the current active sub Iterator for traversal.

This allows to visit all nodes of a tree.

The underlying principles are the same as with IteratorIterator: An interface specifies the type of iteration and the base iterator class is the implementation of these semantics. Compare with the examples below, for linear looping with foreach you normally do not think about the implementation details much unless you need to define a new Iterator (e.g. when some concrete type itself does not implement Traversable).

For recursive traversal - unless you do not use a pre-defined Traversal that already has recursive traversal iteration - you normally need to instantiate the existing RecursiveIteratorIterator iteration or even write a recursive traversal iteration that is a Traversable your own to have this type of traversal iteration with foreach.

Tip: You probably didn't implement the one nor the other your own, so this might be something worth to do for your practical experience of the differences they have. You find a DIY suggestion at the end of the answer.

Technical differences in short:

  • While IteratorIterator takes any Traversable for linear traversal, RecursiveIteratorIterator needs a more specific RecursiveIterator to loop over a tree.
  • Where IteratorIterator exposes its main Iterator via getInnerIerator(), RecursiveIteratorIterator provides the current active sub-Iterator only via that method.
  • While IteratorIterator is totally not aware of anything like parent or children, RecursiveIteratorIterator knows how to get and traverse children as well.
  • IteratorIterator does not need a stack of iterators, RecursiveIteratorIterator has such a stack and knows the active sub-iterator.
  • Where IteratorIterator has its order due to linearity and no choice, RecursiveIteratorIterator has a choice for further traversal and needs to decide per each node (decided via mode per RecursiveIteratorIterator).
  • RecursiveIteratorIterator has more methods than IteratorIterator.

To summarize: RecursiveIterator is a concrete type of iteration (looping over a tree) that works on its own iterators, namely RecursiveIterator. That is the same underlying principle as with IteratorIerator, but the type of iteration is different (linear order).

Ideally you can create your own set, too. The only thing necessary is that your iterator implements Traversable which is possible via Iterator or IteratorAggregate. Then you can use it with foreach. For example some kind of ternary tree traversal recursive iteration object together with the according iteration interface for the container object(s).


Let's review with some real-life examples that are not that abstract. Between interfaces, concrete iterators, container objects and iteration semantics this maybe is not a that bad idea.

Take a directory listing as an example. Consider you have got the following file and directory tree on disk:

While a iterator with linear order just traverse over the toplevel folder and files (a single directory listing), the recursive iterator traverses through subfolders as well and list all folders and files (a directory listing with listings of its subdirectories):

Non-Recursive        Recursive
=============        =========

   [tree]            [tree]
    ├ dirA            ├ dirA
    └ fileA           │ ├ dirB
                      │ │ └ fileD
                      │ ├ fileB
                      │ └ fileC
                      └ fileA

You can easily compare this with IteratorIterator which does no recursion for traversing the directory tree. And the RecursiveIteratorIterator which can traverse into the tree as the Recursive listing shows.

At first a very basic example with a DirectoryIterator that implements Traversable which allows foreach to iterate over it:

$path = 'tree';
$dir  = new DirectoryIterator($path);

echo "[$path]
";
foreach ($dir as $file) {
    echo " ├ $file
";
}

The exemplary output for the directory structure above then is:

[tree]
 ├ .
 ├ ..
 ├ dirA
 ├ fileA

As you see this is not yet using IteratorIterator or RecursiveIteratorIterator. Instead it just just using foreach that operates on the Traversable interface.

As foreach by default only knows the type of iteration named linear order, we might want to specify the type of iteration explicitly. At first glance it might seem too verbose, but for demonstration purposes (and to make the difference with RecursiveIteratorIterator more visible later), lets specify the linear type of iteration explicitly specifying the IteratorIterator type of iteration for the directory listing:

$files = new IteratorIterator($dir);

echo "[$path]
";
foreach ($files as $file) {
    echo " ├ $file
";
}

This example is nearly identical with the first one, the difference is that $files is now an IteratorIterator type of iteration for Traversable $dir:

$files = new IteratorIterator($dir);

As usual the act of iteration is performed by the foreach:

foreach ($files as $file) {

The output is exactly the same. So what is different? Different is the object used within the foreach. In the first example it is a DirectoryIterator in the second example it is the IteratorIterator. This shows the flexibility iterators have: You can replace them with each other, the code inside foreach just continue to work as expected.

Lets start to get the whole listing, including subdirectories.

As we now have specified the type of iteration, let's consider to change it to another type of iteration.

We know we need to traverse the whole tree now, not only the first level. To have that work with a simple foreach we need a different type of iterator: RecursiveIteratorIterator. And that one can only iterate over container objects that have the RecursiveIterator interface.

The interface is a contract. Any class implementing it can be used together with the RecursiveIteratorIterator. An example of such a class is the RecursiveDirectoryIterator, which is something like the recursive variant of DirectoryIterator.

Lets see a first code example before writing any other sentence with the I-word:

$dir  = new RecursiveDirectoryIterator($path);

echo "[$path]
";
foreach ($dir as $file) {
    echo " ├ $file
";
}

This third example is nearly identical with the first one, however it creates some different output:

[tree]
 ├ tree.
 ├ tree..
 ├ treedirA
 ├ treefileA

Okay, not that different, the filename now contains the pathname in front, but the rest looks similar as well.

As the example shows, even the directory object already imlements the RecursiveIterator interface, this is not yet enough to make foreach traverse the whole directory tree. This is where the RecursiveIteratorIterator comes into action. Example 4 shows how:

$files = new RecursiveIteratorIterator($dir);

echo "[$path]
";
foreach ($files as $file) {
    echo " ├ $file
";
}

Using the RecursiveIteratorIterator instead of just the previous $dir object will make foreach to traverse over all files and directories in a recursive manner. This then lists all files, as the type of object iteration has been specified now:

[tree]
 ├ tree.
 ├ tree..
 ├ treedirA.
 ├ treedirA..
 ├ treedirAdirB.
 ├ treedirAdirB..
 ├ treedirAdirBfileD
 ├ treedirAfileB
 ├ treedirAfileC
 ├ treefileA

This should already demonstrate the difference between flat and tree traversal. The RecursiveIteratorIterator is able to traverse any tree-like structure as a list of elements. Because there is more information (like the level the iteration takes currently place), it is possible to access the iterator object while iterating over it and for example indent the output:

echo "[$path]
";
foreach ($files as $file) {
    $indent = str_repeat('   ', $files->getDepth());
    echo $indent, " ├ $file
";
}

And output of Example 5:

[tree]
 ├ tree.
 ├ tree..
    ├ treedirA.
    ├ treedirA..
       ├ treedirAdirB.
       ├ treedirAdirB..
       ├ treedirAdirBfileD
    ├ treedirAfileB
    ├ treedirAfileC
 ├ treefileA

Sure this does not win a beauty contest, but it shows that with the recursive iterator there is more information available than just the linear order of key and value. Even foreach can only express this kind of linearity, accessing the iterator itself allows to obtain more information.

Similar to the meta-information there are also different ways possible how to traverse the tree and therefore order the output. This is the Mode of the RecursiveIteratorIterator and it can be set with the constructor.

The next example will tell the RecursiveDirectoryIterator to remove the dot entries (. and ..) as we do not need them. But also the recursion mode will be changed to take the parent element (the subdirectory) first (SELF_FIRST) before the children (the files and sub-subdirs in the subdirectory):

$dir  = new RecursiveDirectoryIterator($path, RecursiveDirectoryIterator::SKIP_DOTS);
$files = new RecursiveIteratorIterator($dir, RecursiveIteratorIterator::SELF_FIRST);

echo "[$path]
";
foreach ($files as $file) {
    $indent = str_repeat('   ', $files->getDepth());
    echo $indent, " ├ $file
";
}

The output now shows the subdirectory entries properly listed, if you compare with the previous output those were not there:

[tree]
 ├ treedirA
    ├ treedirAdirB
       ├ treedirAdirBfileD
    ├ treedirAfileB
    ├ treedirAfileC
 ├ treefileA

The recursion mode therefore controls what and when a brach or leaf in the tree is returned, for the directory example:

  • LEAVES_ONLY (default): Only list files, no directories.
  • SELF_FIRST (above): List directory and then the files in there.
  • CHILD_FIRST (w/o example): List files in subdirectory first, then the directory.

Output of Example 5 with the two other modes:

  LEAVES_ONLY                           CHILD_FIRST

  [tree]                                [tree]
         ├ treedirAdirBfileD                ├ treedirAdirBfileD
      ├ treedirAfileB                     ├ treedirAdirB
      ├ treedirAfileC                     ├ treedirAfileB
   ├ treefileA                             ├ treedirAfileC
                                        ├ treedirA
                                        ├ treefileA

When you compare that with standard traversal, all these things are not available. Recursive iteration therefore is a little bit more complex when you need to wrap your head around it, however it is easy to use because it behaves just like an iterator, you put it into a foreach and done.

I think these are enough examples for one answer. You can find the full source-code as well as an example to display nice-looking ascii-trees in this gist: https://gist.github.com/3599532

Do It Yourself: Make the RecursiveTreeIterator Work Line by Line.

Example 5 demonstrated that there is meta-information about the iterator's state available. However, this was purposefully demonstrated within the foreach iteration. In real life this naturally belongs inside the RecursiveIterator.

A better example is the RecursiveTreeIterator, it takes care of indenting, prefixing and so on. See the following code fragment:

$dir   = new RecursiveDirectoryIterator($path, RecursiveDirectoryIterator::SKIP_DOTS);
$lines = new RecursiveTreeIterator($dir);
$unicodeTreePrefix($lines);
echo "[$path]
", implode("
", iterator_to_array($lines));

The RecursiveTreeIterator is intended to work line by line, the output is pretty straight forward with one little problem:

[tree]
 ├ treedirA
 │ ├ treedirAdirB
 │ │ └ treedirAdirBfileD
 │ ├ treedirAfileB
 │ └ treedirAfileC
 └ treefileA

When used in combination with a RecursiveDirectoryIterator it displays the whole pathname and not just the filename. The rest looks good. This is because the file-names are generated by SplFileInfo. Those should be displayed as the basename instead. The desired output is the following:

/// Solved ///

[tree]
 ├ dirA
 │ ├ dirB
 │ │ └ fileD
 │ ├ fileB
 │ └ fileC
 └ fileA

Create a decorator class that can be used with RecursiveTreeIterator instead of the RecursiveDirectoryIterator. It should provide the basename of the current SplFileInfo instead of the pathname. The final code fragment could then look like:

$lines = new RecursiveTreeIterator(
    new DiyRecursiveDecorator($dir)
);
$unicodeTreePrefix($lines);
echo "[$path]
", implode("
", iterator_to_array($lines));

These fragments including $unicodeTreePrefix are part of the gist in Appendix: Do It Yourself: Make the RecursiveTreeIterator Work Line by Line..

相关文章