当Iterator :: map返回Result :: Err时,如何停止迭代并返回错误?


76

我有一个返回的函数Result

fn find(id: &Id) -> Result<Item, ItemError> {
    // ...
}

然后另一个像这样使用它:

let parent_items: Vec<Item> = parent_ids.iter()
    .map(|id| find(id).unwrap())
    .collect();

如何处理任何map迭代中的失败情况?

我知道我可以使用flat_map,在这种情况下,错误结果将被忽略

let parent_items: Vec<Item> = parent_ids.iter()
    .flat_map(|id| find(id).into_iter())
    .collect();

Result的迭代器根据成功状态有0或1个项目,flat_map如果为0 ,则会将其过滤掉。

但是,我不想忽略错误,而是想使整个代码块停止并返回一个新错误(基于映射内出现的错误,或者仅转发现有错误)。

我如何在Rust中最好地解决这个问题?

Answers:


109

Result 实现FromIterator,因此您可以将其移到Result外面,并且迭代器将负责其余的工作(包括在发现错误时停止迭代)。

#[derive(Debug)]
struct Item;
type Id = String;

fn find(id: &Id) -> Result<Item, String> {
    Err(format!("Not found: {:?}", id))
}

fn main() {
    let s = |s: &str| s.to_string();
    let ids = vec![s("1"), s("2"), s("3")];

    let items: Result<Vec<_>, _> = ids.iter().map(find).collect();
    println!("Result: {:?}", items);
}

Playground


7
+1 This is awesome! (The example from my answer ported to this: is.gd/E26iv9)
Dogbert

1
@KaiSellgren Yes, you can apply this same trick. The key is in the type signature of collect, which is polymorphic on the return type, which must implement FromIterator. I don't know what you mean by "can it be applied in a broader way." Rust supports polymorphic return types... So, yes? (See the Rng and Default traits for more examples of return type polymorphism.)
BurntSushi5

3
@KaiSellgren from_iter is called in the collect method.
BurntSushi5

1
Use of collect() requires the iterator to be finite, correct? If so, how would a similar but infinite iterator be handled?
U007D

1
What would you do in case of multiple map()s? If the first map() returns a Result, then the following map() has to accept a Result as well which can be annoying. Is there a way to achieve the same from the middle of map() chain? Short of just doing .map(...).collect<Result<Vec<_>, _>>()?.into_iter().map(...), of course.
Good Night Nerd Pride

2

The accepted answer shows how to stop on error while collecting, and that's fine because that's what the OP requested. If you need processing that also works on large or infinite fallible iterators, read on.

As already noted, for can be used to emulate stop-on-error, but that is sometimes inelegant, as when you want to call max() or other consuming method. In other situations it's next to impossible, as when the consuming method is in another crate, such as itertools or Rayon1.

Iterator consumer: try_for_each

When you control how the iterator is consumed, you can just use try_for_each to stop on first error. It will return a result that is Ok if there was no error, and is Err otherwise, containing the error value:

use std::{io, fs};

fn main() -> io::Result<()> {
    fs::read_dir("/")?
        .take_while(Result::is_ok)
        .map(Result::unwrap)
        .try_for_each(|e| -> io::Result<()> {
            println!("{}", e.path().display());
            Ok(())
        })?;
    // ...
    Ok(())
}

If you need to maintain state between the invocations of the closure, you can also use try_fold. Both methods are implemented by ParallelIterator, so you can use them with Rayon.

This approach requires that you control how the iterator is consumed. If that is done by code not under your control - for example, if you are passing the iterator to itertools::merge() or similar, you will need an adapter.

Iterator adapter: scan

The first attempt at stopping on error is to use take_while:

use std::{io, fs};

fn main() -> io::Result<()> {
    fs::read_dir("/")?
        .take_while(Result::is_ok)
        .map(Result::unwrap)
        .for_each(|e| println!("{}", e.path().display()));
    // ...
    Ok(())
}

This works, but we don't get any indication that an error occurred, the iteration just silently stops. Also it requires the unsightly map(Result::unwrap) which makes it seem like the program will panic on error, which is in fact not the case as we stop on error.

Both issues can be fixed by switching from take_while to scan, a more powerful combinator that not only supports stopping the iteration, but passes its callback owned items, allowing the closure to extract the error to the caller:

fn main() -> io::Result<()> {
    let mut err = Ok(());
    fs::read_dir("/")?
        .scan(&mut err, |err, res| match res {
            Ok(o) => Some(o),
            Err(e) => {
                **err = Err(e);
                None
            }
        })
        .for_each(|e| println!("{}", e.path().display()));
    err?;
    // ...
    Ok(())
}

If needed in multiple places, the closure can be abstracted into a utility function:

fn until_err<T, E>(err: &mut &mut Result<(), E>, item: Result<T, E>) -> Option<T> {
    match item {
        Ok(item) => Some(item),
        Err(e) => {
            **err = Err(e);
            None
        }
    }
}

...in which case we can invoke it as .scan(&mut err, until_err) (playground).

These examples trivially exhaust the iterator with for_each(), but one can chain it with arbitrary manipulations, including Rayon's par_bridge(). Using scan() it is even possible to collect() the items into a container and have access to the items seen before the error, which is sometimes useful and unavailable when collecting into Result<Container, Error>.


1 Needing to use `par_bridge()` comes up when using Rayon to process streaming data in parallel:
fn process(input: impl BufRead + Send) -> std::Result<Output, Error> {
    let mut err = Ok(());
    let output = lines
        .input()
        .scan(&mut err, until_err)
        .par_bridge()
        .map(|line| ... executed in parallel ... )
        .reduce(|item| ... also executed in parallel ...);
    err?;
    ...
    Ok(output)
}

Again, equivalent effect cannot be trivially achieved by collecting into Result.



as when you want to sum() [...] the Ok items — that's already implemented in the standard library, using the same technique as the process_results method in itertools.
Shepmaster

@Shepmaster I didn't know about process_results(), thanks. Its upside is that it doesn't require a separate error variable. Its downsides are that it's only available as a top-level function that calls you (possible issue when iterating over several things in parallel), and that it requires an external crate. The code in this answer is reasonably short, works with stdlib, and participates in iterator chaining.
user4815162342

1

This answer pertains to a pre-1.0 version of Rust and the required functions were removed

You can use std::result::fold function for this. It stops iterating after encountering the first Err.

An example program I just wrote:

fn main() {
  println!("{}", go([1, 2, 3]));
  println!("{}", go([1, -2, 3]));
}

fn go(v: &[int]) -> Result<Vec<int>, String> {
    std::result::fold(
        v.iter().map(|&n| is_positive(n)),
        vec![],
        |mut v, e| {
            v.push(e);
            v
        })
}

fn is_positive(n: int) -> Result<int, String> {
    if n > 0 {
        Ok(n)
    } else {
        Err(format!("{} is not positive!", n))
    }
}

Output:

Ok([1, 2, 3])
Err(-2 is not positive!)

Demo

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.