流没有last()
方法:
Stream<T> stream;
T last = stream.last(); // No such method
获取最后一个元素(或对于空Stream为null)的最优雅和/或最有效的方法是什么?
流没有last()
方法:
Stream<T> stream;
T last = stream.last(); // No such method
获取最后一个元素(或对于空Stream为null)的最优雅和/或最有效的方法是什么?
count()
,但Stream仍然具有count()
方法。实际上,该论点适用于无限流上的任何非短路终端操作。
last()
方法。4月1日可能会有一项调查,该如何定义无限流。我会建议:“它永远不会返回,它至少会以100%的速度使用一个处理器内核。在并行流上,它必须以100%的速度使用所有内核。”
max()
方法stream()...max(Comparator...)
。
Answers:
做一个简单地返回当前值的归约:
Stream<T> stream;
T last = stream.reduce((a, b) -> b).orElse(null);
O(n)
,即使除以CPU内核数也是如此。由于流不知道约简函数的作用,因此仍必须对每个元素进行评估。
这在很大程度上取决于程序的性质Stream
。请记住,“简单”并不一定意味着“有效”。如果您怀疑流很大,正在执行繁重的操作或事先知道其大小的源,则以下方法可能比简单的解决方案效率更高:
static <T> T getLast(Stream<T> stream) {
Spliterator<T> sp=stream.spliterator();
if(sp.hasCharacteristics(Spliterator.SIZED|Spliterator.SUBSIZED)) {
for(;;) {
Spliterator<T> part=sp.trySplit();
if(part==null) break;
if(sp.getExactSizeIfKnown()==0) {
sp=part;
break;
}
}
}
T value=null;
for(Iterator<T> it=recursive(sp); it.hasNext(); )
value=it.next();
return value;
}
private static <T> Iterator<T> recursive(Spliterator<T> sp) {
Spliterator<T> prev=sp.trySplit();
if(prev==null) return Spliterators.iterator(sp);
Iterator<T> it=recursive(sp);
if(it!=null && it.hasNext()) return it;
return recursive(prev);
}
您可以通过以下示例说明差异:
String s=getLast(
IntStream.range(0, 10_000_000).mapToObj(i-> {
System.out.println("potential heavy operation on "+i);
return String.valueOf(i);
}).parallel()
);
System.out.println(s);
它将打印:
potential heavy operation on 9999999
9999999
换句话说,它不对前9999999个元素执行操作,而仅对最后一个元素执行操作。
hasCharacteristics()
块?它添加了recursive()
方法尚未涵盖的什么值?后者已经导航到最后一个分割点。此外,recursive()
由于永远无法退货,null
因此您可以删除it != null
支票。
SUBSIZED
流可以保证非空的拆分半数,因此我们永远不需要回到左侧。请注意,在这种情况下,recursive
实际上不会递归,因为trySplit
已经证明会返回null
。
null
-check源于早期版本,但是后来我发现,对于非SUBSIZED
流,您必须处理可能的空拆分部分,即,您必须进行迭代以查明它是否具有值,因此将Spliterators.iterator(…)
调用移至recursive
方法中如果右侧为空,则可以备份到左侧。循环仍然是首选操作。
parallel()
因为这实际上可能会并行执行某些操作(例如排序),而这会意外地消耗更多的CPU内核。
.parallel()
,但实际上,它可以对sorted()
或产生影响distinct()
。我认为,其他任何中间操作都不会产生影响……
这只是Holger答案的一种重构,因为代码虽然很棒,但是却很难阅读/理解,特别是对于那些在Java之前不是C程序员的人。希望对于那些不熟悉分隔符,它们做什么或如何工作的人,我的重构示例类会更容易理解。
public class LastElementFinderExample {
public static void main(String[] args){
String s = getLast(
LongStream.range(0, 10_000_000_000L).mapToObj(i-> {
System.out.println("potential heavy operation on "+i);
return String.valueOf(i);
}).parallel()
);
System.out.println(s);
}
public static <T> T getLast(Stream<T> stream){
Spliterator<T> sp = stream.spliterator();
if(isSized(sp)) {
sp = getLastSplit(sp);
}
return getIteratorLastValue(getLastIterator(sp));
}
private static boolean isSized(Spliterator<?> sp){
return sp.hasCharacteristics(Spliterator.SIZED|Spliterator.SUBSIZED);
}
private static <T> Spliterator<T> getLastSplit(Spliterator<T> sp){
return splitUntil(sp, s->s.getExactSizeIfKnown() == 0);
}
private static <T> Iterator<T> getLastIterator(Spliterator<T> sp) {
return Spliterators.iterator(splitUntil(sp, null));
}
private static <T> T getIteratorLastValue(Iterator<T> it){
T result = null;
while (it.hasNext()){
result = it.next();
}
return result;
}
private static <T> Spliterator<T> splitUntil(Spliterator<T> sp, Predicate<Spliterator<T>> condition){
Spliterator<T> result = sp;
for (Spliterator<T> part = sp.trySplit(); part != null; part = result.trySplit()){
if (condition == null || condition.test(result)){
result = part;
}
}
return result;
}
}
番石榴有Streams.findLast:
Stream<T> stream;
T last = Streams.findLast(stream);
reduce((a, b) -> b)
使用Spliterator.trySplit
这是另一种解决方案(效率不高):
List<String> list = Arrays.asList("abc","ab","cc");
long count = list.stream().count();
list.stream().skip(count-1).findFirst().ifPresent(System.out::println);
substream
方法,即使有,这也是不可行的,因为count
它是终端操作。那么,这背后的故事是什么呢?
count==0
首先输入Stream.skip
不喜欢-1
的输入。除此之外,问题并没有说明您可以获得Stream
两次。也没有说Stream
两次获得保证可以得到相同数量的元素。
带有“跳过”方法的并行未调整大小的流非常棘手,并且@Holger的实现给出了错误的答案。@Holger的实现也比较慢,因为它使用迭代器。
@Holger答案的优化:
public static <T> Optional<T> last(Stream<? extends T> stream) {
Objects.requireNonNull(stream, "stream");
Spliterator<? extends T> spliterator = stream.spliterator();
Spliterator<? extends T> lastSpliterator = spliterator;
// Note that this method does not work very well with:
// unsized parallel streams when used with skip methods.
// on that cases it will answer Optional.empty.
// Find the last spliterator with estimate size
// Meaningfull only on unsized parallel streams
if(spliterator.estimateSize() == Long.MAX_VALUE) {
for (Spliterator<? extends T> prev = spliterator.trySplit(); prev != null; prev = spliterator.trySplit()) {
lastSpliterator = prev;
}
}
// Find the last spliterator on sized streams
// Meaningfull only on parallel streams (note that unsized was transformed in sized)
for (Spliterator<? extends T> prev = lastSpliterator.trySplit(); prev != null; prev = lastSpliterator.trySplit()) {
if (lastSpliterator.estimateSize() == 0) {
lastSpliterator = prev;
break;
}
}
// Find the last element of the last spliterator
// Parallel streams only performs operation on one element
AtomicReference<T> last = new AtomicReference<>();
lastSpliterator.forEachRemaining(last::set);
return Optional.ofNullable(last.get());
}
使用junit 5进行单元测试:
@Test
@DisplayName("last sequential sized")
void last_sequential_sized() throws Exception {
long expected = 10_000_000L;
AtomicLong count = new AtomicLong();
Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed();
stream = stream.skip(50_000).peek(num -> count.getAndIncrement());
assertThat(Streams.last(stream)).hasValue(expected);
assertThat(count).hasValue(9_950_000L);
}
@Test
@DisplayName("last sequential unsized")
void last_sequential_unsized() throws Exception {
long expected = 10_000_000L;
AtomicLong count = new AtomicLong();
Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed();
stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel());
stream = stream.skip(50_000).peek(num -> count.getAndIncrement());
assertThat(Streams.last(stream)).hasValue(expected);
assertThat(count).hasValue(9_950_000L);
}
@Test
@DisplayName("last parallel sized")
void last_parallel_sized() throws Exception {
long expected = 10_000_000L;
AtomicLong count = new AtomicLong();
Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel();
stream = stream.skip(50_000).peek(num -> count.getAndIncrement());
assertThat(Streams.last(stream)).hasValue(expected);
assertThat(count).hasValue(1);
}
@Test
@DisplayName("getLast parallel unsized")
void last_parallel_unsized() throws Exception {
long expected = 10_000_000L;
AtomicLong count = new AtomicLong();
Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel();
stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel());
stream = stream.peek(num -> count.getAndIncrement());
assertThat(Streams.last(stream)).hasValue(expected);
assertThat(count).hasValue(1);
}
@Test
@DisplayName("last parallel unsized with skip")
void last_parallel_unsized_with_skip() throws Exception {
long expected = 10_000_000L;
AtomicLong count = new AtomicLong();
Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel();
stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel());
stream = stream.skip(50_000).peek(num -> count.getAndIncrement());
// Unfortunately unsized parallel streams does not work very well with skip
//assertThat(Streams.last(stream)).hasValue(expected);
//assertThat(count).hasValue(1);
// @Holger implementation gives wrong answer!!
//assertThat(Streams.getLast(stream)).hasValue(9_950_000L); //!!!
//assertThat(count).hasValue(1);
// This is also not a very good answer better
assertThat(Streams.last(stream)).isEmpty();
assertThat(count).hasValue(0);
}
支持这两种情况的唯一解决方案是避免在未调整大小的并行流上检测到最后一个分隔符。结果是该解决方案将对所有元素执行操作,但始终会给出正确的答案。
请注意,在顺序流中,它将始终对所有元素执行操作。
public static <T> Optional<T> last(Stream<? extends T> stream) {
Objects.requireNonNull(stream, "stream");
Spliterator<? extends T> spliterator = stream.spliterator();
// Find the last spliterator with estimate size (sized parallel streams)
if(spliterator.hasCharacteristics(Spliterator.SIZED|Spliterator.SUBSIZED)) {
// Find the last spliterator on sized streams (parallel streams)
for (Spliterator<? extends T> prev = spliterator.trySplit(); prev != null; prev = spliterator.trySplit()) {
if (spliterator.getExactSizeIfKnown() == 0) {
spliterator = prev;
break;
}
}
}
// Find the last element of the spliterator
//AtomicReference<T> last = new AtomicReference<>();
//spliterator.forEachRemaining(last::set);
//return Optional.ofNullable(last.get());
// A better one that supports native parallel streams
return (Optional<T>) StreamSupport.stream(spliterator, stream.isParallel())
.reduce((a, b) -> b);
}
关于该实现的单元测试,前三个测试完全相同(顺序和大小并行)。未大小的并行测试在这里:
@Test
@DisplayName("last parallel unsized")
void last_parallel_unsized() throws Exception {
long expected = 10_000_000L;
AtomicLong count = new AtomicLong();
Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel();
stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel());
stream = stream.peek(num -> count.getAndIncrement());
assertThat(Streams.last(stream)).hasValue(expected);
assertThat(count).hasValue(10_000_000L);
}
@Test
@DisplayName("last parallel unsized with skip")
void last_parallel_unsized_with_skip() throws Exception {
long expected = 10_000_000L;
AtomicLong count = new AtomicLong();
Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel();
stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel());
stream = stream.skip(50_000).peek(num -> count.getAndIncrement());
assertThat(Streams.last(stream)).hasValue(expected);
assertThat(count).hasValue(9_950_000L);
}
StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel())
,Iterable
绕过根本没有任何特征的绕行路,换句话说,创建了无序流。因此,结果与parallel 或using无关skip
,而与事实“ last”对于无序流没有意义,因此任何元素都是有效结果。
我们需要last
在生产中使用Stream-我仍然不确定我们确实做到了,但是团队中的各个团队成员都说我们这样做是因为各种“原因”。我最终写了这样的东西:
private static class Holder<T> implements Consumer<T> {
T t = null;
// needed to null elements that could be valid
boolean set = false;
@Override
public void accept(T t) {
this.t = t;
set = true;
}
}
/**
* when a Stream is SUBSIZED, it means that all children (direct or not) are also SIZED and SUBSIZED;
* meaning we know their size "always" no matter how many splits are there from the initial one.
* <p>
* when a Stream is SIZED, it means that we know it's current size, but nothing about it's "children",
* a Set for example.
*/
private static <T> Optional<Optional<T>> last(Stream<T> stream) {
Spliterator<T> suffix = stream.spliterator();
// nothing left to do here
if (suffix.getExactSizeIfKnown() == 0) {
return Optional.empty();
}
return Optional.of(Optional.ofNullable(compute(suffix, new Holder())));
}
private static <T> T compute(Spliterator<T> sp, Holder holder) {
Spliterator<T> s;
while (true) {
Spliterator<T> prefix = sp.trySplit();
// we can't split any further
// BUT don't look at: prefix.getExactSizeIfKnown() == 0 because this
// does not mean that suffix can't be split even more further down
if (prefix == null) {
s = sp;
break;
}
// if prefix is known to have no elements, just drop it and continue with suffix
if (prefix.getExactSizeIfKnown() == 0) {
continue;
}
// if suffix has no elements, try to split prefix further
if (sp.getExactSizeIfKnown() == 0) {
sp = prefix;
}
// after a split, a stream that is not SUBSIZED can give birth to a spliterator that is
if (sp.hasCharacteristics(Spliterator.SUBSIZED)) {
return compute(sp, holder);
} else {
// if we don't know the known size of suffix or prefix, just try walk them individually
// starting from suffix and see if we find our "last" there
T suffixResult = compute(sp, holder);
if (!holder.set) {
return compute(prefix, holder);
}
return suffixResult;
}
}
s.forEachRemaining(holder::accept);
// we control this, so that Holder::t is only T
return (T) holder.t;
}
以及它的一些用法:
Stream<Integer> st = Stream.concat(Stream.of(1, 2), Stream.empty());
System.out.println(2 == last(st).get().get());
st = Stream.concat(Stream.empty(), Stream.of(1, 2));
System.out.println(2 == last(st).get().get());
st = Stream.concat(Stream.iterate(0, i -> i + 1), Stream.of(1, 2, 3));
System.out.println(3 == last(st).get().get());
st = Stream.concat(Stream.iterate(0, i -> i + 1).limit(0), Stream.iterate(5, i -> i + 1).limit(3));
System.out.println(7 == last(st).get().get());
st = Stream.concat(Stream.iterate(5, i -> i + 1).limit(3), Stream.iterate(0, i -> i + 1).limit(0));
System.out.println(7 == last(st).get().get());
String s = last(
IntStream.range(0, 10_000_000).mapToObj(i -> {
System.out.println("potential heavy operation on " + i);
return String.valueOf(i);
}).parallel()
).get().get();
System.out.println(s.equalsIgnoreCase("9999999"));
st = Stream.empty();
System.out.println(last(st).isEmpty());
st = Stream.of(1, 2, 3, 4, null);
System.out.println(last(st).get().isEmpty());
st = Stream.of((Integer) null);
System.out.println(last(st).isPresent());
IntStream is = IntStream.range(0, 4).filter(i -> i != 3);
System.out.println(last(is.boxed()));
首先是返回类型Optional<Optional<T>>
-我同意,它看起来很奇怪。如果第一个Optional
为空,则意味着Stream中没有元素;如果第二个Optional为空,则意味着最后一个元素实际上为null
,即:(Stream.of(1, 2, 3, null)
与guava
的Streams::findLast
,在这种情况下会引发Exception的不同)。
我承认我的主要灵感来自霍尔格(Holger)对我的问题和番石榴的类似回答Streams::findLast
。
Stream
,则可能需要重新考虑您的设计,并且确实要使用aStream
。Stream
s不一定是有序的或有限的。如果您Stream
是无序的,无限的或两者兼有,则最后一个元素没有意义。在我看来,a的目的Stream
是在数据及其处理方式之间提供抽象层。这样,一个Stream
本身不需要了解有关其元素的相对顺序的任何信息。在a中找到最后一个元素Stream
是O(n)。如果您具有不同的数据结构,则可能是O(1)。