两次读取流


127

您如何两次读取同一输入流?是否可以某种方式复制它?

我需要从网络获取图像,将其保存在本地,然后返回保存的图像。我只是想,使用相同的流而不是为下载的内容启动新的流然后再次读取它会更快。


1
也许使用标记并重设
Vyacheslav Shylkin

Answers:


113

您可以org.apache.commons.io.IOUtils.copy用来将InputStream的内容复制到字节数组,然后使用ByteArrayInputStream从字节数组重复读取。例如:

ByteArrayOutputStream baos = new ByteArrayOutputStream();
org.apache.commons.io.IOUtils.copy(in, baos);
byte[] bytes = baos.toByteArray();

// either
while (needToReadAgain) {
    ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
    yourReadMethodHere(bais);
}

// or
ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
while (needToReadAgain) {
    bais.reset();
    yourReadMethodHere(bais);
}

1
我认为这是唯一有效的解决方案,因为并非所有类型都支持mark。
Warpzit 2012年

3
@Paul Grime:IOUtils.toByeArray也在内部从内部调用复制方法。
Ankit 2012年

4
正如@Ankit所说,此解决方案对我无效,因为输入是在内部读取的,不能重复使用。
Xtreme Biker 2014年

30
我知道此评论已经过时了,但是,在第一个选项中,如果您将输入流读取为字节数组,这是否意味着您正在将所有数据加载到内存中?如果要加载大文件之类的东西,可能会是一个大问题?
jaxkodex 2015年

2
可以使用IOUtils.toByteArray(InputStream)在一次调用中获取字节数组。
有用的

30

根据InputStream的来源,您可能无法将其重置。您可以使用检查mark()reset()支持markSupported()

如果是这样,则可以调用reset()InputStream返回到开头。如果不是,则需要再次从源读取InputStream。


1
InputStream不支持“标记”-您可以在IS上调用标记,但不执行任何操作。同样,在IS上调用reset将引发异常。
ayahuasca

4
@ayahuasca InputStreamsubsclasses像BufferedInputStream不支持“标记”
梅德诺维奇

10

如果您的InputStream支持使用mark,那么可以先输入mark()inputStream,然后再输入reset()。如果您InputStrem不支持mark,则可以使用class java.io.BufferedInputStream,这样就可以将流嵌入BufferedInputStream这样

    InputStream bufferdInputStream = new BufferedInputStream(yourInputStream);
    bufferdInputStream.mark(some_value);
    //read your bufferdInputStream 
    bufferdInputStream.reset();
    //read it again

1
缓冲的输入流只能标记回缓冲区大小,因此,如果源不适合,则不能一路回到起点。
L. Blanc

@ L.Blanc对不起,但这似乎不正确。看一下BufferedInputStream.fill()“增长缓冲区”部分,其中新缓冲区的大小仅与marklimit和进行比较MAX_BUFFER_SIZE
eugene82 '19

8

您可以使用PushbackInputStream包装输入流。PushbackInputStream允许未读(“ 回写 ”),这已经读取的字节,所以你可以这样做:

public class StreamTest {
  public static void main(String[] args) throws IOException {
    byte[] bytes = new byte[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };

    InputStream originalStream = new ByteArrayInputStream(bytes);

    byte[] readBytes = getBytes(originalStream, 3);
    printBytes(readBytes); // prints: 1 2 3

    readBytes = getBytes(originalStream, 3);
    printBytes(readBytes); // prints: 4 5 6

    // now let's wrap it with PushBackInputStream

    originalStream = new ByteArrayInputStream(bytes);

    InputStream wrappedStream = new PushbackInputStream(originalStream, 10); // 10 means that maximnum 10 characters can be "written back" to the stream

    readBytes = getBytes(wrappedStream, 3);
    printBytes(readBytes); // prints 1 2 3

    ((PushbackInputStream) wrappedStream).unread(readBytes, 0, readBytes.length);

    readBytes = getBytes(wrappedStream, 3);
    printBytes(readBytes); // prints 1 2 3


  }

  private static byte[] getBytes(InputStream is, int howManyBytes) throws IOException {
    System.out.print("Reading stream: ");

    byte[] buf = new byte[howManyBytes];

    int next = 0;
    for (int i = 0; i < howManyBytes; i++) {
      next = is.read();
      if (next > 0) {
        buf[i] = (byte) next;
      }
    }
    return buf;
  }

  private static void printBytes(byte[] buffer) throws IOException {
    System.out.print("Reading stream: ");

    for (int i = 0; i < buffer.length; i++) {
      System.out.print(buffer[i] + " ");
    }
    System.out.println();
  }


}

请注意,PushbackInputStream存储内部字节缓冲区,因此它实际上在内存中创建了一个缓冲区,该缓冲区保存了“回写”字节。

知道这种方法后,我们可以进一步将其与FilterInputStream结合起来。FilterInputStream将原始输入流存储为委托。这允许创建新的类定义,该类定义允许自动“ 未读 ”原始数据。此类的定义如下:

public class TryReadInputStream extends FilterInputStream {
  private final int maxPushbackBufferSize;

  /**
  * Creates a <code>FilterInputStream</code>
  * by assigning the  argument <code>in</code>
  * to the field <code>this.in</code> so as
  * to remember it for later use.
  *
  * @param in the underlying input stream, or <code>null</code> if
  *           this instance is to be created without an underlying stream.
  */
  public TryReadInputStream(InputStream in, int maxPushbackBufferSize) {
    super(new PushbackInputStream(in, maxPushbackBufferSize));
    this.maxPushbackBufferSize = maxPushbackBufferSize;
  }

  /**
   * Reads from input stream the <code>length</code> of bytes to given buffer. The read bytes are still avilable
   * in the stream
   *
   * @param buffer the destination buffer to which read the data
   * @param offset  the start offset in the destination <code>buffer</code>
   * @aram length how many bytes to read from the stream to buff. Length needs to be less than
   *        <code>maxPushbackBufferSize</code> or IOException will be thrown
   *
   * @return number of bytes read
   * @throws java.io.IOException in case length is
   */
  public int tryRead(byte[] buffer, int offset, int length) throws IOException {
    validateMaxLength(length);

    // NOTE: below reading byte by byte instead of "int bytesRead = is.read(firstBytes, 0, maxBytesOfResponseToLog);"
    // because read() guarantees to read a byte

    int bytesRead = 0;

    int nextByte = 0;

    for (int i = 0; (i < length) && (nextByte >= 0); i++) {
      nextByte = read();
      if (nextByte >= 0) {
        buffer[offset + bytesRead++] = (byte) nextByte;
      }
    }

    if (bytesRead > 0) {
      ((PushbackInputStream) in).unread(buffer, offset, bytesRead);
    }

    return bytesRead;

  }

  public byte[] tryRead(int maxBytesToRead) throws IOException {
    validateMaxLength(maxBytesToRead);

    ByteArrayOutputStream baos = new ByteArrayOutputStream(); // as ByteArrayOutputStream to dynamically allocate internal bytes array instead of allocating possibly large buffer (if maxBytesToRead is large)

    // NOTE: below reading byte by byte instead of "int bytesRead = is.read(firstBytes, 0, maxBytesOfResponseToLog);"
    // because read() guarantees to read a byte

    int nextByte = 0;

    for (int i = 0; (i < maxBytesToRead) && (nextByte >= 0); i++) {
      nextByte = read();
      if (nextByte >= 0) {
        baos.write((byte) nextByte);
      }
    }

    byte[] buffer = baos.toByteArray();

    if (buffer.length > 0) {
      ((PushbackInputStream) in).unread(buffer, 0, buffer.length);
    }

    return buffer;

  }

  private void validateMaxLength(int length) throws IOException {
    if (length > maxPushbackBufferSize) {
      throw new IOException(
        "Trying to read more bytes than maxBytesToRead. Max bytes: " + maxPushbackBufferSize + ". Trying to read: " +
        length);
    }
  }

}

此类有两种方法。一种用于读取现有缓冲区的方法(定义类似于public int read(byte b[], int off, int len)InputStream类的调用)。第二个返回新缓冲区(如果要读取的缓冲区大小未知,这可能更有效)。

现在,让我们看看我们的课程在起作用:

public class StreamTest2 {
  public static void main(String[] args) throws IOException {
    byte[] bytes = new byte[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };

    InputStream originalStream = new ByteArrayInputStream(bytes);

    byte[] readBytes = getBytes(originalStream, 3);
    printBytes(readBytes); // prints: 1 2 3

    readBytes = getBytes(originalStream, 3);
    printBytes(readBytes); // prints: 4 5 6

    // now let's use our TryReadInputStream

    originalStream = new ByteArrayInputStream(bytes);

    InputStream wrappedStream = new TryReadInputStream(originalStream, 10);

    readBytes = ((TryReadInputStream) wrappedStream).tryRead(3); // NOTE: no manual call to "unread"(!) because TryReadInputStream handles this internally
    printBytes(readBytes); // prints 1 2 3

    readBytes = ((TryReadInputStream) wrappedStream).tryRead(3); 
    printBytes(readBytes); // prints 1 2 3

    readBytes = ((TryReadInputStream) wrappedStream).tryRead(3);
    printBytes(readBytes); // prints 1 2 3

    // we can also call normal read which will actually read the bytes without "writing them back"
    readBytes = getBytes(wrappedStream, 3);
    printBytes(readBytes); // prints 1 2 3

    readBytes = getBytes(wrappedStream, 3);
    printBytes(readBytes); // prints 4 5 6

    readBytes = ((TryReadInputStream) wrappedStream).tryRead(3); // now we can try read next bytes
    printBytes(readBytes); // prints 7 8 9

    readBytes = ((TryReadInputStream) wrappedStream).tryRead(3); 
    printBytes(readBytes); // prints 7 8 9


  }



}

5

如果您使用的实现InputStream,则可以检查的结果InputStream#markSupported(),告诉您是否可以使用方法mark()/reset()

如果您可以在阅读时标记流,请致电reset()以返回开始。

如果不能,则必须再次打开流。

另一个解决方案是将InputStream转换为字节数组,然后根据需要遍历该数组。您可以在本文中找到几种解决方案,无论是否使用第三方库,都可以将InputStream转换为Java中的字节数组。注意,如果读取的内容太大,则可能会遇到一些内存问题。

最后,如果您需要阅读图像,请使用:

BufferedImage image = ImageIO.read(new URL("http://www.example.com/images/toto.jpg"));

使用ImageIO#read(java.net.URL)还可以使用缓存。


1
使用时的警告语ImageIO#read(java.net.URL):某些Web服务器和CDN可能会拒绝发起的裸电话(即,如果没有用户代理使服务器相信该调用来自Web浏览器)ImageIO#read。在这种情况下,URLConnection.openConnection()多数情况下,使用将用户代理设置为该连接+使用`ImageIO.read(InputStream)即可解决问题。
克林特·伊斯特伍德

InputStream不是接口
Brice

3

怎么样:

if (stream.markSupported() == false) {

        // lets replace the stream object
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        IOUtils.copy(stream, baos);
        stream.close();
        stream = new ByteArrayInputStream(baos.toByteArray());
        // now the stream should support 'mark' and 'reset'

    }

5
那是一个可怕的主意。像这样将整个流内容放入内存中。
Niels Doucet

3

为了将InputStreamin一分为二,同时避免将所有数据加载到memory中,然后独立处理它们:

  1. 创建几个 OutputStream,恰好是:PipedOutputStream
  2. 将每个PipedOutputStream与PipedInputStream连接起来,这些PipedInputStream都是返回的InputStream
  3. 将源InputStream与刚刚创建的连接OutputStream。因此,从采购中读取的所有内容都InputStream将以两种语言编写OutputStream。不需要实现它,因为它已经在TeeInputStream(commons.io)中。
  4. 在一个单独的线程中,读取整个源inputInputStream,并且隐式地将输入数据传输到目标inputStreams。

    public static final List<InputStream> splitInputStream(InputStream input) 
        throws IOException 
    { 
        Objects.requireNonNull(input);      
    
        PipedOutputStream pipedOut01 = new PipedOutputStream();
        PipedOutputStream pipedOut02 = new PipedOutputStream();
    
        List<InputStream> inputStreamList = new ArrayList<>();
        inputStreamList.add(new PipedInputStream(pipedOut01));
        inputStreamList.add(new PipedInputStream(pipedOut02));
    
        TeeOutputStream tout = new TeeOutputStream(pipedOut01, pipedOut02);
    
        TeeInputStream tin = new TeeInputStream(input, tout, true);
    
        Executors.newSingleThreadExecutor().submit(tin::readAllBytes);  
    
        return Collections.unmodifiableList(inputStreamList);
    }

请注意在使用完后要关闭inputStreams,并关闭运行的线程: TeeInputStream.readAllBytes()

以防万一,您需要将其拆分为多个InputStream,而不是两个。在前面的代码片段中替换TeeOutputStream您自己的实现的类,该类将封装List<OutputStream>和重写OutputStream接口:

public final class TeeListOutputStream extends OutputStream {
    private final List<? extends OutputStream> branchList;

    public TeeListOutputStream(final List<? extends OutputStream> branchList) {
        Objects.requireNonNull(branchList);
        this.branchList = branchList;
    }

    @Override
    public synchronized void write(final int b) throws IOException {
        for (OutputStream branch : branchList) {
            branch.write(b);
        }
    }

    @Override
    public void flush() throws IOException {
        for (OutputStream branch : branchList) {
            branch.flush();
        }
    }

    @Override
    public void close() throws IOException {
        for (OutputStream branch : branchList) {
            branch.close();
        }
    }
}

拜托,您能再解释一下步骤4吗?为什么我们必须手动触发阅读?为什么读取pipedInputStream的任何内容都不会触发源inputStream的读取?为什么我们不那么打电话呢?
ДмитрийКулешов

2

将inputstream转换为字节,然后将其传递到savefile函数,在此将其组装为inputstream。同样在原始函数中,使用字节来执行其他任务


5
我说这是一个坏主意,结果数组可能很大,并且会抢夺设备的内存。
凯文·帕克

0

如果任何人都在Spring Boot应用程序中运行,并且您想读取的响应正文RestTemplate(这就是为什么我想两次读取流),则有一种较干净的方法。

首先,您需要使用Spring StreamUtils将流复制到String:

String text = StreamUtils.copyToString(response.getBody(), Charset.defaultCharset()))

但这还不是全部。您还需要使用可以为您缓冲流的请求工厂,如下所示:

ClientHttpRequestFactory factory = new BufferingClientHttpRequestFactory(new SimpleClientHttpRequestFactory());
RestTemplate restTemplate = new RestTemplate(factory);

或者,如果您使用的是工厂bean,那么(尽管这是Kotlin,但是):

@Bean
@Scope(ConfigurableBeanFactory.SCOPE_PROTOTYPE)
fun createRestTemplate(): RestTemplate = RestTemplateBuilder()
  .requestFactory { BufferingClientHttpRequestFactory(SimpleClientHttpRequestFactory()) }
  .additionalInterceptors(loggingInterceptor)
  .build()

来源:https//objectpartners.com/2018/03/01/log-your-resttemplate-request-and-response-without-destroying-the-body/


0

如果您使用RestTemplate进行http调用,只需添加一个拦截器即可。响应正文由ClientHttpResponse的实现缓存。现在可以根据需要从respose中检索inputstream多次

ClientHttpRequestInterceptor interceptor =  new ClientHttpRequestInterceptor() {

            @Override
            public ClientHttpResponse intercept(HttpRequest request, byte[] body,
                    ClientHttpRequestExecution execution) throws IOException {
                ClientHttpResponse  response = execution.execute(request, body);

                  // additional work before returning response
                  return response 
            }
        };

    // Add the interceptor to RestTemplate Instance 

         restTemplate.getInterceptors().add(interceptor); 
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.