英文:
How to read Jetty HttpInput (ServletInputStream) many times?
问题
Currently, I'm developing a REST API using RestEasy
and Jetty
. One of my plans with this REST API is to create a hook plugin to handle incoming requests using JAX-RS ContainerRequestFilter
. The issue I'm encountering with the ContainerRequestPlugin
in Jetty
is that once I call requestContext.getEntityStream();
in the filter, the request can't be read again by my EndPoint
class, even if I reset the Entity Stream.
Here's my filter code:
@Provider
@Priority(2000)
public class DummyRequestFilter implements ContainerRequestFilter {
static Logger log = Logger.getLogger(DummyRequestFilter.class.getName());
@Context
private HttpServletRequest servletRequest;
@Override
public void filter(ContainerRequestContext requestContext) {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
String requestBody = "";
try {
IOUtils.copy(requestContext.getEntityStream(), baos);
InputStream is1 = new ByteArrayInputStream(baos.toByteArray());
InputStream is2 = new ByteArrayInputStream(baos.toByteArray());
requestBody = IOUtils.toString(is1);
log.info(requestBody);
requestContext.setEntityStream(is2);
} catch (Exception e) {
log.log(Level.SEVERE, "Exception Occurred", e);
}
}
}
And here's my endpoint class:
@Path("/")
public class DummyService {
Logger log = Logger.getLogger(DummyService.class.getName());
@GET
@Path("test")
@Produces(MediaType.APPLICATION_JSON)
public Response test(@FormParam("name") String name) {
log.info("Name = " + name);
return Response.status(200).build();
}
}
Whenever I call the test
method, I can see the name sent in the filter class, but in the endpoint class, the name is NULL
.
Later, I realized that the getEntityStream
returned from requestContext
is a Jetty custom ServletInputStream
called org.eclipse.jetty.server.HttpInput
. I suspect the request can't be read in the EndPoint
because I set the Entity Stream using ByteArrayInputStream
.
So, my question is: Is there a way to convert the Jetty HttpInput
to a generic InputStream
implementation, or is there another way to work around this issue? I want to be able to read the Jetty HttpInput
multiple times.
Thanks & Regards
英文:
Currently im developing a REST API using RestEasy
and Jetty
. One of my plan with this REST API is to create a hook plugin to do anything needed with the incoming request utilizing JAX-RS ContainerRequestFilter
. The thing with ContainerRequestPlugin
in Jetty
here is that once I called requestContext.getEntityStream();
in the Filter then the request wont be able to be read again by my EndPoint Class even if I have set the Entity Stream again.
Following are my Filter code
@Provider
@Priority(2000)
public class DummyRequestFilter implements ContainerRequestFilter{
static Logger log = Logger.getLogger(DummyRequestFilter .class.getName());
@Context
private HttpServletRequest servletRequest;
@Override
public void filter(ContainerRequestContext requestContext) {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
String requestBody = "";
try {
IOUtils.copy(requestContext.getEntityStream(), baos);
InputStream is1 = new ByteArrayInputStream(baos.toByteArray());
InputStream is2 = new ByteArrayInputStream(baos.toByteArray());
requestBody = IOUtils.toString(is1);
log.info(requestBody);
requestContext.setEntityStream(is2);
}catch (Exception e) {
log.log(Level.SEVERE,"Exception Occurred",e);
}
}
}
Then here is my endpoint class
@Path("/")
public class DummyService {
Logger log = Logger.getLogger(DummyService .class.getName());
@GET
@Path("test")
@Produces(MediaType.APPLICATION_JSON)
public Response test(@FormParam("name") String name) {
log.info("Name = "+name);
return Response.status(200).build();
}
}
Whenever I called this test method I can see the name sent in Filter class but in the Endpoint class name is NULL
.
Later then I figured out that the getEntityStream returned from requestContext is Jetty custom ServletInputStream
that is org.eclipse.jetty.server.HttpInput
. I believe the request cannot be read in EndPoint since I set the Entity Stream using ByteArrayInputStream.
So my question will be, is there any way to build/convert Jetty HttpInput using generic InputStream implementation? or is there any other way to work around this case? where I can read Jetty HttpInput many times?
Thanks & Regards
答案1
得分: 1
Servlet规范明显不允许多次读取请求主体内容。
这是一个有意的决定,因为任何此类功能都需要缓存或缓冲响应主体内容,这会导致:
- 针对您的Web应用程序的各种DoS(拒绝服务)攻击。
- 请求处理时的空闲超时,当您的代码第二次从缓冲区读取请求并产生没有网络流量以重置空闲超时时。
- 无法受益于或使用Servlet Async I/O处理。
JAX-RS端点通常要求javax.servlet.http.HttpServletRequest
输入流从未被读取,无论出于任何原因(*)。
您的代码没有限制分配的字节数组的大小,很容易被滥用以创建Zip Bomb(例如:发送42千字节的数据,解压后达到3.99皮字节)。
您可能会找到JAX-RS实现特定的方法,比如使用Jersey内部代码来设置实体流,但这种代码会很脆弱,并且可能需要修复您的代码并重新编译以适应Jersey库的更新。
如果选择自定义方法,请务必格外小心,不要在代码中引入明显的漏洞,限制请求大小,限制可以缓冲的内容等。
通常需要修改请求输入流内容的Web应用程序会通过代理Servlet来完成,这些Servlet会实时对请求进行中间处理,逐个缓冲区进行修改。Jetty有一个这样的类,方便地称为AsyncMiddleManServlet
。这意味着您的客户端与代理通信,代理再与您的端点通信,同时遵循网络行为和网络背压需求(这是缓冲过滤器无法正确处理的内容)。
(*)您可能会通过从请求中请求请求参数或请求部分的方式无意中读取HttpServletRequest主体,这需要对某些特定Content-Type读取主体内容。
英文:
As you have no doubt noticed, the Servlet spec does not allow you to read the Request body contents twice.
This is an intentional decision as any such feature would require caching or buffering the response body content. Which leads to:
- Various DoS / Denial of Service attacks against your webapp.
- Idle Timeouts on request processing when your code reads the request the second time from the buffer and produces no network traffic to reset the idle timeout.
- The inability to benefit from or use Servlet Async I/O processing.
JAX-RS endpoints typically require that the javax.servlet.http.HttpServletRequest
input stream has not been read, at all, for any reason (*).
Your code makes no attempt to limit the size of the byte arrays you allocate, it would be easy to abuse your service with a Zip Bomb. (example: sending 42 kilobytes of data that unpacks to 3.99 petabytes)
You may find a JAX-RS implementation specific way, such as using Jersey internal code to set the entity stream, but that kind of code will be fragile and likely result in the need to fix your code and recompile with updates to your Jersey library.
If you go the custom route, please be take extra care to not introduce obvious vulnerabilities in your code, limit your request size, limit what you can buffer, etc.
Typically webapps that need to modify request input stream content do it via proxy servlets that perform middle-man modification of the request in real-time, on a buffer by buffer basis. Jetty has such a class, called conveniently AsyncMiddleManServlet
. This essentially means your client talks to the proxy which talks to your endpoint, which honors network behaviors and network backpressure needs. (something a buffering filter wouldn't be able to handle properly)
(*) You can accidentally read the HttpServletRequest body by using things from the request that ask for the request parameters or the request parts (which require that the body content be read for certain specific Content-Types)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论