英文:
Building Apache Nutch Docker container
问题
我正在遵循安装Apache Nutch的说明:
https://hub.docker.com/r/apache/nutch
https://hub.docker.com/r/apache/nutch/dockerfile
注意给开发者:确保此文件通过运行 https://github.com/replicatedhq/dockerfilelint 进行代码检查
BUILD_MODE可以是
0 == 使用Nutch主分支源安装,带有 'crawl' 和 'nutch' 脚本
1 == 与模式0相同,增加了Nutch REST Server
2 == 与模式1相同,增加了Nutch WebApp
ARG BUILD_MODE=0
FROM alpine:3.13 AS base
ARG SERVER_PORT=8081
ARG SERVER_HOST=0.0.0.0
ARG WEBAPP_PORT=8080
LABEL maintainer="Apache Nutch Developers dev@nutch.apache.org"
LABEL org.opencontainers.image.authors="Apache Nutch Developers dev@nutch.apache.org"
LABEL org.opencontainers.image.description="用于运行Apache Nutch的Docker镜像,这是一个高度可扩展和可伸缩的开源网络爬虫软件项目。访问项目网站:https://nutch.apache.org"
LABEL org.opencontainers.image.documentation="https://hub.docker.com/r/apache/nutch"
LABEL org.opencontainers.image.licenses="Apache-2.0"
LABEL org.opencontainers.image.source="https://raw.githubusercontent.com/apache/nutch/master/docker/Dockerfile"
LABEL org.opencontainers.image.title="Apache Nutch 1.x Docker Image"
LABEL org.opencontainers.image.url="https://hub.docker.com/r/apache/nutch"
LABEL org.opencontainers.image.vendor="Apache Nutch https://nutch.apache.org"
WORKDIR /root/
安装依赖项
RUN apk update
RUN apk add apache-ant bash git openjdk11 supervisor
设置环境变量
RUN echo 'export JAVA_HOME=/usr/lib/jvm/java-11-openjdk' >> $HOME/.bashrc
ENV JAVA_HOME='/usr/lib/jvm/java-11-openjdk'
ENV NUTCH_HOME='/root/nutch_source/runtime/local'
检出并构建Nutch主分支(1.x)
RUN git clone https://github.com/apache/nutch.git nutch_source &&
cd nutch_source &&
ant runtime &&
rm -rf build/ &&
rm -rf /root/.ivy2/
创建nutch和crawl的运行时本地bin/nutch和runtime/local/bin/crawl的符号链接
RUN ln -sf $NUTCH_HOME/bin/nutch /usr/local/bin/
RUN ln -sf $NUTCH_HOME/bin/crawl /usr/local/bin/
FROM base AS branch-version-0
RUN echo "Nutch主分支源安装,带有 'crawl' 和 'nutch' 脚本"
FROM base AS branch-version-1
RUN echo "Nutch主分支源安装,带有 'crawl' 和 'nutch' 脚本,以及Nutch REST Server,监听地址为 $SERVER_HOST:$SERVER_PORT"
ARG SERVER_PORT=8081
ARG SERVER_HOST=0.0.0.0
ENV SERVER_PORT=$SERVER_PORT
ENV SERVER_HOST=$SERVER_HOST
为supervisord安排必要的设置
RUN mkdir -p /var/log/supervisord
COPY ./config/supervisord_startserver.conf /etc/supervisord.conf
暴露服务器端口,只有在容器运行时发布了相同的端口才能访问
EXPOSE $SERVER_PORT
ENTRYPOINT [ "supervisord", "--nodaemon", "--configuration", "/etc/supervisord.conf" ]
FROM base AS branch-version-2
RUN echo "Nutch主分支源安装,带有 'crawl' 和 'nutch' 脚本,Nutch REST Server,监听地址为 $SERVER_HOST:$SERVER_PORT,以及WebApp,容器端口为 $WEBAPP_PORT"
ARG SERVER_PORT=8081
ARG SERVER_HOST=0.0.0.0
ARG WEBAPP_PORT=8080
ENV SERVER_PORT=$SERVER_PORT
ENV SERVER_HOST=$SERVER_HOST
ENV WEBAPP_PORT=$WEBAPP_PORT
安装WebApp
RUN apk add maven
RUN git clone https://github.com/apache/nutch-webapp.git nutch_webapp && cd nutch_webapp && mvn package
为supervisord安排必要的设置
RUN mkdir -p /var/log/supervisord
COPY ./config/supervisord_startserver_webapp.conf /etc/supervisord.conf
暴露服务器和WebApp的端口,只有在容器运行时发布了相同的端口才能访问
EXPOSE $SERVER_PORT
EXPOSE $WEBAPP_PORT
ENTRYPOINT [ "supervisord", "--nodaemon", "--configuration", "/etc/supervisord.conf" ]
FROM branch-version-$BUILD_MODE AS final
RUN echo "成功构建镜像,请查看 https://s.apache.org/m5933 以获取运行容器实例的指导。"
=> 错误 [branch-version-2 5/5] 复制./config/supervisord_startserver_webapp.conf /etc/supervisord.conf
0.0s
[branch-version-2 5/5] 复制./config/supervisord_startserver_webapp.conf /etc/supervisord.conf:
------ 计算缓存键失败:无法遍历/var/lib/docker/tmp/buildkit-mount3360673970/config:lstat
/var/lib/docker/tmp/buildkit-mount3360673970/config:没有这个文件或目录
英文:
I am following the instructions for installing Apache Nutch at:
https://hub.docker.com/r/apache/nutch
https://hub.docker.com/r/apache/nutch/dockerfile
# NOTE TO DEVELOPERS: Make sure this file passes linting tests
# by running https://github.com/replicatedhq/dockerfilelint
# BUILD_MODE can be either
# 0 == Nutch master branch source install with 'crawl' and 'nutch' scripts on PATH
# 1 == Same as mode 0 with addition of Nutch REST Server
# 2 == Same as mode 1 with addition of Nutch WebApp
ARG BUILD_MODE=0
FROM alpine:3.13 AS base
ARG SERVER_PORT=8081
ARG SERVER_HOST=0.0.0.0
ARG WEBAPP_PORT=8080
LABEL maintainer="Apache Nutch Developers <dev@nutch.apache.org>"
LABEL org.opencontainers.image.authors="Apache Nutch Developers <dev@nutch.apache.org>"
LABEL org.opencontainers.image.description="Docker image for running Apache Nutch, a highly extensible and scalable open source web crawler software project. Visit the project website at https://nutch.apache.org"
LABEL org.opencontainers.image.documentation="https://hub.docker.com/r/apache/nutch"
LABEL org.opencontainers.image.licenses="Apache-2.0"
LABEL org.opencontainers.image.source="https://raw.githubusercontent.com/apache/nutch/master/docker/Dockerfile"
LABEL org.opencontainers.image.title="Apache Nutch 1.x Docker Image"
LABEL org.opencontainers.image.url="https://hub.docker.com/r/apache/nutch"
LABEL org.opencontainers.image.vendor="Apache Nutch https://nutch.apache.org"
WORKDIR /root/
# Install dependencies
RUN apk update
RUN apk add apache-ant bash git openjdk11 supervisor
# Establish environment variables
RUN echo 'export JAVA_HOME=/usr/lib/jvm/java-11-openjdk' >> $HOME/.bashrc
ENV JAVA_HOME='/usr/lib/jvm/java-11-openjdk'
ENV NUTCH_HOME='/root/nutch_source/runtime/local'
# Checkout and build the Nutch master branch (1.x)
RUN git clone https://github.com/apache/nutch.git nutch_source && \
cd nutch_source && \
ant runtime && \
rm -rf build/ && \
rm -rf /root/.ivy2/
# Create symlinks for runtime/local/bin/nutch and runtime/local/bin/crawl
RUN ln -sf $NUTCH_HOME/bin/nutch /usr/local/bin/
RUN ln -sf $NUTCH_HOME/bin/crawl /usr/local/bin/
FROM base AS branch-version-0
RUN echo "Nutch master branch source install with 'crawl' and 'nutch' scripts on PATH"
FROM base AS branch-version-1
RUN echo "Nutch master branch source install with 'crawl' and 'nutch' scripts on PATH and Nutch REST Server on $SERVER_HOST:$SERVER_PORT"
ARG SERVER_PORT=8081
ARG SERVER_HOST=0.0.0.0
ENV SERVER_PORT=$SERVER_PORT
ENV SERVER_HOST=$SERVER_HOST
# Arrange necessary setup for supervisord
RUN mkdir -p /var/log/supervisord
COPY ./config/supervisord_startserver.conf /etc/supervisord.conf
# Expose port for server which can only be accessed if
# the same port is published when the container is run.
EXPOSE $SERVER_PORT
ENTRYPOINT [ "supervisord", "--nodaemon", "--configuration", "/etc/supervisord.conf" ]
FROM base AS branch-version-2
RUN echo "Nutch master branch source install with 'crawl' and 'nutch' scripts on PATH, Nutch REST Server on $SERVER_HOST:$SERVER_PORT and WebApp on this container port $WEBAPP_PORT"
ARG SERVER_PORT=8081
ARG SERVER_HOST=0.0.0.0
ARG WEBAPP_PORT=8080
ENV SERVER_PORT=$SERVER_PORT
ENV SERVER_HOST=$SERVER_HOST
ENV WEBAPP_PORT=$WEBAPP_PORT
# Install the webapp
RUN apk add maven
RUN git clone https://github.com/apache/nutch-webapp.git nutch_webapp && cd nutch_webapp && mvn package
# Arrange necessary setup for supervisord
RUN mkdir -p /var/log/supervisord
COPY ./config/supervisord_startserver_webapp.conf /etc/supervisord.conf
# Expose ports for server and webapp, these can only be accessed if
# the same ports are published when the container is run.
EXPOSE $SERVER_PORT
EXPOSE $WEBAPP_PORT
ENTRYPOINT [ "supervisord", "--nodaemon", "--configuration", "/etc/supervisord.conf" ]
FROM branch-version-$BUILD_MODE AS final
RUN echo "Successfully built image, see https://s.apache.org/m5933 for guidance on running a container instance."
> => ERROR [branch-version-2 5/5] COPY
> ./config/supervisord_startserver_webapp.conf /etc/supervisord.conf
> 0.0s
> ------
> > [branch-version-2 5/5] COPY ./config/supervisord_startserver_webapp.conf /etc/supervisord.conf:
> ------ failed to compute cache key: failed to walk /var/lib/docker/tmp/buildkit-mount3360673970/config: lstat
> /var/lib/docker/tmp/buildkit-mount3360673970/config: no such file or
> directory
答案1
得分: 1
我遇到了相同的问题。
我将https://github.com/apache/nutch项目下载到我的工作目录。
然后,您可以找到docker目录,在该目录中有Dockerfile和配置目录,其中包含您需要构建镜像所缺少的文件。
英文:
I had the same problem.
I downloaded the https://github.com/apache/nutch project to my working dir.
Then you may find the docker dir and inside that dir there is the Dockerfile with the config dir that has the files that are missing for you to build the image.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论