英文:
Apache Drill - Unable to start Drill in distributed mode (In GCP Dataproc)
问题
I am trying to run Apache Drill in distributed mode on Google Cloud Dataproc, but unable to start drillbit on each node in the cluster.
我正在尝试在Google Cloud Dataproc上以分布式模式运行Apache Drill,但无法在集群的每个节点上启动drillbit。
I have created a basic cluster (1 master, 2 worker) with GCP Dataproc service, using the initialization scripts and instructions provided in the Apache Drill website.
我已经使用GCP Dataproc服务创建了一个基本的集群(1个主节点,2个工作节点),使用了Apache Drill网站提供的初始化脚本和说明。
Installing Drill in Distributed Mode in Dataproc
Apache Drill 1.19.0 and Apache Zookeeper 3.6.3 versions were configured in the setup script. The cluster provisioning in Dataproc was successful and I am able to connect with each node using SSH. When I tried to check the status of Zookeeper using telnet localhost 2181
and entering stats
, it is showing the following
在设置脚本中配置了Apache Drill 1.19.0和Apache Zookeeper 3.6.3版本。在Dataproc中成功完成了集群配置,我能够使用SSH连接到每个节点。当我尝试使用 telnet localhost 2181
并输入 stats
来检查Zookeeper的状态时,它显示如下内容:
Then, I try to start drillbit service on each node using the command bin/drillbit.sh start
as mentioned here Starting Drill in Distributed Mode,
然后,我尝试使用命令 bin/drillbit.sh start
在每个节点上启动drillbit服务,就像在在分布式模式下启动Drill中提到的那样,
then it shows
> Starting drillbit, logging to /opt/drill/log/drillbit.out
然后它显示:
> Starting drillbit, logging to /opt/drill/log/drillbit.out
When I check the status of drill using bin/drillbit.sh status
, it displays
> /opt/drill/drillbit.pid file is present but drillbit is not running.
当我使用 bin/drillbit.sh status
检查drill的状态时,它显示
> /opt/drill/drillbit.pid 文件存在,但drillbit未在运行。
Kindly provide help on how to resolve the issue and setup Apache Drill in distributed mode.
请提供帮助,解决此问题并设置Apache Drill在分布式模式下运行。
英文:
I am trying to run Apache Drill in distributed mode on Google Cloud Dataproc, but unable to start drillbit on each node in the cluster.
I have created a basic cluster (1 master, 2 worker) with GCP Dataproc service, using the initialization scripts and instructions provided in the Apache Drill website.
Installing Drill in Distributed Mode in Dataproc
Apache Drill 1.19.0 and Apache Zookeeper 3.6.3 versions were configured in the setup script. The cluster provisioning in Dataproc was successful and I am able to connect with each node using SSH. When I tried to check the status of Zookeeper using telnet localhost 2181
and entering stats
, it is showing the following
Then, I try to start drillbit service on each node using the command bin/drillbit.sh start
as mentioned here Starting Drill in Distributed Mode,
then it shows
> Starting drillbit, logging to /opt/drill/log/drillbit.out
When I check the status of drill using bin/drillbit.sh status
, it displays
> /opt/drill/drillbit.pid file is present but drillbit is not running.
Kindly provide help on how to resolve the issue and setup Apache Drill in distributed mode.
答案1
得分: 1
我不了解Dataproc,但你正在使用的贡献脚本,特别是automation.sh和apache-drill.sh,已经包含启动ZooKeeper和Drill的命令。因此,你不应该使用drillbit.sh手动启动Drillbits。你可以通过访问其Web UI(http://[drillbit-host]:8047)来检查Drill是否正在运行。请注意,在Drill集群中没有主节点,你可以在Web UI URL中使用任何一个Drillbit。
注:自1.19版本以来,Drill有了一些变化,因此你可以尝试在apache-drill.sh的第10行进行以下更改。
readonly DRILL_VERSION='1.21.1'
英文:
I don't know Dataproc but the contributed scripts you're using, specifically automation.sh and apache-drill.sh, already contain commands to start ZooKeeper and Drill. So you shouldn't be using drillbit.sh to start up Drillbits yourself. You can check whether Drill is running by going to its web UI at http://[drillbit-host]:8047. Note that there is no master node in a Drill cluster and you can use any one of the Drillbits in the web UI URL.
Footnote: Drill has moved on a bit since 1.19 so you might try making the following change on line 10 of apache-drill.sh.
readonly DRILL_VERSION='1.21.1'
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论