配置使用Maven和logback进行Apache Spark日志记录,并最终将消息传递到Loggly。

huangapple go评论115阅读模式
英文:

Configuring Apache Spark Logging with Maven and logback and finally throwing message to Loggly

问题

我在尝试让我的Spark应用程序忽略Log4j,以便使用Logback时遇到了问题。我尝试使用Logback主要是因为它支持loggly appender。

我在我的pom文件中添加了以下依赖项和排除项(版本信息位于主pom库的依赖管理器中):

  1. <dependency>
  2. <groupId>org.apache.spark</groupId>
  3. <artifactId>spark-core_2.12</artifactId>
  4. <version>${spark.version}</version>
  5. <scope>provided</scope>
  6. <exclusions>
  7. <exclusion>
  8. <groupId>org.slf4j</groupId>
  9. <artifactId>slf4j-log4j12</artifactId>
  10. </exclusion>
  11. <exclusion>
  12. <groupId>log4j</groupId>
  13. <artifactId>log4j</artifactId>
  14. </exclusion>
  15. </exclusions>
  16. </dependency>
  17. <dependency>
  18. <groupId>ch.qos.logback</groupId>
  19. <artifactId>logback-classic</artifactId>
  20. <scope>test</scope>
  21. </dependency>
  22. <dependency>
  23. <groupId>ch.qos.logback</groupId>
  24. <artifactId>logback-core</artifactId>
  25. </dependency>
  26. <dependency>
  27. <groupId>org.logback-extensions</groupId>
  28. <artifactId>logback-ext-loggly</artifactId>
  29. </dependency>
  30. <dependency>
  31. <groupId>org.slf4j</groupId>
  32. <artifactId>log4j-over-slf4j</artifactId>
  33. </dependency>

我参考了以下两篇文章:

在Logback中将应用程序日志与Spark日志分离
使用Scala和logback配置Apache Spark日志

我尝试首先使用以下参数(在运行spark-submit时):
--conf "spark.driver.userClassPathFirst=true"
--conf "spark.executor.userClassPathFirst=true"

但是收到了错误消息:

  1. Exception in thread "main" java.lang.LinkageError: loader constraint violation: when resolving method "org.slf4j.impl.StaticLoggerBinder.getLoggerFactory()Lorg/slf4j/ILoggerFactory;" the class loader (instance of org/apache/spark/util/ChildFirstURLClassLoader) of the current class, org/slf4j/LoggerFactory, and the class loader (instance of sun/misc/Launcher$AppClassLoader) for the method's defining class, org/slf4j/impl/StaticLoggerBinder, have different Class objects for the type org/slf4j/ILoggerFactory used in the signature

我希望能够使其与上述内容一起正常工作,但我也尝试了以下方法:
--conf "spark.driver.extraClassPath=$libs"
--conf "spark.executor.extraClassPath=$libs"

但是,由于我在本地和(Amazon EMR集群上)将我的uber jar传递给spark submit,我真的不能指定一个对我机器本地有效的库文件位置。由于uber jar 包含这些文件,是否有办法使用这些文件?当Spark应用程序最终在EMR集群的主节点/节点上运行时,我是否被迫将这些库文件复制到那里?第一种关于使用userClassPathFirst的方法似乎是最佳途径。

英文:

I'm having trouble getting my Spark Application to ignore Log4j, in order to use Logback. One of the reasons i'm trying to use logback, is for the loggly appender it supports.

I have the following dependencies and exclusions in my pom file. (versions are in my dependency manager in main pom library.)

  1. &lt;dependency&gt;
  2. &lt;groupId&gt;org.apache.spark&lt;/groupId&gt;
  3. &lt;artifactId&gt;spark-core_2.12&lt;/artifactId&gt;
  4. &lt;version&gt;${spark.version}&lt;/version&gt;
  5. &lt;scope&gt;provided&lt;/scope&gt;
  6. &lt;exclusions&gt;
  7. &lt;exclusion&gt;
  8. &lt;groupId&gt;org.slf4j&lt;/groupId&gt;
  9. &lt;artifactId&gt;slf4j-log4j12&lt;/artifactId&gt;
  10. &lt;/exclusion&gt;
  11. &lt;exclusion&gt;
  12. &lt;groupId&gt;log4j&lt;/groupId&gt;
  13. &lt;artifactId&gt;log4j&lt;/artifactId&gt;
  14. &lt;/exclusion&gt;
  15. &lt;/exclusions&gt;
  16. &lt;/dependency&gt;
  17. &lt;dependency&gt;
  18. &lt;groupId&gt;ch.qos.logback&lt;/groupId&gt;
  19. &lt;artifactId&gt;logback-classic&lt;/artifactId&gt;
  20. &lt;scope&gt;test&lt;/scope&gt;
  21. &lt;/dependency&gt;
  22. &lt;dependency&gt;
  23. &lt;groupId&gt;ch.qos.logback&lt;/groupId&gt;
  24. &lt;artifactId&gt;logback-core&lt;/artifactId&gt;
  25. &lt;/dependency&gt;
  26. &lt;dependency&gt;
  27. &lt;groupId&gt;org.logback-extensions&lt;/groupId&gt;
  28. &lt;artifactId&gt;logback-ext-loggly&lt;/artifactId&gt;
  29. &lt;/dependency&gt;
  30. &lt;dependency&gt;
  31. &lt;groupId&gt;org.slf4j&lt;/groupId&gt;
  32. &lt;artifactId&gt;log4j-over-slf4j&lt;/artifactId&gt;
  33. &lt;/dependency&gt;

I have referenced these two articles:

Separating application logs in Logback from Spark Logs in log4j<br/>
Configuring Apache Spark Logging with Scala and logback

I've tried using first using (when running spark-submit) :<br/>
--conf "spark.driver.userClassPathFirst=true" <br/>
--conf "spark.executor.userClassPathFirst=true"

but receive the error

  1. Exception in thread &quot;main&quot; java.lang.LinkageError: loader constraint violation: when resolving method &quot;org.slf4j.impl.StaticLoggerBinder.ge
  2. tLoggerFactory()Lorg/slf4j/ILoggerFactory;&quot; the class loader (instance of org/apache/spark/util/ChildFirstURLClassLoader) of the current cl
  3. ass, org/slf4j/LoggerFactory, and the class loader (instance of sun/misc/Launcher$AppClassLoader) for the method&#39;s defining class, org/slf4
  4. j/impl/StaticLoggerBinder, have different Class objects for the type org/slf4j/ILoggerFactory used in the signature

I would like to get it working with the above, but then i also looked at trying the below<br/>
--conf "spark.driver.extraClassPath=$libs" <br/>
--conf "spark.executor.extraClassPath=$libs"

but since i'm passing my uber jar to spark submit locally AND (on a Amazon EMR cluster) i really can't be specifying a library file location that will be local to my machine. Since the uber jar contains the files, is there a way for it to use those files? Am i forced to copy these libraries to the master/nodes on the EMR cluster when the spark app finally runs from there?

The first approach about using the userClassPathFirst seems like the best route though.

答案1

得分: 1

以下是翻译好的内容:

解决了这个问题,当时遇到了几个问题。

为了让 Spark 允许 logback 正常工作,对我有效的解决方案是结合我在上面发布的文章中的一些内容,以及一个证书文件的问题。

我用于传递给 spark-submit 的证书文件是不完整的,并且覆盖了基本的信任库证书。这导致发送 Https 消息到 Loggly 时出现了问题。

第一部分更改:
将 Maven 更新为阴影 org.slf4j(正如 @matemaciek 的答案中所述)

  1. <dependencies>
  2. ...
  3. <dependency>
  4. <groupId>ch.qos.logback</groupId>
  5. <artifactId>logback-classic</artifactId>
  6. <version>1.2.3</version>
  7. </dependency>
  8. <dependency>
  9. <groupId>ch.qos.logback</groupId>
  10. <artifactId>logback-core</artifactId>
  11. <version>1.2.3</version>
  12. </dependency>
  13. <dependency>
  14. <groupId>org.logback-extensions</groupId>
  15. <artifactId>logback-ext-loggly</artifactId>
  16. <version>0.1.5</version>
  17. <scope>runtime</scope>
  18. </dependency>
  19. <dependency>
  20. <groupId>org.slf4j</groupId>
  21. <artifactId>log4j-over-slf4j</artifactId>
  22. <version>1.7.30</version>
  23. </dependency>
  24. </dependencies>
  25. <build>
  26. <plugins>
  27. <plugin>
  28. <groupId>org.apache.maven.plugins</groupId>
  29. <artifactId>maven-shade-plugin</artifactId>
  30. <version>3.2.1</version>
  31. <executions>
  32. <execution>
  33. <phase>package</phase>
  34. <goals>
  35. <goal>shade</goal>
  36. </goals>
  37. </execution>
  38. </executions>
  39. <configuration>
  40. <transformers>
  41. <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
  42. <manifestEntries>
  43. <Main-Class>com.TestClass</Main-Class>
  44. </manifestEntries>
  45. </transformer>
  46. </transformers>
  47. <relocations>
  48. <relocation>
  49. <pattern>org.slf4j</pattern>
  50. <shadedPattern>com.shaded.slf4j</shadedPattern>
  51. </relocation>
  52. </relocations>
  53. </configuration>
  54. </plugin>
  55. </plugins>
  56. </build>

第一部分附加:logback.xml

  1. <configuration debug="true">
  2. <appender name="logglyAppender" class="ch.qos.logback.ext.loggly.LogglyAppender">
  3. <endpointUrl>https://logs-01.loggly.com/bulk/TOKEN/tag/TAGS/</endpointUrl>
  4. <pattern>${hostName} %d{yyyy-MM-dd HH:mm:ss,SSS}{GMT} %p %t %c %M - %m%n</pattern>
  5. </appender>
  6. <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
  7. <encoder>
  8. <pattern>${hostName} %d{yyyy-MM-dd HH:mm:ss,SSS}{GMT} %p %t %c %M - %m%n</pattern>
  9. </encoder>
  10. </appender>
  11. <root level="info">
  12. <appender-ref ref="logglyAppender" />
  13. <appender-ref ref="STDOUT" />
  14. </root>
  15. </configuration>

第二部分更改:MainClass

  1. import org.slf4j.*;
  2. public class TestClass {
  3. static final Logger log = LoggerFactory.getLogger(TestClass.class);
  4. public static void main(String[] args) throws Exception {
  5. log.info("this is a test message");
  6. }
  7. }

第三部分更改:
我之前是这样提交 Spark 应用的(示例):

  1. spark-submit --deploy-mode client --class com.TestClass --conf "spark.executor.extraJavaOptions=-Djavax.net.ssl.trustStore=c:/src/testproject/rds-truststore.jks -Djavax.net.ssl.trustStorePassword=changeit" --conf "spark.driver.extraJavaOptions=-Djavax.net.ssl.trustStore=c:/src/testproject/rds-truststore.jks -Djavax.net.ssl.trustStorePassword=changeit" com/target/testproject-0.0.1.jar

因此,上述的 spark-submit 在 HTTPS 认证问题上失败了(当时联系 Loggly 发送消息到 Loggly 服务时),因为 rds-truststore.jks 覆盖了没有全部证书的证书。我将其更改为使用 cacerts 存储,现在它有了所需的所有证书。

不再在发送以下消息到 Loggly 时出现错误:

  1. spark-submit --deploy-mode client --class com.TestClass --conf "spark.executor.extraJavaOptions=-Djavax.net.ssl.trustStore=c:/src/testproject/cacerts -Djavax.net.ssl.trustStorePassword=changeit" --conf "spark.driver.extraJavaOptions=-Djavax.net.ssl.trustStore=c:/src/testproject/cacerts -Djavax.net.ssl.trustStorePassword=changeit" com/target/testproject-0.0.1.jar

(以上翻译的是您提供的内容中的代码部分,不包含额外的回答。如需更多帮助,请随时提问。)

英文:

So I solved the issue and had several problems going on.

So in order to get Spark to allow logback to work, the solution that worked for me was from a combination of items from the articles i posted above, and in addition a cert file problem.

The cert file i was using to pass into spark-submit was incomplete and overriding the base truststore certs. This was causing a problem SENDING Https messages to Loggly.

Part 1 change:
Update maven to shade org.slf4j (as stated in an answer by @matemaciek)

  1. &lt;/dependencies&gt;
  2. ...
  3. &lt;dependency&gt;
  4. &lt;groupId&gt;ch.qos.logback&lt;/groupId&gt;
  5. &lt;artifactId&gt;logback-classic&lt;/artifactId&gt;
  6. &lt;version&gt;1.2.3&lt;/version&gt;
  7. &lt;/dependency&gt;
  8. &lt;dependency&gt;
  9. &lt;groupId&gt;ch.qos.logback&lt;/groupId&gt;
  10. &lt;artifactId&gt;logback-core&lt;/artifactId&gt;
  11. &lt;version&gt;1.2.3&lt;/version&gt;
  12. &lt;/dependency&gt;
  13. &lt;dependency&gt;
  14. &lt;groupId&gt;org.logback-extensions&lt;/groupId&gt;
  15. &lt;artifactId&gt;logback-ext-loggly&lt;/artifactId&gt;
  16. &lt;version&gt;0.1.5&lt;/version&gt;
  17. &lt;scope&gt;runtime&lt;/scope&gt;
  18. &lt;/dependency&gt;
  19. &lt;dependency&gt;
  20. &lt;groupId&gt;org.slf4j&lt;/groupId&gt;
  21. &lt;artifactId&gt;log4j-over-slf4j&lt;/artifactId&gt;
  22. &lt;version&gt;1.7.30&lt;/version&gt;
  23. &lt;/dependency&gt;
  24. &lt;/dependencies&gt;
  25. &lt;build&gt;
  26. &lt;plugins&gt;
  27. &lt;plugin&gt;
  28. &lt;groupId&gt;org.apache.maven.plugins&lt;/groupId&gt;
  29. &lt;artifactId&gt;maven-shade-plugin&lt;/artifactId&gt;
  30. &lt;version&gt;3.2.1&lt;/version&gt;
  31. &lt;executions&gt;
  32. &lt;execution&gt;
  33. &lt;phase&gt;package&lt;/phase&gt;
  34. &lt;goals&gt;
  35. &lt;goal&gt;shade&lt;/goal&gt;
  36. &lt;/goals&gt;
  37. &lt;/execution&gt;
  38. &lt;/executions&gt;
  39. &lt;configuration&gt;
  40. &lt;transformers&gt;
  41. &lt;transformer
  42. implementation=&quot;org.apache.maven.plugins.shade.resource.ManifestResourceTransformer&quot;&gt;
  43. &lt;manifestEntries&gt;
  44. &lt;Main-Class&gt;com.TestClass&lt;/Main-Class&gt;
  45. &lt;/manifestEntries&gt;
  46. &lt;/transformer&gt;
  47. &lt;/transformers&gt;
  48. &lt;relocations&gt;
  49. &lt;relocation&gt;
  50. &lt;pattern&gt;org.slf4j&lt;/pattern&gt;
  51. &lt;shadedPattern&gt;com.shaded.slf4j&lt;/shadedPattern&gt;
  52. &lt;/relocation&gt;
  53. &lt;/relocations&gt;
  54. &lt;/configuration&gt;
  55. &lt;/plugin&gt;
  56. &lt;/plugins&gt;
  57. &lt;/build&gt;

Part 1a: the logback.xml

  1. &lt;configuration debug=&quot;true&quot;&gt;
  2. &lt;appender name=&quot;logglyAppender&quot; class=&quot;ch.qos.logback.ext.loggly.LogglyAppender&quot;&gt;
  3. &lt;endpointUrl&gt;https://logs-01.loggly.com/bulk/TOKEN/tag/TAGS/&lt;/endpointUrl&gt;
  4. &lt;pattern&gt;${hostName} %d{yyyy-MM-dd HH:mm:ss,SSS}{GMT} %p %t %c %M - %m%n&lt;/pattern&gt;
  5. &lt;/appender&gt;
  6. &lt;appender name=&quot;STDOUT&quot; class=&quot;ch.qos.logback.core.ConsoleAppender&quot;&gt;
  7. &lt;encoder&gt;
  8. &lt;pattern&gt;${hostName} %d{yyyy-MM-dd HH:mm:ss,SSS}{GMT} %p %t %c %M - %m%n&lt;/pattern&gt;
  9. &lt;/encoder&gt;
  10. &lt;/appender&gt;
  11. &lt;root level=&quot;info&quot;&gt;
  12. &lt;appender-ref ref=&quot;logglyAppender&quot; /&gt;
  13. &lt;appender-ref ref=&quot;STDOUT&quot; /&gt;
  14. &lt;/root&gt;
  15. &lt;/configuration&gt;

Part 2 change: The MainClass

  1. import org.slf4j.*;
  2. public class TestClass {
  3. static final Logger log = LoggerFactory.getLogger(TestClass.class);
  4. public static void main(String[] args) throws Exception {
  5. log.info(&quot;this is a test message&quot;);
  6. }
  7. }

Part 3 change:
i was submitting spark application as such (example):

  1. sparkspark-submit --deploy-mode client --class com.TestClass --conf &quot;spark.executor.extraJavaOptions=-Djavax.net.ssl.trustStore=c:/src/testproject/rds-truststore.jks -Djavax.net.ssl.trustStorePassword=changeit&quot; --conf &quot;spark.driver.extraJavaOptions=-Djavax.net.ssl.trustStore=c:/src/testproject/rds-truststore.jks -Djavax.net.ssl.trustStorePassword=changeit&quot; com/target/testproject-0.0.1.jar

So the above spark-submit failed on a HTTPS certification problem (that was when Loggly was being contacted to send the message to loggly service) because the rds-truststore.jks overwrote the certs without all certs. I changed this to use cacerts store, and it now had all the certs it needed.

No more error at the Loggly part when sending this

  1. sparkspark-submit --deploy-mode client --class com.TestClass --conf &quot;spark.executor.extraJavaOptions=-Djavax.net.ssl.trustStore=c:/src/testproject/cacerts -Djavax.net.ssl.trustStorePassword=changeit&quot; --conf &quot;spark.driver.extraJavaOptions=-Djavax.net.ssl.trustStore=c:/src/testproject/cacerts -Djavax.net.ssl.trustStorePassword=changeit&quot; com/target/testproject-0.0.1.jar

答案2

得分: 0

你必须在Spark选项中使用以下内容:-Dspark.executor.extraJavaOptions=-Dlogback.configurationFile=/spark/logback/logback.xml

在logback.xml文件中,您应该配置logback的设置。

英文:

You have to usein spark opts -Dspark.executor.extraJavaOptions=-Dlogback.configurationFile=/spark/logback/logback.xml

In logback.xml you should have settings for logback.

huangapple
  • 本文由 发表于 2020年10月8日 03:15:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/64250847.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定