Server won't start after upgrade from 9.3.0 to 9.5.0

Standard upgrade from 9.3.0 to 9.5.0 running on Windows 2019 Server, SQL Server 2017, Microsoft JDBC driver.

Looking in the logs I don’t get a lot of help. I see this in the es log:

2022.06.23 16:47:30 ERROR es[][o.e.b.ElasticsearchUncaughtExceptionHandler] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: org.elasticsearch.common.ssl.SslConfigException: failed to find a X509ExtendedTrustManager in the trust manager factory for [PKIX] and truststore [null]`Preformatted text`

SQ is running behind an IIS reverse proxy for TLS, so I’m confused about why SQ would be having an issue with SSL.

I see this in the web log which might be a clue


2022.06.23 16:09:39 WARN  web[][o.a.c.l.WebappClassLoaderBase] The web application [ROOT] appears to have started a thread named [OkHttp TaskRunner] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:\n java.base@11.0.15/java.lang.Object.wait(Native Method)\n java.base@11.0.15/java.lang.Object.wait(Object.java:462)\n app//okhttp3.internal.concurrent.TaskRunner$RealBackend.coordinatorWait(TaskRunner.kt:294)\n app//okhttp3.internal.concurrent.TaskRunner.awaitTaskToRun(TaskRunner.kt:218)\n app//okhttp3.internal.concurrent.TaskRunner$runnable$1.run(TaskRunner.kt:59)\n java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n java.base@11.0.15/java.lang.Thread.run(Thread.java:829)
2022.06.23 16:09:39 WARN  web[][o.a.c.l.WebappClassLoaderBase] The web application [ROOT] appears to have started a thread named [OkHttp katahdin.edge.noblis.org] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:\n java.base@11.0.15/java.net.SocketInputStream.socketRead0(Native Method)\n java.base@11.0.15/java.net.SocketInputStream.socketRead(SocketInputStream.java:115)\n java.base@11.0.15/java.net.SocketInputStream.read(SocketInputStream.java:168)\n java.base@11.0.15/java.net.SocketInputStream.read(SocketInputStream.java:140)\n java.base@11.0.15/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:478)\n java.base@11.0.15/sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:472)\n java.base@11.0.15/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:70)\n java.base@11.0.15/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1454)\n java.base@11.0.15/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1065)\n app//okio.InputStreamSource.read(JvmOkio.kt:90)\n app//okio.AsyncTimeout$source$1.read(AsyncTimeout.kt:129)\n app//okio.RealBufferedSource.request(RealBufferedSource.kt:206)\n app//okio.RealBufferedSource.require(RealBufferedSource.kt:199)\n app//okhttp3.internal.http2.Http2Reader.nextFrame(Http2Reader.kt:89)\n app//okhttp3.internal.http2.Http2Connection$ReaderRunnable.invoke(Http2Connection.kt:618)\n app//okhttp3.internal.http2.Http2Connection$ReaderRunnable.invoke(Http2Connection.kt:609)\n app//okhttp3.internal.concurrent.TaskQueue$execute$1.runOnce(TaskQueue.kt:98)\n app//okhttp3.internal.concurrent.TaskRunner.runTask(TaskRunner.kt:116)\n app//okhttp3.internal.concurrent.TaskRunner.access$runTask(TaskRunner.kt:42)\n app//okhttp3.internal.concurrent.TaskRunner$runnable$1.run(TaskRunner.kt:65)\n java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n java.base@11.0.15/java.lang.Thread.run(Thread.java:829)
2022.06.23 16:09:39 WARN  web[][o.a.c.l.WebappClassLoaderBase] The web application [ROOT] appears to have started a thread named [Okio Watchdog] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:\n java.base@11.0.15/java.lang.Object.wait(Native Method)\n app//okio.AsyncTimeout$Companion.awaitTimeout$okio(AsyncTimeout.kt:300)\n app//okio.AsyncTimeout$Watchdog.run(AsyncTimeout.kt:187)
2022.06.23 16:09:39 WARN  web[][o.a.c.l.WebappClassLoaderBase] The web application [ROOT] appears to have started a thread named [OkHttp TaskRunner] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:\n java.base@11.0.15/jdk.internal.misc.Unsafe.park(Native Method)\n java.base@11.0.15/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)\n java.base@11.0.15/java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:462)\n java.base@11.0.15/java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:361)\n java.base@11.0.15/java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:937)\n java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1053)\n java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)\n java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n java.base@11.0.15/java.lang.Thread.run(Thread.java:829)
2022.06.23 16:09:39 WARN  web[][o.a.c.l.WebappClassLoaderBase] The web application [ROOT] appears to have started a thread named [OkHttp TaskRunner] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:\n java.base@11.0.15/jdk.internal.misc.Unsafe.park(Native Method)\n java.base@11.0.15/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)\n java.base@11.0.15/java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:462)\n java.base@11.0.15/java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:361)\n java.base@11.0.15/java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:937)\n java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1053)\n java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)\n java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n java.base@11.0.15/java.lang.Thread.run(Thread.java:829)
2022.06.23 16:09:39 WARN  web[][o.a.c.l.WebappClassLoaderBase] The web application [ROOT] appears to have started a thread named [OkHttp TaskRunner] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:\n java.base@11.0.15/jdk.internal.misc.Unsafe.park(Native Method)\n java.base@11.0.15/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)\n java.base@11.0.15/java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:462)\n java.base@11.0.15/java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:361)\n java.base@11.0.15/java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:937)\n java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1053)\n java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)\n java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n java.base@11.0.15/java.lang.Thread.run(Thread.java:829)
2022.06.23 16:09:39 INFO  web[][o.s.s.app.WebServer] WebServer stopped

Any assistance is appreciated as I’m not sure where to look. I’ll work on rolling back now.

B

Is there any more information I can add to get some feedback from the community?

Hi @bduffey
It seems someone already had a very similar issue on Azure with an Azul JVM.

Could you please share the JVM vendor and version you are using? Please provide the SonarQube edition you are running and the full stacktrace for the error you encountered.

Also, do you have any of the following properties set in your sonar.properties?

sonar.cluster.enabled
sonar.cluster.es.ssl.truststore
sonar.cluster.es.ssl.keystore

@leo.geoffroy - thanks so much for your reply. I AM using Azul (looks like I currently have Azul Zulu JDK 11.56.19 , 64bit installed)

I’m running SQ 9.5.0.56709 developer edition (upgraded from 9.3.0.51899 developer edition, which was working fine with this version of Java)

I don’t see any of those properties in the file, so I don’t believe they are enabled.

How can I provide the full stacktrace?

Thanks again
BD

You should get the full stack trace from logs/es.log file.
You can also check if the following jvm argument have been changed when launching SonarQube:
javax.net.ssl.trustStore

or any other argument related to truststore

Were you able to perform the rollback to 9.3 that you mentioned in your initial message?

I was able to roll back by changing the service definition (I had not upgraded the database yet)

I hope to spend more time on this tomorrow. Thanks again

I’ve switched things back to using the 9.5 files and this is the es.log

2022.06.29 14:29:46 INFO  es[][o.e.n.Node] version[7.17.1], pid[5648], build[unknown/unknown/e5acb99f822233d62d6444ce45a4543dc1c8059a/2022-02-23T22:20:54.153567231Z], OS[Windows Server 2019/10.0/amd64], JVM[Azul Systems, Inc./OpenJDK 64-Bit Server VM/11.0.15/11.0.15+10-LTS]
2022.06.29 14:29:46 INFO  es[][o.e.n.Node] JVM home [C:\Program Files\Zulu\zulu-11]
2022.06.29 14:29:46 INFO  es[][o.e.n.Node] JVM arguments [-XX:+UseG1GC, -Djava.io.tmpdir=E:\sonarqube-data\temp, -XX:ErrorFile=../logs/es_hs_err_pid%p.log, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -Djna.tmpdir=E:\sonarqube-data\temp, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dio.netty.allocator.numDirectArenas=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Dlog4j2.formatMsgNoLookups=true, -Djava.locale.providers=COMPAT, -Dcom.redhat.fips=false, -Des.enforce.bootstrap.checks=true, -Xmx512m, -Xms512m, -XX:MaxDirectMemorySize=256m, -XX:+HeapDumpOnOutOfMemoryError, -Delasticsearch, -Des.path.home=E:\sonarqube-9.5.0.56709\elasticsearch, -Des.path.conf=E:\sonarqube-data\temp\conf\es, -Djavax.net.ssl.trustStoreType=WINDOWS-ROOT]
2022.06.29 14:29:48 INFO  es[][o.e.p.PluginsService] loaded module [analysis-common]
2022.06.29 14:29:48 INFO  es[][o.e.p.PluginsService] loaded module [lang-painless]
2022.06.29 14:29:48 INFO  es[][o.e.p.PluginsService] loaded module [parent-join]
2022.06.29 14:29:48 INFO  es[][o.e.p.PluginsService] loaded module [reindex]
2022.06.29 14:29:48 INFO  es[][o.e.p.PluginsService] loaded module [transport-netty4]
2022.06.29 14:29:48 INFO  es[][o.e.p.PluginsService] no plugins loaded
2022.06.29 14:29:48 INFO  es[][o.e.e.NodeEnvironment] using [1] data paths, mounts [[(E:)]], net usable_space [92.9gb], net total_space [99.9gb], types [NTFS]
2022.06.29 14:29:48 INFO  es[][o.e.e.NodeEnvironment] heap size [512mb], compressed ordinary object pointers [true]
2022.06.29 14:29:49 INFO  es[][o.e.n.Node] node name [sonarqube], node ID [Z4EnM2WhQ0mGZVtiOzgBVA], cluster name [sonarqube], roles [data_frozen, master, remote_cluster_client, data, data_content, data_hot, data_warm, data_cold, ingest]
2022.06.29 14:29:55 ERROR es[][o.e.b.Bootstrap] Exception
org.elasticsearch.common.ssl.SslConfigException: failed to find a X509ExtendedTrustManager in the trust manager factory for [PKIX] and truststore [null]
	at org.elasticsearch.common.ssl.KeyStoreUtil.createTrustManager(KeyStoreUtil.java:152) ~[?:?]
	at org.elasticsearch.common.ssl.DefaultJdkTrustConfig.createTrustManager(DefaultJdkTrustConfig.java:58) ~[?:?]
	at org.elasticsearch.common.ssl.SslConfiguration.createSslContext(SslConfiguration.java:132) ~[?:?]
	at org.elasticsearch.reindex.ReindexSslConfig.reload(ReindexSslConfig.java:137) ~[?:?]
	at org.elasticsearch.reindex.ReindexSslConfig.<init>(ReindexSslConfig.java:107) ~[?:?]
	at org.elasticsearch.reindex.ReindexPlugin.createComponents(ReindexPlugin.java:101) ~[?:?]
	at org.elasticsearch.node.Node.lambda$new$18(Node.java:736) ~[elasticsearch-7.17.1.jar:7.17.1]
	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:271) ~[?:?]
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) ~[?:?]
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?]
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?]
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) ~[?:?]
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?]
	at org.elasticsearch.node.Node.<init>(Node.java:750) ~[elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.node.Node.<init>(Node.java:309) ~[elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:234) ~[elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:234) ~[elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:434) [elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:166) [elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:157) [elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77) [elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112) [elasticsearch-cli-7.17.1.jar:7.17.1]
	at org.elasticsearch.cli.Command.main(Command.java:77) [elasticsearch-cli-7.17.1.jar:7.17.1]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:122) [elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80) [elasticsearch-7.17.1.jar:7.17.1]
2022.06.29 14:29:55 ERROR es[][o.e.b.ElasticsearchUncaughtExceptionHandler] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: org.elasticsearch.common.ssl.SslConfigException: failed to find a X509ExtendedTrustManager in the trust manager factory for [PKIX] and truststore [null]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:170) ~[elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:157) ~[elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77) ~[elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112) ~[elasticsearch-cli-7.17.1.jar:7.17.1]
	at org.elasticsearch.cli.Command.main(Command.java:77) ~[elasticsearch-cli-7.17.1.jar:7.17.1]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:122) ~[elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80) ~[elasticsearch-7.17.1.jar:7.17.1]
Caused by: org.elasticsearch.common.ssl.SslConfigException: failed to find a X509ExtendedTrustManager in the trust manager factory for [PKIX] and truststore [null]
	at org.elasticsearch.common.ssl.KeyStoreUtil.createTrustManager(KeyStoreUtil.java:152) ~[?:?]
	at org.elasticsearch.common.ssl.DefaultJdkTrustConfig.createTrustManager(DefaultJdkTrustConfig.java:58) ~[?:?]
	at org.elasticsearch.common.ssl.SslConfiguration.createSslContext(SslConfiguration.java:132) ~[?:?]
	at org.elasticsearch.reindex.ReindexSslConfig.reload(ReindexSslConfig.java:137) ~[?:?]
	at org.elasticsearch.reindex.ReindexSslConfig.<init>(ReindexSslConfig.java:107) ~[?:?]
	at org.elasticsearch.reindex.ReindexPlugin.createComponents(ReindexPlugin.java:101) ~[?:?]
	at org.elasticsearch.node.Node.lambda$new$18(Node.java:736) ~[elasticsearch-7.17.1.jar:7.17.1]
	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:271) ~[?:?]
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) ~[?:?]
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?]
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?]
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) ~[?:?]
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?]
	at org.elasticsearch.node.Node.<init>(Node.java:750) ~[elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.node.Node.<init>(Node.java:309) ~[elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:234) ~[elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:234) ~[elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:434) ~[elasticsearch-7.17.1.jar:7.17.1]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:166) ~[elasticsearch-7.17.1.jar:7.17.1]
	... 6 more

FWIW - I found this link: Elasticsearch wont start with FIPS Mode Enabled - Elastic Stack / Elasticsearch - Discuss the Elastic Stack

The error is the same, talks about elasticsearch not starting when in FIPS mode.

I took the machine out of FIPS mode - but it didn’t help.

B

PROGRESS!

I have been using a _JAVA_OPTION:

-Djavax.net.ssl.trustStoreType=WINDOWS-ROOT

This (previously) told the JVM to use the Windows certificate store (instead of having to hack up the text files with a corporate CA. I’ve used this for years with no issues cause it saves me the trouble of having to muck with the java cert files.

Once I removed this option I got past that error.

Ok, thanks for sharing the progress!
The problem arises inside this method, and it seems Elastic search is unable to find an adequate Algorithm (PKIX) and default trust store (null)

Leo - I was able to complete the upgrade yesterday once this hurdle was removed. I’m not sure what’s changed to invalidate that _JAVA_OPTION recently - in this case I don’t think I’m specifically using it.

I truly appreciate your help to get things moved forward.

1 Like

I’m going to revive this thread as now my LDAP integration is broken (looks like they are now using a a cert from the internal CA as opposed to a purchased cert). I’ve also cross-posted this to the elastic forum:

Failed to find a X509ExtendedTrustManager in the trust manager factory for [PKIX] and truststore [null] - Elastic Stack / Elasticsearch - Discuss the Elastic Stack

The _JAVA_OPTIONS works on other systems - so my current guess is this is specifically related to something in Elastic. Was there an Elastic upgrade in SQ 9.5?

B