Sonarqube[CE] 8.9 - intermittent timeouts

peppe1977 · January 21, 2022, 12:14am

K8S 1.20
Sonarqube Community Edition 8.9.6.50800
Postgres:10.6 (AWS Managed Service)

Sonar(K8S/POD) seats behind NGINX/Controller which in turns is behind AWS/LB. This very same architecture/setup used to work perfectly fine - no timeouts. As you can see below, nginx gets request though it times out. Nothing has changed other than sonar[ce] upgraded from 8.2 to 8.9. One out of 10+ attempts goes through…

acc641cc2d6eb4c56-nginx-ingress-internal-controller-6588cdkvslv nginx-ingress-internal-controller {"request_id": "f543deb9d1f7f23cfa2b510947ce472f","proxy_protocol_addr": "172.24.145.53","remote_addr": "172.24.145.53", "proxy_add_x_forwarded_for": "172.24.145.53", "remote_user": "43edae08fda2acfef9ef50013915a68521e50cee", "time_local": "20/Jan/2022:23:40:31 +0000", "request" : "POST /api/ce/submit?projectKey=NotificationAdapter&projectName=notification-service HTTP/2.0", "status": "408", "body_bytes_sent": "0", "http_referer": "", "http_user_agent": "ScannerGradle/2.8-SNAPSHOT/6.8.1", "request_length" : "275", "request_time" : "60.004", "proxy_upstream_name": "sonarqube-sonarqube-sonarqube-9000", "upstream_addr": "", "upstream_response_length": "", "upstream_response_time": "", "upstream_status": ""}

Inside sonar container (K8S/POD) - port 9000. ok
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:41555         0.0.0.0:*               LISTEN      24/java
tcp        0      0 127.0.0.1:35719         0.0.0.0:*               LISTEN      151/java
tcp        0      0 0.0.0.0:9000            0.0.0.0:*               LISTEN      109/java
tcp        0      0 127.0.0.1:9001          0.0.0.0:*               LISTEN      24/java
tcp        0      0 127.0.0.1:50574         127.0.0.1:9001          ESTABLISHED 109/java
tcp        0      0 127.0.0.1:9001          127.0.0.1:50574         ESTABLISHED 24/java
tcp        0      0 100.107.66.190:9000     100.97.171.109:38586    ESTABLISHED 109/java
tcp        0      0 100.107.66.190:34060    172.24.242.236:5432     ESTABLISHED 109/java
tcp        0      0 127.0.0.1:50862         127.0.0.1:9001          ESTABLISHED 109/java
tcp        0      0 100.107.66.190:41256    172.24.242.236:5432     ESTABLISHED 109/java
tcp        0      0 127.0.0.1:36402         127.0.0.1:9001          ESTABLISHED 151/java
tcp        0      0 127.0.0.1:9001          127.0.0.1:36402         ESTABLISHED 24/java
tcp        0      0 100.107.66.190:37294    172.24.242.236:5432     ESTABLISHED 151/java
tcp        0      0 100.107.66.190:42724    172.24.242.236:5432     ESTABLISHED 151/java
tcp        0      0 127.0.0.1:9001          127.0.0.1:50862         ESTABLISHED 24/java

enabled debug log and this snnipet repeat itself - so pasting it once.

llocated: 2 of 10; total allocated: 2 of 30]
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[AX55hWDdfJy4frVPAAPd][o.a.h.i.n.c.InternalHttpAsyncClient] [exchange: 240] Connection allocated: CPoolProxy{http-outgoing-1 [ACTIVE]}
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.h.i.n.c.MainClientExec] [exchange: 240] Attempt 1 to execute request
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.h.i.n.c.MainClientExec] [exchange: 240] Target auth state: UNCHALLENGED
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.h.i.n.c.MainClientExec] [exchange: 240] Proxy auth state: UNCHALLENGED
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.http.headers] http-outgoing-1 >> GET /_cluster/health?master_timeout=30s&level=cluster&timeout=30s HTTP/1.1
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.http.headers] http-outgoing-1 >> Content-Length: 0
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.http.headers] http-outgoing-1 >> Host: localhost:9001
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.http.headers] http-outgoing-1 >> Connection: Keep-Alive
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.http.headers] http-outgoing-1 >> User-Agent: elasticsearch-java/7.16.0-SNAPSHOT (Java/11.0.11)
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.http.headers] http-outgoing-1 >> X-Elastic-Client-Meta: es=7.16.0-SNAPSHOT,jv=11,t=7.16.0-SNAPSHOT,hc=4.1.4,kt=1.4
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.h.i.n.c.MainClientExec] [exchange: 240] Request completed
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.http.headers] http-outgoing-1 << HTTP/1.1 200 OK
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.http.headers] http-outgoing-1 << X-elastic-product: Elasticsearch
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.http.headers] http-outgoing-1 << content-type: application/json; charset=UTF-8
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.http.headers] http-outgoing-1 << content-length: 387
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.h.i.n.c.MainClientExec] [exchange: 240] Response received HTTP/1.1 200 OK
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.h.i.n.c.MainClientExec] [exchange: 240] Consume content
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.h.i.n.c.InternalHttpAsyncClient] [exchange: 240] Connection can be kept alive indefinitely
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.h.i.n.c.MainClientExec] [exchange: 240] Response processed
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.h.i.n.c.InternalHttpAsyncClient] [exchange: 240] releasing connection
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.h.i.n.c.PoolingNHttpClientConnectionManager] Releasing connection: [id: http-outgoing-1][route: {}->http://localhost:9001][total kept alive: 1; route allocated: 2 of 10; total allocated: 2 of 30]
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.h.i.n.c.PoolingNHttpClientConnectionManager] Connection [id: http-outgoing-1][route: {}->http://localhost:9001] can be kept alive indefinitely
sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.21 00:05:03 DEBUG web[][o.a.h.i.n.c.PoolingNHttpClientConnectionManager] Connection released: [id: http-outgoing-1][route: {}->http://localhost:9001][total kept alive: 2; route allocated: 2 of 10; total allocated: 2 of 30]

ganncamp · January 21, 2022, 4:31pm

Hi,

I’m not familiar with Nginx logging, & so I’m not able to see any timeouts in the logs you shared. Fortunately you also shared what’s timing out: submission of an analysis report. You didn’t say this explicitly, but I’m going to guess that when you say “One out of 10+ attempts goes through” I guess you mean that one out of every 10 attempts to submit an analysis report goes through (and that all other requests to SonarQube, e.g. to paint the UI, succeed).

When we see analysis report submission timing out, it’s generally something on the network blocking the submission. And generally it’s about the size of the analysis report. That would explain why 1 in 10 gets through: the successful 1 fits under the allowed size limit. The other 9 are too big.

Why would this show up only after upgrade? Because with each new version of SonarQube, we add more rules (i.e. new rules raise new/more issues), and rules get smarter (i.e. fewer False Negatives → more issues from old rules). I haven’t done any statistical analysis, but it’s a fair bet that analysis reports get slightly larger with each new SonarQube version. And with this upgrade you hit the limits set in your internal network controls.

HTH,
Ann

peppe1977 · January 21, 2022, 9:39pm

Hi Ann,

Thanks for taking the time to not only look into it though also share your thoughts. Appreciated. This is what makes this community GREAT!. I am sorry that did not mention it was the submission of the analysis report that times out. In the NGINX logs, you can see a 408 [“status”: “408” == ‘request timeout’] and “request_time” : “60.004” and truth be told, 60 seconds is very long timer. Though it is a client 4XX kind of error, it does not necessarily be a client problem If body was too large, we would have expected a 413 right away and not a 408. How can we check how much time sonarqube is taking ? We have had sonar for a few years now and it has gone through quite few upgrades though this is first time we get considerable number of 408s (default timer is 60seconds). We do understand what you mentioned though would be good to connect the dots, that is, measure exactly how much time sonar takes (and assuming it is acceptable) perhaps increase def 60seconds. Any thoughts? thanks /Pedro

ganncamp · January 24, 2022, 8:00pm

Hi,

The timeout you’re seeing is “I never got a response when I submitted the analysis report”. That’s it. The scanner submits the report and SonarQube says “Yup! Got it!” So the question would be not how long SonarQube is taking but transit time across your network. That’s what leads me to believe that the problem is not a timer in SonarQube, but somewhere else in your network. The post is being rejected because it’s larger than the (somewhat) arbitrary size limit baked into your system.

But okay. If you really want to verify this, watch the SonarQube access logs at analysis completion to see when the POST arrives at SonarQube.

HTH,
Ann

peppe1977 · January 24, 2022, 9:39pm

Hi @ganncamp ,

Thanks though I am not really sure I follow you here. So, allow me to better explain the setup and the problem. Sorry I did a bad job on that.

Jenkins (client) sends the request which promptly makes to NGINX (which is last hop in the network path before Sonarqube)
9 out of 10 attempts, this leads to a request timeout from NGINX point of view (this is the “408” I shared earlier).
NGINX def timer that I mentioned is 60 seconds, therefore it sends a request to sonarqube and 9 out of 10 times throws a 408 because the 60seconds timer expired before sonarqube was able to send response back.

Thanks /Pedro

ganncamp · January 25, 2022, 4:27pm

Hi Pedro,

Can you confirm that you see all 10 attempts show up in the SonarQube access log?

Ann

peppe1977 · January 27, 2022, 11:32pm

Hi @ganncamp ,

nginx (used as reverse-proxy) is the last network hop before sonarqube. I can tell you that nginx gets every single of these requests though please advise where Sonar’s access logs are located at? There are many possibilities here and it would be good to reduce the search domain Lets consider that just before Sonar upgrade to 8.2 (only change done), this really never occurred - which does not mean is Sonar and thats why I reached out to this community. Get some help and guidance. Pls let me know where access logs are and can def check on them thanks /Pedro

ganncamp · January 28, 2022, 6:24pm

Hi,

You’ll find them next to your other server logs.

Ann

peppe1977 · January 28, 2022, 10:42pm

Hi @ganncamp ,

It did not take much to reproduce it and as you can see in the logs, the request is received by Sonar. Pls check it out (attached filed with debug logs). Pls let me know your thoughts. Cheers /Pedro

sonarqube-jenkins-logs.txt (101.3 KB)

ganncamp · January 31, 2022, 6:05pm

Hi,

Here’s the part of your log that’s interesting to me:

2022.01.28 22:18:32 DEBUG ce[][o.postgresql.Driver] Connecting with URL: jdbc:postgresql://infra-postgresql.cluster-cskx4cpsa.us-west-1.rds.amazonaws.com:5432/sonarqube
2022.01.28 22:18:32 DEBUG ce[][o.p.c.v.ConnectionFactoryImpl] Trying to establish a protocol version 3 connection to infrastructure-postgresql.cluster-csuktjafy04g.us-east-1.rds.amazonaws.com:5432
2022.01.28 22:18:32 DEBUG ce[][o.p.c.v.ConnectionFactoryImpl] Receive Buffer Size is 6,291,456
2022.01.28 22:18:32 DEBUG ce[][o.p.c.v.ConnectionFactoryImpl] Send Buffer Size is 6,291,456
2022.01.28 22:18:32 DEBUG ce[][o.p.ssl.MakeSSL] converting regular socket connection to ssl <<--Pedro: Last log line before Jenkins receives:

This is SonarQube trying to talk to your DB. I’ve been assuming the block was between Jenkins & SonarQube (because 99.9% of the time that’s where it is) but it looks instead like the problem is between SonarQube and the DB. It looks like the failure is in trying to store the analysis report in the DB while the job is queued for processing.

Can you take a look at that communication leg?

Ann

peppe1977 · February 4, 2022, 5:29pm

Hi @ganncamp ,

Thank you. Appreciated. The sonar logs does not show a failure, does it? Last line shows that it is trying to convert regular socket to ssl and that’s it. If there was connectivity problem or ssl handshake problem then Id have expected sonar to throw error but it does not. AWS/Cloudwatch , at a glance, does not show any problems at all. Perhaps the analysis report is too big and sometimes sonar takes more than 60 seconds to process it? There is a client (jenkins) to server (sonar) leg that is established an awaiting an answer (and def timer there is very conservative 60secs). Now if sonar is slow or has problems or takes a lot to push analysis to the DB then we would need help to check. Something is fishy here becasue , and again, only thing that changed in the entire setup was Sonar. Nothing else got changed and/or upgraded.

Thanks /Pedro

ganncamp · February 4, 2022, 5:44pm

Hi Pedro,

That’s just it. From the log snippet, it looks like this is the initial receipt by SonarQube & storage of the analysis report in the DB for later processing. There’s no attempt to process it yet. That happens asynchronously, later.

The file you provided earlier ends at

sonarqube-sonarqube-5fc55f4947-qkcdc sonarqube 2022.01.28 22:18:32 DEBUG ce[o.p.ssl.MakeSSL] converting regular socket connection to ssl <<–Pedro: Last log line before Jenkins receives:

From Jenkins/CI:
[okhttp3.internal.http2.StreamResetException: stream was reset: CANCEL, okhttp3.internal.http2.StreamResetException: stream was reset: CANCEL]

What happens in ce.log, web.log after Jenkins gets that message?

Yes, and I explained in my initial response why a SonarQube upgrade could result in a slightly larger analysis report.

Ann

peppe1977 · February 4, 2022, 10:48pm

Hi @ganncamp

The log shared has ce and web as well. Have you had a chance to follow through the log to check on both? It is all there and in debug mode. Something still fishy here because client (jenkins) to server (sonar) on a 60 seconds timeout (quite a lot as most of the cases either response should be given right away or perhaps async as to not keep connection open for nothing) that does not seem to be enough - in most of the times - for sonar to do analysis report. Even if sonar adds more checks and so on because if someone decides to take advantage of most then sonar would not properly operate as client/server timesout before sonar is done. This is why I was wondering how can we check on why sonar takes so long as to ensure we are not missing anything? thanks /Pedro

ganncamp · February 7, 2022, 5:28pm

Hi,

This is exactly why I asked you to check on the connection between SonarQube and Postgres. What happens when Jenkins contacts SonarQube at the end of analysis is that the analysis report is submitted to the server and SonarQube stores the report in the DB for later processing.

Not coincidentally, I believe, the last thing we see in your log is SonarQube trying to establish a connection with Postgres.

Again, can you check on what’s going on between SonarQube and Postgres?

Ann

peppe1977 · February 7, 2022, 5:44pm

Hi,

That trace is recurrent and even on working case, it prints the very same lines. Therefore, it does not seem to indicate a problem. Moreover, there is no transient problems between sonar and db as far as we can tell. We can try to allocate even more heap/memory to client (jenkins node) to see if the transport/cancel (originally reported) still occurs though at a glance, have not seem OOM / heap exhaustion. Cheers /P

ganncamp · February 7, 2022, 5:57pm

Would you like to provide the SonarQube logging for after Jenkins times out?

Ann

peppe1977 · February 21, 2022, 5:36pm

@ganncamp ,

Good morning.
Sorry took me a bit to get back though we have been trying to reduce the search domain by running different tests and going through a lot of logs. Latest on this one is that we can not reproduce it if I run it locally though it happens intermittently - as reported herein - when exercised via Jenkins/CI (K8S Jenkins Node) - which has never happened before this upgrade was done.

Regarding your request - do you want the whole log during and after the jenkins times out? Cheers /Pedro

ganncamp · February 22, 2022, 4:04pm

Hi Pedro,

I’d like a few lines leading up to it and the logs after.

Again, I do not currently see this as a SonarQube problem.
Here’s roughly what happens at the end of analysis:

Scanner submits analysis report to SonarQube
SonarQube stores queues report for processing & meanwhile stores it in DB
SonarQube responds to scanner, saying report is queued. – Jenkins-side analysis is complete –
[Asynchronously] Server dequeues analysis report & processes it
- after report processing, any webhooks are called, notifying e.g. Jenkins of the Quality Gate status

From your logs, the problem is in step 2, and the last thing we’ve seen so far is the initial connection attempt to the DB. We need to see

if that connection is made
if the report is stored successfully
the time it took to do so

Ann

Topic		Replies	Views
es[][o.e.t.TransportService] Received response for a request that has timed out SonarQube Server / Community Build elasticsearch , kubernetes	1	999	December 2, 2022
SonarQube background tasks failing SonarQube Server / Community Build background_task	3	766	December 2, 2020
Occasional timeouts when running analysis SonarQube Server / Community Build	8	2185	January 22, 2019
504 Gateway Time-out errors after Sonarqube 9.9.4 LTS upgrade SonarQube Server / Community Build	5	135	August 15, 2024
SonarQube Scan for a project fails randomly at times and then succeeds when trying 2nd or 3rd time SonarQube Server / Community Build sonarqube , jenkins , elasticsearch	1	708	April 5, 2022

Sonarqube[CE] 8.9 - intermittent timeouts

Related topics