Make Azure DevOps build tasks atomic to the agent

KrylixZA · January 19, 2021, 11:37am

Hi all

Some context:

I am trying to solve an interesting problem but at the moment I am struggling. We have an on premises SonarQube server and a set of agents that can communicate with both Azure DevOps and the SonarQube server. The SonarQube server itself sits behind a company firewall that prevents the Microsoft Azure Pipelines agents from communicating with the server.

As more teams have been adopting SonarQube, the load on the agent pool built up dramatically. Many teams have multiple jobs running per build pipeline and they all run directly on the agent pool. The agent pool technically only consists of 3 VMs. So to handle the load, I installed 3 instances of the Azure Pipelines Agent service on each VM, as you can see in the screenshot below.

The problem:

The problem we are having is that when there are multiple builds running on a single VM, all using the MSBuild Sonar Scanner, we get unexpected and inconsistent build errors. Initially my assumption was that one of the security tools running on our servers was the problem but after further investigation, it seems to be related to the fact that the SonarQubePrepare@4 task appears to install a global dotnet tool, or something to that affect. It seems that when there are multiple builds that one can influence the state of the Sonar Scanner for the other build. We end up having a variety of errors, but the most common is the following:

##[error]15:30:11.322 The SonarScanner for MSBuild integration failed: SonarQube was unable to collect the required information about your projects. 
Possible causes: 
 1. The project has not been built - the project must be built in between the begin and end steps 
 2. An unsupported version of MSBuild has been used to build the project. Currently MSBuild 14.0.25420.1 and higher are supported. 
 3. The begin, build and end steps have not all been launched from the same folder 
 4. None of the analyzed projects have a valid ProjectGuid and you have not used a solution (.sln)

I searched and searched for solutions to this error until I eventually found this post this morning: SonarQube impacting seperate agent job?

This post seems to deal with the issue but doesn’t resolve it for me, and here’s why:

In my case, I am managing 3 VMs, 9 agents that are available to all teams within my organization and they can and do use them as they need. It is not completely feasible for all teams to add config to their .csproj or .sqlproj files to tell the scanner whether or not to run. What we need for our use case is that the SonarQubePrepare@4, SonarQubeAnalyze@4 and SonarQubePublish@4 tasks are totally isolated and atomic to their $(Pipeline.Workspace).

Example of the scenario that’s causing the race condition:

Team A has a build running on Agent1_1 which has passed the SonarQubePrepare@4 and is currently running a dotnet command, say, dotnet build. During this time, Team B triggers a build on Agent1_2. Now there are two builds running on the Agent1 VM. Their first task is to run SonarQubePrepare@4. This now updates the current running directory of the dotnet SDK. A little time passes and Team A’s build on Agent1_1 is finishing and starts running the SonarQubeAnalyze@4 task. Their build fails with the following error message detailed above. Not long after, Team B’s build on Agent1_2 passes.

Configuration:

3 VMs - Agent1, Agent2 and Agent3 with 3 instances of the Azure Pipelines service running under the following directories:
C:\agent\1\...
C:\agent\2\...
C:\agent\3\...
These VMs are running Windows Server 2019 Standard with 6 cores, 12 threads and 12GB of RAM and a 10GB network to each.

The desired state:

3 VMs with 3 services, all able to run the SonarQube tasks without influencing each other during the process without having to update the configuration of every single team’s project that is using the SonarQube scan in their pipelines.

Is this currently possible? My research tells me no - hence 2/3 agents have been disabled in the screenshot above - as does the article I link above.

Thanks & looking forward to hearing back!

KrylixZA · January 19, 2021, 11:45am

I guess my main reason for posting here is because, to my knowledge, Microsoft’s intention for the Azure Pipelines Agent service is that it should be totally atomic to it’s own install location (when configured as such - as I have) but the SonarQube tasks seem to violate that design thinking. It would be nice if I could leverage the horse power on these VMs rather than build more to scale the agent pool.

ahaleiii · January 19, 2021, 3:30pm

I am also interested in this scenario, because my self-hosted agent setup is almost identical (3 VMs, 3 agents on each). Currently we have decent throughput of jobs because there can be 9 builds running at once. I am concerned that once we add SonarQube analysis tasks to the builds, then our throughput capabilities will drop to 1 job per machine (3), which would start to cause backup, wait times, and annoyed developers. I am also skeptical of our ability to get more VMs approved due to (justified) cost concerns. (SonarQube EE and the resources to host it are $$ enough already.)

Per a note in the Microsoft agent documentation:

Although multiple agents can be installed per machine, we strongly suggest to only install one agent per machine. Installing two or more agents may adversely affect performance and the result of your pipelines.

But that suggestion is in my mind negated by the Hardware specs for self-hosted Windows agents:

… As a point of reference, the Azure DevOps team builds the hosted agents code using pipelines that utilize hosted agents. On the other hand, the bulk of the Azure DevOps code is built by 24-core server class machines running 4 self-hosted agents apiece.

I agree that multiple self-hosted agents on a single machine is a real and valid scenario. Any Azure Pipeline tasks executing on one agent should not interfere with tasks executing on another agent on the same system.

KrylixZA · January 21, 2021, 12:53pm

I have managed to find a workaround to this problem. The solution lies in the form of running the agents in Docker containers rather than sharing a single OS. This is a slightly more involved solution but it does work.

You can find a guide on how to create a self-hosted Azure Pipelines agent here: https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/docker?view=azure-devops

And further guides on how to setup VS in a Dockerfile here: https://docs.microsoft.com/en-us/visualstudio/install/build-tools-container?view=vs-2019

Just remember to install NodeJS, JRE and JDK and update the Path and JAVA_HOME variables through PowerShell commands as they’re all needed for the SonarQube Scanner to run.

This can be a solution to the problem but I do not want to close this issue as it is a genuine design fault in the Azure DevOps SonarQube/SonarCloud tasks that needs to be resolved.

mickaelcaro · January 21, 2021, 1:47pm

Hi @KrylixZA and @ahaleiii

Thanks for your insights on that. We’re already aware that concurrent build may lead to unwanted behaviors, this is cause by our target, which we are installing globally during the begin step, and deleting during the end step, so once a build has reached the Run Code Analysis (end step), it will delete the target, and potentially make another parallel build not analyzing the code.

We’re working on it

Mickaël

PaulVrugt · January 23, 2023, 12:41pm

any progress on this? Even without parallel jobs, the settings sometimes seem to “leak” into other jobs that are run on the same agent. See: SonarQube impacting seperate agent job? - #16 by PaulVrugt