Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
103 commits
Select commit Hold shift + click to select a range
ceebd22
feat: bump windows image version for 2026-03B (#8074)
rchincha Mar 13, 2026
57cd406
feat(rcv1p): unify cert bootstrap flow and add Windows CA refresh task
rchincha Mar 16, 2026
965f64f
feat: enhance CA certificates refresh task with endpoint mode based o…
rchincha Mar 18, 2026
7dba267
feat: add tests for certificate endpoint mode handling in AKS custom …
rchincha Mar 19, 2026
8b629de
feat: simplify certificate endpoint mode handling and refresh task re…
rchincha Mar 19, 2026
394c3e0
feat: implement conditional CA certificates refresh task registration…
rchincha Mar 19, 2026
1a3c77f
feat: enhance CA certificates refresh task registration for legacy CS…
rchincha Mar 19, 2026
53abecc
feat: update tests for certificate endpoint mode handling and refresh…
rchincha Mar 20, 2026
d0bb7e6
feat: refactor test setup functions for improved readability and cons…
rchincha Mar 20, 2026
be9ddef
feat: update Get-CustomCloudCertEndpointModeFromLocation to clarify e…
rchincha Mar 20, 2026
4678c46
feat: enhance tests for Should-InstallCACertificatesRefreshTask and G…
rchincha Mar 20, 2026
c0bec67
feat: update cse_cmd.sh and cse_cmd.sh.gtpl to ensure consistent logg…
rchincha Mar 25, 2026
65924a2
feat: update CA certificates functions for backward compatibility wit…
rchincha Mar 26, 2026
133e6c6
feat: remove deprecated Ubuntu repository initialization logic from i…
rchincha Mar 27, 2026
c60e3c7
Split init-aks-custom-cloud.sh to fix Flatcar/ACL customData size limit
rchincha Apr 2, 2026
cf59a85
feat(e2e): add RCV1P cert mode end-to-end tests
rchincha Apr 13, 2026
97c4f5c
Address PR review feedback: fix multi-subscription, validation, and e…
rchincha Apr 14, 2026
d8dd1fb
Add Windows not-opted-in negative test for RCV1P cert mode
rchincha Apr 14, 2026
e346cf9
e2e: add VM instance-level tag update for RCV1P wireserver opt-in
rchincha Apr 16, 2026
ef80be3
e2e: use JSON injection for VM profile tags at VMSS creation time
rchincha Apr 16, 2026
084a5a5
e2e: use lightweight PATCH for VM instance tags instead of JSON injec…
rchincha Apr 16, 2026
ef89698
Revert "e2e: use lightweight PATCH for VM instance tags instead of JS…
rchincha Apr 16, 2026
6da97f8
e2e: use Microsoft.Resources/tags API for VM instance tag patching
rchincha Apr 16, 2026
12bc156
e2e: use BeginUpdate + deferred CSE for VM instance tagging
rchincha Apr 16, 2026
180254e
e2e: add feature flag check for RCV1P subscription
rchincha Apr 17, 2026
2328667
REVERT ME: poll wireserver IsOptedInForRootCerts with retry loop
rchincha Apr 17, 2026
e76bc3f
e2e: always log PlatformSettingsOverride feature flag status
rchincha Apr 17, 2026
8780afe
fix(windows): parse wireserver IsOptedInForRootCerts JSON with Conver…
rchincha Apr 17, 2026
9d1a2f1
e2e: make RCV1P_SUBSCRIPTION_ID optional with feature flag auto-detec…
rchincha Apr 18, 2026
fe84342
e2e: always collect Windows CSE logs (not just on failure)
rchincha Apr 18, 2026
b6651ec
fix: add wireserver HTTP error diagnostic logging for cert endpoints
rchincha Apr 19, 2026
d9a4539
e2e: use testDir() for Windows CSE output log path consistency
rchincha Apr 20, 2026
421fbe9
fix(e2e): filter CSE extension to fix empty Windows CSE log files
rchincha Apr 21, 2026
2f33739
fix(e2e): re-fetch VM instance view for fresh CSE extension status
rchincha Apr 21, 2026
bcfed44
e2e: trim whitespace from RCV1P_SUBSCRIPTION_ID to fix gating
rchincha Apr 21, 2026
82f2983
e2e: add gen2 Windows RCV1P tests, fix Windows2025 TrustedLaunch
rchincha Apr 22, 2026
829218d
e2e: switch RCV1P tests to Azure CNI Overlay to fix IP exhaustion
rchincha Apr 22, 2026
0ce609f
e2e: revert RCV1P from overlay back to kubenet
rchincha Apr 22, 2026
f1e42b6
REVERT ME: use dedicated kubenet cluster for RCV1P tests
rchincha Apr 23, 2026
c20ef57
REVERT ME: use Azure CNI cluster for Windows RCV1P tests
rchincha Apr 23, 2026
0b1ed48
REVERT ME: add wireserver endpoint diagnostics to Windows RCV1P valid…
rchincha Apr 23, 2026
8d71344
fix: use correct wireserver JSON field name for rcv1p cert download
rchincha Apr 23, 2026
374b84b
REVERT ME: add azcopy error logging to Windows log collection
rchincha Apr 23, 2026
98eca0a
REVERT ME: enable verbose test output for azcopy/wireserver diagnostics
rchincha Apr 23, 2026
71a344b
REVERT ME: canary check to prove whether SSH validators are broken
rchincha Apr 24, 2026
016e8f9
Remove canary check - validators confirmed working
rchincha Apr 24, 2026
5721d10
fix: make wireserver cert retrieval failures fatal on Linux
rchincha Apr 24, 2026
13f9833
revert: remove diagnostic commits used during RCV1P development
rchincha Apr 25, 2026
38e4e6d
fix: make wireserver unreachable fatal for RCV1P opt-in check
rchincha Apr 26, 2026
eb412d2
fix: use RCV1P Azure CNI cluster for Windows tests when explicit subs…
rchincha Apr 27, 2026
057a92b
fix: replace legacy ca-refresh cron entry with location-aware version
rchincha Apr 27, 2026
fd6df99
fix: align Windows wireserver retries to 10 to match Linux parity
rchincha Apr 27, 2026
9669db8
fix: enhance RCV1P opt-in tag handling in VMSS creation process
rchincha Apr 29, 2026
bf76e25
fix: use Azure CNI cluster for Windows RCV1P tests
rchincha May 6, 2026
cdba52f
revert: drop 'REVERT ME' cluster switching commits (now superseded)
rchincha May 6, 2026
1bd253b
revert: drop canary validator and wireserver polling debug commits
rchincha May 6, 2026
9c24851
feat(e2e): auto-detect RCV1P feature flag on E2E subscription
rchincha May 7, 2026
3732a24
fix(e2e): skip NotOptedIn tests on auto-detected enrolled subscriptions
rchincha May 7, 2026
2af3e4c
fix(e2e): use caller context in getCustomScriptExtensionStatus
rchincha May 7, 2026
9466de4
fix(e2e): remove TrustedLaunch from non-Gen2 Windows 2025 RCV1P test
rchincha May 7, 2026
54ffb32
fix: return code 2 when wireserver is unreachable in is_opted_in_for_…
rchincha May 7, 2026
f0a1396
fix: throw when opted-in but no certs downloaded with -FailOnError
rchincha May 7, 2026
56bf65d
e2e: use branch-built CSE zip for Windows RCV1P tests
rchincha May 7, 2026
d5c4f5c
fix: parse wireserver IsOptedInForRootCerts JSON response with jq
rchincha May 8, 2026
e335c3d
fix(e2e): update BootstrapConfigMutator signatures after rebase
rchincha May 8, 2026
2b4d429
fix: fail process_cert_operations when no cert bodies are saved
rchincha May 8, 2026
83ac070
fix: pass repodepot_endpoint explicitly to add_key_ubuntu and add_ms_…
rchincha May 8, 2026
63383eb
chore(e2e): remove REVERT ME wireserver diagnostic block from Windows…
rchincha May 8, 2026
27d5086
fix: guard against unresolved ADO pipeline variable expressions in RC…
rchincha May 18, 2026
9d0da88
fix: update for main branch API changes (getClusterVNet, remove Windo…
rchincha Jun 1, 2026
4eb13f5
fix: fail fast if LOCATION is empty when installing ca-refresh schedule
rchincha Jun 2, 2026
2d414ea
e2e: filter transient waagent ProtocolError in ValidateWaagentLog
rchincha Jun 2, 2026
4353cd4
e2e: simplify RCV1P to single-subscription-per-job model
rchincha Jun 2, 2026
cadcff8
init-aks-custom-cloud: add telemetry events for cert provisioning
rchincha Jun 4, 2026
6b31a42
e2e: skip NotOptedIn tests when tags are auto-injected
rchincha Jun 11, 2026
a5139fa
e2e: use ab-e2e-tme-rcv1p variable group for RCV1P pipeline
rchincha Jun 15, 2026
308eff9
fix: correct typo 'usuable' → 'usable' in chrony comment
rchincha Jun 17, 2026
fb8996b
fix: remove duplicate Register-NodeResetScriptTask call in BasePrep
rchincha Jun 17, 2026
46111f5
style: add missing space before inline comment in rcv1p win test
rchincha Jun 17, 2026
7b8b174
fix: fail hard if legacy CA cert trust store install fails
rchincha Jun 17, 2026
a12a0d8
fix: remove confusing '(true on MSFT tenant)' from displayName
rchincha Jun 18, 2026
8e1167c
docs: add comment explaining rcv1pTagsAutoInjected=false
rchincha Jun 18, 2026
f4965bb
fix: remove redundant cron schedule from e2e-rcv1p pipeline
rchincha Jun 18, 2026
708ca07
fix: reword skip message to reference environment, not tenant
rchincha Jun 18, 2026
1fd6fef
fix: parse AFEC feature flag response as JSON instead of string contains
rchincha Jun 18, 2026
54b48f3
docs: clarify deferred extension pattern is E2E-specific
rchincha Jun 18, 2026
dc3455a
fix: remove vmssResp2 code smell, assign directly to vmssResp
rchincha Jun 18, 2026
01a4590
fix: add logs_to_events telemetry for chrony restart
rchincha Jun 18, 2026
d36c6a4
fix: guard initAKSCustomCloudRepos with IsAKSCustomCloud in nodecusto…
rchincha Jun 18, 2026
9e34e6f
fix(e2e): expose subscriptionId parameter on e2e-tme.yaml
rchincha Jun 25, 2026
260f3c3
fix(init-aks-custom-cloud): use fixed-string grep for crontab cleanup
rchincha Jun 25, 2026
0c16a4d
refactor(e2e): remove dead per-scenario SubscriptionID override
rchincha Jun 25, 2026
9e172c8
test(parts): add ShellSpec coverage for init-aks-custom-cloud-repos.sh
rchincha Jun 25, 2026
d5108b9
fix(e2e): close file immediately in CSE zip walk to avoid FD accumula…
rchincha Jun 25, 2026
db670a2
test(cse-windows): re-stub Set-ExitCode after dot-sourcing to prevent…
rchincha Jun 25, 2026
67e5645
fix(e2e): override E2E_SUBSCRIPTION_ID pipeline variable with subscri…
rchincha Jun 25, 2026
2e68f4b
fix(rcv1p): fail hard when installing rcv1p CA certs to trust store f…
rchincha Jun 25, 2026
892116a
fix(e2e): use empty default for subscriptionId param to avoid cyclica…
rchincha Jun 25, 2026
5ddbc1e
fix(rcv1p): address copilot review comments
rchincha Jun 25, 2026
bba6793
fix(e2e): override RCV1P-incompatible settings when running TME e2e a…
rchincha Jun 26, 2026
f05669c
fix(e2e): detect RCV1P sub via E2E_SUBSCRIPTION_ID_RCV1P instead of h…
rchincha Jun 26, 2026
95a16c3
docs(e2e): expand reviewer comments on RCV1P override block
rchincha Jun 26, 2026
8b75bf9
fix(e2e): make blob storage account name subscription-unique
rchincha Jun 26, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions .pipelines/e2e-rcv1p.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
name: $(Date:yyyyMMdd)$(Rev:.r)
variables:
TAGS_TO_RUN: "rcv1pcertmode=true"
SKIP_E2E_TESTS: false
E2E_GO_TEST_TIMEOUT: "75m"
trigger: none
pr: none
jobs:
- template: ./templates/e2e-template.yaml
parameters:
name: RCV1P Cert Mode Tests
IgnoreScenariosWithMissingVhd: false

@r2k1 r2k1 May 15, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who is going to monitor this pipeline and address any issues?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably include an explicit run of this pipeline within our daily build system we use for official releases, that way we're guaranteed to have visibility during official release flows

though at the end of the day it's going to be on us to deal with failures

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be enabled in the TME tenant and probably as a async nightly so that it doesn't interfere with "immediate" tests (PRs, etc)

variableGroup: ab-e2e-tme-rcv1p
# The RCV1P testing subscription does not have platform auto-injection enabled,
# so the E2E framework explicitly injects opt-in tags on each VMSS.
rcv1pTagsAutoInjected: "false"
Comment thread
rchincha marked this conversation as resolved.
6 changes: 6 additions & 0 deletions .pipelines/e2e-tme.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
name: $(Date:yyyyMMdd)$(Rev:.r)
parameters:
- name: subscriptionId
type: string
displayName: Subscription ID to use for E2E tests (empty = use variable group default)
default: ""
variables:
SKIP_E2E_TESTS: false

Expand All @@ -8,4 +13,5 @@ jobs:
name: Linux Tests
IgnoreScenariosWithMissingVhd: false
variableGroup: ab-e2e-tme
subscriptionId: ${{ parameters.subscriptionId }}

3 changes: 3 additions & 0 deletions .pipelines/scripts/e2e_run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ set -euo pipefail
az account set -s "${E2E_SUBSCRIPTION_ID}"
echo "Using subscription ${E2E_SUBSCRIPTION_ID} for e2e tests"

# Map E2E_SUBSCRIPTION_ID to SUBSCRIPTION_ID which the Go test framework reads
export SUBSCRIPTION_ID="${E2E_SUBSCRIPTION_ID}"

# Setup go
export GOPATH="$(go env GOPATH)"
go version
Expand Down
16 changes: 13 additions & 3 deletions .pipelines/templates/e2e-template.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,24 @@ parameters:
default: ab-e2e
- name: subscriptionId
type: string
displayName: Subscription ID to use for E2E tests
default: $(E2E_SUBSCRIPTION_ID)
displayName: Subscription ID to use for E2E tests (empty = use variable group default)
default: ""
- name: rcv1pTagsAutoInjected
type: string
displayName: Whether the platform auto-injects RCV1P opt-in tags on all VMSSes
default: "true"

jobs:
- job: e2e
condition: and(succeeded(), ne(variables.SKIP_E2E_TESTS, 'true'))
variables:
- group: ${{parameters.variableGroup}} # all variables prefixed with E2E_* come from this variable group
# When a caller (e.g. aks-rp orchestrator) explicitly passes subscriptionId,
# override E2E_SUBSCRIPTION_ID from the variable group so the run targets the
# requested subscription (e.g. RCV1P). When empty, keep the variable group default.
- ${{ if ne(parameters.subscriptionId, '') }}:
- name: E2E_SUBSCRIPTION_ID
value: ${{parameters.subscriptionId}}
pool:
name: $(E2E_POOL_NAME)
timeoutInMinutes: 90
Expand All @@ -41,13 +51,13 @@ jobs:
bash .pipelines/scripts/e2e_run.sh
displayName: Run AgentBaker E2E
env:
E2E_SUBSCRIPTION_ID: ${{parameters.subscriptionId}}
SYS_SSH_PUBLIC_KEY: $(SYS_SSH_PUBLIC_KEY)
SYS_SSH_PRIVATE_KEY_B64: $(SYS_SSH_PRIVATE_KEY_B64)
BUILD_SRC_DIR: $(System.DefaultWorkingDirectory)
DefaultWorkingDirectory: $(Build.SourcesDirectory)
VHD_BUILD_ID: $(VHD_BUILD_ID)
IGNORE_SCENARIOS_WITH_MISSING_VHD: ${{parameters.IgnoreScenariosWithMissingVhd}}
RCV1P_TAGS_AUTO_INJECTED: ${{parameters.rcv1pTagsAutoInjected}}

- task: PublishTestResults@2
displayName: Upload test results
Expand Down
7 changes: 6 additions & 1 deletion aks-node-controller/parser/helper.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ func getFuncMap() template.FuncMap {
return template.FuncMap{
"getInitAKSCustomCloudFilepath": getInitAKSCustomCloudFilepath,
"getIsAksCustomCloud": getIsAksCustomCloud,
"getCloudLocation": getCloudLocation,
}
}

Expand Down Expand Up @@ -538,11 +539,15 @@ func getIsAksCustomCloud(customCloudConfig *aksnodeconfigv1.CustomCloudConfig) b
return strings.EqualFold(customCloudConfig.GetCustomCloudEnvName(), helpers.AksCustomCloudName)
}

func getCloudLocation(v *aksnodeconfigv1.Configuration) string {
return strings.ToLower(strings.Join(strings.Fields(v.GetClusterConfig().GetLocation()), ""))
}

/* GetCloudTargetEnv determines and returns whether the region is a sovereign cloud which
have their own data compliance regulations (China/Germany/USGov) or standard. */
// Azure public cloud.
func getCloudTargetEnv(v *aksnodeconfigv1.Configuration) string {
loc := strings.ToLower(strings.Join(strings.Fields(v.GetClusterConfig().GetLocation()), ""))
loc := getCloudLocation(v)
switch {
case strings.HasPrefix(loc, "china"):
return "AzureChinaCloud"
Expand Down
3 changes: 2 additions & 1 deletion aks-node-controller/parser/templates/cse_cmd.sh.gtpl
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
echo $(date),$(hostname) > ${PROVISION_OUTPUT};
{{if getIsAksCustomCloud .CustomCloudConfig}}
REPO_DEPOT_ENDPOINT="{{.CustomCloudConfig.RepoDepotEndpoint}}"
{{getInitAKSCustomCloudFilepath}} >> /var/log/azure/cluster-provision.log 2>&1;
{{end}}
LOCATION="{{getCloudLocation .}}"
Comment thread
rchincha marked this conversation as resolved.
Comment thread
rchincha marked this conversation as resolved.
{{getInitAKSCustomCloudFilepath}} >> /var/log/azure/cluster-provision.log 2>&1;
Comment thread
rchincha marked this conversation as resolved.
Comment thread
rchincha marked this conversation as resolved.

@cameronmeissner cameronmeissner May 15, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should change the name of this template func: maybe getInitCertificateTrustStoreFilepath or something - keeping the notion of "custom cloud" tied to this script at this point doesn't really make sense to me since we're running it everywhere

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was planning a follow up PR that cleans up references to "custom" after this PR lands. Also see my comment below. But ok either way.

/usr/bin/nohup /bin/bash -c "/bin/bash /opt/azure/containers/provision_start.sh"
16 changes: 16 additions & 0 deletions e2e/cache.go
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,22 @@ func clusterCiliumNetwork(ctx context.Context, request ClusterRequest) (*Cluster
return prepareCluster(ctx, model, false, false)
}

var ClusterRCV1PKubenet = cachedFunc(clusterRCV1PKubenet)

// clusterRCV1PKubenet creates a kubenet cluster for RCV1P cert mode testing.
func clusterRCV1PKubenet(ctx context.Context, request ClusterRequest) (*Cluster, error) {
return prepareCluster(ctx, getKubenetClusterModel("abe2e-rcv1p-kubenet-v1", request.Location, request.K8sSystemPoolSKU), false, false)
}

var ClusterRCV1PAzureNetwork = cachedFunc(clusterRCV1PAzureNetwork)

// clusterRCV1PAzureNetwork creates an Azure CNI cluster for Windows RCV1P cert mode testing.
// Windows tests require Azure CNI (not kubenet) because baseTemplateWindows() configures the NBC for
// Azure CNI overlay mode.
func clusterRCV1PAzureNetwork(ctx context.Context, request ClusterRequest) (*Cluster, error) {
return prepareCluster(ctx, getAzureNetworkClusterModel("abe2e-rcv1p-azure-v1", request.Location, request.K8sSystemPoolSKU), false, false)
}

// isNotFoundErr checks if an error represents a "not found" response from Azure API
func isNotFoundErr(err error) bool {
var respErr *azcore.ResponseError
Expand Down
Loading
Loading