Skip to content

Add Environment Validator TSG: AzStackHci_Software_IsNotPartofDomain (Domain Membership)#303

Open
1008covingtonlane wants to merge 4 commits into
Azure:mainfrom
1008covingtonlane:tsg-software-isnotpartofdomain
Open

Add Environment Validator TSG: AzStackHci_Software_IsNotPartofDomain (Domain Membership)#303
1008covingtonlane wants to merge 4 commits into
Azure:mainfrom
1008covingtonlane:tsg-software-isnotpartofdomain

Conversation

@1008covingtonlane

Copy link
Copy Markdown
Collaborator

What this adds

A public remediation TSG for the pre-deployment Environment Validator check AzStackHci_Software_IsNotPartofDomain (display name Domain Membership), plus its entry in the EnvironmentValidator README.

The check fails when a machine is already joined to an Active Directory domain before deployment. Azure Local requires each machine to start in a workgroup, and the deployment process joins it to the domain itself. There was no public remediation guide for this validator.

What the TSG covers

  • Where the failure appears: the deployment Validation step in the portal, the targeted validator Invoke-AzStackHciSoftwareValidation -Include Test-IsNotPartofDomain, and the on-machine Event ID 17205, with the exact failure detail line.
  • Remediation: unjoin the machine with Remove-Computer -UnjoinDomainCredential and restart, then re-validate. This is the remediation the validator itself recommends.
  • Verification: re-run the single validator and confirm SUCCESS.

Accuracy and validation

  • The check name, display name, severity, description, the failure and success detail strings, and the remediation text are taken from the validator source, not paraphrased.
  • The guidance was validated end to end on a live lab cluster (Tier 1): baseline workgroup, inject a domain join, confirm the real check reports FAILURE with the production signature, run the documented unjoin and restart, and confirm the check returns to SUCCESS.
  • Single validator only, no internal or telemetry content in the public guide.

Tracked by ADO 38564291. Follows the same structure as the System Drive Free Space TSG (#302).

…(Domain Membership)

Adds a public remediation guide for the pre-deployment Software validator
AzStackHci_Software_IsNotPartofDomain (display name "Domain Membership"). The
check fails when a machine is already joined to an Active Directory domain before
deployment; Azure Local requires each machine to start in a workgroup and joins it
to the domain itself during deployment.

The TSG covers detection (the deployment Validation step, the targeted validator
Invoke-AzStackHciSoftwareValidation -Include Test-IsNotPartofDomain, and the
on-machine Event ID 17205), where the failure appears, the affected-machine detail
line, the consequence, the remediation (unjoin with
Remove-Computer -UnjoinDomainCredential and restart), and verification.

The check name, display name, severity, description, the failure and success
detail strings, and the remediation text are taken from the validator source. The
guidance was validated end to end on a live lab cluster (baseline workgroup,
inject a domain join, confirm the real check reports FAILURE with the production
signature, run the documented unjoin and restart, confirm the check returns to
SUCCESS).

Tracked by ADO 38564291.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 25, 2026 16:04

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new public troubleshooting guide (TSG) for the Environment Validator check AzStackHci_Software_IsNotPartofDomain (“Domain Membership”) and indexes it in the EnvironmentValidator README, improving self-service remediation for pre-deployment failures caused by nodes being domain-joined.

Changes:

  • Adds Troubleshooting-Software-IsNotPartofDomain.md documenting symptom location, remediation (unjoin + reboot), and re-validation steps.
  • Updates TSG/EnvironmentValidator/README.md to include the new TSG in the list.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
TSG/EnvironmentValidator/Troubleshooting-Software-IsNotPartofDomain.md New TSG for the “Domain Membership” validator failure, including remediation and verification steps.
TSG/EnvironmentValidator/README.md Adds an index entry pointing to the new TSG.

Comment thread TSG/EnvironmentValidator/Troubleshooting-Software-IsNotPartofDomain.md Outdated
Comment thread TSG/EnvironmentValidator/Troubleshooting-Software-IsNotPartofDomain.md Outdated
- Add a pre-unjoin step to confirm a working local administrator sign-in
  before Remove-Computer + restart, so an operator is not locked out of a
  previously domain-joined machine (review finding, MEDIUM).
- Reframe the single-validator instruction: -Include runs only this check;
  drop the inaccurate "excluded from the default Software run" claim. A bare
  Invoke-AzStackHciSoftwareValidation runs all checks; the exclude lives only
  in the deployment orchestrator (Test-AzStackHciSoftware) and is conditional.
- Use Restart-Computer -Force in the remediation to avoid a hang.
- README: surface the "Domain Membership" display name in the link text.
- Related: add the canonical Learn deployment-local-identity link.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@1008covingtonlane

Copy link
Copy Markdown
Collaborator Author

Addressed the review feedback (commit b598478):

  1. [MEDIUM] Local-admin lockout — added a new pre-unjoin step (now step 2) that confirms the built-in local Administrator is enabled with a known password (Get-LocalUser / Enable-LocalUser / Set-LocalUser, plus Get-LocalGroupMember) before Remove-Computer + restart, so an operator is not locked out of a previously domain-joined machine.
  2. [LOW / verify] "excluded from the default Software run" — confirmed against the validator source that a bare Invoke-AzStackHciSoftwareValidation runs all checks (including Test-IsNotPartofDomain). The exclude is applied only by the deployment orchestrator (Test-AzStackHciSoftware) and is conditional. Reworded the instruction to "-Include runs only this check" and dropped the inaccurate claim.
  3. [NIT] README — link now reads "Troubleshooting Domain Membership (Software IsNotPartofDomain)".
  4. [NIT] reboot — remediation now uses Restart-Computer -Force.
  5. [OPTIONAL] Related — added the canonical Learn deployment-local-identity link.

Merge note: findings 1 and 4 change on-box commands, so per our embedded-test standard the live VM-cluster loop will be re-run (the -Force reboot plus the new local-admin pre-check) and LastValidated refreshed before merge.

@1008covingtonlane

Copy link
Copy Markdown
Collaborator Author

Re-validation complete: the domain-membership loop re-validated Grade A on a live masonenode VM cluster (Azure Local build 2607) on 2026-06-25.

  • baseline: throwaway workgroup node, validator SUCCESS
  • inject (domain join): the real Environment Validator reported the exact documented FAILURE, '<NODE>' is part of a domain. Please remove '<NODE>' from the domain.
  • mitigate (the TSG's own Remove-Computer -UnjoinDomainCredential + a forced restart): the node returned to a workgroup (PartOfDomain = False)
  • revalidate: validator SUCCESS (1 of 1)
  • teardown: throwaway node destroyed

This re-exercises the two material changes from the review mitigations: the Restart-Computer -Force reboot is covered by the loop's forced restart, and the new local-admin pre-check (finding #1) documents the precondition the harness itself relies on, it reaches the node as the local administrator for the detect, unjoin, and revalidate steps. The re-validation merge gate is therefore cleared.

LastValidated: 2026-06-25, build 2607 (recorded in the internal companion, not the public TSG).

@1008covingtonlane 1008covingtonlane self-assigned this Jun 26, 2026
1008covingtonlane and others added 2 commits June 26, 2026 09:40
Drop -Force from the Remove-Computer unjoin command; the restart is already
explicit via Restart-Computer -Force. Make the "Azure Local deployment
prerequisites" Related reference a clickable Markdown link.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add prompt guidance before the unjoin command (the credential dialog and the
confirmation prompt now shown after -Force was removed), equate "machine" and
"node" once in the Overview, and use the same illustrative node name
(AzL-Node-01) in the verify-step detail line for consistency.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants