Lemonde
  • Lemonde
  • 100% (Exalted)
  • Advanced Member Topic Starter
3 years ago

How has this happened?

Running New 2016 DC fails dcdiag /e /test:sysvolcheck /test:advertising

Gives this:

DSGetDCName returned information for DC1 when we were trying to reach DC2

And SYSVOL is not being shared.

I see nothing in the Event Log to explain this
Sponsor

Want to thank us? Use: Patreon or PayPal or Bitcoins: 3GJia7gLLY8V8eYBf5Q3RjCrNV8kZC3aNn

All opinions expressed within these pages are sent in by members of the public or by our staff in their spare time, and as such do not represent any opinion held by sircles.net Ltd or their partners.


Lemonde
  • Lemonde
  • 100% (Exalted)
  • Advanced Member Topic Starter
3 years ago
OK so this is a fairly serious problem.

The DCs in this place seem to be all broken except one that NTFRS doesn't work from. It is the only DC advertising SYSVOL and has an error in FRS that states:

Log Name: File Replication Service
Source: NtFrs
Date: 18/09/2020 00:04:25
Event ID: 13562
Task Category: None
Level: Warning
Keywords: Classic
User: N/A
Computer: DC2.domain.local
Description:
Following is the summary of warnings and errors encountered by File Replication Service while polling the Domain Controller DC2.domain.local for FRS replica set configuration information.

The nTFRSSubscriber object cn=domain system volume (sysvol share),cn=ntfrs subscriptions,cn=dc2,ou=domain controllers,dc=domain,dc=local has a invalid value for the attribute frsMemberReference.



Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="NtFrs" />
<EventID Qualifiers="32768">13562</EventID>
<Level>3</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2020-09-17T23:04:25.000000000Z" />
<EventRecordID>2413</EventRecordID>
<Channel>File Replication Service</Channel>
<Computer>DC2.domain.local</Computer>
<Security />
</System>
<EventData>
<Data>DC2.domain.local</Data>
<Data>The nTFRSSubscriber object cn=domain system volume (sysvol share),cn=ntfrs subscriptions,cn=dc2,ou=domain controllers,dc=domain,dc=local has a invalid value for the attribute frsMemberReference.

</Data>
</EventData>
</Event>

Looking at the setting in ADSIEdit

Under Default Naming > Domain > Domain Controllers > NTFRS Subscriptions:

UserPostedImage

I could see that the error was being caused by this value at not set

UserPostedImage

When I tried to enter the correct value, taken from a different DC with the name corrected, it threw an 0X20B5 error saying that the data was invalid.

UserPostedImage

Operation Failed. Error Code: 0x20b5
The name reference is invalid
00020B5: AtrErr: DSID-03152804, #1:
Problem 1005
(Constraint_ATT_TYPE), data 0, Att 903b
(frsMemberReference)

But it would let me enter the data if I used the value from the other DC. So the DC I was trying to fix must have lost its settings in ADSIEdit > Domain> System > File Replication Services.

So I navigated to System > File Replication Services, and sure enough:

UserPostedImage

The server was missing.

So I recreated the nTFRSmember object with the settings based on the entries for the other servers, making sure I was checking all writable values in ADSIEdit and after applying the data, I was able to return to the frsMemberReference and correct the entry and it was accepted.

I was then able to restart the NTFRS service which then started trying to replicate.

All I have to do now is find out what is stopping the communication between the DCs as they are still not actually advertising...
Lemonde
  • Lemonde
  • 100% (Exalted)
  • Advanced Member Topic Starter
3 years ago
Just to say that after disabling IPV6 (not sure this was necessary though) on the 2008 R2 DCs, stopping the NTFRS server on all DCs and marking the good DC in HKLM\System\CurrentControlSet\services\NTFRS\Parameters\Backup/Restore\Process at Startup\

BurFlags set to D4

and set to D2 on all of the other non-advertising servers.

Then restarting the NTFRS service

Advertising is back and SYSVOL is available everywhere.

And we have gone from seeing:

Log Name: File Replication Service
Source: NtFrs
Date: 18/09/2020 14:06:53
Event ID: 13508
Task Category: None
Level: Warning
Keywords: Classic
User: N/A
Computer: DC1.domain.local
Description:
The File Replication Service is having trouble enabling replication from DC1 to DC2 for c:\windows\sysvol\domain using the DNS name DC1.domain.local. FRS will keep retrying.
Following are some of the reasons you would see this warning.

[1] FRS can not correctly resolve the DNS name W2016DC1.domain.local from this computer.
[2] FRS is not running on W2016DC1.domain.local.
[3] The topology information in the Active Directory Domain Services for this replica has not yet replicated to all the Domain Controllers.

This event log message will appear once per connection, After the problem is fixed you will see another event log message indicating that the connection has been established.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="NtFrs" />
<EventID Qualifiers="32768">13508</EventID>
<Level>3</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2020-09-18T13:06:53.000000000Z" />
<EventRecordID>1737</EventRecordID>
<Channel>File Replication Service</Channel>
<Computer>DC1.domain.local</Computer>
<Security />
</System>
<EventData>
<Data>W2016DC1</Data>
<Data>DC1</Data>
<Data>c:\windows\sysvol\domain</Data>
<Data>DC1.domain.local</Data>
<Binary>D9060000</Binary>
</EventData>
</Event>

To seeing:

Log Name: File Replication Service
Source: NtFrs
Date: 18/09/2020 14:08:43
Event ID: 13509
Task Category: None
Level: Warning
Keywords: Classic
User: N/A
Computer: DC1.domain.local
Description:
The File Replication Service has enabled replication from DC2 to DC1 for c:\windows\sysvol\domain after repeated retries.

Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="NtFrs" />
<EventID Qualifiers="32768">13509</EventID>
<Level>3</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2020-09-18T13:08:43.000000000Z" />
<EventRecordID>1738</EventRecordID>
<Channel>File Replication Service</Channel>
<Computer>DC1.domain.local</Computer>
<Security />
</System>
<EventData>
<Data>W2016DC1</Data>
<Data>DC1</Data>
<Data>c:\windows\sysvol\domain</Data>
</EventData>
</Event>

And are also seeing on the boxes that were not advertising before:

Log Name: File Replication Service
Source: NtFrs
Date: 18/09/2020 14:05:13
Event ID: 13516
Task Category: None
Level: Information
Keywords: Classic
User: N/A
Computer: DC1.domain.local
Description:
The File Replication Service is no longer preventing the computer DC1 from becoming a domain controller. The system volume has been successfully initialized and the Netlogon service has been notified that the system volume is now ready to be shared as SYSVOL.

Type "net share" to check for the SYSVOL share.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="NtFrs" />
<EventID Qualifiers="16384">13516</EventID>
<Level>4</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2020-09-18T13:05:13.000000000Z" />
<EventRecordID>1736</EventRecordID>
<Channel>File Replication Service</Channel>
<Computer>DC1.domain.local</Computer>
<Security />
</System>
<EventData>
<Data>DC1</Data>
</EventData>
</Event>

drdread
  • drdread
  • 100% (Exalted)
  • Advanced Member
3 years ago
Whew!

That sounds like it was a night of 😣

The active directory structure and lack of visibility of something like FRS can be infuriating, and often misleading.

The File Replication Service rebuild was the issue all along I guess, it sounds like your DC went through an unsuccessful demotion at some point and was restored without communication with the other DCs and so wound up breaking the FRS replication by being a verifiable member but without permission to replicate.
andrewt2m
2 years ago
Awesome diagnosis and fix - thanks for sharing this at it has saved our bacon!

I am not sure exactly how this has happened either, but we think it was somehow due to demoting a DC and finding that this DC was the only one that worked. We think that said DC was then restored and the domain did not replicate from that moment on and there was only actually ever one DC working.