With the release of XenDesktop 7.12 Citrix has introduced into FMA world Local Host Cache functionality. Since 2013 when XenDesktop 7.0 without LHC was released this feature was the most awaited change. Taking the opportunity that quite recently I started some test of XenDesktop 7.15 in my LAB I would like to write down my notes about local host cache. Let’s start from the beginning …

Update 30.03.2018 – Added detailed information about localDB import process

Update 19.04.2019 – Added link to post about monitoring and troubleshooting

Function

The main function of Local Host Cache is to allow all users to connect/reconnect to all published resources during database outage. In FMA world, Local Host Cache functionality is the next step to successfully implement stable, truly highly-available XenApp and XenDesktop 7.15 infrastructure. The first solution to enable HA is connection leasing introduced in XenDesktop 7.6. For more details see my post: Connection Leasing.   The implementation history is presented in the Table 1 below.

XenApp 6.5XenDesktop 7.0XenDesktop 7.6XenDesktop 7.12XenDesktop 7.15
Deafult
configuration
LHC - EnabledNo HA optionCL - EnabledCL - Enabled (*)
LHC - Disabled (*)
CL - Disabled  (*)
LHC - Enabled  (*)

(*) Depends on the installation type.

The following table shows the Local Host Cache and connection leasing settings after a new XenApp or XenDesktop installation, and after an upgrade to XenApp or XenDesktop 7.12 (or later supported version).

OperationNumber of VDAsConnection leasing before operationLocal Host Cache after operationConnection leasing after operation
Installany-DisabledEnabled
Upgrade< 5KEnabledDisabledEnabled
Upgrade< 5KDisabledEnabledDisabled
Upgrade> 5KEnabledDisabledEnabled
Upgrade> 5KDisabledDisabledDisabled

 

LHC comparison:  XenDesktop 7.15 vs XenApp 6.5

Although Local Host Cache implementation in XenDesktop 7.15 (to be more precise,  from version 7.12) shares the name of the Local Host Cache feature in XenApp 6.x, there are significant differences which you should be aware of. My subjective pros and cons summary is the following:

Advantages:

  • LHC is supported for on-premise and Citrix Cloud installations
  • LHC implementation in XenDesktop 7.15 is more robust and immune to corruption
  • Maintenance requirements are minimized, such as eliminating the need for periodic dsmaint commands

Disadvantages:

  • Local Host Cache is supported for server-hosted applications and desktops, and static (assigned) desktops; it is not supported for pooled VDI desktops (created by MCS or PVS).
  • No control on Secondary Broker election – election is done based on alphabetical list of FQDN names of registered Delivery Controllers. Election process is described in details below.
  • Additional compute resources in the sizing for all Delivery Controllers must be included.

Local Host Cache vs Connection leasing – highlights

  • Local Host Cache was introduced to replace Connection Leasing, which will be removed in the next releases !
  • Local Host Cache supports more use cases than connection leasing.
  • During outage mode, Local Host Cache requires more resources (CPU and memory) than connection leasing.
  • During outage mode, only a single broker will handle VDA registrations and broker sessions.
  • An election process decides which broker will be active during outage, but does not take into account broker resources.
  • If any single broker in a zone would not be capable of handling all logons during normal operation, it won’t work well in outage mode.
  • No site management is available during outage mode.
  • A highly available SQL Server is still the recommended design.
  • For intermittent database connectivity scenarios, it is still better to isolate the SQL Server and leave the site in outage mode until all underlying issues are fixed.
  • There is a limit of 10 000 VDAs per zone.
  • There is no 14-day limit.
  • Pooled desktops are not supported in outage mode, in the default configuration.

In the overall assessment we would say that Citrix has achived one of the biggest milestones in XenDesktops 7.x releases. The current implementation is far from ideal solution but changes are going in the right direction. Additional improvements to LHC are still required to provide enterprise-wide high availablity feature for database outages.

How to turn it on ?

Status of HA options can be checked with powershell command: Get-BrokerSite. See the screenshot below:

Figure 1 – LHC status

To change the status of HA options you can use Set-BrokerSite command. 

To enable Local Host Cache (and disable connection leasing), enter:

Set-BrokerSite -LocalHostCacheEnabled $true -ConnectionLeasingEnabled $false

To disable Local Host Cache (and enable connection leasing), enter:

Set-BrokerSite -LocalHostCacheEnabled $false -ConnectionLeasingEnabled $true

 

How does it work ?

Local Host Cache functionality in FMA world is build based on 3 core FMA services and MS SQL Express localDB:

  • Citrix Broker Service  – called also Principal Broker Service. In windows server operating system is represented as service as process BrokerService. In scope of local host cache functionality Principal Broker service is responsible for the following tasks:
    • registration of all VDAs, including ongoing management from a Delivery Controller perspective
    • brokers new and manages existing sessions, handles resource enumeration, the creation and verification of STA tickets, user validation, disconnected sessions etc
    • monitoring existence of site database
    • monitoring of changes in site database
  • Citrix Config Synchronizer Service – In windows server operating system is represented as process ConfigSyncService. The main tasks served by this service are the following:
    • when a configuration change in site database is detected, copy the content of site database to the High Availability Service/Secondary Broker Service
    • provide the High Availability Service/Secondary Broker Service (s) with information on all other controllers within your Site (Primary Zone), including any additional Zones
  • Citrix High Availability Service – called also Secondary Broker Service. In windows server operating system is represented as process HighAvailabilityService. The main task served by this service is to handle all new and existing connections/sessions during database outage.
  • MS SQL Express LocalDB – dedicated SQL Express instance located on every controller used to store all site information data synchronised from Site database. Ony the secondary broker communicates with this database; you cannot use PowerShell cmdlets to change anything about this database. The LocalDB cannot be shared across Controllers.

Process flow during normal operations

  • The principal broker (Citrix Broker Service) on a Controller accepts connection requests from StoreFront, and communicates with the Site database to connect users with VDAs that are registered with the Controller. In the background Broker is monitoring database status.  A heartbeat message is exchanged between a Delivery Controller and the database every 20 seconds with a default timeout of 40 seconds.
  • Every 2 minutes a check is made to determine whether changes have been made to the principal broker’s configuration. Those changes could have been initiated by PowerShell/Studio actions (such as changing a Delivery Group property) or system actions (such as machine assignments). It will not include information about who is connected to which server (Load Balancing), using what application (s) etc. referred as the current state of the Site/Farm
    • If a change has been made since the last check, the principal broker uses the Citrix Config Synchronizer Service (CSS) to synchronize (copy) information to a secondary broker (Citrix High Availability Service) on the Controller.
    • The secondary broker imports the data into a temporary database (HAImportDatabaseName) in Microsoft SQL Server Express LocalDB on the Controller.
    • When import into temporary DB  is successful, previous DB is removed and temporary DB is renamed to HADatabaseName.  The LocalDB database is re-created each time synchronization occurs. The CSS ensures that the information in the secondary broker’s LocalDB database matches the information in the Site database. Correlated event ids:
      • id 503 – CSS receives a config change
      • id 504 – LocalDB update successfull
      • id 505 – LocalDB update failure
    • If no changes have occurred since the last check, no data is copied

Standard LHC process flow is presented in the figure below:

Figure 2 – LHC standard mode

Process flow during database outage

  • The principal broker can no longer communicate with the Site database
    • The principal broker stops listening for StoreFront and VDA information (marked with red X in the figure below). Correlated event ids: 1201, 3501
    • The principal broker then instructs the secondary broker (High Availability Service) to start listening for and processing connection requests (marked with a red dashed line in the figure below).  Correlated event ids: 2007, 2008
    • Based on alphabetical list of FQDN names an election process starts to determine which controller takes over the secondary broker role. There can only be one secondary broker accepting connections during database outage. Correlated event id: 3504. The non-elected secondary brokers in the zone will actively reject incoming connection and VDA registration requests.
    • While the secondary broker is handling connections, the principal broker continues to monitor the connection to the Site database
    • As soon as a VDA communicates with Secondary Broker, a re-registration process is triggered (shown with red arrows for XML and VDA registration traffic in the figure below). During that process, the secondary broker also gets current session information about that VDA. Correlated event ids: 1002, 1014, 1017
  • Connection is restored,
    • The principal broker instructs the secondary broker to stop listening for connection information, and the principal broker resumes brokering operations.  Correlated event ids: 1200-> 3503-> 3500, 3004-> 3000-> 1002
    • The secondary broker removes all VDA registration info captured during the outage (these information are lost and are not synchronized to Site database) and resumes updating the LocalDB database with configuration changes received from the CSS.
    • The next time a VDA communicates with the principal broker, a re-registration process is triggered.

LHC process flow during database outage is presented in the figure below:

Figure 3 – LHC outage mode

 

Sites with multiple controllers and zones

As it was mentioned above Config Synchronizer Service updates the secondary broker with information about all Controllers in the site or zone. If your deployment does contain multiple zones, this action is done per each zone independently and affects all Controllers in every zone.  Having that information, each secondary broker knows about all peer secondary brokers.

In the deployment with single zone configuration (or with multiple zones but controllers configured with single zone) election is done based on FQDN names of all configured controllers

Figure 4 – Single zone

Figure 5 – Election in single zone deployment

 

In the deployment with multiple zones configured with delivery controllers election is done separately per each zone based on FQDN names of configured controllers.

Figure 6 – Multiple zones

Figure 7 – Election in the first zone

Figure 8 – Election in the second zone

 

SQL Express LocalDB

LocalDB is an instance of SQL Server Express that can create and open SQL Server databases. The local SQL Express database has been part of the XenApp/XenDesktop installation as of version 7.9. It is installed automatically when you install a new controller or upgrade a controller prior to version 7.9.

The binaries for SQL Express LocalDB are located in:

%ProgramFiles%\Microsoft SQL Server\120\LocalDB\.

The LHC database files are located in folder as:

C:\Windows\ServiceProfiles\NetworkService\HaDatabaseName.mdf.

C:\Windows\ServiceProfiles\NetworkService\HaDatabaseNamelog.ldf.

During every import process temporary database is created

C:\Windows\ServiceProfiles\NetworkService\HaImportDatabaseName.mdf.

C:\Windows\ServiceProfiles\NetworkService\HaImportDatabaseNamelog.ldf.

Local Host Cache database contains only static information, referred as the current state of the Site/Farm. In multizone scenario local host cache database in all zones contains exactly the same set of information.

The size comparison of top 20 biggest tables is shown in the figure below:

LocalDB is exclusively used by the secondary broker.  PowerShell cmdlets or Citrix studio cannot be used to communicate with /update this database. The LocalDB cannot be shared across Controllers. Each controller has each own copy of Site database content.

 

Design considerations

The following must be considered when using local host cache:

  • Elections – When the zones loses contact with the SQL database, an election occurs nominating a single delivery controller as master. All remaining controllers go into idle mode. A simple alphabetical order determines the winner of the election (based on alphabetical list of FQDN names of registered Delivery Controllers).
  • Sizing – When using local host cache mode, a single delivery controller is responsible for all VDA registrations, enumerations, launches and updates. The elected controller must have enough resources (CPU and RAM) to handle the entire load for the zone. A single controller can scale to 10,000 users, which influences the zone design.
    • RAM – The local host cache services can consume 2+GB of RAM depending on the duration of the outage and the number of user launches during the outage where
      LocalDB service can use approximately 1.2 GB of RAM (up to 1 GB for the database cache, plus 200 MB for running SQL Server Express LocalDB).
      The High Availability Service can use up to 1 GB of RAM if an outage lasts for an extended interval with many logons occurring
    • CPU – The local host cache can use up to 4 cores in a single socket. Combination multiple sockets with multiple cores should be considered to provide expected performance. Based on Citrix testing, a 2×3 (2 sockets, 3 cores) configuration provided better performance than 4×1 and 6×1 configurations.
    • Storage – During local host cache mode, storage space increased 1MB every 2-3 minutes with an average of 10 logons per second.  When connectivity to site database is restored the local database is recreated and the space is returned. However, the broker must have sufficient space on the drive where the LocalDB is installed to allow for the database growth during an outage. Extended I/O requirements during database outage should be considered as well.
    • Power Options – Powered off virtual resources will not start when the delivery controller is in local host cache mode. Pooled virtual desktops that reboot at the end of a session are placed into maintenance mode.
  • Consoles – When using local host cache mode, Studio and PowerShell are not available.
  • VDI limits:
    • In a single-zone VDI deployment, up to 10,000 VDAs can be handled effectively during an outage.
    • In a multi-zone VDI deployment, up to 10,000 VDAs in each zone can be handled effectively during an outage, to a maximum of 40,000 VDAs in the site.

Monitoring

For additional information for monitoring approach see my post about Zabbix monitoring template

When preparing a dedicated template for XenDesktop 7.15 monitoring the event logs items listed in the table below should be considered.

LevelSourceRegistered on Event IDMessage
InformationCitrix ConfigSync ServiceBroker Server503The Citrix Config Sync Service received an updated configuration.
InformationCitrix ConfigSync ServiceBroker Server504The Citrix Config Sync Service imported an updated configuration.
ErrorCitrix High Availability ServiceBroker Server505An import to the local DB failed; see below for more information
InformationCitrix Broker ServiceBroker Server506
The Citrix Broker Service started successfully.
InformationCitrix Broker ServiceBroker Server1002
The Citrix Broker Service is ready to accept connections from virtual machines.
InformationCitrix Broker ServiceBroker Server1011The Citrix Broker Service successfully initialized the Windows Communication Foundation (WCF) services required for interaction between this machine and virtual machines.
WarningCitrix Broker ServiceBroker Server1039The Citrix Broker Service failed to contact virtual machine 'VDA03.LAB.citrix24.ctx' (IP address ).

Check that the virtual machine can be contacted from the controller and that any firewall on the virtual machine allows connections from the controller. See Citrix Knowledge Base article CTX126992.
InformationCitrix High Availability ServiceBroker Server1065The Citrix Broker Service failed to determine the base settings needed for the Virtual Desktop Agent of machine 'VDA02.LAB.citrix24.ctx'.

Please restart this machine and if this problem persists, see Citrix Knowledge Base article CTX126990.
InformationCitrix High Availability ServiceBroker Server1066The Citrix Broker Service successfully determined the base settings needed for the Virtual Desktop Agent of machine 'VDA02.LAB.citrix24.ctx'.
WarningCitrix Broker ServiceBroker Server1194Registration request for worker S-1-5-21-3924863940-1422453360-4280703915-1617 (VDA03.LAB.citrix24.ctx) was rejected because the Broker service was unable to contact the worker during the registration process.
InformationCitrix Broker Service (*)Broker Server1200The connection between the Citrix Broker Service and the database has been restored.
WarningCitrix Broker Service (*)Broker Server1201The connection between the Citrix Broker Service and the database has been lost.
InformationCitrix Broker ServiceBroker Server2003The Citrix Broker Service successfully started XML services.
InformationCitrix High Availability ServiceBroker Server2008The Citrix Broker Service successfully stopped XML services.
InformationCitrix High Availability ServiceBroker Server3000The Citrix Broker Service is ready to contact hosts.
InformationCitrix High Availability ServiceBroker Server3004The Citrix Broker Service successfully connected to the XenDesktop database.
InformationCitrix Broker ServiceBroker Server3500The Citrix Broker Service has detected that the issue with communication with the database has been resolved and will resume normal brokering activity using configuration in the main site database.
InformationCitrix Broker ServiceBroker Server3501The Citrix Broker Service has detected an issue with communication with the database. To preserve functionality, responsibility for brokering requests will be handed over to the Citrix High Availability Service using locally cached site configuration.
InformationCitrix High Availability ServiceBroker Server3502The Citrix High Availability Service has become active and will broker user request for sessions until the issue discovered with the normal brokering activity is resolved.
InformationCitrix High Availability ServiceBroker Server3503The issue discovered with the normal brokering activity has been resolved, and the Citrix High Availability Service has now stopped participating in brokering user requests for sessions
InformationCitrix High Availability ServiceBroker Server3504The Citrix High Availability Service 'XD01.LAB.CITRIX24.CTX' has become the elected instance amongst its peers (XD01.LAB.CITRIX24.CTX, XD02.LAB.CITRIX24.CTX).

Tests and Troubleshooting

Force an outage

You might want to force a database outage when

  • If your network is going up and down repeatedly. Forcing an outage until the network issues resolve prevents continuous transition between normal and outage modes.
  • To test a disaster recovery plan.
  • While replacing or servicing the site database server.

To force an outage, edit the registry of each server containing a Delivery Controller.

  • In HKLM\Software\Citrix\DesktopServer\LHC, set OutageModeForced to 1. This instructs the broker to enter outage mode, regardless of the state of the database.  (Setting the value to 0 takes the server out of outage mode.)
  • In a Citrix Cloud scenario, the connector enters outage mode, regardless of the state of the connection to the control plane or primary zone

Troubleshooting

As usual the main source of information about the status of Local Host Cache is the Windows Event Viewer. All actions done by LHC components are are logged to Windows Server Application log. The examples of the most important events are presented in figures below.

For additional information for Local Host Cache troubleshooting please see my post:  XenDesktop 7.15 Local Host Cache troubleshooting

Delivery Controller

Event ID: 503 and 504 – LHC configuration change and update

Event ID: 503 – deatils 

Event ID: 1201 and 3501 – Site database connection lost

 

Event ID: 1201 – deatils 

Event ID: 3504 – deatils 

Event ID: 3501 – deatils 

Event ID: 3502 – deatils 

Event ID: 3503 – Site database connection restore

Event ID: 3500 – Site database connection restore

VDA

Event IDs in VDA are using slightly different notation. Although in event log viewer, event ids are displayed as 1001 and 1010 real values are stored as 1073742834 and 3221226473 respectively.

Event ID: 3500 / 1073742834 – details

 

Event ID: 3500 / 3221226473 – details