Troubleshooting Replication

Replication Logs

The replication process records log files on both the source and target systems, separately from the regular error logs. They exist in https://instance_address/on/demandware.servlet/webdav/Sites/Logs/, with filenames like staging-blade_name-appserver-yyyymmdd.log To view the log files in Business Manager, select Administration > Site Development > Development Setup > Log Files.

Note: All the log file names contain "staging", regardless of the instance type.

Hung Replication

Some database transactions, especially those involving catalog data, can take a while to complete depending on the amount of data involved. If the replication remains in the running state for longer than expected, you can check whether it is hung.

  1. Open the most recent replication log on the staging instance.
    • Check whether it contains the line "Staging pipeline in live system successfully called." If it doesn't, there is a problem.
    • Check whether it includes an entry that a state is set to "ErrorAcquiringEditingLocks." If so, resource locks from a previous replication process might not have been released, which can hang the replication.
  2. Open the most recent replication log on the target instance, and scroll to the end.
    • Refresh the view a few times to determine whether new entries are being added. If no new entries appear after a while, the replication might be hung.
    • Check whether it includes an entry that a state is set to ErrorAcquiringLivelocks. If so, resource locks from a previous replication process might not have been released, which can hang the replication.
    • If the last log entry is a database action (INSERT, ALTER INDEX, and so on), check previous logs to see how long that action took and what the following entry was.
    • If the last log entry starts with Rsync, the delay might be due to a large number of changed static content files. Files that have been moved to a different folder are included, even if their content is the same. If the Rsync is stuck, contact Support to check its status.
    • If the log shows the state ErrorLiveStagingProcessKilled, the replication is probably hung due to a concurrent deployment or instance restart.
  3. If either log contains a line with something similar to "resource busy and acquire with NOWAIT specified," open a ticket with Support and provide the troubleshooting steps that you have attempted.
  4. If the replication process shows completed on the target instance but its status is still waiting or in progress on the staging instance, the staging instance might have been down when the replication finished. Restart the staging instance, and check the status again.

If you determine that the replication is hung, use Control Center to restart the staging instance. Make sure that the hung replication has stopped by verifying that its status on the staging instance is Failed. When it has stopped, rerun the replication.

If the replication hangs again, you can try restarting the target instance, then restarting the source instance, then rerunning the replication. However, restarting the target instance disrupts all running jobs, returns errors for all storefront requests, and clears all caches. Restart a production instance only as a last resort.

If the replication still hangs, open a Support ticket and provide the troubleshooting steps that you have attempted.

Post-Replication Issues

If you encounter a problem after a data replication operation, take the following steps.

  1. Examine the replication logs on the staging and target instances for error messages. Look for entries that include "failed" or "ORA-". These messages can help determine the problem.
  2. If the replication logs do not provide helpful data, check the error logs.
  3. Try to isolate the replication task that caused the problem by running tests with and without each task.
  4. If replication of multiple objects fails, try replicating individual objects to narrow the cause.
  5. If a scheduled replication did not run, try to run it manually.

Related Links

Replication

Verify a Data Replication

Page Cache and Replication