We are currently stable. We are still at a higher risk for outages than normal but we are not anticipating any major outages at this time.
Technical Information:
The trigger seems to be outstanding data for SQL replication. Our index optimization job causes this to happen so we have disabled this job for the time being on both availability groups. Any query that generates a lot of log data would also cause this. Lastly, if our VPN between NY and OR goes down the queue would build up. If that happens we have two options:
1. Just let the log replay. We discovered today that we can handle the load even without the plan cache. Also, we have taken steps to reduce the load by adding plan hints to reduce CPU load when there is no plan cache
2. Break OR from the Availability group should the outage be prolonged
Our next step is essentially to continue to push Microsoft for a fix to ensure it remains a high priority.