This relationship had been intentionally broken for some testing on the destination volume and when resync was issued, it had failed due to volume busy.
Healthy: false
Unhealthy Reason: Scheduled update failed to start. (Destination volume must be a data-protection volume.)
Constituent Relationship: false
Destination Volume Node:
Relationship ID: aa9b0b54-64d9-11e5-be3f-00a0984ad3aa
Current Operation ID: 1bed480d-1554-11e7-aa85-00a098a230de
Transfer Type: resync
Transfer Error: -
Current Throttle: 103079214
Current Transfer Priority: normal
Last Transfer Type: resync
Last Transfer Error: Failed to change the volume to data-protection. (Volume busy)
To check the snapshots on the volume for busy status and dependency:
snapshot show -vserver 'vserver_name' -volume 'volume_name' -fields busy, owners
In this case, a running NDMP backup session was preventing the resync.
To list NDMP backup sessions:
system services ndmp status
The system services ndmp status command lists all the NDMP sessions in the cluster. By default it lists the following details about the active sessions:
To list details for a NDMP backup session:
system services ndmp status -node 'node_name' -session-id 'session-id'
From here you can confirm this is the NDMP session you need to kill by referencing the ‘Data Path’ field. This should be the path to the volume that is failing the resync.
To kill NDMP backup session:
system services ndmp kill 'session-id' -node 'node_name'
The system services ndmp kill command is used to terminate a specific NDMP session on a particular node in the cluster. This command is not supported on Infinite Volumes.
After clearing the busy snapshot application dependency, I was able to successfully issue the resync as per normal operations.
Today I encountered VMware SRM error “Failed to create snapshots of replica device Cause: SRA command ‘testFailoverStart’ failed. Storage port not found Either Storage port information provided in NFS list is incorrect else Verify the “isPv4″ option in ontap_config file matches the ipaddress in NFS field.”
Looks to be caused by the firewall-policy of the SVM data LIFs. These were set for “mgmt”, which are not detected by the SRA according to the kb article.
To change the firewall-policy from “mgmt” to “data”:
net int modify -vserver [vserver_name] -lif [data_lif_name] -firewall-policy data
To list LIFs by firewall-policy:
net int show -fields firewall-policy
Article also advises checking the ontap_config file on the SRM server to
ensure that the NFS IP address on the controller is correct and the IP address format mentioned in the NFS address field matches the value set for the isipv4 option in the ontap_config file
By default, the configuration file is located at install_dirProgram FilesVMwareVMware vCenter Site Recovery ManagerstoragesraONTAPontap_config.txt. You’ll look for the “isPv4” option.
I opened a case recently as I had a cluster in OnCommand Unified Manager that was no longer polling. When I tried to delete and recreate I received the error “Cluster cannot be deleted when discovery is in progress.” I am running version 7.0.
Turns out I had hit a Burt. The Burt # is 1053008, and you can view it by logging into the support site with your credentials.
Symptom
• Error unable to discover cluster, Cluster already exists.
• When a cluster is added to the OnCommand Unified Manager (UM) Dashboard, then this even gets logged:
Failed to add cluster 172.16.42.16. An internal error has occurred. Contact technical support. Details: Cannot update server (com.netapp.oci.server.UpdateTaskException [68-132-983])
When the user attempts to remove the cluster, it fails indicating that it is being acquired.
Navigating to Health, Settings, Manager datasources and observing that the datasource is failing.
• For UM, the ocumserver-debug log may contain:
2016-12-19 08:06:14,583 DEBUG [oncommand] [reconcile-0] [c.n.dfm.collector.OcieJmsListener] OCIE JMS notification message received: {DatasourceName=Unknown, DatasourceID=-1, ClusterId=3387647, ChangeType=ADDED, UpdateTime=1482152623430, MessageType=CHANGE}
• When a cluster is added to the UM Dashboard this message may be displayed indicating that the issue is in the OnCommand Performance Manager (OPM) database:
Cluster in a MetroCluster configuration is added only to Unified Manager. Cluster add failed for Performance Manager.
Note: The MetroCluster part of the message is not relevant but is included as that full message is possible.
• For OPM, the ocfserver-debug log may contain:
2016-12-19 09:15:00,013 ERROR [system] [taskScheduler-5] [o.s.s.s.TaskUtils$LoggingErrorHandler] Unexpected error occurred in scheduled task.
com.netapp.ocf.collector.OcieException: com.onaro.sanscreen.acquisition.sessions.AcquisitionUnitException [35]
Failed to getById id:-1
<>
Caused by: com.onaro.sanscreen.acquisition.sessions.AcquisitionUnitException: Failed to getById id:-1
<>
Cause
This is under investigation in Documented Issue 1053008.
Because the cluster is successfully removed from the datasource tables but not the inventory tables, when the cluster is re-added there is a disconnect between these two tables. Attempting to re-add the inventory and performance fail due to duplicate entries tied to the old objects in the database as the values are not unique.
Solution
1. Shutdown the OPM host.
2. Shutdown the UM host.
3. Take a VMware snapshot or other backup per your company policy.
4. Boot UM
5. When the UM WebUI is accessible, boot OPM.
6. Check MYSQL in order to determine which hosts have the invalid datasource ID.
For vApps
Use KB 000030068, get to the diag shell
diag@OnCommand:~# sudo mysql -e “select datasourceId, name, managementIp from netapp_model.cluster where datasourceId = -1;”
+————–+————-+————–+
| datasourceId | name| managementIp |
+————–+————-+————–+
| -1 | clusterName | 10.0.0.2 |
+————–+————-+————–+
diag@OnCommand:~#
For RHEL.
diag@OnCommand:~# sudo mysql -e “select datasourceId, name, managementIp from netapp_model.cluster where datasourceId = -1;”
+————–+————-+————–+
| datasourceId | name| managementIp |
+————–+————-+————–+
| -1 | clusterName | 10.0.0.2 |
+————–+————-+————–+
diag@OnCommand:~#
Windows
A. Open a Windows Command Prompt window
B. Browse to the MySQLMySQL Server 5.6bin directory
EXAMPLE:>cd “Program FilesMySQLMySQL Server 5.6bin”Authenticate MYSQL to access the Database:> mysql -u -p
C. When you press [ENTER] the system will prompt you to enter the user’s password.
a. NOTE: The user and password were created when MYSQL was first installed on the Windows host. There is not a NETAPP default user that can be used to authenticate MYSQL.
MySQL>select datasourceId, name, managementIp from netapp_model.cluster where datasourceId = -1;”
+————–+————-+————–+
| datasourceId | name| managementIp |
+————–+————-+————–+
| -1 | clusterName | 10.0.0.1 |
+————–+————-+————–+
Download the attached script appropriate for the version of UM and OPM.
For vApps,
A. Use an application such as FileZilla or WinSCP to upload the script to the /upload directory on the vApp.
B. Use KB 000030068, get to the diag shell
C. Add the execute attribute to the script.
a. Syntax# sudo chmod +x /jail/upload/BURT1053008_um_70_2016-12-27.sh
For RHEL
A. Be sure to have sudo or root access to the host.
B. Move the script to /var/logs/ocum/
C. Add the execute attribute to the script.
a. Syntax# sudo chmod +x /var/logs/ocum/BURT1053008_um_70_2016-12-27.sh
For Windows (UM only)
A. Browse to the Directory where MYSQL is installed MySQLMySQL Server 5.6bin
a. EXAMPLE: C:Program FilesMySQLMySQL Server 5.6bin
B. Save a copy of the following Script in the ‘MySQL Server 5.6bin’ directory: BURT1053008_um_70_Windows”
Execute the script.
A.For vApps: sudo /jail/upload/scriptname
B. For RHEL: sudo /var/logs/ocum/scriptname
C. For WIndows: MYSQL.ext -u -p MySQLMySQL Server 5.6bin >Script_name
Confirm that the datasources with ID -1 are gone:
vApps and RHEL: diag@OnCommand:~# sudo mysql -e “select datasourceId, name, managementIp from netappmodel.cluster where datasourceId = -1;”
Windows:
A.Authenticate MYSQL as per the steps outlined above, then run Syntax:>
select datasourceId, name, managementIp from netappmodel.cluster where datasourceId = -1;
B. Type: Exit to exit MYSQL
Reboot the host
Once UM is back up, perform this same process with OPM.
Once UM an OPM have been corrected, perform a discovery of the cluster and verify that it is showing up within the WebUI.
If the same failure occurs, please contact NetApp Technical Support for further assistance.
After I ran the scripts provided in the BURT, on both the OnCommand Unified Manager and OnCommand Performance Manager servers, the cluster no longer showed up in inventory and I could add succesfully.
Had an issue this week on a pair of FAS8080s running 8.2.2P1 Cluster-Mode. During firmware upgrade for IOM6 module from version 0208 to 0209, I ran into ACP (Alternate Control Path) Connectivity Status showing “Partial Connectivity”.
I followed the recommended action plan:
1. Disable the ACP feature in ONTAP:
>options acp.enabled off
2. Reseat the IOM module with the unresponsive ACP processor.
3. Reenable the ACP feature:
>options acp.enabled on
That however, did not resolve the issue and we had to replace the module in order to correct. I did not disable ACP prior to replacement.
After replacement, ACP shows status “Additional Connectivity”:
ACP status then shows “Full Connectivity” with module status as “inactive (upgrading firmware)”:
After firmware upgrade from 02.08 to 02.09, the module reboots:
It reports 02.09 firmware and status “inactive (initializing)”:
And concludes with status “active”:
I didn’t expect to resolve this via HW replacement because it hadn’t been reporting an ACP issue prior to the IOM6 firmware upgrade. But that’s what resolved it.
Tracking quotas are like regular quotas but without any quota limits enforced. Tracking quotas enable you to generate disk and file capacity reports, and when used in conjunction with quotas they are helpful because you can resize quota values without having to reinitialize (turning them off and on to activate).
I recently used tracking quotas on volumes dedicated for user home directories in order to automated a chargeback report of user directory folder sizes using the Data ONTAP PowerShell Toolkit. But more on that later. First we need to get tracking quotas enabled.
::> quota modify -vserver vserver_name -volume uservol1 -state on
[Job 4992] Job is queued: "quota on" performed for quota policy "quotatrackingpolicy" on volume "uservol1" in Vserver "vserver_name".
::> quota modify -vserver vserver_name -volume uservol1 -state on
[Job 4993] Job is queued: "quota on" performed for quota policy "quotatrackingpolicy" on volume "uservol2" in Vserver "vserver_name".
::> quota modify -vserver vserver_name -volume uservol1 -state on
[Job 4994] Job is queued: "quota on" performed for quota policy "quotatrackingpolicy" on volume "uservol3" in Vserver "vserver_name".
Test volume quota report:
::> quota report -volume uservol1 -vserver vserver_name
Vserver: vserver_name
----Disk---- ----Files----- Quota
Volume Tree Type ID Used Limit Used Limit Specifier
------- -------- ------ ------- ----- ----- ------ ------ ---------
uservol1 user * 0B - 0 - *
uservol1 user BUILTINAdministrators 78.74GB - 163733 -
uservol1 user root 0B - 2 -
uservol1 user ADDOMAINuser1 495.3MB - 13087 - *
uservol1 user ADDOMAINuser2 3.88GB - 49889 - *
uservol1 user ADDOMAINuser3 38.03MB - 301 - *
uservol1 user ADDOMAINuser4 3.33GB - 9079 - *
uservol1 user ADDOMAINuser5 3.18GB - 37629 - *
uservol1 user ADDOMAINuser6 612.0MB - 4815 - *
uservol1 user ADDOMAINuser7 83.76MB - 989 - *
uservol1 user ADDOMAINuser8 260.4MB - 5378 - *
11 entries were displayed.
::>