VLDB not consistent after a forced MetroCluster switchover…

After a forced MCC switchover and a successful switchback I had the following error on a customer MCC in the event log:
6/21/2017 09:56:49 cluster01-01 ERROR vldb.aggrBladeID.missing: The volume ‘cluster02_svm1_root’ is located on the aggregate with UUID ‘3b61c6fd-b6d7-4d5f-9168-81b12632581c’ whose owning dblade UUID ‘cce2f293-3bd4-11e7-85ef-a1f0d21750d7’ does not exist in the Volume Location Database.

This is an open internal bug from NetApp and can rarely happen if the VLDB has been updated without the knowledge of the other cluster:
http://mysupport.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=1019376

It can be solved by executing the following command on the cluster which hosts the affected SVM:
cluster02# metrocluster resync vserver -cluster cluster02 -vserver cluster02_svm1

Check the status with:
cluster02# metrocluster vserver show
and
cluster01# set diag; debug vreport show; set admin

Does anyone had the same problem recently? Thanks for commenting that.

NetApp HCI is here…

Normally I don’t do postings about Product Releases or New Features of a product but this one I really want to share with you because this is a huge change to the world of HCI (Hyper Converged Infrastructure).

On 5th June NetApp announced the new HCI product which integrates into the NetApp Data Fabric.

The NetApp HCI blocks ar based on VMWare vSphere Compute Nodes and SolidFire All-Flash Storage Nodes. Each block has a total of 4 Nodes.

But what does it make special and what’s the killer feature of this product?

  • You will get all the SolidFire advantages like guaranteed Performance, world-class QOS, vVOL integration, and so on…
  • You will get full Storage power with the All-Flash HCI…
  • You can change the identity of the different Nodes from Storage to Compute and vice versa…

  • Easy management via vCenter Server Web Interface…
  • Easy setup – The appliance is setup in about 30 minutes…
  • IMHO the real killer feature is the integration into the NetApp Data Fabric like SnapMirror and StorageGrid…

As I already mentioned each block has 4 Nodes. You can also unite multiple boxes with the 25GbE/10GbE links and get more Storage/Compute power. The Datastores are attached with iSCSI. If you want to have the familiar OnTap feeling, just deploy an OnTap Select instance and you will get the whole bunch of CIFS, iSCSI and NFS power for your environment.

What can we expect next?

  • We will hopefully get more and more Hypervisors
  • NetApp HCI SnapMirror integration with OnTap 9.3
  • Integration into SnapCenter 4.0

I’m really excited about my first use of this product and I’m #DataDriven…

Changing raid type RAID-DP to RAID4 and vice versa

I just needed to change the aggr raid type of the root Aggrs for an AFF MCC from RAID-DP to RAID4. This is an easy step:

storage aggregate show-status -aggregate AGGR0_NAME
storage aggregate modify –aggregate AGGR0_NAME -raidtype raid4

And verify:

storage aggregate show-status -aggregate AGGR0_NAME

The same can be done the other way:

storage aggregate modify –aggregate AGGR_NAME -raidtype raid_dp

or with RAID-TEC

storage aggregate modify –aggregate AGGR_NAME -raidtype raid_tec

SnapCreator throws an timeout error on listing volumes or snapshots

It is so easy to do an oracle restore with NetApp SnapCreator but what happens if the listing of the snapshots for the restore running in a timeout?

This is because there are too many volumes or snapshot and the listing takes longer than the expected 2 minutes.

Just change the option

SNAPCREATOR_ENGINE_READ_TIMEOUT=300000

in the snapcreator.properties file to 5 minutes for example (5 minutes = 300000ms) and restart the SnapCreator service. The default is 120000ms = 2 minutes.

Grow and shrink with NetApp volume autosize

You want to use volume autosize to grow and shrink a volume?
First to the grow part. This works as it should if you have modified the following options per volume:

-autosize-increment (has a default value)
-autosize-grow-threshold-percent (has a default value)
-autosize-mode (set to grow at least – should be default)
-autosize (set to on)
-max-autosize (has a default value – 120% of the volume)

The shrink part is a bit trickier and includes these options per volume:

-autosize-increment (has a default value)
-autosize-shrink-threshold-percent (has a default value – 50%)
-autosize-mode (set to grow-shrink)
-autosize (set to on)
-min-autosize (has a default value – the initial size)

If you want to shrink the volume with autosize beware of the following:
– Adjust the min-autosize if necessary
– Shrink the “files” parameter if the autosize has the following events in “event log show” and the volume does not shrink to the minimum specified

4/3/2017 10:02:20   node01 INFORMATIONAL wafl.vol.autoSize.done: Volume Autosize: Automatic shrink of volume ‘volume1@vserver:596d33d2-88d9-33e0-9092-90383kde9229’ by 295GB complete.
4/3/2017 10:02:20   node01 NOTICE        wafl.vol.autoSize.shrink.cap: Volume Autosize: Volume ‘volume1@vserver:596d33d2-88d9-33e0-9092-90383kde9229’ could not be auto shrunk below 44.9GB to recover space.

– The “files” parameter can be modified with

volume modify -vserver SVM1 -volume volume1 -files FILECOUNT

– Attention: The “files” parameter does grow but not shrink

More Info:
http://docs.netapp.com/ontap-9/index.jsp?topic=%2Fcom.netapp.doc.dot-cm-vsmg%2FGUID-ECE453FE-D158-4BDA-9851-6B8780474E11.html

Veeam 9.5 FLR from NetApp storage fails using NFS

If you are using NFS volumes from a NetApp cDOT Cluster as datastore and you want to do a File Level Restore (FLR) with Veeam 9.5 the datastore fails to mount with permission denied from server.

This is because Veeam makes a new Export Rule in your root Export Policy if you have set the ip address of your ESXi to read-only as per Best Practises of NetApp. Veeam puts the new rule as number 1. So it is not possible to mount the datastore obviously:

cluster::> export-policy rule show -vserver vmwaresvm -fields rorule,rwrule
vserver policyname ruleindex rorule rwrule ipaddress
———– ——————- ——— —— —— ———–
vmwaresvm ex_vmwaresvm_lab1 8 any any  10.10.1.20
vmwaresvm ex_vmwaresvm_lab1 9 any any  10.10.1.21
vmwaresvm ex_vmwaresvm_lab1 10 any any 10.10.1.22
vmwaresvm ex_vmwaresvm_root 1 none any 10.10.1.20
vmwaresvm ex_vmwaresvm_root 9 any none 10.10.1.20
vmwaresvm ex_vmwaresvm_root 10 any none 10.10.1.21
vmwaresvm ex_vmwaresvm_root 11 any none 10.10.1.22

The workaround is to set the Export Rule for the ip address of the ESXi to read-write before the restore:

cluster::> export-policy rule show -vserver vmwaresvm -fields rorule,rwrule
vserver policyname ruleindex rorule rwrule ipaddress
———– ——————- ——— —— —— ———–
vmwaresvm ex_vmwaresvm_lab1 8 any any 10.10.1.20
vmwaresvm ex_vmwaresvm_lab1 9 any any 10.10.1.21
vmwaresvm ex_vmwaresvm_lab1 10 any any 10.10.1.22
vmwaresvm ex_vmwaresvm_root 9 any any 10.10.1.20
vmwaresvm ex_vmwaresvm_root 10 any any 10.10.1.21
vmwaresvm ex_vmwaresvm_root 11 any any 10.10.1.22

In my opinion it is not a secure workaround because someone can mount your SVM root volume and write to it.

Let me know if you have the same issues…

BTW the same problem exists with Veeam 9.0 but the new rule will not be placed as number 1 so it works as expected…

UPDATE: This problem is solved with Veeam 9.5U2 by unchecking the following tick:

Add and remove volumes in a SVM DR relationship

Every modification in a SVM DR relationship needs a “snapmirror resync” on the destination site. So it’s very easy to add and remove volume to/from relationship.
Here is a brief introduction on how to do these two tasks.

Add a new volume to SVM-DR
# Create a new volume in the already existing SVM-DR Source SVM

source_cluster::> volume create -vserver source_svm -volume NewDRVolume -aggregate aggr1_node01_sas -size 10g -state online -type RW
[Job 2706] Job succeeded: Successful

# Do a snapmirror resync on the destination SVM

dest_cluster::> snapmirror resync -destination-path dest_svm:

# Wait until the resync is done (Relationship Status -> Idle)

dest_cluster::> snapmirror show dest_svm:

Source Path: source_svm:
Destination Path: dest_svm:
Relationship Type: DP
Relationship Group Type: –
SnapMirror Schedule: 30min
SnapMirror Policy Type: async-mirror
SnapMirror Policy: DPDefault
Tries Limit: –
Throttle (KB/sec): unlimited
Mirror State: Snapmirrored
Relationship Status: Idle
File Restore File Count: –
File Restore File List: –
Transfer Snapshot: –
Snapshot Progress: –
Total Progress: –
Network Compression Ratio: –
Snapshot Checkpoint: –
Newest Snapshot: vserverdr.2.8a2831d4-558f-11e6-b980-00a0989f2857.2016-09-28_100000
Newest Snapshot Timestamp: 09/28 10:00:00
Exported Snapshot: vserverdr.2.8a2831d4-558f-11e6-b980-00a0989f2857.2016-09-28_100000
Exported Snapshot Timestamp: 09/28 10:00:00
Healthy: true
Unhealthy Reason: –
Constituent Relationship: false
Destination Volume Node: –
Relationship ID: d2cbabb9-558f-11e6-b980-00a0989f2857
Current Operation ID: –
Transfer Type: –
Transfer Error: –
Current Throttle: –
Current Transfer Priority: –
Last Transfer Type: update
Last Transfer Error: –
Last Transfer Size: 1.64MB
Last Transfer Network Compression Ratio: –
Last Transfer Duration: 0:0:11
Last Transfer From: source_svm:
Last Transfer End Timestamp: 09/28 10:00:11
Progress Last Updated: –
Relationship Capability: –
Lag Time: 0:21:7
Identity Preserve Vserver DR: true
Number of Successful Updates: –
Number of Failed Updates: –
Number of Successful Resyncs: –
Number of Failed Resyncs: –
Number of Successful Breaks: –
Number of Failed Breaks: –
Total Transfer Bytes: –
Total Transfer Time in Seconds: –

# Check the new volume on the DR site

dest_cluster::> vol show -vserver dest_svm
Vserver Volume Aggregate State Type Size Available Used%
——— ———— ———— ———- —- ———- ———- —–
dest_svm
dest_svm_root
aggr1_node01_sas
online RW 1GB 972.6MB 5%
dest_svm
dest_svm_nfs_prod01
aggr1_node01_sas
online DP 4TB 2.00TB 50%
dest_svm
NewDRVolume
aggr1_node01_sas
online DP 10GB 9.50GB 5%
3 entries were displayed.

Remove a no longer used volume from SVM-DR
# Remove the volume from the SVM-DR Source SVM

source_cluster::> volume offline -vserver source_svm -volume OldDRVolume
Volume “source_svm:OldDRVolume” is now offline.
source_cluster::> volume delete -vserver source_svm -volume OldDRVolume
Warning: Are you sure you want to delete volume “OldDRVolume” in Vserver “source_svm” ?
{y|n}: y
[Job 2711] Job succeeded: Successful

# Resync the destination svm

dest_cluster::> snapmirror resync -destination-path dest_svm:

# Wait until the resync is done (Relationship Status -> Idle)

dest_cluster::> snapmirror show dest_svm:

Source Path: source_svm:
Destination Path: dest_svm:
Relationship Type: DP
Relationship Group Type: –
SnapMirror Schedule: 30min
SnapMirror Policy Type: async-mirror
SnapMirror Policy: DPDefault
Tries Limit: –
Throttle (KB/sec): unlimited
Mirror State: Snapmirrored
Relationship Status: Idle
File Restore File Count: –
File Restore File List: –
Transfer Snapshot: –
Snapshot Progress: –
Total Progress: –
Network Compression Ratio: –
Snapshot Checkpoint: –
Newest Snapshot: vserverdr.0.8a2831d4-558f-11e6-b980-00a0989f2857.2016-09-28_131416
Newest Snapshot Timestamp: 09/28 13:14:16
Exported Snapshot: vserverdr.0.8a2831d4-558f-11e6-b980-00a0989f2857.2016-09-28_131416
Exported Snapshot Timestamp: 09/28 13:14:16
Healthy: true
Unhealthy Reason: –
Constituent Relationship: false
Destination Volume Node: –
Relationship ID: d2cbabb9-558f-11e6-b980-00a0989f2857
Current Operation ID: –
Transfer Type: –
Transfer Error: –
Current Throttle: –
Current Transfer Priority: –
Last Transfer Type: resync
Last Transfer Error: –
Last Transfer Size: 400KB
Last Transfer Network Compression Ratio: –
Last Transfer Duration: 0:0:35
Last Transfer From: source_svm:
Last Transfer End Timestamp: 09/28 13:14:51
Progress Last Updated: –
Relationship Capability: –
Lag Time: 0:0:40
Identity Preserve Vserver DR: true
Number of Successful Updates: –
Number of Failed Updates: –
Number of Successful Resyncs: –
Number of Failed Resyncs: –
Number of Successful Breaks: –
Number of Failed Breaks: –
Total Transfer Bytes: –
Total Transfer Time in Seconds: –

# Check for the new volume on the DR site

dest_cluster::> vol show -vserver dest_svm
Vserver Volume Aggregate State Type Size Available Used%
——— ———— ———— ———- —- ———- ———- —–
dest_svm
dest_svm_root
aggr1_node01_sas
online RW 1GB 972.6MB 5%
dest_svm
dest_svm_nfs_prod01
aggr1_node01_sas
online DP 4TB 2.00TB 50%
2 entries were displayed.