Tuesday, 4 January 2011

Netapp Autosupport and Aggregates

Ok, so first day back in the office after the Xmas break.

Guess what - I go into the computer room, orange lights greet me on the new Netapp.  Hmmm somethings up, a closer inspection then reveals an orange light on one of the disks!  Great a failed disk!

I now remember in my email I had a failed to deliver message on an email sent over the christmas period to Netapp autosupport - yet there was no-one around to kick these off.  Putting two and two together, this must have been when the disk failed - and I have not got around to setting up the autosupport!

So first things first lets dust off the instructions to setup autosupport:

First we check the transport, it is set to https but we wish to use smtp so we reset it:

options.autosupport.support.transport smtp

We then need to check it sends correctly so we send a test message:

options.autosupport.doit TEST

Bingo that is now working, so we repeat the commands on the other node.

Now, I login to the now.netapp.com website and see my system is not listed :(   So I raise a support call with Netapp to add it - hopefully this will be resolved tomorrow and the new disk can then be shipped.


Our consultant just before Christmas said we should setup the system with one large aggr0 and not use aggr1 like we had on our old system as this wastes 3 disks - i.e. nearly 1TB of space per node (as we have 300GB disks).

Duh, we have already set this up with aggr1 on both new nodes, so lets first offline and destroy aggr1 on both nodes, using the http interface.

Now we need to zero the disks:

disk zero spares

This seems to finish instantly and not do much, so we move onto moving the spare disks into the aggr0:

aggr add aggr0 20

where 20 is derived from 24 disks in the shelf, minus 3 already used minus 1 for spare

This takes a while to run and says it is zeroing the disks as it goes, so I wonder if the disk zero spares command is actually required?!

This is then repeated on the other node.

After a few minutes the aggr0 is populated with the extra disks, although 1 of the nodes is showing an error with a lack of spare disks - not suprised seeing as 1 disk is failed.

So onwards to tomorrow whereby I can hopefully get the system onto Netapp support correctly and receive another disk!

1 comment:

  1. The 3 wasted disks are worth it IMHO

    Good luck if you ever want to resize aggr0 without reformatting and starting completely from scratch. I think its best to keep the root vol and aggr0 off on its own or it could end up creating bigger issues down the line.
