Sunday, June 8, 2014

ISILON InsightIQ Appliance - A smart way to get LIVE performance data of ISILON

Hi Friends,

Today I tested the ISILON InsightIQ Appliance for getting the LIVE performance data of my ISILON Cluster. The good thing about this tool is the easy to use and faster deployment !!

You can also do some stress tests on your ISILON and check on the load with InsightIQ. Have a look at the below picture which shows the main page dashboard of your ISILON Cluster with LIVE performance details.

(Click to expand)

Thursday, June 5, 2014

My article on calculating IOPS... The much awaited one!

Larger disks would give more capacity that you don’t need and faster disks would provide performance above and beyond what was requested.  This may be good depending on your confidence in the performance requirements.

random I/O
RAID10: write penalty = 2, read = 1; available space = number of disks devided by 2
RAID5: write penalty = 4, read = 1; available space = number of disks minus 1 disk
RAID6: write penalty = 6, read = 1; available space = number of disks minus 2 disks

Always count all the drives involved, since the write penalty takes care of that.

An app does 1000 IOps, where the read / write ratio is 3 / 1, so 3 times as many reads as writes. These 1000 IOps are 750 reads and 250 writes

Backend IOps:
RAID10: 750 + 2 x 250 = 1250; you'll need 1250/180 15k = 7, so at least 8 drives or 1250/130 10k = at least 10 drives
RAID5: 750 + 4 x 250 = 1750; you'll need 1750/180 15k = at least 10 drives or 1750/130 10k = at least 14 drives
RAID6: 750 + 6 x 250 = 2250; you'll need 2250/180 15k = at least 13 drives or 2250/130 10k = at least 18 drives

Not digested ???


Let me take it this way now...

As for the IOPS per drive here is what is used as industry standard:

FC 10K= 150 IOPS 
FC 15K= 200 IOPS 
SSD = 400 IOPS 
SATA = 80 IOPS 
Flash: 3500 IOPS
SAS 15k: 180 IOPS
NLSAS: 90 IOPS

These are just rules of thumb used to size environments. 

Wednesday, June 4, 2014

Manually registering a host into VNX Storage - Abstract EMC Forum

https://community.emc.com/thread/194551

Click to expand...


IBM XIV: Phasing Out and Phasing In a component

When a part is failed in a XIV system, the part is marked as phased out.

This command instructs the system to stop using the component, where the component can be either a disk, module, switch or UPS.

For disks, the system starts a process for copying the disk’s data, so that even without this disk, the system is redundant. The state of the disk after the command is Phasing-out.

The same process applies for data modules. The system starts a process for copying all the data in the module, so that the system is redundant even without this module. A data module phase-out causes a phase-out for all the disks in that module.

For UPSs and switches, the system configures itself to work without the component. There is no phase-out for power supplies, SFPs or batteries.

Phasing out a module or a disk, if it results in the system becoming non-redundant, is not permitted. Components must be in either OK or a Phase In status.

Once the phase-out process is completed, the component's state is either Fail or Ready, depending on the argument markasfailed. If true, the phased-out component is marked as a failed component (in order to replace the component). If false, the phased-out component is in the Ready state.

component_phaseout component=ComponentId [ markasfailed=<yes|no> ]

Phasing In:
This command instructs the system to phase in a component. Components are used by the system immediately. For disk and data modules, a process for copying data to the components (redistribution) begins. Components must be in Ready or Phasing Out states. There is no phase-in for power supplies, SFPs or batteries.

component_phasein component=ComponentId

VMAX: Newly created datadevs not immediately available for allocations ???

There is a background process PHCO/IVTOC that runs on the new devices and before that process is finished, they are not available for allocations.

There is a fix# 68779 available for 5876.268.174  that will give high priority for PHCO over normal IVTOC.

PHCO  is mainly a security feature introduced in 5876.229 that will make the ucode run a scan on all newly added TDATs to check if they degraded due to a disk failure in the RAID group or so. DAs should scan the devices and once the scan is complete, It clears the PHCO flag and then devices will be eligable for new extents allocation.

To be honest with you, The scan takes some time. There is a way to disable it (Senior PSE is needed for this) but ususally PSE lab discourages doing so, however after adding the fix mentioned above, PHCO is giving very high priority so the scan should take less time.

Hope this helps!

Friday, April 11, 2014

Cisco SMART Zoning - Yes it is Smart enough !

Smart Zoning is pretty awesome and I highly recommend it to anyone. The basic idea is that you will create a smart zone as various types. In my example i'm going with a 15 host ESX environment called ESXProd connect to two fabrics, A and B on separate HBA ports. Each fabric has two VMAX FA ports shared by the ESX hosts.  Using traditional 1-to-1 zoning each host would have something like this:

ESX1_HBA1_7e1 VSAN 1
ESX1_HBA1_9e1 VSAN 1
ESX1_HBA2_8e1 VSAN 2
ESX1_HBA2_10e1 VSAN 2

Each zone contains two member pwwn - one for the server's HBA and one for the FA Port. All in i'd have 60 zones (4 zones times 15 hosts) to manage and deal with.

Using a one-to-many SmartZone we allow a single initiator to connect to multiple targets.  That means my zoning per server is now:

ESX1_HBA1_7e1_9e1 VSAN 1
ESX1_HBA2_8e1_10e1 VSAN 2

Each zone has three pwwn - one for the server's HBA, and two for the FA Ports. It's a lot less zones than the traditional approach,  but it's still 30 zones in all. I think the better option is the many-to-one SmartCone which allows multiple initiators to talk to a single target. sing this we end up with far less zones, like this:

ESXProd_7e1 VSAN 1
ESXProd_9e1 VSAN 1
ESXProd_8e1 VSAN 2
ESXProd_10e1 VSAN 2

Each SmartConnect zone contains 16 pwwn.  15 for server HBA WWNs and 1 for the FA WWN.  Total zones used?  Four.  When adding a new ESX host to this cluster we simple modify the SmartZone to add a new member pwwn.

The third option,  which Cisco does not recommend but does work is many-to-many.  We'd end up with two zones, one per fabric.

You can zone using device alias, pwwn, fcid, or fc alias as members in SmartZones. Enable SmartZone at the fabric, zone, or zoneset level..

When you add a member it's just like the current syntax except you have to specify it as either init, target, or both:

member pwwn 10:00:00:12:134:56:99 target

Zoning then will only allow initiators to talk to targets and not other initiators.

Enabling Smart Zoning is as simple as  "zone smart-zoning enable vsan 100".   Then you have to convert your existing zones... you can do it per zone, vsan, zonetset, or fc alias like so: "zone convert smart-zoning zone name <name> vsan <n>" or "zone convert smart-zoning vsan <n>"


If, for whatever reason, you need an target to talk to a target or init to init you just disable Smart Zoning on that zone: "no attribute disable-smart-zoning" inside that zone..

Wednesday, April 9, 2014

VPLEX Best Practices Document - Release: November 2013

Hello Friends,


Here you can download the EMC VPLEX Planning and Best Practices Document.


All rights reserved for EMC. No un-authorized copying or editing. If you want to download the copy, you can login to support.emc.com and use your customer login to download

SRDF/A Concat META Expansion

Am back !!

Today am gonna take you on a issue which I faced during a recent device expansion in Symmetrix VMAX.

Adding additional 100 Gigs to an existing RDF Concat Meta. But how ???

(Click to expand)

1) Create devices on R1 and R2 and R2 Clone as per their hyper sizes.
2) Terminate Clone Session on R2
3) Disable consistency on R1 group
4) Suspend RDF links for that group
5) Do a delete pair on the R1 group. This is an IMPORTANT step.
6) Add the created hypers to the existing META on R1 side (Remember this procedure is exclusively for Concat META expansion)
7) Add the created hypers to the existing META on R2 side.
8) Add the created hypers to the existing META on R2 Clone.
9) Establish a new R1 - R2 Sync with the mode ACP_Disk. This would be a fresh sync between R1 and R2 as the new devices which were added to the concat META will adjust the META Geometry. Since this sync will take time, its better to set the mode to ACP_DISK.
10) You can either create a temporary RDF group and later on move the pair to the actual production group or you can directly add the pair to the actual production group.
11) Once the device state changes to "Synchronized", we have to change the mode to ASYNC and then suspend the RDF group to move the pair to the actual group.
12) Once the move pair operation is complete, we can resume the RDF links and enable the consistency on the actual group which will show the state of the device pair as "Consistent".
13) So we have completed the R1 - R2 setup. Now moving over to the R2 Clone setup.
14) Since we terminated the previous clone session, we will be using the same set of clone devices and will create a clone pair.
15) You can create a pairing file with the R2 and R2 Clone device(s) and create the clone session with the -PRECOPY option. This will start the background copy of the device and the host can access data through it. 
16) You can keep the devices in precopy mode always or you can activate the clone copy which will change the status of the devices to "Copied" status.

Thats it !.. Your R1 - R2 and R2 Clone Concat META expansion is complete. You can ask the platform guys to expand the drive at their end so that the new size would be visible to them.

Friday, April 4, 2014

IBM SVC (SAN Volume Controller) - or - Ease of use Storage provisioning

Okay friends.. in this post am gonna take an overview of IBM SVC GUI. How things look and how it works.

IBM SVC mainscreen or the launch page which gives a summary of the storage being provided from the "SAN Volume Controller". I dont know if SVC has some of its internal storage but when it comes to support of External storage through this SAN volume controller - it is a good choice. It supports Storage provision and Migration of data from Heterogeneous to Heterogeneous storage arrays.
(Click to expand)

But wait - Did I say anything bout the SVC Architecture. No - Am not going to cover anything bout the architecture but again - a look at how the SVC logical structure looks like.
(Click to expand)

Claiming storage from other arrays:

When the storage is masked to the ports of SVC, the LUNs are automatically or manually organized in different storage pools. (The storage pools can be user defined or dynamic names). Once the LUNs are added to different storage pools, the capacity of the pool will be visible as shown in the picture.
(Click to expand)

IBM SVC Volumes - Yes !! they are called as volumes in SVC but not LUNs. A summary of how LUNs look like.
(Click to expand)

Using the "new Volume" create wizard:
(Click to expand)
As you can see from the above pic, volume creation options - Generic Volume, Thin-Provision, Mirror, Thin Mirror and Compressed. I use thin-provision as am concerned bout space. Using this one page wizard, users can create a volume or instead we can create and map it straight away to a host. Once a volume is created, it will be given a unique ID and a unique LUN ID. Both can be used for quick references.

Lets have a look at a sample volume properties.
(Click to expand)

IBM SVC Diagnostics - Troubleshooting an unknown issue:
In SVC we can create support bundles which in turn can be uploaded onto the IBM Support cases for analysis. The support bundles can be big enough to upload it onto the support site, so work with your IBM personnel to get it working.

Hope you liked this post. I would be glad to hear any views.

Thursday, April 3, 2014

EMC VPLEX or IBM SVC ?

I prefer EMC VPLEX instead of IBM SVC. Guess why ?

? Cost - No !
? High Performance - No !
? Availability - No !
? Easy to use Aka User Friendly - No !

Then what is it ??????

I would say "Distributed volumes in a VPLEX Metro" which server storage when there is a site failure or a disaster without any disruption to Host / Application I/O. That is one of the best features of VPLEX which makes it one of the best storage products for mission critical environments.


Why, Why and Why ?

I know this sounds a bit weird post but I had to write it. Few questions posted by some blog readers as to why I post content which has no "technical deepdive". Yes - Am not a deep dive blogger but I do post content which is useful for a user in a way. 

Whether it be just provisioning storage or business continuity using remote replication - I keep the concept straight forward and quickly accessible to the end users. 

But anywayz, thanks for bringing this to my notice. I will try to be more specific and upto the point in my coming posts.

Good day all.

Wednesday, April 2, 2014

SRDF - Simple (Symmetrix) Remote Data Facility

Alright... been ages I wrote something on SRDF. Let me write something bout the much talked about - Performance issues with SRDF/A. But before I do that, here is a simple layout on how a SRDF looks like. (Click to expand)
Basic storage fact of life: if you're doing synchronous replication, the source has overhead and delay because remote write operations must be confirmed. That limits the distance for synchronous to 300 km over fiber, per http://www.intellimagic.net/intellimagic/what-is-spm/copy-services/srdfs. Asynchronous avoids that.
Storage systems are reliant upon the network in one way or another. EMC DMX SRDF synchronous data replication, NetApp MetroCluster node-to-node communication, data replication, iSCSI and NDMP backups all rely on the network to be stable and functioning. But sometimes, it's not, and network troubleshooting is required to fix the problem.

Virtual Provisioning with SRDF:
.Storage is allocated in extents of 768KB
.SRDF operations continue to be supported with the granularity of blocks (512 bytes)
.SRDF supports connecting a VP device to another VP device (thin to thin only)
.Indirection penalty with every write
.Noticable at high I/O rates
.Consequently more SRDF CPU power is needed to achieve equivalent performance

.Maximal 2500 IOPS (roughly, still doing testing and engineering)

Best Practice Consideratoins:
.Always consider disk throughput requirements when creating or growing a data device pool
.Segregate applications by pools if they won’t play well together
.Use R1 or R5 (3+1) when write performance is critical. Use R6 for highest available within a thin pool.
.Pre-allocate capacity if response sensitive applications expand by randomly writing into new space.
.Use concatenated meta volumes for ease of expansion
.Be aware of performance considerations of replication

.General performance tuning principles still apply to thin devices

Data Domain DD990 Chassis layout

Below pics show the DD990 Hardware Chassis layout: (Click to expand)


Data Domain AutoSupport bundle

Below pic shows how we gather the support bundle required for Data Domain issues. Once the support bundle is generated, upload it to EMC case for analysis.

Integrating ISILON with Networker 8

I added ISILON to the Networker console for NDMP backups. And it worked like a charm.

Have a look at this pic (Click to expand)

ISILON SyncIQ Remote Replication Explained

Alright guys... Here I come with my ISILON SyncIQ 7.1 Remote Replication setup on two clusters. To be noted, this is an IP based remote replication and not a Fiber based replication wherein we can expect high amount of traffic / data flow in seconds.

So for the SyncIQ to work correctly, we require two ISILON Clusters with identical Source and Target directories. You can replicate SMB or NFS shares as well.

Lets start with the SyncIQ policy creation: (Click to expand)
More pics.. (Click to expand)

Once the policy is created, we can modify the policy settings as per business requirements and also stop/start it.
So for testing, I added one more policy to see how both the policies react. ISILON is intelligent enough to load balance both the policies as per their data size.

There is more to add to this post, but I would suggest referring the ISILON OneFS 7.1 Web guide or the CLI guide which gives more uses of SyncIQ and how we can customize the policies.

Tuesday, April 1, 2014

Implementing ISILON SMB Share on a new ISILON OneFS 7.1 ? Go through this

Phew!

This took ages for me to identify what was causing the new SMB / CIFS Share access issue through my other windows servers. Thanks to the EMC folks who gave some valuable suggestions on rectifying the issue.


Here is the catch, tested in 7.1 virtual nodes, no AD:
 
In the user-defined access zone, use "lsa-local-provider:System"
for the LOCAL provider. Looks a bit weird, but works (for me).
 
One would expect enabling Guest (like in the KB article)
within the user-defined access zone should do it, but no luck with that so far.

Thursday, March 27, 2014

ISILON OneFS 7.1 Overview !!!

In this post am gonna take you on a tour of ISILON Cluster with 2 Nodes...

Again, downloading the package is from support.emc.com > Downloads > isilon. You will need a registered EMC customer / partner account to get the package downloaded.

Installation of ISILON "Virtual Nodes" -Yes, EMC does provide ISILON ONEfs 7.1 for testing ;)) but you will have to download the Trial / Permanent licenses for all the additional features of ISILON working.

ISILON supports:

1) SmartPools
2) SmartQuotas
3) Deduplication
4) iSCSI
5) SnapshotIQ
6) SyncIQ (Remote replication Aka Business continuity for ISILON Cluster)
7) Backup
8) CIFS / SMB Protocol
9) NFS
10) FTP
11) HTTP
12) Diagnostics
13) Performance reporting

I configured a ISILON test Cluster with 2 Nodes and waiting on configuring another Cluster to setup remote replication. But before I do that, I can share some pics from the GUI which shows what all an ISILON can do.

A cluster overview from the main Dashboard page which gives a summary of how your cluster is performing. (Click to expand)

Curious bout knowing which tasks / jobs are running on the ISILON right now - get into the Job Reports tab which will show the running tasks. (Click to expand)

SyncIQ - The most amazing feature of ISILON for remote replication. Yes - ISILON does provide this feature as a licensed one. While creating a policy, you will mention the source cluster path and the target cluster path and set times when you wanna replicate it. (Click to expand)

Performance issues during a SyncIQ session - Have a look at this (Click to expand)
And last but not least - the fav. topic of everyone - CIFS !!! (Click to expand)     

Oops - I forgot to mention - ISILON does provide the iSCSI protocol but since ISILON is a scale out NAS architecture - one has to open a RPQ with EMC to get the ISILON iSCSI feature working as iSCSI is a block architecture.

Lets have a look at the "Filesystem Explorer" in ISILON which doesn't show up in GUI with the "admin" account but you have to login as "root" to get the full ISILON GUI.(Click to expand)

My next post will cover the ISILON "SyncIQ" configuration and some command line examples to manage a ISILON Cluster.

Wednesday, March 26, 2014

A big question in a small mind - What are the pre-requisites to install Networker 8.1.1.2 ?

Okay... Here is the big question for every newbie who wants to install Networker 8.1.1.2 in their environment. I strongly recommend approaching your EMC reps to get an idea about what is what. 

But I can give a summary of my install steps. 

1) Download Networker 8.1.1.2 from the support.emc.com website using your EMC customer account.

2) Use Windows server 2008 R2 or appropriate OS to install your Networker server package and Management console. I suggest use Windows server 2008 R2 SP1 Enterprise as your "Server" and the rest are clients. I have installed Windows server 2008 R2 SP1 on my x64 based Core i5 processor with 20 Gigs RAM. If you are really interested in using vCenter VMs as your server and clients - then you install them as per your requirements.

3) A minimum version of JRE 1.6.26 or greater is required to launch the Networker Management console. Disable any automatic updates for Java.  I suggest installing both x86 and x64 versions of JRE.

4) Windows firewall should be disabled.

5) Do not install chrome or any other browser before your first launch of Networker Management console. If you do so, you will never be able to launch the NMC (I will use NMC from now) on your server as it will disable the "gconsole.jnlp" file from launching the NMC permanently after multiple reboots of your server.

6) Do not install any Anti-virus softwares as it will disable / create issues with your NMC ports (9000/9001).

Thats it! - I suggest using a simple setup for your NMC server and it runs the best way it can without any issues.

Networker 8.1.1 GUI overview

So,

In this post am gonna share some pics related to my testing which I did on my lab of Networker 8.1.*

I used Networker AFTD (Advanced File Type Devices) for doing a Backup / Restore. (Click to enlarge)





Welcome to my blog !!!

Hi Guys,


Am Imtiaz... A lonley SAN and Storage guy who loves to capture his lab research on this piece of blog. Whatever information it be... but I love to capture it here so that anyone having this trouble can implement it without any issues...

Now coming to the disclaimer - I do not own any of these products nor am using it for any production use. This is solely for people who can troubleshoot issues they are facing in their environments.

Note: Do not use these instructions in your production environement and I STRONGLY recommend testing this first in a lab.