Feb 17

By default, when you enable iscsi sharing within zfs, the share is created and bound to all available ethernet interfaces. This isn’t necessarily a bad thing, but if for some reason you can reach your iscsi share via two paths, you run the chance of sending iscsi traffic over a non optimized path and really messing with your performance. Fortunately, a way exists to bind iscsi to specific interfaces using interface groups.

First, we need to create the interface group. This is assuming that the IP 192.168.1.1 is the IP address that is assigned to the interface (or in the case of bound channels, multiple interfaces) that you want a specific share to use.

iscsitadm create tpgt
iscsitadm modify tpgt -i 192.168.1.1 1

A quick

iscsitadm list tpgt -v 1

Will let you know if this worked.

Now that your interface group is created, all you have to do is bind it to a specific share.

iscsitadm modify target -p 1 zpool/iscsiTarget

Done! This leaves open some interesting opportunities for using the same iscsi SAN to service connections on different networks in a relatively secure manner. Have fun!

Nov 24

In a never ending goal to have enterprise level stuff in my humble home, I have put this short guide together on how to assemble the cheapest VMWare ESXi server possible.  There are a few guidelines to keep in mind.

First, ESXi will run using SATA drives in AHCI mode connected directly to a motherboard that uses an Intel ICH7 or ICH9 chipset, but NOT an ICH8 or ICH10.

VMWare will use as much RAM as you can give it. RAM is more important then CPU.

Having said that, a multicore CPU is really helpful. ecspecially if you plan on using the software iSCSI initiator.

Use an Intel Pro 100 or Pro 1000 NIC. NIC support is pretty bare in ESXi, so just hop on newegg and spend 20 bucks on one.

We have two options, the first and cheapest is to use a motherboard based on the ICH7 or ICH9 chipset. ESXi will support SATA drives in AHCI mode connected to this chipset,so an ICH7/9 board, a few gig of RAM, a SATA drive and an Intel NIC will have you on your feet.  If you have this kind of setup, all you’re going to need to do is boot off of the ESXi installer CD and you’ll be done a short time later.

The second option is a bit more involved, but will work on ICH8 and ICH10 motherboards. I personally run an ICH10 based board with a quad core and 8 gig of RAM using this method.  This method also requires iSCSI storage to be available. I use Solaris and ZFS for my SAN. So, if you have an ICH8 or ICH10 motherboard, you need to create a USB bootable thumbdrive and install ESXi onto that. You can follow these simple instructions for linux or you can google for instructions for windows. Once you have your boot drive created, you will need to give your host an IP and use the management console to setup iscsi storage.  Have fun!

Sep 14

I really only have 1 lun that is important to me on my home san. It stores a VM running ubuntu that I remote into using NoMachine on a fairly constant basis from all over the place. When I upgraded the SAN and did a zfs send and receive from one SAN to another, I neglected to think about the iscsi IDs getting changed between SANs thus screwing up my ESXi machine. I could reimport the LUNs easy enough, but when you import a LUN into an ESX node, the first thing it does is format it! Not good. So after digging around, I found a way to tell ESXi to simply scan the SAN for available LUNs and to allow access to VMFS that might be on them.
In your VI client, do the following.

  • Select the Configuration tab
  • Select Advanced Settings
  • Select LVM in the left pane
  • Set LVM.EnableResignature to 1 and hit OK
  • Go back to Configuration
  • Click Storage Adapters
  • Right click on your iscsi adapter and select rescan. This will take a minute or two.
  • In storage, you’ll now see a bunch of LUNs labeled snapshot-whatever_you_called_the_LUN
  • Rename the snapshots to just be whatever_you_called_the_LUN and go along your merry way

Crisis averted!

Aug 24

A TCP packet is normally 1500 bytes large, 40 bytes of that being header information and 1640 of it being data. Trouble is, sometimes you don’t have 1640 bytes of data to send. In a telnet session for example, hitting enter is 1 byte, but we still need to send an entire packet for that single byte meaning 41 bytes on the wire.  This isn’t efficient and too many small packets like this can bring a firewall to its knees. A solution was put in place years ago called Nagles Algorithm that holds off on sending data until a larger packet can be assembled. Good for general use, not good for an iSCSI SAN.  Here are some worthless and misleading numbers.

Single SATA drive, P4 OpenSolaris Build 91 shared via iSCSI on a dedicated gig network with MTU at 1500.

while true; do dd if=/dev/zero of=test.img bs=1024 count=100000; rm test.img; done
102400000 bytes (102 MB) copied, 3.84045 s, 26.7 MB/s
102400000 bytes (102 MB) copied, 4.24423 s, 24.1 MB/s

Now with Nagle disabled

while true; do dd if=/dev/zero of=test.img bs=1024 count=100000; rm test.img; done
102400000 bytes (102 MB) copied, 2.13878 s, 47.9 MB/s
102400000 bytes (102 MB) copied, 2.14086 s, 47.8 MB/s

For the statisticians among you, I did several more tests and used other utilities like cp, but this is a blog and blogs are quick so work with me here.

A bug was opened with the OpenSolaris folks (6621560) but in the meantime, we can take care of this ourselves with one little command.   To check the status of Nagle, run the following as root

ndd -get /dev/tcp tcp_naglim_def

If that comes back with anything other then 1, Nagle is kicking you in the shorts. So lets fix that.

ndd -set /dev/tcp tcp_naglim_def 1

Tada. things should be a fair bit speedier now.  Unfortunatly, this is system wide and may impact thigs like webservers, but if you are running apache on your SAN you pretty much get what you deserve. Enjoy!

Aug 22

Solaris had seen better days with the release of Solaris 9.  No ground breaking innovations had occurred, the sparc architecture had started to lose it’s place as the data center chip of choice and linux was really kicking it in the teeth with it’s ease of access by the younger sysadmins. An x86 version existed, but it was really just a hobby OS  and no data center in its right mind would deploy it as production.  Things looked bleak, and then came Solaris 10. Solaris 10, and the cool threads/niagra CPUs, helped to put the shine back on Sun. Zones and containers helped to virtualize server hardware, giving a bit more return on investment, but what really did it for the geeks was ZFS.   ZFS is coined “the last word in file systems” and I gotta say, I believe it.  It combines LVM, RAID a journaled atomic file system and manages to increase performance all at the same time. Add to the equation that VMWare recently released ESXi (the bare metal hypervisor that they had been charging 3500 per node for) and you have a really sweet SAN backed virtualization solution in the making.

First things first, install open solaris and immediatly patch it.  You can find instructions on how to do that Here but the condensed version is

pfexec pkg refresh
pfexec pkg image-update
pfexec mount -F zfs rpool/ROOT/opensolaris-2 /mnt
pfexec /mnt/boot/solaris/bin/update_grub -R /mnt

Depending on your internet connection this may take an hour or a few.  The reason for the upgrade is that the shipping version of Open Solaris (2008.5) has a bug with the serial number generation that prevents VMWare from using volumes exported via iscsi.   Once you’ve upgraded solaris, we need to create our pool. We’re going to assume three drives, c0t0d0, c0t0d1 and c0t0d2 and we’re going to put them into a raidz (better look this up, think raid 5 but better)

zpool create tank raidz c0t0d0 c0t0d1 c0t0d2

And you can check your handy work by running

zpool status -v tank

So, we now have a zfs pool called tank that is made of 3 drives we’re going to create a 100 gig volume that we’ll use in the SAN.

zfs create -V 100g tank/iscsi-vol

We now have a 100 gig volume in /tank called iscsi-vol. Next step is to share that bugger out via iscsi

zfs set shareiscsi=on tank/iscsi-vol

and we’re done. you can verify with

iscsitadm list target -v

Now that we have the volume shared out, we need to get access to it with vmware. I’m assuming here that you have a single ESXi 3.5 Update 2 node to play with, so this is assuming a virtual center client to a single ESXi node. This is a pretty simple operation.  In the vmware console, click on configuration and go to networking.  add a vmkernel and then click properties and enable iscsi for that adapter.  Back to the main configuration tab,  click on storage adapters and select properties for the iscsi software adapter.  You’ll need to enable the device and then click on and close the window.  Open that property window again and go to dynamic discovery. Here you’ll add the IP of the Solaris box and then click ok.

Right click on the iscsi adapter and select rescan, this may take a minute.   When it’s done go into storage and click add storage. Looky what shows up in your vmfs storage pools, our new 100 gig volume.

« Previous Entries