A few notes on Guest Virtual Machine Memory Ballooning

Looking over some of my Windows 2008 R2 servers, I notice via Task Manager that their memory is 90%+ quite often.

Using Sysinternal’s RAMMap tool, I can quickly identify if this is due to ballooning.

Here’s a screenshot showing that 2.2GB of RAM is driver-locked (e.g., ballooned using the VMware Tools driver vmmemctl).

Using vSphere Client, I check to see the performance of this VM to confirm if this is ballooned memory and what the actual memory usage is currently.

You can see that it is indeed ballooning nearly 2GB of RAM on this VM and the Guest VM is actively using only a little over 1GB of RAM.

To control how much memory the VM will balloon, you can set sched.mem.maxmemctl parameter in the virtual machine’s .vmx configuration file. (Reference: VMware KB1003586MSDN Lock Pages in Memory VirtualLock)

The above KB mentioned is a good starting point for assisting troubleshooting Guest VMs that pin pages in memory (Lock Pages – using SQL with LPIM enabled?).

Get VM Extra Configuration Details

Add-PSSnapin VMware.vimautomation.core
Connect-VIServer -server 10.10.10.1 -user "user" -password "luser"
(get-vm -name "ZEUS").extensiondata.config.extraconfig | sort key | format-table

Example Output

Key                            Value                          DynamicType                    DynamicProperty
---                            -----                          -----------                    ---------------
ethernet0.pciSlotNumber        32
evcCompatibilityMode           true
guestCPUID.0                   0000000b756e65476c65746e496...
guestCPUID.1                   000106a400010800809822010fe...
guestCPUID.80000001            000000000000000000000001281...
hostCPUID.0                    0000000b756e65476c65746e496...
hostCPUID.1                    000206e60020080000bce3bdbfe...
hostCPUID.80000001             000000000000000000000001281...
nvram                          ZEUS.nvram
pciBridge0.pciSlotNumber       17
pciBridge0.present             true
pciBridge4.functions           8
pciBridge4.pciSlotNumber       21
pciBridge4.present             true
pciBridge4.virtualDev          pcieRootPort
pciBridge5.functions           8
pciBridge5.pciSlotNumber       22
pciBridge5.present             true
pciBridge5.virtualDev          pcieRootPort
pciBridge6.functions           8
pciBridge6.pciSlotNumber       23
pciBridge6.present             true
pciBridge6.virtualDev          pcieRootPort
pciBridge7.functions           8
pciBridge7.pciSlotNumber       24
pciBridge7.present             true
pciBridge7.virtualDev          pcieRootPort
replay.supported               false
sched.swap.derivedName         /vmfs/volumes/500d986f-e408...
scsi0.pciSlotNumber            160
scsi0.sasWWID                  50 05 05 65 7f 15 c3 70
scsi0:0.redo
scsi0:1.ctkEnabled             false
scsi0:1.redo
snapshot.action                keep
userCPUID.0                    0000000b756e65476c65746e496...
userCPUID.1                    000206e600200800009822010fe...
userCPUID.80000001             000000000000000000000001281...
virtualHW.productCompatibility hosted
vmci0.pciSlotNumber            33
vmotion.checkpointFBSize       4194304
vmware.tools.installstate      none
vmware.tools.internalversion   0
vmware.tools.lastInstallStatus unknown
vmware.tools.requiredversion   8300

Hardware Monitoring on This Host is Not Responding

I received the following error this morning on my vCenter server after a brief reboot for a few Windows updates:

Hardware Monitoring on This Host is Not Responding

This error is generated when I click on the Hardware Status tab on each Host. I went to Plugins -> Manage Plugins disabled and re-enabled to no avail.

Came across this post at VMware Communities. Looked at half the suggestions and came to one indicating to restart the VMware VirtualCenter Management Webservices.

Set Ambient Temperature Alarm VMware ESX Host

I had some temperature spikes in the data center recently that caused havoc and for some reason I never received notifications.  I guess this isn’t configured to email by default, rather it sends a trap.
To change this behavior so that I can get escalating notifications when temperature dramatically spikes, you can modify it to do so.
Go to the Alarms for the vCenter server root domain and double-click on Host Hardware Temperature Status:
width=1325
In the settings for this, you want to change the action (click on Action tab) to send email notification.  I send to my email as well as SMS:
width=747
You can always check out what the temperature is of your ESX host by going to the ESX host in vCenter and clicking on Hardware Status tab and expanding Front Panel Board Ambient Temp (or similar depending on your hardware):
width=1325

Testing Disk in Linux using fio

I recently discovered a utility called fio that allows you to benchmark disk subsystem in Linux. Here are the results for this test.
What is fio?

fio is an I/O tool meant to be used both for benchmark and stress/hardware verification. It has support for 13 different types of I/O engines (sync, mmap, libaio, posixaio, SG v3, splice, null, network, syslet, guasi, solarisaio, and more), I/O priorities (for newer Linux kernels), rate I/O, forked or threaded jobs, and much more. It can work on block devices as well as files. fio accepts job descriptions in a simple-to-understand text format. Several example job files are included. fio displays all sorts of I/O performance information. Fio is in wide use in many places, for both benchmarking, QA, and verification purposes. It supports Linux, FreeBSD, NetBSD, OS X, OpenSolaris, AIX, HP-UX, and Windows.

Windows fio download:  http://www.bluestop.org/fio/
OS – Debian Linux “Wheezy” AMD64
RAM – 8GB
Virtualized – YES
VMware Tools – YES
Disk – 1 x 50GB Thin Provisioned
Test File – 10GB
Note:  Disk is on a LUN that is comprised of RAID5 using 6 disks @ 15kRPM – no throttling for disk/cpu is configured on VM.

 
Here are my fio test files:

[randrw]
rw=randread
size=10G
direct=1
directory=/tmp/
numjobs=1
group_reporting
name=randr-4k
bs=4k
runtime=30
write_iops_log
write_lat_log
write_bw_log
[randrw]
rw=randread
size=10G
direct=1
directory=/tmp/
numjobs=1
group_reporting
name=randr-8k
bs=8k
runtime=30
write_iops_log
write_lat_log
write_bw_log
[randrw]
rw=randrw
size=10G
direct=1
directory=/tmp/
numjobs=1
group_reporting
name=randrw-4k
bs=4k
runtime=30
write_iops_log
write_lat_log
write_bw_log
[randrw]
rw=randrw
size=10G
direct=1
directory=/tmp/
numjobs=1
group_reporting
name=randrw-8k
bs=8k
runtime=30
write_iops_log
write_lat_log
write_bw_log
[randrw]
rw=randwrite
size=10G
direct=1
directory=/tmp/
numjobs=1
group_reporting
name=randw-4k
bs=4k
runtime=30
write_iops_log
write_lat_log
write_bw_log
[randrw]
rw=randrw
size=10G
direct=1
directory=/tmp/
numjobs=1
group_reporting
name=random-rw-direct
bs=8k
runtime=30
write_iops_log
write_lat_log
write_bw_log
[randrw]
rw=read
size=10G
direct=1
directory=/tmp/
numjobs=1
group_reporting
name=seqr-4k
bs=4k
runtime=30
write_iops_log
write_lat_log
write_bw_log
[randrw]
rw=read
size=10G
direct=1
directory=/tmp/
numjobs=1
group_reporting
name=seqr-8k
bs=8k
runtime=30
write_iops_log
write_lat_log
write_bw_log
[randrw]
rw=rw
size=10G
direct=1
directory=/tmp/
numjobs=1
group_reporting
name=seqrw-4k
bs=4k
runtime=30
write_iops_log
write_lat_log
write_bw_log
[randrw]
rw=rw
size=10G
direct=1
directory=/tmp/
numjobs=1
group_reporting
name=seqrw-8k
bs=8k
runtime=30
write_iops_log
write_lat_log
write_bw_log
[randrw]
rw=write
size=10G
direct=1
directory=/tmp/
numjobs=1
group_reporting
name=seqw-4k
bs=4k
runtime=30
write_iops_log
write_lat_log
write_bw_log
[randrw]
rw=write
size=10G
direct=1
directory=/tmp/
numjobs=1
group_reporting
name=seqw-8k
bs=8k
runtime=30
write_iops_log
write_lat_log
write_bw_log

I used fio_generate_plot to generate gnuplot graphs.