On July 20th I will be applying vmware updates to ESXi host in Prod 1,  2, 3 clusters. This is a rolling maintenance and will not create any outages for running guest vm’s.  As they will be live migrated off one blade at time as its put into maintenance mode and patches applied.

Regular remote access mechanisms like ssh or remote desktop to the VMs will be unaffected. All VMs and their services will continue to run as normal. There should be no customer impact.

Start: 07/20/2013 9:00 PM

End: 07/21/2013 1:00 AM

If you have questions or concerns about this maintenance, please contact the Shared Infrastructure Group at osu-sig (at) oregonstate.edu or call 737-7SIG.

To improve security on access.cws.oregonstate.edu all incoming connections originating from an off campus IP address (see note below for definition of campus IP address) will be dropped beginning at 10pm on Tuesday July 16th.  Outbound connections from access.cws will not be affected.

After this change a VPN connection will be required for connectivity to access.cws from off campus. For more information on VPN please see: http://oregonstate.edu/helpdocs/network/vpn-campus-access

 

Note: Campus IP addresses are defined below, any IP outside this range will be considered off campus.

  • 10.0.0.0/8
  • 128.193.0.0/16
  • 140.211.0.0/16

If you have questions or concerns about this, please contact the Central Web Services via their webform at  http://oregonstate.edu/is/services/cws/contact

** Emergency Maintenance Announcement – No service interruption anticipated **

We will be replacing a failed raid/rom battery in a fast disk node in our lefthand SAN. This is the storage network that back’s our VMware infrastructure.

We do not anticipate any service interruption. Our nodes are redundant for each other, we will only be working on a single node as such the change should not be service interrupting.

Start: 07/14/2013 11:00 PM

End: 07/14/2013 12:00 PM

If you have questions or concerns about this maintenance, please contact the Shared Infrastructure Group at osu-sig (at) oregonstate.edu or call 737-7SIG.

** Maintenance Announcement – No service interruption anticipated **

On Saturday the 22nd at 10pm we will be upgrading the Netscalers to version 10 build 76.7.  Since they are in HA mode no outages or downtime are expected.  In the unlikly event of problems the changes will be rolled back and the maintenance will be scheudled for a later date.

Start: 06/22/2013 2200

End: 06/22/2013 2259

If you have questions or concerns about this maintenance, please contact the Shared Infrastructure Group at osu-sig (at) oregonstate.edu or call 737-7SIG.

** Maintenance Announcement – No service interruption anticipated **

We will be applying a configuration change to our iSCSI switches that support our StoreVirtual SAN. This is the storage network that back’s our VMware infrastructure. (configuring timezone offset take2, kicking up syslog debug level)

We do not anticipate any service interruption. Our switching is redundant, we will only change the switches one at a time, and the changes should not be service interrupting.

Start: 06/08/2013 10:00 PM

End: 06/08/2013 10:15 PM

If you have questions or concerns about this maintenance, please contact the Shared Infrastructure Group at osu-sig (at) oregonstate.edu or call 737-7SIG.

** Maintenance Announcement – No service interruption anticipated **

We will be applying a configuration change to our iSCSI switches that support our StoreVirtual SAN. This is the storage network that back’s our VMware infrastructure. (configuring syslog)

We do not anticipate any service interruption. Our switching is redundant, we will only change the switches one at a time, and the changes should not be service interrupting.

Start: 06/04/2013 10:00 PM

End: 06/04/2013 10:15 PM

If you have questions or concerns about this maintenance, please contact the Shared Infrastructure Group at osu-sig (at) oregonstate.edu or call 737-7SIG.

We made the changes requested of us in Part 2. However we are still experiencing the occasional ping/fail on one of the switches. We have not seen an loop detected. Interesting to have loop protection turned on and doesn’t hurt anything but the blade centers and flex10s do not seem to be the problem.

So where do we go now?! Another call into hp support and they have directed us to perform the following steps:

  1. Re-seat all components in the switch that keeps paging us.
  2. Enable syslog on all switches and see if they say anything.
  3. Add a timezone offset to the ntp configuration.

Hopefully this will show us what is happening as we still do not have a resolution. Or is it time to start replacing hardware? Is the sfp unit in the switch bad? is the 10g module bad? is the switch bad? lots of questions and no real firm answers as to why we get woken up in the middle of the night yet all seems fine except for a quick down/up event.

Part 1 | Part 1 follow up | Part 2 | Part 3 | Part 4

** Maintenance Announcement – No service interruption anticipated **

We will be applying a configuration change to our iSCSI switches that support our StoreVirtual SAN. This is the storage network that back’s our VMware infrastructure.

We do not anticipate any service interruption. Our switching is redundant, we will only change the switches one at a time, and the changes should not be service interrupting.

Start: 06/01/2013 10:00 PM

End: 06/01/2013 11:00 PM

If you have questions or concerns about this maintenance, please contact the Shared Infrastructure Group at osu-sig (at) oregonstate.edu or call 737-7SIG.

On May 25th I will be applying vmware updates to ESXi host in DEV cluster. This is a rolling maintenance and will not create any outages for running guest vm’s.  As they will be live migrated off one blade at time as its put into maintenance mode and patches applied.

Regular remote access mechanisms like ssh or remote desktop to the VMs will be unaffected. All VMs and their services will continue to run as normal. There should be no customer impact.

Start: 05/25/2013 9:00 PM

End: 05/25/2013 11:30 PM

If you have questions or concerns about this maintenance, please contact the Shared Infrastructure Group at osu-sig (at) oregonstate.edu or call 737-7SIG.

This last Monday we experienced two more pages for downed events for one of our switches one at 8am and one at 5pm. This did not impact service but is troubling as we want everything to be healthy all the time in our environment. For a description of the problem we are seeing take a look at my earlier blog post and its follow up.

I put in a support call to HP and referenced the older ticket and the repeat of the problem. Support requested a copy of the output from each switch from the command show tech all. I dumped the output and sent it off to the helpful support person. Later that day the support person called back and asked about why were were on such a new version of the firmware! So I pointed out that it was their support whom gave us the copy of the firmware and told us to run it. At the end of this support call HP has come back with two changes. They would like us to add loop protect on the ports that feed our blade centers. They would also like us to reconfigure both switch2s in each site so that their trunk ports are statically defined instead of auto-detected/dynamic.

So our next maintenance window is this Saturday and we will perform the following changes:

All switches will get:

config
loop-protect mode port
loop-protect a2
write mem
exit

Each switch2 will get (where ? is 1 for site1 and 2 for site2):

config
no interface <port list> lacp
trunk <port list> trk? lacp
write mem
exit

Part 1 | Part 1 follow up | Part 2 | Part 3 | Part 4