Crosswinds.Net - Simply Professional
Simply Professional

We Support Make Poverty History!!
Crosswinds Customer Testimonial of the Moment
A Fact About Testimonials - Carol
30 Day Money
Back Guarantee

You have 30 Risk Free days to try Crosswinds!
Read more...
Home About Us Contact Us Privacy Policy Terms of Service Refund Policy Testimonials Visit Santa's Toy Store Login
 
Get Your Account Free!
 
Free Web Hosting
Domain Names
 
Support: Above and Beyond
Our Network and Servers
 

001 Game Maker is a free,
downloadable rpg
maker/action game engine,
with a vibrant, active and
growing community that
contributes to make it better!

Check out 001 Game Maker!

 
Make Poverty History

Crosswinds Login
user 
pass 

Updates - Friday, November 20th, 2009 5:53:26pm

September 24, 11:31 AM - Partial Network Outage
Posted by: tony
One of the major feeds into the NOC was cut - literally. A backhoe at a construction site near one of the internet peering points (a place where lots of backbones connect to share traffic) cut a fiber trunk.

This is a major internet cable with approximately 1000 strands of fiber optic cable. They noticed it almost immediately and a crew was on site repairing in under an hour. The link reconnecting us was brought up just prior to 9am EST.

I wasn't affected, but all of Europe, Africa and a good chunk of the US was affected - which depended on which provider you used.

Due to business issues (and not techical) traffic on the affected providers was not re-routed to other networks. It's silly and, in my opinion, completely defeats the purpose of these connections if businesses won't do business with each other to ensure traffic can be redirected.

The servers and everything were up the entire time and my access was not affected - it wasn't until our NOC was informed at approximately 8am that we knew what had happened. They did know there was a traffic reduction on one of their feeds, but no details.

We apologize for those affected, even though it was beyond our control.

-- Tony

September 23, 7:07 PM - Emergency Maintenance Complete!
Posted by: tony
It is finished!

The data has been kept safe, the drive replaced, an additional drive added. The file system is up and running well, all services have been restored.

I do apologize for the extended downtime.

I could not risk any of your data and had to bring the system back in as safe and careful a manner as possible.

Thank you for bearing with us and I hope to not do this for a VERY long time!

Thank you for your patience and understanding.

-- Tony

September 23, 6:44 PM - FSCK Underway
Posted by: tony
System is up and the data partition is fsck'ing (filesystem check).

Since the partition is big, this does take a while. Keep checking.

-- Tony

September 23, 6:33 PM - Emergency Maintenance Update - Second Drive Swap
Posted by: tony
The first drive swap and data mirror sync finished successfully.

This ensures your data is duplicated and safe.

The second drive swap is for another level of safety in case one of the drives fail - it needs to be integrated into the system.

This can be done after the system is up.

Once the cables are are reseated, the system is coming up. I then need to fsck (filesystem check) the data partitions and then we will be live!

Stay tuned!

-- Tony

September 23, 5:11 PM - Emergency Maintenance Update - Rebuild Phase 1 Restarted
Posted by: tony
You will see some life as the web site is displaying the error page. We allow this up so that people will be pointed here.

The first replacement is being resynched. Knock on wood or anything else to fend off Murphy!

Once this drive is synced, we are doing 1 more drive swap but the sync for it will take place with the system online.

Will keep you updated!

-- Tony

September 23, 4:55 PM - New Drives Arrived
Posted by: tony
You might have seen a blip of uptime - we were conducting a test prior to the drive swapout.

The drives are there and ready. We'll be replacing it in approximately 10 minutes, bringing the system back up and looking at resynching the mirrors.

September 23, 4:22 PM - Drive D'oh!
Posted by: tony
Right package, wrong drives.

They were too small - tech is rushing back to supplier to replace them.

September 23, 3:13 PM - Maintenance Update
Posted by: tony
The tech has the server out of the rack and is swapping out the drive.

Once the drive is replaced, we will begin the process of saving the data and forcing the drives to re-replicate (the drives are all paired into mirrors for data safety).

Tony

September 23, 2:48 PM - Maintenance Update
Posted by: tony
Replacements drives have been located and are en-route. ETA, barring construction/other traffic barriers is 10 minutes.

This will get us able to start the rebuild process.

September 23, 1:55 PM - Maintenance Update
Posted by: tony
Awaiting disks. Supplier tracking down our make/model has been the delay (not entirely clear if they had them on hand or had to get them).

The tech will be calling me when he's at the colo with them in hand.

-- Tony

September 23, 12:33 PM - Emergency Maintenance
Posted by: tony
If I could have avoided this I would.

The second drive that started to go bad after the first is apparently causing issues with system. It made the system 'panic' (reboot spontanesouly) and I started to get some warnings in the error logs about CRC errors (bad data).

This occurred on a reboot so no user data has been effected luckily.

This mandates that the disk swap happens asap.

The system is currently offline to ensure no data corruption. A Level 3 tech has been called in to work with me (as my hands and eyes for odd situations) and a drive has been ordered and is on the way to the colo.

I will post more information as I move forward and let you know about the status.

September 23, 12:20 PM - Equipment problem
Posted by: Jenn
Apologies for the inconvenience, we will have this fixed as soon as possible.

--Jenn

September 23, 9:26 AM - Complete and Up
Posted by: tony
The last fsck (file system check) finished with no errors reported and the partition has been brought back online.

Sites are visible, no data was lost and we are live!

September 23, 9:21 AM - Update
Posted by: tony
Last fsck of the procedure then the filesystem will be back online.

Awaiting response from my supplier on the delivery timeline for the replacement drive so I can schedule the swapout.

September 23, 9:09 AM - Drive Failure
Posted by: tony
The gear is starting to show it's age.

The second disk in this batch has started to fail.

Do not worry, no data has been lost but it put the file partition offline. I have ordered the replacement and we have been testing data integrity and making sure no data has been lost.

We are doing the last fsck before we can bring the partition online. We will be scheduling this second disk swap soon (the drives are in mirror pairs so we can continue to operate).

We do apologize for the downtime - hardware failures are just a fact of life...

September 10, 2:30 PM - System Back Up
Posted by: tony
One of the disks "burped" (and that's as close to a technical term as we can get - it gave back a random error that all drives are prone to do from time to time) and caused the system to reboot.

Since it did this in a sudden fashion the user data partition had to be fully fscked to ensure it is safe and clean. That takes a long time.

It's complete and back up.

I have put in a request for a new disk - these burps are not rare but for the $100 the drive costs, I want to replace it asap!

September 10, 1:47 PM - Equipment problem
Posted by: Jenn
We are on top of the problem. Please bear with us as we get it sorted out.

--Jenn
Powered by Coranto