Life of a Sysadmin

The occassional trials and tribulations of a jack of all tr ades sysadmin in a startup in Silicon Valley

March 2007

About that call to Dell Customer Service, or a Latitude Lemon

The first report of a problem with the laptop was that the screen went "funky". Our advice was to hibernate the system and power it back on. If it occured again, the user should call us when it occured. The problem of course recurred, and I was called. The drive was moved to a new laptop and the user continued on their merry way. A call to technical support went smoothly, with the result being the laptop being shipped to the repair depot, where they replaced the screen.

Fast forward a month, I am setting up the laptop for a new user and the screen problems return. I deploy a different laptop the new user and make another service call. The laptop takes another trip to the depot, where the motherboard is replaced.

Fast forward a month, the laptop is out on short term loan to an engineer on vacation. He calls me up and tells me the screen is "woogie". He was able to work around the problem for the week by simply suspending and awaking the machine whenever the screen went south. A trip to the depot replaces the screen (again in theory).

The laptop is returned to us in a worse state than it was sent it; some of the damage likely from the poor packing job by the repair depot. My call to technical support reports a broken wireless switch (it falls off), a flakey screen (same problem), and a horrible whine. It only took me 20 minutes to get them to send a technican onsite to make these repairs (as opposed to sending it back to the depot again). I am told that I will be contacted by the repair person the next day and they would be out to do the repairs the following day.

When I haven't heard from the repair person by the end of the second day, I call Dell. Who can't escalate the matter as the people it would be escalated to have gone for the weekend. I am promised a call on Monday morning from the support escalation team. Monday morning passes and I finally hear from the repairman. He asks when would be convenient for him to come out. He doesn't catch the sarcasm when I reply "Three days ago". We setup an appointment for the following day.

The next day arrives, I show the repairman into a conference room, where he opens the boxes with the parts and learns what work he will be performing on the laptop. In about an hour he replaces the screen, motherboard, and wireless switch. I check the laptop out and everything appears to be in working order. About 30 minutes into a Windows install, the screen problem comes back. About 30 minutes later the whine comes back. The trip from the repairman however did yield one useful piece of information; after the second repair, I should have called Customer Service instead of Technical Support.

That's just what I did. They listened to my complaint, they offered to transfer for me to Technical Support, I asked for a new laptop, and 20 minutes later the agreed. Four days later a new laptop shows up.

Too bad the laptop only vaguely resembles our standard issue laptops. The CPU is faster (great), there is more ram (no problem), the hard drive is bigger (sure), it has the Intel wireless card instead of the Dell one (annoying as I like consistency), the Nvidia graphics card instead of the Intel one (seriously annoying for the automated installer we use), has a fingerprint reader (annoying in that it is a driver that I now need to deal with for just one laptop), has a dvd writer instead of a dvd/cd writer (sure, whatever), has the extended battery (eww, it makes the laptop bigger), it has a more expensive version of Office (not cool as it is yet another special case I have to track).

Sigh. I guess this will become my test/development machine.

[2007/03/29 | /hardware | permanent link]

Prescription Drug, or Backup software company

While researching backup solutions for our Windows laptops, a coworker recommend I look into the products of Avamar. My first through was "That sounds like the name of a prescription drug.". My first glance at the website only provided further evidence that this was the name of a drug and not a company that makes backup software

Turns out that they make some really cool software that works out how to not backup multiple copies of the same thing (across different systems even). It also turns out that they are well out of the price range of this particular project.

[2007/03/27 | /misc | permanent link]

x11vnc, or How to Recover from a wedged X session

An engineer came to me complaining that his X session was wedged. My immediate reaction (without looking up from my work) has become a standard response; "Ctrl, Alt, Backspace to kill the X server and once you log back in set the screensaver to just power off the monitors."

This paricularly engineer however had a couple of long running tasks in terminal windows that he didn't wish to lose, as they had been running for nearly 8 hours; and could I please help him fix the X session? He couldn't just let the machine sit until the tasks were done, and than kill the X server as he needed to see the output of the jobs that were running.

Tangent: Why do so few people understand that when running tasks that are going to take awhile, the output should be redirected to a file (or perhaps a file and stdout with the tee command) so that if the computer were to crash mid run, at least the output would be saved. Not even a bad experience or two (resulting in lost output) seems to convince people of this need.

In this particular instance, the X server wasn't horked far enough to stop x11vnc (which takes a running X session and exports it as a VNC session) from working. This technique won't work on all X server breakages (or even most in my experience), but it does work on occassion.

[2007/03/23 | /software | permanent link]

Low Disk Space and VMWare, or Corrupting a VMs ram

Yesterday evening, I recieve a report that a user can't log into a particular virtual machine. A quick look and it turns out that the virtual machine is stalled awaiting an answer to a question.

For those without access to the image above, the relevant portion is "The directory /vmware/virtual_machine/ has less than 150 MB of free space. Running out of free space in this directory may corrupt the virtual machine's RAM."

sidenote: You can see if a VM in VMWare Server has a question to be answered by eithe connecting to the console of the VM (this is what generated the image above) or with the command line "vmware-cmd vmwareconfig.vmx answer"

This became a problem only because we do not pre-allocate the disk space of virtual machines upon creation. As users of the VM added data to the VM, the hosts's disk filled up and we got this error message.

Doh

[2007/03/16 | /software | permanent link]

VI Keybindings and tcsh, or Oh the humanity

For historical reasons the default shell for the engineers I support is tcsh. The default cshrc sources a half dozen other csh scripts. One of those scripts issues the command bindkey -v which sets the shell to use vi style keybindings.

I learned this little joy when I first started working here, but as I use VI regularly I could adapt (and there was the little fact that I use bash as my day to day shell and enter tcsh as little as possible). I had quickly forgotten about it until I recieved a bug report complaining about "oddly" terminal windows.

Sitting down at the user's machine shortly after the bug report hit my inbox I realized quickly what the problem was; vi keybindings. After explaining what was going on to her, we added the line bindkey -e to the end of her cshrc and her world was better. For the rest of the week, I kept receiving thank you's from people who learned about my "fix" through the grapevine.

It seems few people like many of my coworkers dislike vi keybindings on the shell.

For more information on bindkey, man tcsh

[2007/03/16 | /software | permanent link]

Dell Customer Service, or Not an acceptable message

Written 2007-03-14

The story of why I was calling Dell Customer Service will be saved for another day, suffice it to say I had issues that the technical support group clearly could not resolve.

A few minutes navigating the general phone tree, a few minutes with a call manager (aka an operator), I get to the Customer Service phone tree and select the option to talk to a human. A moment later, I recieve the following message;

"Dell Customer Service is currently closed for a company meeting. Please try calling again in one hour." I was speecheless. Which is good since Dell hung up on me at that point.

[2007/03/14 | /misc | permanent link]