Saturday, February 05, 2005

Day 1 of a Week of Evening Shifts

The first thing that strikes me in the CDF control room is the fan noise. Everything hums. The second thing that strikes me is the screens. As Consumer Operator (aka peon) I stare at 10 monitors (5-dual screen PC's) and a couple of TVs. One TV shows the radar map for northern Illinois, and the other is the beam conditions monitor (store 3962, stack 24 E10, B0Lum 34.43 E30, etc). The message at the bottom reads "Today has been declared an air pollution action day." I'm not sure exactly what that means: do we stay indoors because of ozone levels or clobber a smoker?

The Ace faces 10 more screens (and the same two TVs). His monitors display and govern details of the data acquisition system. As I type this the Ace is arguing in Italian with one of our Japanese collaborators about the difference between something on the monitor and something on his laptop.

The SciCo (pronounced like Bates Motel) sits nested in an 8 foot long curved desk staring at a display of the beam loss monitors. He keeps the e-log, answers the phone, and is the team's interface to the outside world. Need experts? He pages them. So far things are very quiet . .. Ring binders of instructions, reference plots, and manuals line the long desk. The E-log contains such exciting entries as

Halo Counter Work: The above figure shows the effect of the new voltages on C:B0PAGC(yellow), C:B0PBSM(cyan) and B0PHSM(green). The largest effect occurs with the return of the top counter. All work on the halo counters is now complete and any features should be considered real.

Next to the Ace is another bank of monitors. These include TVs showing what cameras in the collision hall see, fire monitoring displays, oxygen monitors, ACNET (accelerator information) displays, solenoid current and other arcana. Think safety. And, of course, that ubiquitous acclerator TV display, with the iniquitous fonts. Over the years, many pixels in the collision hall cameras have gone west, so the images look pretty ratty.

The control room is longer than deep, with the CO on the left, Ace center left, safety center right, and on the right a pair of ACNET relay racks and the array of monitors for the silicon detector and high voltage. The ACNET relay racks are stuffed with NIM and CAMAC crates, which are stuffed with scalers and timers and special-purpose triggering gear and beam loss monitoring. The silicon detector is quite sensitive, and if the beam gets out of control it can be badly damaged even if the power is off, and toasted if the power is on. So we have some fancy gear to shut things down fast if the accelerator has problems.

Oh, and there's another curly desk in the center on the other side of the SciCo. When we have an access to the collision hall that's where the safety officer lives.

Urp. Chung! An absolutely stinking text-to-voice goes "DAQ HRR in error" over and over again. If I hadn't gone over to the desk to see what that was meant I'd be living in ignorance. Oh well, it is better than the situation 3 years ago, when every single alarm system had its own voice and signal (toot! beep! burp! dong! pop!).

Time to run through the checklist again. SVXMon is giving grief: Can't find any root files for it. (Root is a CERN package for I/O, histogramming, and general data management.) OK, that took about 50 minutes. I'll get faster at this, no doubt. The trick is that little line about "new dead channels." You have to crosscheck with the FixList web page to find out which are known already.

The wall not covered with monitors is covered with white boards and cork boards and windows into the assembly hall. Through the near window I can see out the side of the assembly hall onto the road, and see each car going by. There are quite a few for 19:00 hours.

On the long narrow counter that marks the "wall" of the control room and the start of the hall sits a large pink Everyready bunny, whose drum has the CDF logo on the membranes. Fitting: we've been running and running and running off and on since what, 1985? OK, 1987, an engineering run doesn't count. The cork boards have newspaper clippings, detector maps, a 9-page list of experts and their phone numbers and the occasional notice "DO NOT USE FIB or Si VRB EVER unless svx expert says you can!" I don't know what that means either. Also on the counter is a pencil sharpener, a 3-hole paper punch, a manual of CDF plans, a box of teabags, and a large box used for depositing supervised access keys.

We used raised floors here, and every 2-foot square floor tile has an address. If there's a relay rack instead where the tile was, that relay rack gets that address. 2RR18I is the rack on my right, containing a computer (b0dap50) and the CDF ED (event display) showing a new event every 6 or 7 seconds. That's our eyecatcher for visitors: see physics happening! Actually, it is occasionally useful for spotting problems, though most problems are subtle enough to require histograms of tens of thousands of events.

And the CMP has an oscillating channel which the expert wasn't able to fix. This didn't show up in the Fixlist, but it shows up bright and clear in YMon and TrigMon and ObjectMon. I added it to the Fixlist.

A slight digression about names: The suffix Mon is obviously short for Monitoring Program, right? And the prefixes Trig, Lum, Sili, Beam etc are also pretty obvious: Trigger, Luminosity, Silicon Detector, Beam Properties. But why Y?

Years ago CERN had a histogramming package called ZBOOK. After some rewriting they married a variant of this with a memory management package called ZEBRA, and gave the world HBOOK. Meanwhile another group put together a memory management package called BOS. (Recall that all this is in FORTRAN, and some tinkering was needed to request and release chunks of memory for our data structures.) Fermilab decided that wasn't quite good enough and expanded it to a package they called YBOS. And of course, the wheel needs to be reinvented, so they wrote a histogramming package to go with it, and called it YBOOK. Naturally CDF went along with Fermilab software, and when Larry wrote his detector occupancy monitoring program, it used YBOOK, and was called YMON. And the pun was probably intentional. After a short time we ditched YBOOK and rewrote in HBOOK (it took too long to process the histograms at the end of a run), and these days most things are in ROOT, but we've not felt any pressing need to rename the monitoring package (even though it has no code left from the original).

FWIW, HBOOK, YBOOK, and YBOS all suffered from a lack of tutorials. YBOS and YBOOK had every routine rigourously explained, but there was no simple "here's how you do a simple job" explanation. I eventually wrote one for YBOS, just shortly before we ditched YBOS forever. I checked the CDF code, and found that we used fewer than 15 subroutines out of over 60 user-callable subroutines in YBOS. Most of the stuff was just not useful.

Another checklist done. 35 minutes this time. More orange juice, some rye bread, and some low-fat corned beef hash. The latter tasted a bit dry; I don't think I'll make a habit of it.

The newspaper clippings mentioned above are local stories about Fermilab (including myths about the place), a Fermilab News article about a celebration, and the lyrics to "A Dying Cub Fan's Last Request." I wonder what they'll post next year.

Oh, and you'll find menus from various food delivery joints in the neighborhood. I don't think I've seen an experimental area at Fermilab without pizza/chinese delivery menus. I don't know how much delivery they do these days, what with tighter security. Now that I think of it, I don't recall seeing menus posted like that at CERN. Whether that's because security was tighter at CERN then, or because fast food delivery wasn't a big industry in Geneva, or because they took food too seriously to order fast food like that I don't know.

The Japanese gentleman is back, and I'm hearing human voices again. I can't say it was quiet before, but it is livelier now. Almost time for another checklist runthrough. We're on the same run we started the shift with. It isn't a high luminosity run by recent standards. And it is funny to see a plot with the D0 mass being used for online validation. The online consumers only get a tiny fraction of all events, but still get enough events to use for 'calibration' something we used to design experiments to discover.

Dang that text-to-voice program. It sounds like a woman with a very bad sinus condition and a speech impediment.

Every two seconds the solenoid current monitor gives a gentle "fweep."

One of the crates upstairs did a midrun reboot, and the Ace is talking over the options to the SciCo. We don't really like to end a run and restart, since there's a certain amount of time spent in resetting everything, which means beam time is lost. On the other hand, things seem to be well and truly hung. He can't even reset, so he's trotting upstairs to see if he can do a hardware reset on the event builder.

Don't you love the phrase "it's a known problem"? Restart. The expert calls back in time for the Ace to recite what he had done so far up until the OK, it works now.

Stage0 works OK, but then hangs. Restart it with SciCo looking over my shoulder (he had the Consumer pager for a while, and has a professional interest). Figure out that the bit mismatch for the XTRD checking is a simulation bug. On to another checklist! It is 23:11, and I'm getting a little cross-eyed.

Only a quarter of an hour to go. Checklist is done, though some items are flagged "Not Yet" rather than OK or Bad.

The electronic sign reads

Solenoid CHILLN
AC ON FULL
?1212?1212
PROCESS SYST OK

And relief arrives. There's a lost horizon; where the sound of fans doesn't throb in your ears anymore

No comments: