Sunday, July 29, 2012

I Remember Babylon

Remember the story Arthur Clarke used to popularize geosynchronous satellites? In it a Chinese agent explains how they'll use such a broadcast satellite to rot and propagandize the US. Salon of course calls it a "bizarre mix of fact and paranoid fantasy", but others just call it prescient. He erred on one major point: it isn't the Chinese but US firms that drive this.
“For the first time in history, any form of censorship's become utterly impossible. There's simply no way of enforcing it; the customer can get what he wants, right in his own home. Lock the door, switch on the TV set--friends and family will never know."

"Very clever," I said, "but don't you think such a diet will soon pall?"

"Of course; variety is the spice of life. We'll have plenty of conventional entertainment; let me worry about that. And every so often we'll have information programs--I hate that word 'propaganda'--to tell the cloistered American public what's really happening in the world. Our special features will just be the bait."

..

He saw that I was beginning to get bored; there are some kinds of single-mindedness that I find depressing. But I had done Hartford an injustice, as he hastened to prove.

"Please don't think," he said anxiously, "that sex is our only weapon. Sensation is almost as good. Ever see the job Ed Murrow did on the late sainted Joe McCarthy? That was milk and water compared with the profiles we're planning in 'Washington Confidential.'

"And there's our 'Can You Take It?' series, designed to separate the men from the milksops. We'll issue so many advance warnings that every red-blooded American will feel he has to watch the show. It will start innocently enough, on ground nicely prepared by Hemingway. You'll see some bullfighting sequences that will really lift you out of your seat--or send you running to the bathroom--because they show all the little details you never get in those cleaned-up Hollywood movies."

We'll follow that with some really unique material that cost us exactly nothing. Do you remember the photographic evidence the Nuremburg war trials turned up? You've never seen it, because it wasn't publishable. There were quite a few amateur photographers in the concentration camps, who made the most of opportunities they'd never get again. Some of them were hanged on the testimony of their own cameras, but their work wasn't wasted. It will lead nicely into our series 'Torture Through the Ages'--very scholarly and thorough, yet with a remarkably wide appeal."

Clarke was a great one for seeing the possibilities in the technological side of things. It is too much to expect him to foresee the spectrum of social consequences of this sort of media onslaught, or to suspect who would benefit most.

Friday, July 27, 2012

EAA 2012

A friend of Middle Daughter found tickets (much thanks to you both!) and Eldest Son and I went today. He and Youngest Daughter were going to Pirates of Penzance and so we couldn’t stay past 4, but we got to see part of the big air show (it lasts until 6), and lots of aircraft old and new.

They have a model of SpaceShipOne in the museum now. I think it would be fun for them to put in the loop for HotRod, aka Putt-Putt, aka a prototype for Project Orion, though I suppose it wouldn't strictly have been an aircraft since nobody ever figured out how the thing could land.

A Hornet was a noisy part of the show, and a B17 and B29 (with a photography plane shadowing them) a much more sedate interlude as they carried Dolittle’s co-pilot and the Enola Gay navigator and a surprised team of Young Eagles (the winner of the bid didn’t fly himself and his friends but gave the flight to the Young Eagles) around Oshkosh. A Redtail Mustang was taxiing out to be ready after the acrobatics when we left.

The timing of the skydivers worked well, except that the jumper with the US flag came down faster than the singer sang, so the "Oh say does that star spangled banner yet wave" coincided with the landing and the end of the waving.

The kits for sale were interesting, but with my eyebones not quite as young as they used to be, and less than no space to store anything, I was able to consider and restrain myself. (I wanted to be an astronaut, not a pilot.)

We got to look around inside a C-46, of the sort my wife’s uncle had flown over the Hump in China. And a tour through the new Orbis flying eye surgery unit. And others. And a drone. And fields full of planes.

I’m interested in aviation, but you meet the real enthusiasts there. Eldest Son, as we entered the gate, opined that it looked like a fair because of all the sales tents. But when 90% of the tents display aviation-related equipment, the resemblance to the Dane County Fair begins to subside. Even Bose and Sennheiser had booths: hearing protection, of course; obvious in retrospect. (I have trouble with names and couldn’t name most of the aircraft I saw if my life depended on it.)

The scattered drips and brisk wind kept it all cool, though I now sport a jolly red nose that seems to have projected beyond my floppy hat.

Thursday, July 26, 2012

Amino acids in meteorites

Researchers found amino acids in the Murchison meteorite 4 years ago, and from the C13/C14 ratio figured that they were extra-terrestrial, and not contamination or an old chunk that got blasted loose long ago and finally fell back to Earth.

In a "yet-to-be-published" article (not on archiv, dang it), they find that the ratio of left to right handed amino acids isn't the same as on Earth either for the Tagish Lake meteorite. In fact the ratios for aspartic acid were 4:1 left:right, but were only 52:48 for alanine, compared to 1:O(0) for Earthly proteins and enzymes. Random synthesis should result in 1:1. Right-handed amino acids just don't fit in proteins. You can imagine proteins made of nothing but right-handed amino acids, but they just don't exist. Here. Though right-handed acids do rarely show up.

The carbon 13 enrichment, combined with the large left-hand excess in aspartic acid but not in alanine, provides very strong evidence that some left-handed proteinogenic amino acids — ones used by life to make proteins — can be produced in excess in asteroids, according to the team.

They point out that some amino acid crystals will be of one handedness only, and that's one way of getting a pure sample. That's nice, but amino acids aren't usually found in crystal form. I suppose one model might be that there is a pool of mixed amino acids, and a crystal of right-handed cytosine forms, leaving the left-handed in the pool to react and get turned into the first proto-proteins. Or something. Alternatively, an crystal of left-handed survives some chemical insult better than the loose pool of right-handed, and is still around to form proto-proteins. Eh.

This process only amplifies a small excess that already exists. Perhaps a tiny initial left-hand excess was created by conditions in the solar nebula. For example, polarized ultraviolet light or other types of radiation from nearby stars might favor the creation of left-handed amino acids or the destruction of right-handed ones, according to the team. This initial left-hand excess could then get amplified in asteroids by processes like crystallization.

Um. Let's suppose we have something like 100% left-handed in a sample that splashes away from Earth. Over time despite the small rate we can get C13 from p+C12->N13->C13+γ, with the solar wind or low energy cosmic rays as a source for the protons. There should be C14 as well, though I'm not sure of the ratio, and some of the transmutations might disrupt the molecule. So chemicals of terrestrial origin can start to look cosmic. If they were in solution I'd expect nuclear recoil to break the molecule, but in a rock matrix there's not much place for the loose carbon to go, and it might as well recombine after a while.

At what rate will ionized amino acids in a rock matrix spontaneously shift handedness configuration? Is it different for different acids?

That sounds like a research project for somebody...

Wednesday, July 25, 2012

Old Habits Die Hard

The high school sent us a registration notice today. We've been getting those for 12 years now and we finally get to ignore it. Our refrigerator no longer wears a school calendar.

Of course there still are college calendars...

Saturday, July 21, 2012

Wisconsin and Space

Youngest Son is headed to Oshkosh tomorrow with his Civil Air Patrol group to help set up for the EAA air show (biggest civilian airshow in the nation). It is unpleasantly expensive, so I've only been there once--but we got to see SpaceShip One fly.

Harrison Schmitt is an adjunct professor at the UW-Madison. (I heard him lecture once on lunar geological history.) Deke Slayton and five other astronauts are from Wisconsin.

But for sheer excitement can you beat a Sputnik crashing into the street? True, the actual event wasn't very alarming--nobody noticed for an hour and when they did they thought the hot object was some slag from a local foundry. I'm trying to imagine the reactions: "Hard hat area: falling spacecraft" or "Reinforced umbrellas for sale."

Odd that I should only learn about this from a BBC article about a claim that one crashed in Scotland too. Next time we go through Manitowoc we'll have to go look.

Friday, July 20, 2012

Town Philosopher

By now everybody has probably read about Corigliano and the new official town philosopher, who will discuss things like how to think clearly for €15/hour (ie no real expense for the taxpayers).

I love her response to the complaint by the guild of psychologists:

Dr. Guiseppe Luigi Palma said the use of a consulting philosopher was "not only misleading and confusing, but utterly perilous". He said his organisation was ready to take "all the most appropriate actions to combat any offence that may be identified".

to which the reply was

But "the work is not on the emotions, but about ideas," said Lupo. "I don't think the college of psychologists knows what a philosophical consultant is." And being a philosophical consultant, she added: "Their criticism is in any case devoid of epistemological content."

I can think of plenty of people who could profitably spend a little time in such discussions, but not that many who'd actually do it.

Novelty would bring a few people by, but I can't imagine very many people making the pilgrimage to the office, even if it was at a restaurant. I could be wrong: I gather pastors do a lot of counseling and some of that is bound to be big-picture stuff.

What sort of town atmosphere lends itself to philosophical discussion? A lot of what I eavesdrop on at the sidewalk tables turns out to be business or politics. Perhaps asking the important "why" questions makes people feel too vulnerable, and the rigor of logic doesn't seem very popular (especially in politics).

Now an on-call philosopher who you could ring up from a 1am college dorm bull session about the meaning of the universe--he might get some business.

Understatement

I was reading (not for fun, I assure you) a report on data preservation in physics. I ran across this gem.
It would therefore be beneficial to have a framework to automatically test and validate the software and data of an experiment against changes and upgrades to the environment, as well as changes to the experimental software. As such a framework would examine many facets common to several current HEP experiments interested in a more complete data preservation model, the development of a generic validation suite is favourable.

My credentials: I spent several years working with code management and validation on CDF, and the last project I worked on had, at my insistence, a focus on making the release as portable as possible so that colleagues could continue to analyze data for years to come, without worrying about changes in operating system.

It took months to validate an ordinary release--there were always little gotchas, and code that used to work, and things that changed when the underlying system libraries changed. The last project is still in progress :-( after a year.

It would be wonderful to have an automatic testing framework, but it would be hugely complex and there's no way it could be generic. And software upgrades (bug fixes) are anathema to people want a stable framework for analysis and comparison. (If both the simulation you did two years ago and this year's data processing share the same bugs you can still compare the results.) The releases also have to be vetted to make sure the physics results make sense. The last time I heard CMS and Atlas still hadn't validated their code on the SL6 operating system either.

There are two nightmares: when something triggers a bug in the compiler and when "the physics results diverge." The first eats a lot of expensive talent's time and wastes a lot of everybody else's as you converge on an agreement that the compiler is at fault. The second involves getting some very busy scientists to step through a lot of tedious debugging and then argue about whether the result is an improvement and if it is, is it a big enough improvement to disturb the status quo.

Yes, most of the analysis software was written by scientists. They have the domain knowledge, but it doesn't always translate into clean code. (And C++ evolved, and some people learned to do it one way and others another. And some people tried to get cute. And so on.)

To be fair, the paper goes on to say that HERA has a test version of a test suite, and if it is built in from the get-go it is probably easier to do.

Thursday, July 19, 2012

Looking one more time

This perhaps is starting to feel like one of Aquinas' propositions; to and fro and to.

There was one more bit of low-hanging fruit: Consider the distribution of scores for the highest inning (and what the heck, the second-highest too, since if I remove that the distribution for the rest of the game looks very simple). Is this different if it happens in the problematic first inning than if it happens in the others? I guessed that it would be.

And as the plots below show, it is: and so is the second inning's distribution. The first inning is twice as likely as most of the rest to be the highest-scoring inning, but the average score is less.

The average for the highest score if it happens in the first inning is 1.5, while for innings 3-8 it slowly climbs from 2.5 to 2.8. Given that starting pitchers typically get relieved somewhere between 4 and 6, I don't see any smoking gun for pitchers being exhausted, which was the guess of mine that started this study.

The first inning looks like a combination of two distributions: that common to the rest of the innings and something else. If I subtract off the "common distribution": well, see the bottom plot below where I did a rough-and-ready subtraction. The low score end of the leftover distribution looks like a simple Poisonnian random sampling. The high end you shouldn't pay any mind to; the statistics are low and the subtraction was crude.



The second-highest score looks pretty similar, though smaller.


So from last post we see that when we take out the two highest-scoring innings the score distribution per inning seems to suggest that the runs are relatively independent (which is pretty odd. Home runs yes, but small ball no). At any rate, it looks pretty clean.

But we seem to have two failure modes: one of which can happen in any inning and gives the other team an average of at least a couple of runs, and an additional problem in the first-inning, which follows same sort of Poisson distribution as an ordinary inning, with about the same relative probability.

So a team fouls up with some random probability (grounders take bad hops, etc) all through the game (except the last inning), though in the first inning they seem about half again as likely to foul up. On top of this there's some correlated foul-up that happens in just about any inning.

And one more thing: the second-highest scoring inning is more than twice as likely to come immediately after the first-highest than at any other time. So whatever the effect is, it can last more than an inning.

I think I've taken this about as far as I can without delving into pitching changes. It might be amusing to compare different years with different rules/equipment ("Jackrabbit balls"), but I'll leave those as exercises for the readers.

Wednesday, July 18, 2012

Matching the data to the predictions

The results are in. And one prediction held up and the other fell flat.

First the falling flat:

As you can see from the top two plots, the highest and second highest scores tended to come in the first inning, not later in the game. In fact, there's a strong drop at the end, suggesting that relief pitchers do help keep the score down.

I conclude that the high score doesn't come when the pitcher gets tired. I was wrong. In fact, it looks more like the pitcher isn't quite in the game yet, or the team hasn't gotten synchronized for play yet.

Now let's look at the second-highest score distributions, to see what we can see.

The distributions of highest and second-highest are roughly similar, except that the second-highest inning's score is less than half that of the highest inning's score, on the average. And the reduced score distribution for 7 innings (the extra inning games are a small effect here and I ignore the correction) looks much more Poisonnian, dropping by about a factor of 2 each time. That's on the lower right: compare with the 8-inning reduced score on the lower left.

I used the 7-inning reduced score distribution to predict the highest and second-highest innings' scores, and those are in the bottom row of the figure below. Recall that when you are looking at the highest score of a list, 0 is not as likely as higher values. And they have the same general shape as the real distributions above, but with much smaller averages. The average real highest inning's score is 2.3 but the estimate's is 0.7, and the average real second highest inning's score is 1.1 but the estimate's is 0.5.

What can I conclude from this?

The highest inning's score is in substantial disagreement with the estimate. So is the second-highest inning's score, though the real distribution could be a combination of the background estimate and something else.

  1. The first inning is the problematic one: either the team isn't together or the pitcher isn't.
  2. The distribution of scores in the highest scoring inning is not attributable to chance. Something is different during that inning
  3. The second-highest scoring inning is partly like the highest-scoring inning and partly like chance.
  4. The distributions don't look much different between home team and visiting team

Interlude with water

Posting plans this evening changed slightly. The long-awaited rain came!

Unfortunately it filled the window well and started to overflow into the basement, so I bailed about a third of a yard of water(*) while standing in the window well and hoped the lightning wouldn't notice somebody mostly below ground. Deferred maintenance; don't ask. But at least the landscaping I improvised in the back seems to have worked this year, and the sump pump had nothing to do for a change.

But we got a good soaking so I can't complain too much.

(*) The window well was modified to serve as an emergency exit for the basement, so it is all nice and legal for somebody to live down there and there won't be problems if we have to sell and move. So it is a bit wide. The rain filled up 10" of pea gravel and went on to fill in another 7" to lap at the window.

Counterpoint

Let me be the first to challenge my previous post.

The probability of scoring a run in baseball is a complicated mix of the probabilities of walks and hits and steals (and hit batsmen and home runs and so forth). So a nice simple Poisonnian model isn't really appropriate for estimating what the expectation for the maximum inning's score should be. It is possible that these combinations conspire to make higher scores more likely. If that is the case, then the "second-highest" inning should have a distribution more like that of the "highest" than a run-of-the-mill inning.

Also, I was hinting that this might be due to pitcher failure. That should be more likely in later innings, so I should check that also.

So, I want to look at highest and second-highest distributions, and the monte-carlo estimates of what those distribution should look like based on the reduced-reduced score. I also want to see what the distribution of high score inning numbers, and second-highest score inning numbers, look like. The former should be more skewed to higher inning numbers than the latter if this is due to pitcher issues. And the second-highest score should look more like the prediction from the average than like the highest score, if the high-score inning is pitcher failure.

So let's see if my predictions agree with the data. (No, I haven't done these yet.)

Sunday, July 15, 2012

Pitcher Pooped Out?

Many baseball games turn on a decisive inning when a pitcher starts to lose it and isn’t yanked fast enough.

Does this effect, in the end, make a huge difference? How much of the game revolves around the performance of the team as a whole, and how much around how fast the manager spots problems with his pitchers?

The right way to study this is to review all the games and look for the innings with pitching changes, and then drop those, and see if the overall won/loss ratios between the teams change with the new scores.

That gets messy when there’s more than one pitching change. I could try to program that, but instead I used the inning with the most runs scored as a proxy for when the opposing pitcher pooped out. Yes, there are problems with this kind of proxy. The sun getting in the outfielders' eyes can undermine the defense, and contribute to extra runs, for example.

But first, here’s where I got the data for regular season games in 2010 and 2011:

The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at "www.retrosheet.org".

There may be errors in the site’s data, but I’ll assume it is accurate enough to serve the purpose: the uncertainty of my proxy is undoubtedly larger than their typographical error rate.

What I did was read the text file into OpenOffice and remove almost all the columns, so that I was left with the date, team names and leagues, number of outs in the game, whether there were forfeits or protests, and the line scores. The rest was just simple awk scripts and using root for the graphics.

The procedure is simple. Look at the line scores, and for each team find the inning with the most runs. Remove it from their total. Now look at which team won the game. Ties are .5 of a win for both sides. For a season, add up the wins for each team and rank them using the normal method, and rank them using the reduced scores, and see whether the standings change.

I ignore inter-league games, and toss out any games that were flagged as being forfeits or protested by one side or the other. I also don’t worry about divisions; lump them all together.

The first thing to check is consistency. When I look at the win totals I find that they are close to but smaller than those in baseball-reference.com. I have not yet accounted for the discrepancy.

If I check there is a home-team advantage: about 2 sigma for the ordinary score and a bit over 1 sigma for the reduced score.

So far so good, modulo the discrepancy with the official stats.

The distributions of the scores look a little different! They aren't Poissonian, which isn't too surprising: unless it's a home run, if one man scores he had some help, so there's a chance somebody else is now on base and ready to try to score too. And, if I didn't botch this calculation, over half the average scoring comes from a single inning's success.



Add them up and rank them!

As you might expect, the won numbers cluster tighter together.

As you might expect, the overall shape of rankings is pretty much the same.

But the final rankings do change. In 2011 Arizona undergoes a huge jump in ranking when ranked with the modified score, from 3'rd in the league to 6'th. Can their opponents all have had so many moments of bad pitching?

The astute reader will notice that there are no error estimates on the modified win totals. One contribution comes from the fact that if the high-score innings are left out, the home team would sometimes have come to bat in the 9'th where in the real world they didn't. Real scores average about .47/inning and modified scores about .22/inning


2010 Season

National League

RankTeam normalwinsTeam modifiedmod wins
1"PHI"87"CIN"84.5
2"SFN"85"PHI"84
3"CIN"83"SFN"83
4"ATL"82"COL"78.5
5"SDN"81"SDN"76.5
6"SLN"77"SLN"76
7"LAN"76"ATL"75.5
8"COL"74"LAN"75
9"HOU"73"NYN"73.5
10"FLO"73"CHN"71.5
11"MIL"68"HOU"71
12"CHN"67"FLO"70.5
13"NYN"66"MIL"69.5
14"WAS"64"WAS"67.5
15"ARI"59"ARI"58.5
16"PIT"55"PIT"55

American League

RankTeam normalwinsTeam modifiedmod wins
1"TBA"88"TBA"83.5
2"MIN"86"NYA"83
3"NYA"83"MIN"82.5
4"TOR"78"BOS"77
5"TEX"76"TOR"77
6"BOS"76"TEX"75.5
7"OAK"73"OAK"74
8"CHA"73"CHA"74
9"DET"70"ANA"70.5
10"ANA"69"CLE"68.5
11"CLE"64"DET"67
12"BAL"59"KCA"62.5
13"KCA"59"BAL"61
14"SEA"52"SEA"50

2011 Season

National League

RankTeam normalwinsTeam modifiedmod wins
1"PHI"93"MIL"90
2"MIL"90"PHI"88
3"ARI"84"SLN"86.5
4"SLN"82"SFN"80.5
5"ATL"79"ATL"78
6"SFN"76"ARI"76.5
7"LAN"75"CIN"74
8"CIN"73"WAS"71
9"WAS"72"LAN"70.5
10"NYN"68"COL"68.5
11"CHN"66"NYN"68.5
12"SDN"65"SDN"67
13"COL"65"CHN"66
14"FLO"64"PIT"64
15"PIT"64"FLO"62
16"HOU"52"HOU"57

American League

RankTeam normalwinsTeam modifiedmod wins
1"DET"88"TEX"86
2"TEX"87"NYA"85
3"NYA"84"DET"84
4"BOS"80"BOS"82
5"TBA"79"TBA"75
6"ANA"73"TOR"74.5
7"TOR"73"CHA"71.5
8"CLE"69"ANA"70.5
9"CHA"68"KCA"68.5
10"OAK"66"OAK"67
11"KCA"66"CLE"66.5
12"BAL"62"BAL"61
13"SEA"58"SEA"58.5
14"MIN"55"MIN"58

I should check that the score distribution in the real world agrees with other measurements, just to be safe.

UPDATE: But if there's no error, then something is different about those high-scoring innings.

I really need to be more careful. The distribution of the maximum from a set of samples will have a different distribution than the samples will. If you throw a die once you'll get a 6 about 1 time in 6, but if you throw it 3 times and get to pick the highest value you'll wind up with a 6 about 42% of the time. So I have to show that the real-world high-score inning has a different distribution than what you'd get from just picking the best of 9 tries with the reduced distribution.

The following plot includes the average distribution of runs per inning without the high-scoring inning in the upper right. In the lower right is an estimate of the maximum scoring inning in 9 innings if the upper right distribution is correct. The average is about 0.7, which is far less than the 2.3 average for the real-world high-scoring inning (lower right).

There is something different about the high-scoring innings.


Baseball Observations

My better half was a Cubs fan when I married her (with Royko’s George Will's "90% scar tissue"). We’ve lived in Wisconsin for a couple of decades now, and she’s added the Brewers to her repertoire, so there’s about twice as much baseball around the house as there used to be. Along the way I’ve learned a few things about the sport. (I still can’t hit for beans, and my running has gotten worse, if my puffing after running for a bus is any indicator.)

There are a few oddities (infield fly rule, how to tell a balk), and I have to admire umpires who can keep all the rules straight.

Later innings tend to last longer than earlier ones. I remember a few 8’th innings that lasted longer than the first three combined.

And saves. Suppose a pitcher (call him Leroy Paige) leads his team in a 20-0 no-hitter until the last out of the 9’th when he tears his rotator cuff and is carried off the field. The relief pitcher (call him Phillip Wrigley) pitches so badly that 19 runs score, and the game only ends when the second base runner trips and falls on a grounder. Wrigley is credited with a save. Umm...

The commercials have shown an uptick in "insulting my intelligence and/or character" in the past couple of years. But that may be a function of the radio stations.

And it is startling how much the game relies on the pitcher. But more of that in the next.

Friday, July 13, 2012

Olympic Uniforms

I understand the frustration that the US doesn't seem to make its own uniforms, but as my mother pointed out, the Greek Olympics had a different uniform. Maybe temporary tattoos of the national flags? But then, where do the tattoos come from?

Wednesday, July 11, 2012

Electron speed distribution puzzle

What's this about electron speed distributions in dwarf galaxies? Apparently for decades there’s been some contradiction between temperature estimates there. We can’t just stick a thermometer in another galaxy while we take its pulse, so astronomers use some proxies for temperature: for example ratios of the amounts of different ionizations. Except that it wasn’t working; they were getting different results.

As everyone learned in high school and promptly forgot, when a gas is at some given temperature the molecules in it have a distribution of speeds, with a long tail on the high speed end. There’s a peak speed, and the peak speed increases with temperature and the distribution widens. It is fairly easy to describe and is given the name Maxwell-Boltzmann distribution, which should tell you how long people have known about it.

The formula works for plasmas and free electrons, which are pretty common around stars. Turns out there’s a hidden assumption in the prediction of the distribution though—equilibrium.

People who study the plasmas around the sun use a class of related distributions called "kappa" distributions, which have a much bigger high energy tail. When your gas or plasma is fed with some energy source (magnetic reconnection, or what have you) and the gas doesn’t have time to come back to equilibrium, you get more high-energy particles in the mix—the high speed tail increases.

I didn’t know that, and I didn’t know that astronomers were scratching their heads about the temperature estimates; but Nicholls did, and he put the two together in one of those "obvious" connections. If you do the calculations for a "kappa" distribution instead of a "M-B" one, you find that processes that sample the high speed part will proceed as though the temperature were higher than it really is, and those sampling the lower speed part will proceed as though it were lower. So the team goes on to estimate "kappa" for several sources, and note that several have good candidate for providing a strong source of high energy electrons.

I like it. A "cool" result, and it sounds plausible.

A subscription is required to view the article itself, though unless you have some professional interest in it you probably shouldn't bother.

Saturday, July 07, 2012

Breast cancer and size

Drudge pointed to a CBS report warning that women with bigger breasts are more likely to get breast cancer. My first thought was: is the risk proportional to volume? (No, I’m not kidding; I guess I’m a geek.) More cells present means more chances to mutate, so you’d think the risk would rise proportionately.

The actual study says that women with certain genes known to increase the risk of breast cancer tend to have larger breasts—which is a slightly different statement. They took information from using 16,165 contributors to 23AndMe, used the self-reported information (bra sizes, age, cancer history, pregnancy history, currently nursing or not, etc), and found the correlations mentioned in their conclusions. They used cup size, which they admit is problematic. And they mention a study that supports CBS' headline a little better: "Kusano et al. [7] found that among women with a BMI under 25, those with a cup size of D or larger had a 1.8 times higher risk of breast cancer than those with a cup size of A or smaller."

So I went to the font of all wisdom and looked up cup size to volume and found that it is a complete mess (and apparently manufacturers have vanity sizes). This study used strap size as a proxy for BMI, so I will too. Picking a few sizes with the same strap size one finds: the ratio of estimated volume for a 34D to a 34A is about 1.9, 36D to 36A is about 1.8, 38D to 38A is about 1.8. So maybe my intuition is OK. If size matters most, then how you got the size should have the biggest bearing on what kinds of cancer you’re at risk for and when.

Their Figure 1 is pretty dramatic. But they warn

... the shared relationships between breast size and breast cancer at these three regions are not strong enough to account for the possible epidemiological connection that has been reported elsewhere between breast size and breast cancer

Maybe their volume estimation isn't clean enough. Or maybe the naive intuition isn't enough of the story.

BTW, increased BMI increases cancer risk in parts of the body that don’t necessarily increase in size/cell count proportionately. Cancer stories are complicated.

[7] Kusano et al. Int J Cancer 2006, 118:2031-2034

Friday, July 06, 2012

"Symmetry breaking" and Higgs

Why a Higgs field? Why does one need "symmetry-breaking?"

An engineer, a mathematician, and a physicist went to the races one Saturday and laid their money down. Commiserating in the bar after the race, the engineer said, "I don’t understand why I lost all my money. I measured all the horses and calculated their strength and mechanical advantage and figured out how fast they could run..."

The physicist interrupted him: "...but you didn’t take individual variations into account. I did a statistical analysis of their previous performances and bet on the horses with the highest probability of winning..."

"...so if you’re so hot why are you broke?" asked the engineer. But before the argument can grow, the mathematician takes out his pipe and they get a glimpse of his well-fattened wallet. Obviously here was a man who knows something about horses. They both demanded to know his secret.

"Well," he says, between puffs on the pipe, "first I assumed all the horses were identical and spherical..."

You’ve probably noticed that the world is extremely complicated, even if you leave out people’s love lives. To understand it we try to figure out patterns, and then see if we can isolate the things that go to make up those patterns. For example, things slide downhill. Sometimes they stop on the way; sometimes they keep on going through your garden and the back of the garage. You can conclude that something pulls them down (more strongly the steeper the slope), and that something slows them down (that varies in strength depending on the surface roughness and how fast the object is going).

The obvious thing to do is set up some experiments that limit the number of things that change. For example, pick rocks the same shape and slide them down a straight slope of uniform texture with no bushes in the way, and see what you learn about the force that pulls them down. Then vary the tilt, and so on, until you have a model for the force. Then you study the effect of the surface, and so on until you have a picture of friction. Then maybe you worry about air resistance...

Isolate, simplify, and model—and then introduce the complications. It works pretty well.

But what do you do when the pattern is irreducibly "almost symmetric?" You have electrons and anti-electrons, electron neutrinos and anti-electron neutrinos—but you also have muons and anti muons and similar neutrinos. And there are tau particles as well. The groups are called "generations" and the particles in each generation behave in very similar ways to their counterparts in the others; similar enough to use the patterns to form a model to describe them. But they aren’t the same, and the different masses have consequences.

Noether showed that conservation laws (like conservation of momentum, energy, angular momentum) could be associated with symmetries. For example, conservation of momentum is associated with the symmetry of displacements in empty space. Move from one point to another in empty space, and it looks exactly the same: symmetry. And it turns out that some symmetries have representations (mathematical models that reflect the properties of the symmetry) in which some of the parts can represent fundamental particles like electrons. Offer a scientist a powerful tool like that and watch him apply it everywhere he can--it unifies symmetries, forces, and particles in a single bundle.

Analyzing interactions of matter and energy using the symmetries of the relevant fields is immensely powerful—very accurate calculations prove it (QED). Working backwards, by noting that there exist what appear to be parts of the representation and inferring the symmetries, has also been powerful (QCD) and surprisingly accurate (when we’ve been able to actually do the extremely complicated calculations).

I say "surprisingly" because one of the assumptions that goes into the theory is that the particles are massless. This isn’t even close to true for quarks, and the only reason we put up with such nonsense is because the theory works so well. In the real world, and if you want full accuracy, you have to include the masses so that the heavy particles aren’t quite so interchangeable with the light ones anymore. The wonderful symmetry in which all the particles are massless is "broken".

It is only "broken" because we started with the theory where everything was nice and symmetric, so something has to "add on" to the theory to make the proper bits different. It is easier to understand the interactions when you study the problem in that order, but it leads to weird jargon. Sorry.

So what makes the difference?

Higgs (actually several people before him as well, including a superconductivity physicist named Anderson) discovered that you could give a particle or quasi-particle what was to all intents and purposes a mass by the way it interacted with a field with special properties, in particular that it has a non-zero vacuum expectation value. Massless particles could acquire masses in a natural way. It is really a little more involved, since particles in families can share the effects in different ways (e.g. the Z boson is heavy and the photon is massless), but that’s the general idea. Different particles interact more or less strongly with this field, and the effect is to give them what we call mass.

Of course this doesn’t explain why an electron has one mass and a muon a different one. They couple to the Higgs field with different strengths, but nothing yet explains the different coupling strengths. But hey, at least they don’t have to be massless, right?

This kind of field demands a particle to go along with it, and people have been looking for it for years to try to verify the theory. And it looks like it has been found, though there are a few oddities that may just be statistical--we should know by Christmas. (There might be more than one type of Higgs—several theories demand that.)

Perhaps I don’t understand it well enough, but the theory still feels a little ad hoc (*), and several years ago I was predicted that there wasn’t any Higgs. Looks like I was wrong.

(*) Especially that non-zero average value for the Higgs field.

Wednesday, July 04, 2012

Higgs, or something like it

I'm no longer on CDF or CMS, so I didn't have any prior knowledge of the results announced today (or Monday). Others who were there Resonaances and Dorigo, for example, reported already and I can't improve on their work. I didn't even get up at 2:00am to listen--figured I'd read the reports later. I, and most I knew, were already pretty sure of what they'd announce. Fermilab got in one last shot Monday, reminding us that there was some independent verification, though not a discovery there.

It was quite a feat, and a great milestone--but I wish the reporting were more accurate. First, of course, I wish they'd get rid of all references to the g-d particle. Second, though this fills in a hole in the Standard Model, we already know the Standard Model is incomplete: there are known technical problems with it. Third, astrophysics tells us there's more dark matter than visible, and dark matter isn't in the Standard Model at all.

This isn't the capstone for particle physics: there's still work to be done. I don't know if the LHC will address all of these problems--there are things more easily done with ultra-cold and ultra-pure low energy detectors.

What could go wrong?

The Telegraph reports that doctors at a conference announce that ovary implantation surgery should now be considered a mature technology. If you'll pardon the pun.

The idea is that a woman could have part of an ovary removed, sliced thin, and frozen for years. The slices could be re-implanted later, and if they take--which they seem to--the woman can stave off menopause.

This has been done with women with cancer who wanted to have children after the cancer drug treatments, which would otherwise render them sterile--and it seems to work: "without the need for IVF" (a problematic procedure, but that's another issue).

Imagine: a woman could remain fertile into her 70's at the simple cost of abdominal surgery every few years. Perhaps a mere man can't appreciate such things, but somehow I don't think doctors will be overwhelmed with requests for the procedure.

As a side note, this is their model:

The controversial notion would allow career women peace of mind with a fertility insurance policy so they can find a partner, settle down and become financially secure before starting a family.

Is it better for the husband and wife to grow their lives together into the marriage, or to try to join already-established lives? I've seen both ways work fine, but I wonder if the first isn't the easier path. There's no question that it is easier to take care of kids when you're young (no, not early teen young!), even though there's not a lot of financial security then. I'm not convinced their model is at all ideal.

Combat dogs

The office had CNN on with a story about a move to try to bring dogs back from Afghanistan when they retire. Dogs, you see, are classified as "equipment" and typically left behind. CNN included an interview with some legislator who had a plan that seemed to involve some kind of privately funded way to bring the dogs back.

I'd have thought the issue was easier than that. The military invests a lot in morale, and you'd think it a no-brainer that morale for people working with the dogs would be higher if they knew their companions would be able to come home too. It seems like a cheap investment; a couple grand for the trip and maybe a little more for the paperwork to connect them with homes. Medical issues are another matter, though.

Monday, July 02, 2012

Bat diet preference?

Bats eat mosquitoes, I'm told, though plainly not nearly enough.

When the smoke alarm started beeping at 4 in the morning (even the stepladder wouldn't have been enough for me to reach the cabin ceiling), we embraced necessity and got up. Outside Mercury and Venus were rising above the hill and dawn was threatening, so we sat and watched while a small squadron of bats flashed back and forth around us. My better half was glad to see them, saying that they'd eat the mosquitoes.

Since the mosquitoes were eating me instead, I wondered what the bats were up to--clearly not deterrence. And they didn't fly all that close to us, where you'd think there'd be great pickings.

Light dawned. On me, anyway. Full mosquitoes are slower fliers, and are simultaneously easier targets and flavored with a little something extra. Perhaps the bats were waiting to get the full ones leaving us instead of interdicting the hungry ones headed our way.

How would you test this? Release known numbers of empty and full mosquitoes into a bat pen, and count the number that escape? Sounds kind of tedious, but it might be inexpensively do-able, with a large room, a pen, and big fans forcing the insects into a small screened room where you BlackFlag the beasts and count the corpses. You might even be able to get some preliminary separation between full and empty by shaking the tray of dead mosquitoes. You'd need lots of repetitions to get good statistics, since it would be easy to make counting errors, and sometimes the input group would be all empties and sometimes all fulls to let you try to measure the counting biases.

Now where does one find volunteers for the feeding chamber?

Sunday, July 01, 2012

Belated orphan thoughts

AVI posted about orphans shortly before I left on vacation, and I only had time to think about it on the road. In Romania I gather there's a strong bias against orphans, and he suggested that it might be because they, in a culture that relies heavily on mutual support from extended families, are more likely to be a drain than a support. Perpetual mooching?

I have less experience with Romanian culture than he, but there might be a more universal collection of reasons why people might not trust Oliver Twist.

There's the correlation of bad luck and bad character: the bad things that happened to his parents are because of bad things they did, and the lad is going to inherit the bad luck. Keep him away from me.

You can't know an orphan as well as you can know a child when you know his parents. Knowing the parents gives a strong clue as to which unfledged traits are going to be important in his character. In addition, you can tell from them how he has been/will be trained. Without them he might have been trained by Fagin.

Without parents to tie him into the extended family and tribe, is he going to be part of the tribe? Will he be loyal? (OK, this one is a little circular, since if you are part of their tribe you could take over the office of tying them into the tribe yourself, if it occurred to you.)

quick report

Alive and well. McLain camp in the Upper Peninsula of Michigan was a nice jumping-off place for visiting places along the coast, especially when staying in a mini-cabin instead of a tent. (My better half is a good cook, and even with the hiking and lugging I gained a few pounds.) Tettegouche camp in Minnesota is a lovely site also, provided you don't plan to go anywhere--4 hike-in cabins on a lake (with electricity to run a stovetop and a few lights). (Update: Another description of Tettegouche

Pattison park in Wisconsin is nice too, but we discovered that the tent was shot and the heat was more than we cared to endure so we cut out for home a day early.

There was only a little rain (actually a thunderstorm lively enough to encourage us to put the canoe away for the rest of the day), and the heat didn't get to be a problem until we got well away from the lake. Lots of resting, lots of hiking, lots of visiting Great Lakes shipping history sites (and a taconite plant), scenic trials, and so on.

Not everything was perfect: I should learn braille to figure out what the mosquitoes at Pattison were trying to tell me, and the smoke detector decided to let us know the battery was low at 4 in the morning--and the ceiling was 12 feet high at Tettegouche so I couldn't deal with it. We can't afford to do something like this very often, but I'm glad we went.