Monday, May 28, 2007

Almost Forgotten

A few weeks ago it was ANZAC day in NZ; a day of remembrance originally defined by the failed WW1 landings at Gallipoli, but now extended to encompass those Australians and New Zealanders injured or lost during all the wars since. A day also celebrated in Turkey for the creation of a secular state: for one world a disastrous invasion; for another the rise of a modern state.

Now, I usually find these days disconcerting; there seems too great a yearning for days past, as though something about those historic events made us better people, proved our metal, gave perhaps a reason to celebrate our forefather's heroic achievements. It seems a dangerous thing.

Now that might be true but on this occasion, I think I've found something else that warrents mentioning.

My family background on my father's side is Polish. That side of my family came to NZ as refugees in the 1940s and 1950s via the Soviet Union, Persia, and Palestine. Of 1.5 million Poles deported by Stalin in 1939/1940 to Siberian labour camps, only about 700 made there way to start new lives in NZ. The original arrivals consisted of orphaned kids, followed after the war, thanks to the Red Cross, by those close family members that managed to find their children or siblings. It was by this mechanism that what remained of my polish ancestry came together in NZ.

The impact upon families of the trip from labour camps was severe with many dying along the way including members of my father's family. The men of fighting age went into the army, navy or airforce and variously split into those fighting for the West and those fighting for the Soviets.

I recently came across a web site which attempts to document some of the Polish forces during the war, a force which included members of my family. I know that the internet is unreliable but the site seems to confirm what I heard from my family as I grew up.

The Polish forces were a major part of WW2 and yet also now largely forgotten.

The Soviet controlled Polish army fought through to the battle for Berlin and put a Polish flag temporarily on the Brandenburg gate next to the Soviet flags, albiet temporarily as it was rapidly taken down. They numbered 396,000 through the war with 23000 killed or missing in action.

The British controlled forces numbered 255,000 across airforce, army, and navy with 13,000 killed or missing in action.

This means that ignoring the initial invasion of Poland, and the 15,000 officers believed killed by the Soviets in camps in 1940, Poland still fielded a combined total of approximately 650,000 servicemen through WWII making it the 4th largest allied armed forces.

Comparing the casuality rate with that suffered by the US forces during the war is enlightening: 5% of those fighting in the West were killed or missing; 6% of those fighting in the East were killed or missing. As many Poles fought in WW2 under British and Soviet command as did US Marines but they suffered twice  the casualty rate (http://www.usmm.org/casualty.html). And that's excluding the initial invasion of Poland by Germany and the Soviet Union in 1939.

Yet you'll never hear that mentioned on the History channel!

With Yalta and Stalin's desire to incorporate Poland into the Eastern Bloc the efforts of those 2 armies was effectively sealed. In 1946 the British command demobilized the Polish army. They were not allowed to join the London victory parade and according to an interview on the Polish Solder website they were given identity cards with "Enemy Alien" written on them.

In the case of my family 3 members were mobilized into the army, navy and cadets. All survived and ended up coming to live in NZ, a country that interestingly fought alongisde the Polish 2nd Corps throughout North Africa and Italy.

I think it's about time that the efforts of these forgotten soldiers were better brought to light.

SQL Server 2005 Data Mining Classification Matrix

Finally figured it out, so easy, and yet so poorly described in the documentation; the classification model tab in SQL Server 2005 data mining shows you the accuracy of your model prediction. Figuring out how to use it had me confused for quite some time. If you're as daft as me then perhaps the following might help.

The easiest way to discover how to correctly read the classification matrix is to generate a table from your testing data set (you do split your original data into training and test data sets don't you...!?) with the ID, the actual outcome you're trying to predict, and the predicted value. You can do this in the Mining Model Prediction Tab. It's easier there because the option to save to table in your database is accessible from the little disk icon in the top left corner. You choose the ID, the actual and the predicted value in the bottom half of the screen. The first time I looked at this it was a bit of mystery how to drive that half of the screen, I'm guessing you've mastered that part.

Load up the table that is saved from your mining query into SQL Server's query window and count up the number of entries that are correctly predicted; and the number of actual values for each state of the actual attribute. You'll then see how it maps to the classification table. Basically, the succinct description at the top of the table is correct. Columns correspond to actuals; rows to predicted values. The bit they leave out that would've been useful to me is that the total number of cases of any one actual value you get from summing vertically; the total number of predicted cases of a value you get from summing the row.