Tuesday, August 31, 2010

I had a training at Region XIII (the government agency that interfaces between schools and the Texas Education Agency) today and the guy leading the training asked us, as a subset of the thousands of people who can edit student data in the state-wide database to voluntarily adopt a policy that would help prevent what is apparently a real, actually-occurring problem: EDIT WARS of STUDENT RECORDS in the STATE-WIDE DATABASE--specifically of their STATE-WIDE STUDENT IDs (that's THE PRIMARY KEY), but also of their names.

We also discussed 'Hey, this student's Birth Certificate and Passport disagree. What do we do?' (Throw one away--it's illegal for anyone to have two legal identities.) And 'We have a record of this student's name being changed upon adoption, but now he's with his biological father who is providing us with the birth certificate and asking us to use that name. What do we do?' (Go with the most recent date.) And 'We don't know this student's SSN. What do we do?' (Answer: Start an edit war in the state-wide database. No, wait. How about the twenty of us DON'T do that and just hope that everyone else in the state follows our lead.)

I know privacy is an issue, but DATABASES CAN ENFORCE REFERENTIAL INTEGRITY.

Saturday, May 01, 2010

(Potentially relevant link)

Your post advocates a

( ) technical (X) legislative (X) market-based ( ) vigilante

approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)

( ) Spammers can easily use it to harvest email addresses
( ) Mailing lists and other legitimate email uses would be affected
(X) No one will be able to find the guy or collect the money
( ) It is defenseless against brute force attacks
( ) It will stop spam for two weeks and then we'll be stuck with it
(X) Users of email will not put up with it
( ) Microsoft will not put up with it
( ) The police will not put up with it
( ) Requires too much cooperation from spammers
( ) Requires immediate total cooperation from everybody at once
(X) Many email users cannot afford to lose business or alienate potential employers
( ) Spammers don't care about invalid addresses in their lists
( ) Anyone could anonymously destroy anyone else's career or business

Specifically, your plan fails to account for

( ) Laws expressly prohibiting it
(X) Lack of centrally controlling authority for email
( ) Open relays in foreign countries
( ) Ease of searching tiny alphanumeric address space of all email addresses
( ) Asshats
(X) Jurisdictional problems
(X) Unpopularity of weird new taxes
( ) Public reluctance to accept weird new forms of money
( ) Huge existing software investment in SMTP
( ) Susceptibility of protocols other than SMTP to attack
( ) Willingness of users to install OS patches received by email
(X) Armies of worm riddled broadband-connected Windows boxes
( ) Eternal arms race involved in all filtering approaches
( ) Extreme profitability of spam
( ) Joe jobs and/or identity theft
( ) Technically illiterate politicians
( ) Extreme stupidity on the part of people who do business with spammers
( ) Dishonesty on the part of spammers themselves
( ) Bandwidth costs that are unaffected by client filtering
( ) Outlook

and the following philosophical objections may also apply:

(X) Ideas similar to yours are easy to come up with, yet none have ever
been shown practical
( ) Any scheme based on opt-out is unacceptable
( ) SMTP headers should not be the subject of legislation
( ) Blacklists suck
( ) Whitelists suck
( ) We should be able to talk about Viagra without being censored
( ) Countermeasures should not involve wire fraud or credit card fraud
( ) Countermeasures should not involve sabotage of public networks
( ) Countermeasures must work if phased in gradually
(X) Sending email should be free
( ) Why should we have to trust you and your servers?
( ) Incompatiblity with open source or open source licenses
( ) Feel-good measures do nothing to solve the problem
( ) Temporary/one-time email addresses are cumbersome
( ) I don't want the government reading my email
( ) Killing them that way is not slow and painful enough

Furthermore, this is what I think about you:

(X) Sorry dude, but I don't think it would work.
( ) This is a stupid idea, and you're a stupid person for suggesting it.
( ) Nice try, assh0le! I'm going to find out where you live and burn your
house down!

Monday, April 05, 2010

I got up in them middle of the night to write this and it was all muddled. Then I got up in the middle of another night thinking 'This makes no sense!' And it didn't, because I had written P(causation|corrrelation) = P(causation) - P(correlation|no causation). Stupid me. I'll correct it later. In short: They're not independent, as correlation is a prerequisite for causation. P(correlation) is not 0 or we wouldn't be discussing P(correlation) with such fascination. I'll explain it with more rigor when I'm not really tired and needing to get up in seven hours.

----

I occasionally see misuse of the mantra 'Correlation does not imply causation'. So I don't have to write all this every time I see it, here's a concise explanation:

(Note on notation for non-mathy people: P(x) is the probability that x is true. P(y|x) means the probability that y is true given that x is true.)

The source of the confusion with this statement is that 'imply' can mean either 'entail' or 'suggest'. 'x entails y' means P(y|x)=1). 'x suggests y' means P(y|x)>P(y).

In 'correlation does not imply causation', the word 'imply' is being used to mean entail. Correlation does suggest causation.

Also, if you've ever been tempted to claim that 'lack of evidence is not evidence of lack', I encourage you to apply the same [deleted because I muddled it] reasoning, keeping in mind that the terms 'evidence' and 'proof' are not typically considered interchangeable.