Tuesday, January 1, 2008

Equality operators in PHP and Haskell


Sometimes, developers at my company work in PHP, and I noticed the following chart posted up on the wall nearby. This chart shows by example the semantics of two different equality operators in PHP:
PHP: Loose comparisons with ==
TRUE FALSE 1 0 -1 "1" "0" "-1" NULL array() "php" ""
TRUE TRUE FALSE TRUE FALSE TRUE TRUE FALSE TRUE FALSE FALSE TRUE FALSE
FALSE FALSE TRUE FALSE TRUE FALSE FALSE TRUE FALSE TRUE TRUE FALSE TRUE
1 TRUE FALSE TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
0 FALSE TRUE FALSE TRUE FALSE FALSE TRUE FALSE TRUE FALSE TRUE TRUE
-1 TRUE FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
"1" TRUE FALSE TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
"0" FALSE TRUE FALSE TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
"-1" TRUE FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
NULL FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE TRUE TRUE FALSE TRUE
array() FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE
"php" TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
"" FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE FALSE TRUE


This means you can get the following non-transitive behavior:"-1" == TRUE, TRUE == "1", but "-1" != "1"
The following chart for the strict equality operator is a lot more sane. Notice that it has the expected TRUE values along the diagonal.
PHP: Strict comparisons with ===
TRUE FALSE 1 0 -1 "1" "0" "-1" NULL array() "php" ""
TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
1 FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
0 FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
-1 FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
"1" FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
"0" FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
"-1" FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
NULL FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE
array() FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
"php" FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
"" FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE


In Haskell, the situation is quite a bit easier. Lots of these comparisons just aren't allowed by the compiler, which I show as "-":
Equality in Haskell ==
True False 1 0 -1 "1" "0" "-1" [] "Haskell" ""
True True False - - - - - - - - -
False False True - - - - - - - - -
1 - - True False False - - - - - -
0 - - False True False - - - - - -
-1 - - False False True - - - - - -
"1" - - - - - True False False False False False
"0" - - - - - False True False False False False
"-1" - - - - - False False True False False False
[] - - - - - False False False True False -True-
"Haskell" - - - - - False False False False True False
"" - - - - - False False False -True- False True


Here I use the empty list instead of the empty array. Once you get past the types, the only real difference with the PHP is that "" == []. That's because a String is a list of Chars in Haskell. List is probably the best comparison with array in PHP in this case, but it doesn't match exactly.
Basically, we have three types represented here:
Bool: True, False
String or [Char]: "1", "0", "-1", "", "Haskell", []
Integer: 0, 1, -1
And Haskell doesn't have a NULL constant, as far as I know.
The PHP philosophy seems to be that comparing a number to a string is so important that it created these really strangely behaved operators to do it conveniently.
If you need to compare a Num with a String in Haskell, what do you do? The obvious and wrong thing to do is this:
> read "1" == 1
True
The "read" function parses the "1" (a String value), tries to turn it into a Num, and then compares them.
But what happens if that input isn't actually a numeric value:
> read "that's no number" == 1
*** Exception: Prelude.read: no parse
Owch! The Haskell runtime threw an exception instead of returning "False". That means the program crashes, so you can't just do it the "easy" way. Instead, you need to parse the String value, and if the parse fails, then return False, but if it succeeds, then go ahead and compare the result of the parse.
Here's the code I came up with after some discussion on #haskell. Maybe someone can come up with something better:
numEqString :: Integer -> String -> Bool
numEqString n s =
  case reads s of
    [(n2, "")] -> n2 == n
     -- parsing worked, so safe to compare
    _ -> False -- parsing failed
Now we can do something like this:
> numEqString 1 "1"
True
> numEqString 1 "that's not a number"
False
Our function works, and it's not too hard to use at all. I personally like it better than being able to say "1" == 1. You could come up with a better name for it, though.
What I don't like is that it was hard to write numEqString. It's not at all obvious how to write it, and the line:
[(n2, "")] -> n2 == n
looks like total magic unless you understand reads. What it really means is something like "If it parses successfully, and the parse is unambiguous, and there is no text left over, then compare n2 to n."
This would all be a bit easier if there were a maybeRead function, which I've seen defined by at least two projects:
maybeRead :: Read a => String -> Maybe a
maybeRead s =
  case reads s of
    [(x, "")] -> Just x
    _         -> Nothing
That's something that maybe we should consider adding to Text.Read, but it's definitely still not as easy as it is in PHP.
Anyone picking up PHP can easily write "1" == 1. In Haskell, there's no built-in function to do this, and implementing one requires strong familiarity with the parsing libraries.
We should learn from PHP here. Maybe this is an important pattern to support, and if so, we should have a convenient way to do it, probably in a standard library (not as a built-in language feature). There's no reason that it should be hard in Haskell. We can make it easy, and we can make it safe.
Thanks to folks on #haskell for discussion on this post.
There's a discussion thread on Reddit about this article.

No comments:

Post a Comment