Type checking is usefull ...
2004-06-18 10:03 amGoogle search for WSDL and type checking led me to this thread on LtU where Frank Atanassow puts up strong argument for static typechecking and relationship between programming language semantics and type system semantics. Might as well grab a local copy, taking into account instability of LtU site...
Frank Atanassow - Type checking is useful
2/7/2002; 11:46:23 AM (reads: 1970, responses: 8)
[Ehud] Typing is obviously not enough, since types don't capture all useful information.
First, many type systems are not statically decidable; among these are ones which do capture all the (extensional) behavior of a function.
But let's suppose that you don't want to put up with such type systems since, after all, they are not very well-suited for programming (no type inference, principal typing, etc.). So, instead you do testing.
To test something, you need to know what you are testing for. So you need a specification. If you are scrupulous, it will be a formal specification. Then you devise a set of inputs for the tests, and maybe a testing program. Now you need to know that your tests will actually ensure that your program matches your specification. If you are scrupulous, you will write a proof of the fact. If your proof is constructive, then your specification and proof together constitute a typed program, by the Curry-Howard isomorphism.
Well, that's a lot of "if"'s perhaps (but it's also a lot of work, and if your programming language isn't typed, then your test set will get a lot larger, since you have to test so many trivial things explicitly), but my point is that even if you use the testing approach, you're doing something which is fairly close to what a type checker and typed language-using programmer do.
Here's another thing to consider: haphazard testing can make your programs less efficient. This is because, whereas types make assertions about the extension of a program (the relation of inputs to outputs), tests are frequently over-specified, so they tend to assert things about the intension of a program (how it computes its outputs from its inputs). Thus, if you optimize your program, since the intension changes, your tests may break.
[Patrick] OK, you got the argument count correct, and yes, you sent an Integer when it was expected. You still have to write all the same unit tests to demonstrate the behavior is what you expected.
Yeah, but you don't have to write a test to check those properties, do you? And it's not so easy to check them. To prove that your function accepts any integer, you have to test it on every possible integer input (plainly impossible), or prove that its implementation is somehow invariant with respect to the property of being an integer and then test it on one integer, or something like that.
I see this argument about arity checking, etc. quite often. Usually the people who make it are people who have only ever used dynamically typed languages and a language or two with very weak type systems, like C. Often they have switched to Perl or something because C's type system is so unexpressive. This is not the case for advanced languages, like ML.
Another thing is that people who use expressive static type systems tend to exploit it much more than others. Have you ever tried to a program which makes really extensive use of higher-order functions, or genericity, or monads, or algebras, in a dynamically typed language like Scheme? I have. It's very frustrating. You make lots of silly errors which are hard to track down. Errors which would be caught instantly in a statically typed language. In this respect, static typing is an enabling technology, not a constrictive one.
I think these are all good arguments for static typing. But they're not why I use it. The real reason I like static typing is as follows.
Suppose you are given a massive box full of millions of oddly shaped Lego block-type things and told to build a house with them. (This is not a skyscraper argument, BTW.) One way to do it is to use a greedy strategy and start at the bottom, and every time you need a piece, to look for something with the right shape. This might work, but probably your house will look very odd, with lots of bits sticking out, occluding window and doorways.
Another way is to look at the pieces and arrange them into categories, and then develop rules which tell you what kinds of shapes you get when you stick together pieces from certain categories, rules which you can rely on in specific situations. This will not solve all your problems, but if you pick a good set of rules, it will let you concentrate on the larger task. If your set of rules is "complete" in a sense, then you furthermore know that you will also be able always to build your house following those rules exclusively.
It's true that another set of rules may work as well even though they conflict with yours, and following your own rules to the letter will cause you to miss some opportunities for "optimization". But that's a small price to pay for the gains you make.
Now, here's the important bit. Nobody uses the greedy strategy, or at least nobody who has successfully built a medium-sized program. Everybody has some more complex strategy for programming. OO, or waterfall, or patterns, or whatever. They attack problems using that strategy. Usually exclusively with that strategy. In my case, my strategy is to attack problems using algebraic methods.
A type system also induces such a strategy if it has a denotational semantics. In my case, the type systems I use have an algebraic semantics. Therefore, they exactly match my own strategy. Therefore, I hardly ever complain about typing being too strict or something, because when I get a type error, it reflects an error in my mental model of the task, not somebody else's. Once you get in this frame of mind, typing is 99% boon, and only 1% hindrance (mostly when you hit the limits of the type system, but I use languages with very expressive ones, so it doesn't happen much).
If you reverse this line of thought, then, users can learn to work in harmony with their type system if they pick one which uses the same semantics as they do. If you think in terms of OO, you need a language which has an "OO semantics", etc.
The thing is, semantics are always expressed in the language of mathematics, and mathematics is simply not OO, it's algebraic. OO is more coalgebraic, which can be expressed mathematically, but is at odds, stylistically, with a hundred or so years of mathematical literature. Other, team project methodologies and so on, have no conceivable mathematical basis at all.
This is why I strongly believe that the only programming paradigms which will survive to the end of this century are the functional and logic ones. Eventually, programmers will be writing mostly programs which deal with real world problems, problems which come up in physics and statistics and biology (someone will say "what about business?", but every business has to produce a product, and products typically involve the sciences), and these problems are always expressed mathematically, which is to say mostly algebraically, because that is the easiest way to express a complex problem unambiguously. Then the most obvious choice of paradigm is declarative.
Frank Atanassow - Type checking is useful
2/7/2002; 11:46:23 AM (reads: 1970, responses: 8)
[Ehud] Typing is obviously not enough, since types don't capture all useful information.
First, many type systems are not statically decidable; among these are ones which do capture all the (extensional) behavior of a function.
But let's suppose that you don't want to put up with such type systems since, after all, they are not very well-suited for programming (no type inference, principal typing, etc.). So, instead you do testing.
To test something, you need to know what you are testing for. So you need a specification. If you are scrupulous, it will be a formal specification. Then you devise a set of inputs for the tests, and maybe a testing program. Now you need to know that your tests will actually ensure that your program matches your specification. If you are scrupulous, you will write a proof of the fact. If your proof is constructive, then your specification and proof together constitute a typed program, by the Curry-Howard isomorphism.
Well, that's a lot of "if"'s perhaps (but it's also a lot of work, and if your programming language isn't typed, then your test set will get a lot larger, since you have to test so many trivial things explicitly), but my point is that even if you use the testing approach, you're doing something which is fairly close to what a type checker and typed language-using programmer do.
Here's another thing to consider: haphazard testing can make your programs less efficient. This is because, whereas types make assertions about the extension of a program (the relation of inputs to outputs), tests are frequently over-specified, so they tend to assert things about the intension of a program (how it computes its outputs from its inputs). Thus, if you optimize your program, since the intension changes, your tests may break.
[Patrick] OK, you got the argument count correct, and yes, you sent an Integer when it was expected. You still have to write all the same unit tests to demonstrate the behavior is what you expected.
Yeah, but you don't have to write a test to check those properties, do you? And it's not so easy to check them. To prove that your function accepts any integer, you have to test it on every possible integer input (plainly impossible), or prove that its implementation is somehow invariant with respect to the property of being an integer and then test it on one integer, or something like that.
I see this argument about arity checking, etc. quite often. Usually the people who make it are people who have only ever used dynamically typed languages and a language or two with very weak type systems, like C. Often they have switched to Perl or something because C's type system is so unexpressive. This is not the case for advanced languages, like ML.
Another thing is that people who use expressive static type systems tend to exploit it much more than others. Have you ever tried to a program which makes really extensive use of higher-order functions, or genericity, or monads, or algebras, in a dynamically typed language like Scheme? I have. It's very frustrating. You make lots of silly errors which are hard to track down. Errors which would be caught instantly in a statically typed language. In this respect, static typing is an enabling technology, not a constrictive one.
I think these are all good arguments for static typing. But they're not why I use it. The real reason I like static typing is as follows.
Suppose you are given a massive box full of millions of oddly shaped Lego block-type things and told to build a house with them. (This is not a skyscraper argument, BTW.) One way to do it is to use a greedy strategy and start at the bottom, and every time you need a piece, to look for something with the right shape. This might work, but probably your house will look very odd, with lots of bits sticking out, occluding window and doorways.
Another way is to look at the pieces and arrange them into categories, and then develop rules which tell you what kinds of shapes you get when you stick together pieces from certain categories, rules which you can rely on in specific situations. This will not solve all your problems, but if you pick a good set of rules, it will let you concentrate on the larger task. If your set of rules is "complete" in a sense, then you furthermore know that you will also be able always to build your house following those rules exclusively.
It's true that another set of rules may work as well even though they conflict with yours, and following your own rules to the letter will cause you to miss some opportunities for "optimization". But that's a small price to pay for the gains you make.
Now, here's the important bit. Nobody uses the greedy strategy, or at least nobody who has successfully built a medium-sized program. Everybody has some more complex strategy for programming. OO, or waterfall, or patterns, or whatever. They attack problems using that strategy. Usually exclusively with that strategy. In my case, my strategy is to attack problems using algebraic methods.
A type system also induces such a strategy if it has a denotational semantics. In my case, the type systems I use have an algebraic semantics. Therefore, they exactly match my own strategy. Therefore, I hardly ever complain about typing being too strict or something, because when I get a type error, it reflects an error in my mental model of the task, not somebody else's. Once you get in this frame of mind, typing is 99% boon, and only 1% hindrance (mostly when you hit the limits of the type system, but I use languages with very expressive ones, so it doesn't happen much).
If you reverse this line of thought, then, users can learn to work in harmony with their type system if they pick one which uses the same semantics as they do. If you think in terms of OO, you need a language which has an "OO semantics", etc.
The thing is, semantics are always expressed in the language of mathematics, and mathematics is simply not OO, it's algebraic. OO is more coalgebraic, which can be expressed mathematically, but is at odds, stylistically, with a hundred or so years of mathematical literature. Other, team project methodologies and so on, have no conceivable mathematical basis at all.
This is why I strongly believe that the only programming paradigms which will survive to the end of this century are the functional and logic ones. Eventually, programmers will be writing mostly programs which deal with real world problems, problems which come up in physics and statistics and biology (someone will say "what about business?", but every business has to produce a product, and products typically involve the sciences), and these problems are always expressed mathematically, which is to say mostly algebraically, because that is the easiest way to express a complex problem unambiguously. Then the most obvious choice of paradigm is declarative.