[plt-scheme] Mathematical Specifications of Partition Algorithms
This message is fairly long; if you're not interested in the
mathematical
specification of algorithms, press DELETE now.
A recent thread discussed various functional and non-functional
versions of a program to compute the partitions of an integer.
At one point Matthias Felleisen asked for a specification of
the mathematical solution in math. I am interested in program design
techniques driven by specifications, so I decided to see what I could
come up with for this problem. I developed several equivalent
set-theoretic formulations containing different amounts of solution
information.
I'll use the following notation:
1..n = {x: integer(x) & 1 <= x <= n}.
BAG(B,M) asserts that B is a bag (multiset) of members of M.
SEQ(S,M) asserts that S is a sequence (ordered multiset) of members
of M.
{m . S), for m in M and SEQ(S,M), is the sequence with m for its
first
member and the members of S for its remaining members.
< ... > is a literal sequence.
M U N is the union of sets M and N.
UNION(M(i): i in P) is the union of the sets M(i) for all i in P.
MAX(x: x in Y) is the largest member of M.
SUM(x: x in Y) is the sum of the members of the set, bag or sequence
Y.
falling(S) states that the sequence S is sorted in non-increasing
order.
rising(S) states that the sequence S is sorted in non-decreasing
order.
The first, "stingiest" specification is of the set P(n):
Def 1: P(n) = { B: BAG(B, 1..n) & SUM(x: x in B) = n }
Since it is useful to look at partitions ordered in a particular
way, Def 1 can be altered to use sequences. This leads to the
definitions
of the sets Pf(n) and Pr(n):
Def 2: Pf(n) = { S: SEQ(S, 1..n) & SUM(x: x in S) = n & falling(S) }
Def 3: Pr(n) = { S: SEQ(S, 1..n) & SUM(x: x in S) = n & rising(S) }
It is pretty clear that the bags underlying the sequences in Defs 2 and
3
are precisely the bags of Def 1, so they are suitable specifications of
partitions.
One reason for specifying order on the partitions is to make it easier
to
see any structure that can be exploited in the set of partitions.
First, consider Def 2: If we list the partitions of, say, 7, each
sorted
in decreasing order, something pops out -- the set is made up of
partitions
starting with 1, partitions starting with 2, all the way up to
partitions
starting with 6 (and the singleton partition {<7>}). In addition, each
partition starting with, say, 3, is made up of 3 cons'd onto a partition
of 4 whose largest (first) member is no larger than 3. This observation
leads to the following mutually recursive definition of two families of
sets, J(n) and G(n,k):
Def 4: J(n) = {<n>} U UNION( G(n,k): k in 1..(n-1) )
G(n,k) = { (n-k) . S : S in J(k) & MAX(x: x in S) <= n-k }
The claim is that J(n) = Pf(n). The proof is a straight-forward use of
course-of-values (transfinite, strong) induction. The sets J(n)
correspond to the lists built in the partition-generating program
provided by Jerzy Karczmarczuk.
On the other hand, if we list the partitions of, say, 6 using Def 3, we
see lots of sequences with small numbers at the front; further
examination
reveals that a partition <i, ...> of 6 yields the partition <1, i, ...>
of
7, and the partition <i, j, ...> of 6 yields the partition <(i+1), j,
...>
whenever i < j, and both kinds are in the proper order. These points
lead to the mutually recursive definition of three families of sets,
M(n),
C(k) and I(k):
Def 5: M(n) = {<n>} U C(n-1) U I(n-1), n > 1; M(1) = {<1>}
C(k) = { 1 . S : S in M(k) }
I(k) = { (i+1) . j . S : i . j . S in M(k) & i < j }
The claim is that M(n) = Pr(n). This may be established by simple
mathematical induction. The sets M(n) correspond to the lists built up
in the program exhibited by Matthias Felleisen.
An interesting difference between J(n) and M(n) is that J(n) uses all of
the sets J(1), ..., J(n-1), whereas M(n) uses only M(n-1).
I think the following points can be made:
1) Mathematical specifications can, but need not, contain sufficient
structural information to feed a specification-driven design
process.
2) Standard mathematical objects, including recursively-defined
families
of sets, are adequate to specify interesting structural information
even in the absence of higher-order functions or lambda calculus.
Since set theory and type theory are equally capable of formalizing
mathematics, it would seem that set theory is as good as lambda calculus
for specifying computations. However, there is a difference between
the definitions above and the corresponding functional programs --
the definitions may present structural information, but they don't show
how that information is to be used to generate partitions, whereas the
programs must show precisely that. The programs express much more
detail
than do the mathematical specifications.
-- Bill Wood
bill.wood at acm.org