# Learning F#: Case study with branch and bound – Part II (page 4)

So far, we’ve developed a 0-1 knapsack solver in Part I and then generalized it somewhat. The solver is setup to solve general branch and bound solutions of any type (for maximums), given (these are the parameters of `BandB`):

1. A function `f` that evaluates the objective function where all decisions are made
2. A function `g` that provides an upper bound estimate for a partially decided problem instance, such that it gives an upper bound for any possible subsequent settings
3. A function `unFinished` that determines if all decisions have been made
4. A node of the search tree, starting with all decisions unmade
5. An objective function initial value that is less than any valid solution.

So we can now provide the solver with a different set of functions to instead solve the Max-Sat problem. The goal of this problem is to maximize the number of true clauses in a Boolean formula. More precisely, given a Boolean formula in disjunctive normal form, decide a setting for variables such that the number of clauses satisfied (true) is maximum.

Consider the following formula:

Max-Sat example

Here we have nine clauses, each with three literals. There are a total of six Boolean variables (x1 through x6) involved in the problem – the setting of these is the decision that needs to be made. Note the ¬ symbol is logical negation, the ∧ symbol is logical conjunction (and) and the ∨ is logical disjunction (or). The problem is to find settings for each of the variables such that the maximum number of the nine clauses are satistfied. This happens to be in ‘3-SAT’ form where each clause has 3 literals (a Boolean variable or its negation) but the code will not require this. For example above, if x1 = true, x2 = false and x4 = false, then clause 1 is satisfied since at least one literal is true (¬x4 is true).

The evaluation function `f` for this problem is quite simple, for a given variable setting evaluate each clause and determine if it is satisfied, and then add them up. Similarly, the `unFinished` function is also very simple, test to see if all variables are set; we basically did this for the 0-1 knapsack already. A search-space ‘node’ of the problem is also quite similar to the 0-1 knapsack, its a set of variables such that some may have been decided and some still unset. An initial value less than any valid solution would be -1, that is because a valid number of clauses satisfied would range from 0 to the number in the problem instance (as many as 9 in Eq 3 above).

So that leaves what is usually the most challenging function in branch and bound, the estimator `g`.

This is a simple form of Max-Sat solver, this problem has a very long history see this for solvers that use branch and bound as well

Well, one simple upper limit is, “evaluate each clause against the variables that have been decided already, and then assume that literals that are not decided yet will be true.” Conversely, that is if all the literals in a clause are false, then there are no future settings that will change this. So the upper bound function will be the total number of clauses, less those that are already known to be false.

With these concepts in hand, we can tackle the coding. Our first step is the define a variable of the problem.

 ```1: 2: 3: 4: 5: 6: 7: ``` ```type MaxSatDiscreteVar = { (* this is a record type in F# *) Name : string; Setting : ZeroOneVarSetting; // discrete value setting } with member var.isUnset = (var.Setting = ZeroOneVarSetting.Unset) ```

This is similar to the 0-1 knapsack variable, but we don’t need Weight or Utility. It does use the `ZeroOneVarSetting` from our `DiscreteVarTypes` module, although it might be better to use a different type, say one that had `Unset, True` and `False` instead but we interpret `Zero` as false and `One` as true. It also introduces the abilty to write member functions for types, here it creates a function `isUnset` that returns true if the `ZeroOneVarSetting` is `Unset`. Next, let’s consider how we are going to represent our problem in F#. Here is a sample:

 ``` 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: ``` ```let vars2 = [| {Name = "v1"; Setting = Unset}; {Name = "v2"; Setting = Unset}; {Name = "v3"; Setting = Unset}; {Name = "v4"; Setting = Unset}; {Name = "v5"; Setting = Unset}; {Name = "v6"; Setting = Unset} |] let clauses2 = [| [|-1; 2; -4|]; // positive = var at that index; negative means not var at index [|-1; 3; 4|]; [|-2; 3; 4|]; [|-2; -4; -5|]; [| 2; 3; 6|]; [| 3; 5; -6|]; [|-4; -5; 6|]; [|-2; 5; -6|]; [| 3; -5; 6|]; |] ```

Here we have a variable array of `MaxSatDiscreteVar` with given names such that all are Unset, setting up the top node in our search tree. To represent the clauses, we decided on using an integer array of arrays. Each row of the array is a clause, and each element of the row is an index such that an positive index is used for non-negated literals and a negative value for negative (not) literals. We used indexing that starts at 1 rather than 0 since we could not distinguish +0 and -0. So, a literal -4 means that this is the fourth variable in the variable array in negated (¬) form. A literal 5 is the fifth one in positive form (since F# uses zero-based indexing the fifth variable is at index 4). The example above should correspond to Eq 3 above.

We’ll put the Max-Sat related code in a module `MaxSat`, here is the whole thing:

 ``` 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: 48: 49: 50: 51: 52: 53: 54: 55: 56: 57: 58: 59: 60: 61: 62: 63: 64: 65: 66: 67: 68: 69: 70: 71: 72: 73: 74: 75: 76: 77: 78: 79: 80: 81: 82: 83: 84: 85: 86: 87: 88: 89: 90: 91: ``` ```namespace BandBapplication //open DiscreteVarTypes open System.Collections.Generic type MaxSatDiscreteVar = { (* this is a record type in F# *) Name : string; Setting : ZeroOneVarSetting; // discrete value setting } with member var.isUnset = (var.Setting = ZeroOneVarSetting.Unset) module MaxSat = let clausefulleval clauses (vars:MaxSatDiscreteVar array) = // assume all vars are set clauses |> Array.map (fun clause -> clause |> Array.fold (fun acc literal -> let literalIndex = (abs literal) - 1 let literalTrue = match vars.[literalIndex].Setting with | Zero -> if literal < 0 then true else false | One -> if literal > 0 then true else false | _ -> failwith "Attempt to eval with an Unset var" max acc (if literalTrue then 1 else 0) ) 0 ) |> Array.sum let clauseNumFalse (clauses:int array array) (vars:MaxSatDiscreteVar array) = let mutable count = 0 for clause in clauses do let mutable clauseIsDefinitelyFalse = true // an empty clause is false? for literal in clause do let literalIndex = (abs literal) - 1 let literalTrueOrUnknown = match vars.[literalIndex].Setting with | Zero -> if literal < 0 then true else false | One -> if literal > 0 then true else false | _ -> true clauseIsDefinitelyFalse <- clauseIsDefinitelyFalse && (not literalTrueOrUnknown) count <- count + if clauseIsDefinitelyFalse then 1 else 0 // add an always false clause! clauses.Length - count let clauseNumFalseFunc clauses (vars:MaxSatDiscreteVar array) = let count = ref 0 Array.iter (fun clause -> let clauseIsDefinitelyFalse = ref true // an empty clause is false? Array.iter (fun literal -> let literalIndex = (abs literal) - 1 let literalTrueOrUnknown = match vars.[literalIndex].Setting with | Zero -> if literal < 0 then true else false | One -> if literal > 0 then true else false | _ -> true clauseIsDefinitelyFalse := !clauseIsDefinitelyFalse && (not literalTrueOrUnknown) ) clause count := !count + if !clauseIsDefinitelyFalse then 1 else 0 // add an always false clause! ) clauses clauses.Length - !count (* return index of unset (first) var, or -1 if all are set (to One or Zero) *) let unsetVarIndex vars = try Array.findIndex (fun var -> var.Setting = ZeroOneVarSetting.Unset) vars with | :? KeyNotFoundException -> -1 // only create feasible solutions. In this case, all settings are feasible let maxsat_branch g bestSolution vars = let mutable branches = [] let unsetVarIndex = unsetVarIndex vars // should only be called when there are Unset vars, so no -1 check let varArrayZero = vars |> Array.mapi (fun index var -> if index = unsetVarIndex then {var with Setting = Zero} else var) let varArrayOne = vars |> Array.mapi (fun index var -> if index = unsetVarIndex then {var with Setting = One} else var) let gEstimateZero = g varArrayZero let gEstimateOne = g varArrayOne if gEstimateZero > bestSolution then branches <- [varArrayZero] if gEstimateOne > bestSolution then branches <- varArrayOne :: branches branches ```

The code will assume that zero-length clauses are false

There are three (purposely) different styles of coding used in `clausefulleval`, `clauseNumFalse`, and `clauseNumFalseFunc`.

The first function `clausefulleval` is going to be used for our `f` function, that is to determine how many clauses are true for a fully decided variable set. We note that it is explicitly typing the last argument, `vars` as an array of `MaxSatDiscreteVar` like this: `let clausefulleval clauses (vars:MaxSatDiscreteVar array)`. Why is this needed? Well this module is included along with the `knapsack01` module from before. Note that it uses the Settings value, but there isn’t enough in the body of the code to determine if these vars are `MaxSatDiscreteVar` or `KnapDiscreteVar` — if the code accessed Weight or Utility then it would be clear. But as it stands, we need to clarify that these are `MaxSatDiscreteVar`. Notice that this function requires that all of the variables be set, otherwise it failswith a message (think ‘throws an exception’). This function follows a primarily idiomatic functional style.

From the two `for` loops in `clauseNumFalse`, the compiler can deduce that `clauses` is an array of array (technically it only knows that it is a sequence of sequence type), and that the elements must be ints since on line 22 we use an expression that would require the element to be int. However, if we don’t explicitly type the `clauses` argument as: `(clauses:int array array)` then the compile fails at the `clauses.Length` (line 44). Why? Exactly because the for loops only imply a sequence, not necessarily an Array or List type; and sequences are lazily evaluated that is they are only evaluated when an element of the sequence is needed. Therefore they don’t really have a defined length (and can be conceptually infinite – as long as you don’t ask for the value at index ∞), So we needed to clarify that `clauses` is an int array of array. The body of the code is straightforward, looping through each clause and then each literal of the clause. If a clause is determined to be true, then the mutable value count is incremented and ultimately returned as the number of true clauses.

The functions `clauseNumFalse` and `clauseNumFalseFunc` do the same thing, the latter uses a `ref` value to accumulate the needed value but the former only needs a mutable one. But in `clauseNumFalseFunc` notice that there is no need to define the type for `clauses` – this is because the Array.iter applications clarify what clauses must be. Each of these could be re-written in the functional style of `clausefulleval` but I will leave them in the more imperative form, emphasizing that not everything must be in a functional structure. Which is easier to read and maintain is probably subjective.

The final function of interest is `maxsat_branch` on lines 73-89 in listing above for `MaxSat`. It must take an existing search tree node (called `vars` since the search node is an array of variables) and branch to feasible new settings. Like in knapsack, we find an Unset variable, and branch to each of One and Zero settings (with the interpretation that One is true and Zero is false). There is no need for a feasibility test as in 0-1 knapsack where we made sure that the weight limit was not exceeded — all possible settings of Boolean variables are feasible for the problem. We do test that the new settings can be better by testing the upperbound (`g`) of the search node against the best known so far.

Now we just need to use our new code for Max-Sat in the main program (note that `open MaxSat` was included prior to the entry point code shown):

 ``` 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: 48: 49: 50: 51: 52: 53: 54: 55: ``` ```[] let main argv = (* *************************************************************************************** *) (* KNAPSACK EXAMPLE 1 *) let vars1 = [|{Name = "food"; Setting = ZeroOneVarSetting.Unset; Weight = 5.0; Utility = 8.0}; {Name = "tent"; Setting = ZeroOneVarSetting.Unset; Weight = 14.5; Utility = 5.0}; {Name = "gps"; Setting = ZeroOneVarSetting.Unset; Weight = 1.0; Utility = 3.0}; {Name = "map"; Setting = ZeroOneVarSetting.Unset; Weight = 0.5; Utility = 3.0} |] let weightLimit1 = 16.0 let vars1Sorted = (Array.sortBy (fun elem -> -elem.Utility / elem.Weight) vars1) // still have this let gknap1 = knapestimator weightLimit1 let knapAnyUnset = Array.exists (fun (elem:KnapDiscreteVar) -> ZeroOneVarSetting.isUnset elem.Setting) (* Invoke *) let (solutionUtility, solutionVars) = BandB knapmultiplySumUtility gknap1 branchknap knapAnyUnset vars1Sorted -1.0 printfn "Best Utility = %f" solutionUtility for var in solutionVars do printfn "%A" var (* *************************************************************************************** *) (* MAXSAT EXAMPLE 1 *) let vars2 = [| {Name = "v1"; Setting = Unset}; {Name = "v2"; Setting = Unset}; {Name = "v3"; Setting = Unset}; {Name = "v4"; Setting = Unset}; {Name = "v5"; Setting = Unset}; {Name = "v6"; Setting = Unset} |] let clauses2 = [| [|-1; 2; -4|]; // positive = var at that index; negative means not var at index [|-1; 3; 4|]; [|-2; 3; 4|]; [|-2; -4; -5|]; [| 2; 3; 6|]; [| 3; 5; -6|]; [|-4; -5; 6|]; [|-2; 5; -6|]; [| 3; -5; 6|]; |] let f2 = clausefulleval clauses2 // partial func application let g2 = clauseNumFalseFunc clauses2 let maxsatAnyUnset = Array.exists (fun elem -> elem.Setting = ZeroOneVarSetting.Unset) // no qualifier needed let (bestValue2, solution2) = BandB f2 g2 maxsat_branch maxsatAnyUnset vars2 -1 printfn "Solution 2:" printfn "# clauses satisfied = %d out of %d" bestValue2 clauses2.Length solution2 |> Array.iter (fun elem -> printfn "%s = %s" elem.Name (match elem.Setting with | ZeroOneVarSetting.Unset -> "" | ZeroOneVarSetting.Zero -> "false" | _ -> "true")) System.Console.ReadKey() |> ignore // don't close the output window until a key is hit 0 ```

We see that we are using our solver for both the previous 0-1 knapsack and our new Max-Sat formulation. Let’s investigate the types in the `BandB` calls, which is:

BandB type

This says that:

1. `BandB` takes a first argument `f` that takes an (generic) argument of `'a` and produces a `'b`.
2. `BandB` takes a second argument `g` that takes an (generic) argument of `'c`.
3. `BandB` takes a third argument `branch` that takes an (generic) argument of `'a` and produces an `'a list`.
4. `BandB` takes a fourth argument `unFinished` that takes generic argument of `'a` and produces a `bool`.
5. `BandB` takes a fifth argument `node` that is of type `'a`.
6. `BandB` takes a sixth argument `initLowestValue` which is of type `'b`.
7. And finally, `BandB` produces a tuple of type `'b * 'a`.

Other than the `bool` result for `unFinished` all of the arguments are generic. The specific types are a bit different for each of the invocations, for example `'b` is an float for knapsack (line 16) but an int for Max-Sat (line 46). We will revisit this topic after we do the traveling salesman problem on the next page. But for now if we run the code we get:

 ``` 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: ``` ```BandB: total settings queued = 13, total number of evals: 1 Best Utility = 14.000000 {Name = "map"; Weight = 0.5; Utility = 3.0; Setting = One;} {Name = "gps"; Weight = 1.0; Utility = 3.0; Setting = One;} {Name = "food"; Weight = 5.0; Utility = 8.0; Setting = One;} {Name = "tent"; Weight = 14.5; Utility = 5.0; Setting = Zero;} BandB: total settings queued = 14, total number of evals: 3 Solution 2: #clauses satisfied = 9 out of 9 v1 = true v2 = true v3 = true v4 = true v5 = false v6 = false ```

Looking at the new results for Max-Sat we see that we were able to satisfy all nine clauses with the given variable settings. We also note that there are 6 variables, so there are 26 = 64 possible variable settings and 26+1 – 1 = 127 nodes in the search tree, but we were able to determine an optimum result queueing only 14 nodes and evaluating 3 settings.

As on the previous page, the download is a Visual Studio solution file.

## Refactoring and improving

At this point, we see that the queue we are creating, based on the branch function for each problem type, is unsorted. That is, the queue is simply built by concatenanting new nodes from the branch function. However, we would like to do a best-first search, that is select the node from the queue that has the best (highest) possible upper limit. The idea is that this may lead to a good solution faster (with less exploration of the search tree).

As it stands now, the upper-limit bound function `g` is embedded in the branch function; the functionality of which is to both provide branches that are feasible and to ensure that they meet the upperbound limit. But if we want to sort the queue by bound, then we need to know the `g` value at in the `BandB` function where the whole queue is, not just in the branch function. We could invoke it again in the `BandB` function but that is wasteful.

So we are going to refactor the code, that is ‘move functionality around’. We can remove both the `g` function and bestUtility arguments from the branch functions. And then apply `g` in the `BandB` function using it to sort or prune nodes returned by the branch.

 ``` 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38: ``` ```module BandB = let BandB f g branch unFinished node initLowestValue = let bestUtility = ref initLowestValue; // update: queue elements are now a 2-tuple (pair) of g-upperbound estimate and node let mutable queue = [ (g node, node) ] let mutable bestNode = node // best var settings found, nothing so far. // these are only informational, not needed in the routine let mutable numberOfQueues = 1 // we just queued the start, so 1 let mutable numberOfEvals = 0 // number of times we evaluated the function with all vars set while not queue.IsEmpty do match queue with | (gNode,curNode) :: restOfQueue -> if unFinished curNode then // update: apply `g` here instead, simplifying branch, and sort the queue let toQueueAll = branch curNode let toQueueAugmented = toQueueAll |> List.map (fun elem -> ((g elem), elem)) let toQueue = toQueueAugmented |> List.filter (fun elem -> (fst elem) > !bestUtility) queue <- (toQueue @ restOfQueue) |> List.sortBy (fun elem -> fst elem) |> List.rev numberOfQueues <- numberOfQueues + toQueue.Length else let util = f curNode numberOfEvals <- numberOfEvals + 1 if util > !bestUtility then bestUtility := util bestNode <- curNode queue <- restOfQueue // we've processed curNode, take it off the queue | _ -> () // queue isempty, do nothing printfn "BandB: total settings queued = %d, total number of evals: %d" numberOfQueues numberOfEvals (!bestUtility, bestNode) ```

In this version, the queue is not simply the list of search nodes, but a tuple (pair) of: upper bound estimate for the node and the node. The branch function is called on line 18, now with just an argument of `curNode`. We expect the branch function to return a list of nodes branched from `curNode` such that each are feasible (e.g. for knapsack don’t go over weight) but do not concern themselves with bound estimates. That is handled on lines 19-22 where we take the list of branched nodes (`toQueueAll`), augment each with their upperbound (`toQueueAugmented`) and then filter them down to only those that can be better than the best solution found so far (`toQueue`). As before since we want to use the value `bestUtility` in an inner closure (the filtering part) we need to make it a reference type. We will revisit line 24 later. The modified branch function for knapsack is:

 ``` 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: ``` ``` let branchknap wlimit vars = let unsetVarIndex = unsetVarIndex vars // should only be called when there are Unset vars, so no -1 check let varArrayZero = vars |> Array.mapi (fun index var -> if index = unsetVarIndex then {var with Setting = Zero} else var) let varArrayOne = vars |> Array.mapi (fun index var -> if index = unsetVarIndex then {var with Setting = One} else var) // not only use g but also ensure feasibilty! let zeroMakesWeight = (knapmultiplySumWeight varArrayZero) <= wlimit let oneMakesWeight = (knapmultiplySumWeight varArrayOne) <= wlimit // update the queue, first match is the match (zeroMakesWeight, oneMakesWeight) with | true,true -> varArrayOne :: [varArrayZero] | true,false -> [varArrayZero] | false,true -> [varArrayOne] | _ -> [] // Neither case, nothing ```

And the modified branch for Max-Sat is:

 ```1: 2: 3: 4: 5: 6: 7: 8: 9: ``` ```// only create feasible solutions. In this case, all settings are feasible let maxsat_branch vars = let unsetVarIndex = unsetVarIndex vars // should only be called when there are Unset vars, so no -1 check let varArrayZero = vars |> Array.mapi (fun index var -> if index = unsetVarIndex then {var with Setting = Zero} else var) let varArrayOne = vars |> Array.mapi (fun index var -> if index = unsetVarIndex then {var with Setting = One} else var) varArrayOne :: [varArrayZero] ```

Each of these are simpler now that bounding functionality was moved to the main `BandB` routine. The main code is the same.

Finally, let’s revisit line 24 from `BandB`, it reads:

 ```1: ``` ```queue <- (toQueue @ restOfQueue) |> List.sortBy (fun elem -> fst elem) |> List.rev ```

The documentation I’ve found doesn’t say what the computational complexity of a List.sort is, but since it appears to be a comparison sort, this would be O(n log n).

The idea being that we want to sort descending on the `g` upper bound estimates, using the sortBy function. This puts the node with the highest upperbound at the front of the queue for the next loop. As of F# v4.0, there is a `sortDescending` function for lists, but we leave this code as is for now as we will revisit reverse sorting in Part III. We could avoid application of the lambda function that is used to sort on the upper bound only and instead use:

 ```1: ``` ```queue <- (toQueue @ restOfQueue) |> List.sort |> List.rev ```

This would sort the values as a tuple, rather than only on the first element of the tuple (which is the `g` estimate), but the downside of this is then node is required to have a comparison function. For our cases here, of `KnapDiscreteVar` and `MaxSatDiscreteVar` this isn’t a problem as they are sortable but in general there is no reason to require ordering of search nodes so its better to leave the sorting to just using the bound. To avoid using the list reverse, what if we sorted the elements using the negative of the `fst elem` value? This should result in a reverse sorted list, so this should work:

 ```1: ``` ```queue <- (toQueue @ restOfQueue) |> List.sortBy (fun elem -> -(fst elem)) // elem becomes int??? ```

Update: From this reference, it appears that use of the unary operator (-) will change the generic arg to an `int`

But it doesn’t … it will cause the elem in the lambda function to change from a generic type to an int (see update note in gutter). As unary minus is indeed applicable to both ints and floats, we would have to explicitly type the first element argument to float to have a unary float minus operator. But we will leave this code as is for now, and revisit this in Part III.

If we run this code, we get:

 ``` 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: ``` ```BandB: total settings queued = 19, total number of evals: 4 Best Utility = 14.000000 {Name = "map"; Weight = 0.5; Utility = 3.0; Setting = One;} {Name = "gps"; Weight = 1.0; Utility = 3.0; Setting = One;} {Name = "food"; Weight = 5.0; Utility = 8.0; Setting = One;} {Name = "tent"; Weight = 14.5; Utility = 5.0; Setting = Zero;} BandB: total settings queued = 23, total number of evals: 4 Solution 2: #clauses satisfied = 9 out of 9 v1 = false v2 = true v3 = true v4 = true v5 = false v6 = false ```

which is worse than our previous solver, requiring more queues and evals than before. So it might seem that the best-first idea isn’t working, but let’s wait to analyze this more in later sections. Also note that the Max-Sat solution found is different, but still all the clauses are satisfied — branch and bound is a complete solver and will find an optimal solution but it may not be unique (in this case there are mulitple settings that provide the optimal solution).