Fluent.Interface


Scheduling algorithm in GAMS, AMPL, MFS and lp_solve

In researching scheduling problems, I came across Nathan Brixius’ blog post which utilised Microsoft solver foundation for a  critical path project scheduling problem.   This was a step in the right direction, but my problem was resource constrained without concern for the order in which tasks finished.  After some further research I found Erwin Kalvelagen from the Amsterdam Optimization Modeling Group who had a relevant post on a conference scheduling problem that used GAMS to model a solution.  My goal was to port this to MSF, which was made easier with Ewrin’s great tutorial on OML – the modelling language behind Microsoft’s Solver Foundation.

I started by porting the GAMS solution to AMPL – also a modelling language for linear optimization problems, and closer to MSF as far as syntax is concerned.  One advantage GAMS has over AMPL (and indeed MSF) is its richer language syntax and support for sparse matrix’s – see this thread for a discussion on more differences.  However AMPL is clean and so was a first step.

AMPL requres two files

1) the data file which defines ‘sets’ and parameter values associated with the domain:

set TASKS := T1 T2 T3 T4 T5;
set PEOPLE := P1 P2 P3 P4;
set SLOTS := S1 S2;

param pref (tr):
         T1    T2    T3    T4    T5 :=
    P1    1     0     2     0     0
    P2    0     1     2     0     0
    P3    0     0     1     2     0
    P4    0     0     0     1     2 ;

param:   ntasks :=
    T1   1
    T2   1
    T3   1
    T4   1
    T5   1 ;

param:   minp   maxp :=
    T1   1      5
    T2   1      5
    T3   1      5
    T4   1      5
    T5   1      5 ;

param:   nrooms :=
    S1   5
    S2   5 ; 

2) The model file which declares constaints, and the objective function:

set TASKS;
set PEOPLE;
set SLOTS;

param pref {TASKS,PEOPLE} >= 0;
param nrooms {SLOTS} >= 0;
param ntasks {TASKS} >= 0;
param minp {TASKS} >= 0;
param maxp {i in TASKS} >= minp[i];

var x {t in TASKS, p in PEOPLE, s in SLOTS} binary;
var xts {t in TASKS, s in SLOTS} binary;
var xtp {t in TASKS, p in PEOPLE} binary;

subject to rule1 {t in TASKS, s in SLOTS}:
sum {p in PEOPLE} x[t,p,s] >= xts[t,s];
subject to rule2 {t in TASKS, p in PEOPLE, s in SLOTS}:
x[t,p,s] <= xts[t,s]; subject to rule3 {t in TASKS, p in PEOPLE}: sum {s in SLOTS} x[t,p,s] == xtp[t,p]; subject to rule4 {t in TASKS}: sum {s in SLOTS} xts[t,s] == ntasks[t]; subject to rule5 {t in TASKS}: minp[t] <= sum {p in PEOPLE} xtp[t,p] <= maxp[t]; subject to rule6 {s in SLOTS}: sum {t in TASKS} xts[t,s] <= nrooms[s]; subject to rule7 {p in PEOPLE, s in SLOTS}: sum {t in TASKS} x[t,p,s] == 1; maximize z: sum {t in TASKS, p in PEOPLE} pref[t,p] * xtp[t,p]; [/sourcecode] This can be executed by running following commands inside ampl.exe console: [sourcecode language='css'] ampl: model sched.mod; ampl: data sched.dat; ampl: solve; MINOS 5.51: ignoring integrality of 70 variables MINOS 5.51: optimal solution found. 42 iterations, objective 12 ampl: display x; x [*,*,S1] : P1 P2 P3 P4 := T1 1 0 0 0 T2 8.01124e-18 1 -8.01124e-18 0 T3 1.13258e-16 1.13258e-16 1.13258e-16 0 T4 0 0 1 1 T5 0 0 0 3.08286e-16 [*,*,S2] : P1 P2 P3 P4 := T1 1.40275e-16 -1.40275e-16 1.40275e-16 0 T2 0 2.2428e-16 -8.40105e-26 0 T3 1 1 1 0 T4 -2.21125e-16 0 2.21125e-16 2.21125e-16 T5 2.89604e-16 0 -2.78275e-16 1 ; [/sourcecode] Although in this case we are solving a binary linear optimization problem, AMPL treats all variables as double values - so you can interpret the small numbers in results as FALSE, and the 1's as TRUE.  The 3d matrix x then represents the cross section of person(p1-p4) attending task(t1-t5) during slot (s1,s2).  Which can more simply be represented as the following Schedule:

  s1 s2
p1 t1 t3
p2 t2 t3
p3 t4 t3
p4 t4 t5

Now we are ready to port this same solution to MFS.  Using the Excel 2007 foundation solver add-in, all data is defined as region linked to cells on a sheet, and the model is declare in OML.  The syntax is similar defining the ‘sets’ and ‘decisions’ (variables) with constraints using Foreach to represent a sum. 

Model[

Parameters[Sets,People,Tasks,Slots],
Parameters[Integers,NTasks[Tasks],MinPeople[Tasks],MaxPeople[Tasks],MaxRooms[Slots],Pref[People,Tasks]],

Decisions[Integers[0,1],x[Tasks, People, Slots], xts[Tasks, Slots], xtp[Tasks, People]],
Decisions[Reals,z],

Constraints[
// rule 1: x(t,p,s)=0 for all p => xts(t,s) = 0
Foreach[{t,Tasks}, {s,Slots}, Sum[{p,People},x[t,p,s]] >= xts[t,s]],
// rule 2: any x(t,p,s)=1 => xts(t,s) = 1
Foreach[{t,Tasks}, {p,People}, {s,Slots}, x[t,p,s] <= xts[t,s]], // rule 3: p visits talk t in any s Foreach[{t,Tasks}, {p,People}, Sum[{s,Slots},x[t,p,s]] == xtp[t,p]], // rule 4: each talk happens exactly ntalk times Foreach[{t,Tasks}, Sum[{s,Slots}, xts[t,s]] == NTasks[t]], // rule 5: task must have between min&max people Foreach[{t,Tasks}, Sum[{p,People}, xtp[t,p]] <= MaxPeople[t]], Foreach[{t,Tasks}, Sum[{p,People}, xtp[t,p]] >= MinPeople[t]],
// rule 6: timeslot can only hold up to 5 talks
Foreach[{s,Slots}, Sum[{t,Tasks}, xts[t,s]] <= MaxRooms[s]], // rule 7: p can only visit one talk in each time period Foreach[{p,People}, {s,Slots}, Sum[{t,Tasks}, x[t,p,s]] == 1], z == Sum[{t,Tasks},{p,People},Pref[p,t]*xtp[t,p]] ], Goals[Maximize[z]] ] [/sourcecode] The following could also be solved directly in C# code, which will be an exercise for another post.  I did however take the solution one more step further by creating a lp_solve implementation in the form of an '.lp' file which can be solved through the lp_solve IDE.  This format is much more verbose as it doesn’t abstract the data, instead deals with the 70 binary variables for each cell of each matrix (x, xtp, and xts).   This simple solution is solved in 20ms – faster then the other higher level libraries.

Download the source for  solutions in GAMS, AMPL, MSF /w excel, and lp_solve.

Advertisements

Billy McCaferty and his S#arp Architecture

I have been following Billy McCaferty’s Sharp Architecture project for some time now – a best practice Domain Driven Design approach to developing ASP.NET MVC web apps.  It has seen a number of iterations, the latest including the configuration with FluentNhibernate and independent ServiceLocator

In his recent article on InfoQ he discusses some of the motivations behind his TDD/DDD implementation.

What’s currently lacking, at least in the world of .NET web development, is a common architecture and foundation for application development which combines best of breed technologies and techniques, using recent technologies based on proven practices, while taking into account the availability of high quality tools developed by the open source community. S#arp Architecture is a response to this need. The open source S#arp Architecture attempts to leverage the proven practices described in this article with carefully selected tools to increase development productivity, while enabling a high level of quality and maintainability

He also makes dudious use of the T4 templating support built into VS2008 to codegen files from the Model objects through to Views and Controllers.  A great kick start for any new project – Well worth a read!


Cloudera building a business around open source Map Reduce

The heavy hitting ex-executives behind start up CloudEra are banking on a business based around Hadoop, the open source Map Reduce implementation with a distribution capable of running on Amazon’s EC2.   Google is credited with popularising (inventing) Map Reduce and has been tuning its own implementation for many years.  It gave insights into the origin and future research direction in a round table video last year.

Increasingly companies need to make sense of Terabytes or even Petabytes of data.  This information is stored across many machines on many disks, and needs distributed algorithms for sifting through the data in any reasonable time.  This is where Map-Reduce comes in.

Interestingly Microsoft has taken a step back from this direction when with deciding that its SDS offering should support standard ‘relational’ features, in effect turning the product into a hosted SQL Server cloud.

It has however been active in this research field.  It released its functional programming language F# and it runs its ad serving on Dryad – a distributed execution software engine.  DryadLINQ combines the power of this engine, with the simplicity of LINQ by creating a SQL-like execution plan for distributed processing, very cool! 

Large scale distributed processing software typically runs on many low grade Linux servers running open source software so that licensing costs are kept low.  However with the army of MS developers out there, there are companies springing up to provide software to make the most out of idle cycles on Windows boxes around the network.  Manjrasoft a recent graduate from Melbourne University’s GridBus laboratory have released an Alpha of their Aneka software – a .NET Map Reduce implementation.


Regression testing with PEX

A new project from Microsoft research, PEX is an smart tool integrated directly into Visual Studio which helps you generate test suites with max code coverage.  It analyzes your code and suggests various arguments to ensure all paths of your code are tests.  These arguments and return results can then be turned directly in to .NET test methods either using the standard VS Test framework, or NUnit/MBUnit via extensions.

Also recently released is another free testing tool NP .NET Profiler, which can be hooked into .NET processes such as IIS to monitor memory and performance bottlenecks as well as catch first chance exceptions.  This low impact tool can be used in production environments and even has its own query interface for getting to the heart of problems.