Cloud computing report; Sun’s new API and DabbleDB

A recent McKinsey & Company report Clearing the air on cloud computing  reports that for companies to migrate their enterprise software platforms to the cloud:

The cost of cloud must come down significantly for outsourcing a complete data center to make economic sense

Although it is becoming cost effective for smaller ‘compute equivalent’ environment – where the total costs of assets for a typical data centre is US$45/month per CPU.

The report also suggests that:

CIOs should be working the cloud technology stack bottom up; cloud providers top down

Or in other words utilise virtualization and CDN’s available now, while waiting for the cloud service  providers to improved reliability and standardize offerings.  The recent Open Cloud initiative spearheaded by Sam Johnston from Australian Online Solutions might go someway to achieving this goal:

A common set of protocols and API’s for accessing cloud resources could be as revolutionary as HTTP was for TCP/IP.   And possibly the simplest and best implementation I have seen to date is Sun’s new cloud initiative, illustrated in this Hello Cloud example leveraging JSON over the RESTful API.  Hopefully the recent acquisition of Sun by Oracle won’t dent the enthusiasm of this project.

One of the great things about the cloud is the flexability developers/vendors have when choosing the software stack  for their implementation. DabbleDB was a recent example i came across that offers a rich database through a clever online spreadsheet-style interface.  It is built on the Squeak smalltalk engine utilising the SeaSide framework.  Avi Bryant talks to InfoQ on how his application works without the use of a RDBM instead storing virtual images of the smalltalk run-time.


Eiffel-style Code Contracts for .NET

Design-by-Contract was coined by Bertrand Meyer, the creator of Eiffel as a way for software designers to formally define pre and post conditions as well as object invariants such that the caller can know what to expect when executing code.  

This practice is applied in unit testing when Asserting the result of a function, however until now there have only been work arounds for achieving the same logic in-line; which can now be elegantly expressed like using Code Contracts like so:

public void Deposit(int amount)
    Contract.Requires(amount > 0);
    Contract.Ensures(Balance == Contract.OldValue(Balance) + amount);

    Balance += amount;

Microsoft through its DevLab site has released Code Contracts for .NET, a subset of its ongoing Spec# research project.  This consists offering consist of:

  • a stand-alone library that will in future ship as part of the BCL in .NET 4, 
  • a binary rewriter to support post-condition checking, 
  • and extensions to the Visual Studio 2008 IDE to support static checking after every compile.   

As part of the distribution it includes a number of helpful code snippets such as:

cr Contract.Requires(…);
ce Contract.Ensures(…);
ci Contract.Invariant(…);

Where the Contract.Invariant call is declared within a protected ObjectInvariant method (snippet cim) and defines the state of a property throughout it’s life cycle: 

protected void ObjectInvariant ()
    Contract.Invariant ( f ! = null );

I recommend watching the short channel-9 video for an introduction, or consult the manual for further details.  One feature which is not currently available will be integration with code documentation, which will be very helpful for API developers.

TDD on show with ASP.NET MVC at Mix09

Its been a long road, but ASP.NET MVC 1.0 is out and now is open source as posted on ScottGu’s blog.  Its a great milestone and one that has been paraded in front of the the Mix09 attendees in Vegas.  I can recommend setting aside the two hours required to get through Phil Haack‘s Nija preso, and Scott Hanselman‘s NerdDinner talk.

Those guys along with Rob Conery and the main man Scott Guthrie have been slaving away on a new book ASP.NET MVC 1.0 from wrox press.  

the extended forehead edition

And have been kind enough to give away from free a chapter that walks through all the steps required to create the NerdDinner sample, which includes details on:

  • Validation
  • Security tips
  • Rendering partial controls
  • Custom routing
  • Authentication/Authorization
  • AJAX (with jQuery).
  • LINQ to SQL mapping with scalar/table MSSQL functions (very cool!) 
  • Dependancy injection with MOQ

It also has 20 pages dedicated to Unit testing, and encourages using the ‘AAA’ approach: Arrange, Act, and Assert which looks like this:

public void Dinner_Should_Not_Be_Valid_When_Some_Properties_Incorrect() {
Dinner dinner = new Dinner() {
Title = "Test title",
Country = "USA",
ContactPhone = "BOGUS"
// Act
bool isValid = dinner.IsValid;

PhilHa and ScottHa also spend a good amount of time covering red/green/refactor in their videos, and why it is important to have the test fail first.  Great Stuff!

Scheduling algorithm in GAMS, AMPL, MFS and lp_solve

In researching scheduling problems, I came across Nathan Brixius’ blog post which utilised Microsoft solver foundation for a  critical path project scheduling problem.   This was a step in the right direction, but my problem was resource constrained without concern for the order in which tasks finished.  After some further research I found Erwin Kalvelagen from the Amsterdam Optimization Modeling Group who had a relevant post on a conference scheduling problem that used GAMS to model a solution.  My goal was to port this to MSF, which was made easier with Ewrin’s great tutorial on OML – the modelling language behind Microsoft’s Solver Foundation.

I started by porting the GAMS solution to AMPL – also a modelling language for linear optimization problems, and closer to MSF as far as syntax is concerned.  One advantage GAMS has over AMPL (and indeed MSF) is its richer language syntax and support for sparse matrix’s – see this thread for a discussion on more differences.  However AMPL is clean and so was a first step.

AMPL requres two files

1) the data file which defines ‘sets’ and parameter values associated with the domain:

set TASKS := T1 T2 T3 T4 T5;
set PEOPLE := P1 P2 P3 P4;
set SLOTS := S1 S2;

param pref (tr):
         T1    T2    T3    T4    T5 :=
    P1    1     0     2     0     0
    P2    0     1     2     0     0
    P3    0     0     1     2     0
    P4    0     0     0     1     2 ;

param:   ntasks :=
    T1   1
    T2   1
    T3   1
    T4   1
    T5   1 ;

param:   minp   maxp :=
    T1   1      5
    T2   1      5
    T3   1      5
    T4   1      5
    T5   1      5 ;

param:   nrooms :=
    S1   5
    S2   5 ; 

2) The model file which declares constaints, and the objective function:

set TASKS;
set SLOTS;

param pref {TASKS,PEOPLE} >= 0;
param nrooms {SLOTS} >= 0;
param ntasks {TASKS} >= 0;
param minp {TASKS} >= 0;
param maxp {i in TASKS} >= minp[i];

var x {t in TASKS, p in PEOPLE, s in SLOTS} binary;
var xts {t in TASKS, s in SLOTS} binary;
var xtp {t in TASKS, p in PEOPLE} binary;

subject to rule1 {t in TASKS, s in SLOTS}:
sum {p in PEOPLE} x[t,p,s] >= xts[t,s];
subject to rule2 {t in TASKS, p in PEOPLE, s in SLOTS}:
x[t,p,s] <= xts[t,s]; subject to rule3 {t in TASKS, p in PEOPLE}: sum {s in SLOTS} x[t,p,s] == xtp[t,p]; subject to rule4 {t in TASKS}: sum {s in SLOTS} xts[t,s] == ntasks[t]; subject to rule5 {t in TASKS}: minp[t] <= sum {p in PEOPLE} xtp[t,p] <= maxp[t]; subject to rule6 {s in SLOTS}: sum {t in TASKS} xts[t,s] <= nrooms[s]; subject to rule7 {p in PEOPLE, s in SLOTS}: sum {t in TASKS} x[t,p,s] == 1; maximize z: sum {t in TASKS, p in PEOPLE} pref[t,p] * xtp[t,p]; [/sourcecode] This can be executed by running following commands inside ampl.exe console: [sourcecode language='css'] ampl: model sched.mod; ampl: data sched.dat; ampl: solve; MINOS 5.51: ignoring integrality of 70 variables MINOS 5.51: optimal solution found. 42 iterations, objective 12 ampl: display x; x [*,*,S1] : P1 P2 P3 P4 := T1 1 0 0 0 T2 8.01124e-18 1 -8.01124e-18 0 T3 1.13258e-16 1.13258e-16 1.13258e-16 0 T4 0 0 1 1 T5 0 0 0 3.08286e-16 [*,*,S2] : P1 P2 P3 P4 := T1 1.40275e-16 -1.40275e-16 1.40275e-16 0 T2 0 2.2428e-16 -8.40105e-26 0 T3 1 1 1 0 T4 -2.21125e-16 0 2.21125e-16 2.21125e-16 T5 2.89604e-16 0 -2.78275e-16 1 ; [/sourcecode] Although in this case we are solving a binary linear optimization problem, AMPL treats all variables as double values - so you can interpret the small numbers in results as FALSE, and the 1's as TRUE.  The 3d matrix x then represents the cross section of person(p1-p4) attending task(t1-t5) during slot (s1,s2).  Which can more simply be represented as the following Schedule:

  s1 s2
p1 t1 t3
p2 t2 t3
p3 t4 t3
p4 t4 t5

Now we are ready to port this same solution to MFS.  Using the Excel 2007 foundation solver add-in, all data is defined as region linked to cells on a sheet, and the model is declare in OML.  The syntax is similar defining the ‘sets’ and ‘decisions’ (variables) with constraints using Foreach to represent a sum. 



Decisions[Integers[0,1],x[Tasks, People, Slots], xts[Tasks, Slots], xtp[Tasks, People]],

// rule 1: x(t,p,s)=0 for all p => xts(t,s) = 0
Foreach[{t,Tasks}, {s,Slots}, Sum[{p,People},x[t,p,s]] >= xts[t,s]],
// rule 2: any x(t,p,s)=1 => xts(t,s) = 1
Foreach[{t,Tasks}, {p,People}, {s,Slots}, x[t,p,s] <= xts[t,s]], // rule 3: p visits talk t in any s Foreach[{t,Tasks}, {p,People}, Sum[{s,Slots},x[t,p,s]] == xtp[t,p]], // rule 4: each talk happens exactly ntalk times Foreach[{t,Tasks}, Sum[{s,Slots}, xts[t,s]] == NTasks[t]], // rule 5: task must have between min&max people Foreach[{t,Tasks}, Sum[{p,People}, xtp[t,p]] <= MaxPeople[t]], Foreach[{t,Tasks}, Sum[{p,People}, xtp[t,p]] >= MinPeople[t]],
// rule 6: timeslot can only hold up to 5 talks
Foreach[{s,Slots}, Sum[{t,Tasks}, xts[t,s]] <= MaxRooms[s]], // rule 7: p can only visit one talk in each time period Foreach[{p,People}, {s,Slots}, Sum[{t,Tasks}, x[t,p,s]] == 1], z == Sum[{t,Tasks},{p,People},Pref[p,t]*xtp[t,p]] ], Goals[Maximize[z]] ] [/sourcecode] The following could also be solved directly in C# code, which will be an exercise for another post.  I did however take the solution one more step further by creating a lp_solve implementation in the form of an '.lp' file which can be solved through the lp_solve IDE.  This format is much more verbose as it doesn’t abstract the data, instead deals with the 70 binary variables for each cell of each matrix (x, xtp, and xts).   This simple solution is solved in 20ms – faster then the other higher level libraries.

Download the source for  solutions in GAMS, AMPL, MSF /w excel, and lp_solve.

Billy McCaferty and his S#arp Architecture

I have been following Billy McCaferty’s Sharp Architecture project for some time now – a best practice Domain Driven Design approach to developing ASP.NET MVC web apps.  It has seen a number of iterations, the latest including the configuration with FluentNhibernate and independent ServiceLocator

In his recent article on InfoQ he discusses some of the motivations behind his TDD/DDD implementation.

What’s currently lacking, at least in the world of .NET web development, is a common architecture and foundation for application development which combines best of breed technologies and techniques, using recent technologies based on proven practices, while taking into account the availability of high quality tools developed by the open source community. S#arp Architecture is a response to this need. The open source S#arp Architecture attempts to leverage the proven practices described in this article with carefully selected tools to increase development productivity, while enabling a high level of quality and maintainability

He also makes dudious use of the T4 templating support built into VS2008 to codegen files from the Model objects through to Views and Controllers.  A great kick start for any new project – Well worth a read!

Cloudera building a business around open source Map Reduce

The heavy hitting ex-executives behind start up CloudEra are banking on a business based around Hadoop, the open source Map Reduce implementation with a distribution capable of running on Amazon’s EC2.   Google is credited with popularising (inventing) Map Reduce and has been tuning its own implementation for many years.  It gave insights into the origin and future research direction in a round table video last year.

Increasingly companies need to make sense of Terabytes or even Petabytes of data.  This information is stored across many machines on many disks, and needs distributed algorithms for sifting through the data in any reasonable time.  This is where Map-Reduce comes in.

Interestingly Microsoft has taken a step back from this direction when with deciding that its SDS offering should support standard ‘relational’ features, in effect turning the product into a hosted SQL Server cloud.

It has however been active in this research field.  It released its functional programming language F# and it runs its ad serving on Dryad – a distributed execution software engine.  DryadLINQ combines the power of this engine, with the simplicity of LINQ by creating a SQL-like execution plan for distributed processing, very cool! 

Large scale distributed processing software typically runs on many low grade Linux servers running open source software so that licensing costs are kept low.  However with the army of MS developers out there, there are companies springing up to provide software to make the most out of idle cycles on Windows boxes around the network.  Manjrasoft a recent graduate from Melbourne University’s GridBus laboratory have released an Alpha of their Aneka software – a .NET Map Reduce implementation.

Regression testing with PEX

A new project from Microsoft research, PEX is an smart tool integrated directly into Visual Studio which helps you generate test suites with max code coverage.  It analyzes your code and suggests various arguments to ensure all paths of your code are tests.  These arguments and return results can then be turned directly in to .NET test methods either using the standard VS Test framework, or NUnit/MBUnit via extensions.

Also recently released is another free testing tool NP .NET Profiler, which can be hooked into .NET processes such as IIS to monitor memory and performance bottlenecks as well as catch first chance exceptions.  This low impact tool can be used in production environments and even has its own query interface for getting to the heart of problems.

Microsoft Solver Foundation Express /w Gurobi

Microsoft has released v1.1 of its Solver Foundation which introduces a plug-in model to support running against any number of solvers.  The express edition is shipping with a throttle Gurobi solver which has proven to be a strong completitor to industry heavy weights such as the recently aquired CPLEX from ILog (IBM).  

As part of this release, samples are included for swapping in different solvers including the open source lp_solve mixed integer linear programming (MILP) library.  It’s worth nothing that the samples are installed into your ‘My Documents’ directory if you find yourself scratching your head!

The Excel add-in makes it easy to experiment with models using a Optimization Modeling Language (OML) editor data-bound to cells, and for the developers the C# object model supports LINQ data-binding.

Bringing music to the cloud

After a well deserved break, I am back in Sydney enjoying good coffee, food and summer weather!

On my travels through Germany I saw an advertisement for SoundCloud in a music magazine.  It’s basically Flickr for Musos done really well.  And although Web 2.0 is dead, or at least ‘so per se’ these days, I see is a shiny example of how to build a great web app.  

Sound cloud allows you to upload your own tracks or mixes to your profile, or post to your friends drop boxes.  It has a great visualisation of your track, and allows people to comment at various points through out your track – at the breakdown for example.  You can also follow other artists in a similar fashion to flickr.  

Not only is it a killer web 2.0 web application it has a complete API able to support mashups likeRadioClouds and the pure web app CloudPlayer.

I see this as big year for musicians; another highlight being Ableton upcoming release of Live 8 with its hotly awaited Max/Msp integration – bringing sound design to the masses, bring it on!

Open source .NET CMS, Oxite

There are numerous open source CMS platforms out there, wordpress being one of the most recognizable (hosting this blog).  But when it comes to .NET the choice is more limited.  Microsoft’s channel 9 community platform started its life from the telligent Community Server codebase.  Since then that team has built a new Blog/CMS platform Oxite which powers MIX online.

The project targets ASP.NET MVC developers and leverages LINQ to SQL for persistence, but hides the implementation behind a clean repository interface.  There is not end-user installation so without VSTS you will need to attach the database and config a login.  Boasting ‘standards compliant’ features such as RSS and Trackback capabilities, it makes good use of modern features such as extension methods to keep the front end clean.  Although still not out of beta, the ASP.NET MVC is proving popular, and powers some high traffic sites such as stackoverflow.com.

Microsoft’s other CMS platform sharepoint is getting behind CMIS. The content management integration services (CMIS) specificiation leverages SOAP, REST and Atom to enable communication with and between ECM systems. Although Roy Fielding isn’t impressed by the initiative, in a Web 2.0 environment a languague neutral API is almost as important as the website – so it should bring for some good vendor competition.  Alfresco in particular has recently released beta support for CMIS on its java-based platform.

Also worth a look the .NET CMS Umbraco and the popular Radiant CMS built on rails.