Apr 22

I have an idea for a trading app involving the in-play soccer markets. Now this isn’t my traditional territory - basically I prefer the long term soccer stuff as a) there’s more ’structure’ to the markets [just like fixed income] and b) the pace is slower; prices only move after each game [think chess as opposed to video games; more appropriate at my age]

However in-play is where all the action is, and there’s a huge amount of feed info available. On a Saturday afternoon, Betfair might be showing 10 in-play games at once, each with 4 or more markets; so 40 markets to cover, all changing continuously. How can I grab, parse and persist all that data ?

I started with a simple Python scraper, but the best performance I could get was about 2 seconds per feed [connect to server, download page, parse page, store in database]. Not good enough - with 40 matches to cover, you’re going to get killed reloading each once every 80 [40*2] secs. I could convert to Java to get a performance pickup, but that’s not really the problem here .. I need those feeds to execute in parallel rather than sequentially.

Cue Erlang, and Joe Armstrong’s book.

I did try it six months ago but got bogged down; the whole concurrency-oriented aspect is pretty alien to anyone coming from Ruby or Python. But now I seem to be making better progress; it’s starting to click. For anyone reading the book, I would say:

  • Start by figuring out how to compile and run a Hello World program; go to Chapter 6 and use Rake rather than Make [Joe: using Make files is going to scare of 99.9% of web folk]
  • Don’t get bogged down in all the language specifics of Chapters 1-5; I’d give Chapters 1-3 a cursory read and move on
  • Focus on the concurrency aspect; read and re-read Chapters 7 and 8, and code up the examples
  • Section 16:1 ['The Road to the Generic Server'] has some good examples
  • The code provided for the book is helpful, but terribly organised! All the different examples are jumbled in together with no kind of packaging, and one Makefile to compile them all at once
  • In Chapters 10/11, it’s worth pointing out that the lib_chan module is one of Joe’s own [and is included in the src code], and isn’t part of the standard Erlang distribution
  • If you want to compile and run the IRC Lite example [Chapter 11], you need the following files:
    • chat_client.erl
    • chat.conf
    • chat_group.erl
    • chat_server.erl
    • io_widget.erl
    • lib_chan_auth.erl
    • lib_chan_cs.erl
    • lib_chan.erl
    • lib_chan_mm.erl
    • lib_md5.erl
    • mod_chat_controller.erl

Once you get into it, the syntax seems quite familiar. It’s dynamic, so not so far from Ruby and Python. The tail recursion feature is very similar to Python’s filter/map/reduce functionality, or Ruby’s Enumerable module. You can reference functions as first-class objects, so more Python there.

The key example for a beginner IMHO is right at the end of the book on p366; how to take an list-processing action that would normally be processed sequentially, and execute it in parallel. I’ve reproduced it here, with the Ref atom removed [not necessary for demonstration]. If anyone thinks I’m just copying and pasting code from the src into the blog, I’d like to point out that I arrived at this code snippet via an independent process and with a not inconsiderable amount of mental strain!

To compile: erlc -o ebin src/pmap.erl
To run:erl -pa ebin -noshell -s pmap main -s init stop

It’s taken a few days, but I think I’m beginning to get a grip on Erlang’s mojo.

written by justin \\ tags:

Apr 10

Lots of good comment on AppEngine’s newly- announced support for Java:

What’s amazing here is how quickly the JRuby on Rails, Helma etc communities have got round to porting their frameworks to AppEngine; heck, it’s only three days old. I think the ThoughtWorks guys had some kind of early access to the program, but still, you have to admire how quickly it’s been done. I had assumed folk would stick with the default stack provided by Google, just like I do with their Python stack; obviously there’s a lot of pent- up interest out there. [Alternatively, given the paucity of cheap Java hosting alternatives currently available, perhaps I shouldn't be surprised]

I have previously been disparaging about the point of porting Django onto Python AppEngine; however I totally get the importance of porting JRuby on Rails to the new platform. I personally don’t have any interest in the default JSP/Servlet/JDO stack provided by Google; however the JVM provides a mature, stable, fast base on which to support higher level frameworks - JRuby on Rails, Grails, Helma, AppJet, take your pick - maybe if we’re lucky we’ll see some Clojure- or Scala- based stacks emerge.

And here’s the real importance of Java for AppEngine - JVM support opens up a cornucopia of different cloud- based programming models, in a way that Python support doesn’t [I'm still a Python fan BTW]. Now you have the option of coding all your front- end stuff in a nice high level language, then dropping down to Java for all the performance intensive stuff, all in a single application - no more calling out from AppEngine to a separate Java server. A bit like writing a CPython desktop program and then re-writing the slow bits in C - except that Java gives you web portability in a way you can forget about with C.

Reminds me a bit of Microsoft’s CLR - one runtime, many languages supported. Where on earth have Microsoft been in all this ?

Forget Java, It’s all about the JVM.

BTW I keep reading about what a wonderful piece of engineering GWT is. For the record, I just don’t get it; I have no idea why you’d want to generate Javascript from Java, especially now with JQuery being so damn good. Frankly I’d like someone to come up with a Javascript- to- Java compiler - now that really would be worth looking into.

written by justin \\ tags: ,

Apr 06

Having ranted about the inadequacies of Django templates, I started to realise I’m pretty much stuck with them for AppEngine development given they are the default option. So … how to get SmartIf and similar to work under AppEngine ?

As I was digging through the webapp.template source code, interesting to find that the AppEngine designers have pretty much the same view of templates as I do -

The main purpose of this module is to hide all of the package import pain you normally have to go through to get Django to work. We expose the Django Template and Context classes from this module, handling the import nonsense on behalf of clients.

Hah! Not alone!

Django requires you to register your extensions with a template ‘registry’. If you look in the SmartIf src, you’ll see the following.

from django import template
register = template.Library()

You need to replace this with an AppEngine registry:

from google.appengine.ext.webapp import template
register=template.create_template_register()

Next, the SmartIf docs say you should include the following declaration at the top of your templates to replace the built-in ‘if’ with the new smart version:

{% load smartif %}

Don’t do this! You will end up in some kind of Django namespace hell. Instead, register the new template with the Appengine template module.

template.register_template_library(path_to_module)

Finally, don’t comment out the line ‘from django import template’; SmartIf uses the template.TemplateSyntaxError class, which exists in the Django template module but not the AppEngine equivalent. So you still need to import the Django template module; just make sure you do it after you’ve registered the extension with the AppEngine template.

Done!

written by justin \\ tags: ,

Mar 30

  • I’ve been talking to my friends at Smarkets about market making in live soccer games [Match Odds, Correct Score etc]
  • I visited a leading bookie’s HQ recently and watched one of their traders managing the book in a live soccer game; he let slip that a lot of the modelling is done using Poisson processes
  • I came across this video by the founder of BetAngel software, explaining how in-play soccer prices can often drift in counter- intuitive ways.

All of which generated a minor firestorm in my brain.

With respect to the last point, lets take a typical ‘Correct Score’ soccer market with contracts for each possible outcome - 0-0, 1-0, 0-1, 1-1 etc. When the match starts, the score is 0-0. Let’s say 20 minutes elapse, and no goals have been scored. How would you expect the price for each of these contracts to have moved ?

Intuition says that the price of the 0-0 contract should have increased [a 0-0 draw being more likely], whilst everything else should have decreased.

Not so!

[Kudos to Google Charts for the graph; fantastic utility]

The chart shows the evolution of the prices of four contracts across the 90 minute game, assuming no goals are scored. Not surprisingly, 0-0 [red] increases over the game to 100% by the end of the match. More interestingly, 1-0 [green] actually increases from an initial 17% at the start of the game to approx 30% at around 55 minutes, before decaying away to zero.

Counterintuitive ? I think so. What’s happening here ? Well, there’s two distinct things going on:

  • Probability of 0-0 is increasing all the time
  • There are a whole range of other probabilities not shown by the graph [2-0, 2-1, 5-0] etc; initially these have small but non-negligible values. However by 55 minutes, if no goal has been scored, their probabilities will have decayed practically to zero

For the first 55 mins, I’d wager that the second effect is declining at a rate greater than the first effect is increasing - so all the other correct scores [1-0, 0-1, 1-1] tend to have positive biases in the early part of the game [given that the sum of the prices of the contracts must add up to 100%].

Interesting, huh ? I’m wondering if this is the start of a possible trading strategy. I haven’t traded the in-play markets before, but the first thing I would want is some kind of model for assessing fair value at any point in time - any deviation from these prices might represent some kind of opportunity. Lots more work to be done here, I’ll blog some more later.

But back to the modelling - how is it done ?

Essentially I modelled the probability of each team scoring a goal as a Poisson process. From Wikipedia:

“The Poisson distribution .. expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate and independently of the time since the last event.”

Sound reasonable for modelling goals - all you need is the expected number of goals a team will score over the course of the game. Once you have this number the Poisson function will give you the probability of a team scoring no goals, 1 goal, 2 goals, n goals etc. You repeat the process for each team, then the probability of a particular score [eg 1-0] is the product of the two probabilities in question (Team A [1 goal] x Team B [0 goals])

To obtain the graphs above, you simply repeat the calculation at regular intervals during the match [say every minute]; the interesting question is then how to scale the ‘expected number of goals’ parameter for time. Unfortunately I can’t find the link, but I came across a site which suggested the correct way to do this was linearly with respect to time - ie if the expected number of goals over 90 mins is 1.5, it’s 0.75 over 45 mins. Again, sounds reasonable; I say ‘interesting’ as this contrasts with dealing with normal distributions / Brownian motion in finance, where typically you scale with respect to the root of time.

I guess I don’t have enough experience with Poisson distributions at this stage to understand why at this stage [or have forgotten more A level maths than I can remember]. Looks promising, but I’d like to understand exactly why the Poisson process occurs, and why it might be a good candidate for modelling goals.

Back to Wikipedia.

written by justin \\ tags:

Mar 27

Coming soon to a cloud near you, apparently.

Thoughts :-

This already exists with these guys, who must be pissed at Google entering their market, although they say they always expected it. Nice thing about Stax is that you’re not bound to a single framework type; there are templates for standard JSP projects, Jython projects, JRuby on Rails projects. Will AppEngine/Java be quite as flexible ? Based on the Python implementation I’d say probably not.

Which web stack will they go for ? I imagine, like the Python version, Google will give you a default stack to get you up and running quickly. With the Python version you get

Of course you’re not stuck with these, but I imagine a large proportion of folk playing with the stack stick with the default options [why go to the bother of installing a full Django stack on AppEngine when a) you get the templates for free already and b) you can't use Django models as you're stuck with BigTable ? All that work just to get some controller logic!]

Last time I looked [2006], Java was plagued by overabundance of options at each of the MVC layers, and a multitude of different frameworks binding different MVC combinations together, with no obvious market leader. AppEngine is going to give a massive stamp of approval to whatever stack it choses as its default. My guess would be something proprietary at the model level [to deal with BigTable], no idea at the controller level, maybe simple JSP for the views.

Just please God don’t let them use GWT for the view.

This is Google going for the enterprise market. Who uses Python in the enterprise ? Undoubtedly some, but the Java market must be orders of magnitude bigger.

How does the individual developer benefit ?. If the Python example is anything to go by, Google will design something which is extremely simple to get up and running. No more stitching different frameworks together, no more looking for cheap Java hosting, no more reams and reams of boilerplate code. Hopefully you’ll be able to focus on getting your app code running, and getting it out there quickly. If you’re lucky, maybe your codebase will only be 3x the size of the Python version!

.. which would actually be a big win. . The one serious reason to use Java for webapps is the performance of the JVM versus something interpreted like Python. A lot of my recent enterprise stuff has Python/Ruby webapp front ends, but they have to call a separate Java server for anything performance intensive; a decent Java web stack would mean you might be able to integrate the two layers together and do away with all the request/response marshalling code between them. So here’s hoping.

written by justin \\ tags: ,

Mar 23

From Django design philosophies:

The template system intentionally doesn’t allow the following:

* Assignment to variables
* Advanced logic

The goal is not to invent a programming language. The goal is to offer just enough programming-esque functionality, such as branching and looping, that is essential for making presentation-related decisions.

The Django template system recognizes that templates are most often written by designers, not programmers, and therefore should not assume Python knowledge.

Bollox to that!

Let’s say I’m writing a finance app and want to present someone’s banking transactions in a table. One of the columns is likely to be ‘Amount’, and could be positive (a credit) or negative (a debit). I want credits to appear green, debits red.

The obvious way to do this is to

  • Create td.credit/ td.debit CSS classes, with different :color attributes
  • Render td.credit if :amount > 0, td.debit if :amount < 0

For the second of these conditions I need simple greater_than / less_than logic available in the template. Which Django doesn’t have. The nearest you get is the ‘ifequal’ statement. So your only way around this is to determine the CSS class at the controller level and pass it into the template.

WTF ?!

The impression here is that Django’s designers are paranoid about protecting designers from anything that looks like programming logic. The unfortunate results of this decision are that

  • The abstraction leaks; you end up with presentation logic in all but the most trivial app.
  • Your template language ends up looking like a half-assed version of a real programming language (.. just enough programming-esque functionality .. )

Reminds me of Greenspun’s Tenth Rule:

Any sufficiently complicated C or Fortran program template framework contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp a real programming language.

This is where Rails wins big - you get the full power of Ruby within the template framework. Better still, you don’t have to learn a separate language for the view layer - it’s Ruby throughout the whole MVC stack. And I’m guessing this is where Python has a problem - with its fixation on whitespace, raw Python is spectacularly ill-equipped to act as a glue language within the messy world of HTML tags, template tags etc.

Come on Django, sort it out.

[Postscript: I happened upon SmartIf, a template helper which sorts out a lot of my immediate problems. But the criticism still holds - and it's very interesting that Simon Willison seems to feel the same way].

written by justin \\ tags: ,

Mar 18

OK, my take on cloud computing for the independent developer.

I refer the interested reader to a recent GigaOm article on cloud computing’s three separate models [infrastructure, platform, software as a service]. I’m solely interested in the platform as a service model; third party platforms to which I can deploy webapps and have them scale [relatively] painlessly to a large audience at an affordable cost; the sort of thing that’s really only been available to the small guy with Apache/PHP until recently [if you're ever tried to deploy a Rails app prior to the launch of Heroku or Passenger, you'll know what I mean].

There’s a slight apples-to-oranges thing going on here. AppEngine is fairly and squarely a platform-as-a-service model, 100% owned, developed, hosted by Google. Heroku is a small startup which uses Amazon EC2 infrastructure-as-a-service model to host their Rails deployment platform; one layer of cloud computing supported by another. For my purposes, they are direct competitors.

I should also say this isn’t an exhaustive survey, just my personal preferences and experience. I’m sure other folk in this space [Appjet, Stax] have interesting things to say - I just haven’t got around to looking at them yet.

Language

Simple shootout - Python [AppEngine] vs Ruby [Heroku]. I started with Python in 2003 - what a breath of fresh air after years of Java. Loved the dynamic nature, the whitespace, the libraries, the whole nine yards. Convinced I would never need anything else, at least until I started looking for a half- decent web framework [2005 - ie pre- Django]. At which point I was casting envious glances over at Rails .. dammit, I was just going to have to learn Ruby as well.

And whilst it started out looking like a poor imitation of Python [who needs all these 'end' keywords ?], gradually it gets under your skin. You start to love writing ‘array.size’ rather than ‘len(array)’, or ‘-1.abs’ instead of ‘abs(1)’; and you start to wonder why Python needs all these keywords littered across the place. And when you finally figure out what code blocks and the Enumerable module are all about, you realise there will never be any going back.

Yeah, I know Python has list comprehensions, lambda functions etc; it’s just they feel much more of a hack if you’re used to Ruby’s code blocks.

So .. one up to Heroku for Ruby.

Framework

Rails [Heroku] versus .. hmm, something a little more bare-bones [AppEngine].

If you come from the Rails world than AppEngine presents you with something decidedly more stripped down. The familiar MVC structure is there [except in AppEngineLand they're called models, templates and scripts], but the overall feeling is that the structure is far more ’stripped down’ than Rails; it’s rather like working with Rails but without all the helpers.

For example if you’re working in the controller layer, there’s none of the nice Rails respond_to code that helps you differentiate between ‘Accept’ header requests; you have to build it yourself. There’s none of the code that helpfully parses HTTP parameters into easy-to-use Hash structures; you get some basic functions allowing you to query for parameters by name, and that’s about it.

Similarly at the persistence level; there are no ActiveRecord- style belongs_to/has_many relationships allowing you to define model relationships in a sexy way; you can do it [using Reference Properties]; it’s just it feels more basic.

And the wierd thing is .. I kind of like it. Though I think Rails is a wonderful beast, there are aspects of the helpers layer which really don’t seem to work to me .. leaky abstractions, I think Joel would call them. Such as being able to define belongs_to/has_many relationships at the model level, but still having to remember to defined a parent_id field at the database level. WTF ?! Talk about something poking through the abstraction layer! Either do this stuff for me properly, or not at all.

So .. a close call, but AppEngine wins here.

Persistence

Traditional relational database for Heroku, BigTable for AppEngine; old versus new.

It’s not quite that simple however as Rails’ ActiveRecord takes a lot of the pain out of dealing with the database; it’s the certainly cleanest ORM framework I’ve ever used; a lot of the time it really does feel like you’re dealing with an in-memory structure - no ‘database-iness’ at all.

Unfortunately, as I mentioned earlier, the abstraction leaks in a couple of places. You still have to manage your foreign keys manually. You also also can’t just make arbitrary changes to you database schema; you have to ‘migrate’ the database.

Now I can see what migrations are trying to achieve - give you a way to modify your schema whilst preserving your data - which is a worthwhile goal - but they also introduce an extra layer of complexity. I haven’t personally come across a situation where I needed to modify the schema in situ; I tend to mess with the schema in development [when the data doesn't matter] and then stick with that schema in production. So I tend to view Rails migrations as a layer I would rather do without.

And although the foundations of BigTable are decidedly un-relational, dealing with it at the developer level feels very like dealing with a migration-less Rails datastore. No, the ORM layer isn’t nearly as sexy or complete as ActiveRecord; but you benefit from being able to hack the database structure much more easily in development . No migrations to design, run and then manage; just simply add a column definition to your model.

Certainly more bare bones, but preferable in my opinion. Another win for AppEngine.

Tools

  • Source Control. The Heroku API uses Git; if you use Heroku then your source code is living on Heroku with your app; there’s no choice here. AppEngine has no such restriction; you can keep your source code wherever you like [GitHub etc]. I have a marginal preference for the AppEngine model; I don’t really see why my source code should have to reside with my app; it’s nicer to have the option.

  • Deployment. Pretty good in both cases - with Heroku you simply run ‘git push’ to push your source code / deploy your app; AppEngine has a custom script (appcfg.py) which you run to deploy your app. Not much to chose between them here.

  • Database management. The yardstick here is Rails ’script/console’, which gives you an IRB shell right into the database in development mode - very nice. Until recently neither AppEngine or Heroku offered anything as clean as this; both have an HTML- based ‘interactive window’ into which you can enter code; they work, but it’s not very clean. But stop press! The new Heroku API has a shell command ‘heroku console’, which gives you direct access into your production database from the command line - exactly what the doctor ordered!

Given each platform is only 12-18 months old right now the tools are still fairly primitive; there’s a lot of room for improvement on both sides.

Verdict ? Draw.

Conclusion

Aaarghhh OK decision time. This is tricky. Truth is I like both of these platforms. There was a time, not so long ago, when deployment was *the* major headache for the small guy like me. Deploying Rails - Apache ? Mongrel ? FastCGI ? Lighttpd ? Total acronym soup. No concensus in the community about the right way to do it. A total nightmare. So bad, I almost considered moving to PHP (yuk) just to ease the deployment pain.

These platforms are the way out of that quagmire. Now you have a simple, pushbutton way of deploying Python and Ruby webapps, with the added promise of [relatively] painless scaling. They’ve both taken away a huge headache, and there’s a lot to like about each.

But we need a winner. So I’ve decided to plump for … Heroku! Why ? Dunno. Hard to call. Appengine won on a lot of the points above. I think at the end of the day it’s my simple preference for coding in Ruby over Python. Plus the recent deployment of the latest Heroku platform, which looks like a major leap forward. Perhaps in the end it’s a British preference for the plucky underdog over the 800 pound gorilla.

Whatever. They’re both worthy platforms and I hope [for my sake] they continue to prosper, innovate and breed new competition. Here’s to them both.

written by justin \\ tags: , , , ,

Mar 10

I wrote a couple of extremely dull posts about Java and Jython, how much I hated using Java etc, not very interesting.

I went away and wrote a couple of projects in Jython; and having thought Jython was far too slow for production purposes, I had a lot of success with the following strategy:

  • Write project in Jython
  • Figure out performance bottleneck
  • Re-write bottleneck in Java
  • Err, repeat

Genius. The projects were written with all the fluidity of Python, but with the performance of Java. Why didn’t I think of this in 2002 ??

Anyway along the way, I obviously had to compile some Java classes. If you love writing in Python then let’s just say that reaching for Ant at this point is not the most obvious step. You’re busy working in Jython - how can you seamlessly compile your Java classes ?

Here’s a couple of handy hints:

Calling javac dynamically

So you’re busy writing your Jython build script and you need to compile and distribute your Java classes. Normally you call javac from the command line; how do you do it now you’re in a Jython script ?

Fortunately the javac compiler is a Java class and so can be called directly from Jython; the trick is in figuring out the slightly arcane list of args that have to be passed to it.

def compile_java(files, classpath, dest):
    from com.sun.tools.javac import Main as compiler
    from java.io import ByteArrayOutputStream, PrintWriter
    for file in files:
        args=jarray.array(['-cp', ':'.join(classpath), '-d', dest, file], java.lang.String)
        error_log=ByteArrayOutputStream()
        success=compiler.compile(args, PrintWriter(error_log))
        if success==1:
            raise RuntimeError(error_log.toString())

Where

  • ‘files’ is a python list of filenames
  • ‘classpath’ is a python list of classpath entries

[The above code was tested on Ubunu; if you're on Windows you'll need to change the ':' classpath separator]

Dynamic classpath

Great! You can now compile your Java classes dynamically. But let’s say they depend on some external jar files which aren’t contained in the global CLASSPATH; how do you dynamically modify the local classpath to include these jars [so you can compile successfully] ?

def load_resources(resources):
    addURL=java.net.URLClassLoader.getDeclaredMethod("addURL", jarray.array([java.net.URL], java.lang.Class))
    addURL.accessible=1
    cl=java.lang.ClassLoader.getSystemClassLoader();
    for resource in resources:
        url=java.io.File(resource).toURL()
        addURL.invoke(cl, jarray.array([url], java.lang.Object))

Where ‘resources’ is a python list of items to be included in the classpath.

[Original Java code here]

Bonus Hint

Since you’re building with Jython, you have access to all the standard Python library modules. Do yourself a huuuuge favour and write unit tests with Python’s unittest suite as opposed to JUnit - you won’t regret it.

written by justin \\ tags:

Mar 06

I’m buying MK Dons Season Points with Sporting Index.

MK Dons currently have 62 points after 32 games played; they have 14 games to play this season. Sporting Index currently make the Season Points market 82.5 - 84; if I buy them at 84, then I will be in profit provided they make more than 22 points [84-62] from the remaining 14 games; an average of 1.57 points per game.

Every time MK Dons play a match, Betfair will make a liquid ‘Match Odds’ market - here are the prices for Swindon vs MK Dons this Saturday [Mar 7th 2009]. Using the prices from the Match Odds market, I can calculate the market-implied probabilities of MK Dons winning, drawing or losing that match, as follows [using mid-market prices]

  • Probability of MK Dons win = 1 / Away Win Price = 1 / 2.2 = 45.45%
  • Probability of MK Dons loss = 1 / Home Win Price = 1 / 3.55 = 28.16%
  • Probability of draw = 1 / Draw Price = 1 / 3.75 = 26.66%

Note these probabilities sum to very close to 100% [45.45% + 28.16% + 26.66% = 100.27%] - there’s a bit of inaccuracy as I took simple mid-market prices, but nonetheless the market is pretty efficient.

I can now calculate how many points the market expects MK Dons to gain from this market; remember it’s 3 points for a win, 1 point for a draw, nothing for a loss. So expected points is

[45.45% * 3] + [28.16% * 1] + [26.66% * 0] = 1.64

Fine for this Saturday’s match against Swindon; but what about the other 13 matches ? Unfortunately Betfair don’t provide markets for any match more than a week in advance. But armed with a fistful of Betfair Historical Data, a good book and the computing equivalent of the Swiss Army Knife, I think I can predict the expected points for MK Dons in all their remaining 14 matches:

Opposition Home/Away Expected Points
Crewe Home 2.27
Walsall Home 2.16
Brighton Home 2.09
Bristol Rovers Home 2.03
Huddersfield Home 1.85
Millwall Home 1.80
Yeovil Away 1.73
Hereford Away 1.71
Oldham Home 1.67
Swindon Away 1.60
Southend Away 1.58
Northampton Away 1.49
Scunthorpe Away 1.11
Leeds Away 1.02
  Total 24.11

How does this work ? Well, that’s a secret [for the minute]. But essentially you do the following

  • Select a large number of historical games for which you have prices [see Betfair data]
  • Use a Genetic Algorithm to try and pick abilities for each team which, when passed through a ‘Match Odds’ function, produce probabilities as close as possible to the observed historical probabilities [the 'Training Step']
  • Try out your team’s abilities on a set of games not used in the Training Step [the 'Test Step']; adjust your team abilities based on the results
  • Use the adjusted abilities to predict Match Odds for the remainder of the season [the 'Prediction Step']

How did we do ? Well, two thirds of the way down the table you can see that the algorithm predicted 1.60 points for MK Dons in the Swindon match, whilst the market is currently showing 1.64. So not too shabby.

Now what’s the significance of all this ? Remember I need MKDons to score more than 22 points in the final 14 games to make money on my Season Points bet. What this result is saying to me is that the algorithm expects expected points for MK Dons in the remaining 14 Match Odds markets to be higher than the 22 points currently implied by the Season Points market..

OK, but it’s just a computer’s opinion. Why might this be useful in practice ?

The reason is that the Season Points markets and the Match Odds markets are really pricing the same thing - an expectation of how many points each time is likely to get over the course of the season. The Season Points price is an aggregation of the individual Match Odds prices - which means that if there’s a difference in the two prices, there might be an arbitrage opportunity.

What’s crucial here is that I can design a strategy with Match Odds prices that offsets my exposure in the Season Points. Let’s say I buy MK Dons Season Points at 84 for £1; how much am I likely to make or lose on the result of the Swindon match ? Remember my expected points in the Swindon match are 1.6 and that it’s 3 points for a win, 1 for a draw, 0 for a loss. So:

  • Win = 3 - 1.6 = 1.4
  • Draw = 1 - 1.6 = -0.6
  • Loss = 0 - 1.6 = -1.6

Now, back to the Swindon / MK Dons match:

Let’s say I place the following bets:

  • £0.8450 on Swindon to win at 3.55
  • £0.5333 on the Draw at 3.75

How much do I make or lose on this strategy, depending on the outcome ?

  • MK Dons win: -£0.8450 -£0.5333 = -£1.3783
  • MK Dons lose: £0.8450 * (3.55-1) - £0.5333 = £1.6215
  • MK Dons draw: -£0.8450 + £0.5333 * (3.75 - 1) = £0.6215

Which, if you compare it to the earlier figures, is an almost exact hedge! So all I should need to do to make some money here is

  • Buy MK Dons Season Points at 84
  • Place bets in the Match Odds markets of all their remaining games to offset my exposure

At least that’s the theory. In practice

  • The computational wizardry is unlikely to be 100% perfect in its predictive ability
  • I have to pay 5% commission on any Betfair winnings
  • I will have to post margin to Sporting Index if MK Dons start to lose

Nevertheless the season only has 6 weeks to run, and I think this has a good chance of making money. I’ll blog in due course about how it went.

written by justin \\ tags:

Mar 04

I was going to call this post ‘RailsSide’ [or something equally inane] but thankfully came to my senses in time.

Randall and Ramon were kind enough to comment on my last post on this subject.

To Randall’s comment, I can see that Seaside may well be quicker than Rails in terms of runtime CPU, but I don’t see that it’s evident Seaside is faster in terms of development time:

  • The absence of a full stack in Seaside is a net negative to lumpenprogrammers like me; I want something that just works out of the box, and don’t want to have to evaluate all the different persistence options; just tell me I should use ActiveRecord, and I’ll be happy.
  • I’d argue the biggest headache to folk like me is deployment, and the biggest win in recent years has been the emergence of services like Heroku for Rails, and AppEngine for Python/Django. Where’s the Seaside equivalent ?

Anyway guys, thanks for noticing; on to the meat of this post. Last time I talked about wanting to use Seaside’s component- style view layer in Rails. Now it’s been bugging me; how do I get started ?

I’m going to need some kind of tree structure to represent an HTML document. I started off using hashmaps, but quickly found [doh] that they don’t preserve order - you need a hashtable structure. I dived into building a hashtable class but suddenly remembered a quote from Paul Graham in his Arc documentation:

“There is a tradition in Lisp going back to McCarthy’s 1960 paper of using lists to represent key/value pairs; [these are] called association lists, or alists for short. I once thought alists were just a hack, but there are many things you can do with them that you can’t do with hash tables, including sort them, build them up incrementally in recursive functions, have several that share the same tail, and preserve old values.”

So, you might represent a simple HTML document as follows:

['html', ['head', ['title', "Here's the title"], 'body', ['h1', "Here's the body"]]]

Couple of points about this structure:

  • Each list contains a series of name/value pairs
  • The ‘name’ item in the pair is always a string [tag]
  • The ‘value’ item in the pair can be a string [eg 'title', 'h1'] or a list [eg 'html', 'head', 'body'], representing a tags attributes/children/text

With this in place, a quick bit of recursive code will convert the list to HTML.

Now we’re cooking! Of course the HTML.generate function will have to be beefed up to differentiate between tag attributes and children, handle single tags etc; I’ll leave that to your imagination. I want to focus on a different aspect - how to structure a set of ruby classes which represent my on-screen HTML components, and which when linked together will yield the list structure from which my HTML can be generated.

Seaside’s ‘renderContentOn’ method is a useful starting point here. Seaside components typically extend the WAComponent class; grossy simplified, each object’s renderContentOn method accepts an HTML renderer as an argument, and paints onto that renderer HTML representing the object itself. We can use a similar strategy here; if I define classes for Document, Head and Body, I can generate my HTML list provided I define a render method at each level:

class Document
  attr_accessor :head, :body
  def render
    html=[]
    html+=self.head.render
    html+=self.body.render
    ['html', html]
  end
end
class Head
  attr_accessor :title
  def render; ['head', ['title', self.title]]; end
end
class Body
  attr_accessor :text
  def render; ['body', ['h1', self.text]]; end
end
def sample_doc
  head=Head.new
  head.title="Hello World!"
  body=Body.new
  body.text="Hello World!"
  doc=Document.new
  doc.head, doc.body = head, body
  doc
end

So now, a call to sample_doc.render will generate my HTML tree, and passing the results to HTML.generate [from the code above] will turn it into HTML; all I need to now is change my controllers to render the HTML so generated [using render(:text)], and I’m done.

Will this work in production ? I really like the idea of being able to develop view- level components this way, but there are a few steps outstanding:

  • The HTML generation code is very naive and will need to be updated to handle HTML attributes, single tags, empty tags etc; I don’t think this will be too problematic
  • I’m guessing I’ll want to develop the CSS in a similar fashion [ie embed it in the underlying component]; Seaside uses the ’styles’ method here; I’ll probably need something similar.
  • I have a feeling that when you have a large number of components, the rendering process is going to be very slow

Still, I think it’s a promising start; I’ll post an update when I have some experience with it on a real, live project.

written by justin \\ tags: , ,