Functional Programming

Simple color-based search by image in F#

Last week I was playing with a photomosaic composer toy project and needed a simple search by image engine. By search by image I mean searching a database for an image similar to a given one. In this tutorial I will show you how you can implement this functionality –with some obvious limitations– in an extremely simple way just by looking at an image’s color distribution.

If you are looking for feature similarity (shapes, patterns, etc.) you most likely need edge detection algorithms (linear filters or other similar methods), which give excellent results but are usually quite complicated. I suppose that’s the way most image search engines work. Alternatively this paper describes the sophisticated color-based approach used by Google’s skin-detection engine.

In many cases however, finding images with a perceptually similar color distribution can be enough.
If you are in this situation, you may get away with a very simple technique that still gives pretty good results with a minimal implementation effort. The technique is long known and widely used, but if you have no experience in image processing this step-by-step guide may be a fun and painless warm-up to the topic.

I’ll show the concept with the help of F# code, but the approach is so straightforward that you should understand it even without prior knowledge of the language.


This is the high level outline of the process.

Just once, to build a database “index”:

  • Create a normalized 8-bit color distribution histogram of each image in the database.

For every query:

  • Create a normalized 8-bit color distribution histogram of the query image.
  • Search the database for the histogram closest to the query using some probability distribution distance function.

If you are still interested in the details of each step, please read on.

Extracting an image’s color signature

Given that we want to compare images, we’ll have to transform them into something that can be easily compared. We could just compute the average color of all pixels in an image, but this is not very useful in practice. Instead, we will use a color histogram, i.e. we will count the number of pixels of each possible color.

A color histogram is created in four steps:

  1. Load/decode the image into an array of pixels.
  2. Downsample the pixels to 8-bit “truecolor” in order to reduce the color space to 256 distinct colors.
  3. Count the number of pixels for each given color.
  4. Normalize the histogram (to allow the comparison of images with different size).

1. Loading the image

This is almost trivial in most languages/framework. Here’s the F# code using the System.Windows.Media APIs:


2. Downsampling 32-bit color to 8-bit



With the help of some basic bitwise operations we reduce pixels from 32 bits down to 8. We discard the alpha channel and keep 2 bits for blue (out of the original 8), 3 for red and 3 for green (we discard the least significant bits of each color component). The result is that each pixel (being a byte) can represent one of exactly 256 colors. We obviously loose some color detail because we cannot represent all the original gradients, but having a smaller color space keeps the histogram size manageable.

Note: in general 8-bit images use a palette, i.e. every pixel value is a pointer to a color in a 256-color palette. That way the palette can be optimized to only include the most frequent color in the image. In our case the benefit would not be worth the trouble as we would need a common palette across all the images anyways (plus the above method is faster and simpler).

3. Creating the histogram



Nothing special here: we just count the number of pixels that are of a given color. The histogram is nothing more than a 256-elements array of integers (plus the image file name). You can read it like “this image has 23 “light green” pixels, 10 “dark red” pixels, etc.”
We then normalize the histogram by dividing each value by the total number of pixels so that each color amount is a float value in the 0 .. 1 range, where for ex. 0.3 means that a picture has 30% of pixels of that given color.


Comparing color histograms

Now we have a collection of histograms (the database) and a query histogram. In order to find the best matching image, we need a way to measure how similar two histograms are. In other words we need a distance function that quantifies the similarity between two histograms (and thus between two images).

You probably have noticed that a normalized histogram is in fact a discrete probability distribution. Every value is between 0 and 1 and the sum of all values is 1. This means we can use statistical “goodness of fit” tests to measure the distance between two histograms. For example the chi-squared test is one of those. We are going to use a slight variation of it, called quadratic-form distance. It is pretty effective in our case because it reduces the importance of differences between large peaks.
The test is defined as follows (p and q are the two histograms we are comparing):

[latex]distQF(p, q) = \frac{1} {2}  \sum_{i=0}^n \frac{(p_i – q_i)^2} {p_i + q_i}[/latex]


the implementation is straightforward:

The more two histograms are different, the larger is the return value of this test. The test returns 0 for two identical histograms.

A more sophisticated option is the Jensen-Shannon divergence, that is a  a smoothed version of the Kullback-Leibler divergence. While being more complicated, it has the interesting property that its square root is a metric, i.e. it defines a metric space (in layman’s terms, a space where the distance between two points can be measured, and where the distance A → B → C cannot be shorter than the direct distance A → B). This property is going to be useful in the next post when we’ll optimize our search algorithm.

The Kullback-Leibler and Jensen-Shannon divergences are defined as:

[latex]distKL(p, q) = \sum_{i=0}^n p_i ln \frac {p_i} {q_i}[/latex]


[latex]distJS(p, q) = \frac{1}{2} distKL(p, \frac{1}{2}(p + q)) + \frac{1}{2} distKL(q, \frac{1}{2}(p + q))[/latex]


This is the corresponding F# code:

This paper includes an interesting comparison of various distance functions.

At this point our problem is almost solved. All we have to do is iterating through all the samples measuring the distance between query and sample and selecting the histogram with the smallest distance:

Notice that I use head because I’m only interested in the best matching item. I could truncate the list at any given length to obtain n matching items in order of relevance.

Optimizing the search

Maybe you’ve noticed one detail: for every query, we need to walk the full database computing the distance function between our query histogram and each image. Not very smart. If  the database contains billions of images that’s not going to be a very fast search. Also if we perform a large number of queries in a short time we are going to be in trouble.

If you expected a quick and easy answer to this issue I’m going to disappoint you. However, the good news is that this problem is very interesting, much more so than it may look at first sight. This will be the topic of the next post, where I’ll write about VP-Trees, BK-Trees, and Locality-Sensitive Hashing.

Grab the source

The complete F# source of this tutorial is available on GitHub (a whopping 132-lines of code).

Thanks to my brother Lorenzo for the review and feedback.

Real world F#: my experience (part two)

The second project I recently completed in F# is a completely different animal. While the first one is a pet project I’ve put together in my spare time (with no deadline at all), this one has been a full-time work for my company (for this reason I cannot disclose some details or share source code). Additionally, time available was limited. Very limited. Like 2 weeks limited. That’s 10 working days plus a 2-days emergency buffer.

A load simulator tool

My company produces a high performance client-server platform that ships with our own proprietary database engine. After some important changes to the server and database codebase, we needed to test the system’s behavior under heavy load, i.e. when a large number of users are connected and firing queries.

As you can guess, hiring and coordinating hundreds of people to load the system the way you need is very impractical, if possible at all. Maybe it’s doable if you have your own Army of Clones, but we don’t have one, so we had to somehow automate the process. Keep in mind that the server interface is proprietary, i.e. it’s not http, SQL, or anything similar: we have to go through our library and API to access the server. For that reason we could not use any existing tool.

The application I was going to build was meant for internal use but it was clear that something usable by non-über-geeks would have been nice to have at some point (for example to help sizing hardware for large customers). Anyways, being the deadline very close, it was imperative (no pun intended) to focus on the most important stuff.

Writing your own DSL

I decided to define an external DSL to describe the simulation scenarios. The language would let you express the creation of users, connections, queries, pauses, etc. in a simple way.
The second decision was to use F#. Fortunately nobody objected (again no pun intended). I was to work on the project alone, so I could basically use whatever I liked.

Once I defined the grammar I went to step 2, i.e. parsing. Obviously I was not going to reinvent the wheel by rolling out my own lexer and parser so the choice was between parser generators (FsLex/FsYacc, Irony for C# & co.) and combinator libraries à la FParsec. After taking some advice from the great F# community on Twitter (thanks Robert!), I opted for FParsec. I admit it looked a bit intimidating, but the idea of not introducing a tooling step in the build process was appealing, plus I had never used a combinator library before and was curious.

Here starts the amazement. As mentioned, at first FParsec looks slightly cryptic, but once you get the main concepts and get over a few gotchas it just “clicks”. You quickly reach a point where reading the parser code is almost like reading the grammar definition. Making changes is a matter of a few minutes with a very low risk of introducing new errors. FParsec gives you an enormous flexibility and even if the learning curve is steeper than learning parser generators I suggest you look at it if you’ve never done it before. The official documentation is great too.

Anyways, in a few days I had a parser that lifted the input program to the abstract syntax tree. Sweet!

Note: in case you are wondering, the language I defined was not super complicated but also not trivial. It supports regular loops as well as parallel ones (iterations are executed in parallel), nested loops and a plethora of options on all the various commands. I opted for a rich syntax that results in programs that are almost written in natural language. I cannot disclose all the details, but you can get an idea by looking at the screenshots.


Walking the tree

Second amazement: thanks to discriminated unions and pattern matching, walking the syntax tree is an incredibly fluid and easy process. The code is so compact and elegant that I keep opening that file just to look at it. No boilerplate, no class proliferation, no wasted characters. Just the code.

Unfortunately I could not leverage the powerful F# concurrency features to run parallel loops because the client library that interfaces with our server is not thread-safe, so all I could do was starting new threads with each its own separate AppDomain. My skills on asynchronous workflows & co. are still limited so I don’t know if there’s a better way. If that’s the case, I’d love to hear your feedback in the comments.

GUI and extras

With parsing and interpreting done, the bulk of the job was over. I just needed to add logging and a less geeky interface than the command line. With room to spare, I created a WPF GUI that controls the execution and reads logs to display status and stats. This was nothing particularly exotic, but I was able to fit in some nice touches like a graphical timeline to represent operations executed on the different threads. I wrote the GUI in XAML/C# using MVVM-Light. The parser/interpreter runs in a separate process, so that in case of a crash (not a remote possibility when you are pushing the hardware limits) the GUI keeps running and tells you what happened.


So 10 days had passed and this is what had been done:

  • the DSL grammar definition
  • a parser and an interpreter for it. It took slightly more than necessary because I had to learn FParsec along the way (this talk by Robert Pickering has been very helpful).
  • a GUI with some bells and whistles

plus some extras (that as you know better than me, are very time consuming):

  • a (admittedly basic) distributable package
  • the syntax highlighting definition for Notepad++ Smile
  • several code samples that show the DSL capabilities
  • the user manual and language specification (I got some help with that)
  • a tutorial

Developing the GUI and producing the extras went at normal speed, but I’m positive that writing this parser and interpreter in C# would have taken me close to the ten days alone. Maybe my standards are low, I don’t know, but I’m honestly blown away by what I could achieve in such a short time. Also notice that I’m much more experienced in C# than in F#.

Truth to be told, I had another advantage: this project was done in the year ending period when several people are on holidays and the office is very quiet. I also put in some late evenings, but I have a family with two kids, I just cannot code 24*7 even if I wanted.

The stars of the show

The goal of this post is not telling the world how fast I work. It’s impossible for anyone to judge if a project would have needed 2 or 100 days without knowing all the details. No, I’m writing this because I know all the details and I know that F# gave me a huge advantage. Much more so than I imagined when I started.

These are things that I think make F# ideal for a project like this:

Higher order functions

These are what allow libraries like FParsec to exist, amongst the rest.

Discriminated unions, tuples and pattern matching

This trio is worth alone the price of entry. They make for very terse code and bring other great advantages on the table as well.

It works the first time

I still don’t get why it is so. Maybe it’s because of the lack of nulls. Maybe it’s because (as I’ve written in part one) I think functional programming forces you to think more and write/debug less. The net result is that when I write F# I mostly get it right the first time. Because of the higher-order functions there are less corner cases that suddenly appear and crash everything.

Now most of these features are available in several functional languages, however the seamless .NET integration was fundamental in my case (the libraries I had to use are .NET), and some F#-only constructs make coding fun and speedy at the same time.


If you’re not living under a rock (like I’m literally doing right now –but that’s another story) you’ve sure heard of F#. Maybe you’ve even seen some examples, but as I’ve heard many times from C# developers, they looked incomprehensible. Don’t let that stop you, it’s just not true. If you’re new to functional programming it looks that way because F# is (mostly) a functional language, i.e. you’re not only learning a new language, you’re learning a new paradigm. A different way of thinking of your programs. It does take some effort, for sure. Is it worth it? It’s up to you to decide. To me, getting back to functional programming with F# after several years of OOP/C# has been a real breath of fresh air.

If you decide to learn more, here are some great places to start:

Advice for getting started with F# by Richard Minerich
An overview of functional programming by Dorian Corompt (recursion, lists, more to come…)

I suggest starting with the basics: you can already accomplish a lot with just lists, sequences, tuples, unions and pattern matching. When you feel ready you can move on to the more advanced topics.
Have fun!

Again, many thanks to Steffen and Samuel for the feedback!

Real world F#: my experience (part one)

I’ve been playing with F# on and off for about one year, but only recently I was able to complete a few “real world” projects. I was so impressed that I decided to share my experience. In this two-part series I will talk about two very different projects to give you an idea of how wide the spectrum of applications is where F# feels right at home.

The first project

The first project is named VeloSizer. You can check it out here (I may release it as open source but I’m still undecided on what to do with it). I assume you are not a cycling geek so I’ll spare you the details, but in short this application computes the bike setup given your position and the frame geometry. If you’re interested there’s a detailed description on the application page. Surprisingly enough, I’ve never found anything that does this very thing (except for full blown and expensive CADs), so I decided to write it myself.


The application is built in Silverlight: the XAML frontend is basically a glorified input form. It’s not particularly complex, some details are more complicated than it may look at first sight, but still there’s nothing extraordinary. I took a rather standard approach and employed the MVVM pattern (using MVVM-Light) for a clear separation of concerns. The View Model is C# while the Model –where the interesting stuff happens– is written in F#.

Solving this particular problem does not require very complicated mathematics, but involves a large number of geometrical operations (trigonometry and the likes). Without abstracting and hiding away all the math, the solution quickly becomes a nightmare that spirals out of control (don’t ask how I know). For this reason I’ve implemented a simple 2D CAD engine that sits at the application core.

How it went

Here are some things I noticed while using F# in a “real” project for the first time.

Units of measure

F#’s support for units of measures built straight into the type system has been very helpful to avoid stupid errors like mixing degrees with radians with millimeters, etc. It is really a plus when dealing with physical dimensions.


The language syntax is very light and unobtrusive, which makes it ideal to write mathematically-oriented code. The main benefit to me has been that the math stands out clearly, without parenthesis, type annotations or artifacts that make things harder to read. Also writing the code is a joy: you can really focus on the reasoning and almost forget that you are actually programming. In fact translating the equations written on paper to code is almost copy & paste.


I heartily agree with Richard Minerich when he says that testing does not replace a strong, theoretically-validated model. It’s the very same reason that pushed me to build most of this application’s engine on paper before writing a line of code. However I still make (lots of) mistakes when implementing a model –regardless of how correct it is– so I feel safer with the additional support of a solid testing framework.
The nature of functional programming makes it an ideal target for unit tests. Short, side-effects free functions are a joy to test. Result: it has been very easy to create a nice safety net in form of an NUnit project.
I must admit I would probably have written this library more or less using the same style in C#, but in functional programming this is the default.

Interoperatibily with C#/GUI

This is somewhat of a sore point. I don’t know if it is due to my lack of experience (likely) or the nature of a GUI-driven application, but I’ve ended up with many mutable (and not very idiomatic in general) classes, for two main reasons:

  • I had to persist the business objects (using the Sterling NoSQL database) and all the serializers for Silverlight need public setters as they are not allowed to use reflection.
  • With MVVM, each View is bound to its respective View Model, which is nothing more than a wrapper around its respective business object defined in the model (F#).
    Now when for instance the user changes a value in a TextBox, the new value is propagated to the View Model, which in turns propagates it to the Model. You can tell it’s not very practical to create a new instance of the model every time a value changes, so immutable objects do not adapt so well to the situation.

This means that the business objects are very C#-like. They still benefit from the lighter syntax, type inference, etc…, but they don’t fully leverage the power of the language. Fortunately the “application brain” does not suffer much from this.

Is this due to MVVM, XAML and in general GUI patterns being oriented towards the object-oriented paradigm? I don’t know. I’ve heard of a GUI framework specifically written for F#, but I don’t know much more.
I would be very interested to hear your opinion on this subject.

Note: as Stephen points out, keeping the model immutable may not be so much of a problem. I’ll give it a try.

Guidance and community support

The F# community is still small, but it more than makes up for it in quality. The active users on Stackoverflow and other sites are extremely competent. It’s rare to get bogus answers or to get stuck on a problem for long.
What I’ve found difficult though is getting guidance. I often ask myself if my code is well written or a pile of junk. I suppose the only solution is to refine my own sense by reading other people’s code.


Visual Studio’s Intellisense for C# is spectacular and has made us very lazy. F# support is much better than it was at the beginning, but it’s still not up to the same level of C#. In the end though it’s only lacking a few details like parameter names or support for the pipeline operator –the next release already includes some improvements in this area.


Setting breakpoints and watching state change is not simple in functional code because (usually) there is no state. If you debug a lot, this may be a bit unsettling at first, but then you realize it is not so much of a drawback. It is a benefit in fact. Breakpoints are evil: building a half-working solution, running it through breakpoints and tune it until the result matches what you expect is very close to the definitions of cargo-cult programming/programming by coincidence.
It is my opinion that functional programming makes you think more and write/edit/debug less. I believe this has made me a better developer because I now tend to stop, think about the solution “offline” and only write it down when I get it.


I can’t give any judgment on productivity because this application has been a pet project I’ve built alone without any deadline, working literally 15 minutes at a time. We recently welcomed another family member, which has made things even harder. Anyways it took me about 7 months to complete this project, but it’s very hard for me to tell if F# has given any productivity boost at all. More on this in part two.


It has been a real pleasure to write the F# part of this application. When you look at the application source, the first things that jumps to the eye is that the View Model (C#/OO) is way larger (in lines of code) than the model (F#/mostly functional), yet it only does “stupid” things: it’s almost exclusively made of property definitions, RaisePropertyChanged events, brackets, etc. It is like a very large box full of bubble wrap sheet, with only a small, precious gift in the middle.

That said I’ve been left with the impression that I haven’t used all of the language’s power. Writing the View Model in F# would only have slightly alleviated its ineffectiveness, what I need is probably a different pattern for GUI interaction.

In part two I’ll talk about a very different (and more interesting) project, where F# really shined. In the mean time I would be very interested to hear your opinions.

Thanks a lot to Steffen Forkmann and Samuel Bosch for proof reading and general feedback!