OpenTLD

OpenTLD is an open-source visual tracking algorithm by Zdenek Kalal. He released it earlier this year as open-source, and there have been efforts to port the algorithm from the original MATLAB to a C/C++ only implementation.

I've been recently involved in that effort, and we have made a successful port of the code to C++. It's easy to compile and run in Linux (I'll be testing it on other platforms as well, OSX and Windows), and it implements a basic feature set that I hope we can expand. One priority is still to increase its performance, so it can be used in embedded systems and robotics.

Here's a quick view of the algorithm in action:

Here's Zdenek's original video:

The C++ port simply requires:

  1. CMake
  2. OpenCV (2.3)

Now I've listed here some features I want to add:

  1. Obtaining an initial bounding box from a haar classifier
  2. saving learned models

Performance of the C++ port is approximately 40-50fps on my Macbook (Ubuntu 11.10 on Intel Core 2 Duo 2.16GHz with 1GB of RAM) for the motocross video, and around 15fps for the Macbook built-in webcam. However this implementation is single-CPU, it is possible that using GPU or multi-core implementation can be much faster. This is very much a work in progress, and certainly very exciting times ahead!

Right now I've been only testing the implementations in Linux, however I'll try to get it working on OSX as well. To test the code you can get it on GitHub either from mine or Alantrrs' repository: https://github.com/alantrrs/OpenTLD

My development fork of his repository (changes are regularly merged with Alan's repo): https://github.com/arthurv/OpenTLD

And of course, zdenek's original matlab implementation: https://github.com/zk00006/OpenTLD

Tags: 

Comments

Hi thanks for the work on Opentld, its great. Have you got documentation to run opentld on windows? It would really help thanks again

Thax for your open resource!I've downloaded it and successfully run it in my laptop.
Meanwhile, the running speed is not perfect, it cost nearly 500ms per frame and even longer sometimes.
Could you please provide some inspirations to improve the performance of OpenTLD(C++)? Cause I've tried many methods so far.
I'll appreaciate that!

Hi, I have seen that you are not only active in porting OpenTLD but that you also own a RaspberryPi .... do you think it is feasible to port OpenTLD to Raspberry Pi? My target would be autonomous robotics as you mentioned in your article.
I am not sure if this endeavor is possible at all, cause it would require porting the OpenTLD, the OpenCV and the necessary Matlab routines over to the ARM platform. And I am not sure if the 256MB of the R-Pi are sufficient to do anything useful with OpenTLD. On the other hand, the Pi has a very powerful GPU. A few days ago I read about a new camera module currently in development for the Pi. And OpenTLD would be a perfect fit. Do you think this is feasible?

OpenTLD does not use the GPU for processing, so even having a powerful GPU is not very useful, unfortunately. This is why OpenTLD is very slow even for full-sized computers!

Additionally, the Raspberry Pi's GPU also has closed-source drivers, meaning we can't use OpenCL or use it in any meaningful way to assist general-purpose computations easily.

Don't even get me started on MATLAB - it's closed-source, not designed for ARM, and very, very slow. If you got it working on a raspberry pi (and that's a very big if) it would turn the video into a slide show. You would have a better chance of using alantrrs' C++ port on the Raspberry Pi, but I'm sure there would be speed issues as well.

Given all these issues, the raspberry pi is still going to be immensely useful for autonomous robotics, we just have to improvise and innovate around its limitations!

Thank you for your response!

I was not thinking of getting Matlab to work on ARM. Instead I was more thinking of a "pure C" re-implementation of the required Matlab functions as you mention in your comment.

I am a computer scientist (with background in machine learning) but I have never done GPU programming. I know that there is an OpenCV version which was significantly improved by porting it to use the GPU. But I don't know if this would be feasible for OpenTLD or the re-implemented Matlab functions.

I read that the GPU of Raspberry Pi is programmable by OpenGL ES. It was only recently that I learnt about the closed source drivers. Don't know what that means: is there no interface to acces the GPU at all (but then what's the sense in putting a powerful GPU on it?)? Is the OpenGL interface enough to do this kind of programming (guess not)? Or is there another way to use the GPU (except for the high-volume NDA with Broadcom)?

I have also read about the Beaglebone which contains a much more powerful CPU (Cortex A8). But the GPU is less powerful (but open-source). Maybe this beast would be a better fit for OpenTLD. The CPU should make a difference. But I haven't read about a camera module for beaglebone in contrast to the announced cam module directly connected to the onboard cam interface/GPU on the Raspberry Pi.

What do you think? Imagine a quadrocopter with a camera controlled by OpenTLD! Unlimited possibilities :)

Chris

I too would like to see an OpenTLD version for the Raspberry pi. Instead of putting it on a quadcopter I´d like to track those and other flying objects from ground. Sure this could be done with a notebook of any kind but I want it as compact as possible and stepper motors can be controlled so easily with the Raspberry pi.

Hello,

You mentioned that you were going to try and get the c++ version compiled on mac (os x). Have you successfully been able to compile this? If so, would you be able to explain how/maybe put up a tutorial?

Thanks for all your help!

Hi,Thanks for you opensource on opentld c++.I have downloaded the alantrrs-OpenTLD version. And convert it to vs2010 project with opencv2.3.1. on release, the processing speed is ok, but it is always losting objects. The Probability is 60%, not only by the camera but by the video. But the matlab c++ version has a good result in tracking objects. Do I make a mistake?
Please give me the right guid.

Hi!
I tried downloading opentld from github on linux with
git clone git@github.com:alantrrs/OpenTLD.git
but i get a permission denied (publickey) error. Is this a problem on my side or at guthub?

The ALIEN Visual Tracker application IS OUT!
Download it here: http://www.micc.unifi.it/pernici/
(available for Windows7 64bit).

The ALIEN visual tracker is a generic visual object tracker achieving state of the art performance. The object is selected at run-time by drawing a bounding box around it and then its appearance is learned and tracked as time progresses. The ALIEN tracker has been shown to outperform other competitive trackers, especially in the case of long-term tracking, large amount of camera blur, low frame rate videos and severe occlusions including full object disappearance.

The scientific paper introducing the technology behind the tracker will appear at the 12th European Conference in Computer Vision 2012 under the following title:
• FaceHugger: The ALIEN Tracker Applied to Faces. In Proceedings of European Conference on Computer Vision (ECCV) - DEMO Session -- 2012 Florence Italy.
A real time demo of the released downloadable application (http://www.micc.unifi.it/pernici/) will also be given during the conference [1].
Video demos showing the capability of this novel technology may be seen here http://www.youtube.com/user/pernixVision.

So I guess it's Alien vs. Predator yet again :)