Obtaining data from wasted providers
An optical approach to data adquisition
May 10, 2015 Version 1 in production.
In 2015 I was commisioned to develop a realtime client information system for an important railway operator. Europe was immersed in a major financial crisis and the available budget was very exiguous. As start point all trains were equiped with a GPS system controlled by a fleet management software.
I soon realized that the GPS data would not be useful: Part of the infraestructure (including the main station) runs underground. There is also a wide region where GSM coverage was too much poor and the accuracy of the satellites was insufficient to locate each train in the correct track free of errors. The information was not as reliable as it needed to announce departures and arrivals from trains to travellers.
The option of using our own interlocking system’s data itself would be the perfect solution but it had a prohibitive cost. In fact, the manufacturer had its own product, 100% compatible with its internal database. It was frustrating to have all the information we needed right there, represented in four computer screens, but useless for us due a lack of will.
Thinking on reverse engineering
At the same time the lights were still blinking from these dark screens. The lines representing each track section changed its colours and absolutely everyone who attended the system could understand where the trains were and where they were going in the next few minutes… Was it so difficult to take note of what was happening on those screens to inform customers?…
What was there to write down? Knowing the logic that governed in the track section’s connection (topology), the only information we lacked was the color. From the color of each section you can deduce everything else. It’s enough to observe the color of several tens of points along each screen to deduce the location of each train, the state of every track and thus, each circulation’s course. But… How do you get those points of color?
Legal piracy stuff
At first we thought we would use a video camera pointing at each screen and process the images using artificial vision algorithms (image filters). Thinking a little more we deduced that we could even simplify the process by using video capture cards connected directly to the video output of the screens that we wanted to observe. After experiencing we knew we had found the solution (eureka).
The captured image was so accurate that we could look for the color dots using same coordinates of the source image. The development of the algorithm was the simplest part of the whole process.
Combining the theoretical train schedule with the track layout topology and enabling a manual entry of information to manage last-minute changes, the result is a real-time traffic control system that can not only report about the punctuality of each train, but also allows to generate statistics, solve incidents and evaluate the whole circulation’s efficiency. In fact we’re still exploring the possibilities that this endless data source has put within our reach. To improve the readability of the information provided we’ve developed a clone of the original representation of the interlocking that we can replicate as many times as we want, offer through the web using HTML-5 and SVG and make available to many more employees of the company involved in circulation’s tasks.
Future steps of this research consist in doing the same importing data from several other auxiliary systems to obtain a single and robust managemente block wich will take into account all kinds of information. Everything will be done in a clean, legal and absolutely transparent way.